Through text and 2D box selection, researchers have successfully achieved object generation in 3D scenes. From empty plates to delicious bread, users simply select a tray and input commands to instantly present new objects. This research, conducted by ETH Zurich and Google, introduces the InseRF method, facilitating text-driven 3D scene generation. Compared to other methods, InseRF excels in maintaining consistency and flexibility, allowing objects to be inserted on different surfaces.