BlockFusion
Expands 3D scene generation models
CommonProductDesign3D scenemodels
BlockFusion is a diffusion-based model that can generate 3D scenes and seamlessly integrate new blocks into existing scenes. It is trained on a dataset of 3D blocks sampled randomly from complete 3D scene meshes. Through piece-wise fitting, all training blocks are converted into hybrid neural fields: consisting of triangular meshes containing geometric features, followed by multi-layer perceptrons (MLPs) used to decode signed distance values. A variational autoencoder is used to compress the triangular meshes into a latent triangular space for denoising diffusion. Diffusion applied to the latent representations enables the generation of high-quality and diverse 3D scenes. During scene expansion, only empty blocks are appended to overlap with the current scene, and existing latent triangles are extrapolated to fill the new blocks. Extrapolation is achieved by modulating the generation process using feature samples from overlapping triangles during the denoising iterations. Latent triangle extrapolation produces semantically and geometrically meaningful transitions, harmoniously blending with the existing scene. A 2D layout control mechanism is used to regulate the placement and arrangement of scene elements. Experimental results demonstrate that BlockFusion can generate diverse, geometrically consistent, and high-quality large-scale indoor and outdoor 3D scenes.
BlockFusion Visit Over Time
Monthly Visits
19075321
Bounce Rate
45.07%
Page per Visit
5.5
Visit Duration
00:05:32