ODIN Model
Single model implements 2D and 3D perception
CommonProductImagecomputer visioninstance segmentation
ODIN (Omni-Dimensional INstance segmentation) is a model that uses a transformer architecture for segmentation and labeling on both 2D RGB images and 3D point clouds. It distinguishes 2D and 3D feature operations by iteratively fusing information between 2D views and 3D views. ODIN achieves state-of-the-art performance on ScanNet200, Matterport3D, and AI2THOR 3D instance segmentation benchmarks, and achieves competitive performance on ScanNet, S3DIS, and COCO. When using sampled point clouds from 3D meshes instead of perceived 3D point clouds, it surpasses all previous works. As the 3D perception engine in a guided concretization agent architecture, it sets a new state-of-the-art on the TEACh dialogue action benchmark. Our code and checkpoints can be found on the project website.
ODIN Model Visit Over Time
Monthly Visits
20899836
Bounce Rate
46.04%
Page per Visit
5.2
Visit Duration
00:04:57