Depth Anything

Unlock the power of massive unlabeled data

ChineseSelectionImageDepth estimationImage processing
Depth Anything is a highly practical solution for robust monocular depth estimation. We aim to build a simple yet powerful baseline model capable of handling any image in any situation without pursuing novel technical modules. To this end, we design a data engine to expand the dataset, collecting and automatically annotating a massive amount of unlabeled data (around 62M), significantly broadening data coverage and thus reducing generalization errors. We explored two simple yet effective strategies to make data expansion promising. Firstly, by utilizing data augmentation tools to create more challenging optimization objectives. It compels the model to actively seek additional visual knowledge and acquire powerful representations. Secondly, we developed auxiliary supervision to enforce the model to inherit rich semantic priors from the pre-trained encoder. Its zero-shot capabilities were widely evaluated, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Furthermore, by fine-tuning it with depth information measured from NYUv2 and KITTI, we established new SOTAs. Our better depth model also leads to better depth-conditioned ControlNet. Our model is released at https://github.com/LiheYoung/Depth-Anything.
Visit

Depth Anything Visit Over Time

Monthly Visits

9532

Bounce Rate

52.93%

Page per Visit

1.0

Visit Duration

00:00:03

Depth Anything Visit Trend

Depth Anything Visit Geography

Depth Anything Traffic Sources

Depth Anything Alternatives