Data to be translated: Researchers at ETH Zurich have innovated monocular depth estimation by modifying the open-source Marigold model based on Stable Diffusion. This model achieves remarkable performance without the need for actual depth image training data, by fine-tuning the denoising U-Net module. Trained on synthetic data, Marigold can learn a wide range of scenes and enhance its generalization capabilities on unseen datasets. The core technical approach involves leveraging the prior knowledge of Stable Diffusion and employing an affine-invariant depth estimation method to eliminate depth estimation errors caused by uncertainties in camera intrinsic parameters.