WideDepth: Millimeter-Accurate Benchmark for Fisheye Depth Estimation

Ilia Indyk¹, Ignat Penshin¹, Ivan Sosin¹, Maxim Monastyrny¹, Aleksei Valenkov¹, Ilya Makarov^{2, 3}

¹Robotics Center, ²AXXX, ³ISP RAS,
ICRA 2026

Paper Dataset Code (coming soon)

WideDepth pipeline overview

Overview of our proposed pipeline for generating fisheye stereo RGB, depth, and disparity from LiDAR scans.

Abstract

Fisheye cameras are increasingly adopted in robotics for near-field manipulation, navigation, and immersive perception, yet indoor depth benchmarks with accurate ground truth are still missing. To address this, we introduce WideDepth — the first indoor dataset for fisheye depth estimation, featuring 101 scenes containing 5K high-resolution stereo pairs labeled with millimeter-level ground truth depth and disparity. Our dataset also includes paired pinhole and fisheye samples across varying fields of view and baselines in both horizontal and vertical stereo setups. We further propose a method to adapt pinhole-trained stereo models to fisheye images and introduce a novel stereo fisheye image generation pipeline based on high-resolution LiDAR scans. Leveraging these methods, we thoroughly evaluate state-of-the-art monocular depth, stereo matching, and depth completion models on our benchmark. Additionally, we provide 18K LiDAR-derived sparse depth training samples, achieving up to a 62% performance boost on fisheye data when fine-tuning pinhole-based stereo models. In summary, the high precision and versatility of our benchmark set a strong foundation for advancing research in fisheye depth estimation and robotics perception.

Comparison table with other fisheye datasets

Our benchmark offers unparalleled precision of depth maps, leveraging varying FOV and high-resolution images, surpassing all existing fisheye datasets. Our training dataset is the first-ever with a vertical stereo setup.

Stereo qualitative results

Qualitative results for stereo with FOV 195 using the StereoBase model. The model shows no degradation from geometric distortion, demonstrating the success of our approach in adapting pinhole models to fisheye data.

Mono qualitative results

Comparison of the best and worst models on a 195-degree FOV. Despite its ability to distinguish small details well, Depth Anything V2 performs significantly worse in metric depth estimation on extremely wide angles compared to Depth Pro.

Video Presentation