2025 |
CVPR |
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds |
 |
website |
2025 |
CVPR |
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision |
 |
website |
2025 |
arXiv |
Regist3R: Incremental Registration with Stereo Foundation Model |
β |
β |
2025 |
arXiv |
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World |
β |
website |
2025 |
CVPR |
AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis |
 |
website |
2025 |
arXiv |
Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction |
β |
β |
2025 |
CVPR |
MonSter: Marry Monodepth to Stereo Unleashes Power |
 |
β |
2025 |
arXiv |
D2USt3R: Enhancing 3D Reconstruction with 4D Pointmaps for Dynamic Scenes |
β |
website |
2025 |
arXiv |
FlowR: Flowing from Sparse to Dense 3D Reconstructions |
β |
website |
2025 |
arXiv |
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training |
 |
website 4D DUSt3R test |
2025 |
arXiv |
SparseGS-W: Sparse-View 3D Gaussian Splatting in the Wild with Generative Priors |
β |
DUSt3R+Diffusion+3DGS |
2025 |
ICLR |
M3: 3D-Spatial Multimodal Memory |
 |
website compression & Gaussian Memory Attention |
2025 |
CVPR |
MVSAnywhere: Zero-Shot Multi-View Stereo |
 |
website |
2025 |
CVPR |
CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis |
β |
website |
2025 |
CVPR |
Pow3R: empowering unconstrained 3D reconstruction with camera and scene priors |
β |
website DUSt3R+multi information input |
2025 |
CVPR |
Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding |
 |
TSP3D |
2025 |
CVPR |
UniK3D: Universal Camera Monocular 3D Estimation |
 |
website |
2025 |
CVPR |
Sonata: Self-Supervised Learning of Reliable Point Representations |
 |
website |
2024 |
CVPR |
Point transformer v3: Simpler faster stronger |
 |
β |
2022 |
NIPS |
Point transformer v2: Grouped vector attention and partition-based pooling |
 |
β |
2021 |
ICCV |
Point transformer |
β |
unofficial implementation |
2025 |
arXiv |
Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction |
β |
website Dynamic DUSt3R, DPM |
2025 |
ICLR |
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion |
 |
website Test |
2025 |
CVPR |
Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos |
 |
website |
2025 |
CVPR |
Continuous 3D Perception Model with Persistent State |
 |
website CUT3R |
2025 |
CVPR |
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction |
 |
MASt3R+COLMAP+3DGS |
2025 |
arXiv |
SplatVoxel: History-Aware Novel View Streaming without Temporal Training |
β |
β |
2025 |
CVPR |
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding |
 |
3DGS+Transformer |
2025 |
CVPR |
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers |
 |
website distillation |
2025 |
arXiv |
MUSt3R: Multi-view Network for Stereo 3D Reconstruction |
 |
multiple views DUSt3R |
2025 |
CVPR |
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass |
 |
Website Test |
2024 |
NIPS |
Depth anything v2 |
 |
website |
2024 |
CVPR |
Depth anything: Unleashing the power of large-scale unlabeled data |
 |
Website |
2024 |
CVPR |
DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions |
β |
β |
2024 |
CVPR |
Learning to adapt clip for few-shot monocular depth estimation |
β |
β |
2024 |
arXiv |
3d reconstruction with spatial memory |
 |
website Spann3R |
2024 |
CVPR |
DUSt3R: Geometric 3D Vision Made Easy |
 |
Website Test |
2024 |
ECCV |
Gs-lrm: Large reconstruction model for 3d gaussian splatting |
β |
website 3DGS+Transformer |
2024 |
TIP |
BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation |
 |
β |
2024 |
TIP |
GLPanoDepth: Global-to-Local Panoramic Depth Estimation |
β |
β |
2023 |
ICCV |
Towards zero-shot scale-aware monocular depth estimation |
 |
website |
2023 |
ICCV |
Egformer: Equirectangular geometry-biased transformer for 360 depth estimation |
 |
β |
2023 |
Machine Intelligence Research |
Depthformer: Exploiting long-range correlation and local information for accurate monocular depth estimation |
β |
β |
2023 |
CVPR |
Lite-mono: A lightweight cnn and transformer architecture for self-supervised monocular depth estimation |
β |
β |
2023 |
CVPR |
CompletionFormer: Depth Completion with Convolutions and Vision Transformers |
 |
website |
2023 |
ICRA |
Lightweight monocular depth estimation via token-sharing transformer |
β |
β |
2023 |
AAAI |
ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation |
β |
β |
2023 |
ICRA |
TODE-Trans: Transparent Object Depth Estimation with Transformer |
 |
β |
2023 |
AAAI |
Deep digging into the generalization of self-supervised monocular depth estimation |
 |
β |
2022 |
ECCV |
PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation |
 |
β |
2022 |
AAAI |
Improving 360 monocular depth estimation via non-local dense prediction transformer and joint supervised and self-supervised learning |
β |
β |
2022 |
arXiv |
MVSFormer: Multi-view stereo by learning robust image features and temperature-based depth |
β |
β |
2022 |
arXiv |
Objcavit: improving monocular depth estimation using natural language models and image-object cross-attention |
 |
β |
2022 |
arXiv |
Depthformer: Multiscale Vision Transformer For Monocular Depth Estimation With Local Global Information Fusion |
 |
β |
2022 |
arXiv |
Sidert: A real-time pure transformer architecture for single image depth estimation |
β |
β |
2022 |
ECCV |
Hybrid transformer based feature fusion for self-supervised monocular depth estimation |
β |
β |
2022 |
ECCV |
Spike transformer: Monocular depth estimation for spiking camera |
 |
β |
2022 |
3DV |
MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer |
 |
β |
2022 |
arXiv |
DEST: βDepth Estimation with Simplified Transformer |
β |
β |
2022 |
arXiv |
SparseFormer: Attention-based Depth Completion Network |
β |
β |
2022 |
CVPR |
GuideFormer: Transformers for Image Guided Depth Completion |
β |
β |
2022 |
CVPR |
Multi-frame self-supervised depth with transformers |
β |
β |
2022 |
arXiv |
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics |
β |
β |
2021 |
ICCV |
Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers |
β |
STTR stereo matching |
2021 |
BMVC |
Transformer-based Monocular Depth Estimation with Attention Supervision |
 |
β |
2021 |
ICCV |
Transformer-Based Attention Networks for Continuous Pixel-Wise Prediction |
 |
β |
2021 |
ICCV |
Vision transformers for dense prediction |
 |
DPT |