Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting


Duochao Shi1Weijie Wang1,4Donny Y. Chen2  Zeyu Zhang2,4  Jia-Wang Bian3 Bohan Zhuang1 Chunhua Shen1

1Zhejiang University, China   2Monash University, Australia   3MBZUAI   4GigaAI

* Equal contribution

TL;DR


We introduce PM-Loss, a novel regularization loss based on a learned point map for feed-forward 3DGS, leading to smoother 3D geometry and better rendering.


Abstract


Depth maps are widely used in feed-forward 3D Gaussian Splatting (3DGS) pipelines by unprojecting them into 3D point clouds for novel view synthesis. This approach offers advantages such as efficient training, the use of known camera poses, and accurate geometry estimation. However, depth discontinuities at object boundaries often lead to fragmented or sparse point clouds, degrading rendering quality---a well-known limitation of depth-based representations. To tackle this issue, we introduce PM-Loss, a novel regularization loss based on a pointmap predicted by a pre-trained transformer. Although the pointmap itself may be less accurate than the depth map, it effectively enforces geometric smoothness, especially around object boundaries. With the improved depth map, our method significantly improves the feed-forward 3DGS across various architectures and scenes, delivering consistently better rendering results.

Method


Overview of our PM-Loss. The process begins by estimating a dense point map of the scene using a pre-trained model. This estimated point map then serves as direct 3D supervision for training a feed-forward 3D Gaussian Splatting model. Crucially, unlike conventional methods relying predominantly on 2D supervision, our approach leverages explicit 3D geometric cues, leading to enhanced 3D shape fidelity.

Comparison Experiments


Point Cloud Visualization


Rendering Results



Qualitative comparisons on DL3DV(top two rows) and RealEstate10K(bottom two rows) under the 2-view extrapolation setting. Adding PM-Loss leads to significant improvements in rendering object boundaries.



Rendering results under view extrapolation


Results on DTU with varying input numbers

Citation


@article{shi2025pmloss,
  title={Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting},
  author={Shi, Duochao and Wang, Weijie and Chen, Donny Y. and Zhang, Zeyu and Bian, Jia-Wang and Zhuang, Bohan and Shen, Chunhua},
  journal={arXiv preprint arXiv:2506.05327},
  year={2025},
}