Abstract
We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.
| Original language | English |
|---|---|
| Title of host publication | 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings |
| Publisher | IEEE Computer Society |
| Pages | 730-734 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781509021758 |
| DOIs | |
| State | Published - 2 Jul 2017 |
| Event | 24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China Duration: 17 Sep 2017 → 20 Sep 2017 |
Publication series
| Name | Proceedings - International Conference on Image Processing, ICIP |
|---|---|
| Volume | 2017-September |
| ISSN (Print) | 1522-4880 |
Conference
| Conference | 24th IEEE International Conference on Image Processing, ICIP 2017 |
|---|---|
| Country/Territory | China |
| City | Beijing |
| Period | 17/09/17 → 20/09/17 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- Automatic 2D-to-3D conversion
- Depth image-based rendering
- Multi-scale deep neural network
- View extrapolation