Automatic 2D-to-3D conversion using multi-scale deep neural network

Jiyoung Lee, Hyungjoo Jung, Youngjung Kim, Kwanghoon Sohn

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We present a multi-scale deep convolutional neural network (CNN) for the task of automatic 2D-to-3D conversion. Traditional methods, which make a virtual view from a reference view, consist of separate stages i.e., depth (or disparity) estimation for the reference image and depth image-based rendering (DIBR) with estimated depth. In contrast, we reformulate the view synthesis task as an image reconstruction problem with a spatial transformer module and directly make stereo image pairs with a unified CNN framework without ground-truth depth as a supervision. We further propose a multi-scale deep architecture to capture the large displacements between images from coarse-level and enhance the detail from fine-level. Experimental results demonstrate the effectiveness of the proposed method over state-of-the-art approaches both qualitatively and quantitatively on the KITTI driving dataset.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PublisherIEEE Computer Society
Pages730-734
Number of pages5
ISBN (Electronic)9781509021758
DOIs
StatePublished - 2 Jul 2017
Event24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China
Duration: 17 Sep 201720 Sep 2017

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2017-September
ISSN (Print)1522-4880

Conference

Conference24th IEEE International Conference on Image Processing, ICIP 2017
Country/TerritoryChina
CityBeijing
Period17/09/1720/09/17

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Automatic 2D-to-3D conversion
  • Depth image-based rendering
  • Multi-scale deep neural network
  • View extrapolation

Fingerprint

Dive into the research topics of 'Automatic 2D-to-3D conversion using multi-scale deep neural network'. Together they form a unique fingerprint.

Cite this