Stereo matching aiming to perceive the 3-D geometry of a scene facilitates numerous computer vision tasks used in advanced driver assistance systems (ADAS). Although numerous methods have been proposed for this task by leveraging deep convolutional neural networks (CNNs), stereo matching still remains an unsolved problem due to its inherent matching ambiguities. To overcome these limitations, we present a method for jointly estimating disparity and confidence from stereo image pairs through deep networks. We accomplish this through a minmax optimization to learn the generative cost aggregation networks and discriminative confidence estimation networks in an adversarial manner. Concretely, the generative cost aggregation networks are trained to accurately generate disparities at both confident and unconfident pixels from an input matching cost that are indistinguishable by the discriminative confidence estimation networks, while the discriminative confidence estimation networks are trained to distinguish the confident and unconfident disparities. In addition, to fully exploit complementary information of matching cost, disparity, and color image in confidence estimation, we present a dynamic fusion module. Experimental results show that this model outperforms the state-of-The-Art methods on various benchmarks including real driving scenes.
|Number of pages||15|
|Journal||IEEE Transactions on Intelligent Transportation Systems|
|State||Published - 1 Nov 2021|
- confidence estimation
- dynamic feature fusion
- generative adversarial network
- Stereo confidence