Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Accurate and Fast Convergent Initial-Value Belief Propagation for Stereo Matching

  • Xiaofeng Wang,

    Affiliations Vision and Image Processing Laboratory, College of Computer Science, Sichuan University, Chengdu, P.R. China, College of Mathematical and Physical Sciences, Chongqing University of Science and Technology, Chongqing, P.R. China

  • Yiguang Liu

    liuyg@scu.edu.cn

    Affiliation Vision and Image Processing Laboratory, College of Computer Science, Sichuan University, Chengdu, P.R. China

Abstract

The belief propagation (BP) algorithm has some limitations, including ambiguous edges and textureless regions, and slow convergence speed. To address these problems, we present a novel algorithm that intrinsically improves both the accuracy and the convergence speed of BP. First, traditional BP generally consumes time due to numerous iterations. To reduce the number of iterations, inspired by the crucial importance of the initial value in nonlinear problems, a novel initial-value belief propagation (IVBP) algorithm is presented, which can greatly improve both convergence speed and accuracy. Second, .the majority of the existing research on BP concentrates on the smoothness term or other energy terms, neglecting the significance of the data term. In this study, a self-adapting dissimilarity data term (SDDT) is presented to improve the accuracy of the data term, which incorporates an additional gradient-based measure into the traditional data term, with the weight determined by the robust measure-based control function. Finally, this study explores the effective combination of local methods and global methods. The experimental results have demonstrated that our method performs well compared with the state-of-the-art BP and simultaneously holds better edge-preserving smoothing effects with fast convergence speed in the Middlebury and new 2014 Middlebury datasets.

Introduction

Stereo matching is one of the most extensively researched topics in computer vision and aims to infer a dense disparity or depth map by finding the correct correspondence between a pair of images (inference image and target image) captured from different viewpoints or at different times. In recent decades, many high-quality studies have been conducted [1, 2, 3].

Generally, stereo matching problems can be classified into global methods and local methods [4, 5]. For global methods, stereo matching is commonly formulated as energy function minimization frameworks. The belief propagation (BP) [6, 7] algorithm is one of the most popular global methods [8]. Numerous methods have been presented to improve BP, including loopy belief propagation (LBP) [7], hierarchical belief propagation (HBP) [9], context guided BP (CBP) [10], and fast-converging belief propagation [1]. However, these methods are primarily designed to decrease time consumption without intrinsically improving the accuracy of BP. Although some fast BP algorithms rise, the BP algorithm is still much slower than local methods, preventing it from meeting real-time requirements. In addition, BP also exist some problems, such as the Markov random field (MRF) shrinking bias, over-smoothness phenomenon, and ambiguous edges of the disparity map [11]. Relatively, local methods often compute disparity values within a finite region. Traditional local methods are typically faster than global methods with lower accuracy. Recently, some excellent local algorithms have produced and performances are similar to those of state-of-the-art global algorithms, such as the bilateral filter (BF) [12] and guided filter (GF) [13, 14]. Therefore, combining the advantages of local methods will provide a more beneficial reference to improve the accuracy and efficiency of BP.

The main contribution of this paper lies in presenting an initial-value belief propagation (IVBP) with high accuracy and fast convergence, inspired by the importance of the initial value for complex nonlinear problems. A more accurate initial value is helpful in acquiring a more accurate solution of the nonlinear problem, along with the much faster convergence velocity. Similarly to the nonlinear problem, BP is also a cyclic iterative method. An accurate initial value can cause fast convergence of BP while acquiring more accurate disparity values. Local methods with both good accuracy and high convergence speed are especially important for the performance of BP. Presently, GF is one of the best local approaches in terms of accuracy and speed. The disparity map of GF can be set as the initial value of BP, providing different disparities with different probabilities based on the initial value of the local methods instead of traditional equal probabilities. Based on this idea, IVBP is proposed.

Another contribution of the paper is the successful refining of the data term of the BP algorithm. In the past decade, many improved BP algorithms have been reported [1,7,9]. However, these methods concentrate on the smoothness term or other energy terms rather than the data term, and they neglect the importance of the data term for accuracy and speed to a certain degree. In fact, there is a relationship and discrimination between the data and smoothness terms. The reference accuracy of the smoothness term and the other energy terms greatly depends on the fidelity of the data term. Improvement of the data term can simultaneously be helpful for both the accuracy and convergence of BP. Therefore, in this paper, a self-adapting dissimilarity data term (SDDT) is proposed instead of the traditional data term, combining the intensity-based measure and gradient-based measure with contextual inference using a robust measure-based control function. This method also effectively improves the performance in textureless regions and near discontinuities regions.

This study also includes an exploration of the combination of local methods and global methods. We find that local methods and global methods have different advantages and disadvantages; e.g., local methods are simple and fast, whereas global methods are relatively complex and slow. Local methods primarily depend on disparity values of the local window and perform poorly in textureless and occluded regions, whereas global methods perform better in these regions due to information inference. Therefore, their combination should result in more competitive performance, which is considered as important as establishment of a new method. Further, we also expect that this study will attract more attention to the exploration of fusions of local and global methods.

The rest of this paper is organized as follows: in Section 2, we provide a brief overview of BP and then propose IVBP. Section 3 discusses experiments to evaluate the proposed methods, demonstrating the effectiveness of our optimization framework on the Middlebury and new 2014 Middlebury datasets. Finally, Section 4 provides some conclusions.

Our Algorithm

In this paper, to improve accuracy and speed, we present an accurate and fast-converging IVBP.

First, we briefly provide an overview of our methods. Fig 1 depicts this overview. Second, we propose SDDT, which combines intensity-based measure and gradient-based measure using a robust measure-based control function in Section 2.2. Last, we present IVBP in Section 2.3.

thumbnail
Fig 1. Block diagram of our algorithm.

See the text for more details.

https://doi.org/10.1371/journal.pone.0137530.g001

Belief Propagation

Stereo correspondence can be formulated as the estimation of disparity values for every node in an MPF using BP. This section briefly reviews the BP adopted here. BP is an inference algorithm used to find the most likely disparity value of each node through a global energy-minimization framework [8, 9] and is defined as follows:

Let I be a set of pixels and D be a finite set of label maps with respect to I. Our aim is searching for the mapping :ID that assigns each pixel a reliable disparity value. (p,q) represents a pair of neighboring nodes to a neighborhood N(·). A label map f assigns (fp,fq) ∈ D to each pixel (p,q) ∈ I. The compatibility function Φ(p,fp) denotes how compatible a disparity value fp is with the intensity value p observed in the images, and the evidence function Ψ(fp,fq) denotes differences of two disparity pairs (fp,fq).

Considering the compatibility of neighboring disparity variables (fp,fq) and intensity values (p,q) ∈ D between the image pairs, the joint posterior probability of MRF can be defined as [8]: (1)

Through the-log of Eq (1), the above equation P(·) based on maximizing the probability of MRF is equivalent to minimizing the following energy function E(·): (2)

We simplify E(P(D|I)) using E(D|I), and thus Eq (2) can be expressed as: (3) where the functions V(·) and D(·) represent the data and smoothness terms, respectively.

Self-Adapting Dissimilarity Data Term

As described previously, BP mainly consists of two terms: the data term D(·) and smoothness term V(·). Based on a lot of experiment, it is found that the data term D(·) is important for the accuracy of BP. Therefore, to improve the accuracy of the data term, we present an SDDT.

Self-adapting Dissimilarity Data Term.

Often, the data term depends on pixel-based intensity measures, such as absolute intensity differences (AD) and squared intensity differences (SD),etc. A major limitation of the above-mentioned methods is that they utilize only raw intensity [15]. That is because the raw intensity measure does not consider contextual constraints and depending only on raw intensity is challenging in locally ambiguous regions, such as discontinuous and textureless regions. Relatively, gradient-based measures can provide contextual constraints. Therefore, combining intensity-based measures and gradient-based measures should acquire the better performance compared with individual intensity-based measures, similarly to the self-adapting dissimilarity measure in [16]. However, Andreas Klaus only combined the self-adapting dissimilarity measure with color segmentation on the reference image without essentially combining global methods. In contrast with previous work [16], we incorporated the self-adapting dissimilarity measure into the data term of BP for global methods and propose SDDT.

Here, the data term D(·) adopted in this study is briefly reviewed. Suppose that IL(·) and IR(·) are a reference image and target image, respectively. Let DAD(x,y,d) represent the AD measure at position (x,y) with disparity value d, where N(x,y) represents a neighborhood at position (x,y), NX(x,y) represents a neighborhood without a right-most column, NY(x,y) represents a surrounding window without the lowest column, ∇X represents the central difference to the right, and ∇Y is the central difference to the bottom. Then, the AD measure DAD(·) is computed as follows: (4)

Next, SDDT is proposed to improve both accuracy and robustness. Let DGRAD(x,y,d) represent the gradient-based measure at position (x,y) with disparity value d. Then, the gradient-based measure DGRAD(·) is defined as follows: (5)

Further, to balance two previous measures, the optimal weighting between intensity-based measure DAD(·) and gradient-based measure DGRAD(·) is defined. The improved data term D′(·) is defined as follows: (6)

Note that the improved data term contains the traditional data term as a special case, when the optimal weighting .

Robust Control Function

For data term D′(·), experiments demonstrate that there is inevitably a little or even severe bias between DAD(·) and DGRAD(·). To further avoid the bias of the two measures, the simple but effective robust control function ρ(·) is proposed.

Through ρ(c,λ), different cost measures can be mapped to the range [0, 1]. Therefore, it can control the influence of the outliers by adjusting the parameter λ. The SDDT is further described in Eq (7): (7) Where ρ(D(·),λ) = 1 –exp(−D(·)/λ).

Here, is determined as the optimal weighting between ρ(DAD(·)) and ρ(DGRAD(·)). See Section 3.1 for the discussion of the parameter setting and its sensitivity.

To emphasize the performance of SDDT, additional gradient measure maps for the “Tsukuba” dataset are depicted in Fig 2. For better visualization, we show a zoomed-in view of the gradient map near the nose with the yellow rectangle, where the normal view follows the nose change. Again, we zoom in on a discontinuous region near the nose region marked by the red rectangle. As seen from Fig 2, the additional gradient-based measure can be helpful to reduce the errors because it applies space contextual constraint information to acquire more abundant information. Hence, the SDDT, which combines intensity-based measures and gradient-based measures, is beneficial for improving the accuracy of BP.

thumbnail
Fig 2. Gradient images on the Middlebury dataset “Tsukuba”.

We reconstruct the gradient image based on gradient measure and zoom in on the nose region marked by the yellow rectangle. We also zoom in on the discontinuous region near the nose region marked by the red rectangle. Note that adding a gradient measure can enhance the reliability of the correspondences.

https://doi.org/10.1371/journal.pone.0137530.g002

Experiments.

To more intuitively contrast our SDDT and the traditional data term, we compare the two methods using the same MRF and BP optimization framework in detail using the “Teddy” datasets with abundant information in Fig 3. As shown from Fig 3, the performance of SDDT is better than the traditional data term, especially near disparity discontinuities and in textureless regions. In the red rectangle in Fig 3a, the disparity reference of the roof by the traditional data term is especially discontinuous. The primary reason is that this region is textureless, and the intensity value is nearly equal. Thus, when only using the intensity measure, the data term is not sufficiently accurate due to lacking other active constraints. Note that the gradient measure may provide auxiliary contextual inference; thus, SDDT can use space information contextual constraint. In Fig 3b, we verify our algorithm from the contour map, and SDDT is more stable and continuous. Fig 3c provides a visual 3D comparison using BP with and without SDDT. After introducing a gradient measure, the discrete effect is removed on the roof, as shown in Fig 3c. In total, the combined intensity- and gradient-based measure SDDT is more robust and accurate.

thumbnail
Fig 3. Comparison of disparity maps without and with gradient measure on “Teddy”.

(a) Disparity map without and with gradient. (b) The disparity samples from (a) marked by a red rectangle. (c) 3D views based on the disparity of (a). Note that performance of flat in textureless regions with gradient measure is much better than without the gradient measure in the red rectangle.

https://doi.org/10.1371/journal.pone.0137530.g003

Initial-Value Belief Propagation

As described previously, BP has indistinct edges and a slow convergence speed. To solve this problem, we present an IVBP algorithm.

Initial-Value Belief Propagation.

Several BP algorithms have been proposed, including LBP [7], HBP [9], CBP [10], and fast-converging belief propagation [1], but they are designed to decrease time consumption without essentially improving precision. To improve precision, some researchers have combined other image or mathematics algorithms into BP, such as differential geometry [17,18,19], genetic programming [20], and image segmentation [1,21,22]. Although the accuracy of the disparity map is improved, additional methods also increase time consumption and, more importantly, do not improve the accuracy of BP itself. To address this problem, we present an IVBP to improve both accuracy and convergence speed of BP.

At present, the convergence of BP has attracted much attention [23]. However, whether a BP algorithm is convergent and what condition ensures convergence cannot be strictly determined theoretically, which becomes a limitation for BP. Even if the BP is convergent, the convergence speed and accuracy are not guaranteed in most cases. Fortunately, we know that a more accurate initial value is helpful for acquiring both a more accurate solution and faster convergence, which solves the limitation to a certain degree.

To demonstrate the advantages of IVBP, we utilize a “up-down-left-right” accelerated BP, which is a fundamental and fast BP [7]. The alternative schedule is to propagate messages in one direction (the up, down, left, and right directions) and update each node immediately.

Setting the Initial Value.

The accuracy and time consumption of methods acquiring initial values are important for BP. Numerous local methods have been presented, such as AD, Birchfield and Tomasi dissimilarity (BT)[24], non-parametric transforms[25,26], geodesic filter [27] and BF [12,28]. The key limitations of these methods are the compatibility of both accuracy and convergence speed. At present, GF offers high speed and accurate performance and is one of the best local methods [13, 14]. Here, we provide a brief description.

We consider GF as a general linear filtering problem. Given a pair of images ϕ = {I,I′}, where I and I′ are the left and right images, respectively, a pixel s in I may match the corresponding pixel s′ in I′ with the disparity ds. Let C(s,ds) denote the cost function for pixel s at disparity ds. Cost measures C(·) are defined as follows [13, 14]: (8)

Here, ∇X denotes the forward gradient to the right in the X direction, the normalization coefficient λ balances the color and gradient measures, and τc and τg are the truncation thresholds for the color and gradient measures, respectively.

Furthermore, the cost measure CG(·) of GF is output as follows: (9) Where . Here, μk and are the mean and variance in the squared windows ωk centered at pixel k.

Once the cost volume slices are filtered, the Winner-Takes-All (WTA) strategy is applied to select the best disparity value for each pixel s: (10)

The Initial Value Using GF.

To describe our IVBP, we first review how message inference is formulated. We utilize the max-product algorithm to obtain the Maximum A Posteriori (MAP) estimate of the disparity map.

The max-product BP algorithm works by passing messages around the graph, which is commonly defined by a four-connected image grid. Let be the message vector passed from node p with disparity fp to one of its neighbors q with fq at time i. Let N(p) \ p denote the set of nodes neighboring p other than p itself. Then, the iterative message passing procedure of the max-product BP algorithm is given as follows: (11)

Note that passing message m(·) represents a probability, which often determines the results of the disparity map. Generally, traditional belief vector bp at p with L dimensions contains possible labels bp = [bp1,bp2,…,bpL] with equal probabilities P = [P1,P2,…,PL], where Pi = Pj (∀i,jL); however, there are different probabilities due to the existence of the truth disparity value. Based on the above analysis, we set the results of GF as the initial value of the BP algorithm and different probabilities Pi in different disparity values bi using the piecewise probability function υ(·). That is, disparities derived from GF have greater probabilities than other disparity values because their disparities are much closer to the true disparity values, as shown in Fig 4. To avoid possible errors, we set the disparity value of GF at node p as the higher probability weight α1 and set the lower probability weight α2 near the disparity value of GF. Hence, the piecewise function υ(·) is defined as follows: (12)

Here, υ:ℜ→ℜ, the terms α1 and α2 are the probability thresholds of different messages, α1 > α2 > 1, and d* is the disparity value derived from GF, which belongs to a finite set of labels b = [b1,b2,…,bL] with L discrete disparities. See Section 3.1 for a discussion of the parameter settings for α1 and α2 and their sensitivities.

thumbnail
Fig 4. Local message passing procedure in a Markov network.

Green nodes are intensive variables, where gray nodes are observable variables. The new message sent from node p to q is computed with probability υ(·).

https://doi.org/10.1371/journal.pone.0137530.g004

From Eqs (11)) and (12), improved passing message m*(fp) is calculated: (13)

For simplification, we only analyze the data term and do not consider different messages in the up, down, right and left directions. Inserting Eq (13) into Eq (11), the improved iterative message passing procedure is given as follows: (14)

Further, the improved belief vector at node p can be computed after iteration i: (15)

Finally, the label maximizing at node p is selected as follows: (16)

Note that our IVBP contains BP as a special case, when both α1 = 1 and α2 = 1.

The disparity value of GF is used as the constraint of the smooth term and to assist as an information inference. Thus, IVBP has good performance, as shown by the experimental results in Section 3.

Post-processing.

Aim of our algorithm is essentially improving the BP, but our methods can also combine very well with other methods, such as image segmentation.

There are some shortcomings in BP, such as the MRF shrinking bias, over-smoothness phenomenon, and ambiguous edges of disparities. Image segmentation can solve these problems, which may decompose the reference image into a series of homogeneous color or grayscale regions. Image segmentation can help resolve ambiguity within textureless regions and enable object boundaries corresponding to depth discontinuities. Therefore, IVBP with image segmentation (denoted as WIVBP) is proposed. See Section 3.1 for the discussion of WIVBP.

Experiments and Discussions

In this section, we conduct experiments to evaluate our algorithm with some quantitative error analysis. First, we analyze the key parameter settings of the proposed algorithm and then perform experiments on the Middlebury and new 2014 Middlebury datasets with respect to accuracy and convergence speed. The experimental results show the advantages of accuracy and efficiency for our algorithm compared with other BP algorithms.

Parameter Settings

Here, we provide some of the important parameter settings used in our algorithm, including the optimal weight and the probability thresholds of different messages α1 and α2.

For SDDT, the parameter is the optimal weighting between intensity- and gradient-based measures. Fig 5a shows the performance of the proposed algorithm according to different optimal weighting from 0 to 0.5 in the “Tsukuba” dataset. Fig 5 indicates that = 0.1 is relatively better within the range of 0 and 0.5. When = 0.1, the percentage of badly matched pixels is lowest because images blur when is relatively larger.

thumbnail
Fig 5.

(a) Performance of the optimal weighting from 0 to 0.5 between the intensity measures and gradient measures on the Middlebury “Tsukuba” dataset. (b) Performance of a1 and a2 on the Middlebury “Tsukuba” dataset.

https://doi.org/10.1371/journal.pone.0137530.g005

We also evaluate the probability thresholds of different messages α1 and α2 through the errors of non-occluded regions, as shown in Fig 5b, which appears fairly robust and stable in the range of α1 ∈ [1.5,3] and α2 ∈ [1.5,3], with the error range of 0.0183 to 0.0198. As seen from Fig 5b, the lowest value occurs when α1 = 2 and α1 = 1.5.

Experimental Results

We evaluate our proposed algorithm with the Middlebury and new 2014 Middlebury datasets [4]. First, the overall performance of our algorithm is depicted with the Middlebury datasets. Simultaneously, our algorithm is compared with other excellent BP algorithms. Then, we also test our algorithm on the Middlebury and new 2014 Middlebury datasets.

Measure Settings.

To conveniently evaluate the performance of our algorithm, we provide an abbreviation of the six different measures according to the percentages of ‘bad’ pixels [4], as shown in Table 1:

Performance on the Middlebury data set.

A. Total Performance. Generally, we evaluate the performance of disparity maps using the percentages of ‘bad’ pixels, which are primarily among “non.”, “all.” and “disc.”. Table 2 depicts the performance of our proposed algorithm by reporting a comparison with other state-of-the-art BP methods on the Middlebury datasets “Tsukuba”, “Venus”, “Teddy” and “Cones” with an error threshold of 1. It can be seen that our WIVBP (IVBP combining image segmentation) can achieve the better accuracy than other state-of-the-art BP methods and even local methods [29,30], demonstrating the competitiveness of our proposed method. Meanwhile, our IVBP can intrinsically improve the accuracy of BP. Table 2 shows that IVBP outperforms the other BP algorithms [25,31,32,33,34,35,36,37] except TSGO [38] with the lowest average errors. Note that the errors by our IVBP are greatly reduced, compared with other BP except TSGO. At present, TSGO is best BP algorithm listed in the Middlebury datasets. However, TSGO is obtained through four post-processing methods, which greatly decease the average error from 5.70% to 4.06% (see [38]). Relatively, our IVBP does not use post-processing steps and better than TSGO without post-processing methods. Our IVBP is one of the best BP algorithms at present.

thumbnail
Table 2. Comparison of the results with an error threshold of 1 in the Middlebury dataset.

Our algorithms (IVBP and WIVBP) are shown.

https://doi.org/10.1371/journal.pone.0137530.t002

Our algorithm also has better edge-preserving smooth effects compared with the other excellent BP methods, especially near disparity discontinuities and in textureless regions. For simplicity, we show the results of five BP methods only, as shown in Fig 6. The reason our method has better performance is because the inference of the BP algorithm with SDDT results in more accurate performance, and simultaneously an accurate initial value can be helpful in acquiring better accuracy.

thumbnail
Fig 6. Comparison of our algorithm and five other BP algorithms with the Middlebury datasets “Tsukuba” and “Cones”, respectively.

Note that our algorithm has much better edge-preserving smooth effects compared with the other BP algorithms

https://doi.org/10.1371/journal.pone.0137530.g006

B. Analysis of Accuracy Improvements. The addition of different methods can improve the accuracy of BP, as shown in Table 3. In Table 3, the results of our SDDT, IVBP and WIVBP are shown, which show the contribution of our SDDT and IVBP. It can be seen that SDDT and IVBP are crucial to obtain excellent BP results.

thumbnail
Table 3. Comparison of the results with an error threshold of 1 in the Middlebury dataset.

Our algorithms BP with SDDT (denoted BP-SDDT), IVBP and WIVBP are shown.

https://doi.org/10.1371/journal.pone.0137530.t003

For comparison, we provide a visual comparison of the progressive improvement process for the four proposed methods, namely BP, BP-SDDT, IVBP and WIVBP. Fig 7 shows the. comparison diagram of the errors using four methods (BP, BP-SDDT, IVBP, and WIVBP) in non-occluded measures in the Middlebury datasets “Tsukuba”, “Teddy”, “Cones” and “Venus”, and Fig 8 shows that the each progressive improvement is clear, especially in the textureless and occluded regions. Specifically, relative to only the intensity measure, BP-SDDT performs better, possibly because SDDT can imply gradient measures (contextual constraint information) to effectively improve accurate judgments. For IVBP, an accurate initial value is helpful to promote the probability of information propagation.

thumbnail
Fig 7. Comparison diagram of the errors using four methods (BP, BP-SDDT, IVBP, and WIVBP) in non-occluded measures (non.) in the Middlebury datasets “Tsukuba”, “Teddy”, “Cones” and “Venus”.

https://doi.org/10.1371/journal.pone.0137530.g007

thumbnail
Fig 8. Performance on the Middlebury datasets using four methods (BP, BP-SDDT, IVBP, and WIVBP).

From left to right: “Teddy”, “Tsukuba”, “Venus” and “Cones”. From top to bottom: left reference images, image segmentation maps, disparity maps of BP, BP-SDDT, IVBP, and WIVBP, and the ground truth disparity maps.

https://doi.org/10.1371/journal.pone.0137530.g008

For more intuitive comparisons, we rebuild 3D scenes and their contour maps based on the disparity maps of the previous four methods (BP, BP-SDDT, IVBP, and WIVBP) for “Tsukuba” and “Cones” in Fig 9. IVBP and WIVBP have better 3D effects and accurate contours, especially in the textureless and discontinuous regions.

thumbnail
Fig 9. Comparison of our algorithms by 3D and contours on the Middlebury datasets “Cones” and “Tsukuba”.

(a) BP, (b) BP-SDDT, (c) IVBP, and (d) WIVBP.

https://doi.org/10.1371/journal.pone.0137530.g009

We evaluate our algorithms on all six measures listed in Table 1. Fig 10 shows that our algorithm outperforms the traditional BP algorithms on six different measures for “Tsukuba”, “Venus” and “Cones” but not for “Teddy”. Because the occluded measure errors are relatively high, to avoid the influence of other measures, we independently depict them in Fig 10e. In Fig 10e, our algorithms (BP-SDDT, IVBP, and WIVBP) are nearly lower than BP, especially for WIVBP. Occlusion is often caused by the loss of the corresponding pixels in left or right views. We may assume that the color value of the occluded pixels is similar to the neighboring pixels. Therefore, the results of the image segmentation in the left views can help judge the disparity values in the occluded regions.

thumbnail
Fig 10. Error statistics for the percentage of ‘bad’ matching pixels at six different thresholds.

(a) “Tsukuba”, (b) “Teddy”, (c) “Venus”, and (d) “Cones”, including all pixels, pixels in non-occluded areas, pixels in textured areas, pixels in textureless areas and pixels in discontinuous areas. (e) Error statistics for the four methods in the occluded areas.

https://doi.org/10.1371/journal.pone.0137530.g010

Performance for Other Images

Here, we analyze our algorithm on more challenging Middlebury and 2014 new Middlebury datasets and simultaneously compare results with some excellent BP algorithms, such as HBP [9], Real-time Hierarchical Belief Propagation (Realtime BP)[36], Constant-Space Belief Propagation (CSBP)[37], Two Step Global Optimization (TSGO)[38],etc.

In Fig 11, our algorithm holds more distinct edges, decreases ambiguous disparities and acquires smoother low-texture effects on the Middlebury datasets from 2001, 2003, 2005, and 2006 compared with four excellent BP algorithms. We stress that the performance of our algorithm is much closer to the ground truth disparity maps than other BP algorithms. In Table 4, errors in the six datasets “Aloe”, “Art”, “Flowerpot”, “Cloth3”, “Baby1” and “Wood1” are shown. Note that the errors of our IVBP are greatly reduced, compared with other BP except TSGO. Note that TSGO is obtained through four post-processing methods, which greatly reduces errors. Relatively, our IVBP does not use post-processing steps, and is competitively with TSGO.

thumbnail
Fig 11. Results on the Middlebury datasets.

(a) “Aloe”, (b) “Art”, (c) “Flowerpot”, (d) “Cloth3”, (e) “Baby1”, and (f) “Wood1”. From top to bottom: disparity maps of RealtimeBP, CSBP, HBP, TSGO, IVBP and the ground truth disparity maps. Edge-preserving smoothing effects are indicated by pink arrows in our algorithm.

https://doi.org/10.1371/journal.pone.0137530.g011

thumbnail
Table 4. Results with the Middlebury datasets “Aloe”, “Art”, “Baby1”, “Wood”, “Flowerpot”, and “Cloth3”.

The errors by IVBP are greatly reduced compared with other BP algorithms.

https://doi.org/10.1371/journal.pone.0137530.t004

Fig 12 provides a visual comparison based on the synthesized views using the disparity maps from Fig 11. The performance improvement of our algorithm is clear. Note that our algorithm has better accuracy and edge-preserving smoothing effects compared with other BP algorithms, as shown in Figs 11 and 12.

thumbnail
Fig 12. Synthesized views with HBP and IVBP from Fig 11 in the Middlebury datasets.

Poorer edge-preserving smoothing properties are observed for HBP compared with IVBP, as indicated by pink arrows.

https://doi.org/10.1371/journal.pone.0137530.g012

Presently, the old Middlebury benchmark suffers from over-fitting. Relatively, the new 2014 Middlebury datasets consist of more challenging and realistic scenes. In Fig 13, the accuracy of IVBP is greatly improved in sophisticated scenarios compared with HBP and CSBP algorithm.

thumbnail
Fig 13. Performance with the 2014 new Middlebury datasets.

(a) “Livingroom”, (b) “Djembe”, (c) “Australia”, and (d) “Plants”. From top to bottom: left reference images, disparity maps of BP algorithm, disparity maps of CSBP algorithm, disparity maps of IVBP, and the ground truth disparity maps.

https://doi.org/10.1371/journal.pone.0137530.g013

Running Time Performance.

Our algorithm also greatly accelerates convergence speed of BP while keeping the good accuracy. Here, we analyze the improvement of the convergence speed of our algorithm in detail.

For traditional BP, a large number of iterations are required to guarantee convergence and achieve high-quality results. Relatively, our IVBP can obtain low iteration times and fast convergence speed.

Our algorithm can greatly decrease iteration times. According to (3), the energy function value Ei(·) is updated after iteration i. For every iteration i and i-1, if the L1-nrom of the relative energy errors is larger than a threshold η, i.e. |(Ei(·)−Ei-1(·))/Ei(·)| ≥ η,Ei(·) is updated, otherwise Ei(·) is declared as convergence. In our experiments, we set respectively η = 10−4 and η = 10−5. Table 5 shows that IVBP only needs 3 iterations when η = 10−4, whereas BP needs 8 iterations. It is obvious to greatly decrease convergence times for IVBP compared to BP.

thumbnail
Table 5. Comparison of performance of BP and IVBP in “Tsukuba” datasets.

https://doi.org/10.1371/journal.pone.0137530.t005

Experiments demonstrate that the actual convergence speed of IVBP is drastically reduced with high accuracy. The reduction may be because a more accurate initial value requires fewer iteration times than the traditional BP algorithms to reach convergence, which is much closer to the true disparity value with much higher probability. As shown in Fig 14a, our algorithm has fast convergence speed in only a few iterations and then tends toward stability with high accuracy about 1.83%. In Fig 14b, traditional BP requires a large number of iterations and has slow convergence speed with relatively high errors about 4.2%.

thumbnail
Fig 14. Iteration images in “Tsukuba”.

(a) Iteration images with IVBP. (b) Iteration images with BP and IVBP.

https://doi.org/10.1371/journal.pone.0137530.g014

Reduction of iteration times means reduction of run time. For traditional BP, the time complexity is O(NDL), with all pixels (N) and disparity hypotheses (D), and iteration times (L). In contrast, our IVBP can greatly reduce iteration times (LcL) and the time complexity is O(NDLc) ≺ O(NDL). If only several iterations, the complexity is approximately thought to O(ND).

To compare BP-SDDT with traditional BP in running time, we perform experiments with the Middlebury datasets “Teddy”, “Tsukuba”, “Venus” and “Cones” as examples. To avoid the influence of contingency, we run the analysis ten times and compute the mean running time. As shown in Fig 15, although SDDT is added, the additional running time is nearly negligible. According to statistics, the ratios between additional running time and traditional BP on “Tsukuba”, “Venus”, “Sawtooth”, “Cones” are 5.3%, 3.97%, 4.53%, 5.08%, respectively. Furthermore, the overall average runtime of four runtime ratios is rarely 4.72%. Compared with traditional BP, the SDDT is negligible while improving accuracy.

thumbnail
Fig 15. Comparison diagram of mean running time in the Middlebury datasets “Tsukuba”, “Teddy”, “Cones” and “Venus”.

https://doi.org/10.1371/journal.pone.0137530.g015

We experiment with a laptop with an Intel Core i3 2.37 GHz CPU and 4GB RAM. The implementation is in C++. To analyze the trade-off between complexity and accuracy, we list both the processing time and the errors in Table 6. Note that the processing time on a single-core CPU was measured for “Tsukuba” using published program[9,36,37,38] and the average error was calculated for all the test sequences. Our algorithm converge in three numbers in the accuracy of 10−4 using 0.546 second, while seven numbers in the accuracy of 10−5 using 1.482 second. Our algorithm is faster BP algorithm than other BP algorithms while holding the better accuracy. Even our method is faster than other excellent local methods.

thumbnail
Table 6. Comparison with other methods on the runtime and errors for “Tsukuba”.

https://doi.org/10.1371/journal.pone.0137530.t006

Conclusions

We propose an improved BP algorithm with a novel IVBP and SDDT with regard to accuracy and convergence speed. First, we propose a novel SDDT to improve the data term of BP. Compared with intensity-based measures alone, SDDT effectively utilizes gradient context assistant judgments to achieve higher accuracy and smoother effects with similar running times. Second, inspired by the importance of the initial value in complex nonlinear problems, an IVBP algorithm for improving both the convergence speed and accuracy is presented, which provides disparities with different probabilities using GF instead of equal probabilities. These methods improve both convergence speed and accuracy and are an exploration of the combination of local methods and global methods. The methods are evaluated with the Middlebury and new 2014 Middlebury datasets. Experimental results demonstrate that the proposed method maintains superior performance and holds better edge-preserving smoothing effects compared with some excellent BP algorithms, especially near disparity discontinuities and in textureless regions. Our IVBP is one of the best BP algorithms at present.

Future work should aim to incorporate more contextual measures, such as differential geometry properties, into the data term of the BP algorithm.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61571313, Grant 61173182, and Grant 61411130133 and in part by the Sichuan Province under Grant 2014HH0048 and Grant 2014HH0025

Author Contributions

Conceived and designed the experiments: XW YL. Performed the experiments: XW. Analyzed the data: XW. Contributed reagents/materials/analysis tools: XW. Wrote the paper: XW YL.

References

  1. 1. Yang Q, Wang L, Yang R, Stew´enius H, Nist´er D (2009) Stereo matching with color-weighted Correlation, hierarchical belief propagation and occlusion handling. IEEE Transaction on Pattern Analysis and Machine Intelligence, 31:492–504.
  2. 2. Neilson D, Yang YH (2011) A Component-wise Analysis of Constructible Match Cost Functions for Global Stereopsis, IEEE Transaction on Pattern Analysis and Machine Intelligence, 33(11):2147–2159.
  3. 3. Mei X, Sun X, Zhou MC, Jiao SH, Wang HT, Zhang XP (2011) On Building an Accurate Stereo Matching System on Graphics Hardware, In GPUCV'11: ICCV Workshop on GPU in Computer Vision Applications,467–474.
  4. 4. Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision 47:7–42.
  5. 5. Brownm Z, Burschka D, Ihager GD (2003) Advances in computational stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8: 993–1008.
  6. 6. Yang Q, Wang L, Ahuja N (2010) A constant-space belief propagation algorithm for stereo matching. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1458–1465
  7. 7. Sun J, Zheng N, Shum Y (2003) Stereo matching using belief propagation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 7:787–800.
  8. 8. Tappen MF, Freeman WT (2003) Comparison of Graph Cuts with Belief Propagation for Stereo, using Identical MRF Parameters. In,Proceedings. IEEE International Conference on Computer Vision,900–907.
  9. 9. Felzenszwalb Pedro F., Huttenlocher Daniel P. (2006) Efficient Belief Propagation for Early Vision. International Journal of Computer Vision, 1: 41–54.
  10. 10. Mei T, An L, Bhanu B (2015), Context guided belief propagation for remote sensing image classification, Applied Optics, 54(11):3372–3382. pmid:25967326
  11. 11. Wang L, Yang R (2011) Global Stereo Matching Leveraged by Sparse Ground Control Points, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3033–3040.
  12. 12. YangQ, Tan KH, Ahuja N (2009) Real-time o (1) bilateral filtering. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 557–564.
  13. 13. Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3017–3024.
  14. 14. He K, Sun J, Tang X (2010) Guided image filtering. In European Conference on Computer Vision: 1–14.
  15. 15. Pugeault N, Worgotter F, Kruger N (2010) Disambiguating Multi–Modal Scene Representations Using Perceptual Grouping Constraints. Plos one, 5(6), e10663. pmid:20544006
  16. 16. Klaus A, Sormann M, Karne K (2006) Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure, International Conference on Pattern Recognition 3: 15–18.
  17. 17. Li G, Zucker SW (2006) Surface geometric constraints for stereo in belief propagation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2355–2362.
  18. 18. Jung HY, Lee KM, Lee SU (2011) Stereo reconstruction using high order likelihood. In, Proceedings. IEEE International Conference on Computer Vision, 1211–1218.
  19. 19. Woodford O, Torr P, Reid I, Fitzgibbon A (2008) Global stereo reconstruction under second order smoothness priors. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition,900–907.
  20. 20. Hiroyuki S, Yoshihiko H, Danushka B, Hitoshi L(2015), Improved sampling using loopy belief propagation for probabilistic model building genetic programming, Swarm and Evolutionary Computation, 23:1–10.
  21. 21. Hong L, Chen G (2004) Segment-based stereo matching using graph cuts. In In IEEE Computer Society Conference on Computer Vision and Pattern Recognition,1: 74–81.
  22. 22. Wu P, Liu Y, Li Y, Liu B(2015), Robust Prostate Segmentation Using Intrinsic Properties of TRUS Images, IEEE Transactions on Medical Imaging, 34(6): 1321–1335. pmid:25576565
  23. 23. Su Q, Wu YC (2015), On Convergence Conditions of Gaussian Belief Propagation, IEEE Transactions on Signal Processing,63(5):1144–1155.
  24. 24. Sun J, Li Y, Kang SB, Shum HY (2005). Symmetric stereo matching for occlusion handling. International Conference on Pattern Recognition,2: 399–406.
  25. 25. Gu Z, Su X, Liu Y, Zhang Q (2008) Local stereo matching with adaptive support-weight, rank transform and disparity calibration. Pattern Recognition Letters, 29(9): 1230–1235.
  26. 26. Zabih R, Woodfill J (1994) Non-Parametric Local Transforms for Computing Visual Correspondence, Proc. Third European Conf. Computer Vision, 150–158.
  27. 27. Hosni A, Bleyer M, Gelautz M, Rhemann C (2009) Local stereo matching using geodesic support weights. In IEEE International Conference on Image Processing, 2093–2096.
  28. 28. Yoon KJ, Kweon S (2008) Adaptive Support-Weight Approach for Correspondence Search, IEEE Transaction on Pattern Analysis and Machine Intelligence, 650–656.
  29. 29. Mozerov MG, Van WJ (2015), Accurate Stereo Matching by Two-Step Energy Minimization, Image Processing, IEEE Transactions on,24(3):1153–1163.
  30. 30. Min D, Lu J, Do MN (2013), Joint Histogram-Based Cost Aggregation for Stereo Matching, IEEE Transaction on Pattern Analysis and Machine Intelligence 35(10):2539–2545.
  31. 31. Yang Q, Engels C, Akbarzadeh A (2008). Near real-time stereo for weakly-textured scenes. British Machine Vision Conference, 1:924–931.
  32. 32. Barzigar N, Roozgard A, Cheng S, Verma P (2012). SCoBeP: Dense image registration using sparse coding and belief propagation. Journal of Visual Communication and Image Representation,24(2):137–147.
  33. 33. Montserrat T, Civit J, Escoda O, Landabaso JL (2009). Depth estimation based on multiview matching with depth/color segmentation and memory efficient belief propagation. IEEE International Conference on Image Processing. 2353–2356.
  34. 34. Zitnick L, Kang SB (2007) Stereo for image-based rendering using image over-segmentation, International Journal of Computer Vision, 75(1):49–65.
  35. 35. Kowalczuk J, Psota ET.; Perez LC (2013).Real-Time Stereo Matching on CUDA Using an Iterative Refinement Method for Adaptive Support-Weight Correspondences, IEEE Transactions on Circuits and Systems for Video Technology, 23(1): 94–104,
  36. 36. Larsen S, Mordohai P, Pollefeys M, Fuchs H (2007) Temporally consistent reconstruction from multiple video streams using enhanced belief propagation, In,Proceedings. IEEE International Conference on Computer Vision,1–7.
  37. 37. YangQ, Wang L, Yang R, Wang S, Liao M, Nistér D (2006) Real-time global stereo matching using hierarchical belief propagation, Proceedings of the British Machine Vision Conference, 989–998.
  38. 38. Yang Q, Wang L, Ahuja N (2010) A constant-space belief propagation algorithm for stereo matching. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1458–1465.