CEng 583 Computer Vision Term-Project

Shape From Shading: Deterministic and Stochastic Optimization

Emre Ugur, Emre Akbas
emre.{akbas|ugur}@ceng.metu.edu.tr
Department of Computer Engineering
Middle East Technical University



Table of contents

  1. Project proposal
  2. Implementation
  3. Future Work
  4. References

Project Proposal

Here is our project proposal:
Proposal | Presentation

Implementation

The source code of the project (as a single C++ source file): View | Download
Tested on a Linux 2.4 box with gcc-3.2.2, compile with: g++ -lm

Deterministic Optimization

In the first part of the project, deterministic optimization method is applied to the problem. Energy function is defined by the combination two different error measures. First is the brightness error, which is defined as the different between the image brightness value and the reflectance value of corresponding pixel. Minimization of brigthness error is not sufficient for the solution of the problem, therefore smoothness measure of the gradient space, as a regulization term is inserted into the energy function. The discussion about minimization of energy functional can be found in project proposal.

Experiments and Results

Mainly the following two images are used for experiments:
Figure 1: SphereFigure 2: Beethoven
In this section, we examine a number of parameters which have significant impact on the results.

Deterministic methods are extremely sensitive to initial conditions where iterations start. Since gradient space is the unknown of the SFS problem, we should take extra care on initial values of gradient values, namely p and q. The following figures shows three different initial configurations for gradient space.
Figure 3Figure 4
Figure 5
In figure 4, only boundary values of gradient space is initialized to random values. In figure 5, all sites in the gradient space is initialized to random values. In figure 2, a good starting point is given, only cells on boundary is initialized to special values. The results of randomly initialized initial configurations are shown in the following figures:
Figure 6Figure 7
Figure 8
Note that the above surfaces are reconstructed from the uniform sphere (whose original image is Figure 1 at the top of the page) using random initial configurations.

Since there is no proof of convergence of deterministic minimization method in SfS, with different random configurations, solution seems to be stucked in local minima. Figures 9,10 and 11 below were obtained using image irradiance equation. Each different random initialization results in different gradient values which are stuck in local minima. Since reflectance value in each cell should equal to the irradiance of corresponding pixel, each pixel's gray level value can be computed. By both examining the brightness error and by comparing resulting image and original image, gradient space seems to be very close to the solution. However, constructed surfaces of each randomly initialized experiment shows that the resulting 3D surface is not similar to the original one.

Figure 9Figure 10Figure 11

Above experiments showed that starting from different initial configurations may result in different results which minimize brightness error. In figure 3, initial gradient space is given as a good starting point for the sphere example. Thus, reconstructed shape, which is given in Figure 12 below is the correct solution.

Figure 12
Watch the animation of the deterministic optimization method with a good initial configuration: deterministic1.gif (animated gif)
And deterministic optimization method with random initial configuration(borders): deterministic2.gif (animated gif)
Following two images are the last scenes (the converged needle maps) of the animations above.
The last scene of the first animation above: This is the resulting needle map when deterministic optimization is applied with a good initial configuration. Algorithm converges to the correct solution The last scene of the second animation above This is the resulting needle map when deterministic optimization is applied with a random initial configuration. Algorithm converges to a poor solution

The lambda parameter which specifies the change rate in each iteration is another important parameter. Lambda is the coefficient of the term involving the image irradiance equation. In order to obtain successful results, lambda parameter should be set carefully. In experiments performed, best results obtained when lambda is in the range [0.6,0.9]. Figures 13,14,15 and 16 show from-gradient-reconstructed images (i.e. from the p-q values using the image irradiance equation) with different lambda values. When lambda is low, the effect of the brightness error in minimization decreases, thus obtained reflectance values become smoother. However, when lambda is higher, reflectance values are minimized according to image irradiance values.

Figure 13: lambda=0.1Figure 14: lambda=0.5
Figure 15: lambda=0.9 Figure 16: lambda=1.5

Another set of experiments were conducted for the Beethoven image (Figure 2 above), The reconstructed 3D shape of the given image is shown in Figures 17 and 18 below. The unexpected height values in Figure 17 is the result of shadows which appear in the image. However, when seen from top, although all image is composed of discontinuous regions, the shape of the Beethoven statue is extracted.

Figure 17: Size viewFigure 18: Top view

Last set of experiments are made for understanding dynamics of the iterative scheme and determining the stopping point. In each iteration, total brightness error is computed. Figure 19 shows that after some time, brightness error converges to a constant value. If the shape of the reconstructed surface is examined, it can be seen that after the time where error becomes constant, there is no qualitative change in the resulting surface.

Figure 19: Brightness error

Stochastic Optimization

Stochastic vs. Deterministic Optimization: A brief discussion

Natural phenomena consist of various uncertantities. As vision processes are no exception to this fact, optimization modeling in computational vision problems is a rational and useful paradigm. In their survey Zhang et al. found out that the most accurate results were produced by optimization methods. Optimization methods can be divided into two:

  • Deterministic Optimization Methods
  • Stochastic Optimization Methods

The first family of optimization methods consist of two main strategies: directly minimizing the discrete version of the energy function, and discrete solution of the corresponding Euler-Lagrange equations. Szeliski, in <--!TODO--> his paper Fast Shape From Shading, quoted that solving the discrete version of Euler equations do not strictly correspond to minimizing energy, since all local extrema of the energy functional are solutions of these Euler equations. Furthermore, deterministic optimization methods have the problem of high dependance to the initial configuration. This problem is clearly illustrated in figures 6,7, and 8 above.

However, deterministic optimization methods are fast (when compared to stochastic methods) and give good results if initial configuration and boundary conditions are given correctly.

Stochastic optimization methods, on the other hand, finds a global minimum of the underlying energy functional. Initial configuration has no effect on the solution. However, the drawback of these methods is their extreme slowness.

A Bayesian Framework

Shape from shading(SFS) is a classical labeling problem in computational vision and it belongs to the regular sites with continuous labels group of labeling problems as described in [2].

SFS can be re reformulated in a Bayesian framework. Let $ E$ denote the data (i.e. the image) and $ w$ denote the set of configurations. A configuration is a complete labeling of all sites (pixels) where a label means a p-q pair. Given the data $ E$ we want to find the configuration that maximizes the a posteriori probability $ P(w\vert E)$. By Bayes rule:

$\displaystyle P(w\vert E) = \frac{P(E\vert w)P(w)}{P(E)} \propto P(E\vert w)P(w)$ (1)

The term $ P(E\vert w)$ in (1) represents the given data and is called the likelihood, whereas the term $ P(w)$ is called the prior in which we encode the contextual constraints, such as smoothness and integrability. As quoted in [2], Bayes statistics is a theory of fundamental importance in estimation, and according to this theory, the best configuration (i.e. labeling of all sites) can be estimated from the knowledge of the prior distribution and likelihood of the underlying pattern. Such schemes are called MAP (maximum a posteriori) solutions. Markov Random Fields (MRF) are common and efficient models in this framework.[1]

Click here for a definition of Markov Random Fields adapted from [2]

A Markov Model for the SfS Problem

As mentioned above, by Bayes rule we can write the maximum a posteriori probability in terms of the likelihood and prior distributions (see equation (1)). We can encode the contextual constraint in the prior term $ P(w)$. Thus, we can rewrite it as $ P_1(w)P_2(w)P_3(w)$ which respectively define smoothing constraint, integrability constraint and a prior on the labeling (i.e p-q space) distribution. Then, the MAP becomes:

$\displaystyle P(w\vert E) \propto P(E\vert w)P_1(w)P_2(w)P_3(w)$ (2)

In the deterministic optimization approach we derived the energy functional:

$\displaystyle F(p,q) = \iint_{x,y \in \Omega}[R(p,q)-E(x,y)]^2dxdy+ \lambda_{in...
	...da_{smo}\iint_{x,y \in \Omega}[\vert\nabla p\vert^2 + \vert\nabla q\vert^2]dxdy$ (3)

The above equation embeds the image irradiance equation, the smoothing constraint and the integrability constraint. The fact that this energy can be interpreted as a sum of local functions which depend only on neighboring pixels satisfies the Markovian property. By The Hammersley-Clifford theorem we can write a MRF as a Gibbs field, thus:

$\displaystyle P(w) = \frac{1}{Z}e^{-F(w)}=\frac{1}{Z}e^{-\sum_{c \in C}V_c(w_s,s \in c)}$ (4)

where $ c$ is clique of S (set of sites) and $ V_c$ is the clique potential. In [1], the first three terms on the right hand side of (2) are proposed to be written as:

$\displaystyle P(E\vert w)P_1(w)P_2(w) = \frac{1}{Z}e^{-F(w)}$ (5)

Prior on p-q state space

The prior on the p-q distribution $ P_3(w)$ is considered as a uniform distribution on the northern hemisphere of the Gaussian sphere. As the slope $ \rho=\sqrt{p^2+q^2}$, this distribution is proportional to:

$\displaystyle \frac{\rho}{(1+p^2)^{3/2}}$ (6)

The bounds of the state space of p-q is of great importance. They can be computed using the image irradiance equation and $ E_{max}$ and $ E_{min}$, which are highest and lowest gray values, respectively. By multiplying the reflectance function by $ E_max$, we obtain:

$\displaystyle \frac{E_{max}}{\sqrt{1+p_{i,j}^2+q_{i,j}^2}} = E_{i,j}$ (7)

For the $ i,j$ where E has its lowest gray-level, the slope ( $ \sqrt{p^2+q^2}$) is maximum. If we rearrange the above equation:

$\displaystyle (\frac{E_{max}}{E_{min}})^2-1 \ge p_{i,j}^2+q_{i,j}^2$ (8)

which gives us the boundary of the state space of p-q.

Simulated Annealing Algorithm

To optimize the derived Markov model, the following simulated annealing algorithm is proposed:
	1. Initialize a random configuration and the starting temperature
	2. For each pixel (i,j) in the image
		2.1 Choose a random new label (i.e. a p-q pair) in the p-q state space
		2.2 Compute the acceptance ratio R
		2.3 Accept the new label with probability min(1,R)
	3. If the stopping criteria is reached stop,
		else decrement T and go to step 2
	

During each iteration, random changes are made in the current configuration(i.e. complete labeling of all sites with p-q values). The new state is then compared to the current state. If its new energy is lower than the current, this new state is immediately accepted. However, if the new state's energy is higher than the current's, it is accepted probabilistically. This probability depends upon the energy and the temperature value at that iteration. Generally, at high temperatures, many states will be accepted, while at low temperatures, the majority of these probabilistic moves will be rejected.

Acceptance ratio R

This ratio is computed for deciding whether the newly proposed label will be accepted or not. It is simply the ratio of the energy of the new configuration to the energy of the current configuration. In addition, this value embeds the ratio of the proposal distribution of the new label to the current label. According to these, the acceptance ratio is:

$\displaystyle R = exp[-\frac{F_{new}-F_{cur}}{T}]\frac{\rho_s^{new}(1+(\rho_s^{cur})^2)^{3/2}}{\rho_s^{cur}(1+(\rho_s^{new})^2)^{3/2}}$ (9)

where $ \rho_s$ is the slope of the label of the current site $ s$. Using the Markovian property, we can write the energy functional $ F$ depending only on the neighboring pixels:


    $\displaystyle F(p_{i,j},q_{i,j}) = \left\{ \frac{E_{max}}{\sqrt{1+p_{i,j}^2+q_{...
	...}}-E_{i,j} \right\}^2 +
	\lambda_{int}( [p_{i,j+1}-p_{i,j}-q_{i+1,j}+q_{i,j}]^2+$  
                                       $\displaystyle [p_{i,j}-p_{i,j-1}-q_{i+1,j-1}+q_{i,j-1}]^2+
	[p_{i-1,j+1}-p_{i-1,j}-q_{i,j}+q_{i-1,j}]^2 ) +$  
                                       $\displaystyle \lambda_{smo} ( [(p_{i,j+1}-p{i,j})^2+(q_{i,j+1}-q_{i,j})^2] +
	[(p_{i+1,j}-p{i,j})^2+(q_{i+1,j}-q_{i,j})^2] +$  
                                       $\displaystyle [(p_{i,j-1}-p{i,j})^2+(q_{i,j-1}-q_{i,j})^2] +
	[(p_{i-1,j}-p{i,j})^2+(q_{i-1,j}-q_{i,j})^2] )$  

Cooling scheme

The cooling scheme is where is the iteration number. has a crucial importance in the convergence and the execution time of the algorithm. The discussion about the values of these variables can be found below in the Experiments and Results section.

Experiments and Results

Watch the animation of the stochastic optimization method: stochastic.gif (animated gif). The following figure is the last scene of this animation:
This needle map is obtained with initial temperature=1000, stopping temperature=10 and cooling rate coefficient alpha=0.999998. The algorithm converges after about 15000 iterations in 26 minutes on a 2Ghz processor.

Temperature(T) is very important in the computation of the acceptance ratio (9). In the simulated annealing algorithm, if the randomly generated new value for a site imposes a negative change in the energy function (3), it is accepted, otherwise it is accepted with a propability equal to the acceptance ratio. When the temperature is high, the rate of accepted bad moves is high. However, as the temperature decreases, system begins to stabilize and the rate of accepted bad moves approaches to zero.

Initial temperature If it is too high, the algorithm conducts a random walk until an appropriate temperature is reached. This random walk has no useful effect on the result. The high initial temperature, in addition makes the execution time longer than necessary.
On the other hand, if the initial temperature is too low, the system quickly freezes out in metastable states. The needle map in Figure 20 below is obtained with initial temperature: 50, stopping temperature=1 and alpha=.99998

Figure 20

Stopping temperature If it is too high, the system cannot iterate enough to reach the correct solution and a needle map similar to that in Figure 20 is obtained. If it is lower than needed, only the execution time is affected negatively.

Cooling rate coefficient () Large values(in the vicinity of 1.0, should less than 1) results in accurate results since it allows enough number of iterations to be executed. Small values make the algorithm converge faster, but this gives poor results as seen in Figure 20 above.

The result obtained using Beethoven's image is surprizingly good. The needle map corresponding to Beethoven's image (in Figure 2) is below in Figure 21.

Figure 21

Future Work

  • Implement a hybrid method: Due to the its extreme slowness, stochastic optimization is not practical. However, a hybrid method coupling both the deterministic method and the stochatic one should work fast and give accurate results. The method could as follows: first the stochastic method is applied to low-resolution version of the input image, and then the output of the stochastic method is given as initial configuration to the deterministic method.
  • Intensity gradient constraint can be applied in addition to the smoothing and integrability constraints. This would make both the deterministic and the stochastic method to converge faster.
  • Preserving discontinuties. Real world images consist of many discontinuties, an approach that preserves discontinuities would be a more realistic solution to the SfS problem

References

In addition to the references we listed in our proposal , we referred to the following documents in the implementation phase: