next up previous [pdf]

Next: NULL SPACE AND INTERVAL Up: OPPORTUNITIES FOR SMART DIRECTIONS Previous: OPPORTUNITIES FOR SMART DIRECTIONS

The meaning of the preconditioning variable $ \bold p$

To accelerate convergence of iterative methods we often change variables. The model-styling regression $ \bold 0 \approx \epsilon \bold A \bold m$ is changed to $ \bold 0 \approx \epsilon \bold p$ . Experience shows, however, that the variable $ \bold p$ is often more interesting to look at than the model $ \bold m$ . Why should a new variable introduced for computational convenience turn out to have more interpretive value? There is a little theory underlying this. Begin from

$\displaystyle \bold 0$ $\displaystyle \approx$ $\displaystyle \bold W (\bold F \bold m -\bold d)$ (23)
$\displaystyle \bold 0$ $\displaystyle \approx$ $\displaystyle \epsilon \bold A \bold m$ (24)

Introduce the preconditioning variable $ \bold p$ .
$\displaystyle \bold 0$ $\displaystyle \approx$ $\displaystyle \bold W (\bold F \bold A^{-1}\bold p -\bold d)$ (25)
$\displaystyle \bold 0$ $\displaystyle \approx$ $\displaystyle \epsilon \bold p$ (26)

Rewrite this as a single regression

$\displaystyle \bold 0 \quad\approx\quad \left[ \begin{array}{c} \bold r_d \\ \b...
... p \quad - \quad \left[ \begin{array}{c} \bold d \\ \bold 0 \end{array} \right]$ (27)

In Chapter [*] we learned the least squares solution is when the residual is orthogonal to the fitting functions. The fitting functions are the columns of the matrix or the rows of its transpose. Thus we simply multiply the regression (27) by the adjoint operator and replace the $ \approx$ by $ =$ . Thus

$\displaystyle \bold 0 \quad=\quad (\bold W\bold F\bold A^{-1})\T\bold r_d + \epsilon \bold r_m$ (28)

Equation (28) tells us at the best solution to the regression there is a fight between the data space residual and the model space residual. It's a battle between our preconceived statistical model expressed in our model styling and the model wanted by the data. Except for the scale factor $ \epsilon$ , the model space residual $ \bold r_m$ is the preconditioning variable $ \bold p$ . That's why the variable $ \bold p$ is interesting to inspect and interpret. The variable $ \bold p$ is not simply a computational convenience. Its size measures (in model space) the conflict of our acquired data with our preconceived theory. It points to locations of interest.

The preconditioning variable $ \bold p$ is not simply a computational convenience. Its size measures (in model space) the conflict of our acquired data with our preconceived theory expressed by our model styling. It points to locations of interest.

If I were young and energetic like you I would write a new basic tool for optimization. Instead of scanning only the space of the gradient and previous step, it would scan also over the ``smart'' direction. This should offer the benefit of preconditioning the regularization at early interations while offering more assured fitting data at late iterations. The improved module for cgstep would need to solve a $ 3\times 3$ matrix.


next up previous [pdf]

Next: NULL SPACE AND INTERVAL Up: OPPORTUNITIES FOR SMART DIRECTIONS Previous: OPPORTUNITIES FOR SMART DIRECTIONS

2011-08-20