biology daily - the biology and biochemistry encyclopedia
biology daily articles and research Encyclopedia Dictionary Forums biology research links Weblinks Pictures Articles Blogs Newsletter

Tikhonov regularization

(Redirected from Ridge regression)

Tikhonov regularization, is the most commonly used method of regularization of ill-posed problems. In its simplest form, an ill-conditioned system of linear equations

Ax = b,

where A is an m×n matrix above, x is a column vector with n entries and b is a column vector with m entries, is replaced by the problem of seeking an x to minimize

||Axb||2 + α2 ||x||2

for some suitably chosen Tikhonov factor α >0. Here ||.|| is the Euclidean norm.

This problem is now well conditioned and can be solved numerically. An explicit solution is given by

(ATA + α2 I)−1 ATb

where I is the n×n identity matrix. For α = 0 this reduced to the least squares solution of an overdetermined problem (m > n).

Contents

Statistical interpretation

Although at first the choice of the this regularized problem and indeed the parameter α seems rather arbitrary there is a sound statistical justification. Note that for an ill-posed problem one must necessarily introduce some additional assumptions in order to get a stable solution. Statistically we might assume that a priori we know that x is a random variable with a multivariate normal distribution, for simplicity we take the mean to be zero and assume that each component independent with standard deviation σx. Our data is also subject to errors, and we take the errors in b to be also independent with zero mean and standard deviation σb. Under these assumptions the Tikhonov-regularized solution is the most probable solution given the data and the a priori distribution of x, according to Bayes theorem. The Tikhonov parameter is then α = σbx.

If the assumption of normality is replaced by assumptions of homoscedasticity and uncorrelatedness of errors, and still assume zero mean, then the Gauss-Markov theorem entails that the solution is still in a certain sense optimal.

Generalized Tikhonov regularization

For general multivariate normal distributions for x and the data error, one can apply a transformation of the variables to reduce to the case above, equivalently one can seek an x to minimize

\|Ax-b\|_P^2 + \alpha^2\|x-x_0\|_Q^2\,

where we have used ||x||P to stand for the weighted norm xTPx. In the statistical interpretation P is the inverse covariance matrix of b, x0 the expected value of x, and Q is the inverse covariance matrix of x.

This can be solved explicitly, for example using the formula

x_0 + (A^T PA + Q)^{-1} A^T P(b-Ax_0).\,

Regularization in Hilbert space

Typically discrete linear ill-condition problems result as discretization of integral equations, and one can formulate Tikhonov regularization in the original infinite dimensional context. In the above we can interpret A as a compact operator on Hilbert spaces, and x and b as elements in the domain and range of A. The operator A*A + α2 I is then a self-adjoint bounded invertible operator for α > 0.

Relation with singular value decomposition

Given the singular value decomposition

A = UΣ VT

where Σ is the diagonal matrix of singular values σi(augmented with zeros so as to be m-by-n) and U and V respectively the matrices of left and right singular vectors then the Tikhonov regularized solution can be expressed as

V D UT b

where D is an m-by-n matrix equal to

σi/(σi2 + α2)

on the diagonal and zero elsewhere. This demonstrates the effect of the Tikhonov parameter on the condition number of the regularized problem.

For the generalized case a similar representation can be derived using a generalized singular value decomposition.

History

Tikhonov regularization has been invented independently in many different contexts, it became widely known from its application to integral equations from the work of AN Tikhonov and of and DL Phillips on integral equations. Some authors use the term Tikhonov-Phillips regularization. The finite dimensional case expounded by AE Hoerl, who took a statistical approach, and by M Foster who interpreted this method as Wiener-Kolmogorov filter. Following Hoerl it is known in the statistical literature as ridge regression.

References

  • Hoerl AE, 1962, Application of ridge analysis to regression problems. Chemical Engineering Progress, 58, 54-59.
  • Foster, Manus, 1961, An application of the Wiener-Kolmogorov smoothing theory to matrix inversion, J. SIAM, 9, 387-392
  • Phillips DL, 1962, A technique for the numerical solution of certain integral equations of the first kind, J Assoc Comput Mach, 9, 84-97
  • Tikhonov AN, 1963,Solution of incorrectly formulated problems and the regularization method Soviet Math Dokl 4, 1035-1038 English translation of Dokl Akad Nauk SSSR 151, 1963, 501-504
  • Tikhonov AN and Arsenin VA, 1977, Solutions of Ill-posed Problems. Winston & Sons, Washington, ISBN 0470991240.
  • Tarantola A, 1987, Inverse Problem Theory, Elsevier, ISBN 0444427651.


07-14-2008 23:18:10
The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy
BiologyDaily.com 2005. Legal info   Privacy