149x Filetype PDF File size 0.16 MB Source: www.math.usm.edu
Jim Lambers MAT461/561 Spring Semester 2009-10 Lecture 24 Notes These notes correspond to Section 10.3 in the text. Quasi-Newton Methods Oneof the drawbacks of using Newton’s Method to solve a system of nonlinear equations F(x) = 0 is the computational expense that must be incurred during each iteration to evaluate the partial derivatives of F at x(k), and then solve a system of linear equations involving the resulting Jacobian matrix. The algorithm does not facilitate the re-use of data from previous iterations, and in some cases evaluation of the partial derivatives can be unnecessarily costly. AnalternativeistomodifyNewton’sMethodsothatapproximatepartialderivativesareused, as in the Secant Method for a single nonlinear equation, since the slightly slower convergence is offset by the improved efficiency of each iteration. However, simply replacing the analytical Jacobian matrix of F with a matrix consisting of finite difference approximations of the partial derivatives does not do much to reduce the cost of each iteration, because the cost of solving the system of linear equations is unchanged. However, because the Jacobian matrix consists of the partial derivatives evaluated at an element of a convergent sequence, intuitively Jacobian matrices from consecutive iterations are “near” one another in some sense, which suggests that it should be possible to cheaply update an approximate Jacobian matrix from iteration to iteration, in such a way that the inverse of the Jacobian matrix can be updated efficiently as well. This is the case when a matrix has the form B=A+uvT, where u and v are given vectors. This modification of A to obtain B is called a rank-one update, since uvT, an outer product, has rank one, since every vector in the range of uvT is a scalar multiple −1 −1 of u. To obtain B from A , we note that if Ax=u, then Bx=(A+uvT)x=(1+vTx)u, which yields B−1u= 1 A−1u. T −1 1+v A u 1 Onthe other hand, if x is such that vTA−1x = 0, then BA−1x=(A+uvT)A−1x=x, which yields −1 −1 B x=A x. This takes us to the following more general problem: given a matrix C, we wish to construct a matrix D such that the following conditions are satisfied: ∙ Dw=z,for given vectors w and z ∙ Dy=Cy,if y is orthogonal to a given vector g. −1 −1 T −1 −1 −T In our application, C = A , D = B , w = u, z = 1/(1+v A u)A u, and g = A v. To solve this problem, we set (z−Cw)gT D=C+ gTw . Then, if gTy = 0, the second term in the definition of D vanishes, and we obtain Dy = Cy, but in computing Dw, we obtain factors of gTw in the numerator and denominator that cancel, which yields Dw=Cw+(z−Cw)=z. Applying this definition of D, we obtain 1 −1 −1 T −1 −1 −1 1+vTA−1uA u−A u v A −1 A−1uvTA−1 B =A + T −1 =A − T −1 . v A u 1+v A u This formula for the inverse of a rank-one update is known as the Sherman-Morrison Formula. We now return to the problem of approximating the Jacobian of F, and efficiently obtaining its inverse, at each iterate x(k). We begin with an exact Jacobian, A = J (x(0)), and use A to 0 F 0 computethefirstiterate, x(1), using Newton’s Method. Then, we recall that for the Secant Method, we use the approximation f(x )−f(x ) f′(x ) ≈ 1 0 . 1 x −x 1 0 Generalizing this approach to a system of equations, we seek an approximation A1 to JF(x(1) that has these properties: ∙ A (x(1) −x(0)) = F(x(1))−F(x(0)) 1 T (1) (0) (0) ∙ If z (x −x )=0,thenA z=J (x )z=A z. 1 F 0 2 It follows from previous discussion that y −A s A =A + 1 0 1sT, 1 0 sTs 1 1 1 where s =x(1)−x(0), y =F(x(1))−F(x(0)). 1 1 −1 Furthermore, once we have computed A , we have 0 −1y −A s T −1 A 1 0 1s A −1 T −1 −1 −1 0 sTs1 1 0 −1 (s1 −A y1)s A A =A − 1 =A + 0 1 0 . 1 0 −1 y −A s 0 T −1 1+sTA 1 0 1 s A y1 1 0 sTs 1 0 1 1 Then, as A is an approximation to J (x(1)), we can obtain our next iterate x(2) as follows: 1 F A s =−F(x(1)), x(2) = x(1) + s . 1 2 2 Repeating this process, we obtain the following algorithm, which is known as Broyden’s Method: Choose x(0) A0 =J (x(0)) F s1 = −A−1F(x(0)) 0 x(1) = x(0) + s1 k = 1 while not converged do yk = F(x(k))−F(x(k−1)) w =A−1 y k k−1 k c = 1/sTw k k A−1 =A−1 +c(sk−wk)sTA−1 k k−1 k k−1 −1 (k) sk+1 = −A F(x ) k x(k+1) = x(k) +sk+1 k = k +1 end −1 NotethatitisnotnecessarytocomputeAk fork ≥ 1; onlyA is needed. It follows that no systems k of linear equations need to be solved during an iteration; only matrix-vector multiplications are required, thus saving an order of magnitude of computational effort during each iteration compared to Newton’s Method. 3
no reviews yet
Please Login to review.