jagomart
digital resources
picture1_Applications Of Derivatives Pdf 171157 | Mpra Paper 3917


 159x       Filetype PDF       File size 0.15 MB       Source: mpra.ub.uni-muenchen.de


File: Applications Of Derivatives Pdf 171157 | Mpra Paper 3917
munich personal repec archive anote on matrix dierentiation kowal pawel december 2006 online at https mpra ub uni muenchen de 3917 mprapaper no 3917 posted 09 jul 2007 utc anote ...

icon picture PDF Filetype PDF | Posted on 26 Jan 2023 | 2 years ago
Partial capture of text on file.
                          Munich Personal RePEc Archive
        Anote on matrix differentiation
        Kowal, Pawel
        December 2006
        Online at https://mpra.ub.uni-muenchen.de/3917/
        MPRAPaper No. 3917, posted 09 Jul 2007 UTC
                   Anote on matrix differentiation
                              Paweł Kowal
                              July 9, 2007
                               Abstract
                 This paper presents a set of rules for matrix differentiation with
               respect to a vector of parameters, using the flattered representation of
               derivatives, i.e. in form of a matrix. We also introduce a new set of
               Kronecker tensor products of matrices. Finally we consider a problem
               of differentiating matrix determinant, trace and inverse.
                 JEL classification: C00
                 Keywords: matrixdifferentiation, generalizedKroneckerproducts
            1 Introduction
              Derivatives of matrices with respect to a vector of parameters can be ex-
            pressed as a concatenation of derivatives with respect to a scalar parameters.
            However such a representation of derivatives is very inconvenient in some
            applications, e.g. if higher order derivatives are considered, and or even are
            not applicable if matrix functions (like determinant or inverse) are present.
            For example finding an explicit derivative of det(∂X=∂θ) would be a quite
            complicated task. Such a problem arise naturally in many applications, e.g.
            in maximum likelihood approach for estimating model parameters.
              The same problems emerges in case of a tensor representation of deriva-
            tives. Additionally, in this case additional effort is required to find the flat-
            tered representation of resulting tensors, which is required, since running
            numerical computations efficiently is possible only in case of two dimensional
            data structures.
              In this paper we derive formulas for differentiating matrices with respect
            to a vector of parameters, when one requires the flattered form of resulting
            derivatives, i.e. representation of derivatives in form of matrices. To do this
                                      weintroduce a new set of the Kronecker matrix products as well as the gener-
                                      alized matrix transposition. Then, first order and higher order derivatives of
                                      functions being compositions of primitive function using elementary matrix
                                      operations like summation, multiplication, transposition and the Kronecker
                                      product, can be expressed in a closed form based on primitive matrix func-
                                      tions and their derivatives, using these elementary operations, the generalized
                                      Kronecker products and the generalized transpositions.
                                            We consider also more general matrix functions containing matrix func-
                                      tions (inverse, trace and determinant). Defining the generalized trace func-
                                      tion we are able to express derivatives of such functions in closed form.
                                      2 Matrix differentiation rules
                                            Let as consider smooth functions Ω ∋ θ 7→ X(θ) ∈ Rm×n, Ω ∋ θ 7→
                                      Y(θ) ∈ Rp×q, where Ω ⊂ Rk is an open set. Functions X;Y associate a m×n
                                      and p×q matrix for a given vector of parameters, θ = col(θ1;θ2;:::;θk). Let
                                      the differential of the function X with respect to θ is defined as
                                                                                 ∂X =£ ∂X ∂X ::: ∂X ¤
                                                                                                  ∂θ       ∂θ                ∂θ
                                                                                  ∂θ                 1        2                 k
                                      for ∂X=∂θ ∈ Rm×n, i = 1;2;:::;k.
                                                        i
                                      Proposition 2.1. The following equations hold
                                          1. ∂ (αX) = α∂X
                                                ∂θ                     ∂θ
                                          2. ∂ (X +Y) = ∂X + ∂Y
                                                ∂θ                         ∂θ        ∂θ
                                          3. ∂ (X ×Y) = ∂X ×(I ⊗Y)+X × ∂Y
                                                ∂θ                         ∂θ           k                          ∂θ
                                      where α ∈ R and I is a k × k dimensional identity matrix, assuming that
                                                                         k
                                      differentials exist and matrix dimensions coincide.
                                      Proof. The first two cases are obvious. We have
                                        ∂ (X ×Y)=£ ∂X ×Y +X× ∂Y ::: ∂X ×Y +X× ∂Y ¤
                                                                     ∂θ                           ∂θ                ∂θ                           ∂θ
                                       ∂θ                               1                            1                 k                            k
                                                                                                       Y ···              0 
                                                                 £ ∂X                   ∂X ¤           .         .         .                   £ ∂Y                   ∂Y ¤
                                                             =                : : :              ×          .       .       .       +X×                       : : :
                                                                     ∂θ                 ∂θ             .            .      .                       ∂θ                ∂θ
                                                                        1                  k                                                            1                  k
                                                                                                           0     · · ·     Y
                                                             =∂X×(I ⊗Y)+X×∂Y
                                                                   ∂θ            k                           ∂θ
                                                                                                          2
                                            Differentiating matrix transposition is a little bit more complicated. Let
                                      us define a generalized matrix transposition
                                      Definition 2.2. Let X = [X ;X ;:::X ], where X ∈ Rp×q, i = 1;2;:::;n
                                                                                            1      2           n                     i
                                      is a p × q matrix is a partition of p × nq dimensional matrix X. Then
                                                                                                : £ X′;X′;:::;X′ ¤
                                                                                 Tn(X)=                   1      2              n
                                      Proposition 2.3. The following equations hold
                                          1. ∂ (X′) = T (∂X)
                                                ∂θ                   k ∂θ
                                          2. ∂ (T (X)) = T                        (∂X)
                                                ∂θ       n                  k×n ∂θ
                                      Proof. The first condition is a special case of the second condition for n = 1.
                                      Wehave
                                              ∂ (T         (X)) = £ T(n)(∂X) :::                           T(n)(∂X) ¤
                                             ∂θ        (n)                             ∂θ1                          ∂θk
                                                                          h ∂X′               ∂X′                 ∂X′             ∂X′ i                       ³∂X´
                                                                     =            1;:::;          n     : : :         1;:::;          n       =T(k×n)
                                                                               ∂θ              ∂θ                 ∂θ               ∂θ
                                                                                  1               1                   k               k                          ∂θ
                                      since
                                                                    ∂X           £ ∂X                ∂X                  ∂X              ∂X ¤
                                                                            =            1;:::;          n     : : :         1;:::;          n
                                                                                     ∂θ               ∂θ                 ∂θ              ∂θ
                                                                     ∂θ                  1               1                  k                k
                                            Let us now turn to differentiating tensor products of matrices. Let for
                                      any matrices X, Y, where X ∈ Rp×q is a matrix with elements x                                                                ∈ R for
                                                                                                                                                               ij
                                      i = 1;2;:::;p, j = 1;2;:::;q. The Kronecker product, X ⊗Y is defined as
                                                                                             :  x11Y             · · ·    x1qY 
                                                                                                 .               .            .      
                                                                              X⊗Y =                      .          .          .
                                                                                                 .                   .        .      
                                                                                                     xp1Y         · · ·    xpqY
                                      Similarly as in case of differentiating matrix transposition we need to intro-
                                      duce the generalized Kronecker product
                                      Definition 2.4. Let X = [X ;X ;:::X ], where X ∈ Rp×q, i = 1;2;:::;m
                                                                                           1      2            m                     i
                                      is a p × q matrix is a partition of p × mq dimensional matrix X. Let Y =
                                      [Y ;Y ;:::Y ], where Y ∈ Rr×s, i = 1;2;:::;n is a r×s matrix is a partition
                                         1     2           n                   i
                                      of r × ns dimensional matrix Y. Then
                                                                             1        :
                                                                     X⊗ Y =[X⊗Y1;:::;X⊗Yn]
                                                                             n        :
                                                                             m                      1                         1
                                                                    X⊗ Y =[X ⊗ Y;:::;X ⊗ Y]
                                                                             n        :       1     n                  m n
                                                                1;m ;:::;ms                        m ;:::;ms                          m ;:::;ms
                                                        X⊗ 2                    Y =[X⊗ 2                        Y ;:::;X ⊗ 2                        Y ]
                                                                                                                   1                                  n
                                                                n ;n ;:::;n                        n ;:::;n                           n ;:::;n          1
                                                                  1   2       s       :              2      s                           2       s
                                                              m1;m ;:::;m                           1;m ;:::;m                               1;m ;:::;m
                                                      X⊗ 2 sY =[X ⊗ 2                                             s Y;:::;X              ⊗ 2              s Y ]
                                                                                              1                                     m
                                                              n ;n ;:::;n                           n ;n ;:::;n                        1     n ;n ;:::;n
                                                               1    2      s                          1   2       s                           1    2      s
                                      assuming that appropriate matrix partitions exist.
                                                                                                          3
The words contained in this file might help you see if this file matches what you are looking for:

...Munich personal repec archive anote on matrix dierentiation kowal pawel december online at https mpra ub uni muenchen de mprapaper no posted jul utc pawe july abstract this paper presents a set of rules for with respect to vector parameters using the attered representation derivatives i e in form we also introduce new kronecker tensor products matrices finally consider problem dierentiating determinant trace and inverse jel classication c keywords matrixdierentiation generalizedkroneckerproducts introduction can be ex pressed as concatenation scalar however such is very inconvenient some applications g if higher order are considered or even not applicable functions like present example nding an explicit derivative det x would quite complicated task arise naturally many maximum likelihood approach estimating model same problems emerges case deriva tives additionally additional eort required nd tered resulting tensors which since running numerical computations eciently possible only two ...

no reviews yet
Please Login to review.