Calculus Pdf 170460 | Stochastic Calculus

Partial capture of text on file.
                    PDEfor Finance Notes – Stochastic Calculus Review
                    Notes by Robert V. Kohn, Courant Institute of Mathematical Sciences. For use in connec-
                    tion with the NYU course PDE for Finance, MATH-GA 2706. First prepared 2003, minor
                    adjustments made 2011, one typo corrected 2014.
                    Thesenotes provide a quick review of basic stochastic calculus. If this material isn’t familiar
                    then you don’t have suﬃcient background for the class PDE for Finance.
                    Thematerial presented here is covered in the books by Neftci (An Introduction to the Math-
                    ematics of Financial Derivatives), or Chang (Stochastic Optimization in Continuous Time).
                    Deeper treatments can be found for example in Shreve (Stochastic Calculus for Finance II),
                    Steele (Stochastic Calculus and Financial Applications), and Oksendal (Stochastic Diﬀer-
                    ential Equations: an Introduction with Applications).
                    Brownian motion. Brownian motion w(t) is the stochastic process with the following
                    properties:
                       • For s < t the increment w(t) − w(s) is Gaussian with mean zero and variance
                                          2
                          E[(w(t)−w(s)) ] = t−s. Moreover the increments associated with disjoint intervals
                          are independent.
                       • Its sample paths are continuous, i.e. the function t 7→ w(t) is (almost surely) contin-
                          uous.
                       • It starts at 0, in other words w(0) = 0.
                    This process is unique (up to a suitable notion of equivalence). One “construction” of
                    Brownian motion obtains it as the limit of discrete-time random walks; students of ﬁnance
                    who have considered the continuous-time limit of a binomial lattice have seen something
                    very similar.
                    The sample paths of Brownian motion, though continuous, are non-diﬀerentiable. Here is
                    an argument that proves a little less but captures the main point. Given any interval (a;b),
                    divide it into subintervals by a = t1 < t2::: < tN = b. Clearly
                               N−1                                              N−1
                                X                   2                           X
                                    |w(ti+1) −w(ti)| ≤ max|w(ti+1)−w(ti)|·          |w(ti+1)−w(ti)|:
                                i=1                       i                     i=1
                    AsN →∞,thelefthandsidehasexpectedvalueb−a(independentofN). Theﬁrsttermon
                    the right tends to zero (almost surely) by continuity. So the second term on the right must
                    tend to inﬁnity (almost surely). Thus the sample paths of w have unbounded total variation
                                                                                         p
                    on any interval. One can show, in fact, that |w(t)−w(s)| is of order   |t − s|loglog1=|t − s|
                    as |t − s| → 0.
                    It’s easy to construct, for any constant σ > 0, a process whose increments are mean-
                    value-zero, independent, and variance σ2|t − s|: just use σw(t). The vector-valued ver-
                    sion of this construction is more interesting.     We say w(t) = (w ;:::;w ) is an Rn-
                                                                                           1       n
                    valued Brownian motion if its components are independent scalar Brownian motions. Thus
                                                                  1
                  E[(w(t) − w(s)) (w(t) − w(s)) ] equals 0 if i 6= j and |t − s| if i = j.   Given such
                                  i             j
                  w, we can obtain a process with correlated increments by taking linear combinations,
                  i.e. by considering z(t) = Aw(t) where A is a (constant) matrix.      Its covariance is
                                                       T
                  E[(z(t) − z(s)) (z(t) − z(s)) ] = (AA ) |t − s|. If the desired variance σ is a function
                                 i           j           ij
                  of state and time (deterministic, or random but nonanticipating) then construction of the
                  associated process requires solving the stochastic diﬀerential equation dx = σdw (to be
                  discussed below). That’s the scalar case; the vector-valued situation is similar: to construct
                  a process with independent, mean-value-zero increments with speciﬁed covariance Σ we
                                       √
                  have only to set A =   Σ(the unique nonnegative, symmetric square root of Σ) and solve
                  dx =Adw.
                  Filtrations and conditional expectations. It is important, in discussing stochastic
                  processes, to remember that at time t one knows (with certainty) only the past and the
                  present, not the future. This is important for understanding the term “martingale.” It will
                  also be crucial later in the class when we discuss optimal decision-making.
                  The meaningful statements about a Brownian motion (or any stochastic process, for that
                  matter) are statements about its values at various times. Here is an example of a statement:
                  “−3 < w(:5) < −2 and w(1:4) > 3”. Here is another: “max0≤t≤1|w(t)| < 3”. A statement
                  is either true or false for a given sample path; it has a certain probability of being true. We
                  denote by F the set of all statements about w that involve only the values of w up to time
                              t
                  t. Obviously F ⊂ F if s < t. These F ’s are called the ﬁltration associated with w.
                                s     t                t
                  We can also consider functions of a Brownian path. When we take the expected value of
                  some expression involving Brownian motion we are doing this. Here are some examples of
                                                 2
                  functions: f[w] = w(:5) − w(1) ; g[w] = max0≤t≤1|w(t)|. Notice that both these exam-
                  ples are determined entirely by time-1 information (jargon: f and g are F -measureable).
                                                                                         1
                  It’s often important to discuss the expected value of some uncertain quantity given the
                  information available at time t. For example, we may wish to know the expected value of
                  max0≤t≤1|w(t)| given knowledge of w only up to time :5. This is a conditional expectation,
                  sometimes written E [g] = E[g|F ] (in this case t would be :5). We shall deﬁne it in a
                                      t            t
                  moment via orthogonal projection. This deﬁnition is easy but not so intuitive. After giving
                  it, we’ll explain why the deﬁnition captures the desired intuition.
                  Let V be the vector space of all functions g[w], endowed with the inner product hf;gi =
                  E[fg]. It has subspaces
                         V =space of functions whose values are determined by time-t information:
                           t
                  The conditional expectation is deﬁned by orthogonal projection:
                                         E[g] = orthogonal projection of g onto V :
                                          t                                     t
                  The standard linear-algebra deﬁnition of orthogonal projection characterizes E [g] as the
                                                                                              t
                  unique element of V such that
                                     t
                                              hE [g];fi = hg;fi for all f ∈ V :
                                                t                          t
                                                            2
                 Rewriting this in terms of expectations: E [g] is the unique function in V such that
                                                       t                         t
                                           E[E[g]f] = E[gf] for all f ∈ V :
                                               t                      t
                 All the key properties of conditional expectation follow easily from this deﬁnition. Example:
                 “tower property”
                                             s < t =⇒ E [E [f]] = E [f]
                                                       s  t       s
                 since projecting ﬁrst to V then to V ⊂ V is the same as projecting directly to V . Another
                                       t        s    t                                 s
                 fact: E is the ordinary expectation operator E. Indeed, V is one-dimensional (its elements
                       0                                            0
                 are functions of a single point w(0) = 0, i.e. it consists of those functions that aren’t random
                 at all). From the deﬁnition of orthogonal projection we have
                                    E [g] ∈ V and E[E [g]f] = E[gf] for all f ∈ V :
                                     0      0        0                       0
                 But when f is in V it is deterministic, so E[gf] = fE[g]. Similarly E [E [g]f] = fE [g].
                                  0                                                0         0
                 Thus E [g] = E[g].
                        0
                 To see that this matches our intuition, i.e. that E is properly interpreted as “the expected
                                                             t
                 value based on future randomness, given all information available at time t”, let’s consider
                 the simplest possible discrete-time analogue. Consider a 2-stage coin-ﬂipping process which
                 obtains at each stage heads (probability p) or tails (probability q = 1 − p). We visualize it
                 using a (nonrecombinant) binomial tree, numbering the states as shown in Figure 1.
                                                                   6
                                                            p
                                                       2
                                                 p          q
                                                                   5
                                              0
                                                                   4
                                                 q          p
                                                        1
                                                             q
                                                                   3
                             Figure 1: Binomial tree for visualizing conditional probabilities
                 Thespace V is 4-dimensional; its functions are determined by the full history, i.e. they can
                            2
                 be viewed as functions of the time-2 nodes (numbered 3;4;5;6 in the ﬁgure). The space V
                                                                                                1
                 is two-dimensional; its functions are determined by just the ﬁrst ﬂip. Its elements can be
                 viewed as functions of the time-1 nodes (numbered 1;2 in the ﬁgure); or, equivalently, they
                 are functions f ∈ V such that f(3) = f(4) and f(5) = f(6). (Such a function can be viewed
                                  2
                                                        3
                   as a function of the time-1 nodes by setting f(1) = f(3) = f(4) and f(2) = f(5) = f(6).
                   The “expected value of g given time-1 information” intuitively has values
                                      ˜                         ˜
                                     E [g](1) = pg(4) +qg(3);   E [g](2) = pg(6) +qg(5):
                                       1                         1
                                                                                             ˜
                   To check that this agrees with our prior deﬁnition, we must verify that hf;E [g]i = hf;gi
                                                                                              1
                   for all f ∈ V . In other words we must check that
                               1
                                                       h ˜    i
                                                     E E [g]f =E[gf]                                   (1)
                                                          1
                   whenever f(2) = f(5) = f(6) and f(1) = f(3) = f(4). The left hand side is
                                                  ˜              ˜
                                                qE [g](1)f(1) +pE [g](2)f(2)
                                                   1               1
                   while the right hand side is
                                      q2f(3)g(3) +pqf(4)g(4) +pqf(5)g(5)+p2f(6)g(6)
                   which can be rewritten (since f(1) = f(3) = f(4) and f(2) = f(5) = f(6)) as
                                         q(qg(3) +pg(4))f(1) +p(qg(5) +pg(6))f(2):
                                               ˜
                   The formula given above for E [g] is precisely the one that makes (1) correct.
                                                1
                   A stochastic process x(t) is “adapted” to F if its values up to and including time t are
                                                              t
                   determined by the statements in F . (The stochastic processes obtained from Brownian
                                                      t
                   motion by solving stochastic diﬀerential equations automatically have this property.) Such
                   a stochastic process is called a martingale if E [x(t)] = x(s) for s < t. An equivalent
                                                                  s
                   statement: E [x(t) − x(s)] = 0 for s < t. Intuitively: given current information, there’s
                                s
                   no point betting on the future of the process; it’s equally likely to go up or down. (That’s
                   not quite right; it confuses the mean and the median. The correct statement is that the
                   expected future value, based on present information, is exactly the present value.)
                   A stochastic process f(t) is called nonanticipating if its value at time t depends only on
                   information available at time t, i.e. if f(t) is adapted to F . An example is f(t) = F(t;w(t))
                                                                          t
                   for any (deterministic) function F : R2 → R. But this isn’t the only type of example – for
                   example f(t) = Rtw(s)ds is also nonanticipating.
                                   0
                   Stochastic integrals. We are interested in stochastic diﬀerential equations of the type
                                            dy = f(y;s)ds+g(y;s)dw;     y(t) = x:
                   (Pretty much everything we’ll say extends straightforwardly to SDE’s of the form dy =
                   fds+gdw with f and g random but nonanticipating.) The stochastic diﬀerential equation
                   is really shorthand for the associated integral equation
                                         y(b) = x +Z bf(y(s);s)ds+Z bg(y(s);s)dw:                      (2)
                                                     t                t
                                                              4
The words contained in this file might help you see if this file matches what you are looking for:

...Pdefor finance notes stochastic calculus review by robert v kohn courant institute of mathematical sciences for use in connec tion with the nyu course pde math ga first prepared minor adjustments made one typo corrected thesenotes provide a quick basic if this material isn t familiar then you don have sucient background class thematerial presented here is covered books neftci an introduction to ematics financial derivatives or chang optimization continuous time deeper treatments can be found example shreve ii steele and applications oksendal dier ential equations brownian motion w process following properties s increment gaussian mean zero variance e moreover increments associated disjoint intervals are independent its sample paths i function almost surely contin uous it starts at other words unique up suitable notion equivalence construction obtains as limit discrete random walks students nance who considered binomial lattice seen something very similar though non dierentiable argumen...
Related files

Share

Help

Related files

Share

Share to social media

Help

Login Area