108x Filetype PDF File size 0.48 MB Source: www-personal.umich.edu
LECTURE NOTES ON NON-ASYMPTOTIC THEORY OF RANDOM MATRICES MARKRUDELSON AMSSHORTCOURSEONRANDOMMATRICES San Diego, January 2013 Partially supported by NSF DMS grant DMS 1161372. 1 2 MARKRUDELSON 1. Introduction The classical random matrix theory is concerned with asymptotic of various spectral characteristics of families of random matrices, when the dimensions of the matrices tend to infinity. There are many examples when these character- istics, which are random variables themselves, converge to certain limit laws. This includes the celebrated Wigner semicircle law for the empirical measures of eigenvalues of random symmetric matrices, Marchenko–Pastur law, which is the limit of empirical measures of sample covariance matrices, Tracy–Widom distribution describing the limit of the first singular values of a sequence of random matrices, etc. [1]. These limits are of paramount importance, yet in applications one usually needs information about the behavior of such charac- teristics for large, but fixed n. For instance in problems in convex geometry one constructs a random section of an N-dimensional convex body by taking the kernel or the range of a certain random matrix. Random matrices arise also in analysis of rates of convergence of computer science algorithms. In both cases, the dimension of the ambient space remains fixed, and one seeks explicit estimates of probabilities in terms of the dimension. For such problems knowing the limit behavior is of little help. The problems involving estimates for a fixed finite dimension arise in the classical random matrix theory as well. One of the main approaches in deriving the limit laws is based on analysis of the Stieltjes transform of measures [1]. Toderive the convergence of Stieltjes transforms, one frequently has to provide explicit bounds on the smallest singular value of a random matrix of a fixed size, which holds with high probability. This need arises, e.g., in derivation of the circular law and the single ring theorem. These questions led to development of non-asymptotic theory of random matrices, which provides probabilistic bounds for eigenvalues, singular values, etc. for random matrices of a large fixed size. The situation is roughly parallel to that arising for the sums of i.i.d. random variables, where the asymptotic and non-asymptotic results go hand in hand. The asymptotic behavior of the averages of n i.i.d. random variables is governed by the Strong Law of Large Numbers establishing the almost sure convergence to the expectation. Yet, to assert that the average of a large number of random variables is close to the expectation, we need a non-asymptotic version, e.g. Hoeffding inequality. This inequality yields a subgaussian bound for the large deviations (see the details below). Such behavior suggests that the limit distribution of the deviation should be normal, which leads to an asymptotic result, the Central Limit Theorem (CLT). To use the CLT in evaluation of probabilities for random sums, we need its non-asymptotic version, namely the Berry–Esseen Theorem. This theorem provides in turn a crucial step in deriving another fundamental asymptotic result, the Law of Iterated Logarithm. NON-ASYMPTOTIC THEORY 3 These notes discuss the methods of the non-asymptotic approach to the random matrix theory. We do not attempt to provide an exhaustive list of references (a reader can check the surveys [5], [27], and [40]). Instead we con- centrate on three essentially different examples, with the aim of presenting the methods and results in a maximally self-contained form. This approach inevitably leaves out several important recent developments, such as invertibil- ity of random symmetric matrices [42, 19], applications to the Circular Law [9, 37, 38], and concentration for random determinants [39, 20]. Yet, by re- stricting ourselves to a few results, we will be able to give a relatively complete picture of the ideas and methods involved in their proofs. We start with in- troduction to subgaussian random variables in Section 3. In Sections 5-7 we obtain quantitative bounds for invertibility of random matrices with i.i.d. en- tries. As will be shown in Section 6, the arithmetic structures play a crucial role here. Section 8 studies a question arising in geometric functional analysis. Here the ambient space is Banach, and the approach combines the methods of the previous sections with the functional-analytic considerations. We will also touch upon majorising measures, which are a powerful tool for estimating suprema of random processes. Section 9 contains another quantitative invert- ibility result. Here we discuss a random unitary or orthogonal perturbation of a fixed matrix. Unlike in the first example, the arithmetic structure plays no role in this problem. The main difficulty is the dependence between the entries of a random matrix, and the method is based on the introduction of perturbations with independent entries. Acknowledgement. Thesenotes are based in part on the material presented at the workshop “Etats de la Recherche: Probability and geometry in inter- action” at Paul Sabatier University in Toulouse, the mini-course given at the Warsaw University, and the Informal Analysis Seminar at Kent State Univer- sity. The author is grateful to Franck Barthe, Michel Ledoux, Rafal Latala, Krzystof Oleszkiewicz, Michal Wojchehowski, Fedor Nazarov, Dmitry Ryabo- gin, and Artem Zvavitch for their hospitality. The author also grateful to Fedor Nazarov for careful reading of the manuscript and many suggestions, which led to improvement of the presentation. 2. notation and basic definitions We shall consider random matrices of high order with independent entries. For simplicity, we shall assume that the entries are centered (Ea =0) and j,k identically distributed (both conditions may be relaxed). 4 MARKRUDELSON Throughout these notes k·k denotes the ℓp norm p ! n 1/p X p kxk = |xj| , 1 ≤ p < ∞, p j=1 and Bn stands for the unit ball of this norm. The norm of an operator or a p matrix will be denoted by k·k. We use Sn−1 for the unit Euclidean sphere. If F is a finite set, then |F| denotes the cardinality of F. Letters C,C′,c etc. denote absolute constants. n If N ≥ n then an N ×n matrix A can be viewed as a mapping of R into RN. Thus, a random matrix defines a random n-dimensional section of RN. For geometric applications we need to know that this matrix would not distort the metric too much. Let us formulate it more precisely: Definition 2.1. Let N ≥ n and let A be an N × n matrix. The condition number of the matrix A is max n−1 kAxk σ(A) = x∈S 2. min n−1 kAxk x∈S 2 If minx∈Sn−1 kAxk = 0, we set σ(A) = ∞. 2 The condition number of a matrix can be rewritten in terms of its singular values. Definition 2.2. Let N ≥ n and let A be an N × n matrix. The singular ∗ 1/2 values of A are the eigenvalues of (A A) , arranged in the decreasing order: s1(A) ≥ s2(A) ≥ ... ≥ sn(A). The singular values of A are the lengths of the semi-axes of the ellipsoid ABn. The first and the last singular values have a clear functional-analytic 2 meaning: n N s1(A) = A : R → R , and −1 n n s (A) = min kAxk = 1/ A : AR →R , n n−1 x∈S whenever A has the full rank. In this notation σ(A) = s (A)/s (A). 1 n Therefore, to bound the condition number, we have to estimate the first singular value from above, and the last one from below. For matrices with i.i.d. random entries the first singular value is the most robust. It can be estimated using a simple ε-net argument, as will be shown in Proposition 4.4. The last singular value presents a bigger challenge. We will obtain its bounds for “tall” rectangular matrices in Section 4, and for square matrices in Sections 5-7.
no reviews yet
Please Login to review.