This graduate-level course is designed for students majoring in applied statistics at the Department of Mathematics, Jinan University, taught by Weiwen Wang(王伟文). The course is mathematically rigorous, with a strong focus on high-dimensional data analysis. It includes a variety of examples and practical applications to illustrate key concepts.

Reference

  1. Vershynin, R. High-Dimensional Probability:An Introduction with Applications in Data Science. Cambridge University Press, Cambridge, 2018.
  2. Wainwright, M.J. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Cambridge, 2019.
  3. Wright, J. & Ma, Y. High-Dimensional Data Analysis with Low-Dimensional Models. Cambridge University Press, Cambridge, 2022.
  4. Mohri, M., Rostamizadeh, A. and Talwalkar, A. Foundations of Machine Learning. 2nd Edition,  The MIT Press, 2018.

The lecture notes are heavily based on Ref. 1, Ref. 2. and the video course taught by Roman Vershynin (https://www.math.uci.edu/~rvershyn/teaching/hdp/hdp.html)

ALL THE MATERIALS ARE INTENDED FOR NON-PROFIT ACADEMIC USE. IF THEY ARE PRESENTED IMPROPERLY, PLEASE EMAIL ME TO REQUEST REMOVAL.

Syllabus

Lecture notes and homework are released from time to time. There might be typos in the released notes.

  • Lecture 1 [notes
    • Introductory
      • Counter-intuition of high-dimensional data
      • Non-asymptotic anaysis
      • Goals of this course
      • Review of expecation and variance
      • Some classical inequalities
      • Monte-Carlo method for integration in high-dimensional space
  • Lecture 2 [notes] [Eigenface]
    • Approximated Caratheodory’s theorem and its applications 
  • Lecture 3 [notes]
    • Reivew of large sample laws, Markov inequality and Chebyshev’s inequality
    • Concentration inequalties: Gaussian tail bounds
  • Lecture 4 [notes][Erdos-Reyi Model]
    • Chernoff trick
    • Sub-Gaussian variables
    • Hoeffding bound
    • Chernoff bound and its application
  • Lecture 5 [notes]
    • Sub-exponetial variables
    • Bernstein-type bound
    • Johnson-Lindenstrauss embedding
  • Lecture 6 [notes]
    • Bounded differences inequality
    • Clique number in random graphs, Rademacher complexity and Gaussian complextiy
    • Lipschitz functions of Gaussian variables with applications
      • $\chi^{2}$-concentration
      • Gaussian chaos variables
  • Lecture 7 [notes]
    • Unifrom laws of large numbers
      • Glivenko-Cantelli theorem
      • A uniform law via Rademacher complexity
      • Upper bounds on the Rademacher complextiy:  Vapnik-Chervonenkis dimension
  • Lecture 8 [notes]
    • Rademacher complexity in emprical risk minization
    • Margin theory
  • Lecture 9 [notes]
    • Gaussian complexity and Rademacher complextiy
    • Covering number and packing number
  • Lecture 10 [notes]
    • Random process
    • Dudley’s integral inequality and its applications
  • Lecture 11 [notes]
    • Random process
    • Dudley’s integral inequality and its application in Monte-Carlo method
  • Lecture 12 [notes]
    • Covering number and VC dimension
    • Dudley’s integral inequality’s application in statistical learning theory
  • Lecture 13 [notes]
    • Revisit matrices
      • Spectral decomposition
      • Singular value decomposition
      • Matrix norm
    • Principal component analysis
  • Lecture 14 [notes]
    • Covariance estimation
    • Semicircle law
    • Marchenko-Pastur law
  • Lecture 15 [notes]
    • Matrix caculus
    • Matrix Hoeffding inequality
  • Lecture 16 [notes]
    • Matrix Bernstein inequality and its application in community detection
  • Lecture 17 [notes]
    • Introduction to deep generative models
      • GAN
      • VAE
      • Diffusion models

History

  • 2025-02-06: Build