Massive quantities of data are produced by a wide range of scientific disciplines and societal interactions. Researchers are responding to this "data deluge" with new theory and computational methods for acquiring and processing large, complex, highdimensional data. This reading course will introduce students to the state of the art research in the analysis of modern data sets. Recent and influential contributions to the literature will be read and discussed each week, allowing students to gain familiarity with the theoretical, computational, and statistical techniques crucial for the advancement of big data science.
January 24 
Introductory Material

Papers: (1)

January 25 
APPM Colloquium by Anna Gilbert  ECCR 245, 3:00 PM


January 31 
Sparsity, Sparse Representation

Papers: (2), (3)

February 7 
Sparse Representation and Signal/Image Processing

Papers: (4), (5)

February 14 
Compressive Sensing

Papers: (6), (7), (8)

February 21 
Compressive Sensing

Papers: (9), (10)

February 28 
(no meeting)


March 7 
JohnsonLindenstrauss Lemma

Papers: (11), (12)

March 14 
Low Rank Matrix Approximations

Papers: (13)

March 21 
Random Matrix Theory, Randomized Algorithms

Papers: (14)

March 28 
Spring Break


April 4 
Random Matrix Theory, Randomized Algorithms

Papers: (14)

April 11 
Data Models and Manifold Learning

Papers: (15), (16)

April 18 
Manifold Learning and Nonlinear Dimension Reduction

Papers: (17), (18)

April 25 
Graphs and Geometric Analysis of Data Sets

Papers: (19), (20)

May 2 
Geometric Analysis of Data Sets

Papers: (21), (22)

(1)  Donoho and Tanner (2010). "Precise Undersampling Theorems" 
(2)  Bruckstein, Donoho, and Elad (2009). "From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images" 
(3)  Olshausen and Field (1996). "Emergence of Simplecell Receptive Field Properties by Learning a Sparse Code for Natural Images" 
(4)  Aharon, Elad, and Bruckstein (2006). "KSVD: An algorithm for Designing Overcomplete Dictionaries for Sparse Representation" 
(5)  Rubinstein, Bruckstein, and Elad (2010). "Dictionaries for Sparse Representation Modeling" 
(6)  Candes and Wakin (2008). "An Introduction to Compressive Sampling" 
(7)  Candes, Romberg, and Tao (2006). "Robust Uncertainty Principles: Exact Signal Reconstruction From Highly Incomplete Frequency Information" 
(8)  Donoho (2006). "Compressed Sensing" 
(9)  Candes and Tao (2005). "Decoding by Linear Programming" 
(10)  Candes, Romberg, and Tao (2006). "Stable Signal Recovery from Incomplete and Inaccurate Measurements" 
(11)  Dasgupta and Gupta (2002). "An Elementary Proof of a Theorem by Johnson and Lindenstrauss" 
(12)  Achlioptas (2003). "Databasefriendly Random Projections: JohnsonLindenstrauss with Binary Coins" 
(13)  Achlioptas and McSherry (2007). "Fast Computation of Low Rank Matrix Approximations" 
(14)  Halko, Martinsson, and Tropp (2011). "Finding Structure with Randomness: Stochastic Algorithms for Constructing Approximate Matrix Decompositions" 
(15)  Saul, Weinberger, Ham, Sha, and Lee (2006). "Spectral Methods for Dimensionality Reduction" 
(16)  Baraniuk, Cevher, and Wakin (2010). "Lowdimensional Models for Dimensionality Reduction and Signal Recovery: A Geometric Perspective" 
(17)  Belkin and Niyogi (2003). "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation" 
(18)  Belkin and Niyogi (2004). "Semisupervised Learning on Riemannian Manifolds" 
(19)  Coifman, Lafon, Lee, Maggioni, Nadler, Warner, and Zucker (2005). "Geometric Diffusions as a Tool for Harmonic Analysis and Structure Definition of Data: Diffusion Maps" 
(20)  Lafon and Lee (2006). "Diffusion Maps and CoarseGraining: A Unified Framework for Dimensionality Reduction, Graph Partitioning, and Data Set Parameterization" 
(21)  Chen, Little, Maggioni, and Rosasco (2011). "Some Recent Advances in Multiscale Geometric Analysis of Point Clouds" 
(22)  Chen and Maggioni (2011). "Multiscale Geometric Dictionaries for PointCloud Data" 