Sketching algorithms compress data in a way that is still useful for answering some pre-specified family of queries, possibly across datasets by comparing sketches. This course will cover mathematically rigorous models for developing such algorithms, as well as some provable limitations of algorithms operating in those models. Some topics covered include:
Streaming algorithms. Compute useful statistics over a dataset making only one pass over it, while using little memory.
Dimensionality reduction. General techniques and impossibility results for reducing data dimension while still preserving geometric structure.
Randomized linear algebra. Algorithms for big matrices (e.g. a user/product rating matrix for Netflix or Amazon). Regression, low rank approximation, clustering, etc.
Compressed sensing. Recovery of (approximately) sparse signals based on few linear measurements.
This is a graduate course, though there may be room for a limited number of advanced undergraduate students satisfying the following prerequisites: mathematical maturity and comfort with algorithms (e.g. CS 170), discrete probability, and linear algebra.