|Computer Science/Discrete Mathematics Seminar II|
|Topic:||Learning models: connections between boosting, hard-core distributions, dense models, GAN, and regularity II|
|Affiliation:||University of California, San Diego|
|Date:||Tuesday, November 14|
|Time/Room:||10:30am - 12:30pm/S-101|
A theme that cuts across many domains of computer science and mathematics is to find simple representations of complex mathematical objects such as graphs, functions, or distributions on data. These representations need to capture how the object interacts with a class of tests, and to approximately determine the outcome of these tests. For example, a scientist is trying to find a mathematical model explaining observed data about some phenomenon, such as kinds of butterflies in a forest. A minimal criterion for success is that the model should accurately predict the results of future observations. When is this possible? This general situation arises in many contexts in computer science and mathematics. In machine learning, the object might be a distribution on data points, high dimensional real vectors, and the tests might be half-spaces. The goal would be to learn a simple representation of the data that determines the probability of any half-space or possibly intersections of half spaces. In computational complexity, the object might be a Boolean function or distribution on strings, and the tests are functions of low circuit complexity. In graph theory, the object is a large graph, and the tests are the cuts In the graph; the representation should determine approximately the size of any cut. In additive combinatorics, the object might be a function or distribution over an Abelian group, and the tests might be correlations with linear functions or polynomials. This talk is to illustrate the connections between these various contexts.
- Boosting: From machine learning, a boosting algorithm converts a weak learner, that finds a hypothesis that has a small correlation to an unknown , to a strong learner, that finds a hypothesis agreeing with the function almost everywhere.
- Hard-core distributions: From complexity theory, a hard-core distribution lemma says that any Boolean function which cannot be computed with success more than $1-\delta$ on random inputs, has a sub-distribution of size $2 \delta$ where the function is pseudo-random.
- Dense model theorems: Initially from additive combinatorics. Dense model theorems say that any distribution either has a simple witness that it is small (has low min-entropy) or has a simple indistinguishable distribution that is large (high min-entropy).
- GAN-type algorithms: From machine learning, generative adversarial networks (GANs) are algorithms that use a distinguisher that finds differences between a target distribution and a sampling algorithm in order to refine the sampling algorithm. Algorithmic versions of dense model theorems can be viewed as analogous to GAN's in many ways, but can have stronger provable guarantees.
- Regularity lemmas: Initially from graph theory, a regularity lemma shows that an arbitrary mathematical object has a simply described approximation.