|Seminar on Theoretical Machine Learning|
|Topic:||Hyperparameter optimization: a spectral approach|
|Date:||Monday, October 2|
|Time/Room:||12:30pm - 1:45pm/White-Levy Room|
Modern machine learning algorithms often involve complicated models with tunable parameters, called hyperparameters, that are set during training. Hyperparameter tuning/optimization is considered an art. (See e.g. http://www.argmin.net/2016/06/20/hypertuning/ ) We give a simple, fast algorithm for hyperparameter optimization inspired by techniques from the analysis of Boolean functions. We focus on the high-dimensional regime where the canonical example is training a neural network with a large number of hyperparameters. The algorithm - an iterative application of compressed sensing techniques for orthogonal polynomials - requires only uniform sampling of the hyperparameters and is thus easily parallelizable. Experiments for training deep nets on Cifar-10 show that compared to state-of-the-art tools (e.g., Hyperband and Spearmint), our algorithm finds significantly improved solutions. Additionally, our method comes with provable guarantees and yields a quasi-polynomial time algorithm for learning decision trees under the uniform distribution with the best known sample complexity. The talk covers this paper paper: https://arxiv.org/abs/1706.00764 as well as work in progress.