From Optimization to Machine Learning

Optimization methods give practical means to minimize or maximize objective functions. A Machine Learning (ML) problem can then be posed from an optimization perspective by choosing a suitable objective function. The optimization procedure is then responsible for maximizing the fitness of a model for a specific task.

This approach is particularly suited to supervised learning problems where the objective is to learn a function $f^{*}:\mathbf{x}\rightarrow\mathbf{y}$ from a set of training examples $\{(\mathbf{x}_{1},\mathbf{y}_{1}),\dots(\mathbf{x}_{N},\mathbf{y}_{N})\}$. In this case, the objective function can simply be defined as the average error of the model over this set of examples. An optimization method can then be used to find the model which minimizes the error.

We now give a presentation of supervised and unsupervised learning problems and discuss the question of generalization. Then, we study several examples: linear classification, the $K$-means algorithm and polynomial regression. This leads us to pose the questions of hyper-parameter selection and of feature extraction. Finally, we present the semi-supervised learning problem and show how it can be used to achieve better performance in supervised settings with the help of unsupervised data.