Course completion
  • ML problems can be posed as optimization problems where the objective function represents an error or performance measure on a dataset of training examples.
  • The objective of classification is to minimize the misclassification rate. The goal of regression is to minimize the MSE.
  • The fundamental problem of ML is generalization to unseen examples.
  • ML relies on models to represent assumptions about the regularities of a dataset.
  • Generalization is achieved by controlling model complexity to avoid under-fitting or over-fitting.
  • K-means is a clustering algorithm which alternates between an update of clusters given centroids and an update of the centroids given clusters.
  • Model selection can be seen as a secondary learning problem where the objective is to find hyper-parameters which help maximizing generalization.
  • Supervised learning can benefit from a new representation which corresponds to a mapping of the input to a new feature space.
  • A new feature space can be obtained without human intervention by learning representations with an unsupervised algorithm.
  • Learning representations with an unsupervised algorithm has several benefits w.r.t. generalization.
  • Learning sparse representations may improve generalization if we can assume that inputs can be represented by a limited number of features.
  • Learning distributed representations may allow for non-local generalization if each example can be interpreted as combining several features.
  • Combining sparse and distributed representations leads to features which are similar to those found in primate brains.

Next: Machine Learning with probabilities