Least-Square Classification

Type
Linear Classifier

Very similar to linear regression. Same principles apply, just assign {-1, 1} to classes. Then compute the least squares optimization. The decision boundary will always be $f(x)=0$, so the prediction will have a $sign$ function.

f(x)={>0decide y1<0decide y2f(x) = \begin{cases} >0 & \text{decide } y_1 \\ <0 & \text{decide } y_2 \end{cases}

2 classes:

3 classes (example why it might perform badly):

For >2 classes, use one-hot-encoding.

Cons:

  • Sensitive to outliers (squared error penalizes large mistakes heavily)
  • Not ideal for classification (treats it as regression)
  • Can give predictions outside [-1, +1] range
  • Often outperformed by logistic regression or SVM

Brain Teaser: You can also try to train two different linear models and try to interpret results.