Hey! I’m Akshat Tomar, and this is my corner of the internet where I document what I’m learning and building. Right now I’m studying about SVM’s and Optimization Theory
This blog is about SVM and the math involved behind it, I hope everyone reading this gets a in-depth clear understanding about the topic.
The prerequisites are Linear algebra and basics of Optimization theory. Now we are good to go, let’s dive deeper into SVM.
Lets get familiar to some terms related to the topic, In the below figure (fig1) as we can see there is 1D data divided into two class namely {+1,-1}, our objective is to find a line (hyperplane) which segregates the data, this line is called Decision Boundary. The points (vectors) that are closest to the decision boundary are called Support Vectors. And the distance between the Decision boundary and support vectors is called Margin. The algorithm which maximizes this margin parametrized by decision boundary is known as Maximal Margin Classifier.

In this blog we will study about Maximal margin classifier also known as Hard Margin Classifier. And to keep things simple we will observe this algorithm only on Linearly separable data.
However a hard margin classifier is very sensitive to outliers, example in fig2 , if we predict the class of unseen data point it will be predicted in Class +1 but it is clearly more related to Class -1, therefore it is very important to handle outliers while using Hard margin classifier.

Now since, we are aware of important terminologies, let's get started.

Let $\ I_+\ and\ I_-$ be the set containing the training data for +1 and -1 classes respectively, then
$$ I_+ = \{x_i|y_i =+1\}\\I_- = \{x_i|y_i =-1\} $$