0

I'm reading "The Elements of Statistical Learning", to be precise chapter about optimal separating hyperplanes (SVM classification problem) and I've stuck with the following. I have a function to be maximized:

$$ \sum_{i=1}^{N}a_i - \frac{1}{2}\sum_{i=1}^{N}\sum_{k=1}^{N}a_ia_ky_iy_kx_i^Tx_k, $$

subject to:

$$ 0 = \sum_{i=1}^{N}y_ia_i, \quad b = \sum_{i=1}^{N}a_iy_ix_i, \quad a_i \ge 0 \quad \forall{i}, \quad a_i[y_i(x_i^Tb + b_o) - 1] = 0 \quad \forall{i} $$

And it's written in the book that it's "a simpler convex optimization problem, for which standard software can be used", but I'm not very familiar with optimization methods used in such software.

Could you please suggest which method should I use to find a solution? Which method is commonly used in SVM implementations?

Anthony
  • 113
  • I believe you are interested in linearly separable formulations. Because constraints are affine (and the problem is convex) duality gap is 0. Primal-dual solutions we quite popular back then, so "primal-dual" could be a good starting point. Please also check https://www.cs.cmu.edu/~epxing/Class/10701-08s/recitation/svm.pdf . Regarding actual numerical algorithms being used in real-life you may find it useful to check Matlab's implementation – keoxkeox Nov 07 '17 at 08:26
  • Thank you, good link. As far as I see SVM presentations also ends with the problem described above, and it's still unclear for me which method exactly should I use to find an optimal solution. It seems that this is quadratic programming problem, but again I don't understand how to formulate it in terms of QP. – Anthony Nov 07 '17 at 13:04

0 Answers0