I need some help here concerning the Hilbert Spaces theory. Below, you can see a part of Olivier Chapelle's paper: "Training a Support Vector Machine in the Primal". As you can see below, in Eq.(8) the optimization problem is stated in its primal form, while using a kernel $\mathcal{k}$ and an associated RKHS $\mathcal{H}$, the author ends in Eq.(9).
Why does it happen? What would be the form of Eq.(9) if the bias term $b$ was not zero?
Any comment would be very useful, Thanks in advance!
