Here's a simple constrained optimization problem ($X,P \in\mathbb{R}^{m\times n}$):
$Minimize_{X}~ \|P-X\|_{F}^{2}$
$subject~to~\|X\|_{0}= k$.
The optimal $X$ can be got by setting all but the $k$ largest elements of $P$ to zero. That seems intuitive enough, since the lowest the objective can be is zero (when $X=P$), and because of the constraint, we only keep k non-zero elements. How can I prove this analytically?
In a similar vein, how do I analytically find the minimizer to the following problem ($D\in\mathbb{R}^{n\times n}$)?
$Minimize_{X}~ \|P-X\|_{F}^{2}+\gamma\|DX\|_{1}$
$subject~to~\|X\|_{0}= k$.
Can I do the following: find an $X$ that minimizes the objective (introduce a splitting variable and use Augmented Lagrangian) and then keep only the $k$ largest components? Is there a better way?