How can a function of the form $\sum \sum \frac{\lvert x_{i} - x_{j}\rvert^{p}}{p}$ be minimized? (Part I)

Question

This is part-1 of a series of questions regarding (ultimately), in my implementing a co-ordinate descent algorithm, but I have broken it up into parts as I try to solve it 'by hand' first, so that I can better understand.

So basically, I am given a vector, $\begin{bmatrix} x_{1} & x_{2} & x_{3} \end{bmatrix} $, and I am trying to minimize a function of this vector, given by:

$$ H(x_{1}, x_{2}, \cdots, x_{N}) = \displaystyle\sum\limits_{i=1}^N \displaystyle\sum\limits_{j=1}^N \frac{\lvert x_{i} - x_{j} \rvert^{p}}{p} $$

where here obviously for the sake of simplicity, $N=3$, and I have chosen $p=2$. What this is trying to do is minimize the sum of the total absolute differences (raised to a power) of all the samples of the vector. Now, I know the answer in the end is $x_{1} = x_{2} = x_{3} = c$, where $c$ is just some constant. In other words, a flat line.

So what I did, was take the partial derivatives of $H$ as a function of $x_{1}, x_{2}$ and $x_{3}$, and set them all to 0. The final equations I come up with are:

$$ 2x_{1} - x_{2} - x_{3} = 0 \\ -x_{1} + 2x_{2} -x_{3} = 0 \\ -x_{1} - x_{2} + 2x_{3} = 0 $$

My questions are as follows:

Is this correct?
- If so, how does one determine the fact that all x's must be equal to each other from here?
What would be be partial derivative equations if p = 1?

One bonus ease of implementation question:

Is there an easy way to show all those results in wolfram alpha for this particular case? (I am new to it, but also need a quick way to test my hand calcs through it).

Thanks in advance!

I'm quite lost : You want to minimize a non-negative function for which you have,as you say, that $H(c,c,\ldots,c)=0$ for any $c$. Why do you want an algorithm for that ? — Student, Apr 03 '12 at 04:57
@Student This is part-1 of a series of questions I plan to ask about this type of minimization. I have broken it down into smaller questions to make sure I understand it/am getting it right. For example, my next question is going to be about a constrained version of this. — Spacey, Apr 03 '12 at 05:03

score 2 · Accepted Answer · 2012-04-03T16:48:06.343

2

Longer than a comment

The $3$ linear equations in your questions, can be rewritten as $Ax = 0,$ i.e.: $$ \begin{pmatrix} 2 & -1 & -1 \\ -1 & 2 & -1 \\ -1 & -1 & 2 \\ \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = 0 \tag{1}$$ The rank of $A$ is 2. (Google: "how to compute rank of a matrix") In other words, the nullity of $A$ is $1.$ (Google: "rank+nullity theorem") The nullspace basis is (Google: "how to compute nullspace basis of a matrix"): $$b = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} $$ In other words, all solutions to Equation $(1)$ above, are multiples of $b.$ So $x_1 = x_2 = x_3 = c$ is a solution for any $c.$

For future references, you can ask Wolfram|Alpha for nullity and nullspace basis. You can also ask for minimization directly.

edited Apr 03 '12 at 16:48

answered Apr 03 '12 at 04:17

Thank you, one question, is there an easy way to show this in either wolfram or MATLAB? In matlab, null(the_matrix) gives me $\begin{bmatrix} 0.5774 & 0.5774 & 0.5774 \end{bmatrix}$, I realize this is just a scaled version of the $b$ above, but why is it scaled like that? – Spacey Apr 03 '12 at 15:37
Mohammad, it's normalized such that $b$ has a unit legnth, i.e. $| b | = 1.$ Try $\sqrt{0.5774^2 + 0.5774^2 + 0.5774^2} = 1.$ – Apr 03 '12 at 16:27
In Wolfram, you can ask for the nullity or the nullspace basis of a matrix. – Apr 03 '12 at 16:29

How can a function of the form $\sum \sum \frac{\lvert x_{i} - x_{j}\rvert^{p}}{p}$ be minimized? (Part I)

1 Answers1