0

This is part-1 of a series of questions regarding (ultimately), in my implementing a co-ordinate descent algorithm, but I have broken it up into parts as I try to solve it 'by hand' first, so that I can better understand.

So basically, I am given a vector, $\begin{bmatrix} x_{1} & x_{2} & x_{3} \end{bmatrix} $, and I am trying to minimize a function of this vector, given by:

$$ H(x_{1}, x_{2}, \cdots, x_{N}) = \displaystyle\sum\limits_{i=1}^N \displaystyle\sum\limits_{j=1}^N \frac{\lvert x_{i} - x_{j} \rvert^{p}}{p} $$

where here obviously for the sake of simplicity, $N=3$, and I have chosen $p=2$. What this is trying to do is minimize the sum of the total absolute differences (raised to a power) of all the samples of the vector. Now, I know the answer in the end is $x_{1} = x_{2} = x_{3} = c$, where $c$ is just some constant. In other words, a flat line.

So what I did, was take the partial derivatives of $H$ as a function of $x_{1}, x_{2}$ and $x_{3}$, and set them all to 0. The final equations I come up with are:

$$ 2x_{1} - x_{2} - x_{3} = 0 \\ -x_{1} + 2x_{2} -x_{3} = 0 \\ -x_{1} - x_{2} + 2x_{3} = 0 $$

My questions are as follows:

  • Is this correct?

    • If so, how does one determine the fact that all x's must be equal to each other from here?
  • What would be be partial derivative equations if p = 1?

One bonus ease of implementation question:

  • Is there an easy way to show all those results in wolfram alpha for this particular case? (I am new to it, but also need a quick way to test my hand calcs through it).

Thanks in advance!

Spacey
  • 1,344
  • 2
  • 13
  • 22
  • Relevant reading: critical point. –  Apr 03 '12 at 04:21
  • I'm quite lost : You want to minimize a non-negative function for which you have,as you say, that $H(c,c,\ldots,c)=0$ for any $c$. Why do you want an algorithm for that ? – Student Apr 03 '12 at 04:57
  • @Student This is part-1 of a series of questions I plan to ask about this type of minimization. I have broken it down into smaller questions to make sure I understand it/am getting it right. For example, my next question is going to be about a constrained version of this. – Spacey Apr 03 '12 at 05:03
  • Hey, Wolfram can minimize –  Apr 03 '12 at 16:46

1 Answers1

2

Longer than a comment

The $3$ linear equations in your questions, can be rewritten as $Ax = 0,$ i.e.: $$ \begin{pmatrix} 2 & -1 & -1 \\ -1 & 2 & -1 \\ -1 & -1 & 2 \\ \end{pmatrix} \begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = 0 \tag{1}$$ The rank of $A$ is 2. (Google: "how to compute rank of a matrix") In other words, the nullity of $A$ is $1.$ (Google: "rank+nullity theorem") The nullspace basis is (Google: "how to compute nullspace basis of a matrix"): $$b = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} $$ In other words, all solutions to Equation $(1)$ above, are multiples of $b.$ So $x_1 = x_2 = x_3 = c$ is a solution for any $c.$


For future references, you can ask Wolfram|Alpha for nullity and nullspace basis. You can also ask for minimization directly.

  • Thank you, one question, is there an easy way to show this in either wolfram or MATLAB? In matlab, null(the_matrix) gives me $\begin{bmatrix} 0.5774 & 0.5774 & 0.5774 \end{bmatrix}$, I realize this is just a scaled version of the $b$ above, but why is it scaled like that? – Spacey Apr 03 '12 at 15:37
  • Mohammad, it's normalized such that $b$ has a unit legnth, i.e. $| b | = 1.$ Try $\sqrt{0.5774^2 + 0.5774^2 + 0.5774^2} = 1.$ –  Apr 03 '12 at 16:27
  • In Wolfram, you can ask for the nullity or the nullspace basis of a matrix. –  Apr 03 '12 at 16:29