1

I have the following problem to optimize:

\begin{align} \begin{split} \hat{\mathbf{x}}, \hat{\delta} = \underset{\mathbf{x}, \,\delta}{argmin} \, \sum_{i=1}^l \, \sum_{j=1}^m \, a_{i,j}\left(x_i + \delta \right)^2 \\ s.t. \, \, \, \, 1 - x_i \leq 0 \, \, \forall \, i \\ \text{and} \,\, \delta \leq 0 \end{split} \end{align} where $a_{i,j}$ is just a variable that doesn't depend of $x_i$.

I am thinking to solve this problem by minimizing the following Lagrangian function: \begin{align} \underset{\mathbf{x}, \, \delta}{argmin} \, L\left(\mathbf{x}, \, \delta, \, \lambda_1, \, \lambda_2\right) = \sum_{j=1}^m \, \sum_{i=1}^l \, a_{i,j}\left(x_i + \delta \right)^2 + \lambda_1 \left(1 - x_i\right) +\lambda_2 \, \delta \end{align} and then find the derive =0.

I am sure I am wrong. Can someone provide me with brief mathematical details about how this problem can be solved?

Any help will be very appreciated.

Christina
  • 323
  • 1
    First, you should notice that you have 1 Lagrange multiplier for each constraint. This means you have $\sum_i \lambda_i (1-x_i)$ or $\lambda^\top (\boldsymbol{1} - \mathbf{x}) $. And also $\delta$ has another multiplier associated to it. – yes Nov 23 '20 at 21:32
  • 1
    why is there a constraint on the parameter $\delta$? – LinAlg Nov 23 '20 at 22:06
  • @Albus I am not sure of what you have written. Why should I take a vector of lambda? – Christina Nov 24 '20 at 00:14
  • So $\delta$ is fixed, right? You should remove it from the Lagrangian and the constraint list. – LinAlg Nov 24 '20 at 00:51
  • @LinAlg I edited the entire question. In fact $\delta$ is not a constant, but just it doesn't depend on $i$. – Christina Nov 24 '20 at 01:13
  • @Christina if you want to write the Lagrangian, you need to associate to each constraint one multiplier. Each $1-x_i \leq 0$ is a constraint, so you need one multiplier for each $i$. Similarly, you will need another one for the $\delta$. – yes Nov 24 '20 at 09:13
  • @Albus Thank you for your comment. I will try to solve the problem with your suggestion and will keep you posted. – Christina Nov 24 '20 at 11:37

1 Answers1

1

I assume that $a_{ij}$ are nonnegative. Define $q_i = \sum_j a_{ij}$, $p_i = q_i / \sum_i q_i$ and $c = -\delta$, then the problem is:

$$\min_{x,c} \left\{ \sum_i p_i(x_i - c)^2 : x_i \geq 1, c \geq 0 \right\}$$ You recognize the formula for the variance in the objective function. Let me write the problem as: $$\min_x \min_c \left\{ \sum_i p_i(x_i - c)^2 : x_i \geq 1, c \geq 0 \right\},$$ For fixed $x$, the inner problem is $\min_c \left\{ \sum_i p_i(x_i - c)^2 : c \geq 0 \right\}$. This is the minimum of a convex quadratic function with a nonnegativity constraint. The unconstrained minimum of the quadratic function is at $c=\sum_i p_i x_i$. This solution happens to satisfy the constraint $c\geq 0$ (because $x_i \geq 0$), so this solution is also optimal to the constrained problem.

Now you can write the original problem as: $$\min_x \left\{ \sum_i p_i(x_i - \sum_j p_j x_j)^2 : x_i \geq 1\right\},$$ or as $$\min_x \left\{ x^T(\textrm{Diag}(p) - pp^T)x : x_i \geq 1\right\},$$ where $\textrm{Diag}(p)$ is a matrix with $p$ on the diagonal and zeros elsewhere. If all $x_i$ are equal, the objective value is $0$. It is clear that you cannot get a value less than $0$, so $0$ is optimal.

LinAlg
  • 19,822
  • Thank you a lot for your answer. BTW I don't understand why $p_i = q_i / \sum_i, q_i$, how this will be equal to original forumation? and so how do you proceed with the mathematical calculation steps to find $\mathbf{x}$ and $c$. In addition why the Langragian method is not used? can you just add more details in your answer please? thank you. – Christina Nov 24 '20 at 13:49
  • 1
    Dividing the objective by a nonnegative constant does not change the optimal $x$ and $c$ (but it changes the objective value). I just "see" the optimal solutions from the analogy with variance, but you can find the optimal (unconstrained) $c$ by setting the derivative of the objective to $0$. – LinAlg Nov 24 '20 at 13:51
  • (the optimal $x$ is not zero) you could look at the eigenvalues and eigenvectors of the matrix $\text{Diag}(p)-pp^T$, and notice that the all-1 vector is an eigenvector with eigenvalue 0 – LinAlg Nov 24 '20 at 15:26
  • yes, there are many optimal $x$ that all result in an objective value of $0$. An example of a closed form solution is $x_i = 9$ for all $i$, and $\delta = -c = -9$. – LinAlg Nov 24 '20 at 15:57
  • For example how do you solve the optimization problem above and find the optimal $\mathbf{x}$ and $c$? because here we still have a constraint. – Christina Nov 24 '20 at 15:59
  • 1
    Without constraint you could pick $x=0$. If we did not know anything about $\textrm{Diag}(p) - pp^T$, this would not be a simple problem (you should open a new question about $\min_x { x^TAx : x \geq 1}$ if you are interested in that problem), but here we can look at its eigenvalues and eigenvectors and spot the solution right away. – LinAlg Nov 24 '20 at 16:11
  • Yes I see. But in fact how is possible that we transform our problem into another that involves two minimizations. For example: $\underset{\mathbf{x}}{min} , \underset{c}{min}$. Is this the same as we write $\underset{\mathbf{x}, ,c}{min}$ ? I don't think they are the same, but what allows us to do that? – Christina Nov 24 '20 at 19:19
  • 1
    That is the same, see this answer – LinAlg Nov 24 '20 at 19:26
  • One final question:d how did you get $x^T(\textrm{Diag}(p) - pp^T)x$ from $\sum_i p_i(x_i - \sum_j p_j x_j)^2$? – Christina Nov 24 '20 at 19:57
  • by writing $\sum_i p_i (x_i - \sum_j p_j x_j)^2 = \sum_i p_i x_i^2 - 2\sum_{i,j} p_i p_j x_i x_j + (\sum_j p_j x_j)^2 =\sum_i p_i x_i^2 - \sum_{i,j} p_i p_j x_i x_j $ – LinAlg Nov 24 '20 at 20:17
  • your calculation is not completed. You missed something in your last comment – Christina Nov 24 '20 at 20:19
  • 1
    I updated the comment a few times to make sure everything fits on a line, but it is correct as is. Note that $(\sum_j p_j x_j)^2 = (\sum_i p_i x_i)(\sum_j p_j x_j) = \sum_{i,j} p_i p_j x_i x_j$. – LinAlg Nov 24 '20 at 20:21
  • 1
    Ah okay I just thought that there is a minus sign at the end $-$, but it is $-$ LinAlg – Christina Nov 24 '20 at 20:24
  • I have asked a new question and I am not able to follow the same concept you explained to me. In particular, I can't find a way to deal with $y_j$. My question is here: https://math.stackexchange.com/questions/3921483/optimization-problem-with-several-constraints – Christina Nov 24 '20 at 22:07