9

Suppose we have to minimize a function $f(\mathbf x,\mathbf y)$ where $\mathbf x$, $\mathbf y$ are vectors in Euclidean space. The function is convex in $\mathbf y$ when $\mathbf x$ is kept constant and convex in $\mathbf x$ when $\mathbf y$ is kept constant.

Can we then minimize with respect to $\mathbf x$ and $\mathbf y$ separately and find the solution to the initial problem?

Edit: $$g(y,x^*)= \text{min}_x f(x,y)$$ where $x^*$ is $$\text{argmin}_xf(x,y)$$ and $$y^*=\text{argmin}_yg(y,x^*),$$then can we say $$(x^*,y^*)=\text{argmin}_{x,y}f(x,y).$$

I would apriciate if someone can explain how this is true (if it is true) and also a reference where I can look it up, like an online note or a book. Thanks

triomphe
  • 3,848
  • 1
    what do you mean by "Can we then minimize with respect to x and y separately"? – pluton Jul 28 '13 at 04:17
  • Yes if your function is defined on an open set (or collection of open balls) in the plane (there are no corners or edges) and the functions is thrice differentiable (to be safe). – DBS Jul 28 '13 at 05:07
  • Indeed this is too general, you need to tell us more about the function $f$. – al-Hwarizmi Jul 28 '13 at 06:32
  • 3
    If you mean first finding $g(y)=\min_x f(x,y)$, and then minimizing $g$, then yes - you'll get the minimum. This is true for any functions, although separate convexity makes it easier to find $g$. You can also search for separate convex programming to make sure you are not reinventing the wheel. – 40 votes Jul 28 '13 at 06:35
  • Two 1D optimizations won't find the global minimum correctly if the function surface f(x,y) has a shape of a long ravine (not aligned with either axis) with sharp curvature across the ravine and a gently sloping floor. This is the situation when the steepest descent type methods (similar to the proposed idea) perform poorly. – Maxim Umansky Jul 28 '13 at 07:05
  • @40votes I think you got what I wanted. I included the edit in the question to explain my requirement. Thanks – triomphe Jul 28 '13 at 13:12
  • @DBS Can't it be defined in a convex closed set? Could you please explain why it has to be an open set? Thank you – triomphe Jul 28 '13 at 16:04
  • 1
    @MLT If your domain is a convex open set (and the differentiability condition is satisfied) then you can find local minima by looking at points where both partial derivatives vanish and then looking at the second derivative. However if your set is closed (and compact) the minima could be at the boundary which could be a complicated curve in $x,y$ and hence you cannot restrict to just one variable. – DBS Jul 28 '13 at 21:15
  • @DBS Thank you. Could you please refer me to a book where I can study these ideas you mention in a formal manner? – triomphe Jul 29 '13 at 03:19
  • @MLT Take a look at any multivariable calculus book (on the section of maxima-minima) of mutivariate functions. For example Stewart's book (the multivariable sections) is quite popular. Also there are plethora of other calculus books Apostol is a personal favorite. – DBS Jul 29 '13 at 03:38
  • related: https://math.stackexchange.com/questions/1293761/minimize-multi-variable-function-one-variable-at-a-time/1293957#1293957 – Chill2Macht Sep 03 '18 at 23:46

3 Answers3

13

This has nothing to do with convexity nor with the method used to determine minima; it is purely a matter of logic and order. It is with minimax problems that the real difficulties arise.

Note that your function $g$ depends only on the variable $y$. I'd argue as follows, neglecting questions of existence:

Given $$f:\quad X\times Y\to{\mathbb R},\qquad (x,y)\mapsto f(x,y)\ ,$$ put $$g(y):=\min_{x\in X} f(x,y)\quad(y\in Y)\ ,\qquad S(y):=\{x\in X\>|\>f(x,y)=g(y)\}\ .$$ The set $S(y)$ replaces your ${\rm argmin}_x f(x,y)$, and hopefully consists of only one point $x^*(y)\in X$.

Now we know on each " horizontal" $y={\rm const.}$ the minimal value taken by $f$ and the set of points where this minimal value is taken. We proceed to the second step: Put $$\mu:=\min_{y\in Y} g(y), \qquad T:=\{y\in Y\>|\>g(y)=\mu\}\ .$$ Then $\mu$ is the overall minimal value of $f$, and the set of $(x,y)$ where this value is taken is given by (check it!) $$f^{-1}(\mu)=\{(x,y)\>|\>y\in T, \ x\in S(y)\}\ .$$ When $S(y)=\{x^*(y)\}$ we have $g(y)=f\bigl(x^*(y),y\bigr)$, and when in the second step we find $T=\{y^*\}$, then $\mu=f\bigl(x^*(y^*),y^*\bigr)$, and we are allowed to write $${\rm argmin}\>f=\bigl(x^*(y^*),y^*\bigr)\ .$$

Edit: Side note concerning convexity with respect to several variables

In the situation considered by the OP the function $f$ is only separately convex in each of the two variables $x$ and $y$. In this case neither the function $f$ nor $g$ can be expected to be convex. Consider the example $$f(x,y):=x^2-4xy+y^2\ .$$ Then $f_{xx}=f_{yy}\equiv2$, so $f$ is convex in each of the two variables separately; but $$f(x,y)=(x-2y)^2-3y^2\tag{1}$$ shows that $f$ is not convex as a function of two variables. From $(1)$ we deduce $g(y)=-3y^2$, which certainly is not convex. (In this case $\inf_{y\in{\mathbb R}} g(y)=-\infty$; but I conjecture that one could cook an example of this kind with a finite minimum $\mu$.)

  • Thanks a lot. This is very clear. One small clarification is required. I guess in my case the convexity makes the minimum that we found out in two steps unique? – triomphe Jul 28 '13 at 14:28
  • @MLT: You only have assumed convexity in each variable separately. Consider in this respect the function $f(x,y):=x^2-4x y+y^2$, which, as a function of two variables, is not convex. – Christian Blatter Jul 28 '13 at 14:49
  • @Chritian Blatter Yes my function is only convex in each variable separately. So in the first part I will get a unique $x^(y)$ for each $y$ but in the second part when I substitute that $x^(y)$ the function $g(y)$ may no longer be convex in $y$ is that what you mean? Thank you – triomphe Jul 28 '13 at 15:23
  • @MLT: See my edit. – Christian Blatter Jul 28 '13 at 17:14
  • @Chritian Blatter So in your example is $-\inf$ the solution to $\text{min}{x,f} f(x,y)$ and $\text{argmin}{x,y} = {+\inf,-\inf}$ ? Because your theory says one can minimize with respect to each variable. Thank you. – triomphe Jul 28 '13 at 21:30
  • @ChristianBlatter what i understood from your answer is that first you prove that the method shown by the OP is right (just before "In the situation..." part of your answer). After doing so you give an example that the methodology can not be applied even if the functions are convex in each variable. I am extremely confused please help me in understanding it clearly. Thank you. – Frank Moses Jun 23 '16 at 02:29
  • @Frank Moses: The method of the OP is basically correct, but it did not envisage cases of multiple equal minima in the first as well as in the second step. My answer tried to explain this. The text starting with "In the situation..." is an aside note concerning convexity, and has nothing to do with the problem as such. – Christian Blatter Jun 24 '16 at 07:24
10

The above answer is correct but a little verbose. For all $(\mathbf{x}, \mathbf{y})$:

$$f (\mathbf{x}, \mathbf{y}) \ge \min_{\mathbf{y}} f(\mathbf{x}, \mathbf{y}) \ge \min_{\mathbf{x}} \min_{\mathbf{y}} f(\mathbf{x}, \mathbf{y}) \,.$$

Therefore, for any $(x^*, y^*)$ such that $$f(x^*, y^*) = \min_{\mathbf{x}} \min_{\mathbf{y}} f(\mathbf{x}, \mathbf{y})$$ we have that for all $(\mathbf{x}, \mathbf{y})$:

$$ f(\mathbf{x}, \mathbf{y}) \ge f(x^*, y^*) $$

so by definition $\DeclareMathOperator*{\argmin}{argmin}$

$$(x^*, y^*) \in \argmin_{\mathbf{x}, \mathbf{y}} f(\mathbf{x}, \mathbf{y}) \,.$$

Chill2Macht
  • 20,920
1

The equality holds for every possible case whenever $x$ and $y$ are independent i.e.

\begin{equation}\label{eq:0} \min_{y}\min_{x} f(x,y) = \min_{x,y} f(x,y) \end{equation}

Proof: \begin{equation}\label{eq:1} f(x^*,y^*)=\min_{x,y}f(x,y) \leq \min_{y}\min_{x}f(x,y)\,\,(\text{this holds trivially})\tag{1} \end{equation} where $(x^*, y^*) = \text{argmin}_{x,y}f(x,y)$.

Now, \begin{equation}\label{eq:2} \min_y\min_x f(x,y) \leq \min_y f(x^*,y)\tag{2} \end{equation} because $\min_x f(x,y) \leq f(x^*, y) \, \forall y$

and also

\begin{equation}\label{eq:3} \min_y f(x^*,y) \leq f(x^*,y^*) \, \text{(holds trivially)}\tag{3} \end{equation}

Now $\eqref{eq:2}$ and $\eqref{eq:3}$ implies \begin{equation}\label{eq:4} \min_y\min_x f(x,y) \leq f(x^*,y^*)\tag{4} \end{equation}

Overall \eqref{eq:1} and \eqref{eq:4} implies \begin{equation} \min_{y}\min_{x} f(x,y) = \min_{x,y} f(x,y) \end{equation}

Please note that in this proof I haven't used any restrictions of convexity or smoothness. Thus the claim is true for all circumstances.