1

The general problem I want to solve is well studied:

$$ \min_x \Vert Ax\Vert_\infty \;\;\; \mathrm{s.t.} \;\;\; Bx=c, $$

which is equivalent to the following linear program:

$$ \min_{t,x} \, t \;\;\; \mathrm{s.t.} \;\;\; -t \leq Ax \leq t \, \wedge \, Bx=c $$

There are several available blackbox solvers for the above. My problem is that $A$ is huge (viz. too big for memory) and is completely impractical to specify as a matrix. However, I can efficiently compute $Ax$ with a function, but the solvers do not allow for function handles. I had a similar problem in the past minimizing the $\ell_2$ norm of a large-scale linear system but was able to solve it following the suggestion of using a projected gradient method. This time the large-scale linear system is in the constraints, which means that I would again need to pass $A$ to the solver used in the projection of the gradient onto the feasible region. Again, due to the size of $A$, this is not doable.

Any ideas on algorithms or software to solve the above would be greatly appreciated.

1 Answers1

2

There are many methods. Here I will suggest one - formulating it as a sum of two non-smooth functions with (relatively) easily computable proximal operators. Then, you can use any method for optimizing a sum of two non-smooth functions, such as Douglas-Rachford.

You can re-formulate it as: $$ \min_{x,y} ||y||_{\infty} \quad \mathrm{s. t.} ~ Bx = c, y = Ax $$

As a reminder, the indicator of a set $C$, denoted by $\delta_C(x)$ is defined as: $$ \delta_C(x) = \begin{cases} 0 & x \in C \\ \infty & x \notin C \end{cases} $$

Now we re-write the optimization problem using indicator functions: $$ \min_{x, y} \underbrace{||y||_{\infty}}_{f_1} + \underbrace{\delta_{\{x:Bx = c\}}(x) + \delta_{\{(x,y):y = Ax\}}(x,y)}_{f_2} $$

Methods that minimize such functions will require computing the proximal operators of $f_1$ and $f_2$.

  • Since $f_1$ the infinity norm, its proximal operator can be computed by projecting onto the $\ell_1$-norm ball.
  • The proximal operator of $f_2$ is composed of projecting onto the set: $$ \left \{ (x, y): \begin{bmatrix} A&I\\B&0 \end{bmatrix} \begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}0\\c\end{bmatrix} \right \} $$ This projection cannot be computed accurately given the large-scale of the problem. However, the Conjugate-Gradient method can be used to compute it up to a certain accuracy.
  • This is extremely helpful. Researching your answer led me to proximal gradient method, but there is something I cant figure out. I understand that the proximal operator for $\ell_\infty$ is $\operatorname{prox}{tf} (x) = x - t P{B}(x/t)$, where $P_B$ is projection onto the $\ell_1$ ball. However, unless I am mistaken, I need the proximal operator for $\Vert b^{\mathrm{T}} \left[ \begin{array}{c} x \ y \end{array} \right] \Vert_\infty$, and I have not been able to find this. How do I account for $x$ in the proximal op? Its output needs to be in $\mathbb{C^2N}$, so I can't just ignore it. – AnonSubmitter85 May 19 '15 at 22:06
  • In the above comment, $x \in \mathbb{C}^N$, $b \in \mathbb{C}^{2N}$, with $b = \left[ \begin{array}{c} 0 \ 1 \end{array} \right]$, and $\mathbb{C}^2 N$ should read $\mathbb{C}^{2N}$. – AnonSubmitter85 May 19 '15 at 22:13
  • One property of proximal operators is separability. If you have $h(x, y) = h_1(x) + h_2(y)$, then $\operatorname{prox}{h}(z, w) = [\operatorname{prox}{h_1}(z), \operatorname{prox}{h_2}(w)]^T$. In your case you wish to compute the proximal operator of $h(x, y) = ||y||{\infty}$, so $h_1(x) = 0$ and $h_2(y) = ||y||_{\infty}$. – Alex Shtoff May 20 '15 at 09:41
  • Thank you much for this. I've got it coded up and it appears to be working. – AnonSubmitter85 May 21 '15 at 20:44