0

I'd like to find a pair of functions p, f such that $p(x) = 0.5 x^2 - 0.25 f(x)^2$, where $p$ is positive definite and non-convex and $f(0) = 0$.

Is there any example that satisfies the above equation?

1 Answers1

0

Sorry for my poor question. It would be too long as a comment, so I leave the details.

In optimal control theory, people sometimes refer to "positive (semi)definite function" as a function, namely $f$, which is (nonnegative) positive for all $x \neq 0$ and $f(0) = 0$. One example would be [1].

I'm trying to build a (discrete-time, DT) optimal control problem, which is simple yet the state-action value function is non-convex in state argument (explained later).

My first trial is:

$$ x_{1}^{+} = x_{2} \\ x_{2}^{+} = f(x) + u $$ where $x = [x_{1}, x_{2}]^T$ $u \in \mathbb{R}$ and the state variable and control input, resp, and $x^{+}$ denotes the next state. For convenience, I omitted the time index.

For the optimal control, it must satisfy Bellman equation, i.e.,

$$ \min_{u} r(x, u) + V([x_2, f(x)+u]^T) - V([x_{1}, x_2]^T) = 0 $$ where the cost $J(x_0 | u) = \sum_{i=0}^{\infty} r(x^i, u^i)$ (abuse of notation; the argument $u$ of $J$ means a policy (control law) and $x^i$ and $u^i$ denote $i$-th state and input, resp.), $V(x) = \min_{u}J(x|u)$. Note that "state-action" value refers to $Q(x, u) := V(x^{+}) + r(x, u)$, and I wanna find an example such that $Q(\cdot, \cdot)$ is generally non-convex but $Q(x, \cdot)$ is convex for any given $x$. To make it easy, I set $V(x) = V_1(x_1) + 0.5 x_2^2$ and $r(x, u) = p(x) + 0.5u^2$. Then,

$$ \min_u (u+0.5 f(x))^2 + 0.25 f(x)^2 + V_1(x_2) - V_1(x_1) -0.5 x_2^2 + p(x) = 0 $$

To cancel out terms related to $x_2$, I set $V_1(x) = 0.5x^2$ and it implies

$$ 0.25f(x)^2 + p(x) = 0.5 x_1^2 $$

And I'm not sure but positive (semi)definiteness of $p$ and $V_1$ is desirable for stability (and it's true in most continuous-time (CT) problems). To make $V$ non-convex, I asked "is there any example of $p$ and $f$ such that $p(x) = 0.5 x^2 - 0.25 f(x)^2$ (by simplifying $f(x) = f(x_1)$ and $p(x) = p(x_1)$ with the abuse of notation), $p$ is positive definite and non-convex, and $f(0) = 0$".

There may be some mistakes due to the lack of familiarity with DT optimal control; please feel free to leave comments if something's wrong.

[1] T. Bian and Z.-P. Jiang, “Value Iteration, Adaptive Dynamic Programming, and Optimal Control of Nonlinear Systems,” in 2016 IEEE 55th Conference on Decision and Control (CDC), Las Vegas, NV, USA, Dec. 2016, pp. 3375–3380. doi: 10.1109/CDC.2016.7798777.