Optimization problem solved via dynamic programming

Question

Consider a situation where decisions are made in stages. The outcome of each decision is not fully predictable but can be anticipated to some extent before the next decision is made. The objective is to minimize a certain cost - a mathematical expression of what is considered an undesirable outcome.

A key aspect of such situations is that decisions cannot be viewed in isolation since one must balance the desire for low present cost with the undesirability of high future costs. The dynamic programming technique captures this trade-off. At each stage, decisions are ranked based on the sum of the present cost and the expected future cost, assuming optimal decision making for subsequent stages.

This is a quote from Dynamic programming and optimal control by Bertsekas.

Can someone explain the meaning of the last paragraph. What is the trade-off here and how dynamic programming solves it ?

@snarski: This is a quote from Dynamic programming and optimal control by Bertsekas. — aaaaaa, Jan 05 '14 at 06:09

score 0 · Answer 1 · answered Jan 05 '14 at 06:58

The tradeoff is between reducing present cost and reducing future cost. Dynamic optimization problems become interesting only when actions which reduce present costs increase future costs and vice versa. For example a firm more providing training to its workers increases current costs but reduces future costs since a better trained worker would be more efficient. So it needs to find an optimal amount of training which gives the best 'compromise' between current and future costs.

Dynamic programming approaches this problem by using a 'value function' that captures the sum total of future costs that would follow from putting the system into a particular state and following the best possible course of action thereafter. This is then the minimum costs you have to incur after going into a particular state (say a particular level of training for the worker). Now life is simple. We use the per-period cost function that is part of the problem statement to find out the present cost of an action and the value function to find out the cost resulting from the state the action puts us in and then minimize the sum of the two.

But we have seem to have pulled a rabbit out of a hat, for how do we get the value function to begin with. The value function contains a answer to a problem of minimizing total future costs, which is precisely the problem to solve which we want to use the value function.

To know the answer, read ahead in Bertsekas.

Optimization problem solved via dynamic programming

1 Answers1