0

What's the difference between discounted cost , total expected cost and average expect cost MDP? Are they just MDP problems with different objective function? When the discounted factor equals 1, then discounted cost mdp becomes total cost? Can anyone provide more detailed explanation?

Another question, most of existing theory provides conditions for the existence of optimal and stationary policy for mdp with finite state and action space and mdp with countable state space. How to proof the existence of optimal and stationary policy for discrete time MDP with infinite and uncountable state problem?

Cubic
  • 1
  • Your question is very broad. Do you have a specific problem in mind? I'm afraid that answering this would amount to rewriting a few pages from a textbook. – Math1000 Jun 13 '18 at 05:13
  • How to explain discount factor in mdp with discount cost? Some answers are about for the convience of math, the importance of future rewards and current rewards. When discount factor becomes 1, then the discounted cost becomes total expected cost. As for average expect cost, the average expected cost equal (1-discount factor)*discount factor with discount factor approaching 1. I think they are not just problems with different objective function. Is there any relationship between the optimal policies of these three MDP problems? – Cubic Jun 13 '18 at 06:02

0 Answers0