valueIte {MDP}R Documentation

Perform value iteration on the MDP.

Description

Perform value iteration on the MDP.

Usage

valueIte(mdp, iW, iDur, rate=0.1, rateBase=365, times=10, eps=1e-05,
    termValues)

Arguments

mdp The MDP loaded using loadMDP.
iW Index of the weight we optimize.
iDur Index of duration/time such that discount rates can be calculated.
rate Interest rate.
rateBase The time-horizon the rate is valid over.
times The max number of times value iteration is performed.
eps Stopping criterion. If max(w(t)-w(t+1))<epsilon then stop the algorithm, i.e the policy becomes epsilon optimal (see [1] p161).
termValues The terminal values used (values of the last states in the MDP.

Details

If the MDP has a finite time-horizon then arguments times and eps are ignored.

Value

NULL (invisible)

Author(s)

Lars Relund lars@relund.dk

References

[1] Puterman, M.; Markov Decision Processes, Wiley-Interscience, 1994.


[Package MDP version 1.0 Index]