Dynamic programming (4) Example
Consider a simple consumption-saving model, where action (a) is defined by the amount of savings each period, state (s) defined by the current stock, reward be the utility which depends on consumption (c=s-a). Suppose that state is updated where the output is drawn from a uniform distribution on {0, . . . , B}. Let the global upper bound of storage be M. State space State space is \(n = M+B+1\) dimension....