Set up

  • An agent decides whether to delay her employment later (keep searching) or take the job (settle).
  • An agent’s action is binary (keep searching or take the job).
  • If she takes the job, she will get constant wage indefinitely.
  • If she rejects the job, she receives unemployment compensation c and reconsider her choice next period.
  • The probability of observing wage offer is uniformly distributed.
  • There are n states of wages with equal probability.
  • The worker’s problem is to maximize her expected discounted total sum of earnings.

The Value Function

$$ V(w) = \max_{a} [\frac{w}{1-\beta}, c + \beta \sum_{i=1}^n V(w_i)p_i] $$

  • Wage is a state variable.
  • An agent compares taking the job and getting \(w\) for the indefinite amount of time, versus taking the job next period and receives unemployment compensation \(c\).
  • \(a \in \{0, 1\}\).
  • Each day an agent observes wage, which is stochastically determined by some known distribution. With some starting point of \(v\), value function can be computed for each point \(w\). With each point \(w\), there is exactly one vector \(v \in \mathbb{R}^n \) that solves the problem.

How the algorithm works

Step 1: Arbitrarily select the starting point of \(v \in \mathbb{R}^n \).

Step 2: Compute \(v^\prime \in \mathbb{R}^n \) where $$ v^\prime_i = \max \{ \frac{w_i}{1-\beta}, c + \beta \sum_{i=1}^n v_i p_i\} \text{ for all } i. $$

Step 3: Compute the distance between \(v^\prime_i\) and \(v_i\), such as \(\max_i|v_i-v_i^\prime|\).

Step 4: If the distance between \(v^\prime \) and \(v\) is below some fixed tolerance, then stop the iteration and return \(v\), or else repeat step 2 and step 3.

Note that the reason that we stop when the distance between \(v^\prime \) and \(v\) is below some fixed tolerance is that neither action may add significant additional value to the expected sum of future returns and therefore another iteration is not required.

Define $$ T(v):=\max \{ \frac{w_i}{1-\beta}, c + \beta \sum_{i=1}^n v_i p_i\} \text{ for all } i. $$

According to the fixed-point theorem, we can find the fixed point through iterative process.

The Optimal Policy

The optimal action is thought of as a policy, which maps states to actions. We impose a rule that makes our agent select between say two actions, either that be an agent chooses an action that makes her reward function maximized or her disutility minimized. In this specific example, the policy function that dictates states to specific action represented as binary (1 accept and 0 reject) is represented as follows: $$ \sigma(w) := \mathbb{1} \{\frac{w}{1-\beta} \geq c + \beta \sum_{i=1}^{n}V(w_i)p_i\}, $$ which implies an agent accepts when its reward is greater than the reward of decline.

The Fixed Point Theory

According to the Banash fixed point theorem, there exists a unique vector that maps it to itself.

The Banach contraction mapping theorem states that it converges to the fixed point regardless of the different selection of initial values of \(v\).