Show that the lower bound is non-decreasing in n

Assignment Help Advanced Statistics

Reference no: EM13909515

(The Odoni bound) Let k∗ be the optimal stationary policy for a Markov decision problem and let g∗ and π ∗ be the corresponding gain and steady-state probability respectively. Let v∗(n, u) be the optimal dynamic expected reward for starting in state i at stage n with final reward vector u.

(a) Show that mini[v∗(n, u) - v∗(n - 1, u)] ≤ g∗ ≤ maxi[v∗(n, u) - v∗(n - 1, u)] ; n ≥ 1. Hint: Consider premultiplying v∗(n, u) - v∗(n - 1, u) by π ∗ or π where k is the optimal dynamic policy at stage n.

(b) Show that the lower bound is non-decreasing in n and the upper bound is non- increasing in n.

Text Book: Stochastic Processes: Theory for Applications By Robert G. Gallager.

Reference no: EM13909515

Questions Cloud

The government or the federal reserve the financial crisis : 1.We learn from Gorton that it is not possible to prove that had Lehman Brothers been bailed out by the government or the Federal Reserve the financial crisis of 2008 would not have occurred. This is an example of not being able to prove the "counter..

State the complimentary slackness conditions : The maximum size of a triangle packing in G is denoted ν(G), and the minimum size of a triangle cover in G is denoted τ (G).

The limited liability of shareholders in a business : 1.According to Gorton; The limited liability of shareholders in a business creates moral hazard because owners can take risks that can benefit them at the potential expense of creditors. true or false ? why ? 2. Banks are subject to runs when the col..

How organization apply deming pdca paradigm quality control : The metrics pertaining to those functions that determine quality. Possible metrics are timeliness, reliability, cost, shrinkage (damage and loss), etc. and How those metrics are monitored.

Show that the lower bound is non-decreasing in n : Show that the lower bound is non-decreasing in n and the upper bound is non- increasing in n. Show that mini[v∗(n, u) - v∗(n - 1, u)] ≤ g∗ ≤ maxi[v∗(n, u) - v∗(n - 1, u)] ; n ≥ 1.

What are the impact and long-run propensities : In an equation for annual data, suppose that: unempt = 2.7 - .68 inft - .25 inft-1 + .33 inft-2 + ut, where unempt is an unemployment rate at time t and inft is the inflation rate. What are the impact and long-run propensities

Bradford reagent and an equivalent volume : When preparing the protein standards, we used a tube (T) without protein, but including Bradford reagent and an equivalent volume of 0.15 M NaCl instead of protein. Can we just use plain water to prepare tube T instead?

Gorton argues that regulating capital ratios : 1. Gorton argues that regulating capital ratios cannot prevent a systemic run on banks. true or false ? why ? 2. Over time banks have become more efficient at using bank capital and thus use less of it relative to the assets they fund. The capital r..

Direct labor and overhead are added evenly throughout : Belda Co. manufactures a single product in one department. Direct labor and overhead are added evenly throughout the process. Direct materials are added as needed.

User Account

All Pages