nextupprevious
Next:7. Extensions and Concluding RemarksUp:6. Risk-Sensitive Hierarchical ControlsPrevious:6.1 Risk-sensitive hierarchical controls with

  
6.2 Risk-sensitive hierarchical controls with long-run average costs

In this section we consider a manufacturing system with the objective of minimizing the risk-sensitive average cost criterion over an infinite horizon. The risk-sensitive approach has been applied to the so-called disturbance attenuation problem; see, for example, Whittle (1990), Fleming and McEneaney (1995), and references therein.

Let us consider a single product, parallel-machine manufacturing system with stochastic production capacity and constant demand for its production over time. For $t\geq 0$, let $x^{\varepsilon}(t)$,$u^{\varepsilon}(t)$, and z denote the surplus level (the state variable), the production rate (the control variable), and the constant demand rate, respectively. We assume $x^{\varepsilon}(t)\in R=(-\infty,\infty)$,$u^{\varepsilon}(t)\inR^+=[0,\infty)$,$t\geq 0$, and z a positive constant. They satisfy the differential equation

$\displaystyle \dotx^{\varepsilon}(t)=-a x^{\varepsilon}(t)+u^{\varepsilon}(t)-z,\; x^{\varepsilon}(0)=x,$
    (6.1)
where a>0 is a constant, representing the deterioration rate (or spoilage rate) of the finished product.

We let $m(\varepsilon,t)$ represent the maximum production capacity of the system at time t, where $m(\varepsilon,t)$ is given in Section 3.1. The production constraints is given by the inequalities

\begin{eqnarray*}0\leq u^{\varepsilon}(t)\leq m(\varepsilon,t), \t\geq0.\end{eqnarray*}

Definition 6.1   A production control process$u^{\varepsilon}(\cdot)=\{u^{\varepsilon}(t),\, t\geq 0\}$ is admissible if

(i)
$u^{\varepsilon}(t)$ is $\sigma\{m(\varepsilon,s), 0\leq s\leq t)$-progressively measurable;
(ii)
$0\leq u^{\varepsilon}(t)\leqm(\varepsilon,t)$ for all $t\geq 0$.
Let ${\calA}^{\varepsilon}(k)$ denote the class of admissible controls with$m(\varepsilon,0)=k$. Let H(x,u) denote a cost function of surplus and production. The objective of the problem is to choose$u^{\varepsilon}(\cdot)\in{\cal A}^{\varepsilon}(k)$ to minimize
$\displaystyle J^\varepsilon(u^{\varepsilon}(\cdot))=\limsup_{T\to\infty}\frac{......frac{1}{\varepsilon}\int_0^TH(x^{\varepsilon}(t),u^{\varepsilon}(t))dt\right),$
    (6.2)
where $x^{\varepsilon}(\cdot)$ is the surplus process corresponding to the production process $u^{\varepsilon}(\cdot)$. Let$\lambda^\varepsilon=\inf_{u^{\varepsilon}(\cdot)\in{\calA}^{\varepsilon}(k)}J^\varepsilon(u^{\varepsilon}(\cdot))$. A motivation for choosing such an exponential cost criterion is that such criteria are sensitive to large values of the exponent which occur with small probability, for example, rare sequences of unusually many machine failures resulting in shortages ($x^{\varepsilon}(t)<0$). We assume that the cost function H(x,u) and the production capacity process $m(\varepsilon, \cdot)$ satisfy the following:
Assumption 6.1$H(x,u)\geq 0$ is continuous, bounded, and uniformly Lipschitz in x.
Remark 6.1   In manufacturing systems, the running cost function H(x,u) is usually chosen to be of the formH(x,u)=h(x)+c(u) with piecewise linear h(x) and c(u). Note that piecewise linear functions are not bounded as required in (A.6.1). However, this is not important, in view of the uniform bounds on $u^{\varepsilon}(t)$ and on $x^{\varepsilon}(t)$ for the initial state $x=x^{\varepsilon}(0)$ in any bounded set.
Assumption 6.2   Q is irreducible in the following sense: The equations
\begin{eqnarray*}\nu Q=0 \quad\mbox{and}\quad \sum_{i=0}^m \nu_i =1 \end{eqnarray*}
have a unique solution $\nu=(\nu_0, \nu_1, \ldots, \nu_m)$ with$\nu_k>0, \, k=0,1, \ldots, m$. The vector $\nu$ is called the equilibrium distribution of the Markov chain $m(\varepsilon, \cdot)$.Formally, we can write the associated HJB equation as follows:
$\displaystyle \frac{\lambda^{\varepsilon}} {\varepsilon}=$   $\displaystyle \inf_{0\leq u\leq k}\left\{ (-ax+u-z)\frac{w^{\varepsilon}_x(x, k)}{\varepsilon}+\frac{H(x,u)}{\varepsilon}\right.$  
    $\displaystyle \ \ +\left.\exp\left(-\frac{w^{\varepsilon}(x,k)}{\epsilon}\righ......ilon}\exp\left(\frac{w^{\varepsilon}(x,\cdot)}{\varepsilon}\right)(k)\right\},$ (6.3)
where $w^\varepsilon(x,k)$ is the potential function,$w^\varepsilon_x(x,k)$ denotes the partial derivative of$w^\varepsilon(x,k)$ with respect to x, and$Qf(\cdot)(i):=\sum_{j\neq i}q_{ij}(f(j)-f(i))$ for a function f on ${\cal M}$. Zhang (1995) proves the following theorem.

Theorem 6.5   (i) The HJB equation (6.3) has a viscosity solution$(\lambda^\varepsilon,w^\varepsilon(x,k))$.
(ii) The pair$(\lambda^\varepsilon,w^\varepsilon(x,k))$satisfies the following conditions:For some constant C1 independent of$\varepsilon >0$,

(a)
$0\leq \lambda^\varepsilon\leq C_1$, and
(b)
$\vert w^\varepsilon(x,k)-w^\varepsilon(\tilde x,k)\vert\leq C_1\vert x-\tilde x\vert$.
(iii) Assume that$w^\varepsilon(x,k)$to be Lipschitz continuous in x. Then,
\begin{eqnarray*}\lambda^{\varepsilon}=\inf_{u(\cdot)\in\Upsilon}J^\varepsilon(u(\cdot)).\end{eqnarray*}
This theorem implies that $\lambda^{\varepsilon}$ in the viscosity solution$(\lambda^\varepsilon,w^\varepsilon(x,k))$ is unique. Furthermore, Zhang (1995) gives the following verification theorem.

Theorem 6.6   Let$(\lambda^\varepsilon,w^\varepsilon(x,k))$be a viscosity solution to the HJB equation in (6.3). Assume that$w^\varepsilon(x,k)$to be Lipschitz continuous in x. Let$\psi^\varepsilon(x,k)=\exp(w^\varepsilon(x,k)/\varepsilon)$.Suppose that there are$u^*(\cdot)$,$x^*(\cdot)$, andr*(t) such that

\begin{eqnarray*}\dot x^*(t)=-ax^*(t)+u^*(t)-z,\; x^*(0)=x,\end{eqnarray*}
$r^*(t)\inD^+\psi^\varepsilon_x(x^*(t),k^\varepsilon(t))$satisfying
    $\displaystyle \frac{k^\varepsilon}{\varepsilon}\psi^\varepsilon(x^*(t),k^\varepsilon(t))=(-ax^*(t)+u^*(t)-z)r^*(t)$  
    $\displaystyle \ \ \ \ \+\frac{L(x^*(t),u^*(t))}{\varepsilon}\psi^\varepsilon(x......lon(t))+\frac{Q}{\varepsilon}\psi^\varepsilon(x^*(t),\cdot)(k^\varepsilon(t)),$ (6.4)
a.e. in t and w.p. 1. Then,$k^\varepsilon=J^\varepsilon(u^*(\cdot))$.
 

We next discuss the asymptotic property of the HJB equation (6.5) as $\varepsilon\to0$. First of all, note that this HJB equation is similar to that for an ordinary long-run average cost problem except for the term involving the exponential factor. In order to get rid of such a term, we make use of the logarithmic transformation as in Fleming and Soner (1992).

Let ${\cal V}=\{v=(v(0),\ldots,v(m))\in R^{m+1}:\; v(i)>0,i=0,1,\ldots,m\}$. Define

\begin{eqnarray*}Q^v=(q^v_{ij}) \mbox{such that }q^v_{ij}=q_{ij}\frac{v(j)}......x{for} \ i\neq j\mbox{ and }q^v_{ii}=-\sum_{j\neq i}q^v_{ij}.\end{eqnarray*}
With the logarithmic transformation, we have for each $i\in {\cal M}$,
\begin{eqnarray*}&&\exp\big(-\frac{w^{\varepsilon}(x,k)}{\varepsilon}\big) Q......)(i)-Q^v(\logv(\cdot))(i) +\frac{Qv(\cdot)(i)}{v(i)}\right\}.\end{eqnarray*}
The supremum is obtained at$v(i)=\exp(-w^\varepsilon(x,i)/\varepsilon)$.

The logarithmic transformation suggests that the HJB equation is equivalent to an Isaacs equation for a two-player zero-sum dynamic stochastic game. The Isaacs equation is given as follows:

$\displaystyle \lambda^{\varepsilon}=\inf_{0\leq u\leq k}\sup_{v\in{\cal V}}\le......widetilde L(x,u,v,k)+\frac{Q^v}{\varepsilon}w^\varepsilon(x,\cdot)(k)\right\},$
    (6.5)
where
\begin{eqnarray*}\widetildeL(x,u,v,i)=L(x,u)-Q^v(\log v(\cdot))(i)+\frac{Qv(\cdot)(i)}{v(i)},\end{eqnarray*}
for $i\in {\cal M}$; see Evans and Souganidis (1984) and Fleming and Souganidis (1989).

We consider the limit of the problem as $\varepsilon\to0$. In order to define a limiting problem, we first define the control sets for the limiting problem. Let

\begin{eqnarray*}\Gamma_u=\{U=(u^0,\ldots,u^m);\; 0\leq u^i\leq i,\; i=0,\ldots,m\}\end{eqnarray*}
and
\begin{eqnarray*}\Gamma_v=\{V=(v^0,\ldots,v^m);\;v^i=(v^i(0),\ldots,v^i(m))\in{\cal V},\; i=0,\ldots,m\}.\end{eqnarray*}
For each $V\in\Gamma_v$, let ${\bar Q}^V:=(q^V_{ij})$ be such that
\begin{eqnarray*}q^{v^i}_{ij}= q^V_{ij}=q_{ij}\frac{v^i(j)}{v^i(i)}\mbox{ for}i\neq j\mbox{ and } q^V_{ii}=-\sum_{j\neq i}q^V_{ij},\end{eqnarray*}
and let $\nu^V=(\nu^V_0,\ldots,\nu^V_m)$ denote the equilibrium distribution of ${\bar Q}^V$. The next theorem says that${\bar Q}^V$ is irreducible. Therefore, there exists a unique positive $\nu^V$ for each $V\in\Gamma_v$. Moreover, $\nu^V$ depends continuously on V. It can be shown that for each$V\in\Gamma_v$,${\bar Q}^V$ is irreducible.
Theorem 6.7   Let$\varepsilon_n\to 0$be a sequence such that$\lambda^{\varepsilon_n}\to\lambda^0$and$w^{\varepsilon_n}(x,k)\to w^0(x,k)$. Then,
(i)
w0(x,k) is independent of k, i.e.,w0(x,k)=w0(x);
(ii)
w0(x) is Lipschitz; and
(iii)
(k0,w0(x)) is a viscosity solution to the following Isaacs equation:
k0=   $\displaystyle \inf_{U\in\Gamma_u}\sup_{V\in\Gamma_v}\big\{\big(-ax+\sum_{i=0}^m\nu^V_iu^i-z\big)w^0_x(x)+\sum_{i=0}^m\nu^V_iL(x,u^i)$  
    $\displaystyle \ \ \+\big(\sum_{i=0}^m\nu^V_i\frac{Qv^i(\cdot)(i)}{v^i(i)}-\sum_{i=0}^m\nu^V_i \bar Q^V(\log v^i(\cdot))(i)\big)\big\}.$ (6.6)
Let
\begin{eqnarray*}\widehat L(x,U,V)=\sum_{i=0}^m\nu^V_iL(x,u^i)+\sum_{i=0}......(i)}{v^i(i)} -\sum_{i=0}^m\nu^V_i\bar Q^V(\log v^i(\cdot))(i). \end{eqnarray*}
Note that $\widehatL(x,U,V)\leq \vert\vert L\vert\vert$, where $\vert\vert\cdot\vert\vert$ is the sup norm. Moreover, since $L\geq0$,$\widehat L(x,U,1)\geq0$ where V=1 means vi(j)=1 for all i,j. Then the equation in (6.6) is an Isaacs equation associated with a two-player, zero-sum dynamic game with the objective
\begin{eqnarray*}J^0(U(\cdot),V(\cdot))=\limsup_{T\to\infty}\frac{1}{T}\int_0^T\widehatL(x(t),U(t),V(t))dt \end{eqnarray*}
subject to
\begin{eqnarray*}\dotx(t)=-ax(t)+\sum_{i=0}^m\nu_i^{V(t)}u^i(t)-z,\; x(0)=x,\end{eqnarray*}
where $U(\cdot)$ and $V(\cdot)$ are Borel measurable functions and$U(t)\in\Gamma_u$ and $V(t)\in\Gamma_v$ for $t\geq 0$; see Barron and Jensen (1989).

One can show that

\begin{eqnarray*}\lambda^0=\inf_{U(\cdot)}\sup_{V(\cdot)} J^0(U(\cdot),V(\cdot)),\end{eqnarray*}
which implies the uniqueness of $\lambda^0$.

Finally, in order to use the solution to the limiting problem to obtain a control for the original problem, a numerical scheme must be used to obtain an approximate solution. The advantage of the limiting problem is its dimensionality, which is much smaller than that of the original problem if the number of states in ${\cal M}$ is large.

Let (U*(x),V*(x)) denote a solution to the limiting problem. As in Section 3.1, it is expected that the constructed control

\begin{eqnarray*}u(x,k)=\sum_{j=0}^mI_{\{m(\varepsilon,t)=j\}}u^{i*}(x)\end{eqnarray*}
is nearly optimal for the original problem (6.2). See Zhang (1995) for the proof of this result. 
nextupprevious
Next:7. Extensions and Concluding RemarksUp:6. Risk-Sensitive Hierarchical ControlsPrevious:6.1 Risk-sensitive hierarchical controls with