nextupprevious
Next:5.5 Markov decision processes withUp:5. Hierarchical Controls under thePrevious:5.3 Hierarchical controls of dynamic

  
5.4 Hierarchical controls of dynamic jobshops

We consider the jobshop given in Section 3.3. The dynamics of the system are given by
\begin{eqnarray*}\left\{\begin{array}{ll}&\dot x^{\varepsilon}_i(t)=-a_ix^......l i}(t)-z_i\big), \ m+1\leq i \leq N_b.\end{array}\right.\end{eqnarray*}
We write this in the vector form as
$\displaystyle \dot {\mbox{\boldmath$x$ }}^{\varepsilon}(t)$ = $\displaystyle (\dotx_1^{\varepsilon}(t),...,\dot x_{N_b}(t))'$  
  = $\displaystyle -\mbox{diag}({\mbox{\boldmath$a$ }}){\mbox{\boldmath$x$ }}^{\vare......$u$ }}^{\varepsilon}_0(t),...,{\mbox{\boldmath$u$ }}^{\varepsilon}_{m+1}(t))',$ (5.37)
where ${\mbox{\boldmath$a$ }}=(a_1,...,a_{N_b})'$, and
\begin{displaymath}{\mbox{\boldmath$u$ }}^{\varepsilon}_{\ell}(t)=(u^{\varepsi......}(t),...,u^{\varepsilon}_{\ell,N_b}(t))', \ \ \ell=0,...,m,\end{displaymath}
and
\begin{displaymath}{\mbox{\boldmath$u$ }}^{\varepsilon}_{m+1}(t)=(z_{m+1},...,z_{N_b})', \ \ \ell=m+1,...,N_b.\end{displaymath}

Our problem is to find an admissible control${\mbox{\boldmath$u$ }}(\varepsilon,\cdot)$ that minimizes the cost function

$\displaystyle J^{\epsilon}({\mbox{\boldmath$x$ }}, {\mbox{\boldmath$m$ }})=\lim......oldmath$x$ }}^{\varepsilon}(t))+c({\mbox{\boldmath$u$ }}^{\varepsilon}(t))]dt,$
    (5.38)
where $h(\cdot)$ defines the cost of inventory/shortage,$c(\cdot)$ is the production cost, ${\mbox{\boldmath$x$ }}$ is the initial state, and ${\mbox{\boldmath$m$ }}$ is the initial value of ${\mbox{\boldmath$m$ }}(\varepsilon, t)$.

In place of Assumptions 3.4, 3.5 and 3.6 in Section 3.3 on the cost function $h(\cdot)$ and $c(\cdot)$ and the machine capacity process${\mbox{\boldmath$m$ }}(\varepsilon, t)$, we impose the following assumptions on the random process${\mbox{\boldmath$m$ }}(\varepsilon, t)=(m_1(\varepsilon, t),...,m_{N}(\varepsilon,t))$ throughout this section.

Assumption 5.5   Let $p_{n}=\sum_{j=1}^{p}m_{n}^{j}p_{\mbox{{\footnotesize\boldmath$m$ }}^{j}}$, and $n(i,j)=\mbox{arg}\{(i,j)\in K_n\}$ for$(i,j)\in \hat A$, that is, pn is the average capacity of the machine n, and n(i,j) is the number of the machine located on the arc (i,j). Furthermore, we assume that there exist$\{w_{ij}>0: (i,j)\in K_n\}(n=1,...,n_c)$ such that

   
$\displaystyle \sum_{(i,j)\in K_n}w_{ij}\leq 1,$
(5.39)
   
$\displaystyle \sum_{\ell=0}^m w_{\ell i}p_{n(\ell,i)}>d_i, \ i=m+1,...,N_b,$
(5.40)
and
$\displaystyle \sum_{\ell=0}^{i-1}w_{\ell i}p_{n(\ell,i)}>\sum_{\ell=i+1}^{N_b}w_{i \ell}p_{n(i,\ell)}, \ i=1,...,m.$
    (5.41)
We use ${\cal A}^{\varepsilon} ({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}) $ to denote the set of all admissible controls with respect to ${\mbox{\boldmath$x$ }}\in S$ and ${\mbox{\boldmath$m$ }}(\varepsilon,0)={\mbox{\boldmath$m$ }}$. Let$\lambda^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }})$ denote the minimal expected cost, i.e.,
$\displaystyle \lambda^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ ......math$m$ }})} J^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}^0).$
    (5.42)

In the case of the long-run average cost criterion used here, we know, by Theorem 2.4 in Presman, Sethi, and Zhang (1999b), that under Assumption 5.5,$\lambda^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }})$ is independent of the initial condition $( \mbox{\boldmath$x$ },\mbox{\boldmath$m$ })$. Thus we will use $\lambda^{\varepsilon}$ instead of $\lambda^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }})$. We use ${\cal P}^{\varepsilon}$ to denote our control problem, i.e.,

$\displaystyle {\cal P}^{\varepsilon}:\left\{\begin{array}{lll}&\mbox{minimize......repsilon}({\mbox{\boldmath$x$ }}, {\mbox{\boldmath$m$ }}^0).\end{array}\right.$
    (5.43)

As in Section 5.1, the positive attrition rate${\mbox{\boldmath$a$ }}$ implies a uniform bound for ${\mbox{\boldmath$x$ }}^{\varepsilon}(t)$.

Next we examine elementary properties of the potential function and obtain the limiting control problem as $\varepsilon\rightarrow0$.

The Hamilton-Jacobi-Bellman equation in the directional derivative sense with the average-cost optimal control problem in ${\cal P}^{\varepsilon}$, as shown in Sethi, Zhang, and Zhang (1999b), takes the form

$\displaystyle {\lambda }^{\varepsilon}$ = $\displaystyle \inf_{{\mbox{\boldmath$u$ }} \in U({\mbox{\boldmath$x$ }},{\mbox{......oldmath$u$ }})} +c({\mbox{\boldmath$u$ }}) \right\} +h({\mbox{\boldmath$x$ }})$  
    $\displaystyle \ \ +\left(Q^{(1)}+\frac {1} {\varepsilon}Q^{(2)} \right)w^{\varepsilon}({\mbox{\boldmath$x$ }}, \cdot) ({\mbox{\boldmath$m$ }}^j),$ (5.44)
where $ w^{\varepsilon} ({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}^j)$ is the potential function of the problem${\cal P}^{\varepsilon}$,$\frac {\partial w^{\varepsilon} ({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}^......x{diag}({\mbox{\boldmath$a$ }}){\mbox{\boldmath$x$ }}+D{\mbox{\boldmath$u$ }})}$ denotes the directional derivative of $ w^{\varepsilon} ({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}^j)$ along the direction $-\mbox{diag}({\mbox{\boldmath$a$ }}){\mbox{\boldmath$x$ }}+D{\mbox{\boldmath$u$ }}$, and $Qf(\cdot)({\mbox{\boldmath$m$ }}^j):=\sum_{i \neq j}q_{ji}(f(i)-f(j))$ for any function$f(\cdot)$ on ${\cal M}$. Moreover, following Presman, Sethi, and Zhang (1999b), we can show that there exists a potential function$w^{\varepsilon}({\mbox{\boldmath$x$ }}, {\mbox{\boldmath$m$ }})$ such that the pair $( \lambda^{\varepsilon} , w^{\varepsilon} ({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }}))$ is a solution of (5.44), where $\lambda^{\varepsilon}$ is the minimum average expected cost for ${\cal P}^{\varepsilon}$.

First we can get the boundedness of $\lambda^{\varepsilon}$.

Theorem 5.9   The minimum average expected cost $\lambda^{\varepsilon}$ of ${\cal P}^{\varepsilon}$is bounded in$\varepsilon$, i.e., there exists a constant M1>0 such that

\begin{displaymath}0\leq \lambda ^{\varepsilon } \leq M_1 \ \ \mbox{for all} \ \varepsilon >0.\end{displaymath}

For the proof, see Sethi, Zhang, and Zhang (1999b).

Now we derive the limiting control problem as $\varepsilon\rightarrow0$. As in Sethi and Zhou (1994), we give the following definition.

Definition 5.6   For ${\mbox{\boldmath$x$ }}\in {\cal S}$, let ${\cal A}^0({\mbox{\boldmath$x$ }}) $ denote the set of the following measurable controls

\begin{eqnarray*}U(\cdot)&=&({\mbox{\boldmath$u$ }}^1(\cdot),...,{\mbox{\boldm......dmath$u$ }}^p_0(\cdot),...,{\mbox{\boldmath$u$ }}^p_m)(\cdot)),\end{eqnarray*}
with
\begin{displaymath}{\mbox{\boldmath$u$ }}^j_k(\cdot)=(u^j_{k,k+1}(\cdot),...,u^j_{k, N}(\cdot)),\end{displaymath}

such that $0\leq \sum_{ (i,j)\in K_n} u^j_{ij}(t) \leq m^j_n$ for all $t\geq 0$, j=1,...,p, and n=1,...,nc, and the corresponding solutions${\mbox{\boldmath$x$ }}(\cdot)$ of the following system

\begin{eqnarray*}\left\{\begin{array}{lll}&\dot x_k(t)=-a_kx_k(t)+\left(\s......j_{\ell k}(t)-d_k\right),\ k=m+1,...,N,\end{array}\right.\end{eqnarray*}
with (x1(0),...,xN(0))=(x 1,...,xN) satisfy ${\mbox{\boldmath$x$ }}(t)\in {\cal S}$ for all $t\geq 0$.The objective of this problem is to choose a control$ U(\cdot)\in {\cal A}^0$ that minimizes\begin{displaymath}J( U(\cdot))=\limsup_{T \rightarrow \infty} \frac {1} {T} \......(s))+\sum_{j=0} ^p\gamma_j c({\mbox{\boldmath$u$ }}^j(s))]ds.\end{displaymath}

We use ${\cal P}^0$ to denote the above problem, and will regard this as our limiting problem. Then we define the limiting control problem ${\cal P}^0$ as follows:

\begin{eqnarray*}{\cal P}^0:\left \{\begin{array}{lll}& J( U(\cdot))=\lims......f_{U(\cdot)\in {\cal A}^0} J(U(\cdot)).\end{array}\right.\end{eqnarray*}
The average cost optimality equation associated with the limiting control problem ${\cal P}^0$ is
$\displaystyle \lambda=\inf_{U \in {\cal A}^0 } \left \{ \frac {\partial w({\mbo......=0} ^p \gamma_jc({\mbox{\boldmath$u$ }}^j)\right \} +h({\mbox{\boldmath$x$ }}),$
    (5.45)

where $w({\mbox{\boldmath$x$ }})$ is a potential function for ${\cal P}^0$ and$\frac {\partial w({\mbox{\boldmath$x$ }})} {\partial (-\mbox{diag}({\mbox{\boldmath$a$ }}){\mbox{\boldmath$x$ }}+DU^0)}$ is the directional derivative of$\bar w({\mbox{\boldmath$x$ }})$ along the direction$-\mbox{diag}({\mbox{\boldmath$a$ }}){\mbox{\boldmath$x$ }}+DU^0$ with$U^0=\sum_{j=1}^p\gamma_j{\mbox{\boldmath$u$ }}^j$. From Presman, Sethi, and Zhang (1999b), we know that there exist $\bar \lambda$ and $\bar w({\mbox{\boldmath$x$ }})$ such that (5.45) holds. Moreover, $\bar w({\mbox{\boldmath$x$ }})$ is the limit of $\bar w^{\varepsilon}({\mbox{\boldmath$x$ }},{\mbox{\boldmath$m$ }})$ as $\varepsilon\rightarrow0$.

Now we are ready to discuss the convergence of the minimum average expected cost $\lambda^{\varepsilon}$ as $\varepsilon$ goes to zero, and establish the corresponding convergence rate. First we give two lemmas which are used in proving the convergence.

Lemma 5.7   For $\delta\in [0,\frac{1}{2})$and any sufficiently small$\varepsilon >0$, there exist$\hat C_{54}>0$,${\mbox{\boldmath$x$ }}=(x_1,...,x_N) \in {\cal S}$, and

\begin{eqnarray*}U(\varepsilon, \cdot)&=&({\mbox{\boldmath$u$ }}^1(\varepsilon......^p(\varepsilon, \cdot)))\in {\cal A}^0({\mbox{\boldmath$x$ }}),\end{eqnarray*}
such that for each n=1,...,nc and j=1,...,p,
$\displaystyle \sum_{(k, \ell)\in K_n}u^j_{k\ell} (\varepsilon, \cdot) \leqm_n^j-\varepsilon^{\delta} \ \ \mbox{when} \ m_n^j \neq 0,$
    (5.46)

and

$\displaystyle \lambda +\hat C_{54} \varepsilon^{\delta}>\limsup_{T \rightarrow......))+\sum_{j=1}^p\gamma_j c({\mbox{\boldmath$u$ }}^j(\varepsilon, t))\right]dt,$
    (5.47)

where ${\mbox{\boldmath$x$ }}(\varepsilon, t)$ is the trajectory under $U(\varepsilon, t).$

Lemma 5.8   For $\delta\in [0,\frac{1}{2})$, there exist$\hat C_{55}>0$, and${\mbox{\boldmath$x$ }}=(x_1,...,x_N) \in {\cal S}$, and

\begin{eqnarray*}U(\varepsilon, \cdot)&=&({\mbox{\boldmath$u$ }}^1(\varepsilon......^p(\varepsilon, \cdot)))\in {\cal A}^0({\mbox{\boldmath$x$ }}),\end{eqnarray*}
such that
$\displaystyle \min_{1 \leq k \leq m} \inf_{0 \leq t <\infty} x_k(\varepsilon,t)\geq \hat C_{55} \varepsilon^{\delta},$
    (5.48)

and

$\displaystyle \lambda + \hat C_{55} \varepsilon^{\delta}>\limsup_{T \rightarro......t))+\sum_{i=1}^p\gamma_i c({\mbox{\boldmath$u$ }}^i(\varepsilon, t))\right]dt,$
    (5.49)

where ${\mbox{\boldmath$x$ }}(\varepsilon, t)=(x_1(\varepsilon, t),...,x_N(\varepsilon, t))$ is the state trajectory under the control $U(\varepsilon, t)$.For the proof of Lemmas 5.7 and 5.8, see Sethi, Zhang, and Zhang (1999b).
 

Using these two lemmas, Sethi, Zhang, and Zhang (1999b) derive the following theorem.

Theorem 5.10   For any $\delta\in [0,\frac{1}{2})$ there exists a constant $\hat C_{56}>0$ such that for all sufficiently small$\varepsilon >0$

$\displaystyle \vert\lambda^{\varepsilon}-\lambda\vert\leq \hat C_{56} \varepsilon^{\delta}.$
    (5.50)

This implies in particular that $\lim_{\varepsilon \rightarrow 0}\lambda^{\varepsilon}=\lambda.$

Remark 5.4   The theorem says that the problem${\cal P}^0$ is indeed a limiting problem in the sense that the$\lambda^{\varepsilon}$ of ${\cal P}^{\varepsilon}$ converges to $\lambda$ of ${\cal P}$. Moreover, it gives the corresponding convergence rate.


nextupprevious
Next:5.5 Markov decision processes withUp:5. Hierarchical Controls under thePrevious:5.3 Hierarchical controls of dynamic