nextupprevious
Next:2.2 Optimal control of dynamicUp:2 Optimal Control under thePrevious:2.1 Optimal Control under the

  
2.1 Optimal control of single or parallel machine systems

Akella and Kumar (1986) deal with a single machine (with two states: up and down), single product problem. They obtained an explicit solution for the threshold inventory level, in terms of which the optimal policy is as follows: Whenever the machine is up, produce at the maximum possible rate if the inventory level is less than the threshold, produce on demand if the inventory level is exactly equal to the threshold, and not produce at all if the inventory level exceeds the threshold. Recently, Feng and Yan (1999) extend the Akella-Kumar model to the case where the exogenous demand forms a homogeneous Poisson flow. They prove the optimality of the threshhold control type for this extended Akella-Kumar model. When their problem is generalized to convex costs and more than two machine states, it is no longer possible to obtain explicit solutions. Using the viscosity solution technique, Sethi, Soner, Zhang, and Jiang (1992) investigate this general problem. They study the elementary properties of the value function. They show that the value function is a convex function and that it is strictly convex provided the inventory cost is strictly convex. Moreover, it is shown to be a viscosity solution to the Hamilton-Jacobi-Bellman (HJB) equation and to have upper and lower bounds each with polynomial growth. Following the idea of Thompson and Sethi (1980), they define what are known as the turnpike sets in terms of the value function. They prove that the turnpike sets are attractors for the optimal trajectories and provide sufficient conditions under which the optimal trajectories enter the convex closure in finite time. Also, they give conditions to ensure that the turnpike sets are non-empty.

To more precisely state their results, we need to specify the model of a single/parallel machine manufacturing system. Following the notation given by Sethi, Soner, Zhang, and Jiang (1992), let ${\mbox{\boldmath$x$ }}(t)$,${\mbox{\boldmath$u$ }}(t)$,${\mbox{\boldmath$z$ }}$, and m(t) denote, respectively, the inventory level, the production rate, the demand rate, and the machine capacity level at time $t\in [0,\infty)$. We assume that${\mbox{\boldmath$x$ }}(t)\in R^n$,${\mbox{\boldmath$u$ }}(t)\in R^n_+$,${\mbox{\boldmath$z$ }}$ is a constant positive vector in Rn+. Furthermore, we assume that $m(\cdot)$ is a Markov process with a finite space ${\cal M}=\{0,1,...,p\}$.

We can now write the dynamics of the system as

$\displaystyle \dot {\mbox{\boldmath$x$ }}(t)={\mbox{\boldmath$u$ }}(t)-{\mbox{\boldmath$z$ }}, \ \ {\mbox{\boldmath$x$ }}(0)={\mbox{\boldmath$x$ }}^0.$
    (2.1)
Definition 2.1   A control process (production rate)${\mbox{\boldmath$u$ }}(\cdot)=\{{\mbox{\boldmath$u$ }}(t): t \geq 0\}$ is called admissible with respect to the initial capacity m if
(i)
${\mbox{\boldmath$u$ }}(\cdot)$ is adapted to the filtration$\{{\calF}_t: t \geq 0\}$ with ${\cal F}_t=\sigma\{m(s): 0 \leq s \leqt\}$, the $\sigma$-field generated by m(t)
(ii)
$0 \leq{\mbox{\boldmath$r$ }}\cdot {\mbox{\boldmath$u$ }}(t)\leq m(t)$ for all $t\geq 0$ for some positive vector ${\mbox{\boldmath$r$ }}$.
Let ${\cal A}(m)$ denote the set of all admissible control processes with the initial condition m(0)=m.
Definition 2.2   A real-valued function ${\mbox{\boldmath$u$ }}({\mbox{\boldmath$x$ }}, m)$ on $R^n\times {\cal M}$ is called an admissible feedback control, or simply feedback control, if
(i)
for any given initial ${\mbox{\boldmath$x$ }}$, the equation
\begin{displaymath}\dot {\mbox{\boldmath$x$ }}(t)={\mbox{\boldmath$u$ }}({\mbo......z$ }}, \ \{\mbox{\boldmath$x$ }}(0)={\mbox{\boldmath$x$ }},\end{displaymath}
has a unique solution;
(ii)
${\mbox{\boldmath$u$ }}(\cdot)=\{{\mbox{\boldmath$u$ }}(t)={\mbox{\boldmath$u$ }}( {\mbox{\boldmath$x$ }}(t),m(t)): t\geq 0\}\in {\cal A}(m)$.
Let $h({\mbox{\boldmath$x$ }})$ and $c({\mbox{\boldmath$u$ }})$ denote the surplus cost and the production cost function, respectively. For every ${\mbox{\boldmath$u$ }}(\cdot)\in {\cal A}(m)$,${\mbox{\boldmath$x$ }}(0)={\mbox{\boldmath$x$ }}$, and m(0)=m, let the cost function of the system be defined as follows:
$\displaystyle J({\mbox{\boldmath$x$ }},m, {\mbox{\boldmath$u$ }}(\cdot))=E\int^......ty}_0e^{-\rho t}[h({\mbox{\boldmath$x$ }}(t))+c({\mbox{\boldmath$u$ }}(t))]dt,$
    (2.2)

where $\rho>0$ is the given discount rate. The problem is to choose an admissible control ${\mbox{\boldmath$u$ }}(\cdot)$ that minimizes the cost function$ J({\mbox{\boldmath$x$ }},m, {\mbox{\boldmath$u$ }}(\cdot))$. We define the value function as

$\displaystyle v({\mbox{\boldmath$x$ }},m)=\inf_{{\mbox{\boldmath$u$ }}(\cdot)\in{\cal A}(m)}J({\mbox{\boldmath$x$ }},m, {\mbox{\boldmath$u$ }}(\cdot)).$
    (2.3)

We make the following assumptions on the cost functions$h({\mbox{\boldmath$x$ }})$ and $c({\mbox{\boldmath$u$ }})$.

Assumption 2.1$h({\mbox{\boldmath$x$ }})$ is a nonnegative, convex function with h(0)=0. There are positive constants C21, C22, C23, and $\kappa_{21}\geq 0$,$\kappa_{22}\geq 0$ such that

\begin{displaymath}C_{21}\vert{\mbox{\boldmath$x$ }}\vert^{\kappa_{21}}-C_{22}......}})\leqC_{23}(1+\vert{\mbox{\boldmath$x$ }}^{\kappa_{22}}).\end{displaymath}

Assumption 2.2$c({\mbox{\boldmath$u$ }})$ is a nonnegative function, c(0)=0, and$c({\mbox{\boldmath$u$ }})$ is twice differentiable. Moreover, $c({\mbox{\boldmath$u$ }})$ is either strictly convex or linear.

Assumption 2.3$m(\cdot)$ is a finite state Markov chain with generator Q, where Q=(qij), $i,j\in{\cal M}$ is a(p+1) x (p+1) matrix such that $q_{ij}\geq 0$ for $i\neq j$ and $q_{ii}=-\sum_{i\ne j} q_{ij}$. That is, for any function$f(\cdot)$ on ${\cal M}$,

\begin{displaymath}Qf(\cdot)(m)=\sum_{\ell \neq m}q_{m\ell}[f(\ell)-f(m)].\end{displaymath}
With these three assumptions we can state the following theorem concerned with the property of the value function $v(\cdot,\cdot)$, and proved in Fleming, Sethi, and Soner (1987).

Theorem 2.1 (i)  For each m, $v(\cdot,m)$ is convex on Rn, and $v(\cdot,m)$ is strictly convex if $h(\cdot)$ is so. (ii) There exist positive constants C24, C25, and C26 such that for each m

\begin{displaymath}C_{24}\vert{\mbox{\boldmath$x$ }}\vert^{\kappa_{21}}-C_{25}......eqC_{26}(1+\vert{\mbox{\boldmath$x$ }}\vert^{\kappa_{22}}),\end{displaymath}

where $\kappa_{21}$ and $\kappa_{22}$ are the power indices in Assumption 2.1.

We next consider the equation associated with the problem. To do this, let

\begin{displaymath}F(m,{\mbox{\boldmath$w$ }})=\inf\{({\mbox{\boldmath$u$ }}-{...... {\mbox{\boldmath$u$ }}\cdot{\mbox{\boldmath$r$ }}\leq m\},\end{displaymath}
where ${\mbox{\boldmath$r$ }}$ is given in Definition 2.1, and ``$\cdot$" between ${\mbox{\boldmath$u$ }}-{\mbox{\boldmath$z$ }}$ and ${\mbox{\boldmath$w$ }}$ represents the inner product of the vectors ${\mbox{\boldmath$u$ }}-{\mbox{\boldmath$z$ }}$ and${\mbox{\boldmath$w$ }}$. Then, the HJB equation is written formally as follows:
$\displaystyle \rho v({\mbox{\boldmath$x$ }},m)=F(m, v'_{\mbox{\boldmath$x$ }}({......dmath$x$ }},m))+h({\mbox{\boldmath$x$ }})+Qv({\mbox{\boldmath$x$ }},\cdot)(m),$
    (2.4)
for ${\mbox{\boldmath$x$ }}\in R^n$,$m\in {\cal M}$, where $v'_{\mbox{\boldmath$x$ }} ({\mbox{\boldmath$x$ }},m)$ is the partial derivative of $v(\cdot,\cdot)$ with respect to${\mbox{\boldmath$x$ }}$.

In general, the value function v may not be differentiable. In order to make sense of the HJB equation (2.4), we consider its viscosity solution. In order to give the definition of the viscosity solution, we first introduce the superdifferential and subdifferential of a given function $f({\mbox{\boldmath$x$ }})$ on Rn.

Definition 2.3   The superdifferential $D^+f({\mbox{\boldmath$x$ }})$ and the subdifferential$D^-f({\mbox{\boldmath$x$ }})$ of any function $f({\mbox{\boldmath$x$ }})$ on Rn are defined, respectively, as follows:

\begin{eqnarray*}D^+f({\mbox{\boldmath$x$ }})=\left\{{\mbox{\boldmath$n$ }}\in......boldmath$n$ }}}{\vert{\mbox{\boldmath$r$ }}\vert}\leq0\right\},\end{eqnarray*}
\begin{eqnarray*}D^-f({\mbox{\boldmath$x$ }})=\left\{{\mbox{\boldmath$n$ }}\in......boldmath$n$ }}}{\vert{\mbox{\boldmath$r$ }}\vert}\geq0\right\}.\end{eqnarray*}
Based on Definition 2.3, we give the definition of a viscosity solution.

Definition 2.4   We say that$v({\mbox{\boldmath$x$ }},m)$ is a viscosity solution of equation (2.4) if the following holds:

(i)
$v({\mbox{\boldmath$x$ }},m)$ is continuous in ${\mbox{\boldmath$x$ }}$ and there exist C27>0 and $\kappa_{23}>0$ such that$\vert v({\mbox{\boldmath$x$ }},m)\vert\leqC_{27}(1+\vert{\mbox{\boldmath$x$ }}\vert^{\kappa_{23}})$;
(ii)
for all ${\mbox{\boldmath$n$ }} \in D^+v({\mbox{\boldmath$x$ }},m)$,
\begin{displaymath}\rho v({\mbox{\boldmath$x$ }},m)-F(m, v'_{\mbox{\boldmath$x......}})+Qv({\mbox{\boldmath$x$ }},\cdot)(m)\leq 0; \ \mbox{and}\end{displaymath}
(iii)
for all ${\mbox{\boldmath$n$ }} \in D^-v({\mbox{\boldmath$x$ }},m)$,
\begin{displaymath}\rho v({\mbox{\boldmath$x$ }},m)-F(m, v'_{\mbox{\boldmath$x......\boldmath$x$ }})+Qv({\mbox{\boldmath$x$ }},\cdot)(m)\geq 0.\end{displaymath}


For further discussion on viscosity solutions, the reader is referred to Fleming and Soner (1992) or Sethi and Zhang (1994a).

Lehoczky, Sethi, Soner, and Taksar (1991) prove the following theorem.

Theorem 2.2   The value function $v({\mbox{\boldmath$x$ }},m)$ defined in (2.3) is the unique viscosity solution to the HJB equation (2.4).

Remark 2.1   If there is a continuously differentiable function that satisfies the HJB equation (2.4), then it is a viscosity solution, and therefore, it is the value function.

Furthermore, we have the following result.

Theorem 2.3  The value function $v(\cdot,m)$ is continuously differentiable and satisfies the HJB equation (2.4).

For the proof, see Theorem 3.1 in Sethi and Zhang (1994a).

Next, we give a verification theorem.

Theorem 2.4 (Verification Theorem)  Suppose that there is a continuously differentiable function $v^0({\mbox{\boldmath$x$ }},m)$ that satisfies the HJB equation (2.4). If there exists ${\mbox{\boldmath$u$ }}^*\in {\cal A}(m)$, for which the corresponding${\mbox{\boldmath$x$ }}^*(t)$ satisfies (2.1) with${\mbox{\boldmath$x$ }}^*(0)={\mbox{\boldmath$x$ }}$,${\mbox{\boldmath$w$ }}^*(t)=v^0_{{\mbox{\boldmath$x$ }}}({\mbox{\boldmath$x$ }}^*(t),m(t))$, and

\begin{displaymath}F(m(t),{\mbox{\boldmath$w$ }}^*(t))=({\mbox{\boldmath$u$ }}......{\mbox{\boldmath$w$ }}^*(t)+c({\mbox{\boldmath$u$ }}^*(t)),\end{displaymath}
almost everywhere in t with probability one, then$v^0({\mbox{\boldmath$x$ }},m)=v({\mbox{\boldmath$x$ }},m)$and${\mbox{\boldmath$u$ }}^*(t)$is optimal, i.e.,
\begin{displaymath}v^0({\mbox{\boldmath$x$ }},m)=v({\mbox{\boldmath$x$ }},m)=J({\mbox{\boldmath$x$ }},m,{\mbox{\boldmath$u$ }}^*(\cdot)).\end{displaymath}

For the proof, see Lemma H.3 of Sethi and Zhang (1994a).

Next we give an application of the verification theorem. With Assumption 2.2, we can use the verification theorem to derive an optimal feedback control for n=1. From Theorem 2.4, an optimal feedback control u*(x,m) must minimize

\begin{displaymath}(u-z)\cdot v_x(x,m)+c(u)\end{displaymath}
for each (x,m). Thus,
\begin{eqnarray*}u^*(x,m)=\left\{\begin{array}{lll}0 &\mbox{if} \ \ v_x(x,......x,m)<0\\ m&\mbox{if} \ \ v_x(x,m)<-c'(m),\end{array}\right.\end{eqnarray*}
when c''(u) is strictly positive, and
\begin{eqnarray*}u^*(x,m)=\left\{\begin{array}{lll}0 &\mbox{if} \ \ v_x(x,......(x,m)=-c\\m&\mbox{if} \ \ v_x(x,m)<-c,\end{array}\right.\end{eqnarray*}
when c(u)=cu for some constant $c\geq 0$.

Recall that $v({\cdot},m)$ is a convex function. Thus u*(x,m) is increasing in x. From a result on differential equations (see Hartman (1982)),

\begin{displaymath}\dot x(t)=u^*(x(t),m(t))-z, \ x(0)=x,\end{displaymath}
has a unique solution x*(t) for each sample path of the capacity process. Hence, the control given above is the optimal feedback control. From this application, we can see that the point such thatvx(x,m)=-c'(z ) is critical in describing the optimal feedback control. So we give the following definition.

Definition 2.5   The turnpike set${\cal G}(m,z)$ is defined by
\begin{displaymath}{\cal G}(m,z)=\{x\in R: v_x(x,m)=-c'(z)\}.\end{displaymath}

Next we will discuss the monotonicity of the turnpike set. To do this, define $i_0\in{\cal M}$ to be such that i0<z<i0+1. Observe that for$m\leq i_0$,

\begin{eqnarray*}\dot x(t)\leq m-z\leq i_0-z<0.\end{eqnarray*}
Therefore, x(t) goes to $-\infty$ monotonically as $t\to\infty$, if the capacity state m is absorbing. Hence, only those $m\in {\cal M}$, for which $m\geq i_0+1$, are of special interest to us.
In view of Theorem 2.1, if $h(\cdot)$ is strictly convex, then each turnpike set reduces to a singleton, i.e., there exists xm such that${\cal G}(m)=\{ x_m\}$,$m\in {\cal M}$.

If the production cost is linear, i.e., c(u)=cu for some constant c, then xm is the threshold inventory level with capacity m. Specifically, if $x>x_m,\;u^*(x,m)=0$, and if $x<x_m,\; u^*(x,m)=m$ (full available capacity).

Let us make the following observation. If the capacity m>z, then the optimal trajectory will move toward the turnpike set xm. Suppose the inventory level is xm for some m and the capacity increases to m1>m; it then becomes costly to keep the inventory at level xm, since a lower inventory level may be more desirable given the higher current capacity. Thus, we expect $x_{m_1}\leq x_{m}$. Sethi, Soner, Zhang, and Jiang (1992) show that this intuitive observation is true. We state their result as the following theorem.

Theorem 2.5   Assume $h(\cdot)$ to be differentiable and strictly convex. Then

\begin{displaymath}x_{i_0}\geq x_{i_0+1}\geq \cdots \geq x_m \geq c_z,\end{displaymath}
where$c_z=(h')^{-1}(-\rho c'(z))$.
nextupprevious
Next:2.2 Optimal control of dynamicUp:2. Optimal Control under thePrevious:2. Optimal Control under the