Next:2.2
Optimal control of dynamicUp:2
Optimal Control under thePrevious:2.1
Optimal Control under the
2.1 Optimal control of single or parallel machine systems
Akella and Kumar (1986) deal with a single
machine (with two states: up and down), single product problem. They obtained
an explicit solution for the threshold inventory level, in terms of which
the optimal policy is as follows: Whenever the machine is up, produce at
the maximum possible rate if the inventory level is less than the threshold,
produce on demand if the inventory level is exactly equal to the threshold,
and not produce at all if the inventory level exceeds the threshold. Recently,
Feng
and Yan (1999) extend the Akella-Kumar model to the case where the
exogenous demand forms a homogeneous Poisson flow. They prove the optimality
of the threshhold control type for this extended Akella-Kumar model. When
their problem is generalized to convex costs and more than two machine
states, it is no longer possible to obtain explicit solutions. Using the
viscosity solution technique,
Sethi, Soner,
Zhang, and Jiang (1992) investigate this general problem. They study
the elementary properties of the value function. They show that the value
function is a convex function and that it is strictly convex provided the
inventory cost is strictly convex. Moreover, it is shown to be a viscosity
solution to the Hamilton-Jacobi-Bellman (HJB) equation and to have upper
and lower bounds each with polynomial growth. Following the idea of
Thompson
and Sethi (1980), they define what are known as the turnpike sets in
terms of the value function. They prove that the turnpike sets are attractors
for the optimal trajectories and provide sufficient conditions under which
the optimal trajectories enter the convex closure in finite time. Also,
they give conditions to ensure that the turnpike sets are non-empty.
To more precisely state their results, we need to specify the model
of a single/parallel machine manufacturing system. Following the notation
given by Sethi, Soner, Zhang, and Jiang (1992),
let ,,,
and m(t) denote, respectively, the inventory level, the production
rate, the demand rate, and the machine capacity level at time .
We assume that,,
is a constant positive vector in Rn+. Furthermore,
we assume that
is a Markov process with a finite space .
We can now write the dynamics of the system as
|
|
|
(2.1) |
Definition 2.1 A control process
(production rate)
is called
admissible with respect to the initial capacity m
if
-
(i)
-
is adapted to the filtration
with ,
the -field
generated by m(t)
-
(ii)
-
for all
for some positive vector .
Let
denote the set of all admissible control processes with the initial condition
m(0)=m.
Definition 2.2 A real-valued function
on
is called an admissible feedback control, or simply feedback
control, if
-
(i)
-
for any given initial ,
the equation
has a unique solution;
-
(ii)
-
.
Let
and
denote the surplus cost and the production cost function, respectively.
For every ,,
and m(0)=m, let the cost function of the system be defined
as follows:
|
|
|
(2.2) |
where
is the given discount rate. The problem is to choose an admissible control
that minimizes the cost function.
We define the value function as
|
|
|
(2.3) |
We make the following assumptions on the cost functions
and .
Assumption 2.1
is a nonnegative, convex function with
h(0)=0. There are positive
constants C21, C22,
C23,
and ,
such that
Assumption 2.2
is a nonnegative function, c(0)=0, and
is twice differentiable. Moreover,
is either strictly convex or linear.
Assumption 2.3
is a finite state Markov chain with generator Q, where Q=(qij),
is a(p+1) x (p+1) matrix
such that
for
and .
That is, for any function
on ,
With these three assumptions we can state the following theorem concerned
with the property of the value function ,
and proved in Fleming, Sethi, and Soner (1987).
Theorem 2.1 (i)
For each m,
is convex on Rn, and
is strictly convex if
is so. (ii) There exist positive constants C24, C25,
and C26 such that for each m
where
and
are the power indices in Assumption 2.1.
We next consider the equation associated with the problem. To do this,
let
where
is given in Definition 2.1, and ``"
between
and
represents the inner product of the vectors
and.
Then, the HJB equation is written formally as follows:
|
|
|
(2.4) |
for ,,
where
is the partial derivative of
with respect to.
In general, the value function v may not be differentiable. In
order to make sense of the HJB equation (2.4),
we consider its viscosity solution. In order to give the definition
of the viscosity solution, we first introduce the superdifferential and
subdifferential of a given function
on Rn.
Definition 2.3 The superdifferential
and the subdifferential
of any function
on Rn are defined, respectively, as follows:
Based on Definition 2.3, we give the definition
of a viscosity solution.
Definition 2.4 We say that
is a viscosity solution of equation (2.4)
if the following holds:
-
(i)
-
is continuous in
and there exist
C27>0 and
such that;
-
(ii)
-
for all ,
-
(iii)
-
for all ,
For further discussion on viscosity solutions, the reader is referred to
Fleming
and Soner (1992) or Sethi and Zhang (1994a).
Lehoczky, Sethi, Soner, and Taksar (1991)
prove the following theorem.
Theorem 2.2 The value function
defined in (2.3) is the unique viscosity
solution to the HJB equation (2.4).
Remark 2.1 If there is a continuously
differentiable function that satisfies the HJB equation (2.4),
then it is a viscosity solution, and therefore, it is the value function.
Furthermore, we have the following result.
Theorem 2.3 The value function
is continuously differentiable and satisfies the HJB equation (2.4).
For the proof, see Theorem 3.1 in Sethi
and Zhang (1994a).
Next, we give a verification theorem.
Theorem 2.4 (Verification Theorem)
Suppose that there is a continuously differentiable function
that satisfies the HJB equation (2.4).
If
there exists ,
for which the corresponding
satisfies (2.1) with,,
and
almost everywhere in t with probability one, thenandis
optimal, i.e.,
For the proof, see Lemma H.3 of Sethi and
Zhang (1994a).
Next we give an application of the verification theorem. With
Assumption
2.2, we can use the verification theorem to derive an optimal feedback
control for n=1. From Theorem 2.4,
an optimal feedback control u*(x,m) must
minimize
for each (x,m). Thus,
when c''(u) is strictly positive, and
when c(u)=cu for some constant .
Recall that
is a convex function. Thus u*(x,m) is increasing
in x. From a result on differential equations (see
Hartman
(1982)),
has a unique solution x*(t) for each sample path
of the capacity process. Hence, the control given above is the optimal
feedback control. From this application, we can see that the point
such thatvx(x,m)=-c'(z
) is critical in describing the optimal feedback control. So we give the
following definition.
Definition 2.5 The turnpike set
is defined by
Next we will discuss the monotonicity of the turnpike set. To do this,
define
to be such that i0<z<i0+1.
Observe that for,
Therefore, x(t) goes to
monotonically as ,
if the capacity state m is absorbing. Hence, only those ,
for which ,
are of special interest to us.
In view of Theorem 2.1, if
is strictly convex, then each turnpike set reduces to a singleton, i.e.,
there exists xm such that,.
If the production cost is linear, i.e., c(u)=cu
for some constant c, then xm is the threshold
inventory level with capacity m. Specifically, if ,
and if
(full available capacity).
Let us make the following observation. If the capacity m>z,
then the optimal trajectory will move toward the turnpike set xm.
Suppose the inventory level is xm for some m and
the capacity increases to m1>m; it then becomes
costly to keep the inventory at level xm, since a lower
inventory level may be more desirable given the higher current capacity.
Thus, we expect .
Sethi,
Soner, Zhang, and Jiang (1992) show that this intuitive observation
is true. We state their result as the following theorem.
Theorem 2.5 Assume
to be differentiable and strictly convex. Then
where.
Next:2.2
Optimal control of dynamicUp:2.
Optimal Control under thePrevious:2.
Optimal Control under the