Andersson Djehiche - AMO 2011

Appl Math Optim (2011) 63: 341–356

DOI 10.1007/s00245-010-9123-8

A Maximum Principle for SDEs of Mean-Field Type

Daniel Andersson · Boualem Djehiche
Published online: 30 October 2010

© Springer Science+Business Media, LLC 2010
Abstract We study the optimal control of a stochastic differential equation (SDE)

of mean-field type, where the coefficients are allowed to depend on some functional
of the law as well as the state of the process. Moreover the cost functional is also of
mean-field type, which makes the control problem time inconsistent in the sense that
the Bellman optimality principle does not hold. Under the assumption of a convex
action space a maximum principle of local form is derived, specifying the necessary
conditions for optimality. These are also shown to be sufficient under additional as-
sumptions. This maximum principle differs from the classical one, where the adjoint
equation is a linear backward SDE, since here the adjoint equation turns out to be a
linear mean-field backward SDE. As an illustration, we apply the result to the mean-
variance portfolio selection problem.
Keywords Stochastic control · Maximum principle · Mean-field SDE ·

McKean-Vlasov equation · Time inconsistent control
1 Introduction
We consider stochastic control problems where the state process is governed by a

stochastic differential equation (SDE) of mean-field type. That is, the coefficients of
the SDE depend on the law of the SDE as well as the state and the control. More
Communicating Editor: Bernt Oksendal.

D. Andersson () · B. Djehiche
Department of Mathematics, Royal Institute of Technology, 100 44 Stockholm, Sweden
e-mail: [email protected]
B. Djehiche
e-mail: [email protected]
342 Appl Math Optim (2011) 63: 341–356
specifically, the SDE is defined as

dxt = b(t, xt , Eψ(xt ), ut )dt + σ (t, xt , Eφ(xt ), ut )dBt ,
(1.1)
x(0) = x0 ,
for some functions b, σ , ψ and φ, and a Brownian motion Bt . For every t the control
ut is allowed to take values in the action space U . The mean-field SDE is obtained as
the mean square limit of an interacting particle system of the form
⎛ ⎞ ⎛ ⎞
1
n 1
n
dxti,n = b ⎝t, xti,n , , ut ⎠ dt + σ ⎝t, xti,n , , ut ⎠ dBti ,
j,n j,n
ψ xt φ xt
n n
j =1 j =1
when n → ∞. The classical example is the McKean-Vlasov model (see e.g. [13] and
the references therein), although in that model the coefficients are linear in the law of
the process. For the nonlinear case, see [10].
The object of the control problem is to minimize a cost functional of the form
T
J (u) = E h (t, xt , Eϕ(xt ), ut ) dt + g (xT , Eχ(xT )) , (1.2)
0
for given functions h, g, ϕ and χ . This cost functional is also of mean-field type, as
the functions h and g depend on the law of the state process.
The fact that J is a (possibly) nonlinear function of the expected value stands in
contrast to the standard formulation of a stochastic control problem, where J is the
expected value of a functional of the state process. In fact, this leads to a so-called
time inconsistent control problem. That is, the Bellman optimality principle does not
hold, see e.g. [4, 9], and [2]. The reason for this is that one cannot apply the law
of iterated expectations on the cost functional. A more general form of this control
problem has been studied in [1], and to some extent in [12], using an extended version
of the dynamic programming principle.
In this paper, we derive necessary and sufficient conditions for optimality of this
control problem in the form of a stochastic maximum principle. The standard sto-
chastic maximum principle involves solving the adjoint equation, a backward SDE,
and maximizing the Hamiltonian. In our case, where the state process and cost func-
tional are of mean-field type, the adjoint equation will in fact not be just a backward
SDE, but a mean-field backward SDE, which has been studied recently in [5] and [6].
Due to time inconsistency of the control problem, [9] and [4] need to specify a
definition of admissible controls that involves the Nash Certainty Equivalent Prin-
ciple and the notion of Nash Equilibrium, respectively, to turn the problem into a
time consistent one and solve it using Bellman’s dynamic programming principle.
[1] express the value function in terms of the Nisio nonlinear semigroup of opera-
tors to obtain a highly complicated version of the Hamilton-Jacobi-Bellman equa-
tion. In this work we show that to study the control problem associated with (1.1)
and (1.2), the set up offered by the stochastic maximum principle does not require
any special definition of the set of admissible controls, besides the classical adapted-
ness and some integrability conditions, which makes it suitable to solve this class of
Appl Math Optim (2011) 63: 341–356 343
time inconsistent control problems without introducing further concepts and techni-
cal tools.
We apply the methods in [3] to obtain the necessary conditions, i.e. we assume
that the action space is convex, which allows us to make a convex perturbation of the
optimal control and obtain a maximum principle of local condition. The extension to
the general case where the action space is not convex, using a spike variation of the
optimal control, leading to a Peng’s type maximum principle is more involved and
will appear elsewhere.
In the last section we illustrate the result by applying it to the mean-variance op-
timization problem. That is, the continuous version of a Markowitz investment prob-
lem where one constructs a portfolio by investing in a risk free bank account and a
risky asset (e.g. a stock). The objective is to maximize the expected terminal wealth
while minimizing the variance of the terminal wealth. Since this cost function in-
volves the variance which is quadratic in the expected value, the problem is time
inconsistent as explained above. It has nevertheless been solved by different meth-
ods, for instance [15] obtain an optimal control in feedback form by embedding the
problem into a class of stochastic LQ problems. We show that our version of the
stochastic maximum principle can be directly applied to obtain the optimal control
found in [15]. This optimal control is different from the ones obtained by [4] using the
notion of Nash equilibrium and the one suggested in [2], using the total conditional
variance formula.
To ease the exposition of the results, we only consider the one dimensional case.
The extension to the multidimensional case is by now straightforward.
2 Formulation of the Problem
Let T > 0 be a fixed time horizon and (, F , Ft , P) be a filtered probability space
satisfying the usual conditions, on which a standard Brownian motion B = (Bt )t≥0
is defined. We assume that (Ft )t≥0 is the natural filtration of B augmented by P-null
sets of F .
The action space, U , is a non-empty, closed and convex subset of R, and U is the
class of measurable, Ft -adapted and square integrable processes u : [0, T ] × → U .
For any u ∈ U , we consider the following stochastic differential equation

dxt = b(t, xt , Eψ(xt ), ut )dt + σ (t, xt , Eφ(xt ), ut )dBt ,
(2.1)
x(0) = x0 ,
where,
b : [0, T ] × R × R × U −→ R,
σ : [0, T ] ×R × R × U −→ R,
ψ : R −→ R,
φ : R −→ R.
344 Appl Math Optim (2011) 63: 341–356
The expected cost is given by

T
J (u) = E h(t, xt , Eϕ(xt ), ut )dt + g(xT , Eχ(xT )) , (2.2)
0
where,
g : R × R −→ R,
h : [0, T ] × R × R × U −→ R,
χ : R −→ R,
ϕ : R −→ R.
The following assumptions will be in force throughout this paper, where x denotes
the state variable, y the ‘expected value’, and v the control variable.
(A.1) ψ, φ, χ and ϕ are continuously differentiable. g is continuously differentiable
with respect to (x, y). b, σ, h are continuously differentiable with respect to
(x, y, v).
(A.2) All the derivatives in (A.1) are Lipschitz continuous and bounded.
Further assumptions will be needed for the sufficient conditions in Sect. 4. Under the
above assumptions, the SDE (2.1) has a unique strong solution. Indeed, since the co-
efficients b and σ are Lipschitz continuous with respect to x, this follows from Propo-
sition 1.2 in [10] if they are also Lipschitz continuous with respect to the Wasserstein
metric
1/2
d(μ, ν) = inf EQ |X − Y |2 ; Q ∈ P with marginals μ and ν

= sup hd(μ − ν); |h(x) − h(y)| ≤ |x − y| ,
where P is the space of probability measures on Rd × Rd . The second equality is

the Kantorovich-Rubinstein theorem, cf. [11]. Since b, σ , ψ and φ are all Lipschitz
continuous it holds that

b ·, ·, φ(x)dμX (x), · − b ·, ·, φ(y)dμY (y), ·

≤ K φ(x)d (μX (x) − μY (x))
≤ Kd (μX , μY ) ,
and similarly for σ . Moreover, by Proposition 1.2 in [10] it holds that

E sup |xt |p < ∞, (2.3)
t∈[0,T ]
for any p ∈ N+ .
Appl Math Optim (2011) 63: 341–356 345
The optimal control problem is to minimize the functional J (.) over U . A control
that solves this problem is called optimal.
We denote for any process ϕt ,
|ϕ|T∗,2 = sup |ϕt |2 .

t∈[0,T ]
For notational convenience, we will suppress the dependence on the time variable. bx ,
by , bv denotes the derivative of b with respect to the state trajectory, the ‘expected
value’ and the control variable, respectively, and similarly for the other functions.
Finally, we denote by x̂t and ût the optimal trajectory and control, respectively.
3 Necessary Conditions for Optimality
In this section we derive necessary conditions for optimality in the form of a maxi-
mum principle of local condition. The methods are those used in [3].
3.1 Taylor Expansions
We let xtθ denote the state trajectory corresponding to the following perturbation uθt
of ût :
uθt = ût + θ vt , vt ∈ U.
We introduce the short-hand notation

b̂(t) = b t, x̂t , Eψ̂(t), ût ,

ψ̂(t) = ψ x̂t ,
and similarly for the other functions and their derivatives.

The objective of this section is to determine the Gateaux derivative of the cost
functional in terms of the Taylor expansion of the state process. We start by identify-
ing the Taylor expansion.
Lemma 3.1 Let

⎧
⎪
⎨dzt = (b̂x (t)zt + b̂y (t)E(ψ̂x (t)zt ) + b̂v (t)vt )dt
+ (σ̂x zt + σ̂y (t)E(φ̂x (t)zt ) + σ̂v (t)vt )dBt , (3.1)
⎪
⎩
z0 = 0.
Then, it holds that

θ ∗,2
x − x̂t
lim E t − zt = 0.
θ→0 θ T
346 Appl Math Optim (2011) 63: 341–356
Proof Since the coefficients in (3.1) are bounded, it follows from Proposition 1.2 in
[10] that there exists a unique solution such that

E sup |zt |p < ∞, (3.2)
t∈[0,T ]
for any p ∈ N+ .
xtθ −x̂t
If we define ytθ = θ − zt , and noting (2.3), it is also clear that

θ p
E sup y < ∞, (3.3)
t
t∈[0,T ]
for any p ∈ N+ . We have y0θ = 0 and ytθ fulfills the following SDE.
1
dytθ = b x̂t + θ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + θ vt − b̂(t) dt
θ

− b̂x (t)zt + b̂y (t)E ψ̂x (t)zt + b̂v (t)vt dt
1
+ σ x̂t + θ ytθ + zt , Eφ x̂t + θ ytθ + zt , ût + θ vt − σ̂ (t) dBt
θ

− σ̂x (t)zt + σ̂y (t)E φ̂x (t)zt + σ̂v (t)vt dBt . (3.4)
Noting that
d
b ·, E ψ x̂t + λθ ytθ + zt , ·
dλ

= by ·, E ψ x̂t + λθ ytθ + zt , · E ψx x̂t + λθ ytθ + zt ytθ + zt θ,
we proceed as in [3] and write
1
b x̂t + θ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + θ vt − b̂(t) dt
θ
1
= bx x̂t + λθ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + λθ vt ytθ + zt dλ
0
1
+ by x̂t + λθ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + λθ vt
0

· E ψx x̂t + λθ ytθ + zt ytθ + zt dλ
1
+ bv x̂t + λθ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + λθ vt vt dλ. (3.5)
0
Denoting xtλ,θ = x̂t + λθ (ytθ + zt ) and uλ,θ

t = ût + λθ vt for notational convenience,
we may insert (3.5) into (3.4) to obtain the equality
1
b x̂t + θ ytθ + zt , Eψ x̂t + θ ytθ + zt , ût + θ vt − b̂(t)
θ
Appl Math Optim (2011) 63: 341–356 347

− b̂x (t)zt + b̂y (t)E ψ̂x (t)zt + b̂v (t)vt
1
= bx xtλ,θ , Eψ xtλ,θ , uλ,θ
t ytθ dλ
0
1
+ by xtλ,θ , Eψ xtλ,θ , uλ,θ
t E ψx xtλ,θ ytθ dλ
0
1
+ bx xtλ,θ , Eψ xtλ,θ , uλ,θ
t − b̂x (t) zt dλ
0
1
+ by xtλ,θ , Eψ xtλ,θ , uλ,θ
t E ψx xtλ,θ zt − b̂y (t)E(ψ̂x (t)zt ) dλ
0
1
+ bv xtλ,θ , Eψ xtλ,θ , uλ,θ
t − b̂v (t) vt dλ. (3.6)
0
The three last terms tend to 0 in L2 ( × [0, T ]) as θ → 0. To see this, we rewrite the
second to last term above as
1
It := by xtλ,θ , Eψ xtλ,θ , uλ,θ
t
0

− by x̂t , Eψ xtλ,θ , uλ,θ
t E ψx xtλ,θ zt dλ
1
+ by x̂t , Eψ xtλ,θ , uλ,θ
t
0

− by x̂t , Eψ x̂t , uλ,θ
t E ψx xtλ,θ zt dλ
1
+ by x̂t , Eψ x̂t , uλ,θ
t − b̂y (t) E ψx xtλ,θ zt dλ
0
1
+ b̂y (t) E ψx xtλ,θ zt − E(ψ̂x (t)zt ) dλ.
0
By using the Lipschitz continuity and boundedness of the functions as well as

Cauchy-Schwarz inequality, we obtain the following estimate of the L2 ( × [0, T ])-
norm of the expression above (K > 0 is a constant).
⎧ 1/2
T ⎨ T 1 θ 4 T 1/2
E |It | dt ≤ K
2
E λθ yt + zt dλdt E |zt |4 dt
0 ⎩ 0 0 0
1/2 ⎫
T 1 T 1/2 ⎬
+ E |λθ vt |4 dλdt E |zt |4 dt ,
0 0 0 ⎭
which converges to 0 as θ → 0 since the expected values are finite. Similar estima-
tions for the third and fifth terms in (3.6) show that these terms also converge to 0
348 Appl Math Optim (2011) 63: 341–356
in L2 ( × [0, T ]). Now, rewriting the diffusion part in (3.4) in the same way and
using the Burkholder-Davis-Gundy inequality, we have by the boundedness of the
functions and Jensen’s inequality that

∗,2 T ∗,2 T θ 2
E y θ ≤K
T
E y θ dt +
t

sup E ys dt + ρ θ
0 0 s∈[0,t]
T ∗,2
≤K E y θ t dt + ρ θ ,
0
where K > 0 is a constant and ρ θ → 0 as θ → 0. Applying Gronwall’s lemma gives

the result.
Lemma 3.2 The Gateaux derivative of the cost functional J is given by
d T
J û + θ v =E ĥx (t)zt + ĥy (t)E ϕ̂x (t)zt + ĥv (t)vt dt
dθ θ=0 0

+ E ĝx (T )zT + ĝy (T )E (χx (T )zT ) .
Proof Using the short hand notation g (xT ) = g (xT , E (χ(xT ))), etc., we have, in
view of Lemma 3.4,
d θ
E g xT
dθ θ=0

g xTθ − g x̂T
= lim E
θ→0 θ

1 xTθ − x̂T
= lim E gx x̂T + λ xTθ − x̂T dλ
θ→0 0 θ

1 θ θ xTθ − xT
+E gy x̂T + λ xT − x̂T E χx x̂T + λ xT − xT dλ
0 θ

= E ĝx (T )zT + ĝy (T ) E χ̂x (T )zT .
Similarly one can show that

d T T
E h xtθ , uθt dt =E ĥx (t)zt + ĥy (t)E ϕ̂x (t)zt + ĥv (t)vt dt .
dθ 0 θ=0 0
From the definitions of the cost function and the perturbed control, we see that this
proves the lemma.
Appl Math Optim (2011) 63: 341–356 349
3.2 Duality
Define the adjoint equation

⎧
⎪
⎨dp̂t = −(b̂x (t)p̂t + σ̂x (t)q̂t + ĥx (t))dt + q̂t dBt
− (E(b̂y (t)p̂t )ψ̂x (t) + E(σ̂y q̂t )φ̂x (t) + E(ĥy (t))ϕ̂x (t))dt, (3.7)
⎪
⎩
p̂T = ĝx (T ) + E ĝy (T ) χ̂x (T ).
This equations reduces to the standard one, when the coefficients do not depend
explicitly on the marginal law of the underlying diffusion. Under the assumptions
(A.1)–(A.2), this is a linear mean-field backward SDE with bounded coefficients and
it follows from [5], Theorem 3.1, that it has a unique adapted solution such that
∗,2 T 2
E p̂ T + E q̂t dt < +∞. (3.8)
0
We have the following relation between p̂t and zt .
Lemma 3.3
T
E p̂T zT = E p̂t b̂v (t)vt − zt ĥx (t) − zt E ĥy (t) ϕ̂x (t) + q̂t σ̂v (t)vt dt .
0
Proof In view of (3.1) and (3.7), applying Itô’s formula to p̂t zt ,

T
p̂T zT = p̂t b̂x (t)zt + p̂t b̂y (t)E (ψx (t)zt ) + p̂t b̂v (t)vt − zt b̂x (t)p̂t
0

− zt E b̂y (t)p̂t ψ̂x (t) − zt σ̂x (t)q̂t − zt E σ̂y (t)q̂t φ̂x (t) − zt ĥx (t)

− zt E ĥy (t) ϕ̂x (t) + q̂t σ̂x (t)zt + q̂t σ̂y (T )E φ̂x (t)zt + q̂t σ̂v (t)vt dt
+ Mt ,
where Mt is a zero-mean martingale. By taking expectations we are left with
T
E p̂T zT = E p̂t b̂v (t)vt − zt ĥx (t) − zt E ĥy (t) ϕ̂x (t) + q̂t σ̂v (t)vt dt .
0

Next, we define the Hamiltonian
H (t, x, μ, u, p, q) = h t, x, ϕdμ, u + b t, x, ψdμ, u p
+ σ t, x, φdμ, u q.
350 Appl Math Optim (2011) 63: 341–356
To ease the notation, whenever x is a random variable whose probability law is μ, we

use the following notation for the Hamiltonian.
H (t, x, u, p, q) := h (t, x, E (ϕ(x)) , u) + b (t, x, E (ψ(x)) , u) p

+ σ (t, x, E (φ(x)) , u) q.
By combining Lemma 3.3 with Lemma 3.2 and by observing that

E p̂T zT = E ĝx (T )zT + ĝy (T ) E χ̂x (T )zT ,
we obtain the following result.
Corollary 3.1 The Gateaux derivative of the cost functional can be expressed in
terms of the Hamiltonian H in the following way.
d T
J û + θ v =E ĥv (t)vt + p̂t b̂v (t)vt + q̂t σ̂v (t)vt dt
dθ θ=0 0
T d
=E H t, x̂t , ût , p̂t , q̂t vt dt .
0 dv
3.3 Main Result
Since U is convex, we may choose the perturbation

uθt = ût + θ vt − ût ∈ U,
for 0 ≤ θ ≤ 1. Thus, since û is optimal, we have the inequality
d T d
J û + θ v − û =E H t, x̂t , ût , p̂t , q̂t vt − ût dt ≥ 0.
dθ θ=0 0 dv
As in [3], we can reduce this to
d
H t, x̂t , ût , p̂t , q̂t v − ût ≥ 0,
dv
a.e., P-a.s., for all v ∈ U . We summarize this with the main result of this section.
Theorem 3.1 Under assumptions (A.1)–(A.2), if ût is an optimal control with state
trajectory x̂t , then there exists a pair (p̂t , q̂t ) of adapted processes which satisfies
(3.7) and (3.8), such that
d
H t, x̂t , ût , p̂t , q̂t v − ût ≥ 0, P-a.s., for all t ∈ [0, T ]. (3.9)
dv
Remark 3.1 We note that if we assume that the functions b, σ, h, g, ϕ, ψ, χ and φ
are only Lipschitz continuous, Theorem 3.1 still holds but on an extended probability
space, using distributional derivatives and the Bouleau-Hirsch Flow Property (cf. [7]).
Appl Math Optim (2011) 63: 341–356 351
4 Sufficient Conditions for Optimality
We add the following assumptions.

(A.3) The function g is convex in (x, y).
(A.4) The Hamiltonian is convex in (x, y, v).
(A.5) The functions ψ, φ, ϕ, χ are convex.
(A.6) The functions by , σy , hy and gy are non-negative.
Theorem 4.1 Assume the conditions (A.1)–(A.6) are satisfied and let û ∈ U with
state trajectory x̂t be given and such that there exist solutions p̂t , q̂t to the adjoint
equation (3.7). Then, if

H t, x̂t , ût , p̂t , q̂t = inf H t, x̂t , v, p̂t , q̂t , (4.1)
v∈U
for all t ∈ [0, T ], P-a.s., û is an optimal control.
Remark 4.1 By assumption (A.4), the conditions (3.9) and (4.1) are equivalent.
Proof We introduce the short hand notation b(t) = b (t, xt , E (ψ(xt )) , ut ) and simi-
larly for the other functions. The functions b̂(t), ψ̂(t) etc., are defined as in Sect. 3.
Moreover, we denote H (t) = H (t, xt , ut , p̂t , q̂t ) and Ĥ (t) = H (t, x̂t , ût , p̂t , q̂t ).
Since g and χ are convex and gy ≥ 0, it holds that

E ĝ − g ≤ E ĝx (T ) x̂T − xT + ĝy (T )E χ̂(T ) − χ(T )

≤ E ĝx (T ) x̂T − xT + ĝy (T )E χ̂x (T ) · x̂T − xT

= E p̂T x̂T − xT .
By applying Itô’s formula we get

E p̂T x̂T − xT
T T T
=E x̂t − xt dp̂t + p̂t d x̂t − xt + q̂t σ̂ (t) − σ (t) dt
0 0 0
T
= −E x̂t − xt b̂x (t)p̂t + E(b̂y (t)p̂t ψ̂x (t) + σ̂x (t)q̂t
0

+ E σ̂y (t)q̂t φ̂x (t) + ĥx (t) + E ĥy (t) ϕ̂x (t) dt
T T
+E p̂t b̂(t) − b(t) dt + E q̂t σ̂ (t) − σ (t) dt
0 0
T
= −E x̂t − xt b̂x (t)p̂t + E b̂y (t)p̂t ψ̂x (t) + σ̂x (t)q̂t
0

+ E σ̂y (t)q̂t φ̂x (t) + ĥx (t) + E ĥy (t) ϕ̂x (t) dt
352 Appl Math Optim (2011) 63: 341–356
T T
+E Ĥ (t) − H (t) dt − E ĥ(t) − h(t) dt,
0 0
where, in the last step, we have used the definition of the Hamiltonian H . Next, we
differentiate the Hamiltonian and use the convexity of the functions to get for all
t ∈ [0, T ], P-a.s.,
Ĥ (t) − H (t)

≤ Ĥx (t) x̂t − xt + ĥy (t)E ϕ̂(t) − ϕ(t) + b̂y (t)E ψ̂(t) − ψ(t) p̂t

+ σ̂y (t)E φ̂(t) − φ(t) q̂t + Ĥu (t) ût − ut

≤ Ĥx (t) x̂t − xt + ĥy (t)E ϕ̂x (t) x̂t − xt + b̂y (t)E ψ̂x (t) x̂t − xt p̂t

+ σ̂y (t)E φ̂x (t) x̂t − xt q̂t + Ĥu (t) ût − ut

≤ Ĥx (t) x̂t − xt + ĥy (t)E ϕ̂x (t) x̂t − xt + b̂y (t)E ψ̂x (t) x̂t − xt p̂t

+ σ̂y (t)E φ̂x (t) x̂t − xt q̂t ,

where in the last step we have used that Ĥu ût − ut ≤ 0 due to the minimum condi-
tion (4.1). Combining the inequalities above gives us

J û − J (u)
T
=E ĥ(t) − h(t) dt + E ĝ(T ) − g(T )
0
T T
≤E Ĥ (t) − H (t) dt − E x̂t − xt b̂x (t)p̂t + E b̂y (t)p̂t ψ̂x (t)
0 0

+ σ̂x (t)q̂t + E σ̂y (t)q̂t φ̂x (t) + ĥx (t) + E ĥy (t) ϕ̂x (t) dt
T T
=E Ĥ (t) − H (t) dt − E x̂t − xt Ĥx (t) + E b̂y (t)p̂t ψ̂x (t)
0 0

+ E σ̂y (t)q̂t (t) φ̂x (t) + E ĥy (t) ϕ̂x (t) dt ≤ 0,
and thus, û is optimal.
5 Application: Mean-Variance Portfolio Selection
In this section we will illustrate the maximum principle by solving the optimal mean-
variance portfolio problem.
We consider a market with a risky asset and a risk free bank account and denote
the prices at time t with St1 and St0 , respectively. The price process evolve according
Appl Math Optim (2011) 63: 341–356 353
to the equations

dSt0 = ρt St0 dt,
dSt1 = αt St1 dt + σt St1 dBt ,
where αt , σt , ρt are bounded deterministic functions. If ut denotes the amount of
money invested in the risky asset at time t, we can write down the value xt of a
self-financing portfolio consisting of the risky and the risk free assets, as
dxt = (ρt xt + (αt − ρt ) ut ) dt + σt ut dBt , x0 = x(0). (5.1)
A widely discussed version of the mean-variance portfolio problem is to find an ad-

missible control u which minimizes the variance
Var(xT ) = E(xT2 ) − (ExT )2 ,
under the condition that

E(xT ) = a,
where a is a given real number. This problem has been formulated and solved us-
ing both the dynamic programming principle and using the maximum principle—see
[8, 14].
The portfolio selection problem in which one minimizes the variance while maxi-
mizing the expected return differs from the original one, since the Bellman optimality
principle does not hold due to the time inconsistency. It has been solved with different
methods such as in [15], by embedding it into a stochastic LQ control problem, in [4]
using the notion of Nash equilibrium, and in [2] using the total conditional variance
formula. We will apply the maximum principle (Theorem 3.1) stated above to get
a candidate for a feedback optimal control. This control coincides with the optimal
control obtained by [15] through their sufficient optimality condition (Theorem 4.1
in that paper). Our sufficient optimality conditions do not apply to this case because
the cost function does not satisfy condition (A.6) in Theorem 4.1.
The cost functional, to be minimized, is given by
γ
J (u) = Var(xT ) − E (xT ). (5.2)
2
By rewriting this as
γ 2 γ
J (u) = E x − xT − (E (xT ))2 ,
2 T 2
we see that this is a cost functional of the form (2.2). As noted in e.g. [4], this becomes
a time inconsistent control problem. We start our attempt to solve it by writing down
the Hamiltonian for this system:
H (t, x, u, p, q) = (ρt x + (αt − ρt ) u) p + σt uq.

354 Appl Math Optim (2011) 63: 341–356
The adjoint equation becomes

dpt = −ρt pt dt + qt dBt ,
pT = γ (xT − μT ) − 1,
where μt = E (xt ). Looking at the terminal condition of pt , it is reasonable to try a

solution of the form pt = At (xt − μt ) − Ct , with At , Ct deterministic functions such
that AT = γ , CT = 1. Noting that from (5.1), we have
dμt = (ρt μt + (αt − ρt ) E (ut )) dt, (5.3)
we apply Itô’s formula on pt to get

dpt = At (ρt (xt − μt ) + (αt − ρt ) (ut − E (ut ))) + At (xt − μt ) − Ct dt
+ At σt ut dBt , (5.4)
where At and Ct denotes the derivatives with respect to t. By comparing (5.4)
with (5), we get
At (ρt (xt − μt ) + (αt − ρt ) (ut − E (ut ))) + At (xt − μt ) − Ct

= −ρt At (xt − μt ) + ρt Ct , (5.5)
qt = At σt ut . (5.6)
Since, H is linear in the control variable u, in which case the conditions (3.9) and
(4.1) are equivalent, we consider the first order condition for minimizing the Hamil-
tonian that yields
(αt − ρt ) pt + σt qt = 0.
Inserting (5.6) into the latter expression gives us the following candidate of feedback
form for the optimal control:
(αt − ρt ) (αt − ρt )
ût = 2
x̂t − μ̂t − Ct , (5.7)
σt σt2 At
which is square integrable since x̂t is. The expected value of ût is
(ρt − αt ) Ct
E ût = . (5.8)
σt2 At
Moreover, by inserting (5.8) into (5.5) we also get

1 (α − ρ ) 2
2ρt At + At x̂t − μ̂t − Ct − ρt +
t t
ût = Ct . (5.9)
At (ρt − αt ) σt2
Appl Math Optim (2011) 63: 341–356 355

Identifying (5.7) and (5.9) as two equal regressions on x̂t − μ̂t , we can compare
the coefficients to obtain the following equations for At and Ct .

(ρt − αt )2 At − 2ρt At + At σt2 = 0, AT = γ ,
ρt Ct + Ct = 0, CT = 1.
The solutions to these equations are

T
(2ρs − s )ds
At = γ e t ,
T
Ct = e t ρs ds
,
where we have defined

(ρt − αt )2
t = .
σt2
Hence, in view of (5.7), the candidate for the optimal control is given by
αt − ρt −1

ût = C A
t t − x̂ t − μ̂ t
σt2
αt − ρt 1 T (s −ρs )ds
= et − x̂t − μ̂t , (5.10)
σt2 γ
with expected value

αt − ρt 1 T ( −ρ )ds
E ût = et s s . (5.11)
σt2 γ
By inserting (5.11) into (5.3), we get the following equation for μ̂t .
1 T
dμ̂t = ρt μ̂t + t e t (s −ρs )ds dt, μ̂0 = x0 . (5.12)
γ
The solution to (5.12) is

t 1 T (s −ρs )ds t s ds
μ̂t = x0 e 0 ρs ds
+ et e0 −1 . (5.13)
γ
Finally, by inserting (5.13) into (5.10), we get the solution candidate for the mean-
variance portfolio selection problem (5.2), when xt obeys (5.1), given in feedback
form by
αt − ρt t 1 T s ds− T ρs ds
û t, x̂t = x0 e 0 ρs ds
+ e0 t − x̂t ,
σt2 γ
which is identical to the optimal control found in [15], cf. (5.12), (6.7) and the sub-
sequent comments.
356 Appl Math Optim (2011) 63: 341–356
References
1. Ahmed, N.U., Ding, X.: Controlled McKean-Vlasov equations. Commun. Appl. Anal. 5(2), 183–206
(2001)
2. Basak, S., Chabakauri, G.: Dynamic mean-variance asset allocation. In: EFA 2007 Ljubljana Meet-
ings; AFA 2009 San Francisco Meetings Paper. SSRN: http://ssrn.com/abstract=965926 (2009)
3. Bensoussan, A.: Lectures on stochastic control. In: Mitter, S.K., Moro, A. (eds.) Nonlinear Filtering
and Stochastic Control. Springer Lecture Notes in Mathematics, vol. 972. Springer, Berlin (1982)
4. Björk, T., Murgoci, A.: A general theory of Markovian time inconsistent stochastic control problems.
Preprint (2008)
5. Buckdahn, R., Li, J., Peng, S.: Mean-field backward stochastic differential equations and related par-
tial differential equations. Stoch. Process. Appl. 119(10), 3133–3154 (2007)
6. Buckdahn, R., Djehiche, B., Li, J., Peng, S.: Mean-field backward stochastic differential equations.
A limit approach. Ann. Probab. 37(4), 1524–1565 (2009)
7. Chighoub, F., Djehiche, B., Mezerdi, B.: The stochastic maximum principle in optimal control of
degenerate diffusions with non-smooth coefficients. Random Oper. Stoch. Equ. 17, 35–53 (2008)
8. Framstad, N.C., Sulem, A., Øksendal, B.: Sufficient stochastic maximum principle for optimal control
of jump diffusions and applications to finance. J. Optim. Theory Appl. 121(1), 77–98 (2004)
9. Huang, M., Malhamé, R.P., Caines, P.E.: Large population stochastic dynamic games: closed-loop
McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst. 6(3), 221–
252 (2006)
10. Jourdain, B., Méléard, S., Woyczynski, W.: Nonlinear SDEs driven by Lévy processes and related
PDEs. Alea 4, 1–29 (2008)
11. Kantorovich, L.B., Rubinstein, G.S.: On the space of completely additive functions. Vestn. Leningr.
Univ., Mat. Meh. Astron. 13(7), 52–59 (1958)
12. Lasry, J.M., Lions, P.L.: Mean-field games. Jpn. J. Math. 2, 229–260 (2007)
13. Sznitman, A.S.: Topics in propagation of chaos. In: Ecôle de Probabilites de Saint Flour, XIX-1989.
Lecture Notes in Math, vol. 1464, pp. 165–251. Springer, Berlin (1989)
14. Yong, J., Zhou, X.Y.: Stochastic Control: Hamiltonian Systems and HJB Equations. Springer, New
York (1999)
15. Zhou, X.Y., Li, D.: Continuous-time mean-variance portfolio selection: a stochastic LQ framework.
Appl. Math. Optim. 42, 19–33 (2000)

Andersson Djehiche - AMO 2011

Uploaded by

Copyright:

Available Formats

Andersson Djehiche - AMO 2011

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Andersson Djehiche - AMO 2011

Uploaded by

Copyright:

Available Formats

Appl Math Optim (2011) 63: 341–356

A Maximum Principle for SDEs of Mean-Field Type

Daniel Andersson · Boualem Djehiche

Published online: 30 October 2010

Abstract We study the optimal control of a stochastic differential equation (SDE)

Keywords Stochastic control · Maximum principle · Mean-field SDE ·

We consider stochastic control problems where the state process is governed by a

Communicating Editor: Bernt Oksendal.

specifically, the SDE is defined as

2 Formulation of the Problem

The expected cost is given by

where P is the space of probability measures on Rd × Rd . The second equality is

and similarly for σ . Moreover, by Proposition 1.2 in [10] it holds that

|ϕ|T∗,2 = sup |ϕt |2 .

3 Necessary Conditions for Optimality

3.1 Taylor Expansions

We introduce the short-hand notation

and similarly for the other functions and their derivatives.

Lemma 3.1 Let

Then, it holds that

Denoting xtλ,θ = x̂t + λθ (ytθ + zt ) and uλ,θ

By using the Lipschitz continuity and boundedness of the functions as well as

where K > 0 is a constant and ρ θ → 0 as θ → 0. Applying Gronwall’s lemma gives

Lemma 3.2 The Gateaux derivative of the cost functional J is given by

Similarly one can show that

Define the adjoint equation

We have the following relation between p̂t and zt .

Proof In view of (3.1) and (3.7), applying Itô’s formula to p̂t zt ,

where Mt is a zero-mean martingale. By taking expectations we are left with

Next, we define the Hamiltonian

H (t, x, μ, u, p, q) = h t, x, ϕdμ, u + b t, x, ψdμ, u p

To ease the notation, whenever x is a random variable whose probability law is μ, we

H (t, x, u, p, q) := h (t, x, E (ϕ(x)) , u) + b (t, x, E (ψ(x)) , u) p

By combining Lemma 3.3 with Lemma 3.2 and by observing that

we obtain the following result.

3.3 Main Result

Since U is convex, we may choose the perturbation

for 0 ≤ θ ≤ 1. Thus, since û is optimal, we have the inequality

4 Sufficient Conditions for Optimality

We add the following assumptions.

for all t ∈ [0, T ], P-a.s., û is an optimal control.

By applying Itô’s formula we get

and thus, û is optimal.

5 Application: Mean-Variance Portfolio Selection

dxt = (ρt xt + (αt − ρt ) ut ) dt + σt ut dBt , x0 = x(0). (5.1)

A widely discussed version of the mean-variance portfolio problem is to find an ad-

Var(xT ) = E(xT2 ) − (ExT )2 ,

under the condition that

H (t, x, u, p, q) = (ρt x + (αt − ρt ) u) p + σt uq.

The adjoint equation becomes

where μt = E (xt ). Looking at the terminal condition of pt , it is reasonable to try a

dμt = (ρt μt + (αt − ρt ) E (ut )) dt, (5.3)

we apply Itô’s formula on pt to get

At (ρt (xt − μt ) + (αt − ρt ) (ut − E (ut ))) + At (xt − μt ) − Ct

Moreover, by inserting (5.8) into (5.5) we also get

The solutions to these equations are

where we have defined

with expected value

The solution to (5.12) is