6 Evolutionary Game Theory & Replicator Dynamics Flashcards
CGT vs EGT
While Classical Game Theory (CGT) is a static theory centred around players’ full rationality, CHOOSING MOST RATIONAL CHOICE Evolutionary Game Theory (EGT) is about dynamic adaptation in games played between
randomly paired individuals, without consideration of rationality and reciprocity (no information about previous games).
EGT and population games
Evolutionary dynamics can be applied to many systems. Here, we focus on
describes the evolution of interacting and generally competing populations
symmetric
two-player evolutionary games and adopt the framework of population games
interactions between different species= each pure strategy gives payoff matrix A
frequencies = densities of individuals
how they fare depends on their fitness = expected payoff
fitness
defined as the reproductive potential of an individual.
population games
each pure strategy i=1,…,k is interpreted as one of k species composing the population
frequency x_i of pure strategy i is the fraction x_i of species i in the population
we assume pop is very large and well mixed (no fluctuations and no space)
spatially homogeneous
sum x_i = 1 from i=1,…,k as is fraction
strategy profile
at each time t, the state of the population (COMPOSITION) is described by “strategy profile”
x(t) = (x₁(t),x₂(t),…,xₖ(t))^T
=x₁(t)e₁+x₂(t)e₂+…xₖ(t)eₖ
=
(x₁(t))
(x₂(t))
(….)
(xₖ(t))
e₁=(1,0,…,0)^T
e₂= (0,1,…,0)^T
…
eₖ=(0,0,..,1)^T
represent the k pure strategies / species
individuals
each individual is associated with a pure strategy, and at each time increment, agents are randomly paired and play a stage game their expected payoffs determine the rate of replication.
k species
k pure strategies
expected payoff
Π(eᵢ,x)
≡ Πᵢ(x)
= eᵢ^T Ax
with k-dimensional unit vectors
e₁=(1,0,…,0)^T
e₂=(0,1,0,…,0)^T etc
Πᵢ(x) expected payoff of an i-player against a random individual of the population playing on average x
replicator
The basic ingredient in an evolutionary system
an entity (gene, organism, strategy,,,,,)
able to produce copies of itself. A replicator system is a set of replicators with a pattern of interactions among individuals, such that those with higher fitness reproduce faster
WHEN xᵢ(t)=0,1
population only one species
MONOMORPHIC
1 pure strategy in CGT
WHEN 0< xᵢ(t)<1
population consists of the mixing of more than 1 species, is said “polymorphic”
mixed strategy in CGT)
genome and strategies
n a biological setting, we assume that the strategies are encoded by the genome, and
there is no assumption about the rationality of players. In close relation to population
genetics, the success of one species in EGT is measured by its fitness which depends on
what others do: it changes with the population composition as prescribed by the stage
game. The evolutionary dynamics of x(t) in a large population is usually specified by
the replicator equations, see Sec. 6.2
CGT and EGT interpretation of pop games
CGT EGT
Utility function, expected payoff VS Fitness, reproductive potential
Perfect rationality assumed VS Rationality not assumed
Static theory VS Dynamic theory
Pure strategy i = 1, . . . , k VS Species i = 1, . . . , k
Pure strategy i ⇒ unit vector ei VS Species i ⇒ unit vector e_i
Strategy profile x =sumi=1 to k x_ie_i VS Population state: x(t) = (x1(t), . . . , xk(t))T
Frequency of pure strategy i: 0 ≤ xi ≤ 1 xi(t): VS fraction of i in the population at t
sum_i=1,k x_i = 1 x_i(t) VS changes in time but sum_i=1,k x_i(t) = 1
EXAMPLE
Here, there are 3 species and 10 individuals,
4 of species 1, 3 of species 2 and 3 of species 3
x_1=4/10
x_2=3/10
x_3=3/10
representing with unit vectors respectively
(1,0,0)^T
(0,1,0)^T
(0,0,1)^T
x=0.4e₁+0.3e₂+0.3e₃ =
(0.4,0.3,0.3)^T
How does the population, i.e. , evolve?
we use replicator dynamics to see
REPLICATOR DYNAMICS
EGT doesn’t rely on rationality but considers a large population of individuals with given strategies that interact randomly in games
assume that between a time increment
(between t and t +dt) population composition is x(t)
individuals meet randomly many times in pairwise contests for payoffs- finesses and encode reproductive potential
replicator dynamics:
Between time t and t+dt, the population is in state
x(t)=
(x_1(t)
Population’s
average fitness
At this point, individuals are paired randomly to play “stage games” many times (each time
starting from scratch, without recollection of previous outcomes) => they obtain their expected payoff = “fitness” Π_i(x(t)))
now it depends on x at time t
and the population average payoff/fitness
Π_bar(x(t))
Fitness For species i:
Πᵢ(x(t))= eᵢ^T A x(t)
says how i is expected to perform when the population is state x(t)
Population’s
average fitness:
Π_barᵢ(x(t))= x^T A x(t)
says how a random individual perform on average when the
population is state x(t)
(these are, generally change in time: they are “frequency dependent”)
Darwinian principle
the best adapted species/strategies reproduce faster than others =>
compare the fitness of each pure strategy
we compare the fitness of each species i to the population average fitness
Π_barᵢ(x(t)) giving growth rate
Growth rate
at which its density xᵢ(t) changes in time,
is given by
x_dot ᵢ ≡ dxᵢ(t)/dt
x_dot ᵢ/xᵢ = (d/dt)(ln xᵢ(t))
where the GR is generally a function of t, such that
x_i(t) ~ exp(∫ᵗ GR(t′)dt′) with x_i(t) that increases when GR >0 and decreases when GR<0
pop grows exponentially for a short time
GR=fitness of species i - population average fitness
=Πᵢ- Π_barᵢ
species i’s growth rate
Π_ᵢ(x(t)) - Π_barᵢ(x(t))
pop grows exponentially for a short time
GR=fitness of species i - population average fitness
=Πᵢ- Π_barᵢ
species i’s growth rate
xᵢ increases when Πᵢ(x(t))> Π_barᵢ(x(t))
species i has a higher fitness than average population
growth
xᵢ decreases when Πᵢ(x(t))< Π_barᵢ(x(t))
if species i has a lower fitness than average population
decrease
THIS LEADS TO THE REPLICATOR DYNAMICS
REPLICATOR EQUATIONS
THIS LEADS TO THE REPLICATOR DYNAMICS A
replicator equations
xᵢ_dot = xᵢ [ Πᵢ(x)-Π_barᵢ(x)] =xᵢ[ i’s growth rate]
= xᵢ[ eᵢ^T Ax- x^T Ax] (unit vector = Π_bar)
=xᵢ[(Ax)ᵢ- x^TAx] for i=1,…,k
set of k coupled nonlinear ODEs
since sum x_i=1
=> k-1 independent coupled nonlinear ODEs, e.g. by choosing to write x_k=1-(…)
case k=2 is nice: scalar ode
Use the second format
At each time increment dt, the population state changes according to the REs:
xᵢ →xᵢ+dxᵢ
where
dxᵢ=xᵢ[Πᵢ(x)-Π_bar(x)] dt
Properties of the REs (symmetric games):
- REs are a set of k-1 coupled nonlinear equations ODES for (generally cubic in x_i, )
- A symmetric zero-sum game can be formulated so that one player gains what the other loses
=> can be chosen to be antisymmetric and, in this formulation, for zero-sum games
=> For these zero-sum games, REs are quadratic: x_dot ᵢ = xᵢ(**Ax)ᵢ
=> When these REs have a physical interior equilibrium, there is a nontrivial constant of motion (like in the zero-sum RPS game of Sections 5.6 and 6.3, constant of motions, orbits set by IC)
Each unit vector eᵢ=(0,…,0,i,0…,0^T (0 x (i-1), 1, (0 x (k-i)) associated with pure strategy i, is an equilibrium of the REs corresponding to the entire population consisting of individuals of species i.
These equilibria are called “absorbing fixed points” because there is no longer species coexistence
(loss of variability- homogeneous pop) in these states. They are particularly important in finite populations. (See Chapter 9)
The REs have at most 1 interior equilibrium
x∗ = (x∗1, x∗2, . . . , x∗k)^T
with 0<x∗_i<1, for which a necessary condition is
(Ax∗)₁= (Ax∗)₂=….
= (Ax∗)ₖ = x∗^TAx∗ (same conditions as for BCT, demanding that what is in the square brackets vanishes, each component is the same coinciding with avg. fitness)
Adding (or subtracting) a constant to each column of A gives the same REs, i.e
A=
[a b]
[c d] and
A~ =
[ a-c 0]
[0 d-b]
lead to the same REs
Replicator dynamics with or result in the same replicator equations.
We can obtain same RE by adding subtracting
replicator dynamics summary
in terms of ODES dep on evolution of classical game theory,
interpret expected payoff as fitness
growth rate relates to difference in fitness
Classical game theory (CGT): static theory assuming full rationality of all players.
Evolutionary game theory (EGT): theory of dynamic adaptation assuming no rationality, replaced by the notion of fitness (expected payoff) => framework for evolution of interacting populations
Replicator eqs
for frequency xᵢ of pure strategy i for symmetric 2-player game
payoff matrix
time derivative for fraction x_i
x˙ᵢ = xᵢ[Πᵢ(x) − Π¯(x)]
= xᵢ[(Ax)ᵢ − xT Ax] (for i = 1, . . . , k)
xₖ = 1 − Σᵢ₌₁ ᵏ⁻¹ xᵢ
=>k-1 indep. variables
set of coupled non linear ODEs
k pure strategies become k-1 indep vars
k=2 gives one ODE
What is the relationship between the notion of stability in nonlinear dynamics
and the concepts of evolutionary stability and Nash equilibrium?
notions are close but do not perfectly overlap, which gives rise to what is referred to as
“Folk theorems”.
given ODEs we can find equilibria,
we saw that all equilibria correspond to a pure strategy of each species taking over
we also saw that there can be 0 or 1 coexistence equilirbium, by demanding [*] to vanish, relates to BCT
x˙ᵢ = xᵢ[Πᵢ(x) − Π¯(x)]
= xᵢ[(Ax)ᵢ − xT Ax]
= xᵢ[*]
Summary relations between
dynamic and evolutionary stability:
(a) NE of the underlying stage game= equilibria of the REs , while strict NE of the underlying stage game are attractors (asymptotically equilibria) of the REs.
(b) A stable equilibrium of the REs is a NE of the underlying stage game.
(c) If a solution of the REs converges to some x, then x is a NE of the stage game.
(d) The evolutionary stable strategy ESSes of the stage game is an attractor of the REs
Moreover, interior ESSes (see Theorem 5.2) of the underlying stage games are global
attractors of the REs (6.4), i.e. their basin of attraction is the whole simplex S.
Importantly, the converse of statements (a)-(d) are not true. Proofs, along with examples
and counter-examples, can, e.g., be found in Ref.(4) (see Sec. 6.6).
(e) For two-player symmetric matrix games with two pure strategies (2×2 symmetric
games), things are fortunately very simple: in this case, x is an ESS if and only if it
is an attractor of the REs (6.4).
Summary relations between
dynamic and evolutionary stability:
- A Nash equilibrium (NE) of a stage game is an equilibrium of the associated replicator equations (REs)
- Strict Nash equilibria are attractors(asympotic stable equilibria) of the associated REs
- A stable equilibrium of the REs is NE of the underlying stage game
- Evolutionary stable strategies (ESSes) of a stage game are attractors of the underlying REs
Note: converse statements are not true! (e.g. attractors of the REs are not ESSes of the stage game)
Relation is much simpler for symmetric 2-player games with 2 pure strategies:
In this special, but important case:
ESSes <=> ATTRACTORS OF THE REs
Replicator dynamics of two-player symmetric games with two pure strategies
symmetric two-player games with two pure
strategies (2 × 2 games)
REs are scalar ODEs (:D)
infinitely large and
unstructured population where individuals (players) have two pure strategies at their disposal, and are randomly paired to play the underlying symmetric 2 × 2 stage game.
replicator dynamics:
symmetric two-player games with two pure
strategies (2 × 2 games)
payoff matrix
frequency
“cooperation” (C)
e𝒸 = (1, 0)^T
“defection” (D)
e𝒹 = (0, 1)^T
A=
vs |C |D
C a b
D c d
Frequency of species/strategy C is x, frequency of species/strategy D is 1-x
(Fractions)
replicator dynamics:
symmetric two-player games with two pure strategies (2 × 2 games)
Strategy profile
i.e. state of the population, at time t:
x(t) = (x(t), 1 − x(t))^T
= x(t)e𝒸 + (1 − x(t))e𝒹
replicator dynamics:
symmetric two-player games with two pure strategies (2 × 2 games)
expected payoffs
Average payoff/fitness
= xΠ𝒸 + (1 − x)Π𝒹
Expected payoff of C = fitness of C in state
x(t) : Π𝒸 = e𝒸^TAx(t)
= ax(t) + b(1 − x(t))
=(1 0) A ( x, 1-x)^T
playing c against x
(note x is x(t))
Expected payoff of D = fitness of D in state x(t) : Π𝒹 = e𝒹^TAx(t)
= cx(t) + d(1 − x(t))
=(1 0) A ( x, 1-x)^T
avrg payoff/fitness:
Π¯(t) = x(t)^T Ax(t)
= x(t)Π𝒸(t) + (1 − x(t))Π𝒹(t)
=(x, 1-x) A (x, 1-x)
=(x 1 − x) (Π𝒸)
(Π𝒹)
[ax + b(1 − x)]
[cx + d(1 − x)]
=
(Π𝒸)
(Π𝒹)
x is the fraction of cooperation in pop x Π𝒸 the fitness of these +
1-x pop of defectors x Π𝒹
Replicator equation for symmetric 2x2 games (one dimensional equation):
uses avrg payoff/fitness
x˙ = x[Π𝒸 (x)−Π¯(x)] = x(1−x)[Π𝒸 (x)−Π𝒹(x)] = x(1−x)[b−d+x(a−b−c+d)]
SINGLE scalar ODE
k=2 case
solved by integration, separation
solution gives the frequency x(t) of C according to the replicator dynamics, frequency for D is 1-x(t)
replicator dynamics:
symmetric two-player games with two pure strategies (2 × 2 games)
x˙ = x(1−x)[b−d+x(a−b−c+d)]
Equilibrium points of this RE are:
EQ:
to the x=0 (all D individuals) and x=1 (all C individuals)
and when b>d and c>a, or b<d and c<a, there is a coexistence equilibrium
x∗ = (x∗, 1 − x∗)
when [*] vanishes, payoff of C=payoff of D
with 0 < x∗ < 1
FROM BCT:
Π𝒸(x∗) = Π𝒹(x∗)
⇒ x∗ = (b − d)/(b + c − a − d)
interior or coexistence equilibrium
Different payoff matrices?
(important replicator eq property: we can get the same replicator equations by adding/subtracting a constant in each column of payoff matrix)
so payoff matrix
A~ =
[a − c 0]
[0 d − b]
gives same RE as A
rescaling the payoff matrix
In principle, the RE being separable, it can be solved yielding
x˙ = x(1−x)[b−d+x(a−b−c+d)]
t(x) = C +
∫₀ˣ (dy)/(y(1 − y) {b − d + (a + d − b − c)y})
which would need to be inverted to get x(t)
[See Q3(d) of Sheet 3]
Classification of symmetric 2x2 games
linear stability analysis of replicator eq
x˙ = x(1−x)[b−d+x(a−b−c+d)]
uses payoff matrix
A~ =
[a − c 0]
[0 d − b]
quadrants: c-a - b-d
When c> a and d> b: defection dominance (class of PD game) =>D is attractor and sNE
When c<a and d<b: cooperation dominance (harmony games) =>C is attractor and sNE
When c>a and d<b: class of hawk-dove and snowdrift games=> is ESS and attractor
=> stable coexistence of C and D
**When c<a>b:** coordination games (like stag-hunt) => C and D are ESSes and attractors, is a NE (not an ESS) and an unstable equilibrium
of the RE. Coordination game exhibits bistability.</a>
One of C and D has a larger basin of attraction
=> strategy that is risk dominant
Game with a=c and b=d is neutral and its RE is x˙ = 0 ⇒ x(t) = constant
if x∗ closer to C than D (left bottom quadrant) then basin of attraction to c larger, risk dominant and the other would be pareto dominant
diagram
diagram for analysis
quadrants:
c-a DOMINANCE |ANTI-COORDINATION --------------------------------b-d COORDINATION| DOMINANCE
games:
|Hawk Gove snowdrift
————————————————–
stag-hunt| prisoner’s dilemma
Diagrams:
ₓ₌₀ ₓ₌₁ |ₓ₌₀ ₓ∗ ₓ₌₁
D → C | D→◯←C
——————————
D←◯→C | D←C
ₓ₌₀ ₓ∗ ₓ₌₁ |ₓ₌₀ ₓ₌₁
explanation fitness:
Π𝒸>Π𝒹 so C dominance |cooperation x∗ = (x∗, 1 − x∗) stable
———–
ESSes and atractors 0,1 coxistence unstable|Π𝒸<Π𝒹 so D dominance
cooperation or defection attractors: dominance
Π𝒸<Π𝒹 so D dominance
The replicator dynamics of symmetric games with more than 2 pure strategies
The replicator dynamics of symmetric games with more than 2 pure strategies is much richer than
the case of 2x2 games seen so far. Here, we look at the rock-paper-scissors (RPS) games with the
cyclic dominance of three pure strategies and find oscillatory behaviour
6.4 Replicator dynamics of the Rock-Paper-Scissors game
Rock-paper-scissors game: reference model for cyclic dominance.
3 pure strategies in cyclic competition: Rock (R) which wins against Scissors (S) and
loses vs. Paper (P), which in turn loses to Scissors (S)
pure strategies
eᵣ= (1, 0, 0)^T for R,
eₚ = (0, 1, 0)^T for P,
eₛ = (0, 0, 1)^T for S
Here k=3 pure strategies/species
=> (k-1)=2 coupled REs
P dominates over R
R dominates over S
S dominates over P
cyclic
no species dominates over all others
Population games: pure strategies
Population games: pure strategies interpreted as species and
represented by unit vectors:
The most common and popular formulation of the RPS game
is as a zero-sum game, with a payoff matrix that can be chosen
to be antisymmetric:
Aᴿᴾˢ =
[ R P S ]
[R 0 −1 1]
[P 1 0 −1]
[S −1 1 0]
=> win: +1, draw: 0, loss: -1
=> in this formulation of zero-sum RPS: one player gains what the other loses
zero sum: why? because +1 and -1
Strategy profile, or state of population:
RPS game
x1 : frequency of R
x2 : frequency of P
x3 : frequency of S
x = (x₁, x₂, x₃)^T
= x₁eᵣ + x₂eₚ + x₃eₛ
Expected payoffs:
to R-player:
Πᵣ= Π₁ = eᵣ^T Aᴿᴾˢx = (Aᴿᴾˢx)₁
= x₃ − x₂
= 1 − 2x₂ − x₁
to P-player:
Πₚ = Π₂ = eₚ^T Aᴿᴾˢx = (Aᴿᴾˢx)₂ = x₁ − x₃
= 2x₁ − x₂ − 1
to S-player:
Πₛ = Π₃ = eₛ^TAᴿᴾˢx = (Aᴿᴾˢx)₃
= x₂ − x₁
(bc they sum to 1)
Average population payoff:
(payoff matrix here is antisymmetric so easier)
Π¯ = x^T Aᴿᴾˢx = 0
zero sum game average
Replicator equations (REs) of zero-sum RPS game:
x˙ᵢ = xᵢ(Aᴿᴾˢx)ᵢ− x^T Aᴿᴾˢx
= xᵢ(Aᴿᴾˢx)ᵢ (for i = 1, 2, 3)
quadratic in xᵢ
=> explicitly:
x˙₁ = x₁(x₃ − x₂),
x˙₂ = x₂(x₁ − x₃),
x˙₃ = x₃(x₂ − x₁)
2 linear coupled ODEs (if I know 1,2 i know 3)
x₃= 1- x₂ − x₁
x˙₁ = x₁(x₃ − x₂),
x˙₂ = x₂(x₁ − x₃),
x˙₃ = x₃(x₂ − x₁)
x₃= 1- x₂ − x₁
x˙₁ = x₁(1 − x₁ − 2x₂),=f₁
x˙₂ = x₂(2x₁ + x₂-1)=f₂
non linear
system of planar ODES
can’t solve: look for EQ
Finding eq
x˙₁ = x₁(1 − x₁ − 2x₂),=f₁
x˙₂ = x₂(2x₁ + x₂-1)=f₂
x˙₁ = x˙₂ =0
are EQ:
only R only P only S
x = (1, 0, 0),(0, 1, 0),(0, 0, 1)
corresponding to 1 species taking over
COEXISTENCE
x∗ = (x₁∗, x₂∗, 1 − x₁∗ − x₂∗)
such that
1 − x₁∗ − 2x₂∗
= 2x₁∗ + x₂∗ − 1 = 0
thus
x₁∗ = x₂∗ = x₃∗ = 1/3
x∗ = (1/3, 1/3, 1/3)^T
Each species /pure strategy
coexist with the same frequency
4 EQ points
x˙₁ = x₁(1 − x₁ − 2x₂),=f₁
x˙₂ = x₂(2x₁ + x₂-1)=f₂
Linear stability analysis:
J(x₁, x₂) =
[∂f₁/∂x₁ ∂f₁/∂x₂]
[∂f₂/∂x₁ ∂f₂/∂x₂]
=
[1 − 2(x₁ + x₂) −2x₁]
[ 2x₂ 2(x₁ + x₂) − 1]
evaluated jacobian at each equilibrium and find the eigenvalues
x = (1, 0, 0),(0, 1 , 0),(0, 0, 1) saddles
one +ve one-ve eigenvalue (+1,-1)
x∗ : J(x∗) = J∗
[−1/3 −2/3]
[2/3 1/3]
eigenvalues of J* are ±i/√3 PURELY IMAGINARY
x∗ non hyperbolic
so
no conclusion about stability of from linear stability analysis
define
Define
K≡x₁x₂x₃
this is conserved by the RES
if i look for time derivative of K, this is 0
(d/dt)K=
K˙=
x˙₁x₂x₃ + x₁x˙₂x₃ + x₁x₂x˙₃
=
x₁x₂x₃(x₃ − x₂) + x₁x₂x₃(x₁ − x₃) + x₁x₂x₃(x₂ − x₁)
= 0
Due to the time derivative being 0,
if given IC for x then we can find
K(t)=K(0),
which define neutrally stable closed orbits surrounding x∗
coexistence eq x∗ :
RPS
Coexistence eq
not stable but said to be
neutrally/imaginary stable
with neutrally closed orbits surrounding x∗
SKETCH
t against x values, x∗ =1/3
oscillations of x₁(t),x₂(t),x₃(t) oscillate about 1/3
Phase portrait
of the REs of
the zero-sum
RPS game
x∗ =1/3,…,
surrounding closed orbits
of x₁(t),x₂(t),x₃(t)
Closed orbits around set by K(0)=constant
triangle with RPS vertices
Generalized RPS game (Q4 of Example Sheet 3):
There are other (non-zero-sum) formulation of the RPS game that lead to other oscillatory
behaviour
=> generalized rock-paper-scissors game (gRPS) can be described by the payoff
matrix
Generalized RPS game (Q4 of Example Sheet 3):
Aᵍᴿᴾˢ =
[ R P S]
[R 0 −ϵ 1]
[P 1 0 −ϵ]
[S −ϵ 1 0]
with ϵ ≥ 0
When the gRPS game is a non-zero-sum
game. When we recover the zero-sum
RPS game
Here: R vs. P = P vs. S = S vs. R
and R vs. S= P vs. R = S vs. P
=> symmetry (all species on same footing)
In gRPS game, win: +1, draw: 0, loss −ϵ
NO LONGER ZERO SUM if not = 1.
no longer antisymmetric
RPS
RPS and lotka volterra model
Like for the Lotka-Volterra model, replicator dynamics of zero-sum RPS game lead to oscillatory behaviour set by initial
condition
=> is marginally stable centre and oscillations about it are not robust (depend on initial condition)
Generalized RPS game (Q4 of Example Sheet 3):
Aᵍᴿᴾˢ =
[ R P S]
[R 0 −ϵ 1]
[P 1 0 −ϵ]
[S −ϵ 1 0]
with ϵ ≥ 0
Expected payoffs
average payoff
Expected payoffs are
Πᵢ = (Aᵍᴿᴾˢx)ᵢ ⇒
Πᵣ(x) = Π₁(x) = x₃ − ϵx₂,
Πₚ (x) = Π₂(x) = x₁ − ϵx₃,
Πₛ(x) = Π₃(x) = x₂ − ϵx₁
Π¯ ̸= 0 when ϵ ̸= 1
Average payoff is
Π¯(x) = x^T Aᵍᴿᴾˢx
= x₁Π₁ + x₂Π₂ + x₃Π₃
= (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)
REs:
Generalized RPS game (Q4 of Example Sheet 3):
Aᵍᴿᴾˢ =
[ R P S]
[R 0 −ϵ 1]
[P 1 0 −ϵ]
[S −ϵ 1 0]
with ϵ ≥ 0
REs:
x˙ᵢ
= xᵢ(Aᵍᴿᴾˢx)ᵢ − x^T Aᵍᴿᴾˢx
⇒
x˙₁= x₁ [x₃ − ϵx₂ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
x˙₂= x₂[x₁ − ϵx₃ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
x˙₃ = x₃ [x₂ − ϵx₁ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
x˙₁= x₁ [x₃ − ϵx₂ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
x˙₂= x₂[x₁ − ϵx₃ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
x˙₃ = x₃ [x₂ − ϵx₁ − (1 − ϵ)(x₁x₂ + x₁x₃ + x₂x₃)]
Coexistence equilibrium x∗ = (x∗1, x∗2, x∗3)
game theory
If it exists, it is unique by BCT:
x₃∗ − ϵx₂∗ = x₁∗ − ϵx₃∗
= x∗₂ − ϵx₁∗
= (1 − ϵ)(x₁∗x₂∗ + x∗₁x∗₃ + x∗₂x∗₃)
invariant under 1 -> 2 ->3 -> 1
and under 1-> 3 -> 2-> 1
Symmetry here suggests
x∗1 = x∗2 = x∗3 = 1/3, physical
We can indeed verify that
Π₁(x∗) = Π₂(x∗) = Π₃(x∗) = Π¯(x∗)
= 1 − ϵ^3
when x∗ = (1/3, 1/3, 1/3)^T
By Bishop-Cannings Theorem,
x∗ = 1/3(1, 1, 1) is a coexistence equilibrium of the REs and NE of the gRPS game
Dynamics and stability when ϵ not equal to 1:
When ϵ < 1:
When ϵ > 1:
replicator dynamics connection of the saddle equilibria (1,0,0), (0,1,0), (0,0,1)
forming what is called a heteroclinic cycle. The frequency of each species approaches 0 or 1 in turn
(heteroclinic oscillate not periodically)
coexistence x∗ is unstable
Dynamics and stability when ϵ not equal to 1:
When ϵ < 1:
x∗ is the ESS and global attractor of the gRPS game=> replicator dynamics
leads trajectories spiralling towards in the phase plane. spiralling inwards
This corresponds to damped oscillations of
xᵢ → xᵢ∗ = 1/3
a) (heteroclinic oscillate not periodically)
b) orbits
c) spirals inwards
Typical phase portraits of the replicator dynamics of the gRPS game for different values
of ϵ ≥ 0
a) ϵ > 1
b) ϵ = 1
c) ϵ < 1
bimatrix games
didnt matter before player 1 player 2
same options same payoff
So far we’ve focused on symmetric games, but there are asymmetric games in which players have different roles and used different payoff matrix. 2 types of players payoff matrices A and B
=> bimatrix games
different “types” or “roles” (e.g., adults vs. juveniles, males vs.
females, etc
asymmetric games
asymmetric games: each type of players has its own payoff matrix and knows of which type it is. Asymmetric games are thus played
between players of different types (in different roles) choosing
among the same set of pure strategies, say cooperation (C) and defection (D), using payoff matrices
A= (Aᵢⱼ) and
B=(Bᵢⱼ) ≠ A
bimatrix games:
payoff matrix
pure strategies
A= (Aᵢⱼ) and
B=(Bᵢⱼ) ≠ A
pure strategies
eC=(1,0)^T
eD=(0,1)^T
A P1-individual uses payoff matrix against a P2-player, and a P2 individual uses payoff matrix against a P1-player
A=
P1 ↓ || P2 → | C D
C| A₁₁ A₁₂
D|A₂₁ A₂₂
B=
P2 ↓ || P1 → | C D
C| B₁₁ B₁₂
D|B₂₁ B₂₂
In bimatrix games, game is always between players of different groups/types, i.e. P1 vs. P2 individuals
STRATEGIES:
PAYOFF to P1,P2
P1 uses strategy
x=Σxᵢeᵢ
each pure strategy i with frequency 0 ≤ xᵢ≤ 1 , Σᵢ xᵢ = 1
P2 uses strategy
y=Σyᵢeᵢ
each pure strategy i with frequency 0 ≤ yᵢ≤ 1 , Σᵢ yᵢ = 1
giving
STRATEGY PAIR
(P1 player, P2 player)
(x,y)
Expected payoff player to P1 is
Π⁽¹⁾(x, y) = x^T Ay
Expected payoff player to P2 is
Π⁽²⁾(x, y) = x^T Ay
E.g
circles are P1 individuals (4) of which 2 are of species C (cooperators, blue) and 2 are defectors
and squares are P2 individuals (6)
2 cooperators species C
4 defectors species D
playing with fractions:
x= 0.5 e𝒸 +0.5e𝒹
=(0.5,0.5)^T
y = (1/3)e𝒸 +(2/3)e𝒹
STRATEGY PROFILE
(x,y)= ( (1/2,1/2)^T, (1/3, 2/3)^T)
Meaning: P1 population consists of half C’s and half D’s, P2 population consists of a fraction 1/3 of C and
2/3 of D.
This description, does not say if the population of P2 is larger than that of P1 (we tacitely assume that both groups are “very large”)
6.4.1 Nash equilibrium and evolutionary stability in bimatrix games
Notions of Nash equilibrium (NE) and strict Nash equilibrium (sNE) are direct generalizations:
strategy pair
(x∗, y∗)
is a NE if
(x∗, y∗) is a NE if
x∗^T Ay∗≥ x^T Ay∗
and
y∗^T Bx∗ ≥ y^T Bx∗
for all (x, y) ̸= (x∗, y∗)
(x∗, y∗) is NE if x∗ is best reply to y∗ AND y∗ is best reply to x∗
(a NE is the best reply to itself)
strategy pair
(x∗, y∗)
is a sNE if
(bimatrix games)
(x∗, y∗)
is an sNE if
x∗T Ay∗ > x^T Ay∗
and
y∗T Bx∗ > y^T Bx∗
for all (x, y) ̸= (x∗, y∗)
in bimatrix games
(x∗, y∗)
is an evolutionary stable strategy (ESS) pair
IFF
it is a sNE
Necessary condition for an interior (mixed) NE by an extension of Bishop-Cannings Thm:
(Ay∗)₁ = · · · = (Ay∗)ₖ = x∗^T Ay∗
and
(Bx∗)₁ = · · · = (Bx∗)ₖ = y∗^T Bx∗
In bimatrix games, all sNE/ESSes are necessarily pure strategies=>
a mixed-strategy NE is never an ESS in a bimatrix game
Example in the notes: the “Battle of sexes”, a famous example of bimatrix game with two pure strategies
In this game Husband (Player 1) and Wife (Player 2) have to decide
whether to watch a football game or go to the theatre (events are mutually exclusive). The
husband prefers the football and the wife the theatre, and both are happier (higher payoff) if
they pick a common activity. This leads to a bimatrix game in which each player has two pure
strategies, “watch football” (F) or “go to theatre” (T), and the two “roles” are husband (P1)
and wife (P2).
payoff matrices:
A =
vs. F T
F 2 0
T 0 1
and B =
vs. F T
F 1 0
T 0 2
battle of the sexes
By inspection of A and B, it is clear that the pair of pure strategies (F,F), corresponding
to the Husband and Wife watching football, and (T,T) are strict NE and therefore ESSes.
The strategy pair (F,F) yields a payoff 2 to the husband and 1 to the wife, and we thus say
that the payoff is (2, 1) to (husband, wife). Similarly, the strategy pair (T,T) yields a payoff
(1, 2).
Is there also a mixed NE (x∗, y∗)?
To answer this question, we look at the mixed strategy x
∗ = (x
∗
, 1 − x
∗
)
T
for the husband
to play F with frequency 0 < x∗ < 1 and T with frequency 1 − x
∗
, and y
∗ = (y
∗
, 1 − y
∗
)
T
for
the wife to play F and T with respective frequencies 0 < y∗ < 1 and 1 − y
∗
6.4.2 Replicator dynamics for bimatrix games
expected payoff
to p1
to p2
In population bimatrix games, there are two very large interacting populations: P1-individuals using the payoff matrix B =/ A versus P2-players and P2-individuals using the payoff matrix versus P1-players. Interactions/games are only between P1 and P2 individuals.
Expected payoff player to P1 playing pure strategy i:
Πᵢ⁽¹⁾ ≡ Π⁽¹⁾(eᵢ, y) = eᵢ^T Ay = (Ay)ᵢ
Expected payoff player to P2 playing pure strategy i:
Πᵢ⁽²⁾ ≡ Π⁽²⁾(eᵢ, y) = eᵢ^T By = (By)ᵢ
Average payoff to P1 and P2 in population stat
(x, y) Π¯⁽¹⁾ = x^T Ay and
Π¯ ⁽²⁾= y^T Bx
respectively
giving
Replicator equations (REs):
Replicator equations (REs):
for bimatrix games
x˙ᵢ = xᵢ[Πᵢ⁽¹⁾ − Π¯⁽¹⁾]
= xᵢ [(Ay)ᵢ − x^T Ay]
y˙ᵢ = yᵢ[Πᵢ⁽²⁾− Π¯⁽²⁾]
= yᵢ [(Bx)ᵢ − y^T Bx]
i=1,..,k
set of 2(k-1) coupled nonlinear ODEs
0≤ xᵢ, yᵢ ≤ 1
Σᵢ xᵢ=1
Σᵢ yᵢ=1
(links to growth rate? difference between expected payoff and average payoff)
equilibrium points:
x˙ᵢ = xᵢ[Πᵢ⁽¹⁾ − Π¯⁽¹⁾]
= xᵢ [(Ay)ᵢ − x^T Ay]
y˙ᵢ = yᵢ[Πᵢ⁽²⁾− Π¯⁽²⁾]
= yᵢ [(Bx)ᵢ − y^T Bx]
i=1,..,k
xᵢ=0 or 1
yᵢ=0 or 1
when k=2
(x₁, y₁) = (1 − x₂, 1 − y₂) =
{(0, 0),(0, 1),(1, 0),(1, 1)}
=> when k=2, are 4 equilibria
All P1 and P2 Ds
all P1 are D’s, P2 are Cs
P1 are Cs P2 are Ds
all P1 are C’s and all P2 are Ds
There is a unique interior equilibrium that is not asymptotically stable (not ESS) if
(Ay∗)ᵢ = x∗T Ay∗
and
(Bx∗)ᵢ = y∗T Bx∗
for all i = 1, . . . , k
have a physical solution with 0 < xᵢ∗,yᵢ∗ < 1
(by demanding what is in [*] to vanish)
In two-player bimatrix games, the interior equilibrium point is either unstable or surrounded
by closed orbits.
Example: “Matching the pennies”
P1 and P2 individuals, conceal in their palms a penny either with its face up or face down. Both coins
are revealed simultaneously. If both coins show the same (both are heads or both are tails), P1 wins and
P2 loses, whereas P1 loses and P2 wins if one shows head and the other tail.
Matching pennies: pure strategies
Here: 2 players with 2 pure strategies, “show head” (H) represented by
eₕ=(1,0)^T
or “show tail” (T)
represented by
eₜ= (0,1)^T
H and T are played with respective frequencies
by P1,
and frequencies
x₁ ≡ x and x₂ ≡ 1 − x
by the P2 player y₁ ≡ y and y₂ ≡ 1 − y
Payoff matrices for show head and show tail matching the pennies
Expected payoffs
A=
vs H T
H 1 -1
T -1 1
B=
vs H T
H -1 1
T 1 -1
win +1
loss -1
Πₕ⁽¹⁾= (Ay)₁ = 2y-1=Πₜ⁽¹⁾
(first component of Ay)
Πₕ⁽²⁾= (Bx)₁= 1 -2x = =Πₜ⁽²⁾
Π~⁽¹⁾=(2x-1)(2y-1)
Π~⁽²⁾=-(2x-1)(2y-1)
second average payoff uses B instead
Replicator equations for matching the pennies
x˙ = x[Π⁽¹⁾ −Π¯⁽¹⁾] = −2x(1−x)(1−2y),
y˙ = y[Π⁽²⁾ −Π¯⁽²⁾] = 2y(1−y)(1−2x)
EQ points
matching the pennies
Equilibrium points:
(x, y) = {(0, 0),(0, 1),(1, 0),(1, 1)}
(0,0) P1 and P2 always playing T, (0,1) P1 always playing T and P2 always playing H,
(1,0) P1 always playing H and P2 always playing T,
(1,1) playing H
coexistence equilibrium
(x,y)=(1/2,1/2) for which
Πₕ⁽¹⁾ = Πₜ⁽¹⁾ = Π¯⁽¹⁾ = 0 = Πₕ⁽²⁾ = Πₜ⁽²⁾=Π¯⁽²⁾
half-half is a coexistence equilibrium
Matching pennies:
Linear stability analysis?
Jacobian of REs is
J(x, y) =
2*
[2(x + y) − 4xy − 1 2x(1 − x)]
[−2y(1 − y) −[2(x + y) − 4xy − 1]
Evaluated at each equilibrium=>
(0, 0),(0, 1),(1, 0),(1, 1)
saddle points and thus unstable
(one positive and one negative eigenvalue of J).
coexistence
J at (x,y) has purely imaginary eigenvalues =>
(x,y) is non-hyperbolic and we cannot conclude from linear stability.
Matching pennies:
what else can we say about the
REs
In fact, REs conserve the quantity
H = x(1 − x)y(1 − y) that is
H˙ = 0 ⇒ H(t) = H(0) (initial value, constant)
Hence is a marginally stable centre: it is surrounded by closed orbits set by the initial condition
we could also do phase plan analysis to see this