LongCut logo

GTO-2-02: MixedNash

By Game Theory Online

Summary

Topics Covered

  • Deterministic Strategies Cycle Endlessly
  • Mixed Strategies Use Probability Distributions
  • Expected Utility Extends Payoff Matrix
  • Every Finite Game Has Nash Equilibrium

Full Transcript

this lecture is going to introduce the idea of mixed strategies and extend our previous concept of Nash equilibrium to this new definition let's begin by looking at the

matching pennies game recall that it would be a pretty bad idea to play any deterministic strategy in this game for example if player two were to play

heads then player one would want to respond by playing heads to get a payoff of one meaning that player two would prefer to change to Tails so that he can get this payoff of

one meaning that player one would prefer to change to Tails so he can get this payoff of one meaning that player two would prefer

to change back to heads to get this payoff of one meaning that player one would prefer to change back to heads as well to get this payoff of one where we started and

so you can see there's kind of a cycle where we just bounced around between the different cells of this game Matrix and

essentially argued that no pair of deterministic strategies works for both players so what does work for both players well essentially it does make

sense for players to confuse each other by choosing to play randomly so intuitively instead of saying I'm going to commit to playing heads or I'm going to commit to playing Tails I can say I'm

going to commit to flipping this coin and playing whatever side comes up so let's try to make this idea formal before we talked about the idea of pure

strategies which we just equated to playing actions now let's think of things in terms of probability distributions so let's say that a strategy for an agent

is any probability distribution over the actions that the player has available to him and a pure strategy then is the special case where I play only one

action with positive probability a mix strategy says I'm just going to play more than one action with positive probability there there might be a couple of different actions that I

assign positive probability to like in my example with matching pennies I'm going to call the support of my mix strategy the set of actions that

get positive probability so for example when I flip a coin when I'm playing matching pennies both heads and tails are in the support of my mix my support is the set heads

Tails I'm going to define the set of all strategies for an agent I to be capital S subi and I'm going to define the set of

all strategy profiles capital S to be the cartisian product of these strategy sets for the different agents now I have the problem that I've elaborated my definition of of

strategies in the game to include not just this finite set of things players can do but this infinite set of all of the probability distributions over these finite

sets the reason this is a problem is I only have a utility definition for Action profiles and now I'm allowing things to happen that I don't have utilities

for that is to say I can't just read a number out of the Game Matrix to figure out how happy the players are when something happens because under a mix

strategy with a support of size greater than one I won't always end up in the same cell of the Matrix so I can extend my definition of

utility here by leveraging the idea of expected utility from decision Theory so these equations explain what this means it looks a lot more complicated than it

is so what I'm saying here is that I's utility under mixed strategy Profile s where little s is some ele

of the set of all possible mixed strategy profiles big as is equal to the sum over all action profiles in the game you can kind of

think of this intuitively as the sum over all of the cells in the normal form of the

game where I take the utility of of each cell and I multiply it by the probability that that cell will be reached in the given mix

strategy the probability of getting to cell a strategy profile a I sorry action profile a given strategy Profile s and then of course I need to Define

what this probability actually is and that's given here the probability that I'll get to a given action profile um given a strategy Profile s is

just the product of the probability of each player playing his part of that action profile so in for example if this

player was playing with probability 0.5 on each action and this player was playing with probability

.5 then the probability that I would get

.5 then the probability that I would get to this cell is25 this action profile arises with probability 0.25 because this thing

happens half the time and this thing happens half the time and so we have to multiply those two probabilities together to get the joint probability of this action profile so that's that's

what this definition here is saying so in total my utility for a strategy profile is my expected utility taking an expectation over all of the

action profiles in the support of that uh strategy profile and waiting each of them by the probability that that action profile would actually

arise well now that I've defined what strategies are um I can go back to my definitions of best response and Nash equilibrium and basically they work the

way they did before except I changed all of the A's to s's so so that means I have to write these definitions again and I'll go through them again but conceptually if you

understood what best response and Nash equilibrium meant in the case of actions then everything will work again so I will say that a

strategy s star I is an element of the set of best responses to strategy Profile s minus I when the following condition is

true for all other strategies SI that player I could take for all of the strategies in the set of possible strategies for that player and notice

that this this is an infinite set but that's okay the definition still works then the utility that the player would get for playing a star when

everybody else plays the strategy Profile s minus I is at least as big as

when he plays this other strategy SI so let me say that again s is a best

response to strategy Profile s minus I if it's at least as good as anything else given that everybody else is playing s minus

I now we can say that a strategy Profile s is a Nash equilibrium if it's the case that for

all agents everybody is playing a best response incidentally you might notice that I'm using a set membership operator here rather than an equal sign which is

what you might have expected to see well the reason I don't use an equal sign is because the set of best responses might have more than one thing in it so the best there might not be only one best

response sometimes there'll be multiple best responses and so here what I'm saying is a strategy profile is one of the best responses if this condition is true and I'm saying a strategy profile

is in as equilibrium if everybody is playing one of their best responses well this might seem like much a do about nothing I've introduced this idea

of randomizing as a strategy I've redefined utility then I've leveraged this redefined definition of utility which is incidentally what I'm using here when I talk about the utility of a

strategy profile to Define best response I've then leveraged that definition of best response here to talk about Nash

equilibrium and uh in total I've just ended up kind of saying everything that you've already heard us say but what does matter is that now that we have a new definition of Nash equilibrium we're

able to State a theorem that we didn't have before and this is Nash's famous theorem this is the reason why Nash gets the Nash equilibrium named after him and

this is one of the main reasons why Nash got the Nobel Prize um this theorem actually didn't take very long to prove but it's a really important thing for Game Theory and the theorem is that

every finite game has a Nash equilibrium what's first of all what is a finite game this sounds like I'm hedging here but it's not much of a hedge a finite game just means that the

game takes a finite amount of space to write down so it has a finite number of players it has a finite number of actions for every player and that means it has a finite number of utility values

in the game because the number of utility values is determined by the number of players and the number of actions for each player so as long as the game has a finite

number of players not just two players but any finite number of players and a finite number of actions for each player not just two actions but but maybe a very big game then no matter what the

payoff values are no matter what strategic situation we're talking about here no matter what real world um interaction this is modeling there's

going to be at least one Nash equilibrium in this game that's a pretty Deep thing that's saying there will always be some stable thing that all of

the players can do which has the property that if they knew what everyone was doing none of them would want to change their strategy that that's basically one of

the main reasons why we care about this idea of Nash equilibrium because we know that no matter what the game is we can find such a Nash equilibrium and reason about it that's why Nash equilibrium is

such a powerful thing and that's only true when we have this U Fuller definition of Nash equilibrium here that we've just defined in terms of strategies we saw that when

we talked about Nash equilibrium in terms of just actions before what we'll now uh from this point on refer to as uh pure strategy Nash

equilibrium so pure strategy Nash is when we do all of this with A's instead of is that's a pure strategy

n equilibrium and the sad thing about that is that we don't get a theorem that says that every finite game has one of those but this mixed strategy Nash

equilibrium always exists let's do some examples so remember matching pennies well we just argued at the beginning of this video

effectively that matching pennies doesn't have a pure strategy NCH equilibrium but it does have a mixed strategy n equilibrium it has one and

that is as I suggested before for both players to randomize 50/50 and that doesn't mean that it always has to be 50/50 um that just happens to be what the Nash equilibrium

is here that comes from the symmetry of the payoffs but that turns out to be uh the Nash equilibrium here let's come back to the coordination

game well we've previously seen mean that these two strategy profiles um I I'm circling outcomes but remember an outcome isn't an equilibrium you know 1

one isn't the equilibrium that would be wrong to say the right thing to say is that left left is an equilibrium right right is an equilibrium but it turns out there's another equilibrium here so it

turns out again 0.

5.5 is a n equilibrium here as well and that's kind of funny because it doesn't seem like 5.5 is such a good thing to play in this game but you can confirm to yourself that if player one is

randomizing by playing 50/50 then player two can do no better than to randomize 50/50 now you'll notice that player two could do just as well by playing something else if player

one is playing 50/50 player one is just as happy to go left all the time but in particular player one if player one's going 5050 player 2 can do no better

than to go 50/50 himself the reverse is also true and that makes makes 50/50 50/50 a Nash equilibrium of the coordination game as

well and let's look at prisoners dilemma in prisoners dilemma we've previously seen that this is an equilibrium and it's an equilibrium in strictly dominant strategies and we argued before that

equilibria and strictly dominant strategies are unique and so what that means is indeed there aren't any mixed Nash equilibria of the preson Dilemma this is in fact the only Nash

equilibrium

Loading...

Loading video analysis...