This Paradox Splits Smart People 50/50

By Veritasium

Summary

Topics Covered

Evidential beats Causal Decision Theory
Perfect Prediction Eliminates Free Will
Pre-commit to Irrational Wins

Full Transcript

- There is a problem that I can't bring up without starting a fight.

- No, what?

- It just seems so obvious to me.

- Now I'm all screwed up, man.

(Casper laughs) - It has infiltrated every single Veritasium meeting in the last two months.

- It's trivial. (laughs)

- I didn't think you would fall for this side.

- Just makes sense.

- Let's go! - That's crazy!

- And I even argued with Derek about it.

- There's no way you're trying to convince me.

I don't care.

- So, here's the setup.

You walk into a room, and there's a supercomputer and two boxes on the table.

One box is open, and it's got $1,000 in it.

(cash register ka-chings) There's no trick.

You know it's $1,000.

The other box is a mystery box, you can't see inside.

You also know that this supercomputer is very good at predicting people.

It has correctly predicted the choices of thousands of people in the exact problem you're about to face.

Now, you don't know what that problem is yet, but you do know that it has been correct almost every time.

Now, the supercomputer says you can either take both boxes, that is the mystery box and the $1,000, or you can just take the mystery box.

So, what's in that mystery box?

Well, the supercomputer tells you that before you walked into the room, it made a prediction about your choice.

If the supercomputer predicted you would just take the mystery box and you'd leave the $1,000 on the table, well, then it put $1 million (cash register ka-chings) into the mystery box.

But if the supercomputer predicted that you would take both boxes, then it put nothing in the mystery box.

The supercomputer made its prediction before you knew about the problem and it has already set up the boxes.

It's not trying to trick you, it's not trying to deprive you of any money.

Its only goal is to make the correct prediction.

So, what do you do?

Do you take both boxes or do you just take the mystery box?

Don't worry about how the supercomputer is making its prediction.

Instead of a computer, you could think of it as a super intelligent alien, a cunning demon, or even a team of the world's best psychologists.

It really doesn't matter who or what is making the prediction.

All you need to know is that they are extremely accurate and that they made that prediction before you walked into the room.

Pause the video now if you want to think about it.

(soft playful music) Got your answer?

- So, I should just take two boxes, like, obviously.

- I'm gonna say I'm just gonna take the $1 million and go with it.

- Of course you take two boxes!

- I would pick both boxes, I think.

- I would not get the two boxes.

- I think I'm taking both boxes.

- Let's go!

(interviewee laughs) Okay.

- This is seeming less paradoxical than I thought because I should just go in and take the mystery box only.

- No!

What?

- There are two camps, one-boxers who would take only the mystery box, and two-boxers who take both.

But as American philosopher Robert Nozick wrote, "To almost everyone, it is perfectly clear and obvious what should be done.

The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposite half is just being silly."

This is known as Newcomb's paradox, named for its inventor, William Newcomb.

The Guardian newspaper polled over 31,000 people about this problem in 2016.

53.5% were one-boxers and 46.5% were two-boxers.

Now, if you find it hard to see why anyone would pick the opposite side, well, here are the arguments for each camp.

- Look, I'm a reasonable guy and I like money, so I'm gonna do whatever gets me the most money.

So, let's go weigh the outcomes of both of these decisions.

First, I'm gonna say that the probability that the computer predicted my decision correctly is gonna be C, so the computer got it right.

And because of that, the probability it got it wrong is gonna be 1 - C.

So, let's look at what happens if I try to two-box.

There is a C chance of me getting $1,000 and a 1 - C chance of me getting $1,001,000.

If I add these two together, I get a weighted sum, which is gonna tell me how much I can expect to get if I try to two-box.

This is also known as expected utility or the EU of two-boxing.

And I can just simplify this expression a tiny bit.

So, let's look at what happens if I try to one-box now.

There's a C chance of me getting $1 million and there's a 1 - C chance of me getting nothing.

So, we can cancel this out, simplify this to just $1 million times C.

If I equate these two expressions, I'm gonna get the C at which these two expected utilities are equal, and it turns out that the C at which this happens is 0.5005, or 50.05%, which means if the computer is better at predicting than what is basically random, then the expected utility of one-boxing is gonna be higher.

Now, I know that the computer is much better at predicting than that because it accurately predicted thousands of people before me, which means I'm sticking with my one box.

Here's my $1 million.

Take that, Casper.

Thank you very much.

- So, I should go in and take the mystery box and leave the other.

'Cause I'm assuming that its prediction is pretty good, it's a supercomputer.

- Okay, so you're saying it's not paradoxical because the choice is obvious.

I'm very surprised.

- Why? - Because to me, the answer is also obvious, and to me, the answer is you take both boxes.

So, here's how I think about the problem in a way that actually makes sense.

You know that the supercomputer has already set up the boxes, so whatever I decide to do now, it doesn't change whether there's zero or $1 million in that mystery box, and that gives us four possible options that I've written down here.

If there is $0 in a mystery box, then I could one-box and get $0 or I could two-box and get $1,000, but there could also be $1 million in a mystery box.

And in that case, I would get $1 million if I one-box or I would get $1,001,000 if I two-box.

So, I'm always better off by picking both boxes.

This is known as strategic dominance where I always pick the dominant strategy, which in this case is to two-box.

So, give me those boxes.

- The two-boxer argument makes a lot of sense to me.

Once you explain it, I'm like, "Okay, yeah, I can see why exactly you're right."

But I can also see that just having those thoughts in your brain are what might allow the computer to give you nothing.

I think I'm grateful that I just don't have those thoughts.

- It's so funny because I was totally expecting you to go two box.

- No way.

No way, man.

It seems like there are two perfectly reasonable approaches that give two completely different answers, and that's because your choice actually reveals something fundamental about how you make decisions.

It comes down to these two statements.

First, as far as you know, basically everyone who has taken one box has walked away a millionaire, and everyone who has taken two has walked away poor.

Second, the supercomputer made its prediction before you even knew about the problem.

The boxes are already set up, so your decision now can't change if the million is in there or not.

Both of these statements are true, but there's a hidden assumption in each that divides people.

- We pick both and the computer picks both?

- You just get the $1,000.

- I think I would just pick the mystery box.

- Just the mystery box, probably.

- Yeah. - I might be taking the mystery box.

- Just the mystery box?

- Yeah, it might be.

- Okay, why?

- I don't know, I guess the supercomputer is right, no?

- Here's the hidden assumption for us one-boxers.

My expected utility calculations are based on probabilities that are using prior evidence of how accurate the supercomputer is, because the thousands of people that it accurately predicted before me is evidence enough for me that when I go for one box, there's gonna be $1 million waiting in there for me.

This is based on something called evidential decision theory.

And using this decision theory, you get these expected utilities, and my choice is obvious from there.

And it turns out a lot of you actually thought the same way.

We polled our audience, got more than 24,000 responses, and it turns out two thirds of you are one-boxers.

- I know, it's crazy.

I don't trust this.

- It's looking to be like you're less and less rational with these results, Casper.

- So, your argument is kind of that if you one-box, it will have predicted that you were gonna one-box and you'll walk out with the money.

That is pretty convincing.

Casper, I'm starting to really doubt this two box side of things.

- How could you not be a two-boxer?

Gregor has this really funny way of thinking about things, but I make my decision based off something else, something a little more rational because I believe that whatever I do now can't influence and change the past.

I only take into account things that I can actually influence.

And clearly, whatever I do now, whatever I think now is not gonna change whether that $1 million is gonna be in the mystery box or not because it was already set up before I learned about the problem.

This is known as causal decision theory, where you only take into account things that you can actually cause.

And so, with this, your expected utility calculation changes, and that's because you need to use a different probability, one where you could actually cause that $1 million to be in the mystery box or not.

So, right before the supercomputer made its prediction, there was some probability that it thought I was going to one-box.

I know, it's weird, but bear with me.

So, let's say that probability is P, then that's the probability that I'm going to use in my expected utility calculation.

And the expected utility to one-box is just gonna be 0 + $1 million times P.

That's pretty good, but the expected utility for two-boxing is gonna be $1,000 + $1 million times P.

But that's just the same as the expected utility for one-boxing plus an extra $1,000.

So, no matter what the computer predicted, my expected utility is always higher by picking both boxes.

So, of course, you're gonna two-box.

Anyone in their right mind would pick both boxes.

- It's made the prediction before you're in there, whether I have facing this thing being 90%, 100%, it actually doesn't really matter.

I think you guys are imposing your will on- - Yes, it does!

- He's cooking, bro.

Like, yeah, you guys think that your decision, whatever you think now is gonna change the past.

That's called wishful thinking.

- Yeah, like, come on, I'm two boxes.

I'm two boxes.

It only makes sense.

I'm back with Casper.

- Exactly, welcome to camp two-boxers. (laughs)

- I'm not losing $1 million.

It was never in the room, man.

You're gonna walk into a room and there's either money in the room or there isn't money in the room.

Your question is, do you pick it up?

- Of course.

- We're trying to do you a favor here!

- You're not doing me a favor 'cause your decision making does not affect...

Like, your little thought does not change God's mind, bro.

- Henry, if you convinced yourself that you're a one-boxer, you've convinced the machine.

- I don't believe that.

- What do you mean you don't- - I don't believe you convincing yourself should impact what the machine thinks of you.

Because often, you're just walking into a room and it's already made the prediction.

You can't impact it, you know?

- So, whatever you think is more important, whether that's the evidence of the supercomputer's accuracy or the fact that the boxes are already set up, well, that affects how you calculate the expected utility.

And because both of those assumptions are true, both camps have valid answers.

But if there's no right answer, then is this just a meaningless problem?

Well, not really.

Because it actually reveals a surprising amount about three important questions.

Does free will exist?

What does it mean to be rational?

And is there an ideal way to act in life?

For example, the only way you're going to win this game is to already be the kind of person to one-box, but then two-box at the last second anyways.

That's the only way you're gonna get $1,001,000.

But some would argue that that itself is impossible.

If the predictor is so good, let's say it's 100% accurate, then that's not even possible.

Would you say that's true?

- Yeah.

- Then a follow-up question is, if such a perfect predictor would exist, does that mean that free will doesn't exist?

Because you're saying there's nothing you can do in between walking into that room and making your decision that ends up changing what was predetermined.

- That's right, and maybe this reveals where I'm coming from, and I think where I'm coming from is maybe free will doesn't exist.

I come down in this point of like free will is an illusion, but our world operates in a way that is indistinguishable from free will being real, and therefore, you have to act as though it's real, as though it's 100% real.

- Interesting.

- If we think that free will is not real and it's an illusion, and then you have someone who's committing crimes and then you wanna say, "Well, that's not his fault."

Therefore, instead of putting, you know, murderers in jail for 25 years, we're just gonna give them some gardening classes or so.

Like, the problem is, that then changes the environment where everyone knows you can kill someone and you can go to, like, do the gardening.

So, you can't change the system based on the knowledge that it's an illusion.

Whether we do or don't have free will, you have to live as though it exists.

- So, you've still gotta make a choice, one box or two boxes?

Which brings us to our second question, what does it mean to be rational?

I'm the guy who acted rationally and doesn't believe his thoughts can influence the past.

- But your rational choice will have given you a $1,000.

- I know, yeah, it's tough.

It's tough.

- [Gregor] Let's see what I can buy with $1 million.

None of these look great.

I think we can be more creative.

Private island sounds pretty nice.

- All right, Casper, what do you wanna go buy with 1,000, man?

(Casper laughs) (Henry laughs) - [Casper] This is known as the "Why Ain'cha Rich?" argument,

which boils down to one super annoying question, if you're so smart, then why ain'cha rich?

You know, if winning is getting more money, then of course the one-boxers are gonna end up better off than the two-boxers.

But maybe it's not about who wins, but about what's rational.

In their 1978 paper, philosophers Gibbard and Harper argue that the rational choice is to pick both boxes.

Although they do admit that two-boxers will fare worse.

They instead say that the game is rigged.

And "if someone is very good at predicting behavior and rewards predicted irrationality richly, then irrationality will be richly rewarded."

But I think that's a bit of a cop out, because really, Newcomb's paradox reveals something surprising, that sometimes in order to be a rational person, you must act irrationally.

- There's one question of what's a rational person.

There's another question of what's a rational act.

Most of the time, rational people do rational acts, sometimes they just don't.

And I think this is analogous to the situation in the prisoner's dilemma.

- [Casper] In the prisoner's dilemma, you and another player compete for money by either cooperating or defecting.

If you both cooperate, (audio chimes) then you get three coins each. (coins clinking)

But if you defect (audio chimes) and your opponent cooperates, then you get five coins (coins clinking) and they get nothing.

And if you both defect, (buzzer buzzes) you get one coin each. (coins clink)

So, no matter what your opponent does, you are always better off by defecting.

But if you play this game not once, but repeatedly, then everything changes.

All of a sudden, you're better off by cooperating.

- What's a rational society?

A rational society is full of cooperators.

What's a rational person?

Maybe a rational person is a defector.

And normally, you might expect a rational society is made up of rational people, but I think it's familiar that rationality at one level isn't compatible with rationality at the other level.

- So, while the rational act is the two box, a rational society would actually be full of one-boxers.

Now, I'm a fervent two-boxer, but there are three ways that you can get me to one-box.

The first is if my choices now can actually change the past.

For example, say the supercomputer makes its prediction by opening up some tiny wormhole to see the future.

Well, in that case, if I choose one box now, that actually causes the $1 million to be placed in there in the past, so I one-box.

Second, if there are multiple trials, because now with every game, each of my choices builds up a reputation, so if I one-box, then I'll be predicted as the kind of person to one-box, and so I get the $1 million either in this round or a later one.

And third is, if I can pre-commit.

If I can talk to the computer to make my case before it makes its prediction, then I will 100% one-box because staying true to my word is important to me, and the supercomputer would know that.

- If I put my word on it, I'll take the one box.

- Yeah, 100% one box, yeah.

- But there are some realistic scenarios where staying true to a worse option could have deadly consequences.

On the 29th of August, 1949, the Soviet Union detonated the RDS-1 bomb as part of their first nuclear weapons test.

This sent the US and the USSR into a furious arms race.

By the mid-1960s, the US had over 30,000 nuclear warheads, and the USSR had just over 6,000.

Both sides were more than capable of destroying the other. (explosion booms)

The US Secretary of Defense at the time, Robert McNamara, didn't advocate for disarmament.

Instead, he recommended a strategy of "assured destruction," where the US should be able to deter a deliberate nuclear attack by "maintaining a highly reliable ability to inflict an unacceptable degree of damage upon any single aggressor."

This strategy eventually became known as mutually assured destruction, or MAD.

If either country attacked first, the other would surely retaliate and lead to total annihilation of both sides. (explosion booms)

So, having that commitment to retaliate is beneficial.

It stops the attack from happening in the first place.

- Now, say you are the US president during the Cold War.

You have publicly committed to retaliate if the US is ever attacked.

But you've just received word that the Soviets have launched their missiles.

It's not a system error, it is a real attack.

So, what do you do?

If you launch now, then at best, everyone in the US and USSR dies.

And at worst, you get a nuclear winter that kills nearly everyone on the planet.

- Everyone in the whole world dies, then I probably don't launch.

- But you did pre-commit.

- Yep.

I don't like the outcome.

- No, it's terrible. - I don't like the outcome of everyone on Earth dying, so I'm gonna just not.

- When the country is electing their leader, which leader do you want to elect?

Do you want to elect someone who's crazy and is always gonna press that button, or do you wanna elect someone who makes the seemingly rational choice of saving people and not press that button?

- I think you want someone who maintains the posture of always pushing that button.

- That's right. - And then you want someone who secretly will not actually push that button.

- There is an inherent risk there, which is that if anyone finds out, you're exposed.

- There's this other game theory interaction, the game they call chicken.

You're both driving your cars at each other.

The worst thing is if neither of you swerves, because then you both die, but you win if the other person swerves and you don't.

The best strategy in this game is to visibly take the steering wheel out of your car and throw it out the window so that the opponent can see that you've done that.

Now they know you cannot swerve.

You're this mad dog that's just going straight ahead.

And now they realize, "Now my best action is just to swerve."

And that similarly, like what you want with the nuclear deterrent as they set up in "Dr. Strangelove."

- In the 1963 film, "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb," the Russians built the perfect doomsday device.

As soon as it detects a nuclear attack or any tampering, it automatically triggers a large enough nuclear explosion to kill everyone on the planet.

The tampering kill switch isn't there to prevent enemies from disabling it.

It's to prevent the Russians themselves from having second thoughts.

Now, the whole point of the device is to be so devastating and automatic that the US would never even think about launching an attack.

But this only works if everyone knows that the device exists, which is the whole point of the movie.

- In both Newcomb's paradox and MAD, the best outcome follows from a pre-commitment to a worse option.

That's what gets you at least $1 million in the former and a tense but stable peace in the latter.

It's the commitment that's important.

So, maybe being rational isn't deciding about what to choose in the moment, but it's about deciding what rules you're going to live by.

- The question isn't how to act.

The question is what rules one ought to follow, or how does one even decide what rules to follow?

Sometimes it's put in the form of, if you knew that you were a robot with programming, that you could set and you could rewire yourself to make yourself obey one set of rules rather than another, the question is, what sort of rules would you wire yourself to obey?

And what you would do is you would make yourself into the kind of creature that sort of always acts in line with the commitments that would've been good to form had you even known about the problem.

When you're in a situation like the Newcomb case, you would end up finding yourself think, "If I had been able to make a pre-commitment, what pre-commitment would've been the good one to make?

The good pre-commitment to make would've been to be a one-boxer.

And since I've already wired myself up to be the kind of person that lives up to all the pre-commitments I would've made, then I'm already, in effect, committed to one-boxing, even though I didn't realize it."

- I love this approach, because for me, that kind of makes it an iterated problem, but maybe more an iterated problem in life.

If I don't look at it as a single case and I sort of almost think about it as for every future predictor, for every future case, or like, almost building your own reputation, right?

- Exactly.

- Like, you always wanna live up to the commitments you've made, so even if you haven't heard of it before, you'd wanna stick to those ideal pre-commitments so that you are acting in line with the best version of yourself.

Yeah, that would convince me to be a one-boxer.

- It's rare to convince anyone to switch on the Newcomb problem. (laughs)

- The thing is, even if I never run into another generous supercomputer again, life doesn't end after I walk out of that room.

Like, I should always defect (audio chimes) in a one-shot prisoners dilemma because I can only gain by betraying the other player. (coins clinking)

But when I play multiple rounds of the game, like in life or in society, everything changes.

Suddenly, it pays to cooperate.

So, being the kind of person that sticks to an ideal pre-commitment is beneficial.

So, maybe I was just a one-boxer kind of guy all along.

All it took was a little reframing and a new perspective.

(screen beeping and chirping) - The core of Newcomb's paradox is deciding if a strong correlation that you know isn't causal should matter in your decision.

- So, the question is, what do you do?

Do you pick both boxes or do you just pick the mystery box?

- Might be taking the mystery box.

- Mystery box.

- I would pick both boxes, I think.

- I also pick both boxes, so. (laughs)

- I'll take the mystery box.

- Okay, why?

Okay, so you would pick just the mystery box and probably get the $1 million.

- Yeah. - So that's based off, I guess that it's almost always been correct, so there's this correlation.

Do we think that correlation is the same as causation?

- But how can you tease out what's causation and what's just correlation?

That's a difficult problem that has applications far beyond thought experiments.

For example, how can you tell whether a drug really works or if any beneficial effects are simply due to random chance? (screen chirping)

Big questions like these combined with hands-on learning is what I love about today's sponsor, Brilliant.

It forces me to slow down and actually think instead of just nodding along with an explanation.

Brilliant helps you build skills in math and coding with step-by-step interactive lessons.

You're not just watching, you're actively solving problems and testing your ideas as you go.

I found their courses to be both fun and effective.

So, whether you want to explore how to master the math of probability, start coding in Python, or really understand how AI works, Brilliant will help you achieve your learning goals. (audio chimes)

And their lessons start you at the right level based on your background, so no matter if you're 10 or 110, the practice sets and reviews are customized to help you advance at your ideal pace.

So, how can you tell if a drug really does work?

Well, Brilliant's course on Bayesian probability goes through that exact problem, and the best part is, you build up your intuition while you figure out the answer.

By training your problem solving muscles, you can take what you learn on Brilliant and reason through lots of others situations.

So, to learn on Brilliant for free for a full 30 days, scan this QR code or go to brilliant.org/veritasium.

I will put that link down in the description.

For viewers of this channel, Brilliant offers 20% off an annual Premium subscription.

That gives you unlimited daily access to everything on the site.

So, I wanna thank Brilliant for sponsoring this video and I wanna thank you for watching.

Loading...

Loading video analysis...