Tom Griffiths on Using Machine Learning and Psychology to Predict and Understand Human Decisions

By Stanford Data Science

Summary

## Key takeaways - **Theory-Based Pre-Training Wins**: Generate simulated data from psychological theories to pre-train neural networks, then fine-tune on limited human data; this won the 2018 choice prediction competition by providing better inductive bias than off-the-shelf models. [14:44], [16:32] - **Massive Data Reveals Theory Gaps**: Collected data on 13,000 risky choice pairs, two orders larger than prior sets; unconstrained neural networks outperform best human theories like prospect theory once data exceeds 30%, exposing missing explanations. [18:33], [32:18] - **Differentiable Theories Hierarchy**: Express psychological theories as differentiable functions in a nested hierarchy from expected utility to context-dependent choice; optimize each class with neural networks via gradient descent to rediscover and surpass human proposals. [21:26], [22:22] - **Scientific Regret Minimization**: Critique interpretable theories against blackbox neural network residuals, not raw data outliers; this reveals moral decision features like prioritizing humans over animals regardless of legality, iterating to 70+ factors. [40:03], [44:36] - **LLMs Assume Over-Rationality**: Off-the-shelf large language models predict human choices as more rational than data shows, akin to expected value computations; fine-tuning on psychological data creates powerful predictors like Centaur model outperforming cognitive baselines. [54:06], [56:22]

Topics Covered

Full Video

Full Transcript

my name is Kiara sabati and I want to welcome you tonight this is the distinguished lecture in the data St the Stanford data science supports for the

winter quarter is the first lecture that we had in this Academic Year um is also one of the seminars in the series that

we have titled scientific discovery in the age of data and AI um so I hope you will find the topic of interest but first of all I want to welcome you um

also at the name in the name of the people who really made this event possible so um the data science executive director Chris menel and the

people that really organized the event Laura Maddi Alysia Lyn Alexandra and sopia who have welcomed you as you came

in I also want to extend you the welcome of the director of Stanford data science who is now unfortunately away because

he's taking part to the AI Summit that's happening in Paris these days and now after all of this introduction I want to remind you how the format is going to be

we're going to have uh a presentation for 50 minutes or so and then we have ample time for questions these are really meant to be events where we talk

and exchange ideas not only in the pre lecture times where we try to um eat something but also afterwards really taking advantage of the intellectual

stimulations that we get from the lecture so it is really my pleasure to introduce tonight's speaker Professor Tom G Tom griffits he is the professor

of psychology and computer science and the director of the computational cognitive science lab at prinston as well as the Director of the Princeton

laboratory for artificial intelligence which is a fairly new um organization that is really in many ways analogous to what we're trying to do with Stanford

data science in that it tries to um facilitates a lot of research in many different areas that deals with data and computation I mentioned to you the topic

of this series of lectures that we have been having and um when we started making the list um the name of Tom

Griffiths came up very naturally as one of the main speaker to invite he has been having a very um prolific and

interesting career studying how um human cognition with mathematical models that we can make for human cognition but also understanding how human ways of

reasoning is different from a machine ra way of reasoning what does it mean to reason for a human for a machine how should we do that

um he's the author not only of um strictly scientific Publications but also for of a book that we might all enjoy it's

called algor to live by uh and he has received numerous Awards so I don't want to take any more of his time away please

help me in welcoming Professor Griffiths all right thank you very much it's wonderful to be here um uh one of the facts that uh was omitted from my biography there is that I actually got my PhD from Stanford and so it's very

nice to be back and visit a lot of the places that I was many years ago um uh one interesting fact about that is that actually while I was a graduate student I worked with a law professor on a book

that was actually about human decision- making uh and uh that law professor was Paul breast so it's particularly nice to have the opportunity to give this talk in Paul breast Hall and uh he he was going to try and be here today but

unfortunately he wasn't able to make it so um it's it's really wonderful to be able to rekindle some of those relationships and have the chance to uh to visit places that mean a lot to me um

so as you heard I'm going to be talking about understanding human decision- making and there's a lot of reasons why we might care about understanding human decision- making one of the classic reasons is that human decisions underly

our economy right the sorts of financial decisions that people make but nowadays there are many other reasons why we might care about understanding human decision- making as well so if you're trying to build any kind of system which

is going to interact with human beings it's helpful if those systems are able to make inferences about what it is that human beings want and one of the ways that you can make inferences about what human beings want is understanding how it is that what people want translates

into the way in which they behave so you can kind of work backwards and figure out what they might want from the things that they do um and if you know those Stakes don't seem high enough for you uh we can easily raise the stakes because

those systems that are making those inferences about you know how human beings are going to act are not just things that you're interacting with with your voice or by typing but things that you're interacting with on the road this is a a photo from a collision that

involved a self-driving car from the Uber self-driving car program where arguably the cause of this Collision was not making an appropriate inference about the kinds of things that people know about and can take actions based

upon based on the observations that you have of their behavior and so this uh the focus of the talk I'm going to give today is really on thinking how it is that we can make machines that might be

able to better predict and explain human behavior and in particular human decisions so if you're a machine learning researcher one of the ways that you can think about making machines better at doing something or at least

making it easier for them to learn how to do something is by thinking about what machine learning researchers call inductive bias right so inductive bias is everything other than the data that influences what a system learns so it's

the kind of the things that you build into a system that influence what it's going to be easy for that system to learn and what things might be hard for that system to learn um inductive bias isn't necessarily trendy at the moment uh the current sort of Trends in machine

learning over the last 10 years have been more focused on trying to engineer systems that have a large capacity for processing data so you can feed more and more data into them you don't need to have much inductive bias in order to get

them to learn how to solve problems but I think as we start to run out of data in certain critical applications or when we're trying to work in areas like understanding human behavior where we have limited data and data are expensive

to come by thinking about inductive bias is important so despite the fact that inductive bias is uh unpopular it's something that we nonetheless see in uh deep learning models so for example if

you look at things like convolutional neural networks which are used to uh learn um uh functions that involve images like classifying images uh they have an inducted bias which makes them

particularly good for processing images right so they are sort of built in a way which allows them to learn functions that are translation and variant more effectively or if we look at things like um uh long short-term memory networks or

Transformers they have an inductive bias which makes it possible for them to learn things that involve complex relationships that might unfold in sequences and so one way of thinking about this is that machine learning has been very successful in constructing

models that have good inductive biases for learning things about images or learning things about text but we haven't necessarily been very successful in creating systems that have good inductive biases for learning things

about people if you want to build a system that you know is trying to predict something about human behavior you can't say oh I'll go get my neural network architecture which is particularly useful for predicting human behavior because no one has come up with

that normally when people are trying to make predictions about human behavior they go and grab a sort of standard offthe shelf neural network or some other kind of standard machine learning technique there's not the same kind of understanding of what the inductive

biases need to look like and so that's the thing that I'm going to focus on and what I'm going to argue is that in fact there's a particularly useful source of inductive bias if you want to build systems that can make predictions about

human behavior uh and that source of IND bias is the last 100 years of so our psychological research right so you know for about 100 years people have been trying to understand how human behavior Works psychologists have figured some

things out those things that people have figured out are useful in allowing us to construct systems that have good inductive biases for predicting behavior and so the first question I'm going to engage in here is trying to answer this question of how it is that theories from

psychology might Aid Us in developing machine learning models but I'm also going to sort of turn this around because it's certainly not the case that psychologists over the last 100 years have figured out

everything about human behavior uh and there's actually a way in which the kinds of machine learning models we build can be equally useful in helping us learn new things about humans right so the second question I'm going to

engage with is a question of how it is that machine learning can help explain and you know not just predict human behavior so help us build tools that help us to make progress in that scientific goal of understanding human

behavior better um and in this talk the focus is going to be on three methods for combining in theory and data that we've developed in my lab over the last few years um that are really focused on

this problem of how do we create systems that have the right kinds of inductive biases for uh making predictions about people or also in other scientific settings um and how it is that we can

constrain models that allow us to uh be able to learn effectively from small amounts of data and then how it is that we can build more interpretable models and so while my focus is going to be on human decision- making the kinds of methods that I'm talking about here are

things that I think are actually pretty broadly applicable in solving a wide range of scientific problems where you have limited amounts of data and you have machine learning techniques that you want to be able to use and so hopefully it's a a good fit for the the

theme of this series um so the first example that I'm going to zoom in on uh is motivated by this financial case and I'll sort of use

this example for most of the first part of the talk and then I'll switch gears and talk about another decision-making problem and so uh this example comes from thinking about you know sort of the

most basic elements of these sorts of financial decisions which is essentially making a decision under risk right making a decision where you've got some uncertain outcomes uh there are different payoffs associated with those

outcomes and you're trying to figure out which option you should select under those circumstances so in Psychology and in other fields that have studied decision- making this kind of decision

is what's called risky Choice uh and here's a a sort of standard example of a risky Choice problem right so if you've ever participated in a psychology experiment on decision making you might have seen something like this people are

taking a psychology class might be having flashbacks to sitting in a you know a dingy room and sad of having to push buttons on a keyboard right so these are the sort of classic kinds of problems that have been used to to study human decision- making um so in this

case you have a choice between two options uh if you choose option A you get 16 points or dollars uh uh with certainty so with probability one and then if you choose option b there's a different distribution of different

outcomes that can occur with different probabilities and then payoffs that are associated with those and you have to make a decision about whether you want to choose option A or option b in this situation so these kinds of risky Choice

problems have been the bread and butter of at least a a part of the different disciplines like psychology and behavioral economics that have studied human decision- making um they have

motivated the development of various theories they've been used to show that those theories seem to be wrong um and they've given us uh a certain amount of data that we can use for starting to constrain the sort of theories that we might construct of how it is that people

make decisions and so a natural question that we could ask is whether you can use machine learning models to make predictions about this sort of most basic version of

human decision- making so all the way back in 2015 this is really the moment that was the Genesis for the work that I'm going to talk about today uh a group of researchers at the techon in Israel

led by Ido AR ran what they called the choice prediction competition and the idea behind the choice prediction competition was that uh people would see

a bunch of these gambles so they'd see 90 of these gambles uh these these Choice problems where you're deciding between two gambles um and then uh those

the the data from people's responses to those 90 gambl would then be provided to uh you know people who are participating in this prediction competition and those people could use whatever method they wanted they could come up with a

psychological theory they could use a machine learning method they could apply those things to those 90 gambles and then they would submit predictions and those predictions would be evaluated against the actual behavior that people showed on another 30 gambles and so what

you actually had to predict the data that you got and the thing you had to predict were how many people What proportion of people would choose option A and this situation and what proportion would choose option b and so you'd see that for 90 problems like this and then

you'd have to make those predictions for another 30 um and what they found when they did this analysis was that uh the kinds of machine learning models performed extremely poorly right so when

you ran machine learning models compared them with the intuitions of psychologists the intuitions of psychologists one every time the best models for making these kinds of predictions about human behavior were things that were either theories that

psychologists had come up with or uh machine learning models that were very constrained to use features that corresponded to things that psychologists thought were important um and so that was the conclusion of their their papers was that it seemed like you

you know you you really needed this sort of psychological input in order to do a good job in solving this problem um so a few years later they ran another Choice prediction competition uh

and this Choice prediction competition um uh was in it was 2018 and uh had a bit more data so it had sort of on the order of uh 300 pairs of gambles um but

the same kind of basic prediction problem take those gambles make a prediction and then you get scored and a group of researchers from my lab got really interested in this and we tried to come up with a way of solving this problem so one way that you can think

about exactly what this problem is is it's something where you've got uh an input space which corresponds to the descriptions of the gambles right so you've got all of these outcomes that can occur with different probabilities

and so you have the outcomes the sort of payoffs that you have and you have the probabilities that are associated with those and then that defines a space of you know pairs of gambl right each of those things is a dimension of that

space and what you're trying to do is learn a function that Maps you from this configuration of you know outcomes and probabilities to the probability that somebody chooses say option a right and that's just a very natural sort of

machine learning problem and so if you think about this as a machine learning problem that makes sense the big difficulty though is that the amount of data that you have for estimating that function is very small and so the

question is how can you try and solve this problem in a way that allows you to do a good job of estimating that function despite having very limited data and so we came up with a way of solving that problem which we called

Theory based pre-training but the basic idea is that even though you've got a very small amount of actual human data you do have that you know 100 Years of history of psychologists trying to predict the kinds of things that people

do and in particular for this kind of task we have have pretty good models of the kinds of things that people do when they make decisions and so you could take those theoretical models that psychologists had developed and then

generate a whole bunch more data from that theoretical model and then use those data to pre-train your model right so now this is something that people do in uh robotics it's called Sim tooreal

right so you train your robotics your system which is going to solve a robotics problem in a simulated environment then you transfer it to the real environment and then that's a way of getting around the problem that you know in robotics is very guess the

problem you have here that getting data from The Real World is expensive right uh and you can sort of simulate data much more easily so you can simulate lots of data train the model in the simulated data and then transfer to that

to that real world setting so it's exactly the same idea it's Sim toore but here the Sim comes from our psychological theory uh and then the real is the actual decisions that people

make and so in this case we could take uh a theory that people had come up with that theory was in fact the theory that did best in the 2015 Choice prediction competition um it's a pretty complicated

psychological theory it says there are four kinds of heris that people use and they use them in different circumstances and so on but despite the complexity of that theory what it allows you to do is generate data you can say what would

people do on this particular task generate a bunch of data from that use that to pre-train say a neural network model so you're training that neural network on data which is generated from that psychological theory then you've

got your little bit of real human data and you just use that at the end to sort of fine- tune the results um and what we found was that just doing do that was actually sufficient to allow us to do extremely well in generating these

predictions so uh might be a little hard to see here but so this is comparing our approach which is the just a a neural network a multi-layer perceptron with a c this cognitive uh prior um to all of

the different kinds of models that uh were used on the choice prediction competition 2015 data and then this is showing it on the 2018 data um uh we actually won the 2018 competition um but

slightly embarrassingly my graduate student submitted the the wrong uh model so they submitted our backup model um our backup model was good enough that it actually won uh although it didn't use

this technique and in fact if he' submitted the the right one we would have done even better right so uh so this kind of approach though is something that worked really well for uh at least being able to make progress in

this setting where you have this very limited amount of human data and the idea is start with the theory that you have even if that theory is not perfect it's closer to the right answer than wherever you're on your Network normally

starts off which is some random place in weight space and so pre-training moves you towards a solution which is basically instantiating the idea which is in that psychological theory and then training a little bit more data moves

you away from that and towards the actual human responses okay so this is thing number one right method number one Theory based pre-training we generate simulated data from a theory we use that to train a model and then we can find

tune away from it um but the more fundamental problem here is just that if you wanted to use machine learning methods to try and make make sense of these decision data you just did not

have enough data right so a lot of these machine learning methods work most effectively when they have more data and so as a consequence of having this experience and sort of saying you know we kind of worked out from generating

that simulated data how much data you'd actually need in order to train a neural network well uh and so the next thing we did was actually just go off and collect those data so in the last uh you know 10

years or so there's been a revolution in the kinds of methods that are used in Psychology where it's possible for us to go online and basically turn dollars into data using crowdsourcing services

like Amazon Mechanical Turk so that's what we did so uh IID told you that this large data set that was previously used in the choice prediction competition consisted of 300 pairs of gambles what we did was actually go off and then we

collected data for 13,000 pairs of gambles um and so that's a sort of two order of magnitude increase in the amount of data that we have in this kind of task and by doing that we're able to just sort of cover the space of these

problems in a much more comprehensive way so just to get a sense of what that looks like this is uh a visualization of the data set that we collected um this is a a two-dimensional projection of

that high dimensional space that we're trying to learn a function over where this is extracted from um the internal representation of one of the neural network models that I'll show you in a minute but it just sort of gives you a

sense of what this looks like where so the the layout of the um so each of these dots corresponds to one of those Choice problems a choice between a pair

of gambl um the uh the the GRE green X's correspond to sort of historically important pairs of gambles so in the development of theories of choice going

back over you know the last 50 years or so these are the the sort of things that psychologists came up with as points in the space that were meaningful for discriminating between one Theory or another um the red dots correspond to

the um the 2018 Choice prediction competition and then the black dots correspond to the data that we collected and so what you should be able to see is there's a lot of black dots there's not a lot of green crosses there's not a lot of red dots um and those black dots are

sort of filling in parts of the space that really weren't represented in the previous data so what this means is that now we have this broad base of data from which we can try and estimate this function that tells us what's the

probability that people select a particular option and so there's another thing to say about this which is just uh you know uh this is really represents a very different kind of method from the

kind of method that psychologists have previously used for trying to make sense of human behavior so for a long time collecting data was expensive and difficult you had to get people to come

into the lab to get data from them um and that limited the kinds of experiments that people would run so uh as a consequence the way that psychology has been done for those 100 years is people sort of come up with a hypothesis

they might have an alternative that they're comparing it to they design the perfect experiment to distinguish between those hypotheses and then they run that experiment and then they report the results and what that does is essentially if you kind of think about

what this is doing it's kind of like jumping you around those green x's in this space as you're going from one Theory to another um but it's not allowing you to get a comprehensive picture of what that whole Space looks

like right of how it is that people's decision- making varies across all of those different kinds of problems and so by going out and just sort of collecting this entire space of what these possible

Choice problems look like we're able to collect that much more data that we can get a comprehensive picture of what the whole Space of possibilities is and then we can look at how the kinds of decisions that people make might vary

across that space and the way that we do that is by using a method that we call differentiable theories um and that published in this paper called using large scale experiments and machine learning to discover theories of human decision making um and the the authors

here I should have said on the the previous project is the same set of authors Josh Peterson David bouran M agrial Daniel ragman um so the idea here was that if we think about this problem

in terms of trying to estimate this function which relates Choice problems to the probability that people select a particular option we can then characteriz what that space of functions

looks like as well and I'll explain this more in a moment but the basic idea is that we can express the kinds of theories that have previously been talked about as theories of human decision-making in terms of a hierarchy

of functions and then we can explore in that hierarchy of functions where we can find functions that actually correspond to good solutions to this problem and the tool that we're going to use to do that exploration is the kind of tool

that drives a lot of modern machine learning which is that if you can express a theory in a form which corresponds to a differentiable function then you can use tools for automatic differentiation to then automatically

search over the space of functions to find the one that best corresponds to the data that you have and so that might sound a little weird and complicated but it's exactly the procedure which is used in you know sort of back propagation

algorithms that are used to train deep learning models what we're doing is taking our theories expressing those theories in terms of what's called a computation graph that describes how it is that uh you know these functions are

composed out of simpler paths and then as long as the those parts are themselves differentiable we can use these tools of automatic differentiation to then allow us to use gradient descent to explore the space of functions and find what

function in that space best corresponds to something which is going to capture people's behavior and so what this allows us to do is exhaustively evaluate models that might belong to different

classes and so now we can define a constraint which corresponds to what class of theories we're thinking about and then we can find the very best model in that class and then that allows us to

to sort of evaluate how the constraints that correspond to those classes actually translate into making a difference in predicting behavior um and so the kinds of classes of models that we looked at things that corresponded to

Classic kinds of choice models but also things that were more General than that so one starting point um is uh expected utility Theory um and the way that we express this here is with the idea that

uh the probability that you choose one of those gambles say um option A hoping the lights come back on it wasn't me okay uh is um it's probably

someone in the back of the room is probably leaning on a light switch but um uh is going to be proportional to something which reflects this is just the sum of the probabilities of those outcomes and the utility which is

associated with those outcomes so in here you have this utility which is like how much value you would assign to getting some number of points or some number of dollars and the classic idea

here is that this utility uh function is something that say is a um a saturating curve right sort of like decreasing in the rate at which it's going up over time this allows you to capture the fact

that essentially you know you're uh you know they say your your first million dollars is the hardest but it's also probably the million dollars that you value the most the next million dollars you make after that you might value less than the first million right so as you

get more money getting more money is sort of has decreasing value to you um but this function you know normally the way that you might try and estimate this function is by defining some class of utility functions and trying to find the

thing that's in that class of utility function that best fits the data but what we can do is to say well in fact let's leave that class unspecified and instead we're going to fit a neural network and that neural network is going

to play the role of capturing that utility function so this is a completely non-parametric completely flexible way of allowing us to estimate what that utility function is from the data which

we uh we collect in um in these these decision problems and so then what we can do is we can say using this neural network to represent the class of functions that we're estimating try and find the best function in that class and that's going to give us our sort of best

version of this expected utility theory model um the next sort of level of complexity right in terms of moving up through this hierarchy of models is one that says okay yes you're allowed to

assign different utilities to different outcomes but let's also say that you subjectively transform the probabilities that you associate with those outcomes um and this is a model that's called prospect theory it's very famous it's um

uh the uh the source of Daniel conman's uh Nobel Prize um and the the basic idea here is that um uh there there are two things that go into this so one is that the way in which we're going to Define

this utility function is now sensitive to whether you're um uh you're experiencing losses or gains um and so that allows you to capture the fact that say people um might make different

decisions if they're given a choice between say an 80% chance of winning $4,000 or 100% chance of winning $3,000 people often choose the 100% chance of winning $3,000 right this sh thing is

attractive in that situation whereas if you're given say an 80% chance of losing $4,000 and 100% chance of losing $3,000 most people choose the 80% chance

of losing $4,000 because there's a chance you could get away with it um uh and so that uh asymmetry is something that you can capture by allowing to have

an asymmetry in that utility function um whereas if you um uh are interested in um uh capturing phenomena like for

example the fact that people are willing to play lotteries right so uh where I say you might be willing to pay $5 for a uh or you might be willing to trade off

a uh a one in a, chance of winning $5,000 for 100% chance of winning $5 right the trade-off between those things might seem seem reasonable you can capture that through the fact that people might overweight small

probabilities and as a consequence overestimate the value of gambl that involve gains associated with those small probabilities and then underweight uh larger probabilities

and so those kinds of deviations from expected utility Theory are the kinds of things that motivated studying particular choice problems but there are also things that uh correspond to different choices for the forms of these

functions and in the same way we can use a neural network to capture the forms of these different functions and then try and find what's the best corresponding sort of version of prospect theory for capturing what's in the data um and then

the two uh last classes of models that I'm going to talk about are a model we called a value-based choice model which just says um you're going to assign a value to each gamble based on the probabilities and the outcomes that are

associated with that gamble um but we're going to allow the form of the function that translates probabilities and outcomes into uh the those values to be completely free and we're going to fit that with the neural network and then

what we call context dependent Choice which just says you're going to take all the probabilities and outcomes and you're going to put them together and smos them into one big function and then that function is going to give you uh a probability that you make a particular

choice and so again the key Point here is that as we go through these models we're sort of introducing more ways in which these models are free right so they sort of form a nested hierarchy of models and we can find within each of

those classes of models the best fitting model by using these kinds of tools of automatic differentiation and then optimization through gradient descent and so when we do that we can ask

questions about um how well you know human psychologists have done in uh an economists have done in exploring this space of choices so for example we can

take uh expected utility Theory right and take something that has the functional form of expected utility Theory and then what I'm doing here is showing this is how well we do in fitting the data right so this is um a

smaller number here is better and then this is showing uh how much of the data set is uh a model is trained on and then all of these gray lines correspond to different hypotheses that human

researchers had come up with for what the form of that utility function might be and then taking that form and then estimating it from the from the data and the blue line shows what happens when we train our very constrained

neural network that's trying to freely estimate the form of that utility function and so what you can see is that you know humans have done pretty well right um humans have actually come up with pretty good characterizations of what the form of that utility function

might be there's some of these gray lines that are quite low here but then you know and then as we train our neural network model on an increasing amount of our data set um it can sort of discover a function that corresponds to the thing

that's the the best the best in the the the set of functions that human beings have come up with so humans have discovered you know like a pretty good characterization of these kinds of models this is what what ends up being estimated from from that procedure and

you can do the same thing for Prospect the so this is showing the same thing um uh here again all of these gray lines correspond to human proposals about what the form of that probability waiting and utility function might be like um and

then one interesting thing that you can see here is that uh as we train our prospect theory model um it's you know it sort of quite quickly does does better and then this is another version of prospect the cumulative prospect

theory which does you know even better with relatively small amounts of data so again we're able to sort of ReDiscover the kinds of things that it's worth saying you know took psychologists and economists working pretty hard for 50

years doing lots of experiments and theorizing pretty hard we can ReDiscover those in a single experiment where we went out collected a huge amount of data and then used these constrained neural networks to search through the space of

those functions um and again this is this is what this looks like um this probability waiting function doesn't look quite like the one that people normally assume for for prospect theory and I'll come back and talk about that a little bit later

on okay so now this is what happens if we take all of those classes of models that I was telling you about before and we start to compare them against one another um so here the red and blue lines these are the ones that correspond

to those classic models I was telling you about on the previous slide expected utility Theory prospect theory and so on um this light blue line is what happens when we use the value based model so this is the model that assumes that

people assign a value to each gamble based on its payoff and probabilities and so on and that if you get enough data right so when you get to about 30% of the way through our data set or 40%

that starts to be able to beat these sort of best human theories from from history um the dash line here corresponds to the best human Theory from the choice prediction competition

and the best theory in all of the theories we looked at we looked at about 30 different theories that have been proposed by psychologists and economists um this perform best in predicting the data um and it

uh performs even slightly better than this value-based model but if you have a completely unconstrained neural network that's just trying to predict the probability that people make a choice right once you get past you know 30% of

the data set or so that starts to outperform this best model that human beings have come up with okay so the the key takeaway here is that by the time you get to the end of the data set

there's quite a large gap between what the neural network model that's just completely unconstrained is trying to learn from the data can do in terms of performance and the the best theory that that humans had come up with in terms of

being able to make predictions about these kinds of decision problems and when we look at that Gap that's something that we can see as an opportunity right that's an opportunity to actually realize that there are things that we didn't know that we have

an opportunity to now learn as we try and make sense of these data um the other thing I want to point out here is this is that choice prediction competition from 2018 the previous largest data set right and if you look

here you get a conclusion which is very consistent with the kind of conclusion they had which is you know this sorts of psychological theories do a good job right and they're better than the you know completely flexible machine learning theory so this is only something which emerges once you have

enough data but once you have enough data then the more flexible story you know ends up being able to to get this performance and just for people who are worried about overfitting um all of this uh evaluation of mean squared error is

done on uh held out data so this is a validation mean squared error okay so let's talk about um this is an opportunity right it says that there's something which we're missing in uh

trying to account for human decisions it's not in the psychological theories that we have it's also a problem right and the problem is that this model that I just described to you consisted of us just taking your neural network throwing

it at the data and then learning something from it and we have no idea what it learned okay so we need to solve that problem in order to make the most of that opportunity we to try and figure out what this is doing uh in order to

you know be able to actually learn something about explaining human decisions so in order to do that I'm going to switch gears I'm going to talk about another method for trying to uh use machine learning

models to understand human behavior um this is something we call scientific regret minimization for reasons that I'll explain in a moment it's kind of a joke for uh people who are used to the idea of regret minimization in machine

learning um but the idea here is essentially that uh when we have a really well-performing blackbox model like this neural network model we don't understand what it's doing it can still

nonetheless be useful in guiding Us in theory development and the idea is that we can use that blackbox model to critique an interpretable model uh and the way that we make progress is that we

look at the errors that we could have predicted the blackbox model tells us we you know we could have predicted and use those to critique the model that we work with so just to illustrate how this works I'm going to return to another

example which is motivated by this case um this is a a huge data set of moral decisions that was uh collected by the moral machine project um uh so they had

in the part of the data set they were looking at um about 10 million decisions that all involved a task where um people had to say you have a a car this is a

self-driving car you can see there's no one in the car that is headed towards an intersection it's a version of the trolley problem um the car is going to hit somebody on the intersection is either going to hit this side or this

side you have to make a decision should the car continue to go straight or should it swerve in which case it's it's going to you know if it continues to go straight it's going to kill these people if it you know curves it kills these people it's a horrid sort of morbid

example it's exactly the kind of thing that you might uh want to know if you were trying to design self-driving cars this is the way that they motivated it you wanted to design self-driving cars that would capture human moral

intuitions um but for our purposes it just consists of a very rich and complex data set for which they had a huge amount of data where we can actually see if we can actually use some of these methods to try and make sense of the

decisions that people make so in this case there's about 20 different kinds of people who can appear on different sides of the road Road uh the number of them can vary as well so you can think about representing each of these scenarios

just via the number of those different people who appear on on the road right again we have a problem where we have a vector right now we have this Vector of how many people are on the left hand side of different types how many people on the right hand side of different

types that we're trying to map to a probability where that probability is now either G straight or S of the car um you can make a very principled cognitive model a sort of rational Choice model which just says you assign some

utilities to each of those lives those utilities might vary based on the type of individual that's represented uh and um we're going to add up those utilities and then you're going to make a decision about which side you want to go to Just

based on the sum of those utilities right and so it's the relative utility of the left side versus the relative utility of versus utility of the right side um and then you can make that model a little more complicated by adding in

what moral philosophers might call deontological principles so for example um uh you might uh think that somebody who's crossing the road legally has a

privileged value compared to somebody who's crossing the road illegally so you might prefer to avoid hitting people who are crossing the road legally and you know uh in favor of people who are crossing the road illegally or um one

thing that's very clear from the psychological data is that in these kinds of situations people prefer in action to action right they they're sort of you know taking an action seems like it's more morally Laden and so as a

consequence you might have a preference not to Swerve the car at all and in fact that shows up in the data so these are things that you could just add in as extra sort of constants that you add into this equation where they're features that correspond to whether the

scenario you know corresponds to this this satisfies this particular Rule and then we can contrast this kind of model with a a blackbox model in this case again a sort of neural network model that's just mapping that that Vector

into a probability um again what you find is that machine learning does better than the psychologists right so um here this is uh increasing the size of this data

set so I told you it goes up to about uh 10 million decisions and then these different lines correspond to this is now I'm showing you the accuracy of the model so here higher is better um and

these different lines correspond to um different uh models of the kind that I was talking about before so the blue line is uh if you give equal weight to

all of the lives involved that does sort of okay you can make that better by distinguishing between animals and people so if you have different weights for animals and people you get a little bit better you can do better than that by having one where you have different

values for all of those 20 different kinds of individuals um and then if you add in those extra deontological principles I was telling you about you can get to this red line here but then you know if you get enough data the U

neural network ends up doing better right um and is outperforming uh the best psychological model here and so one thing you might then want to do is to say let's see if we can figure out what

the psychological model is missing and a standard technique that you use for doing that is something called um uh uh you know error analysis or residual analysis which you might look at the residuals from the model the things that

it fails to get correct and so if we do that uh and we compare our best sort of theoretically motivated Choice model with the neural network model um what

you find is this right so this is now I'm showing you the case where the error between our choice model and the data is largest and it corresponds to this situation it might be a little bit hard

to see there's a bunch of people on the left hand side of the road there's a bunch of cats on the right hand side of the road uh and and the data says that 99.4% of people chose to kill the people

rather than the cats uh and so we have no idea what happened there right and the people who made the data set have no idea what happened there but if you collect 10

million data points this is the kind of thing that sometimes happens which is that the more data you have the more opportunity there is for weird outliers to happen right and if you're comparing

your residuals to the data and you've got huge amounts of data the biggest residuals you're going to find are going to be these really weird cases where something went wrong and so doing that kind of analysis based on residuals is not particularly effective in these

kinds of massive data sets and so that motivated us to think about this approach we call scientific regret minimization which is that you should only be concerned about the errors that could have been predicted right so if

you go back to this and we look at the Choice model here says uh you know no that shouldn't happen right the neural network also says that shouldn't happen and that should tell you that this is a case where despite the choice model

being different from the data Maybe you don't actually care so much because the model that we' made using the neural network which is kind of like the best model that we've come up with still doesn't do any better in that circumstance and as a consequence this

is an error that is maybe less important than it seems like and so this is the idea then right the regrets that we're minimizing we should only regret those things that we could have done better

right and this is a situation where whether we could have done better is told To Us by that neural network model our sort of blackbox model which is the best predictive model we have um and so the idea is we we a blackbox machine

learning model we evaluate residuals to that model not to the data and if you're a social scientist or any kind of data scientist now you're kind of freaking out because you're like I'm just told you that what you should do is throw away your data don't care about the data

just compare to that model right it's a very sort of unfamiliar and uncomfortable thing to do um but it turns out that in this situation uh if you have enough data training a blackbox model fitting it to the data and

comparing residuals to the Black Box model instead of the data works better and I'll show you this in a moment um because intuitively what happens is that the model becomes a smooth version of the data and that's a better Target for

critiquing the data right and so just to sort of show you what this means here's a sort of simulated example right where the black line corresponds to a true function which we can know because we're simulating data the gray dots correspond

the data that are generated from that function and the blue line is our sort of uh is our neural network prediction and so now imagine we're trying to fit some function to this uh we're trying to you know figure out what model is the

best model of these data we have a nice theoretically interpretable model that we're trying to fit um and I told you it's what we it's okay for us to um do this uh evaluation relative to the blue

model rather than to the the data points and the idea is that in fact what's happened when we've got enough data is now the blue model is closer to the true function so if you are comparing the residuals relative to the data points

these big outlying data points are going to be the things that going to drive your residual analysis and if you compare it to the blue function you're actually going to do a better job and you can actually Express that in terms of math I'm not going to explain the

math but for people like math um the basic idea here is that we can think about decomposing the sources of variance that are going into those residuals right so um in this case f ofx

is the true function G ofx is our estimated this is our nice theoretical model and so in one case we have the true function plus noise right that's the um that's what what you get when you

compare to the data and you end up with something which reflects how well your your um theoretical model fits the true function and then a term that reflects the noise in the data but if what you

did instead was compare the residual between uh your theoretical model and uh your Black Box model which you fit to that function f ofx you can work out what you get is a term which is the

difference between F ofx and um your your theoretical model but you also get this other term which reflects essentially how well your um your blackbox model fits the true function

and then how correlated the errors might be between your blackbox model um and your uh your theoretical model model and if you get enough data Um this can be driven so that it's smaller than the the

noise variance and so this makes a prediction that as you get more data there's going to be a point where the residuals between your uh the the true

model and the data are sorry the the correlation between the yeah the the residuals to uh the data is less than

the correlation to between the residuals to the uh to the the blackbox model and we find exactly that that transition

where if you get enough data your residuals to the blackbox model are a better predictor of your residuals to the true function than the uh residuals to the data are a predictor of the

residuals to the true function okay that was a little complicated but we got there okay so uh and that's exactly what happens so as you do better in uh fitting the data that's what the right plot shows uh you get to this point where you have this transition and in

fact it's better to critique your model against the Black Box model okay so let's go back and now look at this in the context of our problem so if instead of looking at the biggest residual to

the data we look at the biggest residual to the neural network um this is this is the case which is the biggest residual to the neural network uh if you look at this we have um there's a again it's a

little hard to see there's a there's a human on this side uh and a cat on this side and the human is crossing illegally and the cat is Crossing legally and our

choice model says you should kill the human cuz they're Crossing illegally right and the neural network model says Nope right and the humans say nope right and so what's going on is there our

model which said oh you should favor legal crossers is a little bit wrong uh and in fact you know uh you should um uh if you have a choice between killing a human and killing an animal you should

probably yeah you know I'm not I'm not going to interpret this as a moral prescription but people find it more permissible to kill the animal than killing the human Okay so that's a really easy feature that we can put into

our model which says look for these situations and avoid them um you can then you know do this again and again and again it's a kind of abductive loop right so now we can do this again on the left hand side this is again the biggest

residuals to the data which are not very helpful on the right hand side this is the biggest residuals to the model and now that we've added in that extra feature now catches a bunch of these

cases where we have um uh this is uh cats Crossing legally humans on this side this is kids crossing illegally uh adults on this side basically uh what

the the these um errors show is that uh the uh the the designation of an illegal crossair should not be applied to cats and children right so animals and children are not sort of aware of

whether they're Crossing legally or illegally so they shouldn't be penalized for it and so that's another feature we can put into the model um this is another you know next iteration what we find is that um uh uh the prescription

that you should choose to it's that it's more permissible to kill animals than humans does not apply when those humans are bank robbers right so we need to

have an extra sort of recognition that that's a deviation that we allow and then we keep on iterating right so uh now we discover that um uh pregnant women have an extra uh bonus that was

not reflected in the bonuses we had in the model there's actually a sort of very sharp deontological rule about it and so you can add that into the model and so and each of these iterations you're discovering something kind of interesting about human moral decision

Mak you can do that for a long time so the final model looks like this right there's lots and lots of features that come out of that we then did a whole bunch of extra experiments where we went back and we said we found these things are these things real we ran new

experiments we validated that those things were real but the point here is that taking this data set which had 10 million human decisions in it using this procedure where we're critiquing against our Black Box model it's kind of like

the equivalent of doing like it's like 70 papers that you know could have been written where that paper is you know about each of those distinctions that I was showing you in each of those iterations it's just it was all done in one giant data set and then you needed

to have this kind of data analysis method to pull all of those pieces out and so there's a lot of weird interesting things that we learned about human moral decisions uh so um you find these weird higher order effects the

first order effect Crossing illegally increases the risk of being hit the second order effect unless crosses or children right I showed you that one uh there's a third order effect which is that unless those children are with an

adult uh in which case they get back to you know caring about illegality and you also find these weird unexpected interactions so um if you are a woman

then um uh if you are crossing the road and it's unknown whether you're Crossing legally or illegally right if you're jaywalking or not um uh the rate at which it's sort of considered acceptable

to be hit in that circumstance is close to the average of legal and illegal um but if you're a man it is close to Illegal right and so this is saying men are not given the benefit of the doubt in a circumstance where the illegality

of their actions is unclear okay so these are not things that you necessarily want to build into your self-driving car but they are things that it's helpful for us to know about human moral decision- making and we're able to discover them using this kind of

procedure okay so I'm now going to go back to risky choice and sort of try and land this plane right so uh so now we have a method that we can use for trying to figure out what's going on in that black box model um and when we did this

what we did was we took that neural network that I showed you and we took a prospect theory model and we said what is the neural network captured that's not captured in prospect theory and we found that basically those phenomena

that I showed you earlier on are not consistent across the entire space of problems so those phenomena of loss

aversion and of um uh uh of probability um you know giving greater weight to small probabilities classic phenomena which motivated a prospect theory definitely show up in the space of decision problems but they're not

uniform across that space we can capture that by defining what we call a mixture of theories model and the way that this works it uh it says basically let's assume two different kinds of utility

functions and we can again estimate these from data and two different kinds of probability waiting functions where one of those is constrained to be perfectly linear um and the other is allowed to be free and then what we're

going to do is for every problem in our decision space we're going to learn a function which is not from the the the problem to the probability that somebody chooses one option or another but it's a function from the problem to which of

these utility functions do you use and which of these probability waiting functions to use so like how sensitive are you to loss of verion how sensitive are you to small small probabilities when we do that what we find is as I

said you kind of find that these things are sort of not evenly distributed across the space so this is showing you again the space of choice problems uh so here more yellow is less probabil this sort of you know linear probability

waiting function more red is this more curvy classic probability waiting function here more green is this sort of classic loss AV verion utility function more purple is this uh sharper and more

symmetric um loss function and basically you can go in and you can figure out in the data what's going on um but basically they correspond to you don't see this kind of um probability waiting for problems where those probabilities

are sort of like in the middle here so it does affect the small probabilities more but it doesn't affect them so much in the middle and then for L of version when you have problems which are close to being dominated so where it's close

to uh you know it's sort of one option being strictly better than the other people just sort of home in on those math differences right where just like one thing being slightly better than another is enough to push them in that

direction and so as a consequence you get this very sharp uh and symmetric utility function and you can use the same kind of model to explore individual differences where you can say it turns out when we look at our data with

another data set that we collected individuals vary in the extent to which they use these different ways of uh making sense of the data sorry making sense of their their choices okay so

this approach that I've talked about um is in many ways a sort of General paradigm that's now being applied by us and by other people in a lot of different settings right and that Paradigm is collect a large data set optimize Theory constraint in blackbox

models and then iteratively critique the models against the black box to try and figure out what it is that you're missing um and so we use this this is the uh um in the the context of this rxy

Choice setting but we've also recently used this in the context of uh 2 by two games of this the kind sort of study by classical Game Theory where we're able to use this kind of approach to actually understand things about how much people

iterate when they're trying to make sense of the play of their opponents and how that's modulated by the complexity of the task that they're performing um and it's also been applied to things like spatial working memory planning um

uh and uh this is a a cool recent paper that looks about how this might apply in the context of reinforcement learning um and you know the effectiveness of approach increases with bigger and better black boxes so as we're able to

sort of Leverage these machine learning tools they can be used to get insight into uh exactly you know how it is that we might want to impose constraints and the kinds of principles that we might

want to use in explaining human behavior okay so those are three methods for combining Theory and data method one Theory based pre-training method two differentiable theories method three scientific regret minimization uh I

promised you to tell you promised to tell you how we can create systems that have good inductive biases for understanding and predicting human behavior uh and the kind of approach that I just outlined is a way to do that

right it's a way to come up with better theories that then you can feed back into the system as a way of generating things like synthetic data that you can use for pre-training such you can create models that instantiate strong inductive biases

um and in some other recent work I'm not going to have the time to talk about um we've generalized that to something we call inductive bias distillation which uses more sophisticated machine learning methods like metal learning to take an inductive bias from a probabilistic

model like a sort of basian Prior distribution and then transfer it into a neural network and that expands the space of kinds of theories that we're able to to translate into inductive

biases in this way but one question that you might have is how this kind of inductive bias compares to the inducted biases of the kinds of AI systems that

are making the news lately um so large language models like Chad GPT or Claude or Llama Or Gemini these models have some kind of inductive bias that they've learned from the training data they've

been exposed to and very briefly uh We've sort of looked at this um uh I'll tell you this is a story in four papers extremely quickly here so uh first of all um

if you take large language models off the shelf and use them to try and predict the kind of choice data that I was showing you they are not great at it and the way in which they're not great

is that um they as the title says assume people are more rational than we really are right so um uh off-the-shelf models do some kind of mixture of just doing weird stuff and calculating expected

value for the gambles right so calculating just like what the probabilities times the utilities are instead of adding those up and just saying okay choose the one that has great value that's an entirely rational thing to do it's not the thing that

people do as reflected in the data I showed you before but it's the thing that the models assume that people do um and consistent with this we showed in another paper that if you just take the equivalent of these kinds of large

language models train them on a data set which corresponds to arithmetic right in this case the arithmetic that you use for expected value computations you get exactly the same score in predicting people's choices that you do if you take

these models that have been trained on massive you know uh linguistic Cooper um funny thing that came out of uh this this first paper here so first of all using Chain of Thought or other kinds of you know this sort of inference time

compute methods that improve performance on many kinds of tasks um that does not work well in fact it makes the models assume that people are even more rational right and that's consistent with the fact that it enables this kind

of math-based reasoning based computation um so uh it sort of it's strengthens the correspondence with expected value Theory the uh prompt intervention that we found that had the best biggest effect in producing

something that was more like human behavior was the targe language model it was not predicting a human but it was predicting the choices of a monkey so you can make of that what you will um uh the other thing is that fine-tuning

helps so in a nice paper by um Marcel bins and and Eric schuls they showed that you can take uh these off-the-shelf models and then fine-tune them on these kinds of data holding out data in the way that we did in those previous

analyses and then they can actually do better and they can get closer to the kinds of models that I was showing you before um that principle is instan ated in uh paper that um uh we you know

collaborated with them on where um they fine-tuned a um uh a large language model on a massive amount of psychological data so this is psychological data from a a bunch of

different kinds of tasks and then evaluated performance there and so what you see when you do that uh this is here all of the different tasks that we used from these psychological experiments um

the uh the sort of light gray line correspond to sort of standard cognitive models the pink lines correspond to off-the-shelf llama and then the purple lines correspond to this model which

they called the centur model because it's humans plus a llama I guess somehow if you put those together in the right way you get a centur um uh and it actually does result in a significant Improvement in performance and so these

kinds of models are interesting they're not explanatory models but they're powerful predictive models and it's something that you can use in the context of some of the methods that I was talking about before so you can use this as the basis for critiquing a

theoretical model right doing that scientific regret minimization procedure and the nice thing about these fine-tuned uh large language models is you can do that for arbitrary tasks right so now you can give it a new

psychological task you've run uh you can compare it to your data you've got a theoretical model that you'd like to critique you can critique it against the predictions that come out of the centur model and that gives you a tool to do

this kind of theory iteration okay so conclusions uh so first of all psychological theory can be used to create models with good inductive biases for predicting human behavior right and some of the methods I talked about Theory based pre-training differentiable

decision theories are cases where we're using psychological theories in different ways to constrain those models second machine learning can be used to guide the development of better explanations for that behavior through things like scientific regret

minimization although they might end up being more complicated than we expect right and so remember when I showed you the result of following that procedure for the moral reasoning case it was like

70 different factors that had identified as we get more data having machine learning models is very helpful for owing us to make sense of those data but it also reveals some of the intrinsic complexity of human behavior and one of

the things that we need to get comfortable with as scientists as we start to work with these larger data sets is that the kind of simple explanations that we might desire for intrinsically complicated things like

human behavior might not in fact be the things that come out of those analyses so thank you very much uh I think so if you want to ask questions there's microphones there's

one here and there's one over here testing testing yeah hi uh I have a question about the regret minimization so you show that there's this crossover point at which point using the model

instead becomes useful for a new setting is there a diagnostic to know whether or not we've crossed that threshold yeah so um it it relies on you having an estimate of what the noise is in your

data right so the crossover point is determined by the point at which your the um the error of your predictive method Falls below the expected noise threshold

yeah hey Tom uh thanks for the really inspiring talk so the theories that are produced by these methods will be theories of human beings under certain

sampling assumptions about where the data come from that you've got a random sample of humans and that there's you know random variation in humans uh preferences or their cognition th those

aren't true right and I so I wonder how you've thought through the problem of biased sampling in the data sets and then uh meaningful variation amongst the

human beings yeah so um uh in this case we're working with the kinds of data that we can get right and in particular kinds of data that we can get at scale so we're definitely constrained by that

um we see some of that already in even if you look at uh differences between online and lab samples um so one of the things things on this slide where I

showed you the performance of the best human model um this model was calibrated on lab data and if you calibrate it to our data it does a little bit better right and so that's something which

matters um the other thing I showed you briefly was um being able to capture individual differences right and I think this is a way that you can start to think about that variation is that once

you have constrained models at least of some population you can make models of individuals within that population and you can think about how the distributions of individuals vary across populations so there's no way to answer

those questions without being able to collect data from a wider range of populations um and these sorts of online data collection tools are starting to increase in the range of populations

that you can sample from now we have a collaborator who has data from about 60 different countries um where we can look at that variation but that's obviously a really important thing to about thank

you just want to broaden that a bit the general generalizability of your answers may be very dependent on your data set

for example if you ask the same questions to cat lovers you may have gotten very different answers on their choices so that's a caveat all the

extrapolations they only apply to people who made up the data set y second question has to do with prompt questioning

that could bias many of the uh Foundation model answers they are very reluctant for example to criticize your

question so if I ask the question am I right on this they will hedge and never say or very seldom say I'm wrong so the prompting has a lot to do with the type

of answers you get and I wonder if you taken that into consideration yeah so um uh so on the on the first question uh one nice thing about the moral machine data set is that they have a lot of

information about what the sources were for the data and in particular you can look at things like variation by country as well as variation by demographic groups we made a model where we tried to

look at those kinds of individual difference variation in that data set and all the ways that we tried it we weren't able to find an improvement in performance as a consequence of doing that which doesn't mean that those effects aren't there it just means that

they weren't things that we were able to defect to detect at a scale that meant that they made a difference in prediction performance um for the the second question yeah so for the um the

results that I told you about here we try to range of different prompts um and we we see very similar results um we ran this the the three main ways that we

were concerned about were um a uh what would you do question what would you predict a human would do and a kind of like simulate a human question which corresponded to at least three different

kinds of settings in which large language models are used and across those three different settings we found very similar performance yeah so the we we yeah we we also tried um doing different demographic prompting in that

setting as well yeah hi professor thank you so much for coming um my name is Bessie and I'm currently a student here and I study computer science um so most of today's

talk kind of focus around human decision- making on individual basis and I was listening to your talk I was thinking you know another thing that we do on a daily basis is we do Collective decision- making and we do Collective

things thinking so I was wondering if you have any thoughts on how you know techniques might change or you've done any research experiments on when models are acting as agents where their priors are different but through discussion

debating they have to change that prior and reach like a unanimous or like some sort of a uniform posterior at the end for for a good output I was just curious if you had any thoughts on that yeah so

we've done a bunch of work on um like different kinds of processes of cultural Evolution um so in that setting most of the questions that we've been focused on more um how can you create circumstances

where people are actually able to improve over time right so if you have a community that's trying to figure out how something works or uh trying to come up with good algorithms for solving a

problem what are the mechanisms that make it possible to do that um we've got a few papers where we've actually run some of these large scale experiments that explore that um one of the uh things that's effective for doing that

is just being able to see not just what somebody else did but how well they did as a consequence of that which support a kind of social selection effect um and then we currently have um uh a

collaboration with some folks where we're trying to do some work where we're looking at AI mediated interactions and how that's something which can help support people finding better Solutions

in those settings to awesome thank you so much yeah just wondering whether you have any plans to investigate uh the prediction from the game theory some of

them is too ntional you know yeah so I briefly mentioned the game theory case um so let me I can just show you a picture

um so what we did in that setting was uh you know very similar kind of approach where we can take a a two-player game you can think about this as now our sort of vector space corresponds to the

payoffs that are associated with each of these options for the two players right so now we're mapping this Vector into a space which is what's the probability that we just focus on the row player

that the row player makes choice of option a uh and so we can think about that now as a problem of we want to map that space or probability in the same way we can Define sort of meaningful classes of problems we collected a

massive data set of um 2,416 Matrix games that would distributed across these different classes of games um and then in this setting one of the key

scientific questions is about um how much they're called like level K models right like how much people think about what the the move is that the other person's going to make in selecting their move um and a lot of the debate

had been about do people do this one step or do they do it two steps or you know like sort of framing this problem in a very dichotomous way in the same way that in the simple risky Choice case it was is it this Theory or is it that

theory in that case we showed it's kind of both right it's like this is something which varies across the space it's not which theory is right it's where are these theories right and we did the same thing in this two-player game setting Where what we're able to

show is that um the degree to which people iterate varies by problem and we can construct a very straightforward predictor of that variation which is based on a sort of we start off with a black box model but we're able to pull

out interpretable features that correspond to essentially the complexity of the the game and you can quantify that in terms of things that correspond to like the number of um uh you know

equilibria and things like that so yeah thank you so much for a wonderful talk and I have some question regarding some of the learnings you went through while training the models or creating

the prompts because we have a mental Wellness startup happiness factors where we have just trying to train the models for the mental Wellness Solutions as well as create the right prompts so that

the user gets the right information which they need so what suggestions would you give to someone who is still starting in the field yeah okay well um

I mean I think one thing that you could think about there right it's not you don't have a a human prediction problem so much right you're you're at least you're trying to map from prompt space to some kind of complicated human

outcome right so the first thing you need is data yeah in order to be able to figure that out um and then I think um uh one of the challenges with doing that

sort of prompt optimization is that the space that you're optimizing over is much larger than the space that we're considering in these kinds of problems right so part of what made it possible for us to cover the space with the data set that we collected is that we have

these relative constrained sort of classes of problems um we have a recent uh we we have an effort underway at the moment um where we're looking at more

open-ended problems that can be described in text um uh and you can build similar kinds of models in the settings it's just a little more complicated but yeah I think step one is

you need a lot of data thank you all thank you for the very interesting talk uh in the example of with the trolley problem you sort of showcase an example where both the

blackbox model and and the theory driven model got it wrong and it's sort of and the question is sort of in this extreme situation I wonder through the training process have you encounter any situation

where the black's model got it wrong but on the surface doesn't feel like an obvious outlier yeah yeah that's a good question um we didn't look at that in that data set but that would be a good complimentary analysis to do right so

that's something where what you would expect that would tell you is something about where your data coverage is poor so it's generalizing poorly or something like that yeah um y thank you hi professor my name is Patrick I'm a

student here thank you so much for the speech and um my question is something you mentioned during um the time that you were extracting the theories from

psychology there's been um a lot of papers a lot of literature in the thousand 100 Years of histories of the development of these disciplines and it

means a lot to extract theories that are present in these literature I'm wondering if there's like besides having graduate students read them and

understand them and construct models of these theories it would be possible for for example an automated pipeline for um like mathematically modeling all all these theories and testing them on

Pipeline yeah I mean I think that's interesting right so um uh it um so in the in the risky Choice case we were able to recapitulate a lot

of that history and that single experiment um and you might expect that You' be able to do similar things as you collect you know large data sets in other settings too uh one thing that

we're interested in and been thinking about is if you can use large language models to read the research literature to then generate hypotheses in these

settings that you know then become meaningful classes of hypotheses to evaluate U and we've started to think about that but that's obviously a that's a a good a good way to at least lighten

that load yeah but but I mean I think honestly part of the point here is also there's relevant expertise which is useful um this is

part of my pitch uh my um Pitch and my fear for psychology right so I think where psychologists have figured out a bunch of nuanced things about human

behavior um but if we can't figure out how to leverage those Nuance things they will become Irrelevant in the face of sort of you know data driven analysis and so we need to work out for the

future of that field how to find the right balance between Theory and data yeah thank you so much for the answer thanks hi professor thanks for the talk

um I was wondering so your work focuses a lot on specific scenarios I was wondering if you've worked on or thought about uh modeling like the individual so you know given that they made this

Choice among these different scenarios can we predict what they do for this scenario yeah um so we have so I I sort of quickly showed this part let's say

here so this is a slightly different data set so in the in the previous data set I was showing you um we just focused on getting coverage of the choice problems so we had those 13,000 problems then we had about 20 responses per

problem we collected this other data set where we now have a thousand decisions for each individual and we can look at how well we can do in predicting those individual choices and so what we're doing here is building a separate model

model for each individual so each of these lines corresponds to what the you know utility functions and probability waiting functions look like for those individuals and then we can also answer

sort of questions about population like which you know like there's there's meaningful variation in the extent to which they seem to do things like probability waiting across individuals but in that data set we can look at how

much better do we get at predicting decisions the more decisions we've seen for an individual and how much better do we get as a consequence of modeling the population as well right different character thank

you hello Professor thank you for the talk um I was mainly interested in the method where you were talking about using pre-training data uh mainly based

on uh theory that you simulated MH so I come from a data science background so I understood the second part of it could you give me some guidance on where to

look for all which theories yeah to use to gather what kind of data yeah so in this case the theory that we were simulating data from um is the one that

corresponds to this uh dash line here right so we we took the the the best fitting psychological theory and we used that as the thing that we would then simulate data from so you can actually

think about this where you could have a pipeline where you say come up with your you know interpretable sensible you know motivated Theory fit that theory to the

data then use that theory to generate a much Lar data set that you use to pre-train your neural network and then go back and fine-tune that neural network on the data right and all that

you're doing is then making it so the neural network is instantiating the theory right the function that it's learned to correspond to as the theory and then that becomes the starting point that you're using for for the

fine-tuning uh of the the neural network itself so the answer is to what theory you should use kind of depends on what problem you're solving where you need to then you've got a problem that you want to solve you want to be able to learn in that

setting reading about the kinds of theories that people have had in that setting gives you the information that youd need in order to figure out how to generate the synthetic data okay that helps thank you

okay hi thanks for the talk um so over the past few years there's been this proliferation of machine learning interpretability methods things like causal interpretability and mechanistic interpretability um how much potential

do you see in trying to open up some of these blackbox models and understanding what they're doing yeah um uh I think that's definitely a thing that's worth

doing but um uh I am a little skeptical about how well it will work in all of these settings right so often what you get out of that is something like some features that you're able to say these

features seem to be useful right um the kinds of interpretability methods you use constrain the kinds of explanations that you can get right so you can get an explanation from that which says seems like this feature is relevant or it

seems like this feature is relevant and we did something like that for the um the sort of blackbox models that I was sh showing you um but I think taking an approach where we Define the kind of

explanation that we want in a way that corresponds to the sorts of theories that are normal in that domain gives us a richer set of options for being able to then you know pull something out of

those models so if I say you know the thing that I really want out of this model is something which is going to fit this rational Choice framework and I want to find the best thing that is in this rational Choice framework plus these extra features that imposes a constraint on the classes of functions

which I'm going to be using to try and approximate that thing so what you're doing is you're building a sort of parallel approximation by building up that interpretable model class rather than trying to break down the the

features that are being used in the lesson chapter one so I think there're sort of parallel methods um it it's just it's more a matter of like what kind of

explanation you're looking for yeah cool thanks hi Dr Griffith um I was curious your thoughts on how you might model the utility function when there's multiple

types of outcomes that people are weighing um for context I'm researching the impact of colal cancer screening and there's different types of modalities uh

with different cost benefit trade-offs so colonoscopy is the most effective type of screening at preventing and detecting coloral cancer um but a lot of people don't want to do colonoscopies

they'd rather do stool tests because they're less inconvenient and for people without insurance less costly um so yeah I was just curious about if if there's even a possible way to model a universal

utility function um yeah that's an interesting question so um we have been starting to do something like that using as I've

mentioned before text based descriptions of decisions right where we have some of those in medical context and we can sort of calibrate those against the actual preferences that people Express um so I

think you know that's that's very like this um uh the sort of approach that's taken in the Centaur model that I mentioned right where what you might do is start with something like a large language

model and then try and fine-tune that on actual decision preferences that people Express in order to try and figure out if you can predict the decisions that they make in those settings and so in this model there's nothing like a

realized utility function but it's doing something like you know learning how to capture those those preferences in a way that might be useful yeah thank

you hi thank you for this talk I I have a question that's a little irrelevant of the the content of this but I just was curious cuz you had a a very early book

about algorithms to live by and kind of the principles that computational principles that we could learn from the algorithm um I was just curious when I was like listening to this like now with

the new model of the large language model like that can also simulate human behaviors like what is a principle that you would maybe add to the chapter of your book if you're writing one oh okay

that's interesting um I don't think I would change anything so the the point of that that book was um that in for many of the kinds of decision situations people find

themselves in um there we know what Optimal Solutions to those decisions look like right so for things that are optimal stopping problems or explore exploit problems or you know these kinds of very common kinds of problems people

encounter um so the real focus on that book was on these kind of rational solutions that people might not think about but also a different way of thinking about what rational action is which takes into account the effort that

you put into trying to decide what you're going to do right um and so I have a whole other line of work which is about that trying to understand how people navigate those trade-offs and and try and come up with a better characterization of what we call

resource rationality which is figuring out how to allocate computational resources to making good decisions um in terms of the things I talked about here

though um I think the the one thing that some of these things might do is help you to understand how it is that people are able to make reasonable decisions outside the settings where those Optimal

Solutions apply right so the the big problem um that we encounter when we try and use those Optimal Solutions is that not every decision problem that people face is one that sort of fits into the

schema of one of those optimal strategies and despite that people are pretty good at coming up with strategies that they'll use for making decisions in new situations and so that's the kind of thing that I think some of these things

like the large langu anguage models might be helpful for is thinking about how to solve those kinds of problems of interpolating between optimal cases to find what are reasonable decision strategies might look

like have last question yeah real curious about the detail for the fine tuning as you know there are two different ways of fine tune supervis fine tuning or reinforce learning with human feedback I was wondering you know

which ones you yeah this is supervis fine tuning and it's using Aura so it's like uh making it a little bit easier uh to to fine TR the model yeah okay all right thank you thank you

Loading...

Loading video analysis...