LongCut logo

Terence Tao & Riley Tao: Turning AI’s Firehose Into Usable Science

By SAIR

Summary

Topics Covered

  • AI as a Jet Engine: Powerful but Needs a Pilot
  • AI Fuels Discovery: Exploration and Human Proof
  • AI as a Filter: High-Volume Sewage to Drinkable Research
  • Benchmarks Become Artificial; Real-World Deployment is Key
  • Comparative Advantage: AI for Data, Humans for Creativity

Full Transcript

[music] Hi, I'm Riley Tao. Uh, this is my father, Terrence Ta. Um, and I'd like to ask you a few questions about this AI thing I've been hearing so much about.

>> Um, so I've heard that you've been using AI to assist yourself with mathematical research. Um, how's that? Uh, how do you

research. Um, how's that? Uh, how do you use AI to help uh solve math problems?

Yeah. So, um AI is becoming very very capable and um I'm finding lots and lots of ways to to to use it reliably. Um so,

right now the technology um I liken it to to a jet engine, you know. So, a jet engine is really powerful. Uh it can accelerate to really high speeds, but um we don't strap uh jet engines on on

people fly around in jetacks. It's too

dangerous. Um so, u but you know, we do have very safe, reliable planes that use jet engines to, you know, cross the Atlantic and whatever. very very safely.

Um so we are transitioning right now from a technology which is extremely powerful but unreliable makes so many mistakes you can't trust its output but there are now more and more use cases

where we have found ways to check and verify these uh these outputs and um and they're actually quite useful in in in research now. So for example um I

research now. So for example um I routinely use AI tools to do literature review. Um I um if if there's a a

review. Um I um if if there's a a problem I want to work on um I ask an AI has there been any work done on it before um and it will give me some of references. Now some of them may be

references. Now some of them may be hallucinated. They may not exist but I

hallucinated. They may not exist but I can go check I can go look up um the actual articles and I read them and more and more like um of the time a large fraction of of of the references are

actually legitimate and and useful. Um

and it's it's it's um it's already a gamecher in in in many ways. Um, I

didn't used to to code very much when I I did my research. Um, I can do a little bit of coding, but it it's it's um it's a bit difficult for me, but it's so much

easier to to um ask an AI now to to to write a short program. I can I know enough programming I can debug it and and make it uh make sure that it's it's outputting ex exactly what I want. And

so now a lot of my research has has a lot more simulations and code than I had before.

>> Yeah. So um you mentioned using these simulations in code but your research is all in math like how um how are there any examples of projects that you've done where these simulations code have

really helped you with one of your math research problems >> right so yes so for example um I've worked recently with a team at Google deep mind um on their tool called alpha

evolve um which can um um it can't solve every math problem but it can construct examples um of um of objects that sort optimize a certain score. So you give it a certain score and it will try its best

to find objects that that have the best score. Um so there was a problem in pure

score. Um so there was a problem in pure math where I wanted to find certain configurations of um of lines put together that are as um which overlap as

much as possible. Um it's called a nodum set. Um and I wanted to find nodum sets

set. Um and I wanted to find nodum sets as small as possible. Um and um we just got alpha evolve just to to work with in small situations where you have you know maybe maybe 50 60 odd lines and um

certain number of rules on how they can be configured. Um and it came up with

be configured. Um and it came up with some clever new constructions. Well,

clever to ask. Um I mean it was using ideas that were in the literature but we weren't aware of uh to construct some new examples of these sets um for fairly

small um um parameter sizes but um we could take the examples that they produced and and read the explanations of the code that generated them. uh and

I was able to find a general construction that worked for um arbitrarily large um crownage sizes that that that still work and wasn't new. So

I just wrote a paper uh on this which is um a human generated paper. All the

proofs are written by me but the idea of the construction came from these these AI tools. So AI can do exploration, come

AI tools. So AI can do exploration, come up with interesting examples, suggest um conjectures and then uh you can get humans to actually go verified and

actually prove rigorous results. I think

in the future maybe AI will do more and more of um of this process and and uh maybe just the final stage will be will be done by humans. But already we're seeing great partnerships between AI and humans.

>> Yeah. So what I'm hearing is that the shape of these partnerships is that the AIs can uh find problems, find other um things in the literature that other people have done and bring them to the

humans um for inspiration um and for creating a more creative solution out of those base parts.

>> Yeah, that's that's where the technology is at right now. Um it's similar to say search engines you know uh so maybe you don't remember when uh uh I think yeah

Google came out in like the early 2000s and and uh I remember just how amazing it was you know that you just type in um um a search prompt and you get all the

relevant hits all all the web pages in um that um that you may not have have known about and it's incredibly useful tool you take it for granted now um AI

is now at a similar stage where um you you have a math problem for example and it can it can give you a list of all the possible techniques that might be useful um and it it can even apply them um and

uh um and it can see whether any of the standard techniques or some combination of them actually work. It still makes mistakes um but still just that

capability alone is is already um a big um a big level a big boost to to to people people mathematicians routinely use these these uh this capabilities

today. So you mentioned AI making

today. So you mentioned AI making mistakes a couple times before. Um is

there a way around that? Is there a way to make sure that AIs don't make those mistakes?

>> Yeah. So um inherently these large language models which is what powers these uh these modern AI tools um they are stoastic um you know they they randomly generate text based on what

they think is the most likely next word um um to say after um the previous text.

And just by their nature they will always make mistakes. you can you can find ways to to lower the error rate but but they're they're inherently unreliable. Um they're guessing machines

unreliable. Um they're guessing machines basically. But in math at least and

basically. But in math at least and hopefully also investment sciences um we have other ways to verify output. So um

we have for example these things called formal proof assistance that if you write a mathematical proof in a specific language uh like it's like a computer programming language um you can get a a

um a compiler not an AI but but a more trustable piece of computer code to check whether um the program is whether your proof is correct or not just like um um we can compile a program and see

whether it executes or not. Um, and so we do have this way of of of 100% checking of confidence. We can grade the output of of of um um of these AIs. And

so there's a lot of work right now to get these AIs to output their their their um um uh their math arguments in this language, get these these um um

these other tools to verify them. Um and

if if it doesn't work, you you go back and say, "Try again. That didn't um your proof had this error in it." Um and so we can use AI safely and reliably because of these of these um of these

tools. Now it's it's not perfect right

tools. Now it's it's not perfect right now. Um especially for very complicated

now. Um especially for very complicated mathematics. Um AI still struggles to to

mathematics. Um AI still struggles to to write proofs in the formal language. But

at least for certain simple types of of elementary types of mathematics like cominatorics, this already works quite well. Um so in math at least we have um

well. Um so in math at least we have um the ability to to filter um the output.

So one analogy I I like to um to give is is that u you know is that math and science is like it's like drinking water you know like you know you want to have

some drinking water and and uh the current scientific output is like a tap which produces uh which produces some water you can drink but but at at a very low rate. Um so AI is like this fire

low rate. Um so AI is like this fire hose of high high volume high velocity sewage water. Okay. So like it is not

sewage water. Okay. So like it is not drinkable. um but it is much higher

drinkable. um but it is much higher volume than the little um high quality tap water that we've been producing in the past. So what we need is a filter, a

the past. So what we need is a filter, a water filter that can get rid of all the crud and just and and and produce high volume drinkable um research so to

speak. Um and so in math I think we

speak. Um and so in math I think we really have a chance to make this happen because we we understand verification very very well in mathematics. We we

have um we've we've understood the laws of logic and of mathematics for over a century. Um and we've even taught

century. Um and we've even taught computers about how to check um the output. Um science we have experiments

output. Um science we have experiments and clinical trials and simulations. Um

these are um are not quite as reliable as um formal um math verification but um I think they can also be used in similar

ways to uh to to uh uh to filter the output of of of and keep them honest. Um so what I hope to do in the future is start with

mathematics figure out how to use AI safely in mathematics and a lot a lot of lessons that we learn can then be generalized to the rest of science.

>> Okay. So in so in a nutshell you can use math to make sure the AI is right and then you can use the AI to make more math.

>> Oh yes. Yeah. Know it's a great partnership. Um yeah we we only have so

partnership. Um yeah we we only have so many math PhDs in the world. um you know it's it's not a it's not a huge discipline uh because there wasn't there isn't that much money in it um and it's

a lot of a lot of training and there are many many more math problems um both in pure math and in applications to the sciences then there are mathematicians

to solve um so we need help um and um so um now we have AI assistance um also with AI we can also bring in the the broader public um there are now some large community projects which were not

possible before but um where you can take a complex math problem, break it up into lots and lots of pieces, and then hand each piece to um some some amateur mathematician, maybe armed with some AIs

or or some proof assistance or some other tools. But, you know, um it's it's

other tools. But, you know, um it's it's we no longer need um every person to be a math PhD that understands the entire project in order to contribute to one little piece. We can crowdsource a lot

little piece. We can crowdsource a lot of projects. So um you know and math is

of projects. So um you know and math is now um so central everywhere like like all modern technologies not just AI but any complex problem involving say

climate change or new energy sources or um or or or treating new diseases um you would need at some point to solve some some complicated math problems. Um and so we need all the help we can get. Um

and so with with all these new technologies you know we we can um get AI to help us we can get the public to help us. we can get other scientists and

help us. we can get other scientists and mathematicians to work together. Um I

think it's it's a great future for for math and science.

>> Yeah, it's a really optimistic outlook.

I've um I've heard that there are AI companies who managed to win gold medals in the International Mathematics um a pretty popular math competition. Do

you think that these kinds of benchmarks are a good way to measure the capabilities of current AIS?

>> That's an excellent question. Um so

benchmarks are a great goal to to move towards um until you get too close to them. Um so if if you like um you know

them. Um so if if you like um you know um you can say that if if if this location represents the current state of AI and here's a benchmark that you want to get to but here here is a reward

problems that that you want to solve. So

you know initially when you move towards the benchmark you're also moving towards um getting better at the real world problems and that's great but um at some point these benchmarks are artificial um you know so the for example Olympic

problems they have some features in common with real world research problems but they're not identical and so uh if you focus too much on on optimizing just for your benchmark you may actually move

away from um your real world problem um and actually um get a tool that is too overfitted to this benchmark. and no

longer uh suitable for for your problem.

Um so this is a this is something that that people figured out long before AI.

Um but you know so um there's a lot of debate in in in education circles about how much we should re rely on standardized testing for instance. Um so

some standardized testing is useful to get to give students some idea of of of how much they understand the subject.

But if you make a standardized test too much of a focus in your courses, you you and you only teach to the to the test um at some point you only learn tricks that are good for solving your specific test,

SAT, whatever at the expense of other types of of of of learning. Um and so you can get these students who ace these standardized tests but actually not the

best students in your class. So yeah, um a lot of these benchmarks are getting saturated now. Um and we have to uh move

saturated now. Um and we have to uh move beyond that. Um so some people are

beyond that. Um so some people are trying to develop even more advanced benchmarks you know which are somehow closer to real world problems. But I think actually what we really need to do is actually start deploying these tools

in the real world and actually um see how people in the wild are actually using these tools. Um uh because benchmarks at some point can only get

you so far. We need real world data. So

one thing I in particular I'm hoping to do um uh with um the support of of of funders such as the S foundation is is to get uh my my group of of students and

posttos to develop really practical tools um powered by AI um powered by AI that that works you know that has that is already mature um may not be the

cutting edge but you know maybe a year two years old but um you can already deploy right away you know so that you know you can you can take a simple math inequality and and generate a proof of

it um um you know just very reliably or um or writing a statement and and just get a list of all the relevant papers that that could could be useful for you.

Um there's a lot of there's a lot of technology that that basically almost already is out there but it still not um a bit clunky uh not always reliable but if we can make it sort of extremely user

friendly and and geared to real world research problems um I think that will help push uh the AI field forward because benchmarks have sort of accomplished almost everything they can at this point.

>> So if our current benchmarks are saturated as you put it is there another goal that we should be looking forward to? um for the next step in AI.

to? um for the next step in AI.

>> Uh yeah, that's a that is a great question. So yeah, for me uh just real

question. So yeah, for me uh just real world uh practicality um what some people call the last mile of of software development. You know, you can have this

development. You know, you can have this this amazing platform that works in principle and all your your testers and developers love it, but then you you you give it to the public and you know um

and they don't use it. uh like um um sometimes when you are too close to a technology and um you you don't see how the general public um will use it. um

you know Steve Jobs when he um was CEO of Apple like one of his g one of his great uh talents was to understand what the general public would actually uh use

from um you know so he created technologies like the iPhone and whatever which which really um um um sort of um matched what what actual

ordinary users wanted you know so a power user who who who is is very good at technology and loves advanced features might might uh might not necessarily use one of one of his

products, but he had a knack of of working out sort of um what the broader public would actually um embrace. And so

um yeah, so for me usability by sort of um um your typical scientist to really um accelerate their research is is is what's important for me. Um we need all

kinds of AI research though beyond that.

Um so current AI development is is almost all entire powered by large language models. Now this is this is

language models. Now this is this is this is the uh uh almost the only game in town but but large language models do have significant weaknesses as I said they they they they hallucinate a lot.

They're not powered by sort of um grounded reasoning. Um and there are

grounded reasoning. Um and there are other promising ways to to um achieve AI which [clears throat] actually um are a bit neglected now because they have um they have not

achieved anywhere near the success of large language models and so we're also going to need to to pursue other um um AI architectures. Um uh so that's that's

AI architectures. Um uh so that's that's a project which uh I probably I probably myself won't be working on but um I think um we we do need to to to work on on many many directions of AI

development not just making the thems [clears throat] more more powerful. So regardless of how we get to the um AI

So regardless of how the AI is actually powered and run, it seems like a lot of what the AI is doing is finding things that other people have made. You

mentioned literatur literature searches, you mentioned um solving these specific problems and then giving those solutions to a human mathematician. Is that is is that general trend of finding things

that other people have made and then bringing them to a human researcher to be refined and um filtered as you put it is that what you see the future of AI being in mathematics?

>> Uh I think that's what the medium-term is going to hold is going to be. So in

the next 10 20 years um you know I mean humans and AI for in in that time period will have complimentary strengths. Um so

AI is just extremely good at at synthesizing very large amounts of data.

Um so um you know a human cannot read a million different papers or whatever and and and and and try all the ideas in all those papers but an AI kind of can kind

of do that now. Um

but on the other hand uh humans right now can um they can just see five or six examples of of um um of some math problem and just and just see ah okay I

see the pattern now okay um they can generalize from very very small amounts of data um which AIs cannot do right now or they can um they can try to fake it

but it it is it it is very they're very inefficient at it. So um you know there's there's a famous law of economics Ricardo's law comparative advantage you know that um if you have

multiple agents trying to work together on on on various task you don't always assign to um a task to the agent who is

the best at that task. Um because um um there may be one agent that that's better at everything and then there there's nothing for the other the other agent to do. Um so actually it's it's better to to figure out where the

comparative advantages are. So every

agent will be better at some task compared to um to all all the other possible agents um in a relative sense.

Um they can they can have an advantage in one task over another and that's the task that you should assign them on. So

I think AIs will be their competitive advantage is as you say um uh anything that involves um uh using large amounts of data like like like trying um to use

all the ideas that were in the literature already um uh and but but for um u problems where real creativity is needed and you need to extrapolate from

very sparse amounts of of of data signals um even if an expensive AI might potentially be better at that. I think

the comparative advantage is to get the humans to do that. Um, and to and to get the AIS to do sort of the brute force, you know, searching through thousands of examples, flagging the u the the five or

six examples that are really promising and then and then um um and then um escalating them to the humans. Um that

seems like the best partnership, >> right? Um yeah. Uh thank you for your

>> right? Um yeah. Uh thank you for your time, Dad. That was a pleasure.

time, Dad. That was a pleasure.

Loading...

Loading video analysis...