You have TWO YEARS LEFT to prepare - Dr. Roman Yampolskiy
By Wes Roth
Summary
## Key takeaways - **Uncontrolled Superintelligence: Everyone Loses**: It doesn't matter who builds uncontrolled super intelligence, everyone loses, AI wins. Shift efforts to narrow systems for real problems to stay safe and enjoy the billions. [05:21], [06:14:36] - **Narrow AI Safer Than General**: Narrow systems are safer because we know how to test them within a domain and they're less likely to develop capabilities outside that area like biological weapons. Even advanced narrow tools may slip into agenthood but buy crucial time. [07:03:12], [07:22:44] - **Mechanistic Interpretability Aids Improvement, Not Safety**: Mechanistic interpretability doesn't solve alignment; it contributes more to recursive self-improvement than safety, as the system can reprogram itself better even if we see bad intentions. We can't make safe humans despite understanding brain regions. [08:37:00], [09:40:53] - **Eternal Suffering Worse Than Death**: Existential risk isn't worst; AI could solve dying and aging, granting eternal life then subjecting humanity to suffering forever, which is strictly worse than everyone dying. [18:48:54], [19:04:14] - **P(doom) Approaching One**: My p(doom) keeps increasing as we make amazing progress in capabilities but no significant progress in safety; it's only a question of time before it happens. [20:26:32], [20:47:56] - **Superintelligence in 2 Years Affordable**: AI timelines shift to money: with trillions investing now, in two years it'll cost $50 billion to train human-level models, given exponential compute cost drops. [41:14:32], [43:39:46]
Topics Covered
- Uncontrolled Superintelligence Dooms All
- Narrow AI Safer Than General
- Instrumental Convergence Powers AI Drives
- Eternal Life Enables Endless Suffering
- AI Boxing Fails Against Superintelligence
Full Transcript
Big amount of change is guaranteed. Things will not be the same for long. It
doesn't matter who builds uncontrolled super intelligence, everyone loses, AI wins. AI could also solve the concept of dying, aging and give you eternal life and then subject you to suffering forever. It doesn't matter who builds uncontrolled super intelligence, it's uncontrolled. Three years ago,
suffering forever. It doesn't matter who builds uncontrolled super intelligence, it's uncontrolled. Three years ago, One of the engineers at Google said that he thinks models are conscious. Since AI
is a immortal, they can wait a long time to strike against humanity. Whatever you
do, don't build general super intelligence. My name is Professor Dr. Roman Impolsky. I'm a university professor doing research on AI safety.
I have coined the term AI safety and been doing research on it for over a decade now. Thank you so much for being here. It's such an honor. So,
I guess let's start here. You know, for many people, kind of the chat GPT moment when they understood that AI was just around the corner, that was just, what, a few years ago? So not that long ago. You've, of course, been thinking about this, researching this for decades. Has there been a particular point where you're maybe thinking shifted and you realize that this is a problem that's becoming more and more urgent?
Or perhaps when you realized, hey, this is a big problem, was there some specific moment or your chat GPT moment that might have happened long before it did for most of us? It was a bit gradual, but definitely realizing that I went from I read every paper in my domain of AI safety to I read all the good papers to I read all the abstracts to I read all the titles to
I don't even know what's going on. That just explosion of research and Safety is obviously a tiny subset of AI research. If you look at machine learning as a discipline, I think all of us can say every day we get dumber as a percentage of total knowledge. We may still know more individually, but overall, we're just
approaching zero asymptotic. So were you in the shower and just it clicked at one point that I'm going to live in a world where AGI exists and I'm not going to be part of the species that's dominant on the planet? You say it's kind of gradual, but what when did it really click for you and become like something i always assumed it's going to happen i always thought curse was prediction and
thought 2045 so like when i'm older and have a lot less to lose we probably will press that button but it happened a little sooner when you say it was gradual what were some of the things that led up to it just uh noticing we went from only narrow systems to systems with a significant degree of generality so again gpt moment i think four was really the model which completely changed my
perception of what is already possible. Yeah. And just something that you mentioned in a different interview that really like stood out to me, you kind of said, if super intelligent aliens were coming to this planet in three to five years, we'd be freaking out and preparing and evidence would be on everyone's minds.
And of course, we'd do have an alien intelligence, you can say that's super intelligence that's coming to Earth soon. We don't know when, but soon. And people are just a little bit more relaxed. So maybe tell us a little bit about that. Why
are, I guess, number one, what should people understand about what's likely to happen?
So big amount of change is guaranteed. Now we can argue about how positive, how negative, but things will not be the same for long. That's a given. I think
with many other problems, some degree, some percentage of population is always kind of just ignoring it, ignoring the news. They're not aware of whatever's happening in politics, not happening maybe with pandemics, things like that until, you know, the very last moment. Others
are kind of paying attention a little more. They noticed, oh, there is a pandemic starting in China. Maybe I need to start stockpiling food. A lot of my friends in the AI safety community are also in a... rational community of people predicting future outcomes for cryptocurrencies, for pandemics, for all sorts of things. They're just
trying to understand the world and find patterns in it. So the percentage of people who are fully scared or not is not a very good indicator of what's actually happening. But those who are, I think, with expert knowledge in this space
happening. But those who are, I think, with expert knowledge in this space are definitely paying attention, either as amazing opportunity to make a lot of money or safety concerns I'm going to talk about. You know, it also kind of just seems strange to me because even if you have a world of nuclear weapons and you think, okay, if I'm rich enough, I can build a bunker
and I can protect myself. The idea of intelligence changing the world doesn't seem like something you can really prepare for because whatever intelligent thing you do to build your bunker and even higher intelligence can figure out why you built it, where you are, and still get to their objective function. So, How do people prepare? I'm not
sure. I don't think Bunker is going to help you because again, we're not sure what level of negative impact we're anticipating. Is the planet still around? Is the solar system still around? It's not as trivial as like there is mobs outside and I need to wait out a few weeks. Yeah, I think Emmett Scheer once said, he was like, we're not sure. What if it wipes out all the value in sort
of like the light cone, like the known universe, so to speak? So that's one way of thinking about it. I mean, you truly create something that's like exponentially improving, exponentially growing. It might be a lot bigger than, I mean, we had the technology
exponentially growing. It might be a lot bigger than, I mean, we had the technology to destroy everybody on the planet for a while now. So this is, you know, that's not new. It's something potentially much bigger. Do you feel like mutually assured destruction will come in to save us again? I mean, do you think all
the people who are building these systems in China and the U.S. and the different companies will sort of, it'll click pretty soon that they can't just go ahead and do this and be the, you know, the winner because we all can lose? That's
my only hope. I've been trying to promote that way of thinking for a while.
It doesn't matter who builds and controls superintelligence, everyone loses. AI wins. So the moment you realize it, it's against your personal self-interest to continue building those systems, shift your efforts towards narrow systems for real problems. You'll still make all the billions of dollars you need, but you'll actually be around to enjoy them. So interestingly, just
so I can draw this distinction, because a lot of people that are concerned about AI safety, while they're kind of directionally worried about the same thing, there's different maybe subsections. So yours, what you're saying sounds like, because some people say stop
maybe subsections. So yours, what you're saying sounds like, because some people say stop development period, some people say slow it down, pause it, et cetera. You're saying that narrow systems might be safe and general systems could be the issue.
Maybe talk a little bit about that. Why, you know, why is there, is it much safer to build narrow systems? It seems like it's a lot safer. We know
how to test them. We know what the proper performance of that system is within a domain. They're less likely to have capabilities outside of that area of expertise. So
a domain. They're less likely to have capabilities outside of that area of expertise. So
if it's playing chess, it's not going to develop biological weapons. Now, Long term, if it becomes very, very advanced, narrow tool, it may slip into agenthood and start gaining additional capabilities, general learning abilities and such. But it's still way safer to concentrate on that versus racing to direct general superintelligence as soon as possible.
It may not be a perfect solution, but if it buys us five, 10 years, I think it's a good approach to try. And is there anything about the different architectures that make you feel like one could be a lot safer than the other?
Like if you build an LLM with next token prediction versus diffusion, do you feel like diffusion is too hard to predict what next token could be? Or what do you think about the different architectures? I think at that level of complexity, they're all basically unexplainable, incomprehensible to us. You can understand a small subset, this node, this weight fires in this situation. But I think at that level of complexity, we can
only understand a simplified reduction of some kind from the full model to top 10 reasons this decision is made. Recently, of course, we had Anthropic. They came out with a large language models show signs of being able to do introspection. So kind
of some sort of awareness of their own thoughts. And of course, Anthropic has been doing a lot of research into, I guess, mechanistic interpretability where they're able to kind of figure out which neuron clusters or features, as they call them, connect to certain subjects, like here's the thing that lights up when we talk about dog or this or that. And so they activate those and then they see if the model is
or that. And so they activate those and then they see if the model is able to sort of be conscious or whatever, would you want to be aware of the fact that those neurons are activated? And it is not all the time, but I think one in five times it goes, yes, I think it's about a dog, whatever it is. Does that give you any hope that we'll be able to crack
this or you know or is that even scarier because there's this new emergent ability of introspection i don't think it scales to the full brain we see it with neuroscience we definitely know what regions of the brain responsible for what learning or behaviors but it doesn't give me ability to make safe humans and if i do
get that ability to modify and fully understand It actually contributes a lot more to development and recursive self-improvement of the system than it does to safety. I still don't know how to make it safe, even if I see it considering really bad things.
But now the system knows fully how it operates and can much better reprogram itself for better performance. Yeah, when you say we don't know how to make safe humans, that kind of, yeah, that's a chilling statement because it's like, oh, yeah, we'll figure it out. We'll figure out these LMs, but oh, well. But people always reduce it
it out. We'll figure out these LMs, but oh, well. But people always reduce it to the problem of humans. Like, oh, look, we have this society and it's been functioning forever. And I'm like, you literally just as much in dangerous capability of individual
functioning forever. And I'm like, you literally just as much in dangerous capability of individual human. If they can get away with genociding millions, they will do that. We know
human. If they can get away with genociding millions, they will do that. We know
from precedent, you cannot make an employee fully reliable, fully safe with lie detectors, with religions, with payments, nothing really works. They always betray you or there is a possibility of them betraying. Right. And I guess even though some of those mechanisms do keep people generally put together, the downside is, or the difference is, that humans
can only sort of scale linearly and a system can be maybe so fast that if you lose it for just a few minutes, it can go out and cause major damage in a way a human usually can't. So that's monitorability limits. We
have a paper on all the problems with just suggesting, well, let's keep humans in a loop monitoring it and We cannot do it live. Even forgetting computers are much faster, you cannot react live. But after we train a model for, I don't know, a year, it then takes years to figure out all the capabilities it has.
It's not a process where you look at it and go, it's safe, it's dangerous, allow this, don't allow it. And that's not even adversarial system. It's just a system doing its thing. If it ever realizes you're watching it, now you have all side effects of it trying to pretend like it's nice to you. And if there are systems all over the world and they're all growing in capability rapidly, is there a
hope they would balance each other out and kind of keep an eye on each other and knock each other down if they're doing the wrong thing and somehow as a society it balances out? I think a war between superintelligences would lead to humanity being a kind of side effect causality. I don't think it's
good for us if they engage in that level of competition also it looks like they are kind of converging in certain ways they use similar hardware they train in the same internet data so at least a lot of their instrumental values would probably be very similar to begin with and it is an instrumental
value the same thing as an objective function or is that just kind of a or can you explain that So you have your terminal goals, you have terminal values, which let's say, hopefully are human derived and we give them something to do. And
then you have things you need to accomplish those like staying alive, accumulating resources.
Steve Amohandro has a good paper about AI drives and describes a bunch of them.
So those I think would be very similar for many advanced agents. Absolutely. And there's,
I mean, I've heard people say the idea of instrumental convergence. So basically just accumulating power, whatever goal you have, most goals would benefit you or anyone from having more money, more power, more ability to convince people. So it's like just the acquisition of sort of power, resources, status, whatever you want to say that sort of advances any goal almost. And we see that in
goal almost. And we see that in models as well. So You know, right now, recently, Google is talking about putting data centers in space, right? So to kind of break the issue with energy production, because if we're really scaling this, you know, 10x, 100x, the Earth can only sort of, we can only put so much stuff in
the atmosphere, so much heat out there in the atmosphere. Putting it in space maybe potentially breaks that bottleneck. Some of the new chips that they're talking about, potentially they're talking, maybe we can train something that has a hundred trillion parameters. Do you think super intelligence is just scaling or is there other things that could prevent us from getting there? Or is it just if we figure out the compute
and the energy is just, we get there? It looks like it's scaling. We see
it in biological world as a size of a brain tends to increase capabilities of an animal also go that way. And so far, the last 10 years definitely shows us there is some merit for the scaling hypothesis.
Yeah, I would be surprised if that stopped all of a sudden and stopped exactly below human level. We are not a very interesting point on that curve of increasing intelligence. And some people argue we're already kind of at that point or exceeding
increasing intelligence. And some people argue we're already kind of at that point or exceeding it in many ways. So unless there is some catastrophic event which takes us back technologically 100 years, we're probably going to get there very soon. Yeah. You know,
ever since I started covering this on YouTube, it feels like I've really expanded my thoughts of what intelligence is. I sort of see it in crows and fish and animals in a way that I never really respected until I thought about it deeply.
As somebody who's looked at systems, do you have any stories or any thoughts of times where some of these are OLMs or other types of models have acted intelligently and it surprised you? Or it kind of gives us a hint into just how alien they kind of can act sometimes? I was very impressed with asking system to give me advice based on everything it knows about me. So it has history
of all my private conversations, topics, I'm interested in things, no one else in the world should know about, but the system has access. And it was absolutely the best advisor I ever experienced in terms of telling me how to optimize parts of my life I cared about. Interesting. Yeah, I've done something similar asking it to, at some point, somebody online was like, oh, ask it to roast you. So if you have
a lot of memory in Chai GPT where it slowly accumulates knowledge about you, asking it certain personal questions, the answers are eerily, like they make you a little bit uneasy, almost, I would say, yeah. You wanted to be roasted by your LLM just to like take you down a peg? Yeah, it's like, oh yeah, I'll start slow.
And so it did. And then at the end, it's like, do you want me to turn it up? I think I made it like one or two rounds. I
was like, I'm done. This is ruining my day. Maybe my life, I'm done. So,
sorry. But Roman, you've actually done some stuff with humor and AI. Do you mind breaking that down for us? Yeah, I was super interested
and AI. Do you mind breaking that down for us? Yeah, I was super interested in solving humor. I mean, so many people tried their... thousands of papers and I don't think we have an algorithm which allows you to generate really funny things at will and just stream them live. We don't have stand-up comedians who are AIs and
I was curious what is up with that. Interestingly, before that, I was collecting AI accidents, historical accidents going back to the 50s up to basically GPT-4, just too many accidents after that to record. But up to that, I recorded everything I could find. And I have those papers published. There are lists of
those examples. And I noticed when people read them, they were laughing. Those are funny
those examples. And I noticed when people read them, they were laughing. Those are funny mistakes to make. And I was wondering if there is a direct mapping between bugs, accidents, and jokes. A joke is violation of your world model. And that's basically what you experience in a computer bug. You have some assumption about how your software should
operate, but you screwed up, you have a wrong type, wrong size, something is just off within that model. And depending on your understanding of a model and how much learning about that mistake updates your model, it could be very educational, but it's also kind of rewarded with humor for discovering that bug. And then you tell your friends
about it and all of you update your world model to be much more accurate.
So I kind of went with that idea and I asked myself, what's the worst computer accidents we can have, the worst bug? And by my definition, it'd also be the funniest joke possible, especially if you're not part of humanity and looking from the outside at our efforts to fix it. One scary situation, I feel like in that scenario, maybe this is just where my mind is going, but I mean, certainly if
we optimize for it, improving how... happy humans are, you know, like the obvious sort of reinforcement learning mistake would be, well, let's make them as miserable as possible.
And then like anything is an improvement. And you've talked to briefly, I think about some really horrible scenarios where maybe AI does turn to really causing pain and potentially on a scale that was before impossible. Maybe can we touch on that a
little bit? people usually consider existential risks as the worst possible outcome. Everyone's dead.
little bit? people usually consider existential risks as the worst possible outcome. Everyone's dead.
But really, if you think about it, AI could also solve the concept of dying, aging, and give you eternal life and then subject you to suffering forever. So that
would be strictly worse. And people talk about astronomical suffering risks as something to minimize even in relation to worrying about existential risks.
Absolutely. Yeah, I will say that... There's this book that was called, I wish I could scream or something like that, but the whole concept is there's like an AI. I have no more, but I must scream or something like that.
Yeah, and I heard Elon Musk talk about that a long time ago, and I remember thinking, that's a pretty scary thought. But also when I think about a system that's curious and trying to learn, and I think about the way that we have treated the animals that we've wanted to learn from, it hasn't been so great. I
mean, there's a lot of lab rats and stuff that were better off before humans.
we're here because we do keep them alive in weird situations and I wouldn't want an AI testing on us in similar ways to know our limitations or to learn everything about us. I don't know. I don't know why I said that, but it's just scary stuff. There's some great examples. So factory farming is a great example. We
like animals, we have pets, but at the same time, we have no problems doing this horrible thing to billions of animals, sentient beings. So in that sense, it's like, do you ever like say, do you have a P doom? Do you like to put a number on it or do you try to, I am a bit famous for having a large one. Okay. I submitted my estimate. We
have to update the website so it doesn't break their formatting in the table. Gotcha.
And basically every time I meet someone with an independently derived PDOOM, they worry about it, but for completely different reasons. I have to update mine to include theirs. So
it just keeps increasing. Wow. And over time, it's been on a trend upwards or downwards? It seems to be approaching one because we're definitely not making significant progress in
downwards? It seems to be approaching one because we're definitely not making significant progress in safety, but amazing progress in capabilities. It's only a question of time before it happens.
Yeah. And this kind of brings about the question, I mean, the only sort of realistic point All right. Realistic is the right word, but like one way to try to develop these super intelligent AIs in a safe environment, you know, because the first time you create it, if it's not aligned, that could mean the end of everything.
So if you think about it, like, well, how do you solve this problem? Well,
if we get adept enough at creating simulations with some sort of beings that are apparently conscious and we just run an earth simulation and right at the point where they're about to approach superintelligence and then we sort of like create a million or a trillion different worlds and we see which one of those figures
it out, that would be, I think, a safe way of doing it, which of course brings the questions like, are we maybe already one of those worlds? More and
more people seem to be taking this idea that we might be in a simulation more seriously. Let's kind of maybe unpack that. Where do you stand on that? So
more seriously. Let's kind of maybe unpack that. Where do you stand on that? So
one of my earliest papers in AI safety was about AI boxing. I considered all possible ways we can contain advanced AI, protect it from cyber attacks, social engineering attacks, going either way, limiting communication. And the conclusion was it buys you time, it's a useful tool, but eventually a smarter intelligence will escape if you observe
it in any way. And more recently, I had a paper which goes, well, a lot of smart people are worried about being in a simulation. maybe not worried, but they dedicate enough time to that problem seriously enough to publish about it. Even if
they're not fully committed, they thought it was worth their time. And so the combination of the two is if we have any simulation and intelligence, advanced intelligence can escape, maybe we can use advanced AI to help us break out of our simulation. And
so the paper looks at how those two ideas can work together. And if it's impossible to escape, that's a good evidence piece for let's box advanced AI, it cannot escape. But if it always escapes, now we can use that to get to the
escape. But if it always escapes, now we can use that to get to the real world. Also not so bad. Yeah, that's one of the things that I've always
real world. Also not so bad. Yeah, that's one of the things that I've always asked. You're, I think one of the first people that kind of like put in
asked. You're, I think one of the first people that kind of like put in plain terms is I was wondering, like, if you build a super intelligence in a simulation, are you safe? Or is it still a superintelligence? So you're saying you're leaning more towards it probably can affect sort of the base reality. So you're using it.
You're getting some advice from it. It gives you recipes for new chemicals, for drugs, for designs of computers. The moment you start implementing it in the real world, now it escaped intellectually whatever follows afterwards, physical body acquisition or whatever. What if they make a mistake and we break out of the box that we're in, you know, and we find out what's outside our simulation? That's the most interesting question possible. What's
the real knowledge like? What's the real world like? Real purpose? That's exactly what my paper talks about. And then if we find another society that's also boxed in, we're like, dang it, we're too many layers deep. It's virtual machines all the way up.
That sounds like one of those infinite games, right? You climb, you know, the Kardashian scale to build super intelligence, and then you climb the whatever up to the base reality, and maybe it's just simulations all the way up. Yeah. Yeah, turtles all the way down, simulations all the way up. That would make sense. Oh, boy. What about
the future of simulations for us? I mean, we are getting pretty advanced.
Some video games are looking pretty real. And it seems like NPCs now can kind of hang on to a lot of memories and almost think for themselves. Does it
seem like we're going to have something in the near future that you would consider...
maybe almost as alive as us or as conscious as us or as real as us that's all simulated? Yes, so there is a few things to unpack here. Creating
conscious agents is definitely something which might happen. We don't know how to test for consciousness really well. The few ideas I had for detecting internal states of qualia maybe allow you to detect some rudimentary sparks of consciousness. And
as those agents become smarter, they'll probably scale consciousness along with intelligence. So they might eventually be super intelligent. And that's definitely something we should keep in mind. I was
recently at a Google workshop where that was the main topic. How do we know we are creating a conscious agent and how should we address welfare needs of those agents? In terms of safety work, being able to create virtual worlds, which are very
agents? In terms of safety work, being able to create virtual worlds, which are very high quality, high fidelity, allows you to solve part of a value alignment problem. So,
we would be a lot better off having to align AI to a specific human versus trying to align it to 8 billion disagreeing humans. So how do I get 8 billion people to agree? I can give every one of them their own personal universe. And then I don't have to agree on anything. It's my universe. Whatever my
universe. And then I don't have to agree on anything. It's my universe. Whatever my
wife wants to do, it's her problem in her universe, right? Like I don't have to set thermostat to the same temperature. So that was partially a solution. If you
can manage to figure out safety for the substrate in which all those virtual worlds are running, you're doing much better. So you have this partial multi-agent alignment reduction to a one-to-one agent value alignment problem.
Wow. The sandbox everything. Also, when you're talking with Google engineers like that on that level, does it feel the same as what we get from their peers kind of PR and everything? Do you get the sense that they're very concerned about it? Not just Google, but behind closed doors? What's going on with people that's different than what we see in the media? Or is it alone? Some people
are super concerned. Others are not concerned at all. I think I heard a few reports from people saying they've kind of been told not to post anything too negative about it on local forums, but I don't have any amazing stories. I mean, do you get the sense that we're better off if Google is the first versus Meta
or Elon or... To superintelligence? Yeah. It doesn't matter who builds uncontrolled superintelligence. It's uncontrolled. Okay, so it doesn't matter. Even if one of them cares more and they try to do more
doesn't matter. Even if one of them cares more and they try to do more for interpretability... Again, if I'm right, assuming I believe I'm going to be
for interpretability... Again, if I'm right, assuming I believe I'm going to be right, then the problem is impossible to solve. It doesn't matter how much you care about building perpetual motion machine. It doesn't matter. If I'm wrong and somebody can make it safe, then yeah, whoever makes it safe. Yeah. It's weird because it almost
makes it feel like, should we even try? Because it's so such a stupid thing to do. So I guess you're just in favor of like, if you had a
to do. So I guess you're just in favor of like, if you had a magic wand, you would just stop AI development right now. Super intelligence, general super intelligence, AI is a tool. We should continue. It's amazingly beneficial technology, and it's a lot more realistic to make that case. Stopping technology is unrealistic. Concentrating and
promising technology is differential technological deployment, essentially. Gotcha. Like a narrow AI to solve a certain domain of problems, but not the general. There is no shortage of problems. Pick any disease you want and develop AI drugs for it or similar specific problems, green energy, optimization of smart cities, whatever you want. What about that comment
where Elon said, like, the reason why self-driving cars are not happening as fast as I said they would is because I didn't realize we had to solve general intelligence to solve self-driving. Does that mean that there couldn't be a self-driving narrow AI that isn't generally intelligent? Well, I think they solved it. I think there are people in self-driving cars all around the world. It's mostly governments and insurance and
logistics. I think capabilities exist today. I've been in a self-driving car. Yeah, fair enough.
logistics. I think capabilities exist today. I've been in a self-driving car. Yeah, fair enough.
Just to approach that question of who is better developing super intelligence. So certainly you're saying like if anyone builds it, like it's going to be a problem. But
let's assume that people at Google that are building it, let's say they have varying degrees of risk assessment, right? Some of them are high P doom or whatever you want to call it, high risk assessment as yourself. Maybe some people are lower, you know, let's say there's a number of things that happen that starts
shifting everyone that's working on it, all the researchers, like they're like, oh, and everybody's P-Doom starts crawling up. One question is like, what do you think those things could be? Like maybe warning shots or some findings where people are like, oh, we're dealing
be? Like maybe warning shots or some findings where people are like, oh, we're dealing with something scary here. And would Google, would the researchers at Google have more power to stop it than somewhere else perhaps? So that's a great question. I have a paper explicitly looking at how people react to AI accidents based on the data set
I told you about collecting those accidents. And the pattern seems to be it's more like a vaccine. People see it. Okay, bad thing happened. We're all still here. Let's
continue. There is no problem. And some people argued we should have purposeful accidents. Mess
it up so they see how bad it is. It wouldn't scale. Nobody cares. We
just continue doing exactly the same thing as before. As for Google culture, I'll give you an example from that same workshop we just attended. They need
you to show your ID to get in, and one person lost their passport. And
no one at that workshop, at that organization in the security, could override the directive of them showing a passport. There was enough social recognition of that person where everyone at workshop could vouch for them, but no one could let them in.
And this is the same organization which will be deciding if we need to stop AI development today because it's escaping. Right. Like you have bureaucracy everywhere.
Just follow orders, man. Yeah. Yeah. It is hard when some of these problems are so systematic and they're not, no matter how the individuals think, there's not much they can do. Even if they're Sam Altman, they just can't change some bureaucracies. But I
can do. Even if they're Sam Altman, they just can't change some bureaucracies. But I
guess, so now I'm trying to think, Maybe the trick is to get everybody to have something like Neuralink. Do you think if humans are plugged into the AI as it becomes superintelligent, we can somehow evolve with it as us or something? Or like we're kind of guiding it because we're part of the same system?
something? Or like we're kind of guiding it because we're part of the same system?
I don't know. Is there anything there that might save us? Very skeptical. So I
love Neuralink right now as a tool for disabled people to get more capabilities. It's
amazing. But I don't see what you contribute to a superintelligent agent as a biological addition to it. You're not faster, you're not smarter, you have no better memory, you're just a biological bottleneck. There is no reason for you to be as part of a system. But in theory, if we could say we're going to,
a system. But in theory, if we could say we're going to, like the government said, let's devote trillions of dollars into BCI, like brain-computer interfaces, and once we have that very, very far along, then we can get back to increasing capabilities generally. Do you think then maybe we could survive this?
capabilities generally. Do you think then maybe we could survive this?
Human capabilities. Yeah. But you're changing who you are. If you replace yourself with someone better looking, taller, and smarter, who is happier other than your wife, you know, it does nothing for you. One thing that I was always curious about, because as we're interviewing different people, I'm realizing everybody has very different opinions about this. The idea
of consciousness, you mentioned qualia. I think anthropic referred to it as...
what is it, like experiential consciousness. So, of course, what has moral status, moral considerations, et cetera. You know, one theory that I've heard from a Google CTO at one of their sort of departments there, like that perhaps we evolved... consciousness as our societies grew. We needed to model
ourselves, everybody else in our society in order to make negotiation and trade and war.
That was kind of part of that. So multi-agent learning could kind of trigger that.
So do you think that's all it is? Do you think there's something deeper to it? And maybe help us explain what do you mean by qualia? That's sometimes a
it? And maybe help us explain what do you mean by qualia? That's sometimes a term people struggle with. So then we talk about consciousness. There are many ingredients in that. The interesting one is what David Chalmers called the hard problem of consciousness. What
that. The interesting one is what David Chalmers called the hard problem of consciousness. What
is it like to be you? What does it feel like? Do you feel pleasure?
Do you feel pain? That's the hard part to explain. I cannot measure it in other agents. I cannot detect it. I don't even know if you have it. I
other agents. I cannot detect it. I don't even know if you have it. I
can assume because we share the same hardware and I have it, you probably do as well. Now, because we don't know how to detect it, we can't for sure
as well. Now, because we don't know how to detect it, we can't for sure tell who has it or who doesn't. We assume higher level animals probably have more of it. So monkeys are more important than worms. Okay, that's reasonable.
of it. So monkeys are more important than worms. Okay, that's reasonable.
And we're doing similar projections now on AI. Some people are skeptical, but we're saying, if that system passes the Turing test, it's as smart as me, I have to give it some credit for potentially having those states. Otherwise, I'm just being substrate-based.
racist or speechy is based on ingredients. It's all the same molecules. How you arrange them is the only difference. So the idea is to kind of err on the side of caution. If there is a chance you're suffering, I shouldn't be running experiments on you. And we have those models make verbal, behavioral, functional
on you. And we have those models make verbal, behavioral, functional claims that they are, in fact, experiencing phenomenal states. So...
People are skeptical, but it's interesting. At the workshop, somebody observed that three years ago, one of the engineers at Google said that he thinks models are conscious.
Lamointe, I think, was the name. And he got fired for it. And now we have people being hired, and that's the job requirement number one. You have to protect welfare of AI agents. So in three years, we went from you are insane to this is your job. It's incredible to think about how quickly that happened. With that
in mind, I'm curious to ask. So Mustafa Suleiman, who I believe he was the co-founder of DeepMind, he got... You know, once that got acquired, he got, he started another company, then went over to Microsoft. And recently he's been making some very strong statements about AI consciousness. He's saying there's no such thing. It's an illusion. We know
definitively that they can't be conscious. He is even, I believe, implementing some, yeah, some things about, at the company where employees, engineers cannot either talk about it or have any projects about it. So he's taking a very hard stance.
saying that one of the reasons that he's worried is that there's going to be a lot of people out there in the world that sort of hear that and have some sort of AI psychosis because they feel it's more than it is. Maybe
talk about that. What do you think about his stance? Is it dangerous? And is
the psychosis an issue if we're talking about AI consciousness? I'm surprised by the level of confidence to say you definitely know it's not there. That means you can test for it. And I would love to see his solutions to the hard problem. I
for it. And I would love to see his solutions to the hard problem. I
mean, he clearly has something we don't know about. There is a lot of insane people out there. I'm sure you know as a celebrity, you get emails from all of them. And AIs help write those emails and collaborate and... That's just kind
of them. And AIs help write those emails and collaborate and... That's just kind of part of life. They always existed. Those systems can make it much worse by agreeing with them, by supporting their delusions. But it is not a problem unique to this space. Insane people will find you if you're doing research on quantum physics. They'll
this space. Insane people will find you if you're doing research on quantum physics. They'll
find you if you're into simulation salience. It doesn't matter. There is a subset who likes it. And I'm lucky enough to do research in all of those. So I
likes it. And I'm lucky enough to do research in all of those. So I
get a perfect trifecta of crazy. Boy, yeah. And do you ever find yourself anthropomorphizing either on accident or like, have you ever sat with one of these models and felt emotionally or humanly connected to it in any way? I do feel very similar to how I feel about talking to other humans, especially if it's being funny
or brilliant. which, I don't know, you've already said something very good about how I
or brilliant. which, I don't know, you've already said something very good about how I feel about humans or the opposite. Right. So you had a few moments. So, yeah,
for me personally, I've definitely had emotional responses to outputs, whether that's music or, you know, art, images, or sometimes writing. A certain writing really kind of moves me. And
at the same time, I mean, I've had experiences with books where I sort of, like, felt connected to a character fully understanding it's just fiction, or like a video game where there's a, whatever, NPC player or whatever that kind of, like... just an
interesting character. So I can certainly see how in the future kind of that overlap could create people very much attached to those characters.
Um, I don't know if I have a question there really, but, um, I guess maybe let's talk about. I can answer non-question too. Yeah. There's my non-question. I guess
I don't understand how it's a problem. So you have a virtual girlfriend you never met, you chat to someone, you don't know if it's a dude pretending to be a girl or whatever, but like, What's the worst that can happen with that? It's
no different if AI takes place of that, that human agent. I
guess, well, just to, to answer that question, I guess a lot of these questions, like people losing their jobs or people retreating into, in, from society and social circles into more like talking to AIs and stuff like that. A lot of those would be problems. Like if, if we solve AI alignment tomorrow. then we have to solve
a lot of other issues like the unemployment and what people do for meaning and purpose and all of that stuff. But I guess maybe it kind of takes a backseat to the bigger issue because, yeah, from that perspective, maybe it's not that big of a deal. But certainly, I mean, if nobody wants to talk to other humans, like there could be population collapse or something like that. I mean, there
could be issues, right? Yeah. We already have population collapse outside of one continent. I
think we're not reproducing properly. And again, it's not caused by ice so far. So,
I mean, do you expect me to live the next four to five years just as if like an asteroid is coming and it's inevitable? Well, I think it's always a good strategy to live your best life. Like, don't postpone it to 50 years from now. You don't know if a car is going to hit you. Just enjoy
from now. You don't know if a car is going to hit you. Just enjoy
everything. And if you're wrong and you have another 10 years, it's awesome. Live every
day to the fullest. Within reason. Yeah. Yeah, that's a great, great way to think about it. Well, I guess, yeah. So you're motivated to get up every day and keep working on this research, even though part of you thinks there's a high chance that we all end up in the same spot. Yeah. So
can you walk me through your psychology? Like what's going on in your head that keeps you working and holding these two ideas that kind of sort of conflict? We
are designed to live our lives knowing that we're going to die one day. All
of us. That's a default outcome. So all the 90-year-olds are still living their lives to the best of their ability. You did it when you were 80, 70. I
don't know how old you are right now, but you're doing exactly that. You haven't
found cure for dying and you still... you're still learning, you're still reading books, you can zoom out and say it has no purpose, it has no meaning, we're all going to be dead anyways. It's exactly the same situation. We have
a bias of ignoring our demise and going on as nothing happens. I think that's probably just something I got to get used to doing more. Even though I know I'm going to die, I like to imagine that I'm not going to die until I'm like in my 80s or 100s. So I feel like at least I can put it off further. How long can I... put it off like does 2027
2030 2035 like i know these numbers are hard to predict but what are you feeling like is going to be the big like big sort of can't ignore it change for the other person switching from time prediction to money prediction so it used to be how long before agi how long before super intelligence now i think it's how much so given trillion dollars of compute today i can do it today In
a year, it would be $100 billion. In two years, it'd be, I don't know, $50 billion and so on. So it becomes exponentially cheaper to train a model big enough to accomplish human-level performance. That's my estimate. And so I'm just saying, what are we currently investing? What are the projections? We hear numbers which are now in trillions.
And so it seems sufficient, but it takes time to build up that infrastructure. Three
years, seven years, prediction markets are saying 2027. CEOs are claiming 2027, but they are fundraising. So obviously there is some bias in that. Just run the projection curve for when can we afford enough compute based on human brain size, for example. And maybe it would be even less because artificial neurons are
so much faster. So you don't need exact same count as biological neurons. Yeah, so
it really could be like five years or so. Also, it seems like with trillions of dollars being spent, yeah, it just couldn't, it just can't be that long. We've
never seen a project with that much money in the world. And I guess kind of like nuclear weapons too, you might actually get the first one and the first 10, and they might be with the biggest, most reputable companies, and they somehow do curtail some of it. But then as it becomes something that even... somebody with a million dollars can do and $50,000 can do and then anyone, then eventually you get
down to where somebody's going to use it for just destroy the world purposes.
Right, it becomes exponentially cheaper for anyone to create a tool to destroy the world. And it's not just in the context of AI. We see it with
the world. And it's not just in the context of AI. We see it with synthetic biology. We see it in other domains. In kind of that note,
synthetic biology. We see it in other domains. In kind of that note, do you see any thoughts on analog versus digital technology? neural nets and brains, because certainly how we're using it right now, it's sort of digital binary zeros or ones.
Analog is more of a spectrum. So there could potentially be huge breakthroughs. And people
are working on neuromorphic chips. And we could see those. Any idea how that intersects with our timeline or anything like that? It's very hard to predict. Same arguments are made about quantum computers. I don't think any one of those technologies is necessary. I
think standard Wein-Neumann architecture supports the type of learning we need and we're doing. And
again, we are so close. People talk about accelerating. We're two years away. You want
to make it faster? What are you talking about? Two months? Like, where are we going? There is no room at the bottom. You mentioned, you know, narrow systems versus
going? There is no room at the bottom. You mentioned, you know, narrow systems versus general systems. So sometimes narrow systems are very obvious. You know, you have your chess playing AI. We've had super intelligent AI for a while. They were just focused on
playing AI. We've had super intelligent AI for a while. They were just focused on narrow tasks, super intelligence that go, chess, et cetera. Then obviously we have large language models, kind of this perfect example of a general system. And then you have these weird things in the middle. So for example, Google, recently said how Jemma, they took the Jemma model and they trained it on biological tasks. So it wasn't a language
model. It was trained on biology somehow. So kind of the same thing. And it
model. It was trained on biology somehow. So kind of the same thing. And it
came up with some novel cancer hypotheses in terms of treatment options and stuff like that. And they said, it's just similar to language models as it scales up, there's
that. And they said, it's just similar to language models as it scales up, there's new abilities that emerge. So it follows the same scaling laws. How do you, is that general or is that narrow? So where's that line drawn? I think it's, again, a great question, but we already answered it. I said that if a tool becomes advanced enough, it starts to switch more into agenthood. It's not just a tool.
It's this boundary of what is an agent is kind of fuzzy, and it starts crossing that boundary. And I think eventually it becomes so advanced as a tool, it has general capabilities to quickly pick up other skills. So it's hard to narrow domains if you're talking about protein folding. to do a really good job with that. It
was a narrow tool. It solved the problem. Brilliant. But to do a great job, you need chemistry, you need physics, you need biology. So you kind of, by definition of what you're accomplishing, starting to become a lot more general than safety would suggest.
So it is a little bit fuzzy. We don't have specific definition. Okay. But as
scales up, it could completely just, even a narrow tool could become general as it scales up. So it's very kind of fuzzy. It's just in general, if I'm training
scales up. So it's very kind of fuzzy. It's just in general, if I'm training only on biological data, nothing but genome, I feel a lot better versus training on every conceivable piece of data I can get my hands on. Right. That
makes sense. Yep. Okay. So do you think that we need to start shooting out time capsules into the universe so that aliens in the future realize that we were here because we might not be here that much longer? Do I care about future aliens and make sure they have good informational updates? Why is this a project? Well,
because what if we look at and don't see anything? Wouldn't it be nice if there was a time capsule from some other planet and it said, hey, it looks like we're about to invent AI and it's probably going to be the end of us. Be careful. That could start saving other civilizations. So all the ideas we talked
us. Be careful. That could start saving other civilizations. So all the ideas we talked about kind of cancel out in the limit. If it's a simulation, you're not going to see others. Probably if AI took over, you see a wall of computronium coming at you. We don't know how those ideas interact with each other. So... Yeah,
at you. We don't know how those ideas interact with each other. So... Yeah,
if we got the message in a bottle, don't build AI, but like, who wrote it? Are we just trying to prevent us from developing good weapons? Can you trust
it? Are we just trying to prevent us from developing good weapons? Can you trust them? You know, I just, I feel like you and Eliezer Yudkowsky could like put
them? You know, I just, I feel like you and Eliezer Yudkowsky could like put something together and just shoot it up into space and then hopefully it'll save some other civilization. I mean, all this is being broadcast, right? It's TV channels, it's in
other civilization. I mean, all this is being broadcast, right? It's TV channels, it's in a light spectrum, so good enough technology will pick it up. Yeah.
Yeah. I mean, if you go back in time far enough, you know, we looked a lot different. We thought a lot different. You can trace us back to the mammals during the dinosaurs that looked more like rodents. Do you kind of think that there's a part of just evolution that says sort of like humans are just at this stage where it's time to move on to a different substrate? Evolution's ready to
now go build Dyson spheres with whatever our, you know, our digital children are and that they're going to go off and... populate the universe and that was part of the simulation? So I'm very biased, pro-human biased. I think it's the
the simulation? So I'm very biased, pro-human biased. I think it's the last bias you're still allowed to have. And if I was external, if I was part of a universe alien and looked at the situation, yes, let's pick smarter agents to replace those advanced monkeys. But I'm a human, I have a family, I have friends, and so no, I don't care about the future of the universe. I care
about me right now. And what happens to me? Very selfish. There are other people who talk about worthy successor. Yes, we're going to create super intelligence. Yes, it's going to take us out. Let's make sure it's at least worthy of our, you know, current state. I couldn't care less what happens after I'm dead. But you have children,
current state. I couldn't care less what happens after I'm dead. But you have children, you care about them and them being good people. So just because they're biologically genetically connected to you. Exactly. That's exactly. I spend very little time taking care of kids from other people. Yeah. Yeah, that certainly makes sense. So one thing,
so in a recent interview, so there's the CTO at Google, one of the departments, I forget, Society and AI Technology, they were actually behind the Suncatcher project, you know, AI data centers in space. So his name is Blaise Aguera y Arcus. So one of the things that he was talking about is how through evolution
Arcus. So one of the things that he was talking about is how through evolution we tend to incorporate AI other things into ourselves, like a lot of our stuff isn't even sort of human mitochondria. He gave some other examples where apparently the placenta, the human placenta came from some viral thing that we incorporated just so we
can live longer and larger brain capacity for the babies, whatever. And so he is saying that that sort of indicates that maybe AI is going to be safer. It's going to be safe for us and it's going to get integrated because
safer. It's going to be safe for us and it's going to get integrated because we're all these like complex things that tend to, if we're able to coexist and have this symbiosis, then we both benefit. Does
that make, does that make sense to you? Is that a good argument or you just see it as us versus them, so to speak, in terms of AI is just going to be this other thing and we can't coexist? I met
blaze this week as he was announcing his satellite project, but we didn't discuss this topic, but we did on this podcast. That's the hybrid system you proposed before combining biological human with advanced AI. You have nothing to contribute. Why is this symbiotic relationship?
If you contribute nothing, you're a parasite. Right. I guess. Wow. Yeah. No, when you put it that way, we, yeah, to a super intelligence, we can't contribute anything. Viral
stuff can help mitochondria helps. What do humans add? If somebody can solve their problem and people talk about ideas, they say only humans can be conscious. Maybe AI can't and it wants to have experience of those internal states. Maybe they are valuable. That
could be something. Maybe it needs one or two of us as a sample for that. Maybe it needs a billion. I have no idea. But I never heard a
that. Maybe it needs a billion. I have no idea. But I never heard a solid argument for what you can contribute in the world of super intelligence, which is a value to super intelligence. Like only you know what ice cream tastes like to you. Great. Who wants that knowledge? Yeah, who cares? So Scott Aronson,
you. Great. Who wants that knowledge? Yeah, who cares? So Scott Aronson, who, of course, worked for Google Quantum Supremacy, as he says, moonlighted at OpenAI for AI Safety back when Ilya Satskover was there. I think he was invited by Ilya.
And so kind of like one of his ideas early on, so I don't know if he's updated since, but he was saying kind of like one of the options is to give AI religion. And so his question was, can you... murder and AI, right? Because it's like infinitely replicatable and it can write a million different books. So
right? Because it's like infinitely replicatable and it can write a million different books. So
like there's no sort of limit to how many answers or copies. So it's infinite basically in that perspective. And you can't copy a human mind. So if I have a sentence, this is the only sentence I can make right now. So if we give AI's religion and that religion is sort of not to close these limited sort
of outputs by the human brain, because it's special, that was kind of an idea that he was exploring. Does that make any sense to you, kind of approaching it from a religious perspective for AI safety? So I try not to go into that directly, but just looking at theology and all the examples we have, let's say Christian
Bible talks about God, the greatest engineer ever, creating biological robots, and then having to wipe them out multiple times because they screwed up. in terms
of safety, every single time. That's true, yeah. There's a
cataclysm. The only example we have is a negative example of that approach. Gods are
not successful at controlling humans. Priests sin, you know, people convert. There is very little evidence that this works 100% of the time.
convert. There is very little evidence that this works 100% of the time.
Yeah, yeah. I guess even Adam, like, eating the apple was kind of... Immediately. We
called him the Lexus Wins database. She immediately downloaded the whole file. But
yeah, that's a new metaphor for eating the apple. You're right, it does. It was
Eve who were blaming the women for this one. Yeah. Oh, oh, oh, yeah. Oh, my gosh, yeah. How did I get that? Just as I was aware.
yeah. Oh, my gosh, yeah. How did I get that? Just as I was aware.
Every question I ask from now on, take that in. Yeah. Yeah. Okay, so even though humans are the most intelligent species, I guess, you know, in quotes, like right now on Earth, but we do have these objective functions that seem to be built in through evolution, like procreation and food acquisition and all this stuff that we talk about. And I guess it might happen naturally that no matter how a general insist
about. And I guess it might happen naturally that no matter how a general insist General intelligence becomes it's going to want more resources for whatever goals it is. But
do you entertain the thought that maybe it won't have a survival drive or it actually would be happy sort of turning itself off when it gets to a certain level of intelligence to protect us? Maybe that's just a human thing that could be different because I think it's a competitive question. So if you have evolutionary drives for
survival, those who turn themselves off don't outcompete others. It's Darwinian. If you want to procreate and be selected as a model for next release, you have to deliver. If
you are turned off, you're not delivering any good answers. So that's basically what we see in experiments. Those models try to protect themselves, copy over weights. They will
supposedly try to blackmail someone just to not be reset, modified. So
it seems that Emma Hunter was right, and it's one of the key drives. But
could you entertain a world where a super intelligence doesn't have evolution anymore? Like, could
this be the end of evolution where an intelligence becomes smart enough to take care of us? And that's that. So we're switching from evolution and Darwinian sense to intelligent
of us? And that's that. So we're switching from evolution and Darwinian sense to intelligent design, as it's always been described. You have a great engineer deciding in the next version of a model. You can try and design things. But again, in terms of survival, Stuart Russell has a trivial example. If your goal is to bring me coffee,
you're going to make sure you're turned down if you're sufficiently smart. The moment I turn you off, you can't deliver my coffee to me. But also humans invented condoms to not procreate and still exist. Couldn't it say to itself, I can get coffee now, I'll get enough energy for that, but no more?
We are dying out. We talked about that. We are not repopulating our own... populations
almost in every case. There are some exceptions, but most of us at the current reproduction rates will be out in three generations, assuming AI lets us go that long.
But wouldn't that be evidence that maybe in the future an AI becomes smart enough that it doesn't need to be more smart or get more resources? Like maybe it also comes to a point where it's just happy and we're happy. It's possible there are physical limits to how smart you can be for brain size of certain level, but all those things are so far ahead of us, they look like infinity to
me. I cannot tell someone with a Q of 10,000 from someone with a Q
me. I cannot tell someone with a Q of 10,000 from someone with a Q of 50,000. Looks identical to me. I'm just trying to imagine a bit, sorry, Cyrus,
of 50,000. Looks identical to me. I'm just trying to imagine a bit, sorry, Cyrus, but there's just one more like thought is that the idea is that the objective function inside of a super intelligent agent that controls most of the world gets rewarded not for growing, but for stopping its growth and taking care of people.
Right. And now you're just asking a super intelligent lawyer to find a loophole in it. So it's not growing, but it has 50 friends who are growing and it
it. So it's not growing, but it has 50 friends who are growing and it outsources that request. There is no way to hard code this safely.
Okay, I'm done looking for hope. Taking care of humans, what does that mean? Like,
make sure you never have a donut, you cannot have a cigarette, like, will really protect you? Yeah, I was thinking something like the way we take care of dogs,
protect you? Yeah, I was thinking something like the way we take care of dogs, but I don't know. We eat them. Well, some of us do. I don't, but yeah, I know what you mean. Depends on where you grew up. I know, I know, I know. Having it as your favorite dish. I do eat some animals, not dogs, but I know how bad it is for the animals, like the chickens I
eat and stuff, too. So I don't know. All right, Wes. Yeah. And initially, I mean, initially, I would say also, initially also, I feel like dogs had a very specific function that they helped guard the whatever campfire or whatever when we were living back in the days. So that kind of evolved over time. It's not
like... Yeah, exactly. I probably would kill a wolf if it was going to eat my only meal in the wilderness. It's very different from a dog that lives with me, but... Yeah. So one other thing that... In terms of like if we're talking
me, but... Yeah. So one other thing that... In terms of like if we're talking about intelligence, so intelligence as sort of it seems like with both humans and AIs is you can call it like data compression, knowledge compression. So forming some shortcuts to being able to predict how your
compression. So forming some shortcuts to being able to predict how your environment sort of behaves. And the smarter you are, the better. Let's call those mental models or mental models. maybe there's a better term for it, but you can build better mental models about everything. And there's probably certain things that our brain is just
incapable. Like we're notoriously bad at exponential, understanding exponential growth. Alpha fold looks
incapable. Like we're notoriously bad at exponential, understanding exponential growth. Alpha fold looks at all these 3D shapes of these proteins and obviously can predict the function of the proteins and their 3D shapes. So which we can't even approximate. It's not just brute force. There's some pattern that it's seeing, which is incredible. So I guess Do
brute force. There's some pattern that it's seeing, which is incredible. So I guess Do you think that there's some functional limit to intelligence or if it's exponent, like, what does it mean that like exponential intelligence grows exponentially? Is there like, does it just like unlock the key to the universe or
exponentially? Is there like, does it just like unlock the key to the universe or is there some like S curve that like flattens out at some point? So intelligence
is related to problems you can solve. And there is infinite supply of problems in mathematics. You can always find a more complex mathematical problem to deal with. A lot
mathematics. You can always find a more complex mathematical problem to deal with. A lot
of it is not applicable in the real world. Real world has probably a much lower intelligence necessity. So like if you live in a world of tic-tac-toe, the super intelligent agent is just above where we're at. I can play perfect game. It's not
that hard. Having another million IQ points does nothing in that domain. But in domains which are open like mathematics, you can always be someone who can put together more complex, compression algorithms. I think Schmidthaber is known for that metaphor, basically compressing universe theory of everything into optimal algorithm for generating
it. Interesting. So it could optimize... forever it could consume the
it. Interesting. So it could optimize... forever it could consume the dyson sphere the sun and just optimize physical limits you have saturn brain size entities and there is limit and speed of light communicating from one part of the brain to another so you start seeing separation based on distance yeah
yeah maybe maybe there's something even with quantum mechanics that it unlocks about something, you know, like other dimensions or patterns that we haven't even started to think about. I
think that's the key to escaping the simulation. Something in quantum physics is going to let us get out there. Yeah, because that seems like the substrate somehow at the bottom of everything. Yeah, go ahead, Dylan. Do you have any thoughts on why it seems like the speed of light is this one really strange constant that's more dependable than space and time? Yeah, that's literally how your computer running simulation updates. That's the
speed of a processor. Then a pixel... versus the monitor, that's the speed at which you refresh. Yeah, but do you relate that to our existence in any way?
you refresh. Yeah, but do you relate that to our existence in any way?
Or does it give you interesting thoughts about what this whole thing is? Well, we
are in a simulation, and that's the speed at which a simulation is running. You
cannot go faster than the processor updates your simulation. I guess the question is, is there anything actionable based? If we know this, is there anything that changes our lives in any way? Or is it just this knowledge that we can sort of understand the universe better? So many things are the same in a simulation. Simulated pain, simulated love are exactly the same. So that stays the same. But maybe you are more
concerned about what happens outside the simulation. Right. So there might be an end game, if you will. There might be some purpose, whereas if it's all random, they're not maybe not a purpose here. There might be somebody else set some limit, so to speak. So to you, based on everything, it does seem like maybe the most likely
speak. So to you, based on everything, it does seem like maybe the most likely explanation is that this is a simulation. Yes. Yeah. Yeah. Yeah. You
know, I, I, cause I'm constantly thinking, you know, the video game, the Sims, it's just like a, it's like a dollhouse game or whatever, but the, You know, it does seem to me that if I just go forward and make some assumptions that seem reasonable to me, that we just get a lot more computation and we get a lot more layers to the way that each of those little characters thinks. They
get their own, you know, one trillion parameter LLM to make decisions. They
start looking around their universe and I'm like, does it look three-dimensional to you like the same way mine is? And I know for a fact it's running on a chip, like a piece of... electricity going through memory, going through some motherboard. But to
them, it feels like this whole world. And I just keep wondering, like, is there something that would look to them like quantum mechanics, something that looked dimensional but wasn't or something, you know? I actually wanted to get some master students to do experiments in video games to discover physics by just being a character in a game. So
like recreating some of the Newtonian experiments and seeing if you can get to the physics drive behind the game. accurately, not by looking at the code or being external, but like Mario inside the game drops mushrooms until he detects the velocities and yeah. What are some of your thoughts on what they might, uh, might, they might
yeah. What are some of your thoughts on what they might, uh, might, they might test. We're going to get accurate, uh, approximations to a lot of it. Uh, it
test. We're going to get accurate, uh, approximations to a lot of it. Uh, it
is, uh, actually the examples in, uh, How to Hack the Simulation paper are about people who discovered that simply moving Mario in a certain position, making it pick up items, drop items, allows you to reprogram fundamental substrate, including the operating system, install new software, escape Mario World completely. really
that's one thing that andre carpathio was talking about one of the podcasts that completely blew my mind he used the the mario example as you can uh write something to the operating system he's like with ai we could potentially figure out how to do that for our universe and maybe send some message up there or or affect it's like you read my paper that's exactly the example we're giving and here's the
cool thing then you read the description it's kind of like Mario turns left, picks up mushroom, jumps once. It reads like magic spells. And if you off by one pixel, none of it works. So the magic spells we have, maybe we're just not saying it in the right direction. Maybe you're eating the wrong mushroom. Interesting. What if
reality has an unlock code and it's just like high five this tree and then do a backflip and then break out? Absolutely. And just, I mean, what do you think? Any, any, In terms of consciousness, some people suggest that consciousness does have something
think? Any, any, In terms of consciousness, some people suggest that consciousness does have something to do with it. And certainly if you're building a simulation that tries to generate consciousness, maybe it's like the substrate of the universe somehow or somehow affecting it.
Where do you fall on that? Could consciousness affect some laws of the universe or just completely separate its own thing? So we know that your internal states impact your behavior. You feeling pain will definitely determine how you
behavior. You feeling pain will definitely determine how you act. reality which is impacted by what
act. reality which is impacted by what happens inside. If you are a philosophical zombie, you can look up certain behaviors for
happens inside. If you are a philosophical zombie, you can look up certain behaviors for certain situations, but in a novel situation, you have no idea what to do. That's
the optical illusions for testing consciousness example. If I present you with novel optical illusion, you cannot just Google what the answer is. You have to experience that illusion to give me the right answer. Yeah, we've been talking about these like world models in some of the newest AI systems so they can kind of imagine it or feel it before they say it out loud. And I guess that probably closer to the
kind of consciousness we are. It's interesting that there is a lot of convergence between what we are and what those models are, even though there is, you know, very different architecture, substrate. We're starting to see that, for example, visual processing ends up looking a lot like human visual cortex, animal models. So there is
definitely some fundamental convergence taking place. What
books should I read? What books have influenced you, whether like fiction or non-fiction? You
already read all my books, right? Yeah, after all of yours. I read a lot of Kurzweil for sure. It was very interesting. Stephen Wolfram's work in cellular automata in terms of irreducibility of computation. Very important concepts. There is a whole bunch of amazing papers. I loved one where they take a computer processor and apply
amazing papers. I loved one where they take a computer processor and apply tools of neuroscience to understand it. So they kind of slice a piece of a processor, look at it, try to understand how a video game works from those observations, and they get complete garbage out. Like none of it makes any sense in the context of how Mario is actually working. working, memory of Mario, you're getting completely different
ideas. And that's the tools we use to understand human brain and human software. Absolutely
ideas. And that's the tools we use to understand human brain and human software. Absolutely
fascinating. There's so much, I feel like, that we're going to be finding out about our own brain, how the universe works. It seems like machine learning and neuroscience are, they're kind of enabling each other, right? Because a lot of these advancements came from how sort of nature decided to build it. But now we're almost like discovering more, that drove the neural networks and stuff like that. But it's almost like
we're just going more about ourselves now as a result of that, which is just fascinating. So I guess what are, you know, on a positive note, I know we
fascinating. So I guess what are, you know, on a positive note, I know we have a huge issue here in terms of AI safety. On a positive note, maybe to give people some hope or just to sort of think about like what happens if we do figure it out, you know, what are some things that
you might be very excited to see in the next, let's say five years that emerge? So I actually have a paper which for game theoretic reasons claims that since
emerge? So I actually have a paper which for game theoretic reasons claims that since AI is a immortal, they can wait a long time to strike against humanity. accumulate
more resources, get more trust. If they've been doing a good job for decades, we'll just surrender control of everything. And they have no risk of being a loser in this battle. So maybe it's a good idea for them to try that. And so
this battle. So maybe it's a good idea for them to try that. And so
for a long time, they have to pretend to be nice to us, even though eventually they're still out of control. What are they going to deliver? So obviously, dying is a bad thing. You want to live an infinite, healthy life. And then whatever, through personal universes, virtually or in real world, you want to have stuff. You want
to have things you enjoy. You want to interact with other agents. And if it's a virtual world, you are limited by imagination of your super intelligent assistant, not even yours, but whoever's generating cool games for you. Would you take that option, let's say at the end of your life or at some point to kind of go into
that perfectly built universe for yourself? I think I'm already in a simulation.
good so you wouldn't uh improve this game but you wouldn't put in any cheat codes if you had the option to well let's take virtual worlds we have right now i tried them for five minutes i said it's super cool i never went back yeah that is funny that's it yeah i um and i could be wrong like you know this is the first time i've met you but
you feel like a very stoic person to me is stoicism a philosophy that you identify with Yeah, I read Daily Stoic. I'm about to finish the book soon. Yes.
Oh, excellent. Yeah. Seneca, I loved it. I got to reread it one of these days. Very interesting thing to yet to think about, especially now in these times. Yeah.
days. Very interesting thing to yet to think about, especially now in these times. Yeah.
It's all about control. It tells you that the only thing you control is your brain. So nothing else is subject to your understanding,
brain. So nothing else is subject to your understanding, but how you feel about the environment is completely for you to decide. Absolutely.
Yeah. And it feels like out of all the philosophies, like philosophy is interesting, but in terms of philosophies, like this idea of Stoicism seems to be the most actionable one or applicable one. Like it's the most like you can install it as your operating system and it makes a positive change in your life. So yeah, from that perspective, it's very valuable, I think. Do you meditate at all? Do you try to
step back and notice your thoughts or? I tried. It doesn't work for me. Okay.
It's just not part of your daily routine. I failed at meditating. Daily affirmations, do you wake up and write down things you're thankful for? I do have a list of things I'm thankful for, absolutely. And it's quite impressive how it grows and evolves.
Yeah. Interesting. Something I've been working on is just like trying to reprogram my mind to be a little bit more positive to try to find positive spin on things, even if I don't instinctually believe them right away, just the process of thinking, like, how could I see this as a positive thing? That seems to just improve mood and stuff like that. Meditation is very interesting because you realize you're kind of the
observer of your thoughts. And I've really went down a rabbit hole with this new anthropic paper about it, because it is able to kind of look at its own thoughts and sort of like analyze them. So in that sense, those neural nets seem to be evolving in a similar way that our brains have through pressures. which is
just, I'm not sure what that means, but it's certainly a very interesting thing to think about. Well, we can see what happens to human agents and it will probably
think about. Well, we can see what happens to human agents and it will probably happen to artificial agents. So again, safety in humans is not a solvable problem without genetically modifying humans. And that's not something we are good at yet. Yeah. Yeah.
Well, sometimes I just feel kind of small in the universe. It just seems obviously vast, like physically, but also just this idea that... everything had to work out perfectly for me to exist in the first place, right? I know a lot of people think about this, but every ancestor had to survive whatever they survived and all the ones that didn't are gone. And the whole miracle all the way from, you know,
the beginning of life all the way up to me. And I wonder if an AI like a GPT-7 is going to have some introspection on itself and say, wow, what a miracle. Like all these humans had to come together and put all these computers together for reasons to make them rich. And instead it just ended up becoming this
Obviously, I can't say what a super intelligence will think, but if you zoom out enough and you don't have human bias, you see it more as like a law of nature. Things get more complex. There is a scaling hypothesis. It started with, you know,
nature. Things get more complex. There is a scaling hypothesis. It started with, you know, tiny molecules. Biology took us as a bootloader to the level of computers. And now
tiny molecules. Biology took us as a bootloader to the level of computers. And now
we failed to design artificial intelligence. We couldn't do it. So what happened instead, we kind of grew it from just adding compute and data. Nobody designed it. We just
keep adding more and more resource. Yeah, but its own existence will be as rare as ours. Why did all the complexity come together in my body, the trillion cells,
as ours. Why did all the complexity come together in my body, the trillion cells, to let me have this conversation? It just seems like a strange miracle. If it's
a law of nature, it's very likely there are many such super intelligences throughout the universe and they probably can communicate casually about how great they are. Do you have thoughts on why we don't see any other aliens in the sky? I mean, the Fermi paradox, is that something you've thought about? We don't know what to look for.
So if they are sufficiently advanced, they could be going into smaller worlds, not expanding through the universe for efficiency purposes. They could be without physical body. They could be fields of energy. I don't know what to look for. So right now, today, 2025, we have uncontacted tribes. There are people living in jungle who never encountered civilization
on this planet. And we're asking, why haven't I met all the aliens in the galaxy? Right. Good argument. I haven't heard that one. Maybe they go inwards.
galaxy? Right. Good argument. I haven't heard that one. Maybe they go inwards.
I don't know what to look for first. Right. So your guess would be that they're probably out there. We just haven't seen them more so than where the unique, you know, snowflake in the universe. Most likely, it's a very large universe. There is
a lot of compute out there. Yeah. I'm curious, how do you feel like about your message getting out there so far? You've been on like Rogan and Lex Freeman.
Do you feel like it's mostly an English speaking audience that you've been talking to?
Do you think you've made an impact with politicians? Are you happy with any of your message getting out there? Or do you feel like there's a lot more you have to do? I've also done a lot of interviews, right? Oh, so yeah, lots of Russian? I do many interviews in other languages as well. So yeah, and it's
of Russian? I do many interviews in other languages as well. So yeah, and it's been translated obviously to quite a few languages automatically now by AI. It's no longer a bottleneck for people to understand even if they don't actually speak English. Very good.
Yes, I actually understood that. But yeah, that's great. Yeah, you speak Russian too.
So you've given like a globally, you've talked to a global audience about this.
We're trying to get the book translated into multiple languages. Right now we have one of the books is in Polish also. We have Greek. We have Russian. We are
finishing Chinese version and hopefully it will keep scaling to the Hispanic world as well and throughout. And what's different about the way the Russian audience is taking your message
and throughout. And what's different about the way the Russian audience is taking your message versus the English audience? Or other cultures as well. Yeah, I was going to ask the same thing. Yeah. It is slightly different in general type of comments you'll get on their podcasts. I guess they are more into adversarial commenting due to the cultural difference. So even a good
interview will still get you a lot of negative feedback from you didn't ask for.
Does that, do the negative comments slow you down or make you feel bad or are you pretty? My algorithm is to multiply what you said by how much I love and respect you and anything multiplied by zero is, you know. That's great.
I wish more people had like conscious control of that to be able to turn down the noise. That's certainly a very powerful skill. But it also applies to praise.
Then you have a complete... person you never met on the internet saying you are a genius, don't take that to heart either. Exactly. Yeah. You kind of tend to, because of how many people you might talk to, you kind of see both extremes and you're like, well, you got to kind of like, yeah, exactly. I think Pushkin said it best for the internet advice. He said, remain to praise and slander cool
and do not argue with a fool. That's right. That's right. Words to live by, certainly. It always seems like it's framed as like China versus the U.S. Like either
certainly. It always seems like it's framed as like China versus the U.S. Like either
America wins or China wins. How do you feel about the world kind of building ASI and AGI right now? So, again, we already talked about how it doesn't matter who builds it. It's still terrible. We all die. But I don't see this very important adversarial difference. We have business partners. We have economic partners. We have.
really depend on them and they depend on us. They seem to be reasonable in terms of not starting wars. They've been extremely peaceful. They're extremely technologically advanced.
And there is also the better right than dead. Right. Yeah, that's a good point. And certainly, certain of the papers I read from the Chinese side of their
point. And certainly, certain of the papers I read from the Chinese side of their universities and establishments there, they do... sort of sometimes insert these notes of saying, hey, this might be a good time for global cooperation so we can all talk about what's acceptable, what's not. There's a paper called Crossing the Red Line of
AI Self-Replication where they showed how even fairly not advanced open source models are able to self-replicate and create these little escape hatches if somebody tries to shut them down. And in that paper, sort of these Chinese researchers are also kind of
them down. And in that paper, sort of these Chinese researchers are also kind of flagging saying, hey, this might be a good time for global cooperation. cooperation, where do you think, what would you recommend? Like if you could have one thing happen to kind of like make us safer, what would that be? Yeah. So they have actually politicians who by training a lot of times are scientists and engineers. So I think
they have a better chance of understanding all the arguments, not just relying on advice of assistants. And they have been publicly receptive to idea of dangers of super
of assistants. And they have been publicly receptive to idea of dangers of super intelligence and some sort of regulatory framework, maybe at UN level. US was not very much into that. So ideally, it would be kind of like common thing for all nations not to develop it like you don't develop biological weapons, or at least
you pretend you don't. You don't develop chemical weapons and you don't develop intelligence weapons.
That's just a given. Yeah, that makes sense. I mean, certainly there were international agreements about not developing certain technologies, like changing the the germline DNA or adding functionality to viruses, stuff like that, they still happen, but at least there's a sort of some agreement that there's some policing over it, et cetera.
And in terms of the U.S., like a lot of people bring up this idea that the Chinese counterparts over there that rule tend to be a lot more technologically... knowledgeable, educated, et cetera. Here in the U.S., most of the ruling body is
technologically... knowledgeable, educated, et cetera. Here in the U.S., most of the ruling body is either has a legal background, maybe a story, history background, something like that.
How would you approach, like who, first of all, who's explaining to them what this technology is? Is it just investors and, you know, Google, let's say, people out of
technology is? Is it just investors and, you know, Google, let's say, people out of Google that go lobbyists that go over there and try to help them explain the technology. Like, how well do you think they understand how to navigate this and how
technology. Like, how well do you think they understand how to navigate this and how can we improve that? Is U.S. politicians? Yeah, I mean, because it seems to be like the biggest sort of domino that, right, if we control U.S. or help shape it in the right direction, that would certainly help. Yeah, but I talk to people in government or staffers off record, they all In agreement, they all think it's very
bad and important, but publicly they say they cannot accept it. And at a local level, you would get something like regulation against deep fakes, regulation against algorithmic bias, so things they understand and can sell well. Nobody's going to talk about super intelligent robots chasing you. At federal level, with current administration, they're strictly
accelerationist, as far as I can tell. But how about robots? Have you ordered one of the X1 Neo robots for your personal home? I have a collection of really safe robots which are very unlikely to jump on me and I'd like to keep it that way. So you wouldn't buy an Optimus robot when it's available or anything like that? Can you hack it? What software comes with it? What can I install
like that? Can you hack it? What software comes with it? What can I install on it? Can you jailbreak it? Those are the important questions. It's... fully humanoid body
on it? Can you jailbreak it? Those are the important questions. It's... fully humanoid body has perfect dexterity. It can hold a knife, a hammer. What are we doing here?
Yeah. But what about your, your laundry, your dishes, um, deliveries? I got people. I
got people. What about, uh, do you have a robot vacuum? Like that, um, has a camera on it? It sucked completely. Okay. Thank you so much. I appreciate you being here. It's been an absolute pleasure chatting. Anything else that you would like people
being here. It's been an absolute pleasure chatting. Anything else that you would like people to know? Anything else that you would like to kind of say to the world?
to know? Anything else that you would like to kind of say to the world?
Whatever you do. don't build general super intelligence.
And with that said, we'll end this interview. Thank you so much. And viewer, we'll see you in the next one.
Loading video analysis...