Mark Zuckerberg & Priscilla Chan: How AI Will Cure All Disease
By a16z
Summary
## Key takeaways - **AI will accelerate curing all disease**: The Chan Zuckerberg Initiative's strategy to cure, prevent, and manage all disease by the end of the century hinges on accelerating basic science breakthroughs by building new tools, particularly leveraging AI. This approach addresses the limitations of traditional, smaller-scale NIH grants which often focus on near-term research. [02:53], [03:47] - **Biology needs a 'periodic table'**: A significant inspiration for the Initiative's work is the lack of a comprehensive 'periodic table' for biology. They aim to create standardized, open-source data sets, like the Cell Atlas, to catalog millions of cells and provide foundational tools for scientific discovery. [15:45], [15:51] - **Virtual cells enable riskier hypotheses**: Virtual cell models will allow scientists to test high-risk hypotheses in silico before investing in expensive and time-consuming wet lab work. This computational approach can derisk ideas and increase efficiency, akin to using a model organism or the 'new fruit fly'. [19:15], [21:51] - **Most diseases are 'rare diseases'**: Priscilla Chan argues that most diseases should be considered rare because individual biology varies significantly. Current approaches often lump patients by demographics, but advanced tools allow for targeting based on precise individual biology, leading to more effective treatments. [13:34], [13:44] - **Cross-functional collaboration is key**: The Biohub emphasizes integrating biologists and engineers, even having them sit side-by-side, to foster collaboration. This interdisciplinary approach, alongside accessible tools and shared resources, is crucial for accelerating scientific progress. [35:39], [36:36] - **Compute power is the new lab space**: Modern biology labs are expanding their compute resources rather than physical square footage. The Initiative is building large-scale compute clusters, providing access to GPUs, recognizing that advanced computational power is essential for cutting-edge research. [38:46], [39:01]
Topics Covered
- AI's Role: Accelerating Scientific Breakthroughs with New Tools
- Curing Disease: From Crazy Ambition to Credible Pathway
- Frontier Biology Meets Frontier AI: The Uncharted Territory
- Lowering Barriers to Interdisciplinary Collaboration for AI in Disease Research
- AI's Leverage in Curing Disease: A Philanthropic Opportunity
Full Transcript
this is a a a space that I mean that
there's just going to be a huge amount
of leverage with AI. It still seems like
there could be a lot more effort in this
space around building tools and it's
kind of this crazy thing that we're you
know here in you know 2025 and there's
not the kind of periodic table of
elements equivalent for biology. We
think that this is like probably one of
the most important sets of tools that
you need to build. When we first set out
that the goal to cure and prevent
disease by the end of the century,
people like honestly most scientists
couldn't look at us with a straight
face.
>> And that's crazy.
>> Yes. And it was true because if you just
decided to spend the money funding the
next best grant for every single lab in
the country, like you there was no
pathway to that being true. The biology
folks, I think, looked at it as if it
were crazy ambitious. And then the AI
folks are like, well, that's kind of
boring. That's just automatically going
to happen. I know. It's like, okay,
there's something in between there that
needs to be bridged.
>> Mark Priscilla, welcome to the Asz
podcast.
>> Thanks for having us.
>> Yeah, great to be here. Excited.
>> All right. Excited to have you. You're
doing exciting stuff.
>> Yeah. Well, to that end, almost a decade
ago, you guys started the Chan
Zuckerberg initiative with the mission
and intent to cure, prevent, manage all
disease by the end of this century.
There's a lot of missions that you guys
could have poured your time and
resources into. Why don't we talk about
take us behind the conversations of why
you guys picked this one? Maybe
Priscilla, why don't we start with with
you and you hear your side of the story.
>> It always surprises people when I talk
about how we work in basic science
research. Um, I trained as a
pediatrician and people always think
like, oh, it must be about medicine. And
for me, it w, you know, I went into
medicine because I wanted to improve
people's lives. I wanted to make a
difference. I wanted to be able to help
others. And I think training as a
pediatrician at UCSF, I met a lot of
patients and frankly like little kids
and families for which like we just had
no idea what the problem was. and they
might have like a specific gene that
they could name if they were lucky. Um,
or they could be grouped into a bunch of
other diseases and there'd be a general
sort of PDF they'd print out like this
is what we know. And then it was my job
as an intern or resident to try to
translate like like a few lines of
information to how we were supposed to
take care of the patient. And for me,
that's when I really like realized the
power of basic science and how we need
to work on basic science to advance the
forefront of what's possible and without
that there's sort of I think of it as
the pipeline of hope.
>> Yeah. And why did you think um
you could cure all disease? Because
that's like a very like aggressive goal.
>> Um do you want to do you want to answer
that one?
>> Yeah. Well, well, I mean we're not going
to cure all diseases to be clear. I mean
the the strategy is to help scientists
and the scientific community cure all
diseases. So the strategy is really one
of accelerating the pace of basic
science and the theory that we had was
if you look at the history of science
most major breakthroughs are basically
preceded by the invention of a new tool
to observe phenomena in a new way.
Right? So think about things like the
microscope, right? Being able to observe
bacteria or you know other fields, the
telescope or
>> um you know, but it's
>> just to use an engineering example, you
know, it's without those kind of tools,
it's kind of like you're coding without
being able to step through the code and
debug things, right? So it's um that's
like the old days
>> when you
so so our our whole approach on this is
basically let's help build tools that
will accelerate the pace of the whole
field and I think that that there's a
niche that I think fits that because if
you look at how funding works in science
you know the vast majority of funding
comes from the government and NIH
grants. it's parcled out into these
relatively small grants that allow
individual investigators to investigate
usually pretty near-term things. Um, and
the development of these kind of new
types of tools, whether it's imaging or
building now a lot of AI things like
virtual cell models, um, are longer
term, often times more expensive to
develop. So think about like on the
order of a hundred you know maybe you
know hundred million to a billion
dollars over a um over a 10 to 15 year
period and then you you try to unlock
those tools and give them to the
scientific community to accelerate the
pace. So that's that's kind of the the
theory
>> right and and there it seems like
there's also something that that's is
you don't really get credit for the
tools in a lot of ways. I mean, we've
been noted, well, we have companies that
use your tools and they're very happy
about it. But, um, you know, I didn't
even know that that was the case. And
so,
>> that's why it's philanthropy.
>> Yeah. Well, it is, but most people do
philanthropy to get credit, too. I mean,
you know, like that's a you know, that
that's kind of a part of it. So,
how did you I guess did you think about
that or were you just like, no, like
this is going to work and if it works,
that's all we need. We're super focused
on like actually making every scientist
better and and beyond science like
startups, startup founders because I the
point is we can't do this alone. And
when we first set out that the goal to
cure and prevent disease by the end of
the century, people like honestly most
scientists couldn't look at us with a
straight face
>> and crazy.
>> Yes. And it was true because if you just
decided to spend the money funding the
next best grant for every single lab in
the country like you there's no pathway
to that being true. But if you forced
people to really think about this and
like okay what is the most credible
pathway to doing this and what are the
barriers to that credible pathway then
we sort of got somewhere right? They
were like, well, like there's no shared
tools or like we don't have we're not
working on big projects and building the
right data sets. And we're like, okay,
well then we can start doing something
about that. Um, and so that's where the
idea of building shared tools cuz no one
right now in the science.
>> Well, that's so interesting. So
basically, you're like, we're going to
cure all disease and they're like,
>> yeah, can't be done. Why can't it be
done? Well, because we don't have the
tools. Okay, that's pretty that's a
pretty cool sequence.
>> Yeah. Yeah. I mean, there's also this
funny thing where the the biology folks,
I think, looked at it as if it were
crazy ambitious. And then the AI folks
are like, well, that's kind of boring.
That's just automatically going to
happen. I know it's like, okay, there's
something in between there that needs to
be bridged. And if you can like kind of
use the the kind of modern AI tools in
order to build the types of tools that
biologists need. So that's a big part of
how we think about our work is um
>> AI has got to be the most overestimated
and underestimated technology ever like
simultaneously. So weird. I mean, yeah,
we'll probably like the internet early
on, but but we kind of think about
ourselves and the work that we're doing
at the Biohub as frontier biology paired
with frontier AI, right? So, there's
there are labs that do frontier AI that
uh basically, you know, are building the
most advanced models. Um and then there
are lots of biological research
organizations that that effectively do
very leading edge
>> research to build um you know to either
discover new data sets or or or looking
to certain challenges.
>> But so far there hasn't been anyone
who's tried to do both of those at once.
And when you look at I mean even
something like AlphaFold which is
amazing right it's it was built off of
this data set that was a public data set
that had been produced decades ago right
and um
>> what what I think you have the
opportunity to do if you do both of
those together is produce specific data
sets for the purpose of training AI
models to build virtual cells that can
do specific things
>> right
>> so I think that that's like a a pretty
interesting zone to be in
>> and of all the things that that we've uh
that we've worked on. You know, actually
when when we started CZI, we we kind of
actually focused on a number of areas
and what we found is just that the
science research has had by far the
biggest return. So, we've just doubled
down on it over and over and over until
now we're at the point that, you know,
we're 10 years in and Biohub is really
the like main focus of of our of our
philanthropy at this point.
>> Um, but yeah, I mean, that's kind of
that's basically the focus. Maybe you're
not giving yourselves enough credit
because you're sort of saying, "Well,
there's bite-size science. We didn't
want to do that. There's century scale
science and that seemed like a long time
horizon, but achievable, ambitious." But
you've actually identified, you know,
which I think is really fantastic, grand
scientific challenges
>> that are right in between. They're 10 to
15 year horizons, at least per kind of
the way you communicate about them and
the way you energize
>> the scientific community about them. 10
to 15 is kind of an interesting time
horizon sort of like similar to the time
horizon of a venturebacked company
similar to the time horizon on which a
team can work together for that period
of time I think it's how did you get to
that number and then how are you
thinking about the challenges that you
take on in each 10 to 15 year wave
because that's concrete achievable
you know you build a lot of credibility
around it the way that you've announced
those challenges
>> well I'm curious how you guys think
about it but for us when we looked at
the grand challenges for on the 10 to 15
year time horizon it needs to be like
when you look at it you're like I see a
path
>> right
>> not everything needs to be solved for us
to take it on in fact if everything's
solved then that feels like that should
just go
>> ambitious enough
>> yeah like you like we we have we have
some risk appetite right so we want
things where we're like there's a
credible pathway someone who is at the
helm who can do this And there's enough
ambiguity where we feel like we could
take on that risk and if we do it like
the the returns could be higher than
even expected and the way we modeled
that from you know in the biohubs is we
we have three biohubs. We have one in
San Francisco, one in Chicago, one in
New York. The one in New York works on
cell engineering. You know can we
engineer cells to go in and detect
signals, read it out or to take certain
actions. In Chicago, we're building
tissues and looking at uh tissue cell
communications within tissues. And then
in San Francisco, we're looking at deep
im imaging and uh transcrytoics.
And that work the locations are not by
accident. We also look at the partner
universities because we have folks who
come to the biohubs to do this work
collaborative interdisciplinary
um and sort of unconstrained by the
traditional lab. But we also build off
of the labs at these academic institutes
that support the work. And so uh that's
how we sort of choose the grand
challenge and um and the locations. And
then the sort of layering and the uh
large language models and AI coming into
the picture has been so interesting
because we were already building tools
to measure interesting data building the
data sets but we didn't really know what
to do with them yet. Um and large
language models coming onto the scene
we're like wow we can make sense of all
of this now. I'm curious what you view
success as in the therapeutic realm. So,
you know, we think a lot about
understanding biology and sometimes we
bet on startups that want to unlock
completely new biological areas,
diseases where we don't know what's
going wrong. And then there's another
group of folks who kind of say, hey,
okay, now that we understand what's
going wrong, let's fix it. Um, let's
come in with a drug. Let's come in with
a new type of chemistry, a new type of
antibbody. How do you what do you think
success for the CZ Biohub looks like 10,
20, 50 years from now in terms of the
new medicines that you've enabled?
>> We want there to be like an explosion of
a community who are building these um
just the new wave of what it means to be
deploying precision medicine. like we
like I think for rare diseases and
common diseases alike, you're really
talking about individual biology that we
sort of lump together. Um and uh they
and we often don't know how it happens,
right? We know that you have this
mutation or the worst nightmare is you
have a variant of unknown significance.
What does that even mean?
>> The horrible us.
>> Yes. Horrible. And you're like you tell
someone you kind of know something but
we don't know what it means. But if you
look at the way we've been able to look
at variants and look at single cell
transcrytoics, we're starting to be able
to say, okay, this variant actually
impacts this set of downstream cells and
then we start looking at the proteins
that get expressed and how it looks
similar or different to what a healthy
cell would look like. Then you can start
targeting. Okay, like let's look at that
as a target. And you both know the
specificity of the target you want to
build based on the ability the ability
to connect mutation to protein
expression as well as to be able to
predict off target effects. What are the
side effects? Because you also know
where else that drug will be able to
interact with the body. And and so those
are rare like and and but I really think
most diseases should be thought of as
rare diseases because each one of our
biology is different and right now we
just get lumped right we get lumped
based on age demographics ancestry if
we're lucky uh to have that level of
understanding but truly each one of our
biology is different and say like if you
look at hypertension or depression like
we kind of just go by trial and error
and saying like let's just try that drug
and see what happens. happens. But what
should really happen is being able to
precisely and accurately and quickly
treat people by looking at individuals
biology. We want to enable the basic
science and we would be thrilled if
people picked up the models that we
build to be able to build the
diagnostics, the therapeutics that need
to come.
>> You've built amazing data sets. I have
to say like I mean you may not hear the
feedback from the startup community and
the pharma community and the R&D
community but it's there because you've
committed to open source and so people
may not be they may not all be writing
papers but they are using those tools.
Um there's a startup in our portfolio
working on idiopathic pulmonary
fibrosis. The name tells you how vexing
the disease is. It's idiopathic. We
don't know why it happens. The IPF is
named that way. And so, you know, he was
telling me that he used your cell by
gene atlases to look at millions of
single cells in patients with disease,
without disease, try to pinpoint the
fibroblasts, double click on the
fibroblasts and their gene expression.
It's try to, you know, use that to
inform, hey, where could I go after a
new drug target in this disease that's
fundamentally a strange clump of
idiopath, you know, idiopathic um
origin. So um I think there's a huge
there's a huge group of innovators who
are who love the tools, the
visualizations, the query systems and
really the software approach that you
built to making that data incredibly
accessible. So
>> cell by gene is like almost an accident
though.
>> Tell us more.
>> So do you want to share a little bit
about cellene or do you want me to
start? Well, I mean, I don't know which
part you want to get into, but I mean,
but the cell atlas work overall, I mean,
it's kind of this crazy thing that
we're, you know, here in, you know, 2025
and there's not the kind of periodic
table of elements equivalent for
biology, right? So, that was sort of a
lot of the inspiration of it was all
right, how do we both through work that
we're going to do in the Biohub and
through other grants um be able to pull
together and standardize a format where
you can have all this data. And when we
were starting off, we didn't even
necessarily have in mind that we were
going to use that to build virtual cell
models. I think that's sort of just come
into focus as the AI work has advanced,
but that's a very exciting thing. We
should definitely spend a bunch of time
on the virtual cell models, but I'm not
sure what you wanted to get into on the
cell atlas.
>> Well, the single cell work is was one of
our first RFAS 10 years ago we started
and we were like, okay, we think this is
possible. We actually funded the
methodology for it to to standardize how
it was going to be done. So that was 10
years ago. And we then were we seated a
few labs to start building out that data
set. But we were like there are like
millions or billions of different cell
types and different permutations. Like
how are we going to do this? And um
especially with like a burgeoning
technique. And so we ended up um seating
a few groups and they started doing work
and then they told us they had a
problem. There was a uh there was a
bottleneck in their workflow because
they couldn't annotate the data fast
enough. Um and so we built cell by gene
was an annotation tool. That's the
original source of this. So we built the
annotation tool to make it easy for
people to who are doing single cell
science to be able to annotate the data.
And then we put we put the data that we
collected publicly so people could
share. But because everyone started
using the same annotation tool, everyone
was standardized then on the same data
formats
>> and then there started being a a
community around the tool and they
wanted to share back and build the
atlas. So now after 10 years there are
millions of cells that have been built
into this uh shared resource for the
entire scientific community. We only
funded about 75% of it. Sorry that's
wrong. We've only funded 25% of it. 75%
came from the broader community saying
this is useful and there's an easy way
for us to standardize and build the same
metadata.
>> That's right.
>> It's like an interesting what you'd call
a network effect, right?
>> Yeah. I was going to say it sounds like
the internet. Yeah.
>> Come for the annotation, stay for the
stay for the virtual cell model.
>> Well, it was very important when we were
getting started with the work to have
everyone who was doing it have a
consistent format. So that way it could
be used and portable. And then once that
kind of took off as as the way that it
would get done, then other people just
found it valuable.
>> Yeah. And even relative to prior data
bases like GIO and and whatnot, they're
just simply not as standardized or QC.
>> Yeah. Control.
>> Yeah.
>> Let's get into virtual cells. One of the
the great challenges that the grandchild
you would focus on. Um maybe talk about
what is the promise or the hope and
maybe some of the challenges or where
we're at with it.
>> Yeah. I mean, we think that one of this
is going to be one of the most important
tools at this point is basically
building up the kind of hierarchy from
proteins to um to
just different structures within the
cell to whole to like whole like a
virtual immune system or different
levels of hierarchy. And we think that
this is going to end up being like a
very important set of tools for people
to effectively generate hypotheses for
for different science work. um you know
even before you get to the point where
you're really running full experiments
in it you can come up with some um
estimate of how that might run um it
will be useful for some of the precision
medicine type um examples that Priscilla
was talking about a few minutes ago but
we think that this is like probably one
of the most important sets of tools that
you need to build um and it's not a
single thing right so there's different
angles to to come at this from the cell
atlas data is helpful for understanding
things on a cellular level. Um, one of
the the kind of most important things
that we're doing right now, the the um
there's this this great company,
Evolutionary Scale, who actually had a
bunch of researchers who'd formerly
worked at Meta on protein folding
models, um, is joining a Biohub and and
Alex Reeves, the the, uh, leader of it,
is actually going to be the the kind of
head of the whole science program, which
is actually kind of interesting.
>> Yeah. when you think about it where it's
like you have AI and biology coming
together and really it's like an AI
person who understands biology is
running it rather than a biologist who
has some understanding of AI. I think
just kind of speaks a little bit to
where we think the the relative um
weight of these things is. But I mean we
basically view, you know, like Priscilla
was saying with the different biohubs.
Then New York doing cellular engineering
will basically make it so that you can
have cells that can record different
things that are going on around the body
and and share that data and then you can
build that into models. The Chicago
Biohub being able to record inflammation
um and and basically study that in order
to kind of help understand um like that.
That's a that's a different data set. We
have the imaging institute which is we
just trained our our first set of models
around that which are the first like
spatial models around understanding like
the way that that kind of cells look in
different states and eventually just
like you have this analogy on the um
kind of the industry side around
language models where you have different
capabilities and then over time you
train them into models and it gets more
and more general.
>> That's kind of the idea here. So we'll
we'll we'll build the biohubs around
grand biological challenges. The biohubs
will build tools that will generate
novel data sets. We will build models
based on those and then eventually
combine the models into an increasingly
general view of a virtual cell that will
be useful um both for scientists and
hopefully startups and companies that
are working on finding drugs which is
not our part of the whole thing but but
I think is obviously a really important
part of what needs to happen.
>> Yeah. And you know, you guys think about
risk all the time in terms of when you
make investments like I think the
promise of being able to do virtual
biology using a virtual cell model is
you can actually take on riskier ideas.
right now like grant funding can be hard
to come by and the wet lab work is
expensive and slow and it's not just you
know money it's also time and so you
have to choose something that you think
is going to have some likelihood of
success to keep your lab career going
and so it naturally lends people to take
on like some risk but not a lot of risk
because they need to make sure that they
are hitting like a certain percentage of
the time to make tenure or publish or
whatever they need to do. But if you had
a virtual cell model where you could
simulate really highquality biology, you
could actually then start testing and
tinkering on the computational side and
like ask riskier questions, things that
would have been expensive and t costly
in terms of time and resources to do in
the lab and actually see if there is
promise doing the experiments in
silicone before you make the time and
money investment in the wet lab.
>> Do you think of it kind of like a model
organism?
>> Yeah. like it's the new fruit of fly.
>> Yeah.
I was going to ask given the complexity
of a cell, like how close um like how
accurate do you think you'll get the
model too? I mean just assuming I mean
maybe you get it to like a perfectly
accurate representation of a cell, but
like how accurate to be useful with the
virtual cell have to be?
>> I think it will obviously iterate and
get better and better because right now
we we like right now we're still just
talking about uh transcrytoics. are
expanding into different ways of looking
at the cell, but you get more and more
accuracy and but I don't think it needs
to be 100% accurate to be useful because
you just want to be able to derisk the
idea on the front end a little bit. Um,
and the more and more you derisk it, the
the more efficient it gets obviously,
but it will be useful if you even get
directional signal. And yes, I do. We do
think about it like as a a model
organism, but in a way that's like has
fidelity to the human body, like you
know, like I don't want to
>> All models are wrong. Some are useful.
>> Yeah.
>> This is hopefully has has utility on
certain access.
>> Exactly. And just like the language
models, you build in specific
capabilities. So it's not so for
example, you know, one of the models
that uh we're we're publishing is is
variant former, right? basically you
know makes it so that um it's trained on
a bunch of effectively pairs of you you
have a cell you apply crisper to it in a
place you see what comes out at the
other side so it's it basically is able
to make that kind of a prediction like
okay if you have this edit that you're
doing to to a cell what is likely going
to happen um another one of the models
is it's this diffusion model basically
you can describe a type of cell that you
would like it to simulate and it will
just produce a kind of synthetic model
of of of the cell Um, again, I mean,
it's kind of interesting because to
Priscilla's point before about how
everyone is different and and like and
different cells have have kind of um,
you know, you want to be able to
simulate these kind of rare
configurations. Um, having at least a
synthetic version of what that could
look like is interesting and then you
can test against that. The cryo model I
think is interesting because it's
spatial. So it kind of gives you a sense
of there are all these different models
that you can have that allow you to um
basically look at different kinds of
things and then you just train them in
to be increasingly general over time.
>> Wow. Very interesting. And is the is the
modeling technology basically LLMs or
like like is there is there a reasoning
model? Is it like a just
>> Oh, that's actually Yeah. I know that's
a fascinating one too and because one of
the new models um I think this one is
very early but it's um it's it's
basically the first reasoning model over
biology. So the the idea is that um
yeah, you you you effectively have these
models that that kind of simulate world
models in different ways and then you
want it to be able to not just um be
able to spit out correlations, right, in
terms of like what it's found, but
actually be able to kind of reason
through how things would would evolve
and why things would happen. Um I think
that one's quite early but it's uh but
it is interesting conceptually as what I
think is clearly going to be an
important direction
>> um in terms of how these models evolve.
>> Yeah. No it because that's what I was
thinking you know that if it doesn't
work the next question you have is why?
>> Yeah.
>> You know like
>> but I think what you find in reasoning
the the analogy
>> you're married to your hypothesis. Well,
yeah. Sure. Sure. Yeah. I mean, the the
uh
>> Yeah. I thought I thought you're saying
if if the reasoning model doesn't work,
why? I mean, I think the
kind of way in No, it's I mean, the
language model analogy for that would be
you need better kind of world models or
or better pre-trained models in order to
get the reasoning to be good. But, but
it's yeah, you just you build more
>> you build more capabilities into it. And
I think that there's probably an order,
too. So the work that Alex and the
evolutionary scale folks worked on is a
lot of it is protein um which is
interesting because that's at a kind of
smaller resolution obviously than the
cellular data the cell atlas but
>> part of the hypothesis is that you can
look at all these different cells and
you can kind of simulate how they might
behave but you're going to have a
somewhat shallow understanding unless
you actually have this hierarchical
understanding of what um how the sub
components of the cells are going to
interact. So
>> our view is that you basically want to
build up a state-of-the-art protein
model and then have that be a part of
the state-of-the-art cellular model and
then once you have that you build things
like the virtual immune system which
allows you to simulate um much more
complicated systems. But it's sort of
this like hierarchical approach to
building up these these uh virtual
models. That makes a lot of sense
because also as you get into
personalization, you've got like common
proteins combining into a unique cell.
So that
makes it like from a systems standpoint
that makes it like much more manageable.
That that makes a lot of sense.
Interesting.
>> Yeah.
>> Yeah. Know it's it's it's very
fascinating stuff.
>> Yeah.
>> So you guys are announcing some big news
this week. Do you want to give us a
sneak preview? Well, I the big news is
uh thinking about how we are going to be
coming together as one team. Um and you
know in the past we have done we've run
biohubs and we've done built software
we've done some AI research but all of
it has been really thinking about has
been a little bit decentralized but now
under Alex's leadership we are going to
come together as the biohub a uh an
operating philanthropy where we are
doing the science um in service of a
singular goal together and how do we
actually advance the state of biology
and research um at the intersection of
AI and biology.
>> Amazing. Alex is amazing. So,
>> yeah. No, he's great. And then and then
the other thing is that the piece that
that I mentioned earlier, which is just
Yeah. I mean, CCI has focused on a
number of different things. We've really
just found over time that we we feel
like we've been able to make the biggest
difference in science. So, we've just
kept on doubling down on it and we're
going to continue doing work in
education. We're going to continue
supporting local communities and and in
those different pieces. But going
forward, the biohub is really going to
be the main thrust of our philanthropy
and we're very excited about that
because I think that this is there there
has been you know when we started
>> the mission to see if we could help the
scientific community cure and prevent
diseases by the end of the century. I do
think with the advances in AI that
should be possible to do significantly
sooner and that is a very worthy and
important and very exciting goal that we
think we kind of have a unique place in
the ecosystem that we can help empower
others to make fast progress on that. So
there there's obviously like plenty of
advantages to decentralization from a
management communication overhead and so
forth and so like what are you trying to
add by adding this kind of new
layer/unification
on top like what what are the outputs
and then I guess what are the
complexities to that because that's um
I'm sorry to ask a CEO question.
>> No no I I mean I'm like super
you want to go for it then I can jump
in.
>> Yeah. So there are obviously amazing
groups doing frontier AI and a lot of
groups doing uh great frontier biology
and where we think we can do uniquely is
actually tie these two together and we
are we've funded data sets we've built
data sets we're like building the
instrumentation now to be able to look
at the cell whether it's you know for at
the tissue cell communication our cryoEM
where we can look at the cell at nearly
atomic level. So we have the ability to
not only build the data sets but
actually shape and form them the way we
want based on what we see as necessary
to complement the existing body of
knowledge. And so we have amazing teams
doing that work and we're building these
AI models. And so what the reason to do
it together is then we can actually
complete the flywheel like you know the
model is looking like it has some gaps
and blind spots in this area. Okay, who
do we talk to? How do we build um the
next data set? And you know we're seeing
this in the lab like the metadata is
going to be so rich that we can feed
back into the way that we do this
modeling. Yeah.
I think it's going to be incredibly
powerful. And it's it's more than it's
more than just like, you know, writing
down a spec and saying like please
deliver this. Like these people need to
be sort of working shouldertoshoulder
and shaping uh each other's work for
this to actually um be the more and more
accurate model of how the human cell
works.
>> Well, yeah. It's so interesting because
that is exact like that's has been the
biggest surprise in the industry for us
in AI world like forget biology for one
second is that the domain
specific models have been like super
interesting like the original thesis
were like there's just some AIs are get
so smart they're going to be smarter
than everybody at everything but um
>> like on video models like every video
model is best at something but not
everything And so knowing what problem
you're solving actually turns out to be
sort of ironically very important in AI
um because you can actually get to a way
better result. Yes.
>> If you put the two together like yeah
we're we're seeing that over and over
over again uh in a way that that is
>> I would say very counterintuitive to the
whole narrative kind of going into it.
>> And in biology it used to be the or at
least you know one assumption was well
the data sets aren't on the internet. So
part of the reason you need a domain
specific model is that the data sets are
not public. you guys are kind of bucking
that trend too by creating a lot of
open- source access to the data and then
even then it sounds like you're betting
you know on the trend that we're seeing
in other industries but still there will
be nuance in how you annotate that data
curate that data
>> well and how you talk to a scientist
right like so because you have to not
only know the the data and the model and
so forth but like the conversation is
what we keep finding out ends up being
very very important right
>> so rich and so important how you
actually
>> a scientist isn't going to talk to it
like you know I talked to chat at GPT or
whatever. So, this is the fly you can
talk to.
>> Yeah. Yeah. Yeah. That that's really
that's super exciting.
>> And the user interface is actually
really important. Um you talked about uh
you guys have a founder who's using Cell
by Gene. That user interface was
intentionally designed to not need to
have a computational or really a very
deep biological background to be able to
use because you want people coming from
different fields to look at the problem.
It's like look here, help us solve
problems here. And so building that user
interface in a way where it's not a very
high barrier to entry to be able to poke
around and learn something and bring
knowledge back to your work, that's
intentional. And we're really hoping
when we build these virtual models um
that we get to a place where we can
allow a lower and lower barrier entry
for people to say like you know like I
have some knowledge about this maybe I
can contribute. Um a very pertinent
example is turns out I think immunology
has a ton to do with neuro degeneration
right but
>> seems like immunology is behind all this
so might be part of your century vision.
>> Uh so you need to be able to allow the
immunologists to come in and understand
neuro degeneration and understand how
their world fits in. And so the more you
lower the barrier to entry allows people
to actually think in a sort of truly
collaborative and interdisciplinary way.
So will the Biohub grow as a team? Like
will you employ more people at the
Biohub proper or are you moving towards
more of a network model with more sites,
more labs, more communitydriven data
sets? Like which which is the thrust? Or
maybe it's both.
>> Probably a little of both. And we've
added new biohubs over time. Um and then
we're also building up more of this like
central AI team.
>> Cool. So um but I don't I think that
these organizational questions of how do
you set this up are fascinating and a
lot of our approach is sort of informed
by
what the rest of the field is doing
because I you kind of think about
science as it's this portfolio right
society has a portfolio of stuff that
it's trying to do and as in terms of
philanthropy you want to
>> be the most additive that you can be by
trying to figure out what else is
underrepresented. So science by default
is very decentralized, right? It's like
kind of the the way that granting has
worked, the way that I think scientists
by default want to work.
>> Um
>> so I think a lot of what we've found is
that figuring out ways to encourage
collaboration in um ways that otherwise
seem very simple but weren't happening
before can unlock a lot of value. So the
very first Biohub what we did there were
two kind of interesting things. One was
it was this collaboration between UCSF,
Stanford and Berkeley and there are all
these really smart people at all these
different places who previously I guess
in theory they could have figured out a
way to work together but there was not
really a formal construct for them to do
that and this just allowed a lot more
collaboration. Mhm.
>> The other one is cross- discipline.
Basically having biologists sit next to
engineers and this view that like these
two disciplines are things that need to
um and I I don't know. I mean I'm sure
you know you've seen this in a lot of in
a lot of the companies but like
>> it's there's so many interesting
>> in the companies they always like set
them apart.
>> Well, it's interesting. No, it's
interesting how many organizational
questions or problems you can fix just
by having two teams sit together, right?
It's like it doesn't matter what the or
chart is or like whatever. It's like you
guys need to sit next to each other and
until you get this thing to work and
>> um
>> that's something I really believe in.
So,
>> and you have 10 you have 10 to 15 years.
>> Well, no, it's all like communication is
such an underrated problem in general.
Yeah. Uh in in all kinds of in building
anything or solving anything. So,
>> that's a that's pretty neat.
>> Yeah. Yeah. And it's it's just really
kind of simple stuff, but but I think
it's um
>> it's sort of novel as a model. And one
of the things that's so we've now copied
this
>> from the first Biohub to the Biohub
network and expanded it to other models,
but it's also just been neat to see um
other folks who are working in the field
also adopt similar models because it's a
pretty intuitive thing.
>> But you know, at some point you'll reach
the point where you know, actually it's
really good to have decentralized work
too, right? So it shouldn't be that like
we're not saying that this is like the
way that all science should work. We're
just saying that there's a space for
this. It can unlock a lot of value
because it for whatever reason hasn't
been the default.
>> Yeah. And we still rely on like
>> Yeah. There's famous like stories in the
MIT lab about that. That's how they
invented lasers and so forth is they put
a bunch of people from different
departments in the same
>> the lab. Yeah. Well, actually physics is
where we got a lot of the inspiration.
like physics has just historically been
like labs have just rallied around big
projects and big shared resources. Um
and we will you know we are relatively
centralized but we still depend on a lot
of labs who are doing sort of exact
frontier work or complimentary work to
come together to support this. There's
that. But one more thought on your
expansion question is like and maybe
this is like the uh modern AI lab. We
are not expanding like a lot of square
footage per se, but we're expanding our
compute. Um
>> yeah,
>> the research they don't want employees
working for them. They don't want space.
Yeah. They just want GPUs,
>> agents. So it's just like in a sense
that's new lab space. Um it's much more
expensive than wet lab space.
>> And you guys have always been creative
on that. Even in the last few years,
you've created ways to share access to
compute. You've enabled academic labs to
you know um I forgot the name of your
program kind of like
>> scientists and residents or something
like that
rental kind of hoteling.
>> The core of it is clusters. um you know
if you look at individual labs they'll
have like
>> like a large lab would have tens of GPUs
>> um and we were the first to really build
a large scale compute cluster um a
thousand now we're we have plans to move
to the 10,000 range and that one
requires a different type of project
obviously you're are able to ask
different types of questions
>> um and uh it's a resource that we use
but also we've invited scientists to
apply and say like what question do you
have that uh could use this amount of
resource and be able to uh stem uh sort
of seed collaborations that way
>> and so if a scientist is out there
listening like who's not employed by the
biohub or working at the biohub but
wants to collaborate with the biohub
>> that you're going to create interesting
>> interesting doors
>> to utilize the resources that's awesome
>> yeah I mean the GPUs are somewhat zero
sum Right. So that the data isn't. So
yeah.
>> Yeah. Fair enough.
>> Yeah. So you're about to celebrate 10
years um doing this. As as you look out
in the years to come, what else can you
tell us about either things that you're
thinking about for the future or maybe
even principles or a northstar that's
going to guide how you guys grow and
evolve going forward?
You know, it's been really interesting
in the past 10 years because I actually
spent the first few years completely
envious of people working for for-profit
companies because there's so much
clarity. Like the market will tell you
whether or not it's private or public
will tell you if you're doing a good
job.
>> If they think you're doing a good job,
>> if they think you're they're not always
right.
>> They're not always different. But I was
still envious cuz that was I was like I
craved that feedback like am I doing a
good job? And you know 10 years in you
the reason why we're doubling down on
biology is like not only did we achieve
what we said we were going to do and
when we set out to set out on these
projects it actually delivered more than
we thought we were going to. And I was
like okay that's a signal I can latch on
to and like that's a signal I we can
really continue doubling down and doing
more of that. And so I think it's uh
continuing to tolerate the early
ambiguity when you're like, "Okay, I'm
gonna do more of this." Um and uh and
being patient, but uh uh being willing
to have a long time horizon, but be
impatient at the same time. because it's
all those iterations along the way that
have sort of allowed us to get to this
place where you know to get lucky ready
having built data data sets to take
advantage of AI and large language
models that's because of all the work
that we have been doing and so being
able to continue moving forward in this
ambiguity and sometimes lack of signal
on a big goal like I think we've sort of
set the DNA for that.
>> Amazing.
>> Oh, no pun intended.
>> Yeah. But we get to see how many people
use the tools and the feedback. Yeah.
Yeah.
>> Yeah. You have customers which is pretty
cool.
>> Yeah.
>> For philanthropy. Like that's awesome.
>> Yeah. No, it's it's one of the fun
things about building tools is like you
kind of get to see
>> Yeah.
>> How valuable do people find the tools?
Do people use the tools in order to
publish important work?
>> Right. Right. Right. Right. Yeah. And
well, I mean our feedback is they're
awesome.
>> Feedback
and and completely unique by the way. So
like
>> the the other thing is like what would
you use if you didn't have this? It's
like there's nothing.
>> No. Yeah. It's a real it's a real kind
of void. I mean there's this whole
pipeline that that needs to exist from
accelerating basic science to funding a
lot of people to use it to then you can
get into the biotechs that basically can
start to work on on on basically coming
up with novel therapies and then you get
the pharma companies that do them at
scale. And then there's a space for
philanthropy on the other side of public
health of basically taking the the
therapies and and kind of bring them out
to everyone in the world. But this is a
a space that and that there's just going
to be a huge amount of leverage with AI
and it is um yeah it's it still seems
like there could be a lot more effort in
the space around building tools and just
accelerate the whole thing a lot better.
>> Yeah. And I do think it is the place
where you are completely unique. Right.
The other things there are other people
who can do that but there's nobody doing
what
>> that's got good good founder market.
>> Yes. Founder market fit. I mean if we
didn't exist would it be a problem? Yes.
Like those questions uh really land you
know as a VC
>> like one of us as an engineer the other
one scientist doctor.
>> Yeah very happy this direction.
>> Yeah
>> we thank you very much not only for our
companies but for us as humans um for
working on this work. It's amazing work.
Thank you.
>> Thank you guys.
>> Thank you so much.
Loading video analysis...