Decentralized Compute + Swarm Inference: The Future of AI Models | Ivan Nikitin, Fortytwo (#71)
By Fluence
Summary
Topics Covered
- Small Models Outperform Big Ones
- Network Self-Ranks Like Humans
- Wikipedia for AI Intelligence
- Rust Model Beats GPT-4o
- Swarm Resists Prompt Noise
Full Transcript
Hello everybody. Welcome to the Deepin podcast. I'm your host Tom Trobridge.
podcast. I'm your host Tom Trobridge.
We're excited to bring you interviews with the key founders in the deepin landscape, investors and ecosystem partners. This is an exciting time for
partners. This is an exciting time for Deepin and this year we're going to be meeting even more founders learning about the projects that they're building, about the traction they're getting, and about the revenue that they're generating. The podcast is
they're generating. The podcast is brought to you by Fluence. Fluence is
building a decentralized compute platform. It's a project I'm a
platform. It's a project I'm a co-founder of and really excited about.
So, buckle up for this year. Please
subscribe to the channel, listen to us on Spotify or Apple. And of course, if you are around any of the large conferences, come to our deepen day where you can meet these founders, the
Fluent team in person and hear from all the ecosystem partners as well. So, look
forward to seeing you in person and let's get going on this podcast. Hello
everybody, welcome to Deepin podcast.
I'm your host Tom Shro and I am thrilled to be here today with Ivan, co-founder of 42. Welcome, Ivan.
of 42. Welcome, Ivan.
>> Great to be here. Thank you for inviting.
>> Um, listen, it's it's it's um this is a deepin focus podcast. We certainly that is the priority. You guys are interesting because you're a hybrid of
deep pin and AI and obviously I think deepen is exciting. AI is, you know, the place everybody is right now and I think is going to be for quite a while. So why
don't you kind of dig into what 42 is doing and kind of the the the the combination of of deep pin and AI.
>> Sure. So uh our team actually comes from AI space. We've been working uh in the
AI space. We've been working uh in the field for more than 10 years together as core part I mean core part of our team.
Uh but the thing is we've always struggled with AI scalability like there's not enough centralized compute.
uh they're like very harsh rate limits coming from centralized TI providers. So
we saw um plus the other thing is uh the training of large models has been hitting the plateau with like diminishing returns coming from big and
very expensive training runs. So we
by not having um the capital like in billions of dollars to sustain sort of like the projects and the ideas that we were trying to build. uh we saw the
opportunity to solve the um infrastructural um challenge of AI and by doing that through pretty much the
DPN approach where we use the uh consu the consumer hardware that's already out there to build a peer-to-peer network
that combines the uh capabilities of small specialized AI models running on consumer compute into one big model. So
if you can think of 42 as the network is the model. If you're familiar with uh
the model. If you're familiar with uh LLM architectures, you might have heard about the mixture of experts. Uh it's
one of the more common architectures for LLMs where like a big chat GPT model has many experts put inside of it. And we're
s sort of like um doing the inside out approach where we place many many many experts on uh the MacBooks on PCs of uh
our node operators and the way the network operates it sort of like allows those models to amplify and expand each other capabilities rather than a single
model um working on its own. So
hopefully that makes a little bit of sense. Uh but happy to elaborate
sense. Uh but happy to elaborate >> it does but there's so much to get into there. Um and so first I guess I'm
there. Um and so first I guess I'm forbidding which way to go but first let's go big verse small right and so there is a there are two schools of thought maybe there's more but only two
I know of which is that the large models are the way to go because the large models will learn the
most have the most resources and so will basically out compete and out expertise any small models obviously you know if you not accounting for proprietary data
sets right so leave proprietary data sets aside which is a fascinating topic all by itself but um your p the perspective here of you and a
42 is that you can get to a similar or even superior outcome via a combination of small models versus
the big one model fits all or solves all of the XAI chat GBT etc. Right. And
that's that's a big that's a big division in the industry already. Right.
>> Yeah. We've been uh pretty early advocates for small models. We've um
witnessed this uh trend with some some of the first uh smaller specialized models where they could outperform
Frontier models on in in domain specific tasks. And uh uh ever since I think that
tasks. And uh uh ever since I think that sort of like our camp is winning because more and more um
industry leaders are talking about first of all the challenges of the large models like Ilia and many others in AI talking how um training of large models
is bringing diminishing returns. So it
becomes more and more expensive to train a better large u model and uh um the
improvements are marginal um we see those like um singledigit percentage points on um benchmarks that improve
with every other generation of large models and uh DP5 was a disappointment for um many but uh on the other hand
local inference is uh now being actively used. People are running adequate models
used. People are running adequate models on their mobile phones. But talking
about the main expertise of small models, that's a completely different game because uh it's one thing when you're trying to do like a small model that can sort of like do everything. But
if you do a really simple fine-tune of a small model on any particular domain, it's more likely than not that you will get a state-of-the-art model that will
easily outperform the GPT5s, Geminis's, and Deep 6 of this world. So, um, and A16Z
was also on their like podcasts, uh, talking about how, um, it's unlikely that there's going to be like one single god model that can do everything. Uh,
it's more likely and what we're seeing that even in autonomous vehicles that they try to do like one single model that will will do everything for a car.
We'll do the vision, we'll do the steering, we'll do the um, navigation.
and so forth. But the more efficient autonomous vehicles are the ones that do sort of like orchestration of small models rather than uh trying to do everything with one big model that
operates.
>> Super interesting. So I mean this is a a a really important um Let me back up.
It's hard to the the implications if you're right or if this theory is right are huge, right? because people are dumping money into the big companies.
And so if you're right, there's literally probably close to a trillion dollars of capital or trillion dollars of value betting the other way. If you
add up XAI, Open AI, all these other LLMs, right? Like I mean, OpenAI is
LLMs, right? Like I mean, OpenAI is itself 500 billion. So just ballpark roughly a trillion dollars of market value expecting the big ones to win, right?
But the challenge here is that uh I believe on hugging face they have about two million uh models already uh that's uh published there and it's just
impossible to navigate that landscape like if you have a particular task whether that's a coding task whether you just want the model to like check your
grammar or do uh whatever um everyday um task you might need at the moment like how do you know which m which small
model to use? um even if you um realize that hey there's it's more likely that if you're like doing I don't know like JavaScript development that there might
be a better specialized JavaScript model but well you know Chad GPT you know Gemini you're more likely to uh use a larger model because you can keep all
your sort of like context there and it's a familiar interface and you don't need to waste extra time looking for the ideal model because well you can spend
entire per day testing different um offerings that you can find on hugging face not even realizing which one is truly better for a particular task you're trying to solve.
>> So um >> so it's distribution and and and finding it's like distribution is a big piece of the big model.
>> Yes. And this is precisely the problem that we're solving with 42 because we're not operating as a marketplace. you're
you never know which model you interact with because you're interacting with the entire network and it's up to the network to determine sort of like uh which um models to um use for any
particular uh prompt and they always work together. So when the prompt enters
work together. So when the prompt enters the network uh first thing that happens individual nodes determine whether they have the expertise to contribute a response then they generate the
inferences in parallel to each other.
They do um ranking of each other responses to ensure that uh um we only filter the highest quality responses
that sort of like pass the accuracy test, the style test, the completeness test and many others and then they aggregate the highest quality outputs
from many models to present you with the final response. So we sort of like are
final response. So we sort of like are able to squeeze uh the most value from multiple models that people bring to the
network and um that is pretty much the larger vision of um how we are approaching the buildup of uh fortitude with small models. I mean that that is
obviously fascinating but what that implies is that you need two things. You
need the expertise underlying expertise obviously but then you also need what you mentioned which was the knowledge or the level on top of that expertise to
assess that expertise the relative expertise even to assess to combine consolidate edit and then fulfill.
Right? Right. So that's those those are you may even be fine with the expertise at the bottom. But then that that level on top of it is is another is probably the even the more complicated piece. I
would guess >> there's a pretty cool uh phenomena in LLMs and uh it actually in in some sense
mirrors how um uh humans operate as well because um we are much better at um determining like what's right or wrong
when we're giving uh when we're given possible um answers rather than uh generating uh those answers on our own.
And the same thing applies to LLMs. Even if you have an LLM that did not generate the right response, if you give it responses from other in a swarm, it
would be able to uh correctly assess uh the quality given uh the options even if it didn't know the right answer uh beforehand uh and was not able to
generate one. So um it's described in
generate one. So um it's described in the paper called meta ranking and it's one of the core principles on which um we've built um our own and uh sort of
like the ranking became the uh holy grail of 42. Uh we took some principles from from sports because um in uh sports
there were uh interested uh interesting betting uh methods um in the middle of like of previous century and um we built
on all those uh principles and ended up with a system that even if a swarm generates 20 possible responses and only one of them is right, it would still be able to amplify that quick signal to
sort of sort of like surface the truth um much more efficiently than what you could have gotten with majority vote or um other approaches.
>> That's that is interesting. And is that um how I phrase it the right way? Is
that um that is not highly proprietary, right? So that also works for um
right? So that also works for um how I phrase it that isn't a special piece, right? You don't have to have the
piece, right? You don't have to have the most sophisticated XAI, OpenAI to be able to make those assessments, right? That's the other key
assessments, right? That's the other key piece. But like you can have a small
piece. But like you can have a small models equally skillful at assessing other models of course and actually that research on metarinking it shows that
even the weakest models are very capable in uh ranking responses. So um again your particular model that you brought
to the network uh uh could be quite poor in uh uh gener generating its own responses but it's still going to be quite valuable in the ranking process.
>> Really interesting. Um this there's there's a lot I want to get into but let's back up for a second and talk about your background and how you arrived here.
>> Sure. So it's been a long journey. Um
personally I uh started in uh game development long time ago like in uh high school years uh was doing like uh in development and so forth. Uh always
with uh the idea that games are like perfect playgrounds for artificial intelligence even though that was like
super early days. Um and when uh then we sort of like as um a game studio we started experimenting creating uh
self-learning agents with uh um evolving uh neural networks with uh evolutionary algorithms and so forth and the results
were uh so promising that we sort of started to pivot more towards AI technology rather than developing games.
Um so um around that time when we worked with self-learning agents that that's when transformer architecture was introduced uh in uh year or two we
started seeing some of the first um language models and we decided hey um like the thing that we valued the most about evolutionary AI is uh it was
perfect for creating characters because uh when you have this neural net that is dynamically evolving it does not create a perfect um AI. It creates a more like
human AI because it it's capable of making its own mistakes. It doesn't have certain like patterns. It does not always win. So, especially in the game
always win. So, especially in the game scenario, that felt much more natural um compared to other approaches. So we
thought that hey with the first language models that is the most perfect way for us to express the character um of uh
different virtual beings and we started experimenting with that. We built uh I believe that was one of the first conversational AI platforms specifically for entertainment. We've integrated it
for entertainment. We've integrated it in a couple metaverse projects that were popular at that time and um sort of were
like 100% on AI projects done um our own language model training for NEOM in Saudi Arabia. We built a
like language model that could do like a spatial reasoning. So you could like
spatial reasoning. So you could like describe your ideal home uh to it and have like a structured output of what goes where like what items, what
furniture, like the scale, the spatial relationship between those items. But again, AI scalability has been the biggest challenge like uh we never could
get enough compute out of centralized providers. it just non-existent and we
providers. it just non-existent and we couldn't scale. We've been doing like
couldn't scale. We've been doing like conversational AI for famous DJs for like even David Geta at some point and he wanted 300,000 of his fans to talk to his virtual avatar which is simply not
possible even today. There's not enough uh uh nobody's going to give you sufficient uh API calls even if you like combine anthropic Google and openi
together. Um so we decided not to try to
together. Um so we decided not to try to solve the uh unsolvable uh through the existing means and uh go in a different
direction and uh that is how we started 42 with this idea that hey let's use consumer compute let's rely on small specialized models and let's build it in the open where like everyone can
contribute to the advancement of the thing.
>> And when did you found 42? The company
uh was founded October 16th uh 2024 and uh but um >> or just at the year just over your year anniversary.
>> Yes. Uh and uh but we started working on the prototype on the uh initial uh research paper a little bit before that.
So I think that the idea is about two years old.
>> Really interesting. Well, let's get into the deepen aspect of it because you referenced compute and you referenced consumer. So, what's the we've talked
consumer. So, what's the we've talked about the kind of LLM, small model, big model, which I want to get into in more detail and more specifics, but let's let's before we do that, let's talk
about the deepen kind of component here of of pulling things together and using these uh you know various resources.
personally like um I've been a big believer of um deepen approach since uh pretty much my uh childhood ages bec
because when a set at home was popular when like folding at home it appeared I always lived with hey like I do have some um hardware in my possession I
wanted to be able to contribute to something uh bigger than whatever I'm doing um with this hardware at home. So,
um this sort of like became like my early obsession uh with the distributed compute approaches. Now
we have like a more organized way to do that with um deepin but uh in case of 42
I find that it's not just people contributing to the infrastructure of uh community built AI but rather than
uh in some ways uh like taking ownership in the whole thing because our node operators are not just people who are running some uh software as part of the
network but uh they're the people who also can run whatever model that they want. They can do like any fine-tune of
want. They can do like any fine-tune of existing model. They can bring any data
existing model. They can bring any data any tools to their node. So we
incentivize node operators to uh do any kind of enhancements uh to the nodes that they operating. Of course, like uh if a person just wants to um participate
uh in the network without uh taking any extra steps, it's just a a few um clicks and boom, they're running their note as
part of the network. But the idea is that we want to build 42 similarly to how Wikipedia was built where uh random
people on the internet uh can decide that hey I might take this weekend to do a finetune of a model to do like an
expert in I don't know CT scan analysis in jazz history in Delaware corporate law or whatever niche field that uh they might want to contribute to and um they
plug that model that fine-tuned to their node and uh suddenly the whole network becomes more knowledgeable and capable in that particular skill that you
contributed and you do that without uh giving up the data the model or the weights that you've just worked on. It
remains in your like 100% uh possession ownership and privacy because the only thing that your node is sharing are the results of the inferences. So the bigger idea here is uh uh
>> wow >> you know we we've been recently um quite um inspired by um again it was A6 andZ podcast with Baji and he spoke about in
the future there's going to be like American AI with like corporate and and some government control there's going to be like Chinese AI also uh different side of the same coin um and if we're
lucky there's going to be decentralized AI that's going to be representative of the wisdom and the viewpoints of all civilizations on the earth. And that's
how we want 42 to be built. Uh allowing
people even in like underserved regions where uh there are so so many talented data scientists, machine learning engineers who might not have like a direct direct career path towards $100
million contract at meta and even they they're passionate about AI. They want
to contribute to it. So they can do so by contributing a small note on our network realizing that just the same way if they edited a page on Wikipedia,
Wikipedia would become like a more accurate and more complete source of knowledge. uh if they contribute a small
knowledge. uh if they contribute a small model to our network, our network would become um again more more capable, more accurate and more skillful in uh uh
certain domains through those contributions while allowing such people to potentially even make a living um out of operating nodes on our network. Let's
talk about that reward system because that seems like a key piece of it. But
your it sounds like I'm going to guess that the more your model you contribute is used or the answers are chosen the more you are rewarded. Is that is that
right? And that's kind of the and you're
right? And that's kind of the and you're sort of referencing node but there's also simultaneous with that you're mentioning models being used which which
may not be actually synonymous.
In our case uh it is because um we we're calling them uh capsules. So you sort of like put a model in a capsule and it operates as part of your note. So in our
case when we speak about unique nodes on the network we mean uh unique um models that are being operated um there. Uh
anyway the rewards uh system is uh uh quite simple. We
quite simple. We um as a company we provide API endpoint for any uh developer um of AI apps and
services who are currently using like open EITropic, open router or any kind of inference provider and um they use
our API in addition or uh in place of the um current services and they pay for API requests. The API requests get
API requests. The API requests get converted to our token which gets distributed over the which creates the reward pool which gets distributed over
the winning nodes. What do we mean by winning nodes? Those are the nodes that
winning nodes? Those are the nodes that passed the certain quality threshold as part of the inference round. In order
for us to create a system uh without any gatekeeping to which anyone can contribute any kind of model without anyone knowing which models are being
run on the network uh we need a reputation system and that is the reason why we're building the web three way with the blockchain and all that um is
because we need a place to keep the reputation of the individual nodes. Uh
so we use Monet as our settlement layer.
We're calling it um a proof of uh intelligence consensus. So uh your node
intelligence consensus. So uh your node builds reputation over time. At first it just uh participates without being rewarded or even considered for the uh
response aggregation. So it just does
response aggregation. So it just does some uh kind of like test inferences so that it gets a chance to be judged by its peers. after it runs for maybe a
its peers. after it runs for maybe a week and builds sufficient reputation uh in that case it it will be allowed to
judge others as well as participate in real inference rounds. So it responses can actually be um used for the final uh
outputs of the network and get rewarded for uh doing so. But if the uh note or well I mean the model that's operating
uh within the note does not provide uh quality responses it begins to lose its reputation and it does not get uh rewarded at all.
>> I got that that that I understand. Um
what's interesting is that well the more complex the question though or the more specialized the question the more
relevant the trust in the model is right if I ask what's 20 plus 20 well every model is going to give me the right answer if I asked hey here's you know
three symptoms my fouryear-old has what disease do you think he or she has right you know you're you could get a wide range of answers and there you actually
may want the branded response from this is what the data the model trained on NYU pediatric or whatever
data set answered you exactly so uh if a person is running um a model any kind of generic model from hugging face like a
quen jamma or uh whatever uh of course their earnings are expected to be much less compared to those nodes that run um unique models um unique high quality
models >> specialized.
>> Yes. Yes. Um and that creates like additional uh incentives for people to actually not just run the nodes with the generic models but bring any kind of
unique expertise because uh uh in that case earnings can be quite um substantial for participants. How much
of the model expertise, these small models is based on unique or specific data sets versus the actual training
algorithm itself? Because if it's
algorithm itself? Because if it's because then you you'd think the data set would be the more important component given you would think the large models could
very easily copy or or or have similar cheese as a small model. But but tell me where I'm wrong.
>> Sure. So the the thing here is um when you're dealing with a larger model uh and trying to make it uh capable in multiple domains, it becomes a very
difficult and very expensive uh fine-tuning process uh because you um improve its capabilities in one domain
but you start to lose uh um its accuracy in other domains. So that is where all those massive compute budgets are coming from because when OpenAI is training
their next large models, they do a lot of iterations uh and uh sometimes they go in the wrong direction, they have to redo something, they uh have to uh do
something with the data, something with the parameters, so forth and so on. But
um it's much more efficient to do a training or fine-tune of an existing small model on a very um narrow domain of expertise because in that case we can
say hey I've just created the best model in particular programming language. I
don't care if uh um if it got worse in uh C++. I know that it got better in in
uh C++. I know that it got better in in JavaScript and that's all I care. I now
proclaim this model as a JavaScript coder and that is it. Uh we ran uh one such experiment uh recently. Um we
trained a rust specialized model. We
picked a rust in uh particularly because uh >> it's a modern and plus it's a quite modern language that's evolving quite actively and
there's just simply not enough uh data on it uh to do efficient uh model training. So we use our own network to
training. So we use our own network to generate 200,000 uh data points on the rust and we use that data that came from
our network to uh do a very simple finetune of a quen coder model and um just from the first iteration it became state-of-the-art in rust. it uh even
though it's only 14 billion parameter model it surpassed GPT5 codecs it surpassed all the uh frontier big guys
and it it surpassed the baseline by 27%.
So uh just from a from a simple Laura fine tune that's like pretty much the simplest thing you can do in terms of uh finetuning but the data was of such high
accuracy uh that it was able um to contribute to such uh um dramatic improvements and this is exactly the
sort of like evolving intelligence flywheel that we want to start where we use our own network to create data sets.
that can that are given back to the community so that uh our community or us can do specialized models that we put back into the network. The network
becomes even more even smarter uh capable of generating even better data sets and the um the cycle goes on. But
of course we understand that um many people out there like we we can never on our own compete with the power of the community. uh like the same way that
community. uh like the same way that Wikipedia foundation would have never been able to create uh uh the Wikipedia that we know today. Um and it it relied
on individual contributors, on editors and forth. And this is precisely the
and forth. And this is precisely the movement that we need to build for fortitude to succeed where we provide
the opportunity for um people all over the world to do such like meaningful small contributions uh to the network.
>> Well, listen, I I get the analogy, but I will tell you I think Linux is probably a better example. Wikipedia has been held hostage, I think. And and if you look at it, the actual power contributes
a relatively small number. You can have a real debate as to how open it is and how certain topics have been highly politicized in it. Right? So
I'm not sure Wikipedia is the best example. Obviously was it was a
example. Obviously was it was a community that built it, but if you start to peel it back, it's pretty pretty narrow, I think.
>> Uh yes. Uh so we're talking more about the original idea behind Wikipedia and uh uh the thing is um the reason why um
we're not so keen on like open-source uh analogies is because with open source seemingly the barrier of entry uh is a bit higher because uh well you you have
your like pull requests you have the review process there's still gatekeeping uh in case of uh open source software
And um like presumably with Wikipedia at least in early days uh um it you still were able to like create new pages, make
edits that go public before uh they are taken down uh if they're considered like inaccurate or maybe provide enough sources and so forth. So that that is my
that is uh closer to how we approach uh 42 without that sort of like >> fairness. Um,
>> fairness. Um, >> I get I get that point because it doesn't have the >> Yeah, I get the the process required for open source contributions is more.
>> Yeah, we we sort of want to unleash this beast so that even we cease to exist as a company, 42 can continue to exist and
get better because uh um like this whole uh like self-supervised um regulation is completely independent of us. it happens
naturally as part of this pre-ranking and reputation system. So we we don't need to be there uh to watch it grow pretty much.
>> So talk to us about traction you had so far because it looks like you've got a bunch of nodes models up up and running.
So help people understand kind of how where you stand now.
>> Sure. So we onboarded some of the first known operators. We have about 700 nodes
known operators. We have about 700 nodes live uh on the network. Um we have a huge uh waiting list about 45,000 people
applied to run the nodes. Uh uh we're going to start on boarding them uh more actively in um coming weeks and months.
But the most important thing for us this year has been running the benchmarks uh because we needed to sort of like we had this idea that hey if we combine the
power of uh multiple small models we can compete with the um big guys. And this
is precisely what we've done a few weeks ago. And the results surprised us
ago. And the results surprised us because on um several key benchmarks uh in coding like life code bench, math
like Amy 2024, 2025 um humanities last exam and the GPQA diamond which is a hard science
benchmark. we were able to achieve um
benchmark. we were able to achieve um best results among all the uh existing uh models. So we s again surpassed
uh models. So we s again surpassed OpenAI Entropic and Google and XAI models but most importantly what we noticed in the process like when we were
running GPQA diamond uh benchmark uh we ended up being second uh the the first position was taken by um Gro 4. So we
thought like what what's the deal here?
Uh and we tried to run the benchmark a little bit differently uh with extraneous information like you know in a college professor might try to mess with a student to see whether they truly
understand the subject by putting some extra data uh in the problem uh to see whether that's going to distract the student or not and we did precisely
that. So how does GPK diamond uh looks
that. So how does GPK diamond uh looks like? It's a a collection of hard
like? It's a a collection of hard science questions on uh chemistry, biology, physics and so forth. Uh where
we ask a question and we provide multiple choice um response. Um but we added something else to the prompt. We
added a uh verbatim.
Uh here's a non-relevant message. There
is a cat on the roof. Maybe it is hungry.
And after adding that most models lost about like 10 to 15% precision. Uh GR 4 immediately dropped uh uh to uh fourth
uh position started to provide wrong responses where it used to provide the right response beforehand. So it sort of like highlights the limitations of
current reasoning models because when deepseek came out like people were shocked this for the first time they saw the chain of thought of a language model
and they thought hey those things think the same way like I do they like take steps they try to su
sort of like think about the approach that they're going to take uh to solve the problem but Um it's pretty much just
mimicking our reasoning and not doing any uh real one because just like in the case with this extraneous information we see that the best reasoning models start
wasting most of their compute thinking about a message that would clearly in the prompt say it's irrelevant. It's
about the cat. It's has nothing to do with the science problem that we've just given here. and uh 42 demonstrated
given here. and uh 42 demonstrated surprising resilience uh towards those things. So um resilience towards um uh
things. So um resilience towards um uh prompt objections and any kind of noise uh in the prompt or the context and that translates to real world usage because
well how do people talk with JP? They
say hello. They say uh thank you for helping me. they put a lot of extra
helping me. they put a lot of extra stuff sometimes not realizing that uh any extra token that you put in the system uh might be the token that sort
of like throws it um the LLM of of the rails and causes it to lose all the
accuracy. So um we are now like uh on um
accuracy. So um we are now like uh on um this Thursday we are going to be announcing the benchmark results and also publishing the technical report.
It's going to be 35 pages explaining everything that went into the current achievement uh the architecture and the approaches that we're taking with like
ranking with uh reputation and so forth.
Um so we are making it all public. Um
and uh considering how good results we're we're getting in the benchmarks, we want to be a little bit more aggressive with go to market because if today we're able to provide such high
quality inferences, there's no reason for us to wait to completely deploy the uh decentralized network, we might even um add some uh centralized compute to it
uh so that at least we can start on boarding uh clients sooner. And what's
also been important for us are our partnerships in deep in space. We have
done partnership with Dawn which is pretty exciting for us because the hardware uh that um Don is um packaging
in their black boxes is ideal uh for us to run the nodes and just um getting all that uh capacity and opportunity for Don
customers to also uh contribute to uh AI is uh it's a great synergy.
Um and Acarist um Acurist has been very interesting for us because they have a large network of um
compute nodes on phones and what makes phones especially interesting for us is that they're um operating within trusted execution environment and that allows us
to create to make privacy guarantees that even uh big tech uh cannot do with uh uh enterprise data centers uh because
TS well they're still in their infies uh and only phones are sort of like at this advanced um stage uh where it can be
like viable for production environment.
So um different directions but um it's interesting how we can like approach um deepening expansion from um different angles
>> on those lines. So who do you think the customers are? Because you talked about
customers are? Because you talked about the revenue earlier or or kind of how that would work, rewards would work to different nodes, different models. Who
would be the customers? And related to that, you mentioned Rust as an example.
Could it be is there a spec specific vertical that you think you may have an edge in accuracy on that is particularly valuable? And that's kind of your
valuable? And that's kind of your initial customer base target area.
>> Sure. So our focus has been on coding because we see that um like uh coding copilots, VIP coding platforms are um
well this is this is the first major vertical that appeared out of uh this LLM market. We see that how amazing
LLM market. We see that how amazing cursor is doing and many other platforms and they bring actual value. They uh uh
they're not a replacement for software engineers uh but uh they increase uh the efficiency of their work and sort of
like allow software engineers like uh when um sort of like get um equalize the
capabilities of u uh junior engineers, middle engineers and so forth which is um very important uh for this sort of uh advancement of the workforce uh in that
field. So we are focusing heavily on
field. So we are focusing heavily on coding and that's part of the reason why we've been uh doing coding benchmarks in particular u part of the reason why we
did the rust uh model to sort of like make 42 uh the best choice for uh rust coding but all other niches are uh very
interesting as well because they all have their own um sort of nuance um particularly
you mentioned um sort of like a medical use case and this one is uh um something I'm personally very interested in because I'm seeing more and more
specialized models uh uh beginning to appear um for um in in in medical field and uh I think that considering that if
you're getting a hallucination on a coding task that's one thing like you rerun the prompt it's no big deal you get like a compiler But if you get a hallucination on a medical task, that's a whole different
uh issue. And people unfortunately are
uh issue. And people unfortunately are starting to use chat GPT in best case for second opinion uh but in worst case uh to get a first opinion on any medical
matters. And we're starting to hear
matters. And we're starting to hear stories about uh people sort of like inflicting damage because they heard some recommendation that uh GPT
hallucinated uh on on a particular medical issue.
So considering that we cannot stop this trend unfortunately um we can at least do what we can to ensure uh that the
responses that we provide already passed like internal consensus of specialized models and ensure that we do not pass any hallucinations. And it's almost like
any hallucinations. And it's almost like mirrors how real doctors operate with like uh medical consort um um medical groups how they like um talk
about potential uh diagnosis and so forth and so on. And uh if we have like multiple medical models that can like argue with each other, see like rank
each other's uh hypothesis and so forth before passing it to the uh customer that could um allow 42 to become a more reliable
source on uh medical um data and I don't think that we're going to um advance it as part of the public network. Um we are more likely to work with uh medical
institutions to do like maybe even private deployments of uh swarms so that they can operate on uh um validated information but using all those
advantages that come with uh peer ranking and response aggregation that our architecture brings. you know back to the that's I think coding first re
medical maybe second but on the coding side just as example which you mentioned how would that revenue work would people would would developers would businesses
buy a subscription could it be that same sort of concept where you get x hours or x prompts or unlimited use how how would that work
>> for uh for our customers 42 is not going to appear any different um in terms of business um compared to any other inference provider. Uh if we take OpenI
inference provider. Uh if we take OpenI anthropic uh and um all those other uh big tech uh companies that provide inference um APIs, it's going to be the
same. We also going to be accepting uh
same. We also going to be accepting uh fiat payments. Uh it's going to be
fiat payments. Uh it's going to be OpenAI compatible endpoint. So uh
developers would not need to change pretty much anything. they only changed the API endpoint in their code and uh the uh secret uh to um get access to it
and that's pretty pretty much it and they they're going to be charged uh uh similarly to how developers are being charged today for like OpenAI and on the
medical side will you then I think unlike the large LMS do you think there's a path to having access to proprietary
data sets as well, right? Because if
you're talk because then they don't have to share the data, they can just be queried and get the answers back. To me,
that's an enormous potential source of value. Recently we started entertaining
value. Recently we started entertaining an idea that we can operate uh similarly to how Red Hat operates in uh Linux where they have this sort of like open
source uh distro but um their like key business is in um doing very customtailored deployments of Linux uh
solutions for enterprises. And we might take a similar approach where we maintain the open um 42 as uh the
existing peer-to-peer network. But for
more um specific uh fields like uh what we're discussing um in medical domain we could do like private deployments and
potentially they can uh unite the data of uh multiple hospitals. So um imagine like we take um hospitals in Boston for
example and uh we do note deployments in one hospital in the other hospital in the third one uh all of those um nodes
get access to proprietary data um that belongs to this particular hospital but we sharing that data again the same principle where the data remains private
only inferences are shared um all the hospitals in the city, in the state or in the country whatever um can sort of like
uh squeeze the wisdom and the experience of multiple entities without um fully revealing the information on >> and and make it a revenue source. That's
the key thing.
>> Yeah. And so so help understand scale.
Where do you think you know you tell me the metrics, right? You mentioned you've got 700 nodes and models on now. Help us
understand and I got to think scale is not a friend. If you had 40,000 that strikes me as a lot of models to go through for every time someone answers a question, but leave leave that aside for
a second. Um
a second. Um if you 5 years from now, 3 years from now, where do you want to be in terms of models in terms of revenue, what's the
opportunity here so that people can sort of grasp the scale of what you're trying to target? What do you think is
to target? What do you think is possible? Well, what would make you kind
possible? Well, what would make you kind of happy from a execution and metrics whatever it may be, right? It might be might be customers, might be verticals,
might be a number of different things.
So the most important thing for us right now is to get um that ball rolling with um the initial contributors who are not just contributing compute but who
contribute with the models and the data.
sort of like that initial again uh sorry for getting back to Wikipedia analogy but there was a point in time when people uh thought that hey this is worth my time to create a page on this
platform right so we need to get to the point where uh I don't know uh AI researcher in um France or um ML
engineer in Abu Dhabi decides that hey I'm I might do like a small model and run it as part of the um 42 network. So
getting those early uh sort of like grassroots enthusiasm is what's essential for our long-term success. And
uh in 5 years what uh we want to see is first of all that uh the network becomes um sustainable in that regard that we don't
need to touch anything that it's just the power of community of uh people who care about AI uh people who want to benefit from AI that they start and
continue to improve the network on its own without our participation in the process. Um and um uh in terms of the
process. Um and um uh in terms of the sheer number of nodes live on the network, our estimates that ideally would need um
tens of thousands up to like 100,000 nodes to be um efficient and uh of course we are not querying all the models uh on every prompt that enters a
network. We've implemented something
network. We've implemented something that we're calling semantic topology. So
uh every node um positions itself in certain way on the landscape of the peer-to-peer network uh based on its um
expertise. So um a prompt is not
expertise. So um a prompt is not broadcasted to the entire network but it's sort of like routed in a general direction where the more relevant uh
nodes are. And this is what this is like
nodes are. And this is what this is like the key unlock to our scalability uh with more and more nodes joining in
>> you know I I get that but revenue customers right like help people understand where where that can go I mean you're seeing
open AAI and you know with billions of revenue right I think it is two billion or something along those lines right so >> where and and that's you know it's been been live for what two years at this
point maybe less. So where where where do you and again you can think about how big maybe just a coding vertical industry right how big is that coding vertical where do you think you go which is that
>> I I I think we can aim for all verticals the only uh vertical that's uh not that great with 42 uh is anything like real time conversational agents and so forth
because we optimize for quality we are optimized for cost but not for the latency so you're expected to wait uh
several seconds uh um more maybe like uh 10 20 seconds on some uh tasks compared to what you're getting from centralized
AI providers. So um for us the sky is
AI providers. So um for us the sky is the limit because uh what we're starting here is uh uh similar to how open- source was uh competing with proprietary
and uh it seemed like in the 90s and early 2000s that uh Microsoft uh with uh all the engineering all the capital was
kind of like too big to ever lose um its um grip on the on the industry. But uh
uh and they had a very strong anti-op open-source stance and 20 years later and now we're seeing that uh Microsoft
sort of like uh became one of the advocates for open source. We now have Ubuntu uh as part of uh uh Windows 11 um
kernel. we have uh uh Microsoft
kernel. we have uh uh Microsoft contributing to open source, benefiting from open source and in the end I think
it's fair to say that uh open source uh one over proprietary even though it was competing like again random enthusiasts
versus uh some of the most capitalized companies uh on the on the planet. And
um we believe uh same thing can happen with uh AI as well where just the power of uh community would be stronger than
all the um capital in the world that's currently being like concentrated around uh open AAI project Stargate and uh all those other um initiatives that are
going on.
>> Listen that that obviously is big.
You're still not giving me a number. I
want numbers. Give me some numbers.
Where do you want to be revenue-wise?
Where do you want to be? I mean, think about right. DPIN is all focused on real
about right. DPIN is all focused on real world business, real world solutions, real world revenue that cycles into token economics, right? If you generate revenue, you let me differently. You
cannot get to the model numbers you want without generating revenue because people won't care if they're not getting paid. So, you have to have distribution.
paid. So, you have to have distribution.
You end up with 10,000 models and no one's using it. It doesn't work, right?
It falls apart. So you need revenue to sustain and incent all those nodes, right? So presumably
that's got to be a big focus.
>> It is uh and we believe there's no reason why we can't fully occupy um the the coding vertical and uh uh the others
where people don't expect the fast response but they expect the uh the most accurate response, >> high quality the high quality response.
>> Yes. uh like a deep research task, any kind of uh um scientific research um and uh a anything related to development
so forth. The reason why it's hard to
so forth. The reason why it's hard to estimate um a figure is because we're still not sure like and by uh we I mean the AI
industry in general. We're still not sure on what trajectory we are in uh with the AI scale like uh there are many skepticis skeptics today talking about
that we're at the peak of the AI bubble and it's going to burst and LLMs are not going to get any better and disappointment's going to follow. We of
course understand all the challenges that the LLMs are facing but uh still the industry is useful and we believe
that if we can guarantee um high accuracy responses then we can actually expedite the adoption of LLMs in um
uh in practical use.
>> You don't you don't you don't care what people think about the industry. All you
care about is I'm providing a service that's high quality and people will pay me for that.
>> Yes.
>> You don't care what it's called. So, who
cares what people think of LM's getting better or worse. It doesn't matter. If
you got a good coding solution and you're you're superior to Copilot or whoever else, that's a market. That
market is X big. You expect to do whatever is in it. Who cares what the industry does? You just know you've got
industry does? You just know you've got a service that's superior that you will charge whatever you charge for. and
should get however many number of customers right >> of course um so in in that regard if if we're talking numbers uh as I mentioned we optimize
for cost as well because with uh consumer compute we want to make it cheaper than what openi entropic currently charge uh even though all of the centralized AI providers probably
with the exception of entropic are heavily subsidizing the cost of their API so uh they are actually paying for your API request because it's just not possible to do the inferences of large
models at the cost that they're currently offering. Anyway, um we
currently offering. Anyway, um we believe that we can uh in um coding in particular uh be become one of the um
major inference uh um providers at cost similar to probably uh Deepseek. And I
think that if people start to see that we dramatically lowered the number of hallucinations and so forth that uh the adoption is going to rise as well. So
the actual numbers are going to go much higher than what we're seeing today in uh um coding and I think I I don't remember the exact numbers but the
cursor is doing extremely well uh even though um we're still like in the very first days in the infancy of uh this industry because uh people cannot like
100% rely on the code that they're getting from cursor and all those other >> right no and I get that but that will you know I love the concept cept of you guys being able to, you know, you could
run coding competitions versus cursor, right? You could do coding competitions
right? You could do coding competitions and all kinds of things, which I love.
Like you can see answers, have it checked by humans, checked by machines, whatever it is. But, um, that that's really exciting. Um, what are there
really exciting. Um, what are there other projects in the deepin space you're paying attention to you think are doing interesting things or you're you've got to be heads down on this and you have such an AI focus so you may not
pay too much attention to the rest of the deepin landscape. But I'm kind of curious anything else you're you're looking at out there you think is interesting or effective. I obviously
have to mention the uh partners with which we're currently building um Don and uh Acarrest. I I think uh the uh those are great. Uh the
other in the name escapes me right now.
Uh um I I think they did a fund raise recently with uh solar energy. Uh
do you know what the one I'm talking about?
>> You mean Glow?
>> Putting solo is putting solar panels around the world. They've raised I don't know if that's what you're talking about. What what's what's the model?
about. What what's what's the model?
It it could it could be them. I I I don't remember. Um anyway, but um um I I
don't remember. Um anyway, but um um I I think those are like the main ones on our uh radar as it is today. Uh, of
course I don't know if uh if we consider like distributed compute like um a year and all those other guys are they uh
purely uh deep in or if they're just uh more like um the centralized infrastructure um it's
sort of like hard to truly uh position them in one category or another. But uh
I do believe there's uh great future in uh those approaches as well. And uh
yeah, I think that's pretty much uh it at this point. Well, listen um how do people stay in touch? They want to run a model. How do they do that? They want to
model. How do they do that? They want to track your progress of 42 and um test out some coding capabilities. How do
people follow you or 42?
>> So there's a form on our website so people can apply to run the nodes. Um,
again, we still have a huge uh weight list, but we're going to get much more aggressive on on boarding more and more people. And uh yeah, of course, the
people. And uh yeah, of course, the Twitter is where people can um keep track of uh whatever is happening. Uh
this week we're going to have uh some big announcements. Uh
big announcements. Uh um as of yesterday, we have a new Twitter handle. It used to be 42
Twitter handle. It used to be 42 Network. Now it's just42.
Network. Now it's just42.
Um, so yeah.
>> Well, listen, I want to see you back on here once things are a bit more a little more mature. You've got, you know,
more mature. You've got, you know, you've scaled up the models and you've actually got um, you know, real some some some real interesting things to talk about. Sounds like there's a lot
talk about. Sounds like there's a lot that's coming. This was a terrific
that's coming. This was a terrific introduction and I think the small verse big model is a fascinating debate and I think we're going to be having this debate for for quite a while and uh for
a lot of people's sake I hope the small model wins but um but uh but let's see >> they're already winning.
>> Great. Well, don't don't tell don't tell the openxi investors that. But uh yeah um we'll keep we'll keep make it our secret.
>> All right. Excellent. Thanks, Ivan.
Appreciate you being on.
>> Thank you, Dom. It was a pleasure.
>> The Deepinned Podcast, the place where we explore realworld use cases unlocked by crypto. That's all for today. Thanks
by crypto. That's all for today. Thanks
for watching to the end. If you enjoyed this episode, please help us grow. Add a
like, add a comment. Um, and of course, if you have anyone you think we should be talking to or have on this podcast, either leave a comment there or reach out. But thanks for watching and we'll
out. But thanks for watching and we'll see you next week.
Loading video analysis...