Jane Street on GPUs, Trading, and Hiring: A Conversation with Dwarkesh

By Jane Street

Summary

Topics Covered

Trading spans a spectrum of time horizons and intelligence
Financial data has opposite properties to LLM data
Trading is AGI-complete because all problems flow into it
Humans outperform models through market phase transitions
Physical infrastructure is the new bottleneck for AI compute

Full Transcript

Jane Street are partners of my podcast and one of the fun ideas we had is why don't I come visit a data center uh for training that you guys run. So I just

got a tour of this uh Texas data center from Ron Minsky who co-heads the technology group and Dan Ponttovo who heads the physical engineering team. So

thank you guys for showing me around it's worth I've never been here before so I also got I was also getting a tour which was great. Previously I was confused well how can you be doing GPU things if you need to be trading on

nanconds and maybe you can talk through what is the actual time horizon of the trading you guys do can you afford to have be running big models to in in the

middle of making trading decisions I think the thing to understand here is there isn't one time horizon there are many time horizons uh there are trading systems we build and trades that we do where in order to be competitive you

actually have to turn around a packet in in under 100 nanconds and like that's a very different regime, right? You know,

people sometimes talk about like, oh, can you guys write high performance stuff in Okamel? It's like we can, but like for this kind of speed, it's like it doesn't matter if you write in Okamel or Rust or C++. You can't use a CPU,

right? you are going to be on an FPGA

right? you are going to be on an FPGA that's like direct wire attached to the network and you're going to be turning around the packet so fast that if you like attached an oscilloscope to the wire on the way in and the wire on the way out you would see the packet start

to leave before it's done being consumed. So it's like a very different

consumed. So it's like a very different very specialized regime but like when you're in that time regime you really can't do very much computation. the

decisions you're making are going to be very simple. And in fact, there's this

very simple. And in fact, there's this kind of whole curve of trade-offs between how smart is the decision that you're making, be it a model or some other kind of maybe even like

handwritten decision-m process, and how fast the turnaround is. And like the the right way to build uh an optimal trading strategy is really to have a kind of ensemble approach where for some kinds

of decisions you're making very s simple decisions very quickly. For some kind of decisions, you're operating at the scale of, you know, instead of thinking of a hundred nanos, maybe like a handful of

mics or tens of microsconds or hundreds of microscs or milliseconds. And in some cases, there are processes where if you can get that decision turned around, you

know, in an hour or that day, that's totally fine. And you're you're kind of

totally fine. And you're you're kind of competitive on a time basis at each of these horizons. Uh but you're making

these horizons. Uh but you're making very different kinds of decisions at all of them. Maybe you can't say but what

of them. Maybe you can't say but what what is it exactly these models are predicting like surely it's just not the next thing in the order book or maybe it is right so we're definitely like dancing towards stuff that's hard to talk about but I think the simplest and most

important one that we've been thinking about like we think about it now but like also 25 years ago when I started at James Street when I was building like models out of linear regression you know and stuff like that like a very useful

kind of thing is to predict a fair value for a thing like what do we think this thing is worth and that fits in in a very kind of composable a into lots of different trading processes. That's not

the only kind of thing that we use as a prediction target, but it's an important one.

It seemed like a meme I was getting for a while about like what trading firms do is like you you got to get the colo and the where the NASDAQ exchanges and it's very important that your machines are right there without thinking without thinking too much about

the exact details of what we put where.

like your inference processes might be on CPU, might be on FPGA, might be on GPU depending on the kind of constraints of how much compute you need, how big the model is, what kind of latency

turnaround do you need. And yeah, like bigger, slower things you can put farther away. It's annoying to have to

farther away. It's annoying to have to put all the compute right by the exchange. And for the stuff that's like

exchange. And for the stuff that's like really really fast, like being in the colo isn't enough, you care about like how long is the spool of wire that gets you there. like you're literally like

you there. like you're literally like measuring out the length of the fiber runs when you're when again when you're like at this very very low nanocond scale. Um but in general like the like

scale. Um but in general like the like bigger models give you a lot more flexibility in terms of where they physically go. If we're putting GPUs in

physically go. If we're putting GPUs in some of these colloccated uh facilities that are next to the exchanges right now, you have to work with with their rules. You know, who is who is that

rules. You know, who is who is that provider? Who is who is giving you that

provider? Who is who is giving you that space? Yeah. And your power, your

space? Yeah. And your power, your cooling, all those constraints now are maybe slightly tighter than if you have a facility facility that you're designing and operating. Um so you're now having to kind of come up with ways

to, hey, maybe I could only get one GPU in a rack because it consumes so much power. So now I have to spread it all

power. So now I have to spread it all out rather than being able to do liquid cooled in one rack. So these are all uh things we need to keep in mind as our comput you know comput.

You guys recently signed a $6 billion comput deal with core reef.

Mhm.

What are you going to use that for?

The rest of the AI world has scaling laws. We have scaling laws too and there

laws. We have scaling laws too and there are lots of models that we want to train. I think the thing that's

train. I think the thing that's interesting and maybe different between us and the kind of more traditional AI labs is the amount of diversity in model

architecture and the amount of experimentation that we're doing. So a

lot of the value you get from all of this is just people are like trying lots of very different new things in the model designs and giving researchers just like faster iteration time so they can discover more ideas and drive more

innovation. It just turns out to be

innovation. It just turns out to be incredibly important.

In the case of these foundation labs, there's some gain from have training just one model that does everything that is fully general rather than building a bunch of custom different models. Can

you give me a sense of why there's a different trade-off at Jane Street? For

us, some of the specialization is about being adapted to consume the right kind of data, right? And there are just like many possible data sources that we might be feeding in. There are like a bunch of

just differences in the data rates that we need to achieve. Like just like another thing that like just makes us need to kind of specialize some of what we're doing is just like the overall kind of both inference and trading

dynamics are made different by just like the the bytes to flop ratio being different. We have like way more data

different. We have like way more data that we are using to train the models, but the data is kind of bite for bite less informative just because financial data is very noisy.

Yeah. Um, and so the models tend to be smaller and the data tends to be noisier noisier and there tends to be a lot more of it. And it's also different between

of it. And it's also different between different models that we build for different applications, right? As we try and figure out like how can we leverage more of the information that we get.

It's like oh now there's like all of the kind of decisions from like how do we store and load data efficiently to how do we shape the model to how do we make the inference process, you know, have both the throughput and latency that it

needs. those there's going to be a whole

needs. those there's going to be a whole different set of trade-offs there. And

so there's just like a lot of value in kind of working that out and picking the the best thing that you can do for for different applications.

What is the um inference workload actually or how does it compare to what your traditional big chatbot LLM company is doing in broadstrokes? Latency matters more as

in broadstrokes? Latency matters more as you might expect. Um, batching is still an issue. Like depending on the model

an issue. Like depending on the model you're doing, you might have models or a part of models that are kind of disagregated for different symbols that you're looking at. And so the same kind of like pulling in data from multiple sources and batching them together makes

a difference. I think another thing

a difference. I think another thing that's interesting is just like the data rates are really high. Like the amount the the aggregate data rate that you get in a large LLM LLM lab from like all of

the different users is also very high.

But the amount of sequential data that you're going to get from any one user is not that high. Whereas when you the the data that you're pulling is the bytes that are coming out of like the NASDAQ feed, it's like oh man the data rate

that you want that kind of sequentially consumed in one domain kind of causally one after the other is really high. And

so again like the dynamics change and like but I think a lot of the same kind of basic engineering questions are not so dissimilar but like all the constants are twiddled to different places and so

you end up making different choices.

What what does that mean in terms of the how you how you had to design these systems where whether it's in terms of storage or whatever else.

Yeah, there's I think more emphasis on the performance of the data loading than you might otherwise see. I think we're doing a lot of work to build out our own kind of largecale uh data storage

system, our own kind of internal object store um where we've like used various kind of vendor products but um over time I think for some of these research focused use cases we kind of need to

operate at a much larger scale and need also to deal with a diversity of data centers right and this is like less a training time less a inference time and more of a training time question of like

we just can't get all the compute we want all in the same place And I don't know, I feel like in general like a an important trick in like effectively running a technical organization is

feeling is figuring out what shortcuts you can take. One shortcut that we were like privileged to be able to take for many years is we got to pretend like there was only one CPU architecture on the planet. Like everything was for like

the planet. Like everything was for like x8664.

We pretended like none of these other things existed. Uh and that simplified a

things existed. Uh and that simplified a bunch of things. Uh and we also had like one big research data center and one big storage cluster and that also simplified a ton of things. And actually both of

those have now been unwound like you just can't get the amount of power like you cannot wire in enough thunderbolts into like the same data center to power all the things you need. You need to get the data centers built all over the place. So there's a big disagregation

place. So there's a big disagregation problem and that gives you a problem like oh now you have to think about like your compute scheduling and your storage scheduling being intertwined one with the other and there's a ton of data. So

moving around is like non-trivial. Um,

and also we had to give up on this x86 only thing because Nvidia has a bunch of cool new products that mean that you need to support ARM. Now zooming out, I want to ask a very naive question.

There's maybe a naive view that uh, you know, if you have AGI, it can like immediately do what Jane Street does.

Give me a sense of like why that naive view is naive.

Yeah. And I don't want to totally discount it like you know there's a world that we should take seriously where like you know we're going to build large language models or some other AI

systems that are like strictly smarter than all humans on the planet and more capable at all cognitive tasks and like yeah that's going to be weird and that's like a different that's a that's a

different a different state of things.

Um, and in that case, yeah, you know, maybe large amounts of things that Jane Street does will be automated away and, you know, maybe we'll all just like, you know, sit back and, you know, drink more margaritas or something. I don't know

what that world looks like, but it doesn't feel like we're particularly close to that now. I think that like in general I think it's like easy to underestimate the richness and complexity of the work both that like a

company like Jane Street does but really that is done in kind of any really like ambitious high difficulty like company scale task. I think trading in

scale task. I think trading in particular feels to me as like kind of AGI complete sort of like NPcomplete.

It's like meaning like that like all of the different problems of the world end up influencing what you're doing in a trading context because at the end of the day trading involves figuring out

what things are worth which means making predictions about the future and lots of different things flow into that and as various pieces of that get automated you know you have the usual thing of like the other hard parts that we don't yet

know how to automate well that ends up being where the competitive edge lies I feel like humans and like human cognition are like more valuable than ever. Like I have never been more

ever. Like I have never been more desperate to hire more engineers and more traders than I am today because everything people are doing is more valuable than it was. I mean some of

this is just me being somewhat skeptical that we are quite as close to the models that are like smarter than humans at all the things as some people seem to think.

Maybe it's like physical infrastructure like actually getting the colo. Maybe

it's actually like the software infrastructure that you build. Like give

me a sense of what it is that would Yeah, we build like a huge variety of complicated pieces of software, have people thinking about lots of different trading problems, some of which are not very electronic at all. Like the

business is just like way more diverse than I think people give it credit for.

And there's an idea of like, oh yeah, it's like it must be that like simple thing where you just like you just have smart people who like make smart decisions and write good software. And

like if we could just automate the smartness part, that would be the whole thing. And I think it's just way more

thing. And I think it's just way more complicated than that. What what do you mean by the non electronic parts of trading?

I mean, there's still trading that happens via chat between people talking to each other and making decisions and like someone like sizing up how much adverse selection they think the person on the

other side of the phone represents.

That's like still a real part of the business. Um there's just like, you

business. Um there's just like, you know, there's just different kinds of securities that have taken longer to get more automated. The bonds business for

more automated. The bonds business for example is just like not nearly at the level of automation that you see in equities. Indeed, we I think we were

equities. Indeed, we I think we were kind of confused about this of like I think those of us who have been like in the business for a while. We kind of I mean I I started a little too late to really see the kind of transition of equities becoming electronic. But I

think people who are you know paying attention a little earlier than me were like yeah and I guess everything else comes next. And like you know what it's

comes next. And like you know what it's been like you know 25 30 years and like not everything has gone that way. the

systems are still, you know, we don't have a lot of people like standing on the floor of exchanges anymore, but there's still lots of trading that is deeply intermediated by humans and human judgment.

King of which, how much are humans in the loop on between the model and the and the trading decision? Many of your most profitable days happen when like weird stuff happens and there are events

and the world kind of goes crazy and like nobody knows what's going on and like that's when it's like very hard to provide liquidity in those contexts and so you get paid more for doing it and there's often a lot of volume on days

like that and doing that well often involves human judgment of like thinking about like how is today different from all of the other days and you know to the degree that we

can we want to build the models that work well through phase transitions but also we think humans work better than models do through phase transitions and sometimes you need this kind of meta

judgment to decide what to do and so there's a even for the systems that are largely automated there are decisions to be made by the people who are watching and we always have people who are

watching right I think an important part of trading is paying attention to and thinking about what's happening during the trading day even if the individual transactions are going by far too fast for a human to kind of weigh in on a

kind of transaction bytransaction basis.

Dan, what what have been the more notable changes over the last 20 years that you've been doing in buildings like these?

Yeah, people are actually care about data centers and want to talk about it.

You know, been working on cooling for a while and now all a sudden people people talk about it and and and think it's interesting. So that's like that's fun

interesting. So that's like that's fun and exciting. And for for folks on my

and exciting. And for for folks on my team, I think they feel that way as well. There's people who have been in

well. There's people who have been in the data center industry for 20 years that kind of still want to do it the way they used to. And I think that's kind of falling by the wayside now. Uh you're

finding ways where people are um challenging previous thoughts. Hey,

these my entire data center is backed up by generators. But generators are some

by generators. But generators are some of the longest lead time items you can buy. So maybe we take those away and

buy. So maybe we take those away and only put it for a core part of the system that needs that resiliency. Um

that gets our GPUs on six months faster.

Let's do it. So those are things that uh you know maybe Maybe it's not the best engineering decision, but it's truly the best business decision. And I think it's stuff like that that has been coming up more and more often.

It feels like every year people change the what what is bottlenecking scaling AI compute right now as you're doing more negotiations and trying to acquire more comput. What what is the current

more comput. What what is the current bottleneck and what do you expect it to be for putting aside comput and memory and all that fun stuff. So generators, uh transformers, um some of the cooling equipment that's used now for the liquid

cooling is is is in in a lot of demand.

So um and it changes rapidly. What I

tell you today is is going definitely going to be different 2 weeks from now.

Um we do this thing we work very closely with internal teams on the procurement side uh to to stock up on some of this stuff. Stuff that we know is fungeible

stuff. Stuff that we know is fungeible across all our data centers. We will

warehouse and have it ready to go. Um,

there's components like generators where you're not going to put a giant generator in a in a warehouse or or you know, for instance, if you're doing something behind the meter like a turbine, you're gonna you're going to have to think about those markets a

little bit more. Um, where you're getting them, where you're staging them, you can't just leave them off to the side. Um, so I think the components

side. Um, so I think the components definitely change. Those are some of the

definitely change. Those are some of the big ones. And uh you know as we get to

big ones. And uh you know as we get to more and more density you know I think one hope is that the buildings get a little bit smaller and maybe and and and we're able to like you know build the buildings faster get all that compute

kind of in a nice tight bundle and then all the infrastructure around it's got to got to be maybe pre-built and delivered to site right modular data centers or modular infrastructure is becoming more and more of a thing where

these components especially the long lead components are being um designed and built offsite and shipped to site.

So almost as close to plug-and-play as you can get.

Well, one of the points you made earlier is that as uh as the racks themselves get more uh dense uh you know more and more of the data center is like the infra around

the actual racks which actually is kind of similar to um like a a a package on like a a chip right or like a chip on a package. It's like the the compute is a

package. It's like the the compute is a very small part of the total area of package.

Yeah, it's it's interesting. thing. I

mean I I don't you know um it it doesn't solve any problems per se. I mean maybe it creates it creates others. Sure. Like

you know you get to a one megawatt rack, right? People are like what does that

right? People are like what does that even mean? One megawatt in a rack and

even mean? One megawatt in a rack and and you know you know the the cooling kind of the pipe is just going to get larger that you're bringing there and uh the amount of power whether it's kind of the AC power that we're using now or 800 volt DC where it's where it's going in

the future. You still have to bring all

the future. You still have to bring all that those components to a spot. And the

thing that's like interesting from our point of view is like you know we could design these these engineering things but at the end of the day whether it's Nvidia or an ASIC or who they have to

sell a component that can work in a data center and they're they're thinking very hard about what they sell um because you need people to use it right if you're if you build a one megawatt data center um

one megawatt rack but there's no way to power and cool it kind of useless. So,

you know, we're working very closely with with kind of almost everyone in in that space to think about what are the components you need to be able to support these next generations because the lead times you're talking about, you know, over a year sometimes

and you're just you're deciding on the infrastructure before you're placing an order for the chips.

So, you know, you're trying, for instance, the you know, TPUs, they use lower temperature water and they're they're half as dense as as you know, an NBL72 GP300, right? So that requires a

different strategy and and you want to make sure you can handle those in the future.

One of the things that allows hyperscalers to commit to large amounts of compute is that they have some reserve use for excess compute that they're not using for training or

inference of LLMs at a particular time.

For example, like Meta, if they're not using some of the GPUs they bought, they can just say we'll just make our Insta ad uh serving uh model slightly better

for today. What is the equivalent sort

for today. What is the equivalent sort of reserve use of compute for Jane Street that's just a lower bound on how much that's worth for you?

Part of what's going on is like in many ways we're just like very compute constrained. There's lots of innovation

constrained. There's lots of innovation and experimentation and new ideas that people have that is bounded by the amount of compute that we have. And so

like in some ways like if we just think about like like we do a we try and and do a kind of moderately rigorous job of thinking about the value of the new

different runs that we can do and the value of the runs that we're turning away is really quite high, right? So

like we're doing what we think are the most valuable things but you know if you know if it turns out we have more compute than we need for those there's just like a ton of other research and experimentation that we can

do in that space. So like we're we're we're nowhere near to like being like oh too much compute like we sort of have have the opposite problem. I think

there's also really lowhanging fruit in that direction. Just like it's valuable

that direction. Just like it's valuable to retrain the models more often.

There's some decay in the quality of models over time and being able to rerun them like that's that's kind of has immediate and clear value to the firm.

Uh there's also some amount of bulk inference tasks that that we can do that can like fill in the gaps in the systems where there's nothing else to schedule.

Um so we don't quite have the thing that looks like the analog of like the Instagram ad serving thing, but there is just like a ton of other like kind of dark space of like things that we're not

doing, but we would if we had more compute. So we're like pretty

compute. So we're like pretty unconcerned about getting value out of these. Here thing there is like there is

these. Here thing there is like there is a bunch of embedded bets like we are like investing a lot of money in in this stuff and you could imagine that like things won't get better at the rate that

we are thinking they will in terms of like the value of the individual models and trades that we're doing and like it's a competitive environment. maybe

other people will out compete us. We're

like I think one part of remaining good is like always being nervous about other ways that competitors can like figure out doing similar things to what you're doing and reduce the value of of that.

So like there are ways there are ways that it might not work out but uh certainly with anything like the current mix of compute jobs that we have we're just like very far from having this

problem. It's it's interesting to this

problem. It's it's interesting to this doesn't exactly answer it but like you know you could disconnect the uh the powering the data center from the chips and say okay well you know I I I might need to use this compute later let me

commit to the data center and the power now but like delay the decision on the chips which are very expensive right and and just be slightly long power and data center for that that that point of time

where you might need that compute um and then we'll build in situations where hey maybe we can kind of offload some of that capacity to somebody else it's much easier for us I is to offload power and data center capacity than is the chips

themselves for obvious reasons. But uh

you can you can really bifrocate those two.

This also changes the considerations around hiring. I mean you already have

around hiring. I mean you already have like the highest bar for hiring but it just increases even more if you hire one more person that is one person who will

need compute to do their experiments and that compute is going to be traded off against somebody else who's excellent on your team who could be doing experiments themselves. I I hear what you're saying,

themselves. I I hear what you're saying, but we don't think, oh, it would be weird to hire more researchers because then we'd have to give them more comput.

It's more like the research is incredibly valuable. The researchers are

incredibly valuable. The researchers are incredibly valuable. This is a good

incredibly valuable. This is a good argument for buying more compute. Um,

and so we're like very axed to grow the amount of compute. Like these days, we are in something like the range of like tens of thousands of GPUs and we will in not too long be in the range of hundreds

of thousands of of GPUs. And we think it's like well justified by the business like you know it's not it's it's not like it's it's not like you know we're worried about like oh you know can we

justify it based on like the penals of the trading strategy. It's like no no no it's like these are clearly good investments. Um so it doesn't feel like

investments. Um so it doesn't feel like it's slowing us down on the hiring front. In some ways, the the biggest

front. In some ways, the the biggest impediment to growth is that it takes time to like really train people and absorb them into the culture and kind of build build them up and build the place

up. Like we want Jane Street to continue

up. Like we want Jane Street to continue to be a great place to work. Like I I just don't think of the hardware thing as at all being the thing that slows us down. And and I think the real limiting

down. And and I think the real limiting factors are finding great people and having the mentorship capacity for them.

I guess this might be a good opportunity for you guys to mention what kinds of roles you're currently hiring for. Oh

man, why don't you start in the in the engineering space?

Yeah, I I'll start. I mean, I think so, we're generally just looking for really smart people, people that that that are interested in in in doing this stuff and and that's, you know, mechanical engineers, electrical engineers, project

managers, architects, people that help design and build some of these spaces.

And, you know, our our our remit uh in my team is is really to to find the spaces, to design them, to construct them, and then to operate them, right?

So, it's full life cycle. So in each one of those you kind of need people you know lots of engineers, lots of what we call physical engineering which is a madeup term that that we came up with but uh you know mechanical engineers and

structural engineers maybe electrical engineers those types of folks and and machine learning and trading in general is really like a whole team sport and so we want to hire people from lots of different backgrounds and with

lots of different capabilities. Uh we're

certainly like very excited to hire people with kind of you know specific like machine learning backgrounds of like you know designing architectures and building models in various cases. We

both I mentioned that we have like a bunch of like custom architectures and stuff for like our own bespoke kind of kinds of data that we need like the data kind of characteristic of the markets.

Um we also build LLMs and people who experience in all sorts of part of the life cycle of LLM training. we're

interested in hiring and have been growing that area. Um, you know, we we hire lots of like people with like

generally good scientific and technical backgrounds from like math and CS and physics and engineering and stuff to be traders and like there's a kind of mix of skills there. Uh, but that's like an

area we continue to be very excited to hire in. On the software engineering

hire in. On the software engineering side, there's like a general software engineering role which we're always eager to get great people for uh that, you know, I think just rewards a little bit, you know, it feels a little silly

to say, but just like, you know, as Dan was saying, smart, curious people with really good CS backgrounds, uh, you know, fit into that generalist role and there's lots of different kinds of things they can end up doing. There's

also a bunch of interesting specialized areas where we really are excited. Like

here's a thing that's kind of new. With

all of this scale, we are much more interested in fleetwide optimization than we were in the past. Like

we our old view about about performance optimization was that it was much more about, you know, making the things that were most speedritical as fast as possible. And more generally, yeah,

possible. And more generally, yeah, compute's kind of cheap and like people are expensive and we're not spending that much time optimizing our general compute. But like, man, we're doing a

compute. But like, man, we're doing a lot of general compute now. You know,

you start investing billions of dollars in this stuff and it just becomes more valuable there. And there are people who

valuable there. And there are people who have experience in doing this at some of the hyperscalers and we'd love to hire more people with that kind of background to think about the optimization problems that we're hitting which are like related in important ways different but

like you know so it's like both a related challenge and a new one. Um

we're like we do a lot of fun like hardware engineering stuff. We're like

working on our own AS6 people with that kind of experience is super exciting.

Um, one thing that we mentioned a little earlier at lunch was like we're starting to think about building out a formal methods team using basically mathematical proof to make software

engineering more effective.

That's like a new very speculative area and we're like very excited to find people there. We feel like that's a kind

people there. We feel like that's a kind of a set of a whole community of people who in the past I feel like I've always had to disappoint by like yeah we're not interested in formal methods but like I think the whole AI revolution makes

formal methods suddenly a much more interesting field and so it's a place we're excited to invest in. So I don't know and like I don't know project managers people who do front-end dev actually like for most of Jane Street's

experience we pretended like this whole web thing had never happened and like almost all all of our tools were just like in the terminal but you know it turns out it's useful to be able to like draw a straight line and you know have a tool tip and things like that. So, we've

actually invested a lot in building really good tools for doing front-end development and building tools for people and having great front-end engineers who are both really good software engineers and have a good sense

of what it means to make an application that's good for a person is really important. I say like as as a general

important. I say like as as a general meta point about all of this, I think that like in all of the like legitimate and real excitement around AI tooling, I think people sometimes like kind of miss out on the importance of

the human element of all of this. I

think that we really we really care a ton about building tools that are good for people and that comes that includes the AI tooling itself, right? I think

trying to drive tooling in a way that increases human understanding and agency and efficiency is like that's the core thing. We are limited more than anything

thing. We are limited more than anything else by the amazing people who work there and like being able to find more of the right people and grow the organization so that we can get more done. Uh and so we have a very kind of

done. Uh and so we have a very kind of humanoriented way that we think about the systems that we build.

Um, it's been really cool to have you guys um, make these fun puzzles and challenges. I think in general you do

challenges. I think in general you do that, but also you um, you've been uh, you guys have made a couple for the listeners of the podcast in particular.

And I think people who are listening to this might find it interesting to check those out. um uh including one by the

those out. um uh including one by the way which not only was nobody who submitted to the competition able to solve but Jane Street itself cannot

solve which um uh which involves finding back doors to various LLMs that have a trigger phrase baked into them. Anyways,

I mentioned this because um uh to the extent people are interested in learning more, I think these are the kinds of fun puzzles that might give some indication of what work is like and um uh why things like fun place.

Yeah, puzzles are a deeply embedded part of the culture. So, it's kind of great to use them as a way to reach out to people as well.

Yeah. Yeah. Um, I guess the plug here in this case is janestreet.com/doresh.

Uh, so that people can learn more about the open worlds and about all these puzzles. Yep. Awesome. Cool. Thanks for

puzzles. Yep. Awesome. Cool. Thanks for

doing this, guys. Thank you very much.

Our pleasure.

Loading...

Loading video analysis...