Why Anthropic Thinks AI Should Have Its Own Computer — Felix Rieseberg of Claude Cowork/Code
By Latent Space
Summary
Topics Covered
- Execution Cheapens Before Ideas
- VMs Unlock Safe Claude Autonomy
- Local Computers Trump Hyper-Personalized Cloud AI
- Skills Evolve from Markdown Instructions
- AI Accelerates Junior Engineers via Simulations
Full Transcript
This is maybe a somewhat contrarian view to a lot of people in AI. I
actually don't think that the future is going to be hyper-personalized software down to the point where everyone is running their own version. I actually think it's going to be quite hard for one of us to have our own internal chat tool. Silicon Valley
overall is undervaluing the local computer. And my default argument for that is always, how come you are using Macbooks and not like an iPad or a Chromebook? And now
when I think about clock, it's this entity that is supposed to be very useful to you. that gets tremendously useful to you. I think that entity needs to have
to you. that gets tremendously useful to you. I think that entity needs to have access to all the same tools you have access to. Otherwise, it's going to be hamstrung in all these complex ways.
Hey, everyone. Welcome to the Lydian Space Podcast. Our first one in the new studio at Kernel. This is Alessio, founder of Kernel Labs, and I'm joined by Swix, editor
at Kernel. This is Alessio, founder of Kernel Labs, and I'm joined by Swix, editor of Layton Space. Yeah, so nice to be here. Thanks to TJ, Alessio, Alan helping to set everything up. It looks beautiful. We haven't even had the logo outside. Yeah.
It's like really nice. When you walk in here as a guest, you're like, ah, this is a serious production. You like feel it immediately. Yeah. Felix, you've been, you're currently product manager of Cowork or? I'm really lead. Lead, yeah.
The identities are kind of vague. Member technical staff. I know. Member technical staff is like the official title we'll carry around forever. Yeah. I basically kind of wanted, like, we've been kind of obsessed. I've been using it a lot, even for managing Latent Space. Like, co-work helps me upload videos and like title things and like edits and
Space. Like, co-work helps me upload videos and like title things and like edits and everything. It's like really amazing. Cool. He's had multiple times co-work as a GI in
everything. It's like really amazing. Cool. He's had multiple times co-work as a GI in the group track. Yeah. Yeah. So we have a second channel for Latent Space TV.
And basically this is our Discord meetup. And we have like cloud coworkers, it might be AGI. I don't know if we have uploaded it yet, but one of the
be AGI. I don't know if we have uploaded it yet, but one of the sessions was like a cloud cowork thing. I would love to see it. Like I'm
so curious, like one of the most fun parts of my job is that I constantly see the weird things people use cowork for, because it's obviously like very hard for us to actually design for specific use cases. We do. But like every single person who's like most amazed is usually amazed about a thing that I didn't even expect Cowork would be good at. We have a new designer and it's one of
the first small tasks. I was like, Hey, we need like a new emoji for Cowork for our internal stack. It's like a pretty small thing. I was like, can you please do it? And he drew an SVG and just gave it to Cowork.
I was like, can you animate this emoji? And now it has like this beautiful loopy animation. And I mean, I think obviously this goes down to like, it turns
loopy animation. And I mean, I think obviously this goes down to like, it turns out you can do more things with code than you expected, but it's like that kind of stuff that is really fun to me. So long story short, I would love to see like the kind of things you're doing. I'll pull it up. I'll
put it up. Yeah. Yeah. Uh, but before we get into it, I think always want to start with like a top level. What is Cloud Cowork for people who haven't heard of it, haven't tried it out. Okay. Uh, real quick, Cloud Cowork is a user friendly version of Cloud code. So The way it basically works is we have cloud code and for us fairly impressive agent harness that over December we noticed
more and more people are using either, even though they're not technical, they're not at home in the terminal, or they are at home in the terminal, but they started using cloud code for non-coding workloads, right? Like managing expenses or like filling out receipts or organizing a knowledge base. Like there was a big obsidian moment that a lot of people liked. And we wanted to capitalize on that, but also bring this capability
to people who are not terminal native and who might not know how to like brew install something. So, Cowork is cloud code running in a virtual machine with a little bit of padding, a little bit more guardrails, making it a little safer, a little bit more convenient for people who don't want to first open up the terminal when they go to work. It's interesting that it's kind of pitched that
way as a more user-friendly thing, because I always feel like it To me, I treat it as like why I'm familiar with Cloud Code. Like we did a Cloud Code episode a year ago. But this one is like even more power user tools because it kind of integrates much better with like Cloud and Chrome and all the other tooling. But maybe that's like a perception thing, right? No, honestly, I don't think
other tooling. But maybe that's like a perception thing, right? No, honestly, I don't think you're wrong. This is like a thing I've been thinking a lot about for like
you're wrong. This is like a thing I've been thinking a lot about for like the last two weeks. But when they say user-friendly, it's like, oh, it's the dumbed-down version. But no, actually, this is the superset. Yeah, like I think a similar thing
version. But no, actually, this is the superset. Yeah, like I think a similar thing happened to me about 10 years ago, like maybe 12 years ago when I was at Microsoft and we started working on Electron and like browser-based technologies and cross-platform stuff.
And one of the first use cases was Visual Studio Code, which used to be a website. And the initial narrative was, well, Visual Studio Code is like a more
a website. And the initial narrative was, well, Visual Studio Code is like a more user-friendly version of Visual Studio. But in a similar vein, I think there were some voices saying, oh, this is not for serious developers. Like we're not going to. use
this, right, for like anything. And I think in the end what happened is people have different stories about why Visual Studio Code became such a big thing. But my
personal belief is that the hackability and the extendability has like played a pretty big role, right? You can hook in Visual Studio Code to like almost any workload. It's
role, right? You can hook in Visual Studio Code to like almost any workload. It's
so easy to hack on, so easy to build extensions for it. And I think Colwork might be hitting a similar thing where it's very easy to extend and it's very easy to bring into your workflows. So the convenience I think is a bit of a It's obviously the thing we strive for as developers, but I think the way people find value in it then is by probably mapping it onto whatever they
actually have to do in their job. So end of last year, you see the spike of like non-technical usage in Cloud Code. What's the design process to say we should make Cloud Code work? Because I mean, you built it in only 10 days.
I'm sure there was some discussion before on what does easier to use mean. You
know, like making like a desktop GUI is obviously one way to do it, but like there's a lot of nuance in the product. Like maybe talk people through what was like the trigger of like, we should build a separate thing. We should not build like a different plot code thing. And then maybe some of the more interesting design decisions that maybe you didn't take. Yeah. I think at Anthropic, we've been thinking
about ways to move people who are comfortable with using claw to answer questions. and bring more of the power of like this thing to now like execute tasks for you, right? Can like solve problems for you, can like build things for you. How do we bring that capability to people who are currently mostly comfortable with like a question answer paradigm within the chat? And we've had a lot
of prototypes around that, this coming back as far as like easily a year and a half, like we had a lot of people working on that. Um, and internally, Anthropic is a very prototype demo first culture. We have a lot of like internal prototypes that don't reach the public. And what Co-Work actually became is like we sort of picked the right pieces out of the many prototypes that we had. And that's
maybe also like, I think an important qualifier whenever people mention this like 10 day number. I do think it's important to me to mention that we didn't start with
number. I do think it's important to me to mention that we didn't start with scratch. There was like a lot of stuff already happening, right? And I think it's
scratch. There was like a lot of stuff already happening, right? And I think it's important for people to remember that when you build a website, you use React, you use like a bunch of other things. And this is like a similar scenario with like a lot of pieces we already had. And in terms of decision paths, I think we live in an interesting new world where execution is actually quite cheap. So
maybe what you would do that's so crazy to hear. It's wild, right? Usually, ideas
are cheap. Execution is the hard part. No, we used to live in this world maybe where you would take a product manager and the product manager would go to a number of potential customers and in this very low-bandwidth way would try to tease out what are the problems they're having, what are they willing to buy, and then maybe what can you build to address that need. And then you go back and
you draft a spec and you think about it and then you make a design and you execute it. We internally at Anthropik are now pretty much closer to the point where like, don't even write a memo, just like, let's build all the candidates very quickly. Let's just build all of them and then pick the best ones. I
very quickly. Let's just build all of them and then pick the best ones. I
think the decision that is most impactful both for the product as well for the users right now is like the way we put value on your local computer. I
think that's a big decision point. A lot of people have thought about should this thing, whatever it is, should it ultimately run on your computer or should it run in the cloud? Because they're big trade-offs, right? I guess like if we solved auth, it would be easy to do in the cloud. But I think like the fact that I can just download any file from anywhere and then put it in co-work
there, it's like a big unlock. I mean, it's interesting you mentioned reusing certain pieces.
I think this is something I've been thinking about even with cloud code, right? The
price of like writing code is going to zero, blah, blah, blah. But it actually seems like the value of having some sort of platform substrate is like increasing because as you build these new things, you can kind of plug them together. Yeah. So
I almost feel like when people are saying, oh, the value of a lot of software is going to zero because you can recreate it. To me, it's almost like the opposite. It's like having an existing platform to build on top of is like
the opposite. It's like having an existing platform to build on top of is like even more valuable because you can kind of build things on. Yeah. You have obviously MCPs, you have skills, you have like obviously the models, which is a big part.
all these things kind of come together. Do you feel like that's a valid way to think about it, where people should invest even more in kind of like these primitives to rebuild on? Or are you like recreating a lot of it each time because like things change and it's easier to rewrite than reuse? You know, I think you're right. I think you're right that the holistic platform is really useful. And this
you're right. I think you're right that the holistic platform is really useful. And this
is maybe a whole like somewhat contrarian view to a lot of people in AI.
I actually don't think that the future is going to be hyper-personalized software down to the point where everyone is running their own version. Like, I actually think it's going to be quite hard for one of us to have our own internal chat tool.
And like, if I want to talk to you, like, how is that going to work? Right. In the context of call work and how we build it, I think
work? Right. In the context of call work and how we build it, I think it's a bit of a combination. Like, the execution that gets cheap is not necessarily rebuilding all the primitives. I think a priori, there's also not a lot of value in it. So for instance, my team did not think about rebuilding plot code. We
in it. So for instance, my team did not think about rebuilding plot code. We
like very much started with the with the core thesis of this should be cloud code. And then we'll like build things on top of it. The part of the
code. And then we'll like build things on top of it. The part of the execution that gets a little cheaper is like, how do you take all of these Lego pieces and put them together in a way that makes sense for users? It
is like actually valuable. You have so many different approaches now in terms of what kind of, what kind of things do you actually elevate to a primitive? Do you
strongly believe that all your products should be built by just combining primitives with about all size available? Do you keep some things in total? And I think that's still evolving, but I think what's probably going to go away is like, I'm not sure if it's going to fully go away, but I'm going to say, I think for me personally, I will probably no longer try to come up with a really good
product without testing it with people. This is not a new concept, but wherever you used to have to make costly decisions around, do we pick technology A or technology B, or do we like build it this way, build it the other way. I
really strongly believe now you just build all of them and try them out with a small focus group and then whatever whatever is better is what you go with.
Right. And that that is probably quite different even from how we maybe worked a year ago. Right. Like I think I think this happened very recently. Yeah. I started
year ago. Right. Like I think I think this happened very recently. Yeah. I started
building something on Electron since you're here. Coincidence. But then Electron and like SQLite are like there's like some issues that like between development and like building anyway. And
I was like Let's just rebuild the whole thing in Swift and just recreated the whole thing in Swift. And it's like, it's done. You know, I didn't take any effort. I don't even know Swift. Yeah, exactly. I was like, I'm not reviewing it
effort. I don't even know Swift. Yeah, exactly. I was like, I'm not reviewing it anyway, whatever, you can write it, whatever language you pick. But the important stuff that I did was not write the electron bindings. It was like the logic of what happens in the app, you know, and then the model is like, yeah, I can just recreate the same thing as with. Yeah, I think you still want, especially for
people who are doing like high performance software, like very complex software, you still want like some view of the architecture, but you can use Markdown for that. Right. Yeah.
You don't actually have to read the code. Again, I'm still like on a sort of like a definitional thing. Can we build a good mental model of cloud co-work?
This is what I have, right? Like you said, it's like fundamentally cloud code, we don't want to touch it. There's the cloud app, there's cloud in Chrome. I think
you guys do something different in planning, but I've been talking with Tariq, who is on the Cloud Code team, and you guys are, he's like, no, we just exposed planning. Maybe you can clarify, what are the major pieces that people should be aware
planning. Maybe you can clarify, what are the major pieces that people should be aware goes into Co-Work? Like, okay, I think you basically have them. So you can take planning more or less out. I think that's a few things that are really valuable in Co-Work. The virtual machine is probably the most powerful thing. So we currently run
in Co-Work. The virtual machine is probably the most powerful thing. So we currently run like a, We currently run like a lightweight VM and we put Cloud Cloud into the VM. And we do that for a number of reasons. Safety and security is
the VM. And we do that for a number of reasons. Safety and security is a big one. But even if you ignore for a second safety and security and you're just like, okay, YOLO, I want this thing to do whatever, it is quite powerful to give Cloud a sound computer. That is like generally a good idea. And
in terms of architecture and UX and everything else that we've been working on Anthropic, it often is quite useful for you to like anthropomorphize. aggressively and just be like, this is a person. What would you do if you had a person? And the
analogy I've given my dad this morning, who is still quite insistent on using chat, even for coding things, is if you were a developer and your employer told you that you don't need a computer, they're just going to send you emails with the code and you send emails with code back. That maybe worked for PetrMiles in the back, but that is not very effective. So what we can do with the VM
is because it's a Linux system, Cloud Code has more or less free range to install whatever it needs to install. can install Python, they can install Node.js. We do
have strict network ingress and egress controls. So you can still, as a user in like plain human language, make it clear to the entire system what you're okay with and what you're not okay with. But at no point do we have to ask a real person, like a person who might be in marketing or a lawyer.
I don't have to go to the lawyer and be like, are you okay with me installing homebrew? Because the implications of the question and the answer are complex and nuanced and not easy to reason about. And this gives us a lot of distraction that makes Cloud very powerful. Now then around it, we do probably have a number of things that also keeps growing almost every single week that you're probably
noticing that make Kovac maybe better for certain tasks than just Cloud Cloud on its own. But most of those actually live in the system prompt. They're about like, what
own. But most of those actually live in the system prompt. They're about like, what can we infer about the work that you do? What can we introduce into the system prompt to make that more effective? It's of course, very tight integration with Cloud and Chrome. You're noticing that a lot of people, especially as the models get better,
and Chrome. You're noticing that a lot of people, especially as the models get better, a lot of people throw up their hands when it comes to MCP connectors and say, I'm not going to go through like 25 MCP connectors, click auth everywhere, and then like half of them don't let me do things anyway. So Cloud and Chrome is quite powerful because we can just talk to the Cloud and Chrome start agent
and that will just do things for you. Yeah. So one example, right? In MCP,
I honestly, I think that the state of MCP is kind of, like really hard to integrate. I needed to add Figma MCP to the coding agent that I use.
to integrate. I needed to add Figma MCP to the coding agent that I use.
But I didn't want to read the docs, so I just had caught to it.
And it's great at reading docs. And in the same way, I had to set up like a Google Cloud account for some project I was working on and get some API key somewhere. And Google Cloud is famously super hard to navigate. So I
just didn't want to deal with any of it. I just used Cloud Code. Within
the first week of developing on CodeWeave, this happened very, very quickly. I caught myself like studying to use code for coding tasks, which is not ostensibly what we built it for, right? We don't need to. But I found myself on our internal tool that we have to collect crashes and just like debugging information. And I
found myself sort of like picking out the ones that I think we can easily fix versus the ones that might be like kernel corruption or something else on the operating system. And I found myself sort of picking these out and then just telling
operating system. And I found myself sort of picking these out and then just telling Claude, go fix this bug. I was like, what am I doing here? Go one
level up, tell a cowork, I want you to go to all these crash tools.
I want you to find all the bugs that you think are fixable and not like an operating system crash. And then I want you to tell another cloud to like fix all of that. And that's sort of- You tell another cloud? Yeah. So
it can spin up another instance or? Currently what I do is, and this is a bit of a hack, but I tell it to use Cloud Code Remote to call it itself. Yeah. That's interesting. So you basically take, If you imagine like a dashboard with like 20 bucks. This is remote control or clock or remote? Sorry, I just wanted to confirm what. The way I'm using it is I
remote? Sorry, I just wanted to confirm what. The way I'm using it is I have Cowork running and I'm telling Cowork, here's where I normally go every morning to find the latest bugs. Go read the entire bug list, separate out which ones are fixable, which ones are not fixable. And then for the fixable ones, for, is this almost a loop, for each bug, write a markdown file with a prompt. And then
for each markdown file that is a prompt, start of a cloud set. So natively,
Cloud Code has this concept of subagents, and this is basically a subagent, but you're not using the subagent's functionality. I'm not using the subagent's functionality, and the reason I'm not is because I'm firing that off as a Cloud Code remote task. That's kind
of nice because then it can just fire it off. I can go to my next meeting and in Cloud Code remote, now the work's happening. Yeah, you see, like, you're already starting to use the cloud over your local machine. And I think this is one of those things where, like, well, shouldn't just everything just be cloud first, right? This is such a good group. I'm like solely about this. I have so
right? This is such a good group. I'm like solely about this. I have so many thoughts about that. Okay. So I generally believe that Silicon Valley overall is undervaluing the local computer. And my default argument for that is always, how come we're all using Macbooks and not like an iPad or a Chromebook? There's like still value in having a local machine. And now when I think about Claude, it's this entity that
is supposed to be very useful to you. Okay. tremendously useful to you. I think
that entity needs to have access to all the same tools you have access to.
Otherwise it's going to be hamstrung in all these complex ways. And there's sort of two approaches we could take. We could say, okay, we're going to like one by one chip away at everything that is at your computer and move it into the cloud. That's one way to do it. And I think other products have taken that
cloud. That's one way to do it. And I think other products have taken that path. I personally, this is a very personal opinion, but I personally, for the amount
path. I personally, this is a very personal opinion, but I personally, for the amount of tools that I use, just don't have the patience to give another tool like permissions to every single thing and keep those permissions up to date. The second thing that I'm still grappling with, and I don't have a good answer for anyone just yet, but the second thing I'm still grappling with is what does it look like
for someone to slurp up your entire work and put that in the cloud? Like
if I just, as an example, like if you would click a button and it just clone your entire computer into the cloud, is that something that you would want?
I'm not totally convinced yet that everyone will, And that is sort of like upstream of all the technical issues we're going to have, because like in general, I think the world is not ready for this kind of stuff. Like I'll give you one quick example that would probably be very easy for us. So as a desktop app, we, in theory, with your permission, can do a lot of things on your computer,
including reading your Chrome cookies, if we really want to do it, right? We could
take your Chrome cookies, you wouldn't have to decrypt them for us, but we could put those on the cloud if we really felt like it. Pretty easy solution that would be super cool. We could just be like, oh, we can do all your tasks in the cloud now. A lot of websites, banks included, if they see the same authentication from like two different locations, we'll just lock down your account. And now
you have to go to the branch and be like, okay, I'm here with my passport. You actually know that. Wow. You know, as tired as well are of the
passport. You actually know that. Wow. You know, as tired as well are of the term agent for the agentic future, I think there's a lot of stuff that sort of slowly needs to catch up. And until that's the case, the way I, as someone who's working on Cloud can make Cloud most effective is to like put it where you're working. Anything else with our mental model? So basically, part of me also
just want, the more I understand how it works, the more I can use it to its full potential, right? Yeah. And so what I'm hearing from you is you told me to delete the planning thing. You're not doing anything special that's only exclusive to clockwork. We have some tricks, but they're sort of like change freak over week.
to clockwork. We have some tricks, but they're sort of like change freak over week.
We eval, co-work maybe against different use cases than you would eval clock code. Right.
How do you think about it this way? Okay. So like Cloud Code is like quite optimized for coding tasks and we mostly evaluate whether or not we're getting better or worse depending on how good it is at like a typical sweet job. And
Cloud Cowork on the other hand, we evaluate more against typical knowledge work, the kind of stuff he would find in finance or in like maybe like a legal office.
My personal use case is always like managing my things like managing my personal mortgage or something like that. Right. Wolf planning for me and my family. Those are the kinds of use cases we eval CloudCorework on. And what you might be picking up on is like the subtle changes we make to the system prompt, what we put in the system prompt, how we steer Cloud with the tools we give it. So
like either it'd be better in one or the other direction, whether there's a trade-off, trade-offs exist a lot. CloudCode will be better for code and CloudCorework will be better for non-coding tasks. Will those gaps still exist in the next few generations of models?
It's like, a little unclear to me though. Yeah. Because right now these like hyper optimizations we make, I'm not sure for how long they're still really relevant. I think
what I was referring to was also it just qualitatively felt different when I probably is just out prompting and I'm reading too much into it. But like the fact that it comes out with like a nine step plan, I can edit the plan.
and give feedback and see it execute the plan. Yeah. It felt more long range than in Cloud Code, but maybe that already existed in Cloud Code and then you just built a nice UI for it. It's kind of both. Like if the Cloud Code people who build the planning functionalities with Cydia, they would say, yes, we have all of those things in Cloud Code and they do. I think people tend to
give co-work tasks that are maybe of a longer time horizon. It's so long. Yeah.
That's like one thing, right? You're just like the chunk of work tends to be maybe a little bigger. And then the second thing is that because the work, when it gets longer, it gets a little bit more ambiguous. We do tell co-work to make heavy use of the planning tool or to make heavy use of the ask user question tool, right? We do want it to come up with like different scenarios
of, okay, tease out what the user actually wants. Don't go off to work for like four hours and then come back with the wrong thing. And you're probably picking up on that. I wish I could tell you I like built this magical thing and it's like there's some secret sauce. I'm, oh, no, no, no. I mean, it's just clarity is good. Engineers just want to know. They can plan around it. And
I think also for me, I'm realizing I have to switch to my other machine because this is a new machine and doesn't have my session. But yeah, the planning is really important for me to like approve or like to see whether it's like it's right. The ask user question is so beautifully presented. I mean, it's also available
it's right. The ask user question is so beautifully presented. I mean, it's also available in like cursor and Cloud Code. But I think it's so nice to see that it's kind of for me to understand that it gets me. It gets what I want to do. Yeah, it tries very hard. Just on the topic of evals, when you say eval, I think people are very vague about what it means. Is it
just vibe testing or do you have automated programmatic evals of Cloud Code Work? When
we say eval, what we really mean is that we essentially take the entire transcript, including all the tools that cloud is available ultimately to it. And we then measure what are the outputs depending on what we tweak. Right. So we do run that a lot. We use that in training. We use that in like, if you sort
a lot. We use that in training. We use that in like, if you sort of separate out post training from like the scaffolding around that co-work sort of exists in the scaffolding space, but obviously we also train on it a little bit. So
when we say eval, we mean given the certain transcript, what do the outputs look like, including the file outputs, as well as like the actual token outputs, like the ones that you see in the TAT window. I'm curious how much of the failure modes are the model intelligence versus like the usage of the end tool to put the intelligence in? Like the world planning is like a good example, right? It's like
one thing is to come up with a plan. The other thing is like make a nice spreadsheet that kind of runs you through the plan. Like how have you seen that evolve? The thing that I grapple with a lot is that whatever scaffolding you come up with, I think we still have a bit of sort of like model overhang where the model is dramatically more capable than users end up using it
for. And I think part of that is ever just not getting the model, all
for. And I think part of that is ever just not getting the model, all the tools to do all the things that's theory capable of, right? That's like one thing. However, whenever you do build the scaffolding, sort of wondering at what
thing. However, whenever you do build the scaffolding, sort of wondering at what point will that scaffolding go away and like how much you invest in figuring out what the right scaffolding is, it's kind of up to, it's a little bit of a bet, right? And one thing that I as an engineer quite enjoy is that like working in Anthropik and working at a frontier lab, I maybe have a little
bit more insight into what's coming down the chute in terms of like, what's the next model? What is the model capable of? What is good at? What is it
next model? What is the model capable of? What is good at? What is it bad at? And I'm increasingly wondering, is the right thing for us to like really
bad at? And I'm increasingly wondering, is the right thing for us to like really invest too much in sort of these like scaffolding corrections where the model might otherwise not misbehave, but just not do the thing that you want? Or is it to just like give it as many capabilities as possible, try to make those safe so that the worst case scenario is like not as bad as it might be otherwise,
and then just simply wait a second for the next model to drop. I'm personally
currently more leaning into the latter. I think we're going to see a lot of applications and companies that do very impressive things with AI that in the short term might seem very effective because they're very specialized to individual use cases. But I think once models get better at generalization and get better at those specific use cases without being super guided on those, I'm not sure how long that's going to stick around.
And you can kind of already see this in skills and NCP servers, right? We've
already seen sort of this slow shift, from MCP service to skills. And like maybe a good example is Barry who made skills. He was initially hacking on something that honestly looked a lot like what cowork does today. It was sort of thinking about what if cowork, but for like people who don't want to build code. And he
too did that as a prototype inside the desktop app. One of the first use cases we thought of were, okay, what are like coding like use cases that could really benefit from graphical interfaces and like from being a little separated from the actual underlying code. And everyone comes up with the same answers, data analysis. Or it's like,
underlying code. And everyone comes up with the same answers, data analysis. Or it's like, how many users do we have today? How many? Like, it's always data analysis. And
I think the thing that ultimately led to skills is that we wanted to connect this little prototype to our data warehouse. And the team very quickly discovered that like, instead of building a custom tool for the thing to talk to our data warehouse, they just like made a markdown file like, dear Claude, if you want to get data. Here's the endpoint, here's what the API looks like, you figure
it out. And then it ended up hand over control. Yeah. Yeah. Also just like
it out. And then it ended up hand over control. Yeah. Yeah. Also just like maybe go one step up in the layer of abstractions. Right? Instead of telling the thing, here's a CLI, please call this CLI, or here's an MCP, please call this interface shape. Just like, this is the endpoint. If you want to know something, if
interface shape. Just like, this is the endpoint. If you want to know something, if you post here, maybe you can do post SQL, it's going to be okay. And
that ended up being so effective that They started trying the same pattern of like just giving the model a markdown file that describes whatever it needs to do, that the whole thing eventually became skills. And we're like, we should package this up. This
is a good idea. Yeah. We've had Barry Mahesh on our conference and he's definitely got a good idea there. Yeah. I wanted to show you how I've been using Cloud Cowork. So this is my favorite part. So this is like me.
This is how we run the Discord. We literally, at first, I didn't trust Cloud Cowork. This is my very first usage. Okay. Right. So then I was like, okay,
Cowork. This is my very first usage. Okay. Right. So then I was like, okay, I will just try to manually download from Zoom all my recordings and upload it to YouTube because this is a very laborious process. I got to click, click, click.
YouTube isn't super user-friendly. And it just did it. And then I was like, actually, You know, even the download from Zoom part, I should also put into Cloud Cowork and then I did it, right? Here's a bunch of, and it starts compacting here and it starts to even be able to do things like look through the individual frames of the video to name the video so I can upload it automatically. And
this replaces my job as a YouTuber. We will forever appreciate your creative. Yes. And so that's great. But then by the way, it compacts and makes
creative. Yes. And so that's great. But then by the way, it compacts and makes like a new thing, right? So I don't have the initial thing. But then I asked it to make its own skills so that something that's repetitive and one-off and human-guided becomes more automated and I can use the skills independently and reuse them. And
it obviously can write skills. And that goes into context and skills at the bottom here, which is so nice. So I have all these skills that I now sort of do on a weekly basis. I know you've released scheduled co-works, which I haven't done yet, but Of course, you should try them. I think this is like so wonderful and fun for me to see because I think one thing that is very
fun for me about skills in particular is that they're so easy to make. Like
anyone can make a skill, like a text message could be a skill and they can be so hyper-personalized to you. And this is like sort of this objection layer, right? Like, I'm just guessing, but you're very good at your job. You've probably given
right? Like, I'm just guessing, but you're very good at your job. You've probably given this thing some guidance about how to do it, right? I just said wrap everything up into a skill, right? And then I was like, actually, sometimes I might need to break things apart because some parts fail or some parts might be needed individually.
So I told it to split one skill into three skills. So it's like a skill splitting thing. And then there's like a parent skill that just orchestrates all of them if I want to use that. I think that's really good. And there's one more part, which is the Google Chrome thing that I told you about, where I'm like, OK, you know what's better than uploading using Cloud Coworks to YouTube? looking
at the docs to like programmatically upload to YouTube and then putting that in a skill. And I've never done that before. I don't want to deal with Google Cloud.
skill. And I've never done that before. I don't want to deal with Google Cloud.
So CloudCore does it for me. That is really cool. So, so I just, I don't care. I just like, do a thing. I don't, it doesn't really matter. That
don't care. I just like, do a thing. I don't, it doesn't really matter. That
is really cool. And then you, I assume paired the skill just with the script that it's built. Yeah. And then I just update, update the skill. Oh, that is beautiful. Yeah. That's wonderful. It's kind of like a skill. Basically, I think like, The
beautiful. Yeah. That's wonderful. It's kind of like a skill. Basically, I think like, The way that people ease into cloud co-work is like take a knowledge work task that you would normally be clicking around for and then try to turn that and then you do the, okay, well, what if you went further? Okay. And then what if you went further? And then you sort of expand the scope of co-work as you
gain trust with it and also teach it how to replace you. Yeah. It's like
a little bit like playing factorial, but for your own life. Like you say, you start really small, you start automating something really tiny and like, Once it clicks, you keep adding onto this automation empire, just make your life easier and easier. My favorite
skill has been every single morning, Koberk starts looking at my calendar and make sure that there's a conflict because people tend to schedule a lot of meetings, sometimes last minute, sometimes miss it. It's often painful. And a lot of products have existed like that a lot. I've written in the custom prompt there, I haven't made it a skill. I honestly should. But I've given like pretty clear instructions about,
okay, here are some people, if they book over other meetings, I'm probably going to go to their meeting. Like if Dario schedules a meeting. Right. Not try to reschedule Dario, right? And I think there's some other rules in there about like what kind
Dario, right? And I think there's some other rules in there about like what kind of meetings I care more about, what kind of meetings I care less about, what is okay to like maybe punt, like when I want to be working, when I don't want to be working. And it's those really small things that I think kind of click with people. Right when we launched CoWork, I think one of the users
that went most viral on Twitter X was clean up your desktop, which is of course silly. That's such a silly thing, right? Like you don't need a model to
course silly. That's such a silly thing, right? Like you don't need a model to clean up your desktop, not really. Like this? Like clean up my desktop? Yeah, exactly.
Yeah. I need to choose my desktop, right? I guess give it access to my desktop. Yeah. Okay. Okay. This is very scary. We'll do it.
desktop. Yeah. Okay. Okay. This is very scary. We'll do it.
I did it with my downloads folder. It was like, you have so many term sheets and there's like eight copies of your rental lease for your office. I was
like, all right, like, don't yell at me. It's such a small task. And I'm
like, I would never go out there normally otherwise and tell people, I've built a product. It can organize your folder for you because it feels small. But I think
product. It can organize your folder for you because it feels small. But I think to your point, like. Here's the ask user questions. Yeah. Beautiful,
right? Elite obvious junk. You probably shouldn't click that. No. If he's not done right.
It's not just reversible. I don't make a blend. Yeah. You know, I have a typical everything is super messy folder. So yes, I think this is super helpful. So
this is a pretty simple task. But I've okay, here it is. Here's the progress.
I don't see this in this. I'm like, this got to be something different than the cloud code because I'm like, we do the others. We do system prompt that we're like, all right, we want you to think about like, this task. Yeah. And
then I can do like little suggestions for these things. It's beautiful. Look at this.
I can say like, oh, don't do that. Don't do this. It's amazing. I'm so
happy you like it. I mean, the other way around, like we're part of the Cloud Code team. If you would like this in Cloud Code. Yeah. Damn.
So, yeah, I mean, This is really good. Obviously, I'm like kind of raving about it. I have other things like sign up for PG&E. So if you can do
it. I have other things like sign up for PG&E. So if you can do phone calls for me, that'd be great. I do. People have done that. Obviously, you
can't do that natively, but people have done that with like various other providers. Yeah.
And then this is like signing up for the Fingua MCP. I really am trying to do like everything. Data analysis as well. I do think, oh, Design to code, very, very good. So here's a Figma file, I'll take it. And then this is where a lot of other tasks is like knowledge work, like replace my manual clicking.
But this is, no, I would normally use Cloud Code for this, but because I perceive that you have better Chrome integration, I think you can actually do a better job of this. And this is one-shotted my conference website. That's pretty cool. At some
point, I would love to hear how you feel about code. in the desktop app, which is like, I never use, which is the same team. Same team. So I
use the cloud coding terminal, which I perceive to be the default way of cloud coding. So one thing this has, sorry, I'm just like, I'm not here. I'm not
coding. So one thing this has, sorry, I'm just like, I'm not here. I'm not
here. I'm not here. I have all these products. I can talk about other stuff.
I'm not sure if people out there want to like hear me advertise my stuff for like an hour. Please do that. This thing is like a built-in browser, which is a thing a lot of products have. It's a built-in browser, and I think giving cloud eyes into what you're actually working on makes it so much more effective.
And that's probably what you've seen in code work, because it can see Chrome, it can debug the DOM, it can see things. That does make it more powerful.
Yeah, so I think my mental model is kind of broken because I only use code work because I thought it had a browser thing in it. But I understand that the Cloud Code app or the app version of Cloud Code does have a built-in browser. I've seen this preview thing. I've never used it. In the end, you
built-in browser. I've seen this preview thing. I've never used it. In the end, you sort of get like... You basically get the same thing, right? The additional skill that you're describing is a lot better if it can see what it's working on. That's
sort of like the summary here. And whether it's using your Chrome or it's just making up its own little browser, it doesn't really make a bit of a difference because either way it's going to see what it's working on and that just makes it much better. And then you don't have to run QA for your cloud. Why
doesn't it pick up my existing cloud code sessions? Because I mean, obviously I've used cloud code, but... Excellent question. Don't have a good answer other than like we're honest.
This is what the OpenAI team does. Cool. I don't have other, like, I just, I do want to expand people's minds and also maybe show people if they haven't really done it. But like, I think it's very interesting how I sometimes use this more than I use, I mean, I use Dia, right? Yeah. And I use, I've used like all the other agentic browsers and Enthropic didn't have to build an agentic
browser because you just had cloud cowork and that's enough. Yeah. I also think like maybe integrating with a number of excellent browsers out there is like currently on my, personal priority list a little higher than like trying to rebuild a browser from scratch.
Yeah. You know, never say never, but I think going back to this idea of like, we want to plug this into an entire existing workflow. I think our goal is actually to not replace any of the applications you have on your computer, but instead like work really well with a new workflow. Make the new one. Yeah. Yeah.
It seems that nowadays, especially on the browser, most of the innovation is like user ergonomics. It's not really like the underlying browser engine. So I feel like to call
ergonomics. It's not really like the underlying browser engine. So I feel like to call it, it doesn't really matter if it's like Dia or Chrome or Alice, whatever.
Yeah, we want to meet you wherever you are, which is like, like, obviously I would say that, but it's also just genuinely true because I don't want to streak my potential user base artificially by saying, okay, like, I'm going to start building for the people who are willing to switch browsers. Right. That's such a, like, you know, like, many lawsuits have been filed over who gets to write the browser. And, like,
a lot of money has switched hands over the question of, like, which browser is default and which search engine is default within the browser. Um, I just want to build for, yeah, I want to build for Swix essentially. I want to, I want to build for people who have a number of annoying tasks that they feel like maybe Clark could do it, could do it for them. Yeah. What do you think
about skills portability? I think there's been one thing I use another thing called Zo, which is kind of like a cloud computer plus agent and I have a skill to add visitors to the office. Yeah. So whenever somebody has to come in after hours, they need to check in downstairs. Um, but I want to like text the thing. So it doesn't really work in, in cowork, but now that skill is in
thing. So it doesn't really work in, in cowork, but now that skill is in the zone harness and it's not in my cowork thing. And then if I make a change, it's, I gotta, I gotta sync them. How do you see that going?
Like I see memory as like cloud personal, kind of like, I don't necessarily want my memories to be crossing. Yeah. But I do want my skills to be cross agent that I use. I think with MCPs, people do the same thing. It's like,
oh, MCP Gateway, MCP Registry. I don't really know if that's like a business. So
I'm curious, like if you've had any thoughts in the area. I think for me, this is sort of where I go back to the really basic primitives. For us,
skills are file based instead of like this complicated thing that exists inside a place somewhere that is like super proprietary. I'm really leaning into the idea of like, it's all just files and folders. And that makes it very portable on its own, right?
We do have skills as part of this container format, which was just called plugins.
And plugins are available both for Cloud Code and Cloud Cowork, the same format. And
you can install plugins. This works in Cowork today. You can basically say, I'm going to add a whole, like just a GitHub repo as a skills marketplace or like a plugin marketplace. And that's how we're doing potability. I think we have a lot of room left to grow in how do we make it easy for people to know that they can write skills How do we make it easy for them to
just like share a skill with you? Because obviously all the words I just said, right? Like I'm losing most of the knowledge worker base out there. Right. It's hard
right? Like I'm losing most of the knowledge worker base out there. Right. It's hard
for saying, oh, you can connect to GitHub repo. It's not exactly how most people will end up working in like a general knowledge worker space. But I think there's something there. And another thing that's there that I think has not really been properly
something there. And another thing that's there that I think has not really been properly explored is the combination of which part of the skill is very portable and then which part of the skill is very personal to you. I think that's something we haven't really solved yet in the industry. Do you want to introduce more structure to the skill or always have public skill, private skill pairs? Yeah,
kind of. I think the easiest way to do this is we do use string interpolation or something. Insert username here, insert phone number, insert known folder locations, that kind of stuff. That's probably clunky. That's why we haven't built it. But I do think someone is going to come up with like an interesting
it. But I do think someone is going to come up with like an interesting way to keep everything we like about skills. The portability is just a file, it's just Markdown, it's just text, honestly, write like a text file words. The complete lack of structure, which means you don't need any kind of tutorial to write a skill.
Just like explain it to Claude the way you would explain it to me and Claude will probably get it before I work in, right? You're just like for booking your flight. I will tell Claude how to book a flight the same way we'll
your flight. I will tell Claude how to book a flight the same way we'll tell him somewhere I just thought about me yesterday. But combine that with a very like personal thing. Um, maybe we'll stick with a booking a flight example. I don't
actually think AI should be booking flights. I think the tools we have. Yes. Yeah.
Finally, somebody says it is the default demo that everyone was making. I'm like, I ain't even against like booking demos. It was not a good showcase. Yeah. I'm like,
I just want to book my flight myself, but, um, I think there's a lot of things that have a personal and a non-personal component. And that's maybe why people reach for flight booking because some things are very universal. Cheaper flight is usually better, right? Like few people try to book the most expensive flight. And then some
better, right? Like few people try to book the most expensive flight. And then some things are quite personal about like what times you prefer, which seat you prefer, which airports you prefer. Combining that and like a skill format that is actually portable, compatible, easy to understand for people. I think that would be very exciting. We're just having to figure it out. Yeah, I think the techs part I think everybody by now
has some sort of like cloud file thing, either Dropbox, Google Drive, whatever. So it
feels like in a way it should basically like symlink my skills into all my agent harnesses. Yeah. Just keep those and things. Like we have internally this like valuable
agent harnesses. Yeah. Just keep those and things. Like we have internally this like valuable tokens repo. which is like all the commands and sub-agents. And then I built like
tokens repo. which is like all the commands and sub-agents. And then I built like a TUI where you can start and be like, you know, install this command and these three sub-agents into this agent and this folder and just copy paste this. It
doesn't do anything. It literally CP the file into that. But I feel like there should be something similar where like whenever I go into a new thing, it's like, hey, here's like the link to exactly the cloud folder and just bring down these skills into this. Like today, it doesn't quite work like that. Like if I install a new agent, I cannot, I have to like copy paste all the skills and
I don't even know where they are. Yeah. That's like the big problem. It's like,
where do I find them? Yeah. So I'm curious, like in the future, like that almost feels like my personal productivity thing will be my skills. Yeah. It's not really the product that I use because everybody has access to the same product. But today
there's that just looks like copy pasting. I think so many things. I really like thinking about agents and LLM just as like another coworker. so many attempts have been made to build documentation companies that are like, oh, we're going to solve all your documentation problems. I myself spend a little bit of time working in Notion, right?
I'm deeply familiar with the concept of let's get everyone on the same page. What
you're basically saying here is you want all your agents to be on the same page about your preferences, about the skills, about the way they ought to work and how they ought to execute. I'm not sure what the right thing is going to be. If it's going to be some, some company that can say, all right, we're
be. If it's going to be some, some company that can say, all right, we're as an independent body, we're not trying to like push into any particular product. It's
our job to be like the skill authority and we provide, I don't know, we're going to be the Dropbox of skills and we can just sim link us into all the products they want to use. I'm not sure that's going to be viable business, but as, as an idea, it would be cool. Right? Yeah. Yeah. I think
so many things are just going away as businesses. It's like, how am I supposed to do it? I'm not even asking somebody to make a product about it. Like,
I want to personally know. And there's things, like you said, it's like you almost want a skill and then interpolate it between personal and work. So if I'm booking a flight for work, it's different than I'm booking a flight personally. In some ways, but like a lot of the scaffolding is the same, you know? I mean, as an engineer, I will tell you like, you know, technical person, technical person, I will
just be like SimLinks. Well, that's what I do with Cloud.md and Agents.md. It's just
the same as how SimLinks. And so it's like, that works, but It feels like, yeah, I don't know, maybe. You could always go one level up. You can always tell coworker problem and then coworker will solve it for you. Doesn't make the same things. That's like one way to do it. That's true. That's true. All right. Everything
things. That's like one way to do it. That's true. That's true. All right. Everything
is called cowork. Potentially spicy question for both of you. Which of these industries will go away? Okay. So what Felix was saying before is interesting. There's basically like the
go away? Okay. So what Felix was saying before is interesting. There's basically like the short term pressure of like, we need to turn these tokens into valuable things, which is I should build the last mile product that harness the model. And then there's the question of like long term, which ones are gonna still be valuable? And I
think you're kind of seeing this today with like, you know, the coding space in a way, it's kind of like everybody's moving up and up in stack because you need more than just turning tokens into code. I think search, like enterprise search is kind of seeing the same thing, like with Glean and like all these different companies.
It's like, at the end of the day, if cowork is the one doing all the work, the search itself is like such a small part that like, I don't know, if I'm really gonna pay that much money just to do search. It's almost
like everything is like a coworker vertical. So like how much can cowork first party support and how much can it not? I think for a lot of these things, the planning thing that you were showing. The which one? The planning. Okay, yeah. Like
that's one thing where like most of the value that these agents provide is like they're better at planning for specific tasks and have better tools for it. But I
think the models are now moving in that direction and they have the right harnesses and they're on your computer. So for me, it's almost like if the end customer trusts your startup to be the provider of that task result, then I think that works. This is something that this is a short spike that we're working
on. Yeah, I think Look, I'll tell you this. I don't think I'm the
on. Yeah, I think Look, I'll tell you this. I don't think I'm the best person to actually estimate which industry is going to hit the hardest. But I
do think that at Anthropik, as a group of people, we're deeply worried about the impact that the tools are going to have on the labor market, especially for junior employees. Because I think it's only honest to say that when we talk about automating away a lot of the work that we personally find annoying, that we maybe think it's not the best use of our time. In a lot of industries,
that kind of work would have been given to a junior entry level employee. Right.
And I think it's only right to be really worried about that and like worry what that's going to do in particular to people who like enter the job market. I have a solution for that, which you make them, you create simulative
job market. I have a solution for that, which you make them, you create simulative jobs for them. Okay. So this is like half joke, half true. So if you think about software engineering, when you're like a junior engineer, you work like one, two, three years. And in those three years, there's like maybe like an end-fall moment where
three years. And in those three years, there's like maybe like an end-fall moment where like you really learn something and then a bunch of other days where like you're not really progressing. I think now we can use AI and these models to actually like shortcut these careers and almost like simulate the early years of your work and like just make them like super dense in like these learnings. It's like, hey, we're
working on this feature, which is like a distributed system and you need to learn this thing. That might take three months at a company. And so you take three
this thing. That might take three months at a company. And so you take three months. Here is like, we're just simulating the whole thing. It's actually not a real
months. Here is like, we're just simulating the whole thing. It's actually not a real thing. And in one week, we kind of speed run through the whole thing. And
thing. And in one week, we kind of speed run through the whole thing. And
you kind of learn your lesson from there. And we kind of repeat that. In
like one year, you basically get like three years worth of like projects and experience.
Yeah. I think it's harder for like things like sales or for things like, you know, marketing, because you don't really have a way to get the feedback loop. But
I think a lot of it, it sounds kind of silly. It's like you're making them do a fake job, but it's almost like, you go to college, right? People
pay to learn how to do it. And this might feel similar where it's like, hey, we have the Jane Street simulator. It's like, you want to come work at Jane Street? We'll just put you in the simulator over like three months and you'll
Jane Street? We'll just put you in the simulator over like three months and you'll come out of it. It's like, you know, I'm ready. So there is an aspect here. I'm not an expert enough to like actually know what is going to happen
here. I'm not an expert enough to like actually know what is going to happen to marketing or legal or finance, right? Like I don't work in those jobs and I don't think I should talk about them, but I am an engineer and I think I have a pretty good idea about what engineering is like. And I think one thing we're sort of seeing is that as a company and also as the
public, we're like deeply worried about entry level, but we're also seeing more senior engineers accelerated. If you like, they're more productive, they actually increase the value they provide. And
accelerated. If you like, they're more productive, they actually increase the value they provide. And
the thing that I'm thinking about a lot is the fact that even before all of this happened, I've always had a lot of respect for the University of Waterloo and the new grads that have joined my teams as from coming from University of Waterloo always felt like more ready than new grads who literally spend their entire time at university, regardless of how good, but never actually had to work inside
an environment where you have to ship things that eventually will be used by users.
And I'm German, I initially went to German university. And I think the information systems programs there tend to be very theoretical, right? Like I often give people the example of like trying to become a doctor, but you first have to do four years of biology. And as a result, When you get a new grad, you sort of have to teach them what it's like to actually build products and
to work in a company and like work with other people. And like some people will have a different opinion and like, how do you do all of those things?
And the University of Waterloo, it seems like they just spend half of their time.
I don't know if it's true, but I think it's a year. They spend so much time. Part of your job or curriculum to do spend a year in internships.
much time. Part of your job or curriculum to do spend a year in internships.
Yeah. They just like go from company to company. They show up on your team as like a junior engineer who has been to like 20 companies, not really, but like it seems like a lot of my new grads have also briefly worked at Apple, Google, Tesla. Yes. And there's a common meme where they like collect all these logos like infinity stones, but, and they always put it on LinkedIn. It's very unclear
that they were an intern. Yeah, yeah, exactly. But it does actually make them so much better compared to other new grads. And I wonder if that's a useful model maybe for the future when we also have to like crunch down the amount of time you have as a junior employee because the value you have as a junior employee is going to like be impacted. My sort of pro young people take is
that you're more, you have higher neuroplasticity, you can learn more, you have less pre-existing biases. And what I assume is true for you, what OpenAI often says is that
biases. And what I assume is true for you, what OpenAI often says is that actually it's the younger like first grad engineers that use codecs or their coding stuff more innovatively than the experienced engineers who have a set and preferred way of doing things. Yeah, as I talk to people, I have similar experience. Yeah, so maybe you're
things. Yeah, as I talk to people, I have similar experience. Yeah, so maybe you're more AI native and therefore you get cut. But I think the problem is you don't need that many of them. I mean, Anthropic is on the record as saying we do believe that the impact on the market is going to be sizable and we do not think that people overall are ready. And we do actually think we
should probably talk about it as a society much more. I'm not sure that I'm like the individual that can add anything useful there. But I think as societies with economists and governments that need to wrestle those questions in a way that is probably more meaningful than me wrestling with them, we're probably not doing that enough. Yeah, well,
we'll try to educate. And then I think also just releasing frequently as you guys do, or probably maybe too frequently. is helping people to adjust over time, right? Rather
than one big bang thing, there's like sort of this gradual takeoff that people are living through that we can feel up, right? Yeah. But I think a lot of us like wondering at what point do we actually have full takeoff, right? Like at
what point is there, we're all sort of expecting this like big bang moment where things will accelerate so quickly that it becomes a self-reinforcing loop. And at that point, it's sort of like off to the races and there will be no more like slowly catching up. You know, just have Cloud being so good at everything. It's when
co-workers training models. It's when it's looking at TensorBoard and weights and bias studies and training things. We can all debate how many years it's away, right? Some people make
training things. We can all debate how many years it's away, right? Some people make a better route, like maybe it's 10 years away, maybe it's a year away. I'm
not entirely sure where I come on this line, but I'm not entirely sure that ultimately it matters all that much, whether or not it happens in four or five years. If we have a decent one, certainly that's going to happen. It's probably something
years. If we have a decent one, certainly that's going to happen. It's probably something we should wrestle with. I wanted to talk, so by the way, the scheduled task complete, the clean my desktop class complete and it did. It organized by file type, which okay, but you know, I was trying to get it to do more sort of thematic, like read the file, understand what it's about, group by the topic rather
than the file type. I mean, you can just follow up and have it do that. Oh yeah, clearly. Like it is proposing this, right? Yeah, so it's got some
that. Oh yeah, clearly. Like it is proposing this, right? Yeah, so it's got some like topical things, but yeah, I could probably do better. Like, yeah, so like I probably need to give it a skill to read video files so they understand here's how I like to. Honestly, though, like I see that you're using Opus 4.6, right?
Like my recommendation for people is increasingly don't worry about it anymore. Just like tell it what you want it to do. Yeah. And it's probably going to figure out a way to do it. Okay. It might not be the way that you like necessarily or the way that you've gone about it. Eat videos deeper. But we're outsourcing organizing all of this. So that's fine. Yeah. I'm honestly like so curious what cloud
is going to come up with. I'll kick that off. I wanted to also just talk about the overall, you know, you talk about data analysis, you talk about like your personal finances. You also said, which by the way, for us is very timely tax season, right? Like use cloud core for tax season. It is not responsible for any mistakes, but might as well, right? Like it's free knowledge work for you. So
I just like, I think cloud for finance is a big deal. And this is definitely like in that mix. I wonder, is it like, is it a separate team?
Do you talk to them? How important is it? Right? Like, you can also natively output Excel files now. Yeah. Just talk about the finance effort. Yeah, we care about the verticals quite a bit. So we do have a dedicated verticals team. We also
have a dedicated enterprise team. And this is engineering, not sales. It's engineering. Yeah. Yeah.
It's engineering. So we do have people who sort of come to work every single day and they ask themselves, how do we make co-work extremely effective? for people in those specific industries. How do we make it easier for them to understand? How do
we make it easier for them to pluck into this and like sort of get the same value out of it that software engineers get? I think it's no real surprise that software engineers ended up being sort of at the forefront of the entire AI moment because so much of it is this like Rube Goldberg machine-esque where like we're already used to automating things, right? Like it's part of our job. So we
care about it quite a bit. I think it also like really matches what we see Cloud being very good at as a model. I think it provides tremendous amount of value to those customers in particular, because we can do so much for the amount of data they have. Those are like data heavy industries. Their industries were correctness matters quite a bit. So, for us, if I've used it to analyze my business,
I just can't show it. It's too sensitive. I had a similar question about taxes.
Like I did tweet, I did tweet about the fact, I did tweet about, oh, COVID is doing my taxes. This is honestly incredible. And, um, it's like, annoying because like this is so cool, but I'm not gonna Twitter is maybe not the audience that needs to like see my text returned. Yeah, but here it is. It's reading
on the videos. So it's like, yeah, it's getting more. Yeah. How did it actually do it? I'm actually curious. Oh, usually it just like takes a screenshot and then
do it? I'm actually curious. Oh, usually it just like takes a screenshot and then it reads the screenshot of it by vision. So this is what I do for my my zoom upload thing, right? Because I have paper club sessions that I need to upload to zoom. And I wanted to automatically title them and do show notes and everything. So it just takes screenshots and try to try its best. It would
and everything. So it just takes screenshots and try to try its best. It would
probably benefit from transcribing, which it's operating by Pure Vision now, but it's good enough.
And then I do have to call out to Nano Banana to do images. So
unless you guys do images for me, I have to call other people with images.
We're aware. It's just like so fun for me because like this is the thing that I'm increasingly doing, like increasingly curious about cloud creativity and like figuring out what is great. plots approaches to like certain problems. Yeah. Vision for everything is like the
is great. plots approaches to like certain problems. Yeah. Vision for everything is like the superpower, right? Like, you know, and computer use, you guys were the first to do
superpower, right? Like, you know, and computer use, you guys were the first to do computer use, right? And when it was launched, I was very unimpressed. I was like, it's slow, it's unreliable. It's wild. It's wild how much better it's going. It was
one year ago. Yeah, I know. Like it was barely usable. Yeah, I remember it was very usable, but isn't wild how much better things have gone over there? We
went to the Anthropic office because for the launch event for computer use, there was this hackathon. And nobody hacked on computer use. But I did see, I don't know
this hackathon. And nobody hacked on computer use. But I did see, I don't know if you were okay with me saying that, but I did see briefly that you do have an automate macOS MCB server installed, right? Have you used that ever?
Sorry, which one? Where? If you go to your settings. Oh, settings. Okay. Sorry, this
one? Yeah. Yeah. I noticed that in your connectors, I probably set it up one time, but I don't use it actively. Oh, okay. The MaxWise
automator? Yeah, yeah. So, yeah, this one, I really wanted to just automate everything in my thing. I didn't find it super reliable. Okay. Why? No, no.
my thing. I didn't find it super reliable. Okay. Why? No, no.
That's true at all. Cloud is much better at writing AppleScript and executing its own AppleScript than relying on these third-party tools. Yeah. So I've increased, I initially installed IMCP and like all these other MCPs that people built and, but now I don't use any of them anymore. Like just, just let claw write its own thing. It's going to be more custom made. We keep going up the stack,
own thing. It's going to be more custom made. We keep going up the stack, but I think computer use is like a fairly interesting area to me. And it's
like also interesting in the sense that I don't think we're far away from, I don't think we're far away from claw being very effective at like using your computer and not just a theoretical computer. What's the relationship between the user and the computer?
Like there were some tweets about how huge some of the VMs that Cloud Cowork creates are, it's like 12, 15 gigabytes and people complain. But at some point it's like, if you're using the computer, you're taking action on, is this just your computer?
And I'm just looking at it. You know, it's like, I think that's why people like the idea of like the Mac mini and the open claw or whatever on it, because it's like, it got its own home, you know, it's doing its thing.
I'm doing my thing. I think there's some kind of like, not like race condition, but it's like, okay, if I kickstart this task, Now I can't really use the computer, you know, because Cloud Cowork is doing things on it and it's kind of awkward. Like, yeah, I'm not sure. I do think it's a super interesting area because
awkward. Like, yeah, I'm not sure. I do think it's a super interesting area because I can maybe tell you like some of the things I thought about that I think are actually a bad idea. So when we initially started working on Cowork, I did have some dreams about what would it look like for Cloud to have its own cursor. Could be cool, right? Like it's a computer, we can write code, we
own cursor. Could be cool, right? Like it's a computer, we can write code, we can touch everything. Like who says that computers need to have one cursor? We could
do a second quiz, right? But that actually breaks down quite a bit, even if you go and like present cool dreams to both Apple and Microsoft. You're like, wouldn't it be cool if it breaks down quite a bit because so many of our models on the computer are built around this idea of like, there's only one thing working on it. There's like a foreground app, a background app. Cloud and Chrome can
work in the background, but that's like within one application. But at the operating system layer, that is a lot harder to implement. So I'm still grappling with what does it mean for Cloud to actually act on your computer? Is the right format for Cloud to have its own computer that you set up and maybe every now and then you like zoom in and you play with it? Or is the right format
for Cloud to just like wait until you're stepping away for a little bit and take over while you're gone? Or is the right move for Cloud to just like have its own computer in the cloud and like whatever you want Cloud to do, you have to set up yourself? Right? There's like a number of different options. This
is the thing I think about a lot like what is the relationship between you and your computer and you and your data on the computer? Because how intimate that relationship is kind of depends on the tool and the thing that you're currently looking at, right? Like we're quite comfortable sharing some things, very uncomfortable sharing other things. And
at, right? Like we're quite comfortable sharing some things, very uncomfortable sharing other things. And
I think whatever product is gonna be successful, we'll have to deal with those different things. But you probably, even if cloud was capable of making a determination, Would
different things. But you probably, even if cloud was capable of making a determination, Would you want Claude to make that determination in the first place? It's tricky, Barry, because it's like, it's more than just privacy. It's like almost intimacy. And it's like tricky to reason about in a way that will make everyone comfortable. Yeah, I could see, you know, a virtual box, like actual virtual box app where like you run the
VM and then you have like a screen within the screen, you know, you can put it in the background, but then you can like jump in the screen. You
know, that's not a good idea. Yeah. Like, I mean, I used to, you know, people used to do it virtualizing like Kali Linux in a Windows machine. Yeah. And
like you just jump in and then you would jump out, but it's like, it's not like a dual boot. It's like within the thing. The problem is that you need twice the amount of RAM, twice the amount of, you know, it's like, it's kind of taxing on the machine, but I think that would be cool. Kind of
like see, you know, the little card window, I can see it's desktop, look how cute it is clicking around things. I was going to bring up, he's the original machine in the machine guy because he has the Windows Windows 95 project. Where's the
Windows 95 project? There's probably someone like it out. No, no, no, no. It's like
the first thing you see is this one. Nice. Yeah, exactly. That was honestly a very fun project though. Like obviously I didn't, I should say this just so that no one gets the wrong impression. I did not write the actual, the actual, obviously I didn't build Windows 95 because I was a child, but also I did not build the actual engine that is capable of like simulating an X86 processor in JavaScript
and Watham. That's a tool called V86, which is very cool and everyone should try.
and Watham. That's a tool called V86, which is very cool and everyone should try.
But this came out of a debate we had at work where people were like, they often are in the end of debating the merits of Electron and whether or not we should be building software in JavaScript, yes or no. And I still am very upset that I can run all of Windows 95 in JavaScript and launch Microsoft Excel inside the virtualized JavaScript Windows 95 machine and do things that I can do
that entire chain faster than I can do a lot of other things in like traditional SaaS applications. This is sort of like a performance rampage that I went on.
So I mostly built this as a joke for some of my colleagues at Slack.
This took like one night. What? But then that I, it was not hard to do. It was All the hard work is in V86. Like if you go to
do. It was All the hard work is in V86. Like if you go to the repo, it's going to say like 99% of this work is done by a guy who goes after the by the name Copy. His name is Fabian. Yeah. Cool.
I think you're kind of back on the Windows grind because you're building out the Windows support. I thought there were some really cool technical stories to tell. And it
Windows support. I thought there were some really cool technical stories to tell. And it
gives people an appreciation of like, well, here's how hard it is. And here's how important, how you invested the sandbox. So maybe this is like a good opportunity to talk about some of the details. Oh yeah, the VM honestly is like so cool.
There's a lot of things we dislike about the VM, right? Like there's a lot of things that are real trade-offs and you want to know why you're making those trade-offs. And you're right, there are a lot of people write me like, hey, how
trade-offs. And you're right, there are a lot of people write me like, hey, how come Cloud is taking up 10 gigabytes? I could say on that point, it's not actually taking up 10 gigabytes. It's just like a way that Mac OS displays bytes is like wrong. But the way we actually write it to disk is by we collapse the empty space in the image. So it's not actually taking up 10 gigs.
but that's a technical differentiation that's not going to matter too long. To me, the how come is it takes too long to start. Yeah. It's like 30 seconds sometimes.
I don't know. Oh, it should be faster than that. Whatever. It's maybe 10, but it feels like 30. Yeah. Like even either way, like whatever it is, it's going to be slower than just running Glock code directly on your computer, right? So the
trade-offs are real. But what we're doing on Windows, we're using the Windows host compute system. It's the same thing that WSL2 runs on, like the Windows subsystem for Linux.
system. It's the same thing that WSL2 runs on, like the Windows subsystem for Linux.
that I think a lot of developers appreciate quite a bit. And it's pretty cool because we sort of like have to separate out which system space this virtual machine runs in, who gets to talk to the virtual machine, because obviously you give this virtual machine a decent amount of power. How do we optimize not just the connection between the two systems, but also how do we make sure that random other application
doesn't get to talk to Claude inside the VM? We do some pretty interesting things.
Last week we started writing a new networking service, a networking driver. that optimizes how Claw talks to the internet if your company is doing weird internet things like patent inspection and taking your part as a cell inside your company. I think there was probably a very small, easy version to build of Cowork that is much simpler, but
also breaks on most users' computers. And this one is quite nice because it works on most users' computers. And the default example I always go for is I really want this to be highly effective on a machine that most people pick up, and that machine will probably not have Python. It will not have Node.js. And even if I just take away those two things, Cloud is going to be so much less
effective on your computer. So what do you do? You don't even, I mean, maybe require people to install Node in Python. Oh, like, what does the future look like without a VM? No, no, no. So like you said, right, let's say a target machine is whatever is a default spec Windows laptop. We do this, which is quite cool. So on macOS, we use the Apple virtualization framework, which is pretty
cool. So on macOS, we use the Apple virtualization framework, which is pretty solidly optimized. Like it's good stuff. And it's a simple API call, right? It's like
solidly optimized. Like it's good stuff. And it's a simple API call, right? It's like
super simple. I saw the code recently. I'm like, that's it? What the fuck? Once
you start like shipping production code on it, you start adding like all of these edge cases, it ends up being a little longer. But I think Apple really cooked with a virtualization framework. And it's very, very good. It's very fast, it's very reliable.
And same on Windows, the host compute system, I think WSL2 as well is maybe one of the diamonds within Windows. It's like one of the few things that developers universally rave about. It's very, very cool. And like hooking into the same subsystem makes it a lot easier for us to say, we don't really care how locked down your computer is. Maybe it's like your employer's computer and your employer has decided that
you get to install nothing. Not trusted. But it's true in a lot of environments, right? Like even at Anthropic, our IT department controls what kind of software you install,
right? Like even at Anthropic, our IT department controls what kind of software you install, which is like a pretty common experience for many companies. And this gives IT departments a decent amount of, like it makes their job so much easier because we can say you can separate out Claude's computer from the user's computer. And then for Claude's computer, what you probably care about is data loss. You care about like a
potentially hostile actor. You care about maybe data being exfiltrated. And once you control the network and the file system layer, you don't really care necessarily anymore that Cloud might be writing super useful Python scripts. What worries you about the fact is that like once you install Python, now anyone can do anything on the computer. But once you put that in a VM, that risk really goes down. Yeah. So that's why we
jumped through all of these loops. Yeah. I think you had a different tweet about this, but it's almost like people have also approved exhaustion. Like it's like you can't approve every single command. Like sometimes by default, some of the CLIs, I think even early cloud code, we have to approve every single command. Yeah. And like,
so there's this sort of dichotomy between either approve every step or dangerously skip permissions.
Yeah. And actually sandboxing is like kind of like the middle ground. Yeah. I do
think, I do think it's maybe on us as like the AAN history to come up with something better than, oh, this is super safe as long as it doesn't do anything. If you want this to be useful, then you have to approve every
do anything. If you want this to be useful, then you have to approve every single step of the way. And computer use is a good example. The only way to make computer use on your host super safe, really super safe, is probably if you approve every single action. Models, I would like to type the word You're like, okay, that seems fine. I know which cursor is focused. Yeah, it's not automation if
you don't delegate. Yeah, exactly. You need to probably delegate. You need to be able to delegate and walk away and trust that this thing is not going to mess dramatically. And I don't even think we need to build perfect systems. I don't think
dramatically. And I don't even think we need to build perfect systems. I don't think we need to wait for 100% model alignment. We can rely on the same Swiss cheese model we've used in the industry for a long time. do think we need to like universally maybe eventually invest more and that's what we're doing we need to invest more in systems but we can say you do not need to approve everything
speaking of swiss cheese model he just wrote a thing about this oh cool yeah yeah um yeah super cool i mean yeah it's it's weird how like i guess usually i think safety and security is kind of like a boring word to to engineers they're like just give me unsafe to me unsecure but um i think achieving the right thing, like you're going after a consumer slash prosumer. Yeah, yeah, kind of
like both. I think I also want to capture people who would have no trouble
like both. I think I also want to capture people who would have no trouble using Cloud Code like yourself, right? Yeah, yeah. But still find them maybe just convenient, easier. You're like, oh, cool. That's like the to-do list on the right. I can
easier. You're like, oh, cool. That's like the to-do list on the right. I can
edit it. Those things are just easier to do if you have to. Yeah, but
this is like clearly the knowledge work side. Cloud Code will clearly capture the development workflow. But like, I do think like you have to sweat this like safety and
workflow. But like, I do think like you have to sweat this like safety and security details in order for people to trust it. And like even cloud and Chrome, like having the whatever API uses to do the background thing. Yeah. That's the only reason I use it is because otherwise I would have to just get a separate machine. Yeah. And just run it, run to the list. And that sounds super annoying.
machine. Yeah. And just run it, run to the list. And that sounds super annoying.
Yeah. I mean, I'm currently doing it, but. And I think also as developers, maybe we're, we are more risk tolerant, but we're also just like accepting We are more risk tolerant, but I think we also just have like, I don't want to say arrogance, but like sort of the trust that if like the really bad thing happens, we can probably fix it. I just tell Claude to like check with me before
doing any irreversible action, like sending an email or doing it permanently. It's good enough.
But like not even Claude, I mean like simple things such as npm install. Like
we're all running npm install with full user permissions. And if it wants to like read .ssh, it will. that is the default. I agree. I agree at the fine.
read .ssh, it will. that is the default. I agree. I agree at the fine.
I'm obviously doing it every single day. I think obviously NPM and GitHub too have done a pretty good job maybe over the last couple of months to clean house and come up with more specific tokens. But generally speaking, I think as engineers, we've always been a little bit more risk tolerant. And if you do a little bit of introspection and you ask yourself, is that how we should be doing things?
You might not always come up with the right answer. And I think for models too, like my approach, like I'm not gonna, the safest thing is to do nothing.
We do want products that are quite capable, but to the extent possible, I don't wanna ask you, are you okay with a script? Because I kind of believe that once it starts becoming a part of your workflow, you're probably not either, either you don't have the skill to understand whether or not this Python script is safe or you're not gonna read it anyway. Cool. I guess a couple parting questions. What's the
future of Clockwork? I think we're still such early days. We're going to keep shipping things that we're going to keep shipping things that we're going to keep iterating on this thing like pretty quickly, but which I mean, you can sort of continue to expect that every single week, there's going to be like a small new feature, if not a big new feature. I'm going to continue probably to double down on your
computer and like making you effective on your computer and making cloud effective on your computer. We're starting to grapple as we talked about today, grapple more with the question
computer. We're starting to grapple as we talked about today, grapple more with the question of like, what does it mean? What does your computer mean? Does it have to be the one in front of you? like a VM on your computer or like a computer somewhere else. And then the third thing that I'm quite excited about is we're continuing to go up this hill climbing on slowly taking users who are used
to asking questions and getting an answer to slowly teaching them to like step more and more away and that claw take over like bigger and bigger tasks and work both in time as well as in like scope. And I think you can probably see most of our investments in our feature releases to like work on both of those things. Like the ability to do more on your computer and then the ability
those things. Like the ability to do more on your computer and then the ability to do more independently for longer. Does remote control work for Cloud Cowork yet? No,
right? Excellent question. Coming soon. I mean, that's an obvious thing if you want to keep betting on your computer. But to me, like, you know, we talk about like people are not ready this year. Like there's no wall, it's accelerating. To me, like, what will we be doing differently at the end of this year that You know, we're maybe not even thinking about this at the start of this year, right? Like,
I'm just trying to look ahead as to like, what's like a good use case that we sort of aim towards. So for example, for the machine learning scientists, it's always, okay, well, I want AI scientists that can automate machine learning. But like for knowledge work, I mean, I can already, you know, get it to sign up for Google Cloud to mean SAGI. Because Google Clouds, but like, what's beyond that? I don't
know. I think it's basically the idea that like you still had to tell her to build your script, right? You were still kind of involved in maybe a way that felt kind of magical to you, but like maybe to me on the other side is the person building this product still feels kind of heavy handed. I see
so much process that I'm like, oh, let me take that away from you. Okay.
Like how do I just go? I will continue to go or continue to go like further and further up the stack and make your life easier and easier. Oh,
here's one, right? Yeah. Watch. you know, I don't care about my own privacy or whatever, or I trust Anthropic. So just watch everything I do on a normal day to day basis. At the end of the day, tell me what you is called co-workable. Yeah. I think the funny thing about a lot of these products is that
co-workable. Yeah. I think the funny thing about a lot of these products is that like, for good reason, I don't enjoy, I don't feel my entire career, I've never like teased too much what I'm working on because I think you should just like, yeah, to lose it, yeah, build the days and release it and then talk about it. Like, I'm not a big fan of that. vague posting my own work ahead
it. Like, I'm not a big fan of that. vague posting my own work ahead of time. But the thing that is always so fascinating to me is like, both
of time. But the thing that is always so fascinating to me is like, both of you all multiple times a day, you've mentioned things and I'm like, yeah, that is obviously very obvious that someone should be working on those things. And I think we're still in the space where if you look at co-work, the things that we will be releasing will probably not be a big surprise to either of you. You're
going to be like, yeah, obviously that's valuable. obviously that we're working on those things.
And obviously that's good and useful. And the more I hit those points, the more our features fit into that category. I think the better it is for us, because then we don't end up building things that are too hyper-specialized, too difficult to understand.
Yeah, I think the hyper-specialized thing is very important. It keeps you like general purpose.
It means you're not thinking too small, maybe. I don't know what the word is.
Yeah, yeah, exactly. It's like the whole concept that like at no point if we release You know, there's no cloud code for Node.js applications that use React and 10 stack and only those two things. And like, if it's anything else. I know several startups like that. I think that's pretty, like, I'm not a VC. I'm not an investor. It's like hard for me to predict where the markets go. But in terms
investor. It's like hard for me to predict where the markets go. But in terms of the building blocks that I'm interested in, Electron is probably by far the most popular thing I ever built. And Electron itself is like very abstractable and generalizable, right? Like so many apps are in it. I think it would have been hard for me to predict how many apps actually end up using Electron.
And what would have been even less useful for me to predict this and what those apps do. I just really remember Bloom coming out of me. That is cool.
Like your camera in a little circle in the corner. That is pretty smart. That's
an Electron app. Yeah. Or at least was. I'm not sure if it still is.
It was for a while. Like 1Password has so many interesting things, right? It's a
level of the stack that I'm quite comfortable with and Whenever I give other engineers advice, it's actually that layer that I think is most valuable to invest in because the tools of the layer are not that good, but that's where you get the most leverage for the future in general. Just quick tangent on Electron, because I always wonder this. Have you looked at Tori? I have, yeah. What's your take? My view
wonder this. Have you looked at Tori? I have, yeah. What's your take? My view
is like most things should be Tori by default unless you really need the full power of Electron. Yeah, I can give my big take.
why do we ship an entire version of Chromium inside the thing, right? Like, why
do we do that? And people ask me this question a lot because it's like very counterintuitive. Wouldn't it be much easier to use the WebViews that are on the
very counterintuitive. Wouldn't it be much easier to use the WebViews that are on the operating system? Wouldn't it be much easier not to have to do that? And the
operating system? Wouldn't it be much easier not to have to do that? And the
answer is yes. And like, obviously I did that once upon a time. I did
that. It was a version of the Slack app that used just the operating system WebViews. Wait, did you start the Slack app? Well, team effort, yeah, but I was
WebViews. Wait, did you start the Slack app? Well, team effort, yeah, but I was there and we built the Slack app. Yeah, it's crazy. I mean, obviously get the Electron guy to do it, but... Well, but this is an interesting point. Like by
the time I joined Slack, they already had an app that was built with something at the time called MacGap. It was a little bit like the same appgap thing for mobile. It just used the operating systems, web views, and that didn't work for
for mobile. It just used the operating systems, web views, and that didn't work for like so many reasons. And they were like, all right, maybe we need like bigger guns. We need to like take more control of the rendering stack. And there's a
guns. We need to like take more control of the rendering stack. And there's a few things I always mention here. I think if you're building a small app, just going with the operating systems, Red View is perfectly fine. If you're building an app maybe that doesn't have too many users who will like cry bloody murder, if it doesn't work, that is fine. The reason to go with your own embedded rendering engine
is because, and this is still true in 2026, the operating system rendering engines are not that good. They're just not that good. Both Microsoft and Apple are trying to move away from that. They so far really haven't. The only way to upgrade those is to upgrade your operating system. So if you're, say, a Slack and you have a critical rendering bug in WK WebView and some of the other WebView options, your
only recourse is to tell your customer, oh, sorry, you're too poor, you didn't buy the latest MacBook. Unacceptable. Unacceptable to the user, unacceptable to the user developer. So you
sort of need to go down the stack and find the best rendering engine and then put it in your app. Why Chromium, even though it's very big, Chromium is by far the best thing. Like I often like to remind people the Unreal Engine, you want to render some text, they use Chromium. Like Chromium is part of the Unreal Engine for the same purposes. Chromium is very, very good. I think it's like
one of the marvels of engineering. It's very hard from, we're in San Francisco right now, this is where we're recording. Most of the people in the city are web developers. It's hard for me to like overstate how magical it is you can run
developers. It's hard for me to like overstate how magical it is you can run like rendering a YouTube video, dynamically negotiating a bitrate, figuring out what to do about your extremely broken hardware driver. Actually, this is a fun thing. You can enter chrome, colon, wagwag, GPU. Okay.
And if you scroll down a little bit, these are all the enabled workarounds.
because something is going wrong on your computer. If you're doing this on a Windows computer with like a GPU that is not the most popular GPU, it will be much longer. And all of these are usually just there to make sure that if
much longer. And all of these are usually just there to make sure that if I say as a developer, I want a red pixel to appear here, that that actually happens. Chrome is such a marvel because it works on all the machines that
actually happens. Chrome is such a marvel because it works on all the machines that a user might throw at you. And it's going to work fairly reliably. And if
it doesn't, they will probably fix it within 24 hours. I see. So this is the super operating system, right? That works everywhere. Yeah. So a lot of the magic of Electron is honestly just that it makes it very easy for you to ship Chromium in a way that serves you exactly and your use cases exactly. Our next
interview is with Marc Andreessen, who had the phrase like, desktop OSs are just poorly, poor implications of the actual OS, which is Chrome, which like actually works everywhere.
And this is the platform where you ship apps. I think the wild thing is that I guess engineers, we so often sort of assume that the platform, like the layer below us is like super stable. And then you talk to those people and then I go, we're also just like guessing. And I had like a distinct moment at Slack where one of our customers at Slack was NVIDIA. And for a while,
I really put GPU developers on this pedestal in my head. And I do think they're still probably much smarter than I am. But I was like hardware engineers who built the chips, who then they built the drivers. Their work must be so much harder than mine. They must be very good. And we had like one bug in Slack where like if you had a YouTube video in Slack, it wouldn't quite render
why like it would have these weird artifacts. And that ended up being a Chromium bug and I ended up on this like giant thread. So I get to see a lot of the source code and they also are just like common to do.
We don't know why this is weird, but if you flip this bit, things work, you know, this is just like happening at every layer of the stack. Maybe the
end of year AGI prediction is that Cloud can build Chromium. You
see, you laugh now, but someday. It's starting to get pretty good. It used to be completely useless, mostly just overwhelmed both with how hyper-specialized tools are inside the Chromium repo. For a long time, the Chrome ads would sort of reinvent all the tools because none of them are capable of handling Chrome. I
think the EGI moment I'm kind of waiting for is at what point are we going to say Electron is probably no longer necessary because you can just build fully native apps. The Swift-y. Yeah, like not just in Swift, because this is one thing,
native apps. The Swift-y. Yeah, like not just in Swift, because this is one thing, like it's pretty easy. I think our current models are quite capable of taking an Electron app and replicating it Swift. Are they going to be capable of like building an app that is actually more performant, uses less memory, all of that stuff, is going to go into the same hyper optimization that developers have done for a long
time. We're not quite there yet where I can like point even our best models
time. We're not quite there yet where I can like point even our best models at a thing and say, just replicate this in native code. Make no mistakes, UltraThink, right? We're not quite there yet. UltraThink is bad. Today, UltraThink is back. Yes. Okay.
right? We're not quite there yet. UltraThink is bad. Today, UltraThink is back. Yes. Okay.
Or we'll go on UltraThink for like days. Just a pretty long time before. But
he worked on UltraThink for days? Yeah. Why? It's just a prompt. I'll let it go. The more goes into it. Yeah. Okay. Another question I had is like, co-works.
go. The more goes into it. Yeah. Okay. Another question I had is like, co-works.
So if I have my cloud co-work, like what's kind of like the multiplayer mode.
I think sub-agents is like single players split up the context. Yeah. And the multiplayer co-work is like my colleague has some file on their machine that I want to know about, or I want to know how their task is going to then update my thing. Like, is that interesting? Is that something that makes sense for you to
my thing. Like, is that interesting? Is that something that makes sense for you to build or for like It's like super interesting to me. It almost goes back to like some of the scaffolding where I'm like, okay, are we going to be end up, will we end up building scaffolding that will just go away? And like a question I have here is at what point do we just assign these things like
their own Gmail account? We'll just give them their own like Slack handle and then they will just like use the same tools we humans use to interact with each other. You mentioned our finance people. They've been working pretty hard on very good office
other. You mentioned our finance people. They've been working pretty hard on very good office integrations and I think for a while we built so much tech around Claude leaving useful comments inside a Google Doc and now it just does it. It
just leaves a comment in your Google Doc and that's how you interact with it.
Maybe the similar thing where I still have open questions around what is the best interaction mode. Is it for us to build something super custom for co-work agents to
interaction mode. Is it for us to build something super custom for co-work agents to talk to each other? Or is it, okay, let's just jump straight to the finish line and say, well, we're just going to give this thing, if you use Slack at work, we're just going to give this thing a Slack handle and that's going to be the way it's multiplayer capable. They communicate with each other. Yeah. Like, you
know, as a fun project, I built this thing called PyQ, which basically takes any repo and the Py agent, coding agent, it puts it in a VPS, and then there's a public webhook where anybody can submit a coding task. And then there's a dashboard in which you review the task, and then there's PyQ. Py, P-I-Q.
you basically get all these like tasks. Anybody can submit a task. And to me, it's almost like in the organization of the future, it's like the salespeople are talking to the engineering team that is talking to the marketing team, to the product team, and all these co-workers are going to like queue up decisions for other people to approve in a way. Yeah. You know, and I'm kind of curious what
that looks like and like how do you how do I give my co-work the ability to both approve tasks without asking me. Yeah. And how to decide which one I need to review. Yeah. You know, because for some of these things, it's like, you know, you want to change the color or something that's kind of like a branding decision or another one is like, hey, your thing is just broken. It's like,
this is like how you fix it. Yeah. And Claude can actually review whether or not that prompt matches what it's trying to do. Today, everything is still very, it's like multiplayer within the single player, you know, I can spin up many of them.
But like, how do I get multiple people to hand off to each other things using their particular context. Yeah, and for both of your co-works to talk to each other, right? Right, yeah, hey, we got an episode today. Can you like, have you,
other, right? Right, yeah, hey, we got an episode today. Can you like, have you, you know, or... Yeah, this is like, I know we're like running out of time here, but like we previously talked about sharing skills and I did have this question of like, what if your co-work would just like ask the other co-works if they have a skill for this task? There's another thing we could do, right? Like, okay,
so skill transfer. Yeah, like, and again, that's maybe a bit...
This maybe goes back into the territory of like building something very powerful and building something creepy often goes hand in hand. Because I could tell from the reaction that my fellow engineer said that this is probably not what we're going to do. But
like we have Bluetooth LE, right? Like I, this computer can figure out that it's sitting right next to this computer. So you're probably working on the same thing. Will
you see that in co-work? Probably not. But there's like, I think really creative solutions to problems that we really haven't tried yet. Yeah, yeah, yeah. Excellent. I guess the last thing is Anthropic Labs. I always have this mental model of a model lab versus agent lab. And this is basically Anthropic's internal agent lab, which Cloud Code is now under, right? It's part of the whole org. I mean, people are so
fungible, right? Like, okay, this is just, I don't know how real this is. No,
fungible, right? Like, okay, this is just, I don't know how real this is. No,
it's a real team. It's very Raheem. The last team is primarily working, though, on things that you don't see in public yet. They're trying, like, really wild out there ideas that seem quite improbable. The mad science thing. But are you officially under this thing? No. Now Cloud Code is like a fairly big group where I
thing? No. Now Cloud Code is like a fairly big group where I actually know how many people we are. Like I remember yesterday coming into our weekly COVID meeting. I was like, whoa, there's a lot of people here. But we still
COVID meeting. I was like, whoa, there's a lot of people here. But we still have a Labs team. We actually made the Labs team a lot bigger. Mike just
joined the Labs team as an IC, which I think is very cool and very fun. But they're working on things that you have not seen yet that are extremely
fun. But they're working on things that you have not seen yet that are extremely out there and probably half broken. Right? Like the sort of the idea of a labs team is that it should only work on things that make really no sense for anyone else to work on. Okay. Well, looking for exciting things from there. But
thank you so much. I know we're out of time, but I appreciate your joining us. I appreciate Cloud Cowork. Everyone go use it. It is the closest I've felt
us. I appreciate Cloud Cowork. Everyone go use it. It is the closest I've felt to AGI this year. That's so nice of you to say. Thank you very much.
Thank you for your time.
Loading video analysis...