AI Agents Just Changed Forever: GLM 5.2, Codex Skills, Claude & Cursor
By Riley Brown
Summary
Topics Covered
- Open-Source Just Passed the Vibe Check
- Record Once, Skill Forever: Screen Capture as AI Training
- SpaceX-Cursor: Unlimited Compute Changes the Super App War
- Fable Depression: Ambition Is the Real AI Benchmark
Full Transcript
What an insane week in the world of AI agents. If you want to know the latest
agents. If you want to know the latest updates on Claude Fable 5, the latest Codex feature that lets you record your screen and turn it into skills, the best open-source model in the entire world,
and if you want to know about the SpaceX cursor acquisition and more, you're in the right place. You're watching Agent Native. I cover the latest updates and
Native. I cover the latest updates and news from Frontier agent platforms and models so that we can learn about and use AI agents effectively. My name is Riley Brown, and if you want to become
Agent Native, hit that like button, hit that subscribe button, and let's dive in. Today, we're going to get started
in. Today, we're going to get started with the most important news, in my opinion, in the world of AI agents. The
company Z.ai released an open-source model that I believe is like five or six times cheaper than GPT 5.5, and some are
saying it's actually comparable and almost as good as Opus 4.8 and GPT 5.5.
So, GLM 5.2 is a model released by Z.ai, and this company is from China, and this model is open-sourced, and it's much
cheaper than Frontier models. And by the way, I'm going to show you exactly how you can get this set up directly inside Cursor in just 1 second. But, I first want to talk about the benchmark. And
so, here are some of the benchmarks, and this is what it looks like across the board. You'll see that GLM 5.2 is
board. You'll see that GLM 5.2 is comparable to Opus and GPT 5.5.
Currently, I think the best model, besides Fable, is GPT 5.5 with Opus trailing just a little bit, but this model actually held its own when I
actually tested it. Because normally,
when a new open model comes out, usually, there are benchmarks that are released. They don't actually tell the
released. They don't actually tell the whole story, or even in even close to an accurate story. But, because they put
accurate story. But, because they put these cool graphs on Twitter, there's a ton of hype, people make a lot of videos saying that this model's actually really, really good. And usually, when I
go to test that model, I just end up incredibly disappointed. I actually test
incredibly disappointed. I actually test the model and it does not pass the vibe check. And as I tweeted earlier today,
check. And as I tweeted earlier today, this was not one of those times. This
model after spending a ton of time actually using this model, I do believe that it passes the vibe check. I think
that it's getting close to the frontier labs, specifically Opus 4.8 and GPT 5.5 and I think this will actually cause the frontier labs, OpenAI and Anthropic to
release even smarter models. I think a lot of people realize that the models that they rely on every day can be taken away. However, with these open models,
away. However, with these open models, you can actually download the weights. I
think a lot of people are taking this time to test these open source models.
And so the best place to try out this new model, GLM 5.2, in my opinion, is directly inside Cursor and I'll show you exactly how to set that up in just a second. I use the Convex plugin and it
second. I use the Convex plugin and it one shot a Trello app with basically all of the different features that Trello has with a database and authentication and it works nearly perfectly or it
actually works perfectly. I also had GLM 5.2 go off to research about me, then create a landing page, and then run it locally. I also connected GLM 5.2 to my
locally. I also connected GLM 5.2 to my Notion, to my Slack and to a ton of other integrations and I was having it just do general agent tasks for me and it was doing a great job, just as good
as if I was using 4.8. And so yes, I think the model was really good and you're going to see a lot of people on the internet saying the model is really, really good. But you shouldn't take our
really good. But you shouldn't take our word for it, you should actually go in and try it. So I'm going to show you the easiest way to try this model directly inside Cursor. So directly inside
inside Cursor. So directly inside Cursor, what I want you to do is follow these exact steps. It should only take you 3 to 5 minutes to get this model directly inside Cursor. In order to add the model to Cursor, we're going to be
using another tool called OpenRouter.
Normally, if you want to use a bunch of different AI models, you need a ton of API keys in order to access them.
OpenRouter allows us to only use one key, so we can get access to GPT 5.5, Claude Opus, DeepSeek V4, and in this case, the most important one, GLM 5.2, and then thousands of other models. This
video is not sponsored by OpenRouter. I
just want to explain why I normally use this. And so, OpenRouter allows us to
this. And so, OpenRouter allows us to add any model to Cursor. I'll show you exactly how to do it. In Cursor, you're going to go down to your plan here, and you're going to click settings. Then,
what you're going to do is you're going to come up here and you're going to select models. You're going to come down
select models. You're going to come down to API keys, and what you're going to do is you are going to turn this on right here. So, normally this is off, and you
here. So, normally this is off, and you are going to put in your own API key.
And then you're going to say override the OpenAI base URL. You're basically
converting this OpenAI key into an OpenRouter API key. And in order to switch this from OpenAI to OpenRouter, you're going to paste this exact thing.
I'll put the link in the description.
You're just going to paste this exact thing in here, and then what you're going to do is you're going to come up to view all models, and you're going to come down here and you're going to click add custom model. Now, you can add any
model from OpenRouter here. And so,
we're going to go to OpenRouter, and we're going to click models, and we're going to look for z.ai/glm-5.2.
And you're going to see this little copy button right here. You're going to click copy, and now what you're going to do is you're going to paste this model right here, and you're going to click add. And
since I've already added it, it just said it's already available, but for you, it should show up somewhere in here, and it should look exactly like this: z-ai/glm-5.2.
Congratulations, you now have access to the best open-source model directly inside Cursor. Now, let's go test it
inside Cursor. Now, let's go test it out. So, if you go to a new agent
out. So, if you go to a new agent session inside Cursor, and Cursor looks very similar to Codex, you can select any model. Here, I'm selecting
any model. Here, I'm selecting Z-AI/GLM2.
I can say "Hi, what model are you?" And
there you go. I'm GLM 5.2 by Z-AI, and you are now ready to test the best model. I want you to comment below what
model. I want you to comment below what you did with it and how good it was at it. I want to know what you think of
it. I want to know what you think of this model. I'm genuinely curious.
this model. I'm genuinely curious.
Please let me know. And one of the reasons why I think you should get into using these open-source models and testing them out is because the founder
of Z.AI, who created GLM 5.2, said that they're going to get a Fable-level open-source model, like a model that's as good as Fable, that's open-source
within this year. So, someone said, "What's the current timeline for China to reach the Fable class or get as good as Fable 5?" And Elon Musk commented, he
said, "Probably Q1." And then the founder said, "Won't take that long."
So, that means he thinks it'll be done by the end of this year. And so, that means that in like 5 months, we could get a model that is open-source that is better than Fable, and it will likely be significantly cheaper. I don't know
significantly cheaper. I don't know about you guys, but I think the best place to use AI agents with my team, especially for marketing, is directly inside Slack. And the easiest way to
inside Slack. And the easiest way to create cloud-based agents that runs directly in Slack is with Hyperagent, where the agent can actually become part of your team. All you need to do is go to Hyperagent, create an agent with your
favorite skill. This agent can watch all
favorite skill. This agent can watch all of your channels, run on a schedule, use integrations, and send updates directly into Slack when something needs your attention. For example, the first one
attention. For example, the first one I'm building is basically a YouTube researcher. It scans my competitors
researcher. It scans my competitors using my YouTube researcher skill, and it keeps track of what videos are actually performing well. And it does so automatically without me asking. Then it
suggests videos for me to make based on the keywords and topics that are working in my niche. And whenever I upload a draft, it can generate 20 different thumbnail options for the video, and my
team can quickly figure out which direction is the strongest. The coolest
part is is that I don't need to remember to open another AI tool and ask it to do this every time. Because the agent lives in Slack, my team and I can talk to it where we already are working. It can
send us new ideas, run these workflows on a schedule, and keep improving as we add more skills and integrations. And
this is just one agent. You can build an entire team of agents for your own workflows. HyperAgent is giving away
workflows. HyperAgent is giving away $1,000 in credits to the first 1,000 people to sign up. Click the link below to sign up. Claim yours now. So now I want to move to the biggest super app
update of the week, and it involves Codex. It feels kind of like a slow week
Codex. It feels kind of like a slow week from Codex. They didn't really announce
from Codex. They didn't really announce anything that big, but they announced one feature that I believe is incredibly underrated, and it involves recording your screen. Let me just show you how it
your screen. Let me just show you how it works. Directly inside Codex now, you
works. Directly inside Codex now, you can use a plugin called record and replay. I'm going to show you the
replay. I'm going to show you the process for adding a Typefully draft.
Please make a skill called manual tweet draft. So now, you could just tell Codex
draft. So now, you could just tell Codex by using this record and replay skill that you want to show them how to do something. So here, it's going to say,
something. So here, it's going to say, "I'll use record and replay workflow to capture the Typefully steps." Now, watch this.
Look at that. It automatically turned on the recording, and it says, "Recording is now on. Show me the Typefully draft process." So now, I'm going to go like
process." So now, I'm going to go like this. I'm just going to type, let's say,
this. I'm just going to type, let's say, "Comment." And now I am going to go
"Comment." And now I am going to go create a new tab. We'll go to typefully.com, and I'm going to switch to Riley Brown.
Hello, this is a draft by Riley Brown. I
can add images and videos. Now, I can upload an image. And now we can do PNG and here we go. And that is the basic process. Once we're done, I'm just going
process. Once we're done, I'm just going to hit stop. It automatically goes back to Codex and it automatically enters I'm done recording. And look at this. I'll
done recording. And look at this. I'll
stop the capture now then inspect the recording event. And now it's creating
recording event. And now it's creating this skill called manual tweet draft. So
then we should be able to just type {slash} manual tweet draft and it will show up here. It doesn't quite yet. Here
it summarizes exactly what I did and this was a very short task. You could do it up to 30 minutes. That was a 1-minute task. You're allowed to upload up to 30
task. You're allowed to upload up to 30 minutes for a task so that Codex has a really good understanding of how to do it because they have a really good computer use. Okay, so it is now done.
computer use. Okay, so it is now done.
And if you see here, we can actually type {slash} we can type manual tweet draft. There we go. Hey, can you please
draft. There we go. Hey, can you please upload the latest video to Typefully as a draft? It's in my downloads, the
a draft? It's in my downloads, the latest video there. And here we go. It's
off to the races and I believe we can just open up Comet. Let me go ahead and close this out. There you go. We can see Look at this. Computer use is working.
That's its little mouse.
It clicked new draft. Now it should upload a video or at least type out a draft for it. There we go.
Upload an image.
Now it's going to find the last video.
There it is. Wow, this is crazy.
Wow, let's go. Ah, I need to upgrade.
Oh, no. That's super weird. I think I just rightfully rejected it because I says I need to upgrade. Okay, so that video was just too big. Uh you can't
upload anything above uh 512 megabytes, but you get the point. I recorded my screen and I taught Codex how to use Comet to upload something to a different software and then I immediately turn
into a skill. And in order to do that, all you need to do to get that started is just use the uh record and replay
feature and say, "I'm going to record something. Watch and make it a skill."
something. Watch and make it a skill."
And you can tell it what you want to name the skill, but it's literally that easy. And so, potentially the loudest
easy. And so, potentially the loudest news of the week came on Tuesday, June 16th, when SpaceX acquired Cursor. And
remember, the only thing that I care about is becoming Agent Native and talking about things that are actually practical and useful to understand. I
don't actually care about this acquisition except fact that I believe that Cursor is going to be closing the gap on both Codex and Claude Code. And
the main reason I think Cursor's going to get so much better is now they can afford to subsidize these plans if they're able to train a model that's as close to as good as GPT 5.5 and Claude
Opus. SpaceX is actually the fifth
Opus. SpaceX is actually the fifth largest company in the world, and so basically, this $60 billion acquisition means that Cursor gets access to
basically unlimited compute, basically unlimited money and capital through SpaceX, and they also, underrated fact is they get access to the Twitter distribution. I guarantee you Elon Musk
distribution. I guarantee you Elon Musk is going to be retweeting all of the Cursor content trying to grow Cursor as much as possible. And in return, SpaceX sees this as a huge advantage to get the
best AI agent coding platform in the world, arguably. They get all of
world, arguably. They get all of Cursor's developers who are very, very good at what they do, and they also get access to the training expertise because
Cursor did train Composer 2.5 and Composer 3's coming out soon. So, the
teams are merging and I expect Cursor to get significantly better. And I talked about this in my full-length video when I covered this entire story. I said,
"Notice here in the actual announcement by Cursor that they didn't say for developers. They just said useful AI."
developers. They just said useful AI."
And to me, this is an indication that Cursor will likely become a direct competitor to Codex and Claude Desktop because they already have a really good in-app browser. They already have
in-app browser. They already have Composer 2.5, which is a fast, good model. You already saw earlier in this
model. You already saw earlier in this video that you can use open-source models directly inside Cursor. This, I
believe, is going to turn into the best general one of the best general agent platforms. And so, the overall trend from this news right here is I really
hope we end up with a very tight three-way competition between Codex, Claude Desktop, and Cursor. The more
competition, the more benefits they're going to have to give to users, and the better the tools are going to be for everyone because they're going to be fighting for all of the market share in the world of AI super apps. And I
couldn't be more excited for Cursor to get better. Okay, so to close out this
get better. Okay, so to close out this episode, I do want to talk about some updates with Claude. And I think all of us have kind of this weird taste in our mouths surrounding Claude, and I think
we're kind of all in this Mythos or Fable depression. And so, this is just
Fable depression. And so, this is just one of the tweets that I screenshotted, but I've seen hundreds of tweets like this. Something around the lines of, "I
this. Something around the lines of, "I don't know if it's placebo, but using Fable for those days, it felt like it just never gave up on problems and kept trying crazy ways to get whatever you
wanted done. Now back on Opus, and it's
wanted done. Now back on Opus, and it's just kind of lazy. It thinks things are too daunting and keeps asking if you are sure. There was this sense when you used
sure. There was this sense when you used Fable that you could basically do anything. And one of the best benchmarks
anything. And one of the best benchmarks for AI models is how ambitious can you actually be? And one thing with Fable, I
actually be? And one thing with Fable, I felt that I literally wasn't smart enough to even come up with an idea for a thing that Mythos or Fable wasn't
truly capable of. And so for the past 4 months when I was in Silicon Valley, right, I was talking to everyone and everyone was talking about how good GPT
5.5 was. Now, they got access to Fable
5.5 was. Now, they got access to Fable for like 4 days and now they can't even go back to GPT 5.5 or Opus 4.8. They're
literally in this Fable Mythos depression where they just are waiting for this model to come back because they know that once it comes back, they're going to be able to get done whatever it is they're trying to get done in like a
fraction of a time. That's how good Fable was. And so right now, it is 3:26
Fable was. And so right now, it is 3:26 Eastern time on June 19th and Fable's still not back in any of the Claude products. It is still illegal to use.
products. It is still illegal to use.
And so right now, Anthropic is working with the government trying to figure out how they can get this model back into our hands and we just have no clue when it's going to come back. But beyond
being in this Mythos depression, there are two updates I do want to talk about.
One of them touches on a theme that I've been talking about a lot, which is agent native apps. But the first thing I want
native apps. But the first thing I want to talk about is Claude's new update to their design mode. New in Claude design, it stays on brand with your design system across projects, lets you edit
directly on the canvas, syncs with Claude code, and connects to more of the tools that you already use. So for those of you who don't know, if you go to Claude
.ai and this only works on the web, not on desktop, they have this feature right here called design. So the first thing that they announced is it says it stays on brand with your design system across
projects. I haven't used this long
projects. I haven't used this long enough to test that. But what I can test is that it lets you edit directly on the canvas. So, I notice here there's this
canvas. So, I notice here there's this edit feature. I think I can click Can I
edit feature. I think I can click Can I edit this directly? The open-source
rival is here. Wow. A cheap Chinese model that passes the vibe check. A
record GLM 5.2 is a very good model. Okay, this is really cool. You can just edit things
really cool. You can just edit things directly on the canvas. This is really fun, actually. And of course, you can
fun, actually. And of course, you can also do markups. So, I can say like, "Don't have any of these here. I don't
like these." That's really cool. And the
next thing Claude added is they made it really easy to share these and send them to other tools. So, I can send them to Lovable, Base 44, Gamma, Miro, and
Replit. So, I could in theory send it to
Replit. So, I could in theory send it to Lovable, and I could connect it, and I could basically, if I designed a landing page or a website, I could theoretically
deploy it on Lovable, or I could actually just deploy it straight to Vercel. I should have that already set
Vercel. I should have that already set up, and we can send it to Vercel. And
yeah, I already have this set up. So,
now it's going to be able to deploy this to Vercel. So, it can actually be on the
to Vercel. So, it can actually be on the internet. And finally, something brand
internet. And finally, something brand new to Claude Code is artifacts. You
know that the Claude web app and Claude desktop app already have artifacts when you use the normal Claude mode. But, now
Claude Code can create artifacts, and it can send little interactive pages. So,
here it's saying, "Research where users are dropping off since the previous release." And we can see here in this
release." And we can see here in this video, it's just going to go off and create a little mini app, or an agent native app that you can share with other people. And here it created this little
people. And here it created this little artifact. It has its own link. And now
artifact. It has its own link. And now
it says propose a solution. And so, it shows the current and the proposal in this little mini app. So, you can get the agent or Claude code to create this
little artifact or I call them mini apps. And you can view them and you can
apps. And you can view them and you can be like, "Okay, that's a good idea. But
here's what they're proposing. Okay,
yes, we can do it." And if you want to share it with a team, you can just easily press copy link and then you can just send it to whoever you want on any platform. And the example that they
platform. And the example that they showed was a phone. So, links made for sharing. Here's the message and you can
sharing. Here's the message and you can if you send it to someone on your team, they can very easily open it and look it over. And so, you can very easily share
over. And so, you can very easily share these little mini apps or artifacts, whatever you want to call them. All
right, so those are the biggest updates for the week. With Claude, we have design mode and mini apps. With Open AI, we have the record and replay to create skills. Screen record to skills, really
skills. Screen record to skills, really cool workflow. We have the best open
cool workflow. We have the best open source model ever created, which is GLM 5.2. And then we have these the
5.2. And then we have these the acquisition by SpaceX of Cursor. And the
main point of this is that Cursor is very likely to get better and it will likely become a better deal for their $20 per month plan and $200 per month plan. And we love competition between
plan. And we love competition between Cursor, Claude, and Codex. It's very fun and the open source models. We have And And that's kind of the fourth bucket is
all of the open source models together.
And we just have so much competition from all angles, multiple countries.
This is amazing. I'm very excited for next week. Next week, these are some
next week. Next week, these are some things that I'm expecting based on the rumors that I've been seeing around Twitter and other areas. I think we're going to see a return of Fable from
Anthropic or at least I'm really hoping.
There's been some rumors circulating on Twitter about a new model by Open AI.
We could see some more open source models being released. We're hearing
some of the other companies, specifically from China, talking about how they're going to be releasing more open source models. Gemini might be releasing a model, and then finally, this one I'm really excited about,
Gemini may be making an announcement regarding their super app. And so, I've been somewhat harsh on Google when it comes to them not deciding what their super app is. They have way too many
products. Instead, I want them to pick
products. Instead, I want them to pick one, and here we have Logan saying, "Feels like we are entering the super app era." And I've been saying we've
app era." And I've been saying we've been in the super app era for 100 days, Logan. Choose your challenger. I'm
Logan. Choose your challenger. I'm
really excited for Google to just pick one, whether it's anti-gravity, whether it's Google AI Studio, whether it's Jewels, whether it's their Gemini desktop app. We don't know what their
desktop app. We don't know what their super app is, so it's really hard for them to compete because it's impossible for me as a content creator to tell them which tool to use. I don't know which Google tool to use. Their models are
pretty bad. I really hope they catch up
pretty bad. I really hope they catch up because Google's one of the other companies that can in theory be a competitor. They just don't feel like it
competitor. They just don't feel like it right now. So, I really hope Google
right now. So, I really hope Google comes back. Anyway, thank you guys so
comes back. Anyway, thank you guys so much for watching this video. This has
been a really exciting week. Next week,
I will finally be in my studio filming from New York City. Couldn't be more excited. Anyway, I'll see you guys here
excited. Anyway, I'll see you guys here for the next one.
Loading video analysis...