LongCut logo

AI Agents Just Changed Forever: GLM 5.2, Codex Skills, Claude & Cursor

By Riley Brown

Summary

Topics Covered

  • Open-Source Just Passed the Vibe Check
  • Record Once, Skill Forever: Screen Capture as AI Training
  • SpaceX-Cursor: Unlimited Compute Changes the Super App War
  • Fable Depression: Ambition Is the Real AI Benchmark

Full Transcript

What an insane week in the world of AI agents. If you want to know the latest

agents. If you want to know the latest updates on Claude Fable 5, the latest Codex feature that lets you record your screen and turn it into skills, the best open-source model in the entire world,

and if you want to know about the SpaceX cursor acquisition and more, you're in the right place. You're watching Agent Native. I cover the latest updates and

Native. I cover the latest updates and news from Frontier agent platforms and models so that we can learn about and use AI agents effectively. My name is Riley Brown, and if you want to become

Agent Native, hit that like button, hit that subscribe button, and let's dive in. Today, we're going to get started

in. Today, we're going to get started with the most important news, in my opinion, in the world of AI agents. The

company Z.ai released an open-source model that I believe is like five or six times cheaper than GPT 5.5, and some are

saying it's actually comparable and almost as good as Opus 4.8 and GPT 5.5.

So, GLM 5.2 is a model released by Z.ai, and this company is from China, and this model is open-sourced, and it's much

cheaper than Frontier models. And by the way, I'm going to show you exactly how you can get this set up directly inside Cursor in just 1 second. But, I first want to talk about the benchmark. And

so, here are some of the benchmarks, and this is what it looks like across the board. You'll see that GLM 5.2 is

board. You'll see that GLM 5.2 is comparable to Opus and GPT 5.5.

Currently, I think the best model, besides Fable, is GPT 5.5 with Opus trailing just a little bit, but this model actually held its own when I

actually tested it. Because normally,

when a new open model comes out, usually, there are benchmarks that are released. They don't actually tell the

released. They don't actually tell the whole story, or even in even close to an accurate story. But, because they put

accurate story. But, because they put these cool graphs on Twitter, there's a ton of hype, people make a lot of videos saying that this model's actually really, really good. And usually, when I

go to test that model, I just end up incredibly disappointed. I actually test

incredibly disappointed. I actually test the model and it does not pass the vibe check. And as I tweeted earlier today,

check. And as I tweeted earlier today, this was not one of those times. This

model after spending a ton of time actually using this model, I do believe that it passes the vibe check. I think

that it's getting close to the frontier labs, specifically Opus 4.8 and GPT 5.5 and I think this will actually cause the frontier labs, OpenAI and Anthropic to

release even smarter models. I think a lot of people realize that the models that they rely on every day can be taken away. However, with these open models,

away. However, with these open models, you can actually download the weights. I

think a lot of people are taking this time to test these open source models.

And so the best place to try out this new model, GLM 5.2, in my opinion, is directly inside Cursor and I'll show you exactly how to set that up in just a second. I use the Convex plugin and it

second. I use the Convex plugin and it one shot a Trello app with basically all of the different features that Trello has with a database and authentication and it works nearly perfectly or it

actually works perfectly. I also had GLM 5.2 go off to research about me, then create a landing page, and then run it locally. I also connected GLM 5.2 to my

locally. I also connected GLM 5.2 to my Notion, to my Slack and to a ton of other integrations and I was having it just do general agent tasks for me and it was doing a great job, just as good

as if I was using 4.8. And so yes, I think the model was really good and you're going to see a lot of people on the internet saying the model is really, really good. But you shouldn't take our

really good. But you shouldn't take our word for it, you should actually go in and try it. So I'm going to show you the easiest way to try this model directly inside Cursor. So directly inside

inside Cursor. So directly inside Cursor, what I want you to do is follow these exact steps. It should only take you 3 to 5 minutes to get this model directly inside Cursor. In order to add the model to Cursor, we're going to be

using another tool called OpenRouter.

Normally, if you want to use a bunch of different AI models, you need a ton of API keys in order to access them.

OpenRouter allows us to only use one key, so we can get access to GPT 5.5, Claude Opus, DeepSeek V4, and in this case, the most important one, GLM 5.2, and then thousands of other models. This

video is not sponsored by OpenRouter. I

just want to explain why I normally use this. And so, OpenRouter allows us to

this. And so, OpenRouter allows us to add any model to Cursor. I'll show you exactly how to do it. In Cursor, you're going to go down to your plan here, and you're going to click settings. Then,

what you're going to do is you're going to come up here and you're going to select models. You're going to come down

select models. You're going to come down to API keys, and what you're going to do is you are going to turn this on right here. So, normally this is off, and you

here. So, normally this is off, and you are going to put in your own API key.

And then you're going to say override the OpenAI base URL. You're basically

converting this OpenAI key into an OpenRouter API key. And in order to switch this from OpenAI to OpenRouter, you're going to paste this exact thing.

I'll put the link in the description.

You're just going to paste this exact thing in here, and then what you're going to do is you're going to come up to view all models, and you're going to come down here and you're going to click add custom model. Now, you can add any

model from OpenRouter here. And so,

we're going to go to OpenRouter, and we're going to click models, and we're going to look for z.ai/glm-5.2.

And you're going to see this little copy button right here. You're going to click copy, and now what you're going to do is you're going to paste this model right here, and you're going to click add. And

since I've already added it, it just said it's already available, but for you, it should show up somewhere in here, and it should look exactly like this: z-ai/glm-5.2.

Congratulations, you now have access to the best open-source model directly inside Cursor. Now, let's go test it

inside Cursor. Now, let's go test it out. So, if you go to a new agent

out. So, if you go to a new agent session inside Cursor, and Cursor looks very similar to Codex, you can select any model. Here, I'm selecting

any model. Here, I'm selecting Z-AI/GLM2.

I can say "Hi, what model are you?" And

there you go. I'm GLM 5.2 by Z-AI, and you are now ready to test the best model. I want you to comment below what

model. I want you to comment below what you did with it and how good it was at it. I want to know what you think of

it. I want to know what you think of this model. I'm genuinely curious.

this model. I'm genuinely curious.

Please let me know. And one of the reasons why I think you should get into using these open-source models and testing them out is because the founder

of Z.AI, who created GLM 5.2, said that they're going to get a Fable-level open-source model, like a model that's as good as Fable, that's open-source

within this year. So, someone said, "What's the current timeline for China to reach the Fable class or get as good as Fable 5?" And Elon Musk commented, he

said, "Probably Q1." And then the founder said, "Won't take that long."

So, that means he thinks it'll be done by the end of this year. And so, that means that in like 5 months, we could get a model that is open-source that is better than Fable, and it will likely be significantly cheaper. I don't know

significantly cheaper. I don't know about you guys, but I think the best place to use AI agents with my team, especially for marketing, is directly inside Slack. And the easiest way to

inside Slack. And the easiest way to create cloud-based agents that runs directly in Slack is with Hyperagent, where the agent can actually become part of your team. All you need to do is go to Hyperagent, create an agent with your

favorite skill. This agent can watch all

favorite skill. This agent can watch all of your channels, run on a schedule, use integrations, and send updates directly into Slack when something needs your attention. For example, the first one

attention. For example, the first one I'm building is basically a YouTube researcher. It scans my competitors

researcher. It scans my competitors using my YouTube researcher skill, and it keeps track of what videos are actually performing well. And it does so automatically without me asking. Then it

suggests videos for me to make based on the keywords and topics that are working in my niche. And whenever I upload a draft, it can generate 20 different thumbnail options for the video, and my

team can quickly figure out which direction is the strongest. The coolest

part is is that I don't need to remember to open another AI tool and ask it to do this every time. Because the agent lives in Slack, my team and I can talk to it where we already are working. It can

send us new ideas, run these workflows on a schedule, and keep improving as we add more skills and integrations. And

this is just one agent. You can build an entire team of agents for your own workflows. HyperAgent is giving away

workflows. HyperAgent is giving away $1,000 in credits to the first 1,000 people to sign up. Click the link below to sign up. Claim yours now. So now I want to move to the biggest super app

update of the week, and it involves Codex. It feels kind of like a slow week

Codex. It feels kind of like a slow week from Codex. They didn't really announce

from Codex. They didn't really announce anything that big, but they announced one feature that I believe is incredibly underrated, and it involves recording your screen. Let me just show you how it

your screen. Let me just show you how it works. Directly inside Codex now, you

works. Directly inside Codex now, you can use a plugin called record and replay. I'm going to show you the

replay. I'm going to show you the process for adding a Typefully draft.

Please make a skill called manual tweet draft. So now, you could just tell Codex

draft. So now, you could just tell Codex by using this record and replay skill that you want to show them how to do something. So here, it's going to say,

something. So here, it's going to say, "I'll use record and replay workflow to capture the Typefully steps." Now, watch this.

Look at that. It automatically turned on the recording, and it says, "Recording is now on. Show me the Typefully draft process." So now, I'm going to go like

process." So now, I'm going to go like this. I'm just going to type, let's say,

this. I'm just going to type, let's say, "Comment." And now I am going to go

"Comment." And now I am going to go create a new tab. We'll go to typefully.com, and I'm going to switch to Riley Brown.

Hello, this is a draft by Riley Brown. I

can add images and videos. Now, I can upload an image. And now we can do PNG and here we go. And that is the basic process. Once we're done, I'm just going

process. Once we're done, I'm just going to hit stop. It automatically goes back to Codex and it automatically enters I'm done recording. And look at this. I'll

done recording. And look at this. I'll

stop the capture now then inspect the recording event. And now it's creating

recording event. And now it's creating this skill called manual tweet draft. So

then we should be able to just type {slash} manual tweet draft and it will show up here. It doesn't quite yet. Here

it summarizes exactly what I did and this was a very short task. You could do it up to 30 minutes. That was a 1-minute task. You're allowed to upload up to 30

task. You're allowed to upload up to 30 minutes for a task so that Codex has a really good understanding of how to do it because they have a really good computer use. Okay, so it is now done.

computer use. Okay, so it is now done.

And if you see here, we can actually type {slash} we can type manual tweet draft. There we go. Hey, can you please

draft. There we go. Hey, can you please upload the latest video to Typefully as a draft? It's in my downloads, the

a draft? It's in my downloads, the latest video there. And here we go. It's

off to the races and I believe we can just open up Comet. Let me go ahead and close this out. There you go. We can see Look at this. Computer use is working.

That's its little mouse.

It clicked new draft. Now it should upload a video or at least type out a draft for it. There we go.

Upload an image.

Now it's going to find the last video.

There it is. Wow, this is crazy.

Wow, let's go. Ah, I need to upgrade.

Oh, no. That's super weird. I think I just rightfully rejected it because I says I need to upgrade. Okay, so that video was just too big. Uh you can't

upload anything above uh 512 megabytes, but you get the point. I recorded my screen and I taught Codex how to use Comet to upload something to a different software and then I immediately turn

into a skill. And in order to do that, all you need to do to get that started is just use the uh record and replay

feature and say, "I'm going to record something. Watch and make it a skill."

something. Watch and make it a skill."

And you can tell it what you want to name the skill, but it's literally that easy. And so, potentially the loudest

easy. And so, potentially the loudest news of the week came on Tuesday, June 16th, when SpaceX acquired Cursor. And

remember, the only thing that I care about is becoming Agent Native and talking about things that are actually practical and useful to understand. I

don't actually care about this acquisition except fact that I believe that Cursor is going to be closing the gap on both Codex and Claude Code. And

the main reason I think Cursor's going to get so much better is now they can afford to subsidize these plans if they're able to train a model that's as close to as good as GPT 5.5 and Claude

Opus. SpaceX is actually the fifth

Opus. SpaceX is actually the fifth largest company in the world, and so basically, this $60 billion acquisition means that Cursor gets access to

basically unlimited compute, basically unlimited money and capital through SpaceX, and they also, underrated fact is they get access to the Twitter distribution. I guarantee you Elon Musk

distribution. I guarantee you Elon Musk is going to be retweeting all of the Cursor content trying to grow Cursor as much as possible. And in return, SpaceX sees this as a huge advantage to get the

best AI agent coding platform in the world, arguably. They get all of

world, arguably. They get all of Cursor's developers who are very, very good at what they do, and they also get access to the training expertise because

Cursor did train Composer 2.5 and Composer 3's coming out soon. So, the

teams are merging and I expect Cursor to get significantly better. And I talked about this in my full-length video when I covered this entire story. I said,

"Notice here in the actual announcement by Cursor that they didn't say for developers. They just said useful AI."

developers. They just said useful AI."

And to me, this is an indication that Cursor will likely become a direct competitor to Codex and Claude Desktop because they already have a really good in-app browser. They already have

in-app browser. They already have Composer 2.5, which is a fast, good model. You already saw earlier in this

model. You already saw earlier in this video that you can use open-source models directly inside Cursor. This, I

believe, is going to turn into the best general one of the best general agent platforms. And so, the overall trend from this news right here is I really

hope we end up with a very tight three-way competition between Codex, Claude Desktop, and Cursor. The more

competition, the more benefits they're going to have to give to users, and the better the tools are going to be for everyone because they're going to be fighting for all of the market share in the world of AI super apps. And I

couldn't be more excited for Cursor to get better. Okay, so to close out this

get better. Okay, so to close out this episode, I do want to talk about some updates with Claude. And I think all of us have kind of this weird taste in our mouths surrounding Claude, and I think

we're kind of all in this Mythos or Fable depression. And so, this is just

Fable depression. And so, this is just one of the tweets that I screenshotted, but I've seen hundreds of tweets like this. Something around the lines of, "I

this. Something around the lines of, "I don't know if it's placebo, but using Fable for those days, it felt like it just never gave up on problems and kept trying crazy ways to get whatever you

wanted done. Now back on Opus, and it's

wanted done. Now back on Opus, and it's just kind of lazy. It thinks things are too daunting and keeps asking if you are sure. There was this sense when you used

sure. There was this sense when you used Fable that you could basically do anything. And one of the best benchmarks

anything. And one of the best benchmarks for AI models is how ambitious can you actually be? And one thing with Fable, I

actually be? And one thing with Fable, I felt that I literally wasn't smart enough to even come up with an idea for a thing that Mythos or Fable wasn't

truly capable of. And so for the past 4 months when I was in Silicon Valley, right, I was talking to everyone and everyone was talking about how good GPT

5.5 was. Now, they got access to Fable

5.5 was. Now, they got access to Fable for like 4 days and now they can't even go back to GPT 5.5 or Opus 4.8. They're

literally in this Fable Mythos depression where they just are waiting for this model to come back because they know that once it comes back, they're going to be able to get done whatever it is they're trying to get done in like a

fraction of a time. That's how good Fable was. And so right now, it is 3:26

Fable was. And so right now, it is 3:26 Eastern time on June 19th and Fable's still not back in any of the Claude products. It is still illegal to use.

products. It is still illegal to use.

And so right now, Anthropic is working with the government trying to figure out how they can get this model back into our hands and we just have no clue when it's going to come back. But beyond

being in this Mythos depression, there are two updates I do want to talk about.

One of them touches on a theme that I've been talking about a lot, which is agent native apps. But the first thing I want

native apps. But the first thing I want to talk about is Claude's new update to their design mode. New in Claude design, it stays on brand with your design system across projects, lets you edit

directly on the canvas, syncs with Claude code, and connects to more of the tools that you already use. So for those of you who don't know, if you go to Claude

.ai and this only works on the web, not on desktop, they have this feature right here called design. So the first thing that they announced is it says it stays on brand with your design system across

projects. I haven't used this long

projects. I haven't used this long enough to test that. But what I can test is that it lets you edit directly on the canvas. So, I notice here there's this

canvas. So, I notice here there's this edit feature. I think I can click Can I

edit feature. I think I can click Can I edit this directly? The open-source

rival is here. Wow. A cheap Chinese model that passes the vibe check. A

record GLM 5.2 is a very good model. Okay, this is really cool. You can just edit things

really cool. You can just edit things directly on the canvas. This is really fun, actually. And of course, you can

fun, actually. And of course, you can also do markups. So, I can say like, "Don't have any of these here. I don't

like these." That's really cool. And the

next thing Claude added is they made it really easy to share these and send them to other tools. So, I can send them to Lovable, Base 44, Gamma, Miro, and

Replit. So, I could in theory send it to

Replit. So, I could in theory send it to Lovable, and I could connect it, and I could basically, if I designed a landing page or a website, I could theoretically

deploy it on Lovable, or I could actually just deploy it straight to Vercel. I should have that already set

Vercel. I should have that already set up, and we can send it to Vercel. And

yeah, I already have this set up. So,

now it's going to be able to deploy this to Vercel. So, it can actually be on the

to Vercel. So, it can actually be on the internet. And finally, something brand

internet. And finally, something brand new to Claude Code is artifacts. You

know that the Claude web app and Claude desktop app already have artifacts when you use the normal Claude mode. But, now

Claude Code can create artifacts, and it can send little interactive pages. So,

here it's saying, "Research where users are dropping off since the previous release." And we can see here in this

release." And we can see here in this video, it's just going to go off and create a little mini app, or an agent native app that you can share with other people. And here it created this little

people. And here it created this little artifact. It has its own link. And now

artifact. It has its own link. And now

it says propose a solution. And so, it shows the current and the proposal in this little mini app. So, you can get the agent or Claude code to create this

little artifact or I call them mini apps. And you can view them and you can

apps. And you can view them and you can be like, "Okay, that's a good idea. But

here's what they're proposing. Okay,

yes, we can do it." And if you want to share it with a team, you can just easily press copy link and then you can just send it to whoever you want on any platform. And the example that they

platform. And the example that they showed was a phone. So, links made for sharing. Here's the message and you can

sharing. Here's the message and you can if you send it to someone on your team, they can very easily open it and look it over. And so, you can very easily share

over. And so, you can very easily share these little mini apps or artifacts, whatever you want to call them. All

right, so those are the biggest updates for the week. With Claude, we have design mode and mini apps. With Open AI, we have the record and replay to create skills. Screen record to skills, really

skills. Screen record to skills, really cool workflow. We have the best open

cool workflow. We have the best open source model ever created, which is GLM 5.2. And then we have these the

5.2. And then we have these the acquisition by SpaceX of Cursor. And the

main point of this is that Cursor is very likely to get better and it will likely become a better deal for their $20 per month plan and $200 per month plan. And we love competition between

plan. And we love competition between Cursor, Claude, and Codex. It's very fun and the open source models. We have And And that's kind of the fourth bucket is

all of the open source models together.

And we just have so much competition from all angles, multiple countries.

This is amazing. I'm very excited for next week. Next week, these are some

next week. Next week, these are some things that I'm expecting based on the rumors that I've been seeing around Twitter and other areas. I think we're going to see a return of Fable from

Anthropic or at least I'm really hoping.

There's been some rumors circulating on Twitter about a new model by Open AI.

We could see some more open source models being released. We're hearing

some of the other companies, specifically from China, talking about how they're going to be releasing more open source models. Gemini might be releasing a model, and then finally, this one I'm really excited about,

Gemini may be making an announcement regarding their super app. And so, I've been somewhat harsh on Google when it comes to them not deciding what their super app is. They have way too many

products. Instead, I want them to pick

products. Instead, I want them to pick one, and here we have Logan saying, "Feels like we are entering the super app era." And I've been saying we've

app era." And I've been saying we've been in the super app era for 100 days, Logan. Choose your challenger. I'm

Logan. Choose your challenger. I'm

really excited for Google to just pick one, whether it's anti-gravity, whether it's Google AI Studio, whether it's Jewels, whether it's their Gemini desktop app. We don't know what their

desktop app. We don't know what their super app is, so it's really hard for them to compete because it's impossible for me as a content creator to tell them which tool to use. I don't know which Google tool to use. Their models are

pretty bad. I really hope they catch up

pretty bad. I really hope they catch up because Google's one of the other companies that can in theory be a competitor. They just don't feel like it

competitor. They just don't feel like it right now. So, I really hope Google

right now. So, I really hope Google comes back. Anyway, thank you guys so

comes back. Anyway, thank you guys so much for watching this video. This has

been a really exciting week. Next week,

I will finally be in my studio filming from New York City. Couldn't be more excited. Anyway, I'll see you guys here

excited. Anyway, I'll see you guys here for the next one.

Loading...

Loading video analysis...