LongCut logo

Is Mythos too Dangerous?

By The PrimeTime

Summary

Topics Covered

  • Mythos: AI's Dangerous New Frontier

Full Transcript

Here we are. Claude did it again.

Dropped a new version of itself. Okay.

But this one, it has a very special name. Okay. It's It's much better. We're

name. Okay. It's It's much better. We're

not on the old Sonnet or Opus or Haiku.

No, we've been upgraded to Mythos. The

greatest model to ever be dropped. In

fact, it's so great. It's so fantastic that you you the per Yeah. You sitting

there. Yeah. You right now. You can't

you can't have you can't have that.

Okay. Hey, you're not allowed to touch that. Apparently, this model is finding

that. Apparently, this model is finding bugs and uh able to crack out of sandboxes like nobody's business. We are

talking about able to take down computers just simply by connecting them. They're the Chuck Norris, God rest

them. They're the Chuck Norris, God rest his soul, of of all of the models, okay?

It's just able just to destroy everything apparently. Okay, you got to

everything apparently. Okay, you got to hide your kids, hide your Raspberry Pies cuz they're taking everybody out here.

So, let's talk about this new model for a second. They kind of released a bunch

a second. They kind of released a bunch of stats for it and then they released the part that would be considered the scary part. The part that you always see

scary part. The part that you always see Anthropic does, right? Because this is pretty typical of Anthropic is they have a new model and then what do they do with it? They're like, "Dude, by the

with it? They're like, "Dude, by the way, AI super scary. The most scary ever. So scary. US government. Hey,

ever. So scary. US government. Hey,

government so scary. You better put some regulation in place and help us control because man, it's scary." So, first let's just go with the least interesting of the items, which honestly I don't

care about any of these numbers cuz honestly it really means nothing to me.

But here we go. The Sweet Benchmark Pro Mythos preview, the new model, 77.8% versus Opus 46 at 53.4%. So, as you can see, it's dramatically better.

Practically 20% better. Now, what does that actually mean for you or me? Well,

it doesn't really mean anything because you're not going to touch this model.

You know, you're not allowed to.

Nobody's allowed to. Only a few people at Amazon, Google, and Apple, and a couple other top companies and the US government are allowed to touch this model. And you can see the rest of the

model. And you can see the rest of the benchmarks just seems to perform super, you know, super much better than Opus 46. On the reasoning side, the GP, QA,

46. On the reasoning side, the GP, QA, Diamond, Mythos Preview dominates Opus 46. Humanity's last exam, Mythos Preview

46. Humanity's last exam, Mythos Preview without tools still gets an F, but I mean, we're we're getting near D territory. And you know what? D's earn

territory. And you know what? D's earn

degrees at some some of the places in Mythos with tools actually does get a D.

Okay, it is passing some colleges. This

is some serious PhD level intelligence going on here. The actual interesting part about the model is security research. I've already just released a

research. I've already just released a video about this. How Daniel Stenberg, the uh maintainer, lead maintainer of CURL has said, "Hey, AI reporting, it's gotten a lot better. It's actually

starting to show real issues. For a long time, AI inside the security field has been a security issue itself because it just inundates any maintainer with so

many fake reports that it's actually impossible for maintainers to really be able to operate on their own repository.

But then a kind of a shift, a big shift happened with 46. We're actually

starting to see AI being actually, oh wow, no, this is actually serious now.

Now it can seriously find things. But

this new one, Mythos, apparently is real good. During our testing, we found that

good. During our testing, we found that Mythos Preview is capable of identifying and then exploring zero-day vulnerabilities in every major operating system and every major web browser when

directed by a user to do so. The

vulnerabilities it finds are often subtle and difficult to detect. Many of

them are 10 or 20 years old with the oldest we have found so far being a now patched 27-year-old bug in OpenBSD, an operating system known primarily for its

security. Mythos preview wrote a web

security. Mythos preview wrote a web browser exploit that chained together four vulnerabilities writing a complex JIT heap spray that that escaped both renderer and OS sandboxes. It

autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and Casler bypasses. It

autonomously wrote remote execution code exploit on free BSD NFS server that granted full route access to unauthenticated users by splitting a 20

gadget RO chain over multiple packets.

It even found a 16-year-old vulnerability in FFmpeg, the hand artisally crafted library. So if this is all to be believed and this is actually

what is happening and we are literally entering into the most impressive era for AI ever to the point where releasing the model publicly would result in every

system that has ever existed being hacked. Well we got ourselves a bit of a

hacked. Well we got ourselves a bit of a problem now don't we? And that is why Enthropic has said the following. We do

not plan to make claude mythos preview generally available. We plan to launch

generally available. We plan to launch new safeguards with an upcoming claude opus model allowing us to improve and refine them with a model that does not pose the same level of risk as mythos

preview. So that 20 plus improvement on

preview. So that 20 plus improvement on sweet bench baby, you're never going to taste that. Okay? You're never going to

taste that. Okay? You're never going to get your sweet hands on that one. But

you might get a smarter claude. Does

that mean we're entering into the nation of geniuses on a GPU that's stored in a warehouse in which Anthropic owns and you are now able to create everything you've ever wanted just with a simple

quick text description? Well, it doesn't necessarily sound like it. It sounds

like some people might have it, but I don't think you're going to have it anytime soon, and I probably not going to have it anytime soon either. See, the

thing is, they're going to release it to a few select tech cartel leaders, and who knows when it's actually going to happen. So, is it as big of a deal as we

happen. So, is it as big of a deal as we are seeing or is it not? Obviously, we

can see the receipts with FFmpeg saying, "Hey, thanks for the patch." But some aren't buying it. You got Boris saying, "Hey, it's very powerful and should feel terrifying." Kind of continuing to push

terrifying." Kind of continuing to push the same narrative, but just never forget the exact same narrative was pushed with Chad GPT2. It is really dangerous. You got to be super careful.

dangerous. You got to be super careful.

It's honestly too dangerous to release.

Well, the best we can hope for is that Chad GPT also happens to have Chad GPT6 or something or Chad GPT Cosmos going to be coming out and that will force Anthropic to have to catch up and

release their super powerful model which is also just a weird place to be in that we're I what did I just say there? Me

rooting for open a Oh my gosh, something got into my head there for a second. But

I think Lowle said it best. They called

it Mythos because no one's ever going to see it. They're literally trying to rage

see it. They're literally trying to rage bait us right now. I'm feeling it. I'm

feel I'm feeling the baiting. You know,

it's hard not to look at all this and realize that there's some part of my skills every year becoming more and more irrelevant. You know, the ability to

irrelevant. You know, the ability to hammer out all those Vim shortcuts. Kind

of a dying skill, right? It's a little sad. I I mean, I personally think it's

sad. I I mean, I personally think it's pretty dang sad, but it's an ending skill. It's a It's a skill that I don't

skill. It's a It's a skill that I don't think the younger kids, them young fellas, are going to really learn because they don't really have to learn it. And it's becoming more and more

it. And it's becoming more and more apparent that people would rather just hammer on to a model than actually learn any of these tasks or these like really fine difficult things anyways. And so

here we are. So the things that you know I have defined myself with over the last 20 years. See while you guys went out

20 years. See while you guys went out smoking with cigarettes, staying up too late, probably experimenting with mindaltering drugs. I on the other hand

mindaltering drugs. I on the other hand was sharpening my skills. And now those skills, maybe they're a little bit more useless. Every single year, a little bit

useless. Every single year, a little bit more useless. But honestly, I'm okay

more useless. But honestly, I'm okay with it. I know that might be strange to

with it. I know that might be strange to say, but I am okay with it. I'm okay if these things do turn out to be fantastic that I don't have to be uh I don't have

to identify myself as the greatest Neoim user of all time. It's cool. I can still use Neoim and I can still enjoy it, but it doesn't have to be my identity. And

also I'm just happy I've done all those years of trying to understand how to make good software because now even if I do AI generate something I can go oh yeah this is here's why it's wrong I can

just understand things at a level in which people who've never even touched software have no idea about. So hey am I happy about that still? Sure. And maybe

you know what one day those skills even could become invalidated. And if they are I guess I have to be okay with that.

That's it. I just kind of wanted to yap about this because, you know, it's it's been an interesting time and I genuinely really appreciate that I still have uh the chance just to yap to yap to you

guys, you know, to kind of talk about these things cuz I know a lot of people they feel kind of really unsure about everything. They feel kind of worried

everything. They feel kind of worried about everything. Uh especially with

about everything. Uh especially with just all of just the crazy talk from the hype beast being like, "Oh, it's the end of the universe." Even this report right here by Anthropic being like it's it

knows how to take advantage of every single browser, every single operating system. It's finding bugs 27 years old.

system. It's finding bugs 27 years old.

You're absolutely going to get destroyed if we let this thing out. It's just

constant fear instilling, you know, just attacks on you at all times. And you know, I see these things.

times. And you know, I see these things.

I'm like, "Okay, hey, I'm glad that if it really is that that Anthropic making quote unquote steps towards Amazon and Google and all this nonsense to be able to patch all these problems, but at the

same time, I don't want to have to live under this like intense pressure and this intense constant barrage of just negativity. Like I can look at it as

negativity. Like I can look at it as like, wow, I now have the ability to accomplish things that before would have taken me a lot longer. They would have been a lot harder. I would have been less likely to even start them just

because I can only have so many side projects. Now I get the benefit to be

projects. Now I get the benefit to be able to abandon several side projects.

Like I have been able to abandon more projects than I've ever done in my lifetime thanks to the power of AI. And

honestly, that feels pretty amazing.

Hey, the name the primogen. Hey, is that HTTP? Get that out of here. That's not

HTTP? Get that out of here. That's not

how we order coffee. We order coffee via ssh terminal.shop. Yeah, you want a real

ssh terminal.shop. Yeah, you want a real experience. You want real coffee. You

experience. You want real coffee. You

want awesome subscriptions so you never have to remember again. Oh, you want exclusive blends with exclusive coffee and exclusive content? Then check out

CRON. You don't know what SSH is?

CRON. You don't know what SSH is?

Well, maybe the coffee is not for you.

Living the dream.

Loading...

Loading video analysis...