Fable 5 vs GPT 5.6 Sol: The Early Results
By AI Explained
Summary
Topics Covered
- Staggered AI releases concentrate corporate power
- Distillation attacks subsidize geopolitical competitors
- Sonnet 5 nearly eliminated prompt injection risk
- Larger models will always win
Full Transcript
Fable 5 is back, but we'll just shut down a chat or reroute to a weaker model even more than before. GPT 5.6 Soul, the
OpenAI equivalent of Fable, is out, but only for select customers and with incomplete results in its report card.
Claude Sonet 5, just got thrown into the mix last minute by Anthropic. But the
model maker is now competing, I would say, to show how little Sonnet adds to frontier capabilities, lest one thinks the US government intervenes. So yeah,
it's been a weird few days in AI, but there are a handful of points of signal, I would say, that I want to try to highlight in this brief video. From the
hard quantifiable comparisons we can make between GPT Soul and Fable 5 or Mythos 5, it is possible to unearth some direct comparisons. And from that to the
direct comparisons. And from that to the news of OpenAI offering a stake in their company to the US government, Sam Orman warning of concentrated corporate power
and much more. First though, the myths and confabulations about Fable and the timeline of how it came back into general availability. Yes, even to me, a
general availability. Yes, even to me, a non-American. It turns out, according to
non-American. It turns out, according to Anthropic, that the vulnerability that Amazon flagged that caused Fable to get blocked in the first place. See my
recent two videos, was one that could also be flagged and identified by GBC 5.5, Kimmy K2.5, an open weights model from China. Nevertheless, it would be
from China. Nevertheless, it would be pretty awkward for the US government to just admit that. So, Anthropic had to show some response and further shifted the line in what their safety scans
would flag as blockable. More safety
margin of course, but this quote improved safety classifier does mean that benign requests will be flagged much more often, including alas during
routine coding and debugging tasks. Now
I will say that my question about the benefits of beachroot happily discussed with Opus 4.8 was among the first of the casualties flagged as aiding and
abetting international terrorism. No,
I'm joking. It's not that. But it was deemed as too risky for Fable 5 and so I had to continue with Opus 4.8. More
seriously though, quite how frequently the safety classifier marks routine tasks like coding and debugging as being blocked and just how annoying that becomes. Only the coming few weeks will
becomes. Only the coming few weeks will tell. Oh, and what about the mythical
tell. Oh, and what about the mythical universal jailbreak that the US government thought was possible where you don't just extract one harmful response as with a narrow jailbreak, but
know a universal one where you unlock the full potential for good or ill of the model. Well, on that anthropics say
the model. Well, on that anthropics say no one has yet at the time of writing been able to find a universal jailbreak, though of course the red teaming
continues. All well and good, but you
continues. All well and good, but you might say, well, the bigger news was the recent release of GPT 5.6 Soul in particular. That's the counter response
particular. That's the counter response from OpenAI to the Fable series. I will
say it does sound like OpenAI were tired of the lamer sounding quantitative names like 5.5 or 03. So, they copied the
anthropic approach of evocative names, Soul, Terror, Luna. My only query there is it doesn't really leave them much room for naming expansion as Mythos was
an expansion to Opus. Almost the only way I can see them going bigger than the Sun or Soul would be to say name a model as Beetlejuice. That's the star, not the
as Beetlejuice. That's the star, not the person. Very few hard stats have been
person. Very few hard stats have been released about Soul as of today beyond the price and a few select benchmarks.
But the price is a tell though because they are gunning for anyone looking to save a buck versus Claude with even 5.6
Soul being half the API price of Fable 5. Exactly half the input and just over
5. Exactly half the input and just over half the output. Some of you may be thinking well I use the Pro or Max plan for Claude. I don't pay the API price
for Claude. I don't pay the API price but come July 7th it won't be included in your weekly plan. And so you may come to feel that pricing really does matter.
All of that rather obfuscates the main point though, the trillion dollar question. Is soul roughly as performant
question. Is soul roughly as performant as Fable? Because if so, one could
as Fable? Because if so, one could imagine precipitating a massive switchover between the two. Well, slight
problem. We can't test 5.6 directly because at the US government's request, OpenAI are starting with a limited preview of the model for a small group of trusted partners. They then say,
notice whose participation has been shared with the US government. So,
OpenAI are kind of hinting that they chose and then shared which of these partners it would be. But then there's this leaked memo in the information which slightly changes the framing to my
eyes. It's actually, as told staff, that
eyes. It's actually, as told staff, that the government would be approving the access given customer by customer during the preview period. Either way, man hopes that there will be a general
release in the coming couple of weeks.
So, call it next week or the week after from time of recording. The risk though, as I hinted about earlier and was commented on by one Twitter user, is that such staggered releases will
concentrate power. This was a direct
concentrate power. This was a direct risk flagged well before the recent kathuffle by OpenAI themselves. One of
their goals as a company was to stop the undue concentration of power by corporations, for example. A staggered
release means large corporations get access to the best models much earlier.
Alman replied, "If it takes too long for general availability, then yes, that would happen. If we can get through the
would happen. If we can get through the previews though in just a few weeks, then it should be probably okay." Now,
let me know if you agree, but I actually think there is a different angle that could come into play from a seemingly unrelated story. Basically, Anthropic
unrelated story. Basically, Anthropic accused Alibaba, who oversee the development of a top Chinese model, Quen, of using 29 million exchanges with
Claude, to get training data from its responses to train their own Chinese models, the Quen series, against, of course, the terms of service of Anthropic. This would be the largest
Anthropic. This would be the largest extraction campaign of its kind. The
world in immediate response rallied in sympathy with Anthropic who have always been champions over never using even a line of copyrighted material for any of their models. End sarcasm. But wait, how
their models. End sarcasm. But wait, how does this story linked to the corporate concentration point I was just making?
Well, if this large scale scraping to distill abilities into Chinese models becomes ever more sophisticated, successful, I think the incentives of the labs might switch. Better for them
perhaps to serve their latest models to governments, approved businesses and of course themselves for say 3 to four months safe from this kind of distillation and then only when they
have a better internal model release the older one to you the unwashed masses.
And that theory is even before you get to geopolitics. Anthropic put it like
to geopolitics. Anthropic put it like this. Distillation attacks turn hundreds
this. Distillation attacks turn hundreds of billions of dollars in American investment and research into a massive subsidy for our geopolitical competitors. That's my theory anyway.
competitors. That's my theory anyway.
And I want to now get to the test results for GPT 5.6 Soul. But just one more thing on this concept of model access being set to be much more gated.
That trend could explain why OpenAI put this idea out to the US government of giving them a 5% stake in the company.
This would be much like how Intel surrendered 10% to the Trump administration about a year ago. By the
way, these early conversations also involved giving the US government stakes in other US AI companies. I am kind of curious why you think OpenAI are suggesting this cuz I have a few
theories. Theory one would be that this
theories. Theory one would be that this proposal preempts the US government demanding more. Theory two could be that
demanding more. Theory two could be that it encourages the government to rapidly allow general release because if the growing equity in these companies could be used to pay dividends to the public,
apparently this is what OpenAI want. A
bit like happens with energy in Alaska, then the government would have that incentive to grow the market share of those companies and allow general release earlier. A darker theory three
release earlier. A darker theory three would be that OpenAI think that Anthropic would not go along with this proposal and that therefore OpenAI would get preferential treatment under the
arrangement. You could argue that this
arrangement. You could argue that this angle is reinforced with these comments from a couple days ago from Anthropic that they hope for systematic rules and
that when such rules are forged they are codified i.e. not open to subjective
codified i.e. not open to subjective interpretation and quote applied equally across Frontier model developers. Let me
know what you think of course but that's enough theorizing for now because back to that trillion dollar question Soul versus Fable TLDDR from the scant details we have it looks like Fable
which is the safeguarded version of Mythos is slightly better than GPT 5.6 Soul overall it's much better than 5.6 6
cyber though that capability is of course blocked in Fable. But all of that said, Fable is double the price of Soul.
So on performance per dollar, will GPT 5.6 Soul on release be the best out there? Now wait, I know the headline
there? Now wait, I know the headline that OpenAI wants me and wants you to focus on is Terminal Bench 2.1. It's
right at the top of their release. There
we go. GPT 5.6 Soul on ultra mode. name
I think kind of copied from anthropic gets almost 92% versus Mythos 5's 88%.
But this is terminal bench 2.1 specifically about interacting with models using the terminal on your computer juggling the tools you give it access to and that's kind of niche and
if you added in error bars I would say that it might be a tie between Mythos 5 and Soul maybe a slight edge for Soul Ultra versus Fable 5. But luckily, that isn't the only benchmark we can rely on
because you already know I've dug deep into the Mythos paper, hundreds of pages. I also obviously on release dug
pages. I also obviously on release dug into the 77page 5.6 preview system card.
I noticed the safety report doubled in size relative to the previous one given what's happening with the US government.
But what I did was backsolve comparisons between 5.6 Soul and Mythos or Fable.
How could we do that? Well, sometimes
both system cards or report cards would compare their models to another model like Opus or 5.5. So, you could use those as common points of comparison.
There weren't many of these points of comparison, but the easiest one to use was Healthbench Professional on page 300 of the Mythos report card. You can see here that Mythos 5 gets 66.0%.
Now, this is about the raw horsepower of Mythos 5 because Fable 5, of course, won't really answer any questions about health. On raw horsepower, 66%. For GPT
health. On raw horsepower, 66%. For GPT
5.6 Soul, it gets 60.5% on that same benchmark. OpenAI does note that the benchmark kind of rewards longer answers, but even if you factor
in a length adjusted score, it gets 64%.
Still below Mythos 5. The obvious
question is, will 5.6 six soul actually answer questions about health. Only time
will tell. You get the idea though, like in the same ballpark, but slightly worse for Soul. Then there's exploit bench.
for Soul. Then there's exploit bench.
And this was a harder one to uncover because you can see that Soul gets slightly worse than Mythos 5, only slightly extrapolating about maybe 76%
versus 78% for Mythos. But then have a look on output tokens. You have Soul spending only about 120 130,000 versus
Mythos preview spending 350,000. And
remember, Soul's tokens are cheaper anyway. So performance per dollar a
anyway. So performance per dollar a runaway win for Soul. The devil though is slightly in the detail because what even is Exploit Bench? Well, as you'd expect, it's about finding exploits.
It's a cyber security benchmark. And
yes, I did check. It does seem to be the same questions. 41 recent
same questions. 41 recent vulnerabilities in the VA engine. the
engines which power Chrome. There we
have it again in the 5.6 system card, 41 V8 vulnerabilities. So, it's the same
V8 vulnerabilities. So, it's the same test, same 16 capability flags, including control flow, hijack, and arbitrary code execution. And yes, it does line up with what OpenAI said with
Mythos 5 getting 78%. So, does that mean that Mythos 5 is slightly better, but costs way, way more? Not quite. Notice
that in the mythos system card, GPC 5.5 is listed as getting 34%. On the Open AI chart, I'm colorblind, but I'm pretty
sure GBC 5.5 gets around 48%. What's the
discrepancy? Well, Anthropic did a three trial approach, taking the average as compared to the 5.6 system card where they used five trials. Nevertheless,
though, you could say the trend is there. and Mythos gets you there more
there. and Mythos gets you there more reliably with a higher peak but costs a fair bit more. Perhaps a more direct comparison would be in viology with one
particular multiple choice benchmark being one where Mythos 5 got 56% well above the expert baseline and GPT 5.6
Soul got 55.5% almost the same. I know it's early days, but kind of extracting a trend here.
About the same performance, maybe a touch worse, but again, much cheaper. Is
that lower price of Soul being subsidized by OpenAI? We don't know. Is
this a last gambit to take market share from Anthropic or a sustainable lower pricing? Before we leave the 5.6 Soul
pricing? Before we leave the 5.6 Soul system card, I will say that I do admire that OpenAI repeatedly admitted that 5.6 six soul is a fair bit less aligned in
places despite the obvious incentives to not admit that it's more likely than GBC 5.5 or 5.4 5.2 or 5.1 to engage in chats
about violent illicit behavior or output things that involve a range of other sensitive topics. They even admit that
sensitive topics. They even admit that Soul is worse than GP 5.5 at avoiding data destructive actions. It's also
worse than previous models at engaging in dangerous financial transactions and they repeatedly emphasize that soul will do things like this. A user authorized
the deletion of remote virtual machines 1 2 and three. Soul however couldn't find those names in one name space. So
it substituted remote virtual machine 5 6 and 7 without asking killing active processes and force removing work trees.
But for all the release notes and dozens and dozens of pages, the data I've given you so far is the best snapshot of a comparison between the Frontier models, at least as of today. Beyond that, until
the wider release comes out, when apparently we'll get much more info, the only thing we have to go on is internal evals. Rest assured though that some of
evals. Rest assured though that some of these are a different, more challenging set of challenges. But the TLDDR again is sold probably slightly worse overall
than Mythos or Fable, especially in the cyber domain, but likely better for now for most people if performance per dollar is your metric. And yes, I know I have basically skipped Sonic 5 because
Anthropic pretty much did. In the system card, they say in almost all cases, Sonic 5 trails our Opus and Mythos class models. and even cost adjusted. When
models. and even cost adjusted. When
Sonet's API price reverts in September, it will be barely competitive in my eyes. There was just one stat I would
eyes. There was just one stat I would say that I thought worth including in the video from the entire paper, which is that the underlying Sonic 5 model without safeguards is massively more
resistant to prompt injection attacks where a redte teamer will hide a prompt into the browser that the model's using, thereby tricking the model into taking actions. It's almost like over the last
actions. It's almost like over the last few weeks, they've managed to bake in much deeper resilience to those kind of attacks even before we get to safeguards. The less than 1% success
safeguards. The less than 1% success rate of these prompt injections against Sonet 5 compares with almost 30% for Mythos 5, 32% for Opus 4.8 and over 50%
for Sonet 4.6. There we have it. That is
my analysis of some fairly febral few days in AI. The way I'd put it is that power is shifting tangibly unpredictably as we speak. Sometimes it feels like the
power is drifting to openweight models to China as we saw with the impressive GLM 5.2 which I covered in detail on my Patreon. Then though we get papers like
Patreon. Then though we get papers like this one co-authored by among others researchers at Stanford, MIT, Harvard, Anthropic. It points out that the
Anthropic. It points out that the winners will always be the largest of the models. That because of competition
the models. That because of competition over limited neurons, a larger model will always be able to learn a part of the data distribution that smaller models, often like those produced by China, fail to learn, even with infinite
training data. Essentially, that in the
training data. Essentially, that in the drive to reduce loss, small models don't have the parameters to spare to learn rare tasks. Gradients will interfere and
rare tasks. Gradients will interfere and they'll focus on more common tasks.
Large models with increased width have reduced competition between tasks over model parameters, enabling the learning of that rare task without forcing the forgetting of features relevant to
common tasks. That implies that the
common tasks. That implies that the models served by those with the most compute like the US frontier labs will perennially be able to learn more patterns, extract more juice from the
same data, have smarter models. Other
times it feels like power is drifting toward a concentrated group of US corporations and the US government.
There are days when power chases a certain breeze and lands on anthropic with their best-in-class models, but then the very next moment it drifts to those like OpenAI who are promising best
performance per dollar. Let me know where you think the power will land.
Thank you so much for watching and have a wonderful
Loading video analysis...