I Built a Vox-Style Video Using HyperFrame and Claude Code
By Andy Lo
Summary
Topics Covered
- Video creation is turning into code
- Build the design system before the scenes
- Handoff files prevent AI context corruption
- Creative direction outranks the tools
Full Transcript
Take a look at this.
China now builds more electric cars than the rest of the world combined. But the
real story isn't [music] the cars. It's
the batteries inside them and who controls how they're made. This entire
Fox style animated video was built using hyperframes and claw code. The scenes,
the animations, the typography, the transitions, everything was generated as code. And we have already seen AI models
code. And we have already seen AI models developed motion graphics skills like claw code with remotion. But the problem was that a lot of AI motion graphics
still felt more like animated slides than real editorial video. And that's
where hyperframes by hent becomes more interesting because it gives a creators more control over pacing layout timing typography transitions and animation
style. So instead of just a basic slide
style. So instead of just a basic slide style video, you can now create something closer to a real fox style editorial animation. And in this video,
editorial animation. And in this video, we're going to show you how this project was built with CL code and hyperframes.
So before we jump into the project, let us quickly explain what hyperframes and hijen are. So hijen is already known for
hijen are. So hijen is already known for AI video avatars and realistic lip sync.
But hybrid frames is pretty different.
Hyper frames is an open-source video framework from hijen that lets AI agents create videos using HTML, CSS and JavaScript. So in simple terms, it turns
JavaScript. So in simple terms, it turns video creation into code. So instead of editing everything manually, cloud code can now build the fo like a web project
and hybrid frames can render it into a finished video. And now let me show you
finished video. And now let me show you how this system actually works. This
entire workflows is built around only three stages. The first stage as the
three stages. The first stage as the foundation. This is where Cord creates
foundation. This is where Cord creates the video canvas, loads the fonts, defines the design system, builds the color palette and creates the scene structure. And the second stage is the
structure. And the second stage is the story itself. Cord writes the copy,
story itself. Cord writes the copy, builds the scene, creates the animation with Gap and connects everything together with transactions. And then the final stage is rendering. This is where
narration gets added, source citations get inserted and hybrid frames exports the finished video. And now instead of manually animating every elements, transition and layer for with hybrid
frames can structure the whole video system for you. And by the end of this video, you will know exactly how to build yours. So before we start
build yours. So before we start building, let me quickly walk you through the project structure.
Everything starts with this file right here, the storyboard and direction. You
can think of this as the creative brief for the entire project. And this defines the story that we are telling, the facial style we are aiming for, the scenes we need to create, and the
overall direction of the video. And in
our case, that means things like the fox inspired design language, the animation style, the scene structure, the color palette, typography choices, and the key message we want the audience to walk
away with. So ultimately what this
away with. So ultimately what this document does is give Claude a clear understanding of what success looks like before it writes a single line of code.
And now if you want to create your own custom hyperframes project, we actually created a complete build guide that walks through the entire process from like storyboarding assess sourcing all
the way through building the final video. So you can find that inside our
video. So you can find that inside our community as well along with the exact prompts and resources used in this project and you can find the link in the description below. So next we have the
description below. So next we have the fonts folder. So for this video we are
fonts folder. So for this video we are using three main font families or sense belto and Harriet display. And these are the prompts we selected to help recreate
that fork style editorial look. And now
you notice that we have the entire font family for each one loaded in the project. And in reality we are only
project. And in reality we are only going to use a handful of these weights.
But it is much easier to put the entire font family here then go through the entire pack and pick out which one to use. Now let's look at the Fox medias
use. Now let's look at the Fox medias folder and more specifically the SS folder. So you can now ignore the other
folder. So you can now ignore the other files and folders first since those are covered in the build guide. And the SS folder is where all of our facial
materials leave things like the images, SVGs and videos. Anything you want to appear inside the final video. So, for
this project, we are mainly using images. And yes, there are quite a lot
images. And yes, there are quite a lot of them, but that does not mean that we're using every single SS. Okay, most
scenes only need a handful of selected images. And the goal here is not to use
images. And the goal here is not to use more assets, but to have enough options to choose from. So, next you have the CLAMD file. And if you have worked with
CLAMD file. And if you have worked with claude code before, you probably recognize this immediately because this is essentially the source of truth for the entire project. And it contains the
project rules, technical requirements, design constraints, workflow instructions, and everything Claude needs to consistently make the right decisions while building. You can think of it as the operator manual for the
entire project. And finally, we have the
entire project. And finally, we have the prompt. And these are the prompts we are
prompt. And these are the prompts we are going to use to build a feed from start to finish. And we will go through each
to finish. And we will go through each one when we get to the actual build process. And for now, all you need to
process. And for now, all you need to know is that the workflow is broken into three separate phases. With each phase handling in different parts of the production pipeline, the project
structure is in place. We can start building the actual video. And at this point, C already understands the assets, the storyboard, and the facial style we are aiming for. And the first thing
we're going to do is to run prompt one.
And here is exactly what this prompt does. The first thing we are asking
does. The first thing we are asking Claude to do is read all the project documentation. And these are the
documentation. And these are the storyboard and direction. And because
before Claude starts building anything, it needs to understand what we are trying to create. And in our case, that's a fed style video about electric
vehicles and artificial intelligence.
And once Claude understands the goal, it starts building the design system. And
that includes things like the typography, color palette, motion style, and all of the official rules that every scene will follow later. And again, this
is very important because we do not want every scene looking different from one another, right? We want the entire view
another, right? We want the entire view to feel like it was designed as a single cohesive piece. So, the prompt also
cohesive piece. So, the prompt also tells Claude how motion should behave throughout the project. And if you have watched Fox videos before, you may notice that their animations have a very
specific style and feeling. And their
animations are quite intentional, measured, and slightly choppy. It is a style that feels really editorial and more focused on communicating information. We are teaching Claude how
information. We are teaching Claude how to recreate that facial language from the very beginning. And then finally, we're asking Claude to create a structure of the video itself. All
right, our story board now contains seven scenes. And what Cord does here is
seven scenes. And what Cord does here is create placeholders for all seven scenes. So we have a framework to build
scenes. So we have a framework to build on later. And by the end of this prompt,
on later. And by the end of this prompt, we do not have a finished video yet, but we do have the foundation. And now let's copy the prompt, paste it into CL code,
and see what it builds. All right. And
it took about 20 minutes for Claude to finish building the foundation and inside system. And now the exact timing
inside system. And now the exact timing may vary, but you can expect this first phase to take a couple of minutes. Seems
Claude is setting up the entire architecture. And let's open it up.
architecture. And let's open it up.
Right now, ours is running on localhost 3014. And your port might be different.
3014. And your port might be different.
And if you do not see it immediately, just ask CL for the link. And here we are. And now this looks very bare bones,
are. And now this looks very bare bones, right? But that's exactly what we want.
right? But that's exactly what we want.
Remember, we have not built any actual scenes yet. We have not added images. We
scenes yet. We have not added images. We
have not added charts. This phase was only about building the foundation. And
from what we can see, everything looks correct. And all seven scenes are
correct. And all seven scenes are present. The topography hierarchy is on
present. The topography hierarchy is on point. The motion design looks correct.
point. The motion design looks correct.
We can already see that signature Fox style motion language beginning to take shape. And now, most importantly,
shape. And now, most importantly, there's nothing here that needs fixing.
the foundation is pretty solid. And now
let's move on to prompt two. And this is where the actual video starts getting built. So if prompt one was about
built. So if prompt one was about creating the foundation, then prompt two is about bringing the storyboard to life. And the first thing we are asking
life. And the first thing we are asking CL to do is use the assets we have already collected. And then we are
already collected. And then we are asking it to transform those raw assets into seven fully built scenes based on our storyboard. Okay. So now notice that
our storyboard. Okay. So now notice that we're not just telling Claude what assets to use. We are also reminding it how to use them. For example, every visual should explain something. Every
animation should communicate information. And you can also carefully
information. And you can also carefully choose the specific assets you want to use. But for this video, we will tell
use. But for this video, we will tell Claude to do that. And we'll see how it goes. So nothing should exist just
goes. So nothing should exist just because it looks cool. That's the core part of Fox style. And we want Claude to follow that idea throughout the entire project. And now we are also giving
project. And now we are also giving Claude the structure for each scene. So
rather than asking Claude to invent a story from scratch, we are giving it a very clear runback to follow. And that
makes everything look consistent from scene to scene. And finally before finishing, we are asking CL to do something extremely important. We are
asking it to tell us where the project is weak. For example, like missing
is weak. For example, like missing assets, visuals feel underdeveloped, etc. and cla should flag it so that we can improve it before moving to the final stage. So now let's hit enter and
final stage. So now let's hit enter and wait and once again this may take some time. All right so after a while it's
time. All right so after a while it's finished. So just like before, Claus
finished. So just like before, Claus gives us a summary of everything it accomplished. But honestly, this pink
accomplished. But honestly, this pink step is the most important. And you can see as structured in the prompt, Claude also gives us a list of missing or weak
assets across the project. And this is where quality control begins. You can
take a look at the scenes Claude flag.
You can open the view, see what looks off. Maybe like an image feels weak or
off. Maybe like an image feels weak or official doesn't support a narration strongly enough. Maybe one assess simply
strongly enough. Maybe one assess simply fits the scene better than another. And
this is mostly a facial review process.
So you may tell Claude what you're seeing and Claude will make the necessary adjustments. And this is
necessary adjustments. And this is exactly why we gathered so many assets earlier because now you can start selecting the strongest image for each scene instead of being forced to use
whatever happens to be available. So
after prompt two, the process becomes very simple. Review, iterate, improve.
very simple. Review, iterate, improve.
That's it. And you can ask CLA to fix official bugs you see, swap assets or refine the timing and the exact iteration process will look a little bit different every time, but the storyboard
and claude.md will keep the project
and claude.md will keep the project moving in the right direction. And both
build guides, this one and the custom F guide covers this process in much more detail so you do not get lost. Now, one
more tip before we continue. If you are unfamiliar with the term, the context window is essentially Claude's working memory for the current conversation.
It's everything Claude can actively remember and reference while working on your project. So you can check its usage
your project. So you can check its usage at any time using the /context command.
And as the quotes builds, it will essentially autoco compact its context.
And this allows you to continue working with a fresh context window without starting over from scratch. And for most projects that's perfectly fine. But for
us once we start getting into second autocompact we usually create a handoff file and start a fresh section. And
there's not an exact rule for when you should do this. It's mostly a habit because as conversations become extremely long. The context will
extremely long. The context will eventually become too corrupted. So you
will run into inconsistencies or hallucinated outputs. So a handoff file
hallucinated outputs. So a handoff file is basically a project status summary.
It contains all of the important decisions, project contacts, progress, and instructions needed so a brand new cla session can immediately pick up
where the previous one left off. And now
we are ready for prompt three. And this
step is completely optional. And as you can see, prompt three handles things like sourcing and generating SVG logos, creating narration, and rendering the
final feno. And if you do not need
final feno. And if you do not need sourcing or creating custom logos or AI voice overs, you can absolutely finish a photo after prompt one and two. And at
that point, you already have a complete Fedo. And for prompt three, it simply
Fedo. And for prompt three, it simply completes the final production polish.
So let's jump into the preview and take a look at the final result. Okay, it
looks pretty good. And now the question is, is this going to compete directly with a Fox video created by an experienced production team? I would say
no, not yet. But that's not really the point, right? The point is that we boot
point, right? The point is that we boot this in just about an hour using a handful of prompts. And despite that, the facial hierarchy is there. The
topography feels cohesive. The assess
blend naturally with the layouts and the transition works. And also the motion
transition works. And also the motion language feels close to the editorial style we were aiming for. So, at this point, we already have a working video, and now we can take a step further and
make it even better. More scenes,
narration, and music, and a more advanced storyboard structure. So, for
this next version, we have two additional files. The first one is the
additional files. The first one is the leveled up storyboard, and this is exactly what it sounds like. It's an
upgraded version of the original storyboard, but with a much larger scope. So instead of seven simple
scope. So instead of seven simple scenes, we are now working with seven chapters that are further broken down into 19 individual subscenes. That may
not sound like a huge difference, right?
But it changes the pacing of the entire video. So instead of spending like eight
video. So instead of spending like eight or 10 seconds on a single facial idea, we are constantly moving between different supporting facials, maps,
charts headlines documents and photography. And the result is a feeder
photography. And the result is a feeder that feels significantly more dynamic and information dense. Also, another
major change is assess utilization. In
the original version, we only needed a handful of assets per scene. And this
upgraded version is designed to use over 70% of the entire assets library. And
that means more speciality and more opportunities for court to create those like signature transitions where one ideas evolves naturally into the next.
So next we have the narration script and as the name suggest this contains all of the voice over for the upgraded versions of the project but it's more than just a script. The timing has already been
script. The timing has already been aligned to the new storyboard structure and is formatted specifically for voice generation. And that means Claude
generation. And that means Claude already knows what should be spoken, when it should be spoken and how it fits into the timeline. And now another thing
you may have probably noticed is that we have had a open code session running in a separate tab this entire time. And we
actually used open code together with all alpha through open router to help set up the 11 labs API integration. And
if you are curious about this and would like a dedicated feed covering all alpha and open code, please let us know in the comment section below. And one thing we would like to mention now, the alpha is
currently a stealth model and that means we do not really know like how the provider is handling training retention or storage behind the scenes. So for us, we would avoid putting like sensitive
information to it for now. Things like
API keys, passwords, private confidentials of course, anything you would not want to be exposed to. So for
general work is perfectly fine and capable and it has over a million contact window. So now let's go back to
contact window. So now let's go back to the project and at this point implementing the upgraded storyboard follows the exact same process as before and also we suggest that you can create
a duplicate of this project folder if you want in backup. Okay, so we are going to skip this step-by-step process because it's almost identical to what we have already covered. So let's just jump
ahead to when everything is finished.
All right, time skip and the build is done. Let's take a look at the perfume.
done. Let's take a look at the perfume.
China now builds more electric cars than the rest of the world combined. But the
real story isn't the [music] cars, it's the batteries inside them, and who controls how they're made. China's lead
rests on four [music] strengths: manufacturing scale, batteries, industrial policy, and factory floor AI.
And the numbers [music] are stark. China
now makes more than 60% of the world's electric vehicles. These [music]
electric vehicles. These [music] strengths aren't separate. They feed one another. A self-reinforcing system
another. A self-reinforcing system that's extremely [music] hard to compete with. At the center of it all is BYD, a
with. At the center of it all is BYD, a company most Americans don't know. Now
the [music] world's largest maker of electric vehicles. Its factories are
electric vehicles. Its factories are vast. Some lines finish a car in under a
vast. Some lines finish a car in under a minute. And BYD makes its own batteries,
minute. And BYD makes its own batteries, chips, even [music] the robots. Last
year, BYYD sold over 4 million vehicles.
And its batteries now [music] supply rivals like Tesla. This is integration at a rare scale. It starts underground.
[music] China refineses most of the world's lithium and rare earths. Those
minerals become cells, [music] then packs, then cars, rolling off Chinese lines by the millions. From there, they ship worldwide. [music] China now
ship worldwide. [music] China now exports more electric cars than any other nation. The newest shift is
other nation. The newest shift is artificial intelligence. China's [music]
artificial intelligence. China's [music] plants are becoming smart factories that learn and adapt in real time. Computer
vision now checks [music] every weld and circuit faster and more precisely than any human team. Digital twins model the whole line before a single part [music]
exists. A factory is becoming a brain.
exists. A factory is becoming a brain.
The United States has noticed [music] tariffs on Chinese EVs now approach 100%. Washington is steering hundreds of
100%. Washington is steering hundreds of billions toward building a supply chain on American [music] soil. New battery
plants arising across the Midwest. But
the gap is [music] wide and slow to close. So two systems now compete. One
close. So two systems now compete. One
built over decades, the other racing to rebuild. [music] Both betting on the
rebuild. [music] Both betting on the same technology. The race isn't about
same technology. The race isn't about cars anymore. It's batteries, software,
cars anymore. It's batteries, software, and AI. Whoever [music] wins will shape
and AI. Whoever [music] wins will shape how the world builds everything.
And honestly, [music] the hyperframes and cord code are genuinely impressive.
With the chapter based structure, the additional assets, the narration, and the background music, the entire project feels significantly more polished.
There's more movement, more variety, more storytelling, and most importantly, more clarity. So, the Fed feels less
more clarity. So, the Fed feels less like a prototype and more like something you would want to publish. And what's
exciting is that we are still using the exact same workflow. And the only thing that changed was just the quality of the inputs. Better storyboard, more assets,
inputs. Better storyboard, more assets, better narration, which is a really important lesson because the tools themselves are only part of the equation. And the real difference comes
equation. And the real difference comes from the creative direction you give them. The better your planning becomes,
them. The better your planning becomes, the better the final video becomes. Now
imagine what happens when you spend more time refining the storyboard or when you start learning motion design principles yourself or when you put this workflow in the hands of an actual creative team.
Hyperframes is not replacing programmatic motion designers is actually amplifying them. And when you combine a strong creative vision with claw code and hyperframes you can ship
highquality videos dramatically faster than traditional workflows. So overall
we think that hyperframes and claude code have a tremendous amount of potential in the right hands and personally we think that is probably the best alternative to remote availability.
So now let's wrap things up. So first of all, B code acts like the production team like it organizes the project, writes the scene, creates the animation
logic, structures the assets and builds the entire video system and hybrid frames then takes that system and renders it into an actual video. The
biggest shift is that programmatic video production is going further than ever from emotion. Now we also have hybrid
from emotion. Now we also have hybrid frames. So instead of editing every
frames. So instead of editing every frame manually, you just describe what you want and generate the system that creates it. That means faster iteration,
creates it. That means faster iteration, better consistency, and workflows that are dramatically easier to scale. Try
this yourself. Pick a topic you already understand well and build a video around this. You'll very quickly start seeing
this. You'll very quickly start seeing how powerful this approach can be. And
after all, if you want more in-depth tutorials to learn how to make AI videos and how to make money with it, feel free to join our any code community. You can
find the links in the description below.
And as always, if you found this video helpful, hit the like and subscribe button for more video like this in future. I'll see you in our next
future. I'll see you in our next
Loading video analysis...