I Built a Vox-Style Video Using HyperFrame and Claude Code

By Andy Lo

Summary

Topics Covered

Video creation is turning into code
Build the design system before the scenes
Handoff files prevent AI context corruption
Creative direction outranks the tools

Full Transcript

Take a look at this.

China now builds more electric cars than the rest of the world combined. But the

real story isn't [music] the cars. It's

the batteries inside them and who controls how they're made. This entire

Fox style animated video was built using hyperframes and claw code. The scenes,

the animations, the typography, the transitions, everything was generated as code. And we have already seen AI models

code. And we have already seen AI models developed motion graphics skills like claw code with remotion. But the problem was that a lot of AI motion graphics

still felt more like animated slides than real editorial video. And that's

where hyperframes by hent becomes more interesting because it gives a creators more control over pacing layout timing typography transitions and animation

style. So instead of just a basic slide

style. So instead of just a basic slide style video, you can now create something closer to a real fox style editorial animation. And in this video,

editorial animation. And in this video, we're going to show you how this project was built with CL code and hyperframes.

So before we jump into the project, let us quickly explain what hyperframes and hijen are. So hijen is already known for

hijen are. So hijen is already known for AI video avatars and realistic lip sync.

But hybrid frames is pretty different.

Hyper frames is an open-source video framework from hijen that lets AI agents create videos using HTML, CSS and JavaScript. So in simple terms, it turns

JavaScript. So in simple terms, it turns video creation into code. So instead of editing everything manually, cloud code can now build the fo like a web project

and hybrid frames can render it into a finished video. And now let me show you

finished video. And now let me show you how this system actually works. This

entire workflows is built around only three stages. The first stage as the

three stages. The first stage as the foundation. This is where Cord creates

foundation. This is where Cord creates the video canvas, loads the fonts, defines the design system, builds the color palette and creates the scene structure. And the second stage is the

structure. And the second stage is the story itself. Cord writes the copy,

story itself. Cord writes the copy, builds the scene, creates the animation with Gap and connects everything together with transactions. And then the final stage is rendering. This is where

narration gets added, source citations get inserted and hybrid frames exports the finished video. And now instead of manually animating every elements, transition and layer for with hybrid

frames can structure the whole video system for you. And by the end of this video, you will know exactly how to build yours. So before we start

build yours. So before we start building, let me quickly walk you through the project structure.

Everything starts with this file right here, the storyboard and direction. You

can think of this as the creative brief for the entire project. And this defines the story that we are telling, the facial style we are aiming for, the scenes we need to create, and the

overall direction of the video. And in

our case, that means things like the fox inspired design language, the animation style, the scene structure, the color palette, typography choices, and the key message we want the audience to walk

away with. So ultimately what this

away with. So ultimately what this document does is give Claude a clear understanding of what success looks like before it writes a single line of code.

And now if you want to create your own custom hyperframes project, we actually created a complete build guide that walks through the entire process from like storyboarding assess sourcing all

the way through building the final video. So you can find that inside our

video. So you can find that inside our community as well along with the exact prompts and resources used in this project and you can find the link in the description below. So next we have the

description below. So next we have the fonts folder. So for this video we are

fonts folder. So for this video we are using three main font families or sense belto and Harriet display. And these are the prompts we selected to help recreate

that fork style editorial look. And now

you notice that we have the entire font family for each one loaded in the project. And in reality we are only

project. And in reality we are only going to use a handful of these weights.

But it is much easier to put the entire font family here then go through the entire pack and pick out which one to use. Now let's look at the Fox medias

use. Now let's look at the Fox medias folder and more specifically the SS folder. So you can now ignore the other

folder. So you can now ignore the other files and folders first since those are covered in the build guide. And the SS folder is where all of our facial

materials leave things like the images, SVGs and videos. Anything you want to appear inside the final video. So, for

this project, we are mainly using images. And yes, there are quite a lot

images. And yes, there are quite a lot of them, but that does not mean that we're using every single SS. Okay, most

scenes only need a handful of selected images. And the goal here is not to use

images. And the goal here is not to use more assets, but to have enough options to choose from. So, next you have the CLAMD file. And if you have worked with

CLAMD file. And if you have worked with claude code before, you probably recognize this immediately because this is essentially the source of truth for the entire project. And it contains the

project rules, technical requirements, design constraints, workflow instructions, and everything Claude needs to consistently make the right decisions while building. You can think of it as the operator manual for the

entire project. And finally, we have the

entire project. And finally, we have the prompt. And these are the prompts we are

prompt. And these are the prompts we are going to use to build a feed from start to finish. And we will go through each

to finish. And we will go through each one when we get to the actual build process. And for now, all you need to

process. And for now, all you need to know is that the workflow is broken into three separate phases. With each phase handling in different parts of the production pipeline, the project

structure is in place. We can start building the actual video. And at this point, C already understands the assets, the storyboard, and the facial style we are aiming for. And the first thing

we're going to do is to run prompt one.

And here is exactly what this prompt does. The first thing we are asking

does. The first thing we are asking Claude to do is read all the project documentation. And these are the

documentation. And these are the storyboard and direction. And because

before Claude starts building anything, it needs to understand what we are trying to create. And in our case, that's a fed style video about electric

vehicles and artificial intelligence.

And once Claude understands the goal, it starts building the design system. And

that includes things like the typography, color palette, motion style, and all of the official rules that every scene will follow later. And again, this

is very important because we do not want every scene looking different from one another, right? We want the entire view

another, right? We want the entire view to feel like it was designed as a single cohesive piece. So, the prompt also

cohesive piece. So, the prompt also tells Claude how motion should behave throughout the project. And if you have watched Fox videos before, you may notice that their animations have a very

specific style and feeling. And their

animations are quite intentional, measured, and slightly choppy. It is a style that feels really editorial and more focused on communicating information. We are teaching Claude how

information. We are teaching Claude how to recreate that facial language from the very beginning. And then finally, we're asking Claude to create a structure of the video itself. All

right, our story board now contains seven scenes. And what Cord does here is

seven scenes. And what Cord does here is create placeholders for all seven scenes. So we have a framework to build

scenes. So we have a framework to build on later. And by the end of this prompt,

on later. And by the end of this prompt, we do not have a finished video yet, but we do have the foundation. And now let's copy the prompt, paste it into CL code,

and see what it builds. All right. And

it took about 20 minutes for Claude to finish building the foundation and inside system. And now the exact timing

inside system. And now the exact timing may vary, but you can expect this first phase to take a couple of minutes. Seems

Claude is setting up the entire architecture. And let's open it up.

architecture. And let's open it up.

Right now, ours is running on localhost 3014. And your port might be different.

3014. And your port might be different.

And if you do not see it immediately, just ask CL for the link. And here we are. And now this looks very bare bones,

are. And now this looks very bare bones, right? But that's exactly what we want.

right? But that's exactly what we want.

Remember, we have not built any actual scenes yet. We have not added images. We

scenes yet. We have not added images. We

have not added charts. This phase was only about building the foundation. And

from what we can see, everything looks correct. And all seven scenes are

correct. And all seven scenes are present. The topography hierarchy is on

present. The topography hierarchy is on point. The motion design looks correct.

point. The motion design looks correct.

We can already see that signature Fox style motion language beginning to take shape. And now, most importantly,

shape. And now, most importantly, there's nothing here that needs fixing.

the foundation is pretty solid. And now

let's move on to prompt two. And this is where the actual video starts getting built. So if prompt one was about

built. So if prompt one was about creating the foundation, then prompt two is about bringing the storyboard to life. And the first thing we are asking

life. And the first thing we are asking CL to do is use the assets we have already collected. And then we are

already collected. And then we are asking it to transform those raw assets into seven fully built scenes based on our storyboard. Okay. So now notice that

our storyboard. Okay. So now notice that we're not just telling Claude what assets to use. We are also reminding it how to use them. For example, every visual should explain something. Every

animation should communicate information. And you can also carefully

information. And you can also carefully choose the specific assets you want to use. But for this video, we will tell

use. But for this video, we will tell Claude to do that. And we'll see how it goes. So nothing should exist just

goes. So nothing should exist just because it looks cool. That's the core part of Fox style. And we want Claude to follow that idea throughout the entire project. And now we are also giving

project. And now we are also giving Claude the structure for each scene. So

rather than asking Claude to invent a story from scratch, we are giving it a very clear runback to follow. And that

makes everything look consistent from scene to scene. And finally before finishing, we are asking CL to do something extremely important. We are

asking it to tell us where the project is weak. For example, like missing

is weak. For example, like missing assets, visuals feel underdeveloped, etc. and cla should flag it so that we can improve it before moving to the final stage. So now let's hit enter and

final stage. So now let's hit enter and wait and once again this may take some time. All right so after a while it's

time. All right so after a while it's finished. So just like before, Claus

finished. So just like before, Claus gives us a summary of everything it accomplished. But honestly, this pink

accomplished. But honestly, this pink step is the most important. And you can see as structured in the prompt, Claude also gives us a list of missing or weak

assets across the project. And this is where quality control begins. You can

take a look at the scenes Claude flag.

You can open the view, see what looks off. Maybe like an image feels weak or

off. Maybe like an image feels weak or official doesn't support a narration strongly enough. Maybe one assess simply

strongly enough. Maybe one assess simply fits the scene better than another. And

this is mostly a facial review process.

So you may tell Claude what you're seeing and Claude will make the necessary adjustments. And this is

necessary adjustments. And this is exactly why we gathered so many assets earlier because now you can start selecting the strongest image for each scene instead of being forced to use

whatever happens to be available. So

after prompt two, the process becomes very simple. Review, iterate, improve.

very simple. Review, iterate, improve.

That's it. And you can ask CLA to fix official bugs you see, swap assets or refine the timing and the exact iteration process will look a little bit different every time, but the storyboard

and claude.md will keep the project

and claude.md will keep the project moving in the right direction. And both

build guides, this one and the custom F guide covers this process in much more detail so you do not get lost. Now, one

more tip before we continue. If you are unfamiliar with the term, the context window is essentially Claude's working memory for the current conversation.

It's everything Claude can actively remember and reference while working on your project. So you can check its usage

your project. So you can check its usage at any time using the /context command.

And as the quotes builds, it will essentially autoco compact its context.

And this allows you to continue working with a fresh context window without starting over from scratch. And for most projects that's perfectly fine. But for

us once we start getting into second autocompact we usually create a handoff file and start a fresh section. And

there's not an exact rule for when you should do this. It's mostly a habit because as conversations become extremely long. The context will

extremely long. The context will eventually become too corrupted. So you

will run into inconsistencies or hallucinated outputs. So a handoff file

hallucinated outputs. So a handoff file is basically a project status summary.

It contains all of the important decisions, project contacts, progress, and instructions needed so a brand new cla session can immediately pick up

where the previous one left off. And now

we are ready for prompt three. And this

step is completely optional. And as you can see, prompt three handles things like sourcing and generating SVG logos, creating narration, and rendering the

final feno. And if you do not need

final feno. And if you do not need sourcing or creating custom logos or AI voice overs, you can absolutely finish a photo after prompt one and two. And at

that point, you already have a complete Fedo. And for prompt three, it simply

Fedo. And for prompt three, it simply completes the final production polish.

So let's jump into the preview and take a look at the final result. Okay, it

looks pretty good. And now the question is, is this going to compete directly with a Fox video created by an experienced production team? I would say

no, not yet. But that's not really the point, right? The point is that we boot

point, right? The point is that we boot this in just about an hour using a handful of prompts. And despite that, the facial hierarchy is there. The

topography feels cohesive. The assess

blend naturally with the layouts and the transition works. And also the motion

transition works. And also the motion language feels close to the editorial style we were aiming for. So, at this point, we already have a working video, and now we can take a step further and

make it even better. More scenes,

narration, and music, and a more advanced storyboard structure. So, for

this next version, we have two additional files. The first one is the

additional files. The first one is the leveled up storyboard, and this is exactly what it sounds like. It's an

upgraded version of the original storyboard, but with a much larger scope. So instead of seven simple

scope. So instead of seven simple scenes, we are now working with seven chapters that are further broken down into 19 individual subscenes. That may

not sound like a huge difference, right?

But it changes the pacing of the entire video. So instead of spending like eight

video. So instead of spending like eight or 10 seconds on a single facial idea, we are constantly moving between different supporting facials, maps,

charts headlines documents and photography. And the result is a feeder

photography. And the result is a feeder that feels significantly more dynamic and information dense. Also, another

major change is assess utilization. In

the original version, we only needed a handful of assets per scene. And this

upgraded version is designed to use over 70% of the entire assets library. And

that means more speciality and more opportunities for court to create those like signature transitions where one ideas evolves naturally into the next.

So next we have the narration script and as the name suggest this contains all of the voice over for the upgraded versions of the project but it's more than just a script. The timing has already been

script. The timing has already been aligned to the new storyboard structure and is formatted specifically for voice generation. And that means Claude

generation. And that means Claude already knows what should be spoken, when it should be spoken and how it fits into the timeline. And now another thing

you may have probably noticed is that we have had a open code session running in a separate tab this entire time. And we

actually used open code together with all alpha through open router to help set up the 11 labs API integration. And

if you are curious about this and would like a dedicated feed covering all alpha and open code, please let us know in the comment section below. And one thing we would like to mention now, the alpha is

currently a stealth model and that means we do not really know like how the provider is handling training retention or storage behind the scenes. So for us, we would avoid putting like sensitive

information to it for now. Things like

API keys, passwords, private confidentials of course, anything you would not want to be exposed to. So for

general work is perfectly fine and capable and it has over a million contact window. So now let's go back to

contact window. So now let's go back to the project and at this point implementing the upgraded storyboard follows the exact same process as before and also we suggest that you can create

a duplicate of this project folder if you want in backup. Okay, so we are going to skip this step-by-step process because it's almost identical to what we have already covered. So let's just jump

ahead to when everything is finished.

All right, time skip and the build is done. Let's take a look at the perfume.

done. Let's take a look at the perfume.

China now builds more electric cars than the rest of the world combined. But the

real story isn't the [music] cars, it's the batteries inside them, and who controls how they're made. China's lead

rests on four [music] strengths: manufacturing scale, batteries, industrial policy, and factory floor AI.

And the numbers [music] are stark. China

now makes more than 60% of the world's electric vehicles. These [music]

electric vehicles. These [music] strengths aren't separate. They feed one another. A self-reinforcing system

another. A self-reinforcing system that's extremely [music] hard to compete with. At the center of it all is BYD, a

with. At the center of it all is BYD, a company most Americans don't know. Now

the [music] world's largest maker of electric vehicles. Its factories are

electric vehicles. Its factories are vast. Some lines finish a car in under a

vast. Some lines finish a car in under a minute. And BYD makes its own batteries,

minute. And BYD makes its own batteries, chips, even [music] the robots. Last

year, BYYD sold over 4 million vehicles.

And its batteries now [music] supply rivals like Tesla. This is integration at a rare scale. It starts underground.

[music] China refineses most of the world's lithium and rare earths. Those

minerals become cells, [music] then packs, then cars, rolling off Chinese lines by the millions. From there, they ship worldwide. [music] China now

ship worldwide. [music] China now exports more electric cars than any other nation. The newest shift is

other nation. The newest shift is artificial intelligence. China's [music]

artificial intelligence. China's [music] plants are becoming smart factories that learn and adapt in real time. Computer

vision now checks [music] every weld and circuit faster and more precisely than any human team. Digital twins model the whole line before a single part [music]

exists. A factory is becoming a brain.

exists. A factory is becoming a brain.

The United States has noticed [music] tariffs on Chinese EVs now approach 100%. Washington is steering hundreds of

100%. Washington is steering hundreds of billions toward building a supply chain on American [music] soil. New battery

plants arising across the Midwest. But

the gap is [music] wide and slow to close. So two systems now compete. One

close. So two systems now compete. One

built over decades, the other racing to rebuild. [music] Both betting on the

rebuild. [music] Both betting on the same technology. The race isn't about

same technology. The race isn't about cars anymore. It's batteries, software,

cars anymore. It's batteries, software, and AI. Whoever [music] wins will shape

and AI. Whoever [music] wins will shape how the world builds everything.

And honestly, [music] the hyperframes and cord code are genuinely impressive.

With the chapter based structure, the additional assets, the narration, and the background music, the entire project feels significantly more polished.

There's more movement, more variety, more storytelling, and most importantly, more clarity. So, the Fed feels less

more clarity. So, the Fed feels less like a prototype and more like something you would want to publish. And what's

exciting is that we are still using the exact same workflow. And the only thing that changed was just the quality of the inputs. Better storyboard, more assets,

inputs. Better storyboard, more assets, better narration, which is a really important lesson because the tools themselves are only part of the equation. And the real difference comes

equation. And the real difference comes from the creative direction you give them. The better your planning becomes,

them. The better your planning becomes, the better the final video becomes. Now

imagine what happens when you spend more time refining the storyboard or when you start learning motion design principles yourself or when you put this workflow in the hands of an actual creative team.

Hyperframes is not replacing programmatic motion designers is actually amplifying them. And when you combine a strong creative vision with claw code and hyperframes you can ship

highquality videos dramatically faster than traditional workflows. So overall

we think that hyperframes and claude code have a tremendous amount of potential in the right hands and personally we think that is probably the best alternative to remote availability.

So now let's wrap things up. So first of all, B code acts like the production team like it organizes the project, writes the scene, creates the animation

logic, structures the assets and builds the entire video system and hybrid frames then takes that system and renders it into an actual video. The

biggest shift is that programmatic video production is going further than ever from emotion. Now we also have hybrid

from emotion. Now we also have hybrid frames. So instead of editing every

frames. So instead of editing every frame manually, you just describe what you want and generate the system that creates it. That means faster iteration,

creates it. That means faster iteration, better consistency, and workflows that are dramatically easier to scale. Try

this yourself. Pick a topic you already understand well and build a video around this. You'll very quickly start seeing

this. You'll very quickly start seeing how powerful this approach can be. And

after all, if you want more in-depth tutorials to learn how to make AI videos and how to make money with it, feel free to join our any code community. You can

find the links in the description below.

And as always, if you found this video helpful, hit the like and subscribe button for more video like this in future. I'll see you in our next

future. I'll see you in our next

Loading...

Loading video analysis...