LongCut logo

Google Quietly Made AI Building Way Easier

By MattVidPro

Summary

Topics Covered

  • Open-Source World Models Enable Temporal Control
  • Stitch AI Generates Coherent Multi-Page Designs
  • Vibe Coding Scaffolding Beats Monolithic Prompts

Full Transcript

How's it going everyone? Welcome back to the Matt VidPro AI YouTube channel.

Today's video is all about cutting edge AI research and tools that you can use today. We're going to get hands-on with

today. We're going to get hands-on with a few of them, but let's kick things off with Inspacio. They have released

with Inspacio. They have released Inspacio World. This is a world model

Inspacio World. This is a world model that is completely open source. You can

download the weights and code today and check it out. It looks like they do have a demo you can try for free. Obviously,

the clips that you see right now are cherrypicked. They call this 4D because

cherrypicked. They call this 4D because you do have temporal control. You can

completely pause the quote unquote world simulation, slow it down, or even rewind it. Let's take a look at this footage of

it. Let's take a look at this footage of the football being caught. It is quite coherent, especially compared to the other open-source world models we've come across. But the guy in the back, I

come across. But the guy in the back, I don't know what he thinks he's doing.

He's sort of hallucinating in the dark back there, and it looks like he doesn't have an arm. This this fully simulated world technology is so early on in its life cycle or stages of development, but

researchers have a very good idea of what they need to focus on. This is a natural evolution from AI, video, and things like physical realism are going to be paramount. However, long-term

stability might actually be more important right now. You can see in this clip, which is sped up, it's pretty consistent for a while. This

demonstration shows a man making a drink, even with some recognizable liquor bottles. Turn your head to the

liquor bottles. Turn your head to the right and then look back and the drink is actually still being poured as you would expect. That's impressive. This

would expect. That's impressive. This

really reminds me of that Genie 3. So in

terms of real world applications though, something that we're going to focus on today with world models, robotics, running realistic predictions, a similar scenario with autonomous driving,

applying a realworld simulatory effect to anime, or you could even bring old memories to life and make them interactive. The coolest part about

interactive. The coolest part about Inspacio, though, of course, is that it's open source under the Apache 2.0 license. Anyone can fork this. Anyone

license. Anyone can fork this. Anyone

can modify it, upgrade it, fine-tune it.

You can see this is built off of WAN 2.1. And it also uses a depth estimation

2.1. And it also uses a depth estimation model. Inspacio models themselves, 14

model. Inspacio models themselves, 14 billion parameters. But the real

billion parameters. But the real question is how much VRAM do you need to run this? So if you want to run this

run this? So if you want to run this thing at home, it's looking like the 1.3B model is going to be your best bet.

But that is far from the fullsized 14B.

It definitely looks like it might have the capability to run local at that size though, around 10 to 12 GB of VRAM. That

full 14b size, you know, if you want this type of quality, that's going to be running on the GPUs that are on server racks, you know, not consumer level

stuff. But I like that they still

stuff. But I like that they still provide that 1.3B option. And of course, we cannot forget about the demo itself, which already just puts us right in as soon as you load the website. Okay, it's

starting. I'm like moving the camera around. I thought I was going to get to

around. I thought I was going to get to control this robot. It dropped an egg in there. What happens if I do this? Is

there. What happens if I do this? Is

this view? Oh, it is view. Okay, so I can look up. Can I look down? Oh, I'm

reaching the bounds. I see. Well, let's

go see this from another angle, then.

Let's walk over to the right and then I'm going to turn the camera. I see. So,

time is currently frozen right now.

Yeah, I'm going to I'm not going to lie.

The delay in terms of interactivity is absolutely terrible. It's It's not

absolutely terrible. It's It's not really playable like a video game, but as a cool little demo, this is definitely pretty mind-blowing to explore a space that is fully dreamt up

by AI on the fly. It looks like Let's go ahead and try the beach demo. Okay, so

is this our family? What are we a little child on the beach? Let's try to go into the ocean. Bye, family. I don't want to

the ocean. Bye, family. I don't want to see you. I don't care about your sand

see you. I don't care about your sand castles. Okay, it's definitely doing

castles. Okay, it's definitely doing like some slow motion trying to get us there. The inference isn't totally

there. The inference isn't totally smooth. Or at least the streaming isn't.

smooth. Or at least the streaming isn't.

Oh, we're getting two umbrellas over there. She's trying to to touch the the

there. She's trying to to touch the the child's hair. Trying to walk past my

child's hair. Trying to walk past my fam. I want to go into the ocean to the

fam. I want to go into the ocean to the beach. I'm wondering what this

beach. I'm wondering what this limitation is where the blue borders come up and it doesn't let us walk to where we would prefer to walk or would want to walk. All right, let's let's get

us a drink. Let's have this uh Oh, no.

He's cooking us a steak. And there's a doggy. Oh, I want to see the doggy. Let

doggy. Oh, I want to see the doggy. Let

me see the doggy. I don't care that you're making me a delicious steak, man.

I just want to pet the dog. Oh, we can get nice and close. How close can we get? Oh, that's about it, huh? Yes.

get? Oh, that's about it, huh? Yes.

You're doing a great job, Mr. Chef.

Inside of what appears to be my house.

Did I hire you? I just come here and sit with my dog. Well, very cool stuff. Very

fun to play around with. I love the free demo. You guys should go give this a try

demo. You guys should go give this a try before the servers are swamped. Ah,

there we go. Okay, so now the cup is being filled. Oh my god, it's just an

being filled. Oh my god, it's just an Bluetooth stream. Bluetooth stream of

Bluetooth stream. Bluetooth stream of coffee and it stops and now time is frozen. It's It's like automatically

frozen. It's It's like automatically freezing the time. I guess maybe that's a part of the demo. But yeah, you can see there's a lot less for it to hallucinate about this scene other than,

you know, this very crazy lighting and this dramatic dark background. Oh, okay.

That's cool though, how it noticed it was probably on top of like a little box or something. Well, that was awesome.

or something. Well, that was awesome.

This is not the only interactive AI world we're talking about today. Open

Art launches open art worlds. Fully

navigable 3D environment from a single prompt or image. You can step right inside it and capture shots exactly the way you envision them. This is much different than the world model we just

looked at. Similar, we start with an

looked at. Similar, we start with an initial image. The model generates a 3D

initial image. The model generates a 3D world based off of it that you can navigate freely. But instead of the

navigate freely. But instead of the entire world living inside of an AI model, it uses 360 degree image generation and depth estimation to

produce a world that feels 3D. It's

certainly not as advanced as a real world model, but it does allow you to step right inside of an image pretty much instantly. And really what they're

much instantly. And really what they're trying to do is combine this with Nano Banana. You can mle around your 3D

Banana. You can mle around your 3D environment, find your perfect angle, and start putting characters right into it, and then generate videos with image to video. It allows you to do

to video. It allows you to do multi-shot, but overall, I'd say this is the least interesting of the bunch today. So, moving forward, let's talk

today. So, moving forward, let's talk about Google. They just released two

about Google. They just released two updates to AI focused products. The

first one is Stitch. You may have heard of this. This is a design application

of this. This is a design application powered by AI, and the recent update has made it feel very cohesive, easy to use, and honestly, just fun. The main focus

of this is to design apps or design websites, but you could definitely use it for so much more. And the way that they've integrated everything, especially with the focus on design and

adhering to a specific style, is going to take this product very far, I think.

And in traditional Google fashion, from what I can tell, this is free to use and try out today, allowing everybody to get a firsthand taste at what AI fueled design can really be like. You know

what? I'm going to jump right in here.

give you guys a little bit of a tutorial on how to get started with Stitch, what it is capable of. We've got that ever so familiar prompt interface, but we've got a slider down here for app versus web. I

think it would also be nice to have maybe design both as an option.

Regardless, for now, I'm going to pick web. You can attach screenshots,

web. You can attach screenshots, sketches, or visual inspiration. So, you

could literally on, you know, a napkin and a pen draw out what you think the design of the website should be. the

agent will receive those photo tokens in the embedding space and actually generate your design. But hey, you could still do that sort of thing inside of chat GPT or the Gemini app. So why use

Stitch? Well, a major reason you would

Stitch? Well, a major reason you would is for the design systems. It can maintain coherent designs across multiple pages on a website of all different formats. And you can see there

different formats. And you can see there are some presets, right? Alexandria,

Glacier, Neon Tokyo, or you can merely let Stitch decide. And Stitch can even create its own designs. I'm going to roll with a simple prompt. Rustic

medieval themed website for selling odd trinkets and rarities. And let's send that off. So once you send a prompt, it

that off. So once you send a prompt, it opens you up into a blank canvas. And

you can see there is a sprawled out little UI design. The agent lives over here on the left hand side. And

apparently something unexpected happened. Well, regardless, right down

happened. Well, regardless, right down underneath you also see every prompt that you send through in a log. And

honestly, I love having this. I wish

this was a standard feature inside of chat GPT and Claude, keeping track of every prompt I send so I can hotkey to them, so I can click on them and it immediately takes me to that part. This

is like a macro view that tracks everything that happens as you work. And

not having this even in a basic chat interface, I honestly think is a missed opportunity. Okay. Well, let's give it

opportunity. Okay. Well, let's give it another shot. This time, hopefully the

another shot. This time, hopefully the agent doesn't get locked out of API jail. Okay. And you can see this time it

jail. Okay. And you can see this time it is generating. I think they've learned a

is generating. I think they've learned a lot about building agents just by working on Google Anti-gravity. It gives

you a basic layout of what it wants to do with your idea. Then it builds a custom design. And this is a real usable

custom design. And this is a real usable spec. You can see their primary,

spec. You can see their primary, secondary, tertiary, and neutral colors.

Headline, body, label, a few different icons. And this is not one of the

icons. And this is not one of the pre-made sets. This is a completely

pre-made sets. This is a completely unique design system that it devised itself. And it is fully editable and

itself. And it is fully editable and controllable by you down to the very fonts. And you're also going to notice

fonts. And you're also going to notice at the top there's a design.

And this is essentially a prompt for the design and how to use it. It's got dos and don'ts, input fields, buttons, components. Gemini makes its own design.

components. Gemini makes its own design.

In this case, you can make and upload custom design specs. So, if you were already working on a website, you would be able to translate that over. All

right, it looks like our initial generations are done. So, what exactly are we looking at? Well, these are actually two separate generations for this same one prompt, allowing us to

decide which one we like better. These

look and appear to be images, but they actually aren't. They are fully

actually aren't. They are fully interactive coded websites. This grand

foyer homepage, right? If I really like this right now, I can go down here and click the download button and I get a zip file with the design MD and the code. I can open that right up now and

code. I can open that right up now and view the website. The approach they took to empowering the agent with design capabilities seriously pays off. It can

be something as silly as what I said, but it nails it. The Scribes Omen, a quill that writes the truth even when the hands wish to lie. And it uses Nano Banana to generate some imagery

obviously to pair with the website. all

automatic. Ah, the flickering lantern. I

can acquire this artifact for 45 gold.

It's a silly example, but it shows you the capabilities and how if you really do need to build a website, this is so fast, so professional. And what I'm going to talk about next regarding this

is how easy it is to edit, modify, and really bring this to life or even put it into a claude or chat GPT. All right, so let's say we're just in love with this second variant, right? Everything you

see is built, but you wouldn't be able to click on the grand foyer and open it up, for example, or the vault or the scribe rarity and origin. All this

stuff, it's not actually built yet. It's

just there for show. It's the basic design principles. So, we'll simply

design principles. So, we'll simply click to highlight our site and in the prompt build the rest of the website, pages, grand foyer, the vault, the

scribe. Rarity sorting should work,

scribe. Rarity sorting should work, origin should work, and double the amount of products in the cabinet. also

add a review page with satisfied medieval users. And I will send that

medieval users. And I will send that prompt on through. Now, instead of giving us two variants for everything, it's going to go ahead and take this website we already approved, enhance it, and flush it out. While those are

loading, I'll show off the new directedit feature. And this gives you

directedit feature. And this gives you much more granular control over specific elements. Any active piece of the

elements. Any active piece of the website can be clicked, oftentimes being allowed to edit text or change an image URL, but you always have the option to edit that specific portion with AI. It's

cool. We have the macro control and the micro control right at our fingertips in a very easy to use interface. All right,

so looks like the vault page has come through. Very close here to mimicking

through. Very close here to mimicking that original generation here that it spawned from, but you'll notice that this is italicized and in a different font. Here's our regeneration of the

font. Here's our regeneration of the curiosity cabinet with the additions we asked for. So, let's get rid of the old

asked for. So, let's get rid of the old one. And it looks like the agent didn't

one. And it looks like the agent didn't make it all the way there. The Gemini

model, not surprised that it ended up behaving this way. Not completing the whole thing. GPT 5.4 Claude Opus a

whole thing. GPT 5.4 Claude Opus a little bit more verbose. Align these to make sure they are part of the same website. Generate the other two sections

website. Generate the other two sections of the site as well. And as you can see, we've got some more screens generating through as we wait. Just to give you an idea of what you can do with this. I

mean, anybody, even a kid, could hop on here and make their own website, interactive learning tools, physics, demos, storefronts. It seems it's just

demos, storefronts. It seems it's just built the whole website in four total gems. But it's looking like it's following the design exactly now. So now

we can delete these two precursors and we are left with a four-page website.

Exporting your website could not be easier and they give you a ton of options. I can simply highlight all four

options. I can simply highlight all four pages, click on export. You could just grab a dotzip and give it to literally any other AI agent. You can also directly export with AI studio or do an

instant prototype, which is what I'm going to do right now. What the instant prototype does is make something that you can explore on the fly right in Stitch. It's a different kind of

Stitch. It's a different kind of interface which still gives you super granular editing control. There's a

hotspot toggle on the side that shows you everything that can actually be clicked and interacted with on your page. So, if you wanted to click on this

page. So, if you wanted to click on this wax seal, for example, you know, it's not going to do anything cuz that's not really built in the website yet. But, as

we asked the curiosity cabinet, all of this is all interchangeable and working as a normal site. So, the vault looks like this is the review section from all

of the medieval people. Master Merchant

Bram, Order of the Iron Rose. Oh, I love this review. In all my travels from the

this review. In all my travels from the frozen wastess to the Emerald Coast, I have never seen a merchants's work so imbued with the spirit of the ancients.

Marcus the Wanderer. I've never seen an easier way to build a website with AI, and you can try it for free. This is a very welcome overhaul. Overall, this is a super welcome overhaul, and it makes

it so easy for professionals and newbies alike. They've hit a great midpoint with

alike. They've hit a great midpoint with super simple, easy UI that offers very granular control. Google has also

granular control. Google has also upgraded the vibe coding experience in Google's AI studio. This is their API platform for AI. You can get your keys there, but you can also experiment with

models, mess with the settings, and do vibe coding. Here's what it actually

vibe coding. Here's what it actually looks like inside of Google AI Studio.

Typical prompt box. We can dictate or we can add some files. They even brought back the I'm feeling lucky button to just suggest they make a random app. But

they've got some fun remixes here. All

right, I'm going to go ahead and do a physicsbased claw machine that only gives you lemonbased toys. Let's see how this experience shakes up. Typically,

when I use AI to write and build things with code, I've got three main options.

I prefer to do it on my desktop locally cuz I'm downloading GitHub repos. I'm

downloading files, but this is definitely easier to get started building apps than installing Claude Code, OpenAI Codeex, or Google Anti-gravity. I'm already liking this

Anti-gravity. I'm already liking this setup, though. Again, reminds me of

setup, though. Again, reminds me of Stitch in the way that it seems to be set up in order for you to export it easily. Typically, when you quote

easily. Typically, when you quote unquote vibe code with chatgpt or Gemini or claude, you know, on the website through the chat interface, it can give you files like zip things to download,

but sometimes they aren't held on the server for too long and they'll disappear or in Gemini's case, you have to copy all the code and do it separate.

Same thing with Grock. But you can see right here, it's clearly got access to its own virtual environment. ability to

make ajson file, makex files, put things in folders where they need to go. This

feels very much like an online only version of anti-gravity almost, especially where I can kind of flip through all of the different files and look at the code like this. Ah, here you

can see it already generated all of the prizes we'll be able to grab. They have

their own glow colors as well as separate colors, shape, and material.

lemon slice or a lemon seed, a sour candy, perhaps a lemon gummy or a lemon shark. This is a little bit of an easier

shark. This is a little bit of an easier way for people unfamiliar with creating apps or coding to get into something like this, see how it's structured, and

how all of the files are created and the code that actually powers and makes them work. AI tools and integrations like

work. AI tools and integrations like this one seriously bootstrap people who previously never would have thought to get into this sort of thing. It's

important to remember the barriers that are being broken down. Even installing

anti-gravity as an application on your computer and then learning how to use it. Since it's a fork of VS Code, it's

it. Since it's a fork of VS Code, it's it's pretty complicated. This is simple.

It's entirely online. It still gives and shows you the core pieces of building a complex application. The biggest

complex application. The biggest question that I have right now is whether or not they want something like this to truly live inside of Google AI Studio where people get API keys and

test models out. I feel like this sort of thing actually needs to live maybe alongside Gemini or maybe even a webbased anti-gravity. That could be a

webbased anti-gravity. That could be a really cool idea. Something very similar to this, but it's anti-gravity essentially simplified for the web with a quick export to the desktop version of

anti-gravity. This has been going for

anti-gravity. This has been going for like 7 minutes. What I will do is try this directly on the Gemini app with Pro in their current canvas tool. It allows

you to play web apps right inside of Gemini, but there is no real file system to host more complex apps. So, with the same prompt, what's the difference? All

right. Well, here is Gemini 3.1 Pro sort of dis disappointing everybody. The

scaffolding that an AI model has access to is now becoming a huge part of the equation for the types of things you're able to produce, especially with code.

It wrote a 550line monolith. This just

has way more code. Separate files for the UI versus the game loop versus the claw machine. You're giving it a bigger

claw machine. You're giving it a bigger playground. Oh yeah, this is more what

playground. Oh yeah, this is more what I'm talking about. That looks like a real claw machine. Wow. And the graphics are really good. That was something that I focused on a little bit in the prompt.

I don't know what that is in the back corner over there. It's definitely got a few mistakes. I like how much I can zoom

few mistakes. I like how much I can zoom in though and kind of see the reflective toys in there. They look tantalizing.

All right, let's play for one coin. What

happens? Oh, okay. It is returning with no no prize. Okay, it does have sounds like I asked for. And you'll actually see there are some warnings here as

well. It is not finished. It is

well. It is not finished. It is

definitely not complete, but it is a lot closer than just throwing a prompt in Gemini normally. Okay, so I can move the

Gemini normally. Okay, so I can move the claw machine with W and D. Let's try and go for that ball on the side, this blue one. See, let's see if we can get that.

one. See, let's see if we can get that.

Okay, the claw doesn't really extend it.

See, it sort of extends like way too late. I don't think we can actually get

late. I don't think we can actually get any prizes out of it. But I got to say, I'm pretty impressed with the visuals.

This is a good start. So if I press this download button on the side. Yeah. So

what it does is it provides you with a full zip file that contains the entire project. You could take this, extract it

project. You could take this, extract it to your desktop, open it up in anti-gravity, claude code, codecs, modify, enhance it, and keep working.

Not only is this a great way to start a project for those that are new to creating and coding with AI, but it gives you the tools to take it somewhere else when that person is ready to upgrade out of this interface. All

right, guys. I think that's going to do it for today. There is also a new Crea agent that can build custom workflows in Korea, which I found very intriguing.

But I think I'm going to leave that one for tomorrow. I want to thank you guys

for tomorrow. I want to thank you guys so much for watching, and if you try any of this out today, I would love to hear your opinion. The world model is the

your opinion. The world model is the most experimental of the bunch, where Stitch feels like the most well-rounded product. From a technical perspective,

product. From a technical perspective, though, you can't beat the world model, especially the fact that it's open source. I have yet to run a world model

source. I have yet to run a world model locally, but I have a feeling that that one might actually be my first. Thanks

so much for watching, guys.

Loading...

Loading video analysis...