Google Quietly Made AI Building Way Easier
By MattVidPro
Summary
Topics Covered
- Open-Source World Models Enable Temporal Control
- Stitch AI Generates Coherent Multi-Page Designs
- Vibe Coding Scaffolding Beats Monolithic Prompts
Full Transcript
How's it going everyone? Welcome back to the Matt VidPro AI YouTube channel.
Today's video is all about cutting edge AI research and tools that you can use today. We're going to get hands-on with
today. We're going to get hands-on with a few of them, but let's kick things off with Inspacio. They have released
with Inspacio. They have released Inspacio World. This is a world model
Inspacio World. This is a world model that is completely open source. You can
download the weights and code today and check it out. It looks like they do have a demo you can try for free. Obviously,
the clips that you see right now are cherrypicked. They call this 4D because
cherrypicked. They call this 4D because you do have temporal control. You can
completely pause the quote unquote world simulation, slow it down, or even rewind it. Let's take a look at this footage of
it. Let's take a look at this footage of the football being caught. It is quite coherent, especially compared to the other open-source world models we've come across. But the guy in the back, I
come across. But the guy in the back, I don't know what he thinks he's doing.
He's sort of hallucinating in the dark back there, and it looks like he doesn't have an arm. This this fully simulated world technology is so early on in its life cycle or stages of development, but
researchers have a very good idea of what they need to focus on. This is a natural evolution from AI, video, and things like physical realism are going to be paramount. However, long-term
stability might actually be more important right now. You can see in this clip, which is sped up, it's pretty consistent for a while. This
demonstration shows a man making a drink, even with some recognizable liquor bottles. Turn your head to the
liquor bottles. Turn your head to the right and then look back and the drink is actually still being poured as you would expect. That's impressive. This
would expect. That's impressive. This
really reminds me of that Genie 3. So in
terms of real world applications though, something that we're going to focus on today with world models, robotics, running realistic predictions, a similar scenario with autonomous driving,
applying a realworld simulatory effect to anime, or you could even bring old memories to life and make them interactive. The coolest part about
interactive. The coolest part about Inspacio, though, of course, is that it's open source under the Apache 2.0 license. Anyone can fork this. Anyone
license. Anyone can fork this. Anyone
can modify it, upgrade it, fine-tune it.
You can see this is built off of WAN 2.1. And it also uses a depth estimation
2.1. And it also uses a depth estimation model. Inspacio models themselves, 14
model. Inspacio models themselves, 14 billion parameters. But the real
billion parameters. But the real question is how much VRAM do you need to run this? So if you want to run this
run this? So if you want to run this thing at home, it's looking like the 1.3B model is going to be your best bet.
But that is far from the fullsized 14B.
It definitely looks like it might have the capability to run local at that size though, around 10 to 12 GB of VRAM. That
full 14b size, you know, if you want this type of quality, that's going to be running on the GPUs that are on server racks, you know, not consumer level
stuff. But I like that they still
stuff. But I like that they still provide that 1.3B option. And of course, we cannot forget about the demo itself, which already just puts us right in as soon as you load the website. Okay, it's
starting. I'm like moving the camera around. I thought I was going to get to
around. I thought I was going to get to control this robot. It dropped an egg in there. What happens if I do this? Is
there. What happens if I do this? Is
this view? Oh, it is view. Okay, so I can look up. Can I look down? Oh, I'm
reaching the bounds. I see. Well, let's
go see this from another angle, then.
Let's walk over to the right and then I'm going to turn the camera. I see. So,
time is currently frozen right now.
Yeah, I'm going to I'm not going to lie.
The delay in terms of interactivity is absolutely terrible. It's It's not
absolutely terrible. It's It's not really playable like a video game, but as a cool little demo, this is definitely pretty mind-blowing to explore a space that is fully dreamt up
by AI on the fly. It looks like Let's go ahead and try the beach demo. Okay, so
is this our family? What are we a little child on the beach? Let's try to go into the ocean. Bye, family. I don't want to
the ocean. Bye, family. I don't want to see you. I don't care about your sand
see you. I don't care about your sand castles. Okay, it's definitely doing
castles. Okay, it's definitely doing like some slow motion trying to get us there. The inference isn't totally
there. The inference isn't totally smooth. Or at least the streaming isn't.
smooth. Or at least the streaming isn't.
Oh, we're getting two umbrellas over there. She's trying to to touch the the
there. She's trying to to touch the the child's hair. Trying to walk past my
child's hair. Trying to walk past my fam. I want to go into the ocean to the
fam. I want to go into the ocean to the beach. I'm wondering what this
beach. I'm wondering what this limitation is where the blue borders come up and it doesn't let us walk to where we would prefer to walk or would want to walk. All right, let's let's get
us a drink. Let's have this uh Oh, no.
He's cooking us a steak. And there's a doggy. Oh, I want to see the doggy. Let
doggy. Oh, I want to see the doggy. Let
me see the doggy. I don't care that you're making me a delicious steak, man.
I just want to pet the dog. Oh, we can get nice and close. How close can we get? Oh, that's about it, huh? Yes.
get? Oh, that's about it, huh? Yes.
You're doing a great job, Mr. Chef.
Inside of what appears to be my house.
Did I hire you? I just come here and sit with my dog. Well, very cool stuff. Very
fun to play around with. I love the free demo. You guys should go give this a try
demo. You guys should go give this a try before the servers are swamped. Ah,
there we go. Okay, so now the cup is being filled. Oh my god, it's just an
being filled. Oh my god, it's just an Bluetooth stream. Bluetooth stream of
Bluetooth stream. Bluetooth stream of coffee and it stops and now time is frozen. It's It's like automatically
frozen. It's It's like automatically freezing the time. I guess maybe that's a part of the demo. But yeah, you can see there's a lot less for it to hallucinate about this scene other than,
you know, this very crazy lighting and this dramatic dark background. Oh, okay.
That's cool though, how it noticed it was probably on top of like a little box or something. Well, that was awesome.
or something. Well, that was awesome.
This is not the only interactive AI world we're talking about today. Open
Art launches open art worlds. Fully
navigable 3D environment from a single prompt or image. You can step right inside it and capture shots exactly the way you envision them. This is much different than the world model we just
looked at. Similar, we start with an
looked at. Similar, we start with an initial image. The model generates a 3D
initial image. The model generates a 3D world based off of it that you can navigate freely. But instead of the
navigate freely. But instead of the entire world living inside of an AI model, it uses 360 degree image generation and depth estimation to
produce a world that feels 3D. It's
certainly not as advanced as a real world model, but it does allow you to step right inside of an image pretty much instantly. And really what they're
much instantly. And really what they're trying to do is combine this with Nano Banana. You can mle around your 3D
Banana. You can mle around your 3D environment, find your perfect angle, and start putting characters right into it, and then generate videos with image to video. It allows you to do
to video. It allows you to do multi-shot, but overall, I'd say this is the least interesting of the bunch today. So, moving forward, let's talk
today. So, moving forward, let's talk about Google. They just released two
about Google. They just released two updates to AI focused products. The
first one is Stitch. You may have heard of this. This is a design application
of this. This is a design application powered by AI, and the recent update has made it feel very cohesive, easy to use, and honestly, just fun. The main focus
of this is to design apps or design websites, but you could definitely use it for so much more. And the way that they've integrated everything, especially with the focus on design and
adhering to a specific style, is going to take this product very far, I think.
And in traditional Google fashion, from what I can tell, this is free to use and try out today, allowing everybody to get a firsthand taste at what AI fueled design can really be like. You know
what? I'm going to jump right in here.
give you guys a little bit of a tutorial on how to get started with Stitch, what it is capable of. We've got that ever so familiar prompt interface, but we've got a slider down here for app versus web. I
think it would also be nice to have maybe design both as an option.
Regardless, for now, I'm going to pick web. You can attach screenshots,
web. You can attach screenshots, sketches, or visual inspiration. So, you
could literally on, you know, a napkin and a pen draw out what you think the design of the website should be. the
agent will receive those photo tokens in the embedding space and actually generate your design. But hey, you could still do that sort of thing inside of chat GPT or the Gemini app. So why use
Stitch? Well, a major reason you would
Stitch? Well, a major reason you would is for the design systems. It can maintain coherent designs across multiple pages on a website of all different formats. And you can see there
different formats. And you can see there are some presets, right? Alexandria,
Glacier, Neon Tokyo, or you can merely let Stitch decide. And Stitch can even create its own designs. I'm going to roll with a simple prompt. Rustic
medieval themed website for selling odd trinkets and rarities. And let's send that off. So once you send a prompt, it
that off. So once you send a prompt, it opens you up into a blank canvas. And
you can see there is a sprawled out little UI design. The agent lives over here on the left hand side. And
apparently something unexpected happened. Well, regardless, right down
happened. Well, regardless, right down underneath you also see every prompt that you send through in a log. And
honestly, I love having this. I wish
this was a standard feature inside of chat GPT and Claude, keeping track of every prompt I send so I can hotkey to them, so I can click on them and it immediately takes me to that part. This
is like a macro view that tracks everything that happens as you work. And
not having this even in a basic chat interface, I honestly think is a missed opportunity. Okay. Well, let's give it
opportunity. Okay. Well, let's give it another shot. This time, hopefully the
another shot. This time, hopefully the agent doesn't get locked out of API jail. Okay. And you can see this time it
jail. Okay. And you can see this time it is generating. I think they've learned a
is generating. I think they've learned a lot about building agents just by working on Google Anti-gravity. It gives
you a basic layout of what it wants to do with your idea. Then it builds a custom design. And this is a real usable
custom design. And this is a real usable spec. You can see their primary,
spec. You can see their primary, secondary, tertiary, and neutral colors.
Headline, body, label, a few different icons. And this is not one of the
icons. And this is not one of the pre-made sets. This is a completely
pre-made sets. This is a completely unique design system that it devised itself. And it is fully editable and
itself. And it is fully editable and controllable by you down to the very fonts. And you're also going to notice
fonts. And you're also going to notice at the top there's a design.
And this is essentially a prompt for the design and how to use it. It's got dos and don'ts, input fields, buttons, components. Gemini makes its own design.
components. Gemini makes its own design.
In this case, you can make and upload custom design specs. So, if you were already working on a website, you would be able to translate that over. All
right, it looks like our initial generations are done. So, what exactly are we looking at? Well, these are actually two separate generations for this same one prompt, allowing us to
decide which one we like better. These
look and appear to be images, but they actually aren't. They are fully
actually aren't. They are fully interactive coded websites. This grand
foyer homepage, right? If I really like this right now, I can go down here and click the download button and I get a zip file with the design MD and the code. I can open that right up now and
code. I can open that right up now and view the website. The approach they took to empowering the agent with design capabilities seriously pays off. It can
be something as silly as what I said, but it nails it. The Scribes Omen, a quill that writes the truth even when the hands wish to lie. And it uses Nano Banana to generate some imagery
obviously to pair with the website. all
automatic. Ah, the flickering lantern. I
can acquire this artifact for 45 gold.
It's a silly example, but it shows you the capabilities and how if you really do need to build a website, this is so fast, so professional. And what I'm going to talk about next regarding this
is how easy it is to edit, modify, and really bring this to life or even put it into a claude or chat GPT. All right, so let's say we're just in love with this second variant, right? Everything you
see is built, but you wouldn't be able to click on the grand foyer and open it up, for example, or the vault or the scribe rarity and origin. All this
stuff, it's not actually built yet. It's
just there for show. It's the basic design principles. So, we'll simply
design principles. So, we'll simply click to highlight our site and in the prompt build the rest of the website, pages, grand foyer, the vault, the
scribe. Rarity sorting should work,
scribe. Rarity sorting should work, origin should work, and double the amount of products in the cabinet. also
add a review page with satisfied medieval users. And I will send that
medieval users. And I will send that prompt on through. Now, instead of giving us two variants for everything, it's going to go ahead and take this website we already approved, enhance it, and flush it out. While those are
loading, I'll show off the new directedit feature. And this gives you
directedit feature. And this gives you much more granular control over specific elements. Any active piece of the
elements. Any active piece of the website can be clicked, oftentimes being allowed to edit text or change an image URL, but you always have the option to edit that specific portion with AI. It's
cool. We have the macro control and the micro control right at our fingertips in a very easy to use interface. All right,
so looks like the vault page has come through. Very close here to mimicking
through. Very close here to mimicking that original generation here that it spawned from, but you'll notice that this is italicized and in a different font. Here's our regeneration of the
font. Here's our regeneration of the curiosity cabinet with the additions we asked for. So, let's get rid of the old
asked for. So, let's get rid of the old one. And it looks like the agent didn't
one. And it looks like the agent didn't make it all the way there. The Gemini
model, not surprised that it ended up behaving this way. Not completing the whole thing. GPT 5.4 Claude Opus a
whole thing. GPT 5.4 Claude Opus a little bit more verbose. Align these to make sure they are part of the same website. Generate the other two sections
website. Generate the other two sections of the site as well. And as you can see, we've got some more screens generating through as we wait. Just to give you an idea of what you can do with this. I
mean, anybody, even a kid, could hop on here and make their own website, interactive learning tools, physics, demos, storefronts. It seems it's just
demos, storefronts. It seems it's just built the whole website in four total gems. But it's looking like it's following the design exactly now. So now
we can delete these two precursors and we are left with a four-page website.
Exporting your website could not be easier and they give you a ton of options. I can simply highlight all four
options. I can simply highlight all four pages, click on export. You could just grab a dotzip and give it to literally any other AI agent. You can also directly export with AI studio or do an
instant prototype, which is what I'm going to do right now. What the instant prototype does is make something that you can explore on the fly right in Stitch. It's a different kind of
Stitch. It's a different kind of interface which still gives you super granular editing control. There's a
hotspot toggle on the side that shows you everything that can actually be clicked and interacted with on your page. So, if you wanted to click on this
page. So, if you wanted to click on this wax seal, for example, you know, it's not going to do anything cuz that's not really built in the website yet. But, as
we asked the curiosity cabinet, all of this is all interchangeable and working as a normal site. So, the vault looks like this is the review section from all
of the medieval people. Master Merchant
Bram, Order of the Iron Rose. Oh, I love this review. In all my travels from the
this review. In all my travels from the frozen wastess to the Emerald Coast, I have never seen a merchants's work so imbued with the spirit of the ancients.
Marcus the Wanderer. I've never seen an easier way to build a website with AI, and you can try it for free. This is a very welcome overhaul. Overall, this is a super welcome overhaul, and it makes
it so easy for professionals and newbies alike. They've hit a great midpoint with
alike. They've hit a great midpoint with super simple, easy UI that offers very granular control. Google has also
granular control. Google has also upgraded the vibe coding experience in Google's AI studio. This is their API platform for AI. You can get your keys there, but you can also experiment with
models, mess with the settings, and do vibe coding. Here's what it actually
vibe coding. Here's what it actually looks like inside of Google AI Studio.
Typical prompt box. We can dictate or we can add some files. They even brought back the I'm feeling lucky button to just suggest they make a random app. But
they've got some fun remixes here. All
right, I'm going to go ahead and do a physicsbased claw machine that only gives you lemonbased toys. Let's see how this experience shakes up. Typically,
when I use AI to write and build things with code, I've got three main options.
I prefer to do it on my desktop locally cuz I'm downloading GitHub repos. I'm
downloading files, but this is definitely easier to get started building apps than installing Claude Code, OpenAI Codeex, or Google Anti-gravity. I'm already liking this
Anti-gravity. I'm already liking this setup, though. Again, reminds me of
setup, though. Again, reminds me of Stitch in the way that it seems to be set up in order for you to export it easily. Typically, when you quote
easily. Typically, when you quote unquote vibe code with chatgpt or Gemini or claude, you know, on the website through the chat interface, it can give you files like zip things to download,
but sometimes they aren't held on the server for too long and they'll disappear or in Gemini's case, you have to copy all the code and do it separate.
Same thing with Grock. But you can see right here, it's clearly got access to its own virtual environment. ability to
make ajson file, makex files, put things in folders where they need to go. This
feels very much like an online only version of anti-gravity almost, especially where I can kind of flip through all of the different files and look at the code like this. Ah, here you
can see it already generated all of the prizes we'll be able to grab. They have
their own glow colors as well as separate colors, shape, and material.
lemon slice or a lemon seed, a sour candy, perhaps a lemon gummy or a lemon shark. This is a little bit of an easier
shark. This is a little bit of an easier way for people unfamiliar with creating apps or coding to get into something like this, see how it's structured, and
how all of the files are created and the code that actually powers and makes them work. AI tools and integrations like
work. AI tools and integrations like this one seriously bootstrap people who previously never would have thought to get into this sort of thing. It's
important to remember the barriers that are being broken down. Even installing
anti-gravity as an application on your computer and then learning how to use it. Since it's a fork of VS Code, it's
it. Since it's a fork of VS Code, it's it's pretty complicated. This is simple.
It's entirely online. It still gives and shows you the core pieces of building a complex application. The biggest
complex application. The biggest question that I have right now is whether or not they want something like this to truly live inside of Google AI Studio where people get API keys and
test models out. I feel like this sort of thing actually needs to live maybe alongside Gemini or maybe even a webbased anti-gravity. That could be a
webbased anti-gravity. That could be a really cool idea. Something very similar to this, but it's anti-gravity essentially simplified for the web with a quick export to the desktop version of
anti-gravity. This has been going for
anti-gravity. This has been going for like 7 minutes. What I will do is try this directly on the Gemini app with Pro in their current canvas tool. It allows
you to play web apps right inside of Gemini, but there is no real file system to host more complex apps. So, with the same prompt, what's the difference? All
right. Well, here is Gemini 3.1 Pro sort of dis disappointing everybody. The
scaffolding that an AI model has access to is now becoming a huge part of the equation for the types of things you're able to produce, especially with code.
It wrote a 550line monolith. This just
has way more code. Separate files for the UI versus the game loop versus the claw machine. You're giving it a bigger
claw machine. You're giving it a bigger playground. Oh yeah, this is more what
playground. Oh yeah, this is more what I'm talking about. That looks like a real claw machine. Wow. And the graphics are really good. That was something that I focused on a little bit in the prompt.
I don't know what that is in the back corner over there. It's definitely got a few mistakes. I like how much I can zoom
few mistakes. I like how much I can zoom in though and kind of see the reflective toys in there. They look tantalizing.
All right, let's play for one coin. What
happens? Oh, okay. It is returning with no no prize. Okay, it does have sounds like I asked for. And you'll actually see there are some warnings here as
well. It is not finished. It is
well. It is not finished. It is
definitely not complete, but it is a lot closer than just throwing a prompt in Gemini normally. Okay, so I can move the
Gemini normally. Okay, so I can move the claw machine with W and D. Let's try and go for that ball on the side, this blue one. See, let's see if we can get that.
one. See, let's see if we can get that.
Okay, the claw doesn't really extend it.
See, it sort of extends like way too late. I don't think we can actually get
late. I don't think we can actually get any prizes out of it. But I got to say, I'm pretty impressed with the visuals.
This is a good start. So if I press this download button on the side. Yeah. So
what it does is it provides you with a full zip file that contains the entire project. You could take this, extract it
project. You could take this, extract it to your desktop, open it up in anti-gravity, claude code, codecs, modify, enhance it, and keep working.
Not only is this a great way to start a project for those that are new to creating and coding with AI, but it gives you the tools to take it somewhere else when that person is ready to upgrade out of this interface. All
right, guys. I think that's going to do it for today. There is also a new Crea agent that can build custom workflows in Korea, which I found very intriguing.
But I think I'm going to leave that one for tomorrow. I want to thank you guys
for tomorrow. I want to thank you guys so much for watching, and if you try any of this out today, I would love to hear your opinion. The world model is the
your opinion. The world model is the most experimental of the bunch, where Stitch feels like the most well-rounded product. From a technical perspective,
product. From a technical perspective, though, you can't beat the world model, especially the fact that it's open source. I have yet to run a world model
source. I have yet to run a world model locally, but I have a feeling that that one might actually be my first. Thanks
so much for watching, guys.
Loading video analysis...