Gemini Canvas Brings Vibe Coding to Everyone (FREE!)
By Mark Kashef
Summary
## Key takeaways - **One-Prompt Fitness App Build**: I built this rep counting fitness app with one basic prompt. First try, zero debugging. [00:00], [00:15] - **Video Input Replaces Prompts**: Theoretically, you can now screen record whatever app you're trying to build and feed that as input instead of a prompt. If a picture is worth a thousand words, a video is worth a million. [01:29], [01:46] - **AI Product Manager Auto-Integrates**: It's literally doing the job of an AI product manager, but it's aware of all the code and the nuance in the code to come up with not just the recommendations, but actually implement them. It added AI form tips and a weekly plan generator using the Gemini API. [03:01], [03:55] - **Recreate Apps from Loom Video**: I recorded a Loom of the voice memo app audioen.ai, uploaded it as input, and recreated the experience verbatim using Web Speech API, including real-time dictation, transcription, and downloads. In one shot, it worked, and we fixed timing issues quickly. [04:50], [06:49] - **Persist Transcript History Easily**: Every time I record a voice note, I want it to be logged on screen with a limit of 10 transcriptions that persist, showing the recording date, name, and what was said so I can redownload it later. Now I have history with copy and download per item. [07:04], [07:47] - **One-Shot 3D Tetris Prototype**: For example, you can even build games with this where all I said was build a 3D interactive Tetris game, and when you click start, you actually have a functional game that you can go left and right and start to build and play with. [07:39], [08:06]
Topics Covered
- Gemini Canvas builds prototypes instantly?
- Canvas acts as an AI product manager?
- Video input trumps prompts for app replication?
- Iterate apps with video demos and history?
Full Transcript
I built this rep counting fitness app with one basic prompt. First try, zero debugging.
Vibe coding companies should be worried because Google just dropped Gemini Canvas, an app that essentially allows you to build prototype ready apps out of the box in seconds.
And the craziest part is that this is just a feature add-on in the same ecosystem where you can run deep research, nano banana, and even video input.
So, in this video, I'm going to walk you through where to find Gemini Canvas, how to use it, and most importantly, I'm going to show you one of my favorite hacks that I used to use in the old world of vibe coding that you can now do natively with this feature.
Let's dive in. All right, so if you go into gemini.
google.com and then pick your model of choice, you can use 2.5 Flash or Pro.
Pro is obviously a lot more professional at actually building this. So once you have that enabled and if you navigate to the tools section of the page, you can select canvas right here. And as soon as you enable it, you can send whatever prompt you want and then you'll have essentially a version of Lovable or Bolt or Base 44 at your fingertips.
But because you're in the Google ecosystem, you can still access all the other tools like deep research, creating videos, creating images, and guided learning.
And you'll also be able to add whatever files you want, including video input, which is specific still to the Gemini models because they're the only language model as of today that can still accept a video input as context. Now, why am I harping on that so much?
Well, theoretically, you can now screen record whatever app you're trying to build and feed that as input instead of a prompt.
And as we know, if a picture is worth a thousand words, a video is worth a million.
But more on that later.
Once you have Canvas enabled, all you have to do is send a prompt. And like I said, this prompt is really not my best.
So, I just said real time camerabased fitness coach that gives instant feedback on user movement and adapts workouts live.
I want to be able to see through webcam on my Mac. So, I literally wrote like a caveman here.
And in one shot, it was able to function. And you can see right here, I can interact within the screen itself.
And all we have to do is click on start.
And then, just like you saw in the intro, it'll look just like that.
Now, very similar to the vibe coding apps, you can access the preview, you can share the app. So, if you click on share and click on share here and copy, you can open this up in an incognito browser very easily. So, similar to deploying, you technically don't need to deploy to a Netlefi or what have you cuz you can access it right here. And on top of that, obviously this is not cranking out production level apps.
These are prototypes or MVPs at least for now.
But if you want to take the code, you can click on the code tab right here, copy this, and you can take it to your cursor, your cloud code, your windsurf, or what have you to really take it to the next level. And you can see here, you can go back and forth between different versions.
So you can restore a prior version.
And if you go back to that version, you can still go up.
So you still have that ability to track changes over time. And one very interesting feature is at the bottom right here.
And you can actually move this around.
And what this does is whatever app you build, if you want to AI the app or come up with AI ideas and you don't want to go to a product manager or your AI product manager, you can click on this and it will brainstorm on what would make the most sense to integrate Gemini models in this app to take it to the next level. So you can see here it says, I'm currently assessing the architectural implications of integrating Gemini into the AI fitness coach app. I've reviewed the existing codebase, expanding integration possibilities, defining feature implementation.
So, it's literally doing the job of an AI product manager, but it's aware of all the code and the nuance in the code to come up with not just the recommendations, but actually implement them.
And you can see here it's added to the backend AI form tips, a new button that calls the Gemini API to give you a detailed expert breakdown of the selected exercise, including proper form, common mistakes, and muscles worked.
And then we have the AI weekly plan generator. And this feature analyzes your workout history from the database and uses the Gemini API to generate a personalized 5-day workout plan.
And all we have to do is just wait for it to finish. And you can tell from here and here that it's implementing these brand new features. And we'll see what it looks like. And as soon as it finishes, it has some form of thinking tab.
And as I go down, it's now added a new feature that will be able to look at what I'm doing and in real time give me feedback as to how awful I am at bicep curls.
But this is an example of one that's already built. Let's build one together.
And let me show you the cheat code of cheat codes to hacking time and prompt engineering by uploading a video input.
So this is an app that made a lot of money back in the day a few years ago when AI was coming out.
It's called audioen.ai.
not affiliated, but the way it works is literally you just record yourself dictating and it just transcribes that and allows you to store that and download it.
So theoretically, if I wanted to recreate the majority of this experience, all I have to do is record a loom, which I did right here.
Let me just play it for you on mute.
And you can see here I went through the app just recording exactly how it works, the functionality, the fact when I upload or dictate it, it would transcribe it, have this kind of popup, and all I did was record this, then download the video, and then upload the video as input.
So, we'll do it together. So, I'm just going to download this video. We'll go into a brand new session. Let's open up a new chat, and then let's go to tools and then canvas.
And let me upload this video's context.
All right. And within 5 seconds, we're uploading it.
And I'll say the following. Okay. So, I want to recreate this app verbatim. I want to use something like the web speech API in my browser to be able to dictate in real time.
And I want the look and feel to look exactly the same. Make sure that when I click on this little voice memo icon that I can dictate, it has that same countdown effect, but I can manually stop it to trigger the transcription.
And ideally, I want to be able to either copy or download the actual transcription text onto my computer as something like a text file or similar.
So, we'll send this over.
And I used Whisper Flow, but they have a native feature that you can actually just use audio as well. So, we'll send that over and we'll see what we get.
All right. So, in OneShot, we got something like this.
And if I navigate to the bottom right here and click on the voice icon, I'll click on allow. Hello. Hello.
My name is Mark and I love AI so much.
which is just so good and moving so fast.
I'm getting overwhelmed. I want to cry myself to sleep. And if I stop that, what happens?
Nothing. So, let me try giving it some feedback to make it a little bit better. All right. So, I said it's not transcribing and it says, "Of course, it looks like you ran into a common issue with the timing on the web speech API.
" So, let's just skip the rest and let's try it out. Hey, this is another test that I want to see if the transcription is working. Okay, so now it's processing.
Okay, there we go.
And can I download this? Boom. And does the download actually work? Yes, it does.
Now, did I copy all that functionality from that app in one shot? Probably not.
But can we keep adding to it? Can we add things like history? We can. So, let me see if I can maintain history in the app itself.
Okay. So, every time I record a voice note, I want it to be logged on screen.
So, similar to the app I gave you, let's do a limit of 10 transcriptions that persist, but I want to be able to see not just the recording date and the name of the recording, but what I actually said so I can redownload it later.
All right. And now, if I enter another one, here's another test.
Please store this. I'll stop. It should ask me to save the note. And now, when I save the note, I now have this recording and this recording.
And I should be able to go back to the original and either copy or download it. And you can really let your imagination run wild.
For example, you can even build games with this where all I said was build a 3D interactive Tetris game.
And when you click start, you actually have a functional game that you can go left and right and start to build and play with. So stuff like this can really disrupt the vibe coding ecosystem.
Because if model providers can offer the features, the models, and the experience themselves, then things get really interesting. If you found that helpful, let me know in the comments below.
And if you're vibe coding and you're looking to take your prototype to the next level using something like Claude Code, we've been cranking out tons of content and resources on how to take apps just like this and take it to the next level in my early AI adopters community. So if that interests you, check out the first link in the description below.
Otherwise, I'll see you in the next
Loading video analysis...