How to Use Gemini 3.1 Pro Better than 99% of People
By Parker Prompts
Summary
Topics Covered
- The 1M Token Context Enables Scale No Tool Can Match
- Veo 3.1 Generates Seamlessly Synced Audio With AI Video
- Deep Research Removes the Productivity Ceiling Most Users Never Push Past
- Canvas Turns Gemini From Chatbot Into Collaborative Editor
- Gems and Scheduled Actions Make Gemini Work While You Sleep
Full Transcript
Gemini 3.1 Pro just produced this 21page research report in 3 minutes pulled from over 30 sources with full citations.
That's the deep research feature running on Google's newest model, and it's one of the eight capabilities I'm going to show you today that'll put you ahead of 99% of the people using this tool. I'm
going to head over to Gemini and select the Pro model from the drop down to get us on the new 3.1 version. And the first capability we will go through is Gemini 3.1 Pro's image analysis. I'll upload a screenshot of a performance dashboard
filled with dozens of metrics spread all over the page. Then I will tell Gemini to extract every metric, calculate the month-over-month trends, and flag any anomalies in a table format. Gemini
reads the layout, pulls every number, calculates trends, and organizes the whole thing into a table. It flags a projected revenue dip in May that I had completely missed. That's what the 38%
completely missed. That's what the 38% reduction in hallucinations in the new Gemini 3.1 Pro actually means in practice. Because in the previous
practice. Because in the previous version, that flag either wouldn't show up or it had come with a wrong number attached. And this works for almost any
attached. And this works for almost any visual input. You can hand it
visual input. You can hand it unstructured documents, messy notes, or random screenshots, and it will still extract and organize the details with that exact same precision. I'll upload a photo of a handdrawn system architecture
from a brainstorming session and type convert this into a clean list of components, map the connections between them, and flag any missing dependencies, and it reads my handwriting, identifies every component, maps how they connect,
and catches three dependencies I forgot to draw in. It proves you aren't limited to clean screenshots or charts. It
actually interprets raw, messy ideas just as well. The second capability is video analysis. I recently needed to
video analysis. I recently needed to extract every feature from a 1-hour product demo, and doing that manually means scrubbing through the video, pausing, taking notes, going back to check timestamps, and still missing
things that were mentioned once in passing. Instead, I paste the YouTube
passing. Instead, I paste the YouTube link directly into Gemini, and type watch this demo, extract every feature mentioned, organize by category, and include timestamps. Gemini processes the
include timestamps. Gemini processes the full video across the entire timeline, tracking topics as they develop instead of pulling random frames, and I get a structured breakdown with exact timestamps organized into categories. It
picks up on details that get rushed right past and folds them into the breakdown, which is exactly the kind of thing you would miss watching the video by yourself. And once that breakdown
by yourself. And once that breakdown exists, I can keep prompting against it.
I'll type, "Now compare the features mentioned in the first half versus the second half and tell me which ones got the most screen time." Gemini already has the full video in context, so it cross references its own analysis and gives me a ranked comparison without
reprocessing anything. That turns a
reprocessing anything. That turns a single extraction into an ongoing conversation where I can keep pulling insights out of the same video from different angles. The third capability
different angles. The third capability is audio analysis, and this one scales in a way that nothing else I've tested can match. Imagine you have 10 quarterly
can match. Imagine you have 10 quarterly earnings calls, and you need to find every mention of a specific topic across all of them and track how the conversation shifted over time. Gemini
3.1 Pro has a 1 million token context window, which means I can upload all 10 recordings in one prompt and ask it to find every prospecting strategy and service mentioned. Gemini scans
service mentioned. Gemini scans everything, cross references across files, and gives me a structured summary with citations pointing to specific moments in each recording. Those first
three capabilities alone will save you hours of manual reading and organizing.
But while they are incredible at pulling information out of files that already exist, they cannot build anything from scratch. That is exactly where the next
scratch. That is exactly where the next two capabilities come in, letting you generate entirely new content without ever leaving the interface. The fourth
capability is Nano Banana Pro, which is Google's image generation model built on Gemini. I'm going to show this inside
Gemini. I'm going to show this inside Flow, which you can find at labs.google/flow,
labs.google/flow, and it's Google's AI filmmaking workspace. The standout feature here is
workspace. The standout feature here is actually text rendering. While most
image generators still struggle with misspelled or completely garbled words, Nano Banana Pro gets it right and keeps the text perfectly clean. I'll paste in a prompt asking for a professional event poster for a tech conference with
specific dates, a dark gradient background, and clean typography. The
text lands exactly where it should, spelled and positioned correctly, and I can refine it simply by saying, "Make the gradient warmer and move the date to the bottom left, and it adjusts without regenerating the whole image." The fifth
capability is VO 3.1, which is Google's video generation model. And this one is also inside Flow. This one creates seamless AI videos with native audio.
So, the dialogue, sound effects, and ambient noise are all generated and synced automatically. I'll just paste
synced automatically. I'll just paste this prompt in and press enter. And that
looks really good right out of the gate.
And you can hear how well the audio actually lines up with the motion. One
really useful feature on Flow is frames to video, which lets you upload up to two reference images to control the characters, style, and setting of the generated video. So, instead of
generated video. So, instead of describing everything from scratch, you feed it a reference photo of a person or environment, and VO builds the video around those inputs.
Now, real quick before we continue, a huge thank you to Higsfield for sponsoring this video. If you want access to Nano Banana VO3.1 and dozens of other generation models all in one workspace, Higsfield brings everything
together. So, instead of using Flow for
together. So, instead of using Flow for Google generations and then opening up other software for different models or editing, Higsfield puts all of it in one place where you can generate, iterate, and export without jumping between tabs.
For anyone doing content creation or production work at any scale, having everything together removes a lot of friction from the workflow. The link is in the description if you want to try Higsfield yourself. Every capability so
Higsfield yourself. Every capability so far is a back and forth. When I type a prompt, Gemini responds, which means the second I stop working, Gemini stops, too. That's a ceiling most people never
too. That's a ceiling most people never push past, and the sixth capability removes it entirely. Deep Research is an autonomous agent built into Gemini. You
give it a question and it goes out on its own for up to 20 minutes browsing hundreds of websites, reading full articles, cross- referencing sources, and compiling cited multi-page reports.
I'll paste in a research prompt asking it to compare the top five open-source AI fine-tuning tools by ease of use, supported architectures, and community activity. Gemini generates a research
activity. Gemini generates a research plan first, shows me what it's going to cover, and lets me adjust before it starts. Once I approve, it starts
starts. Once I approve, it starts searching. And a few minutes later, I
searching. And a few minutes later, I get a structured report with sections, source links, and comparative analysis that would have taken me three to four hours to put together manually. The
reason the report quality is this high is because Gemini 3.1 Pro's reasoning scores jumped from 31% to 77% in a single update, and independent testing ranks at number one overall right now,
ahead of both Claude and Chat GPT.
What's new since the last version is that deep research can now search your Gmail, Google Drive, and Google chat alongside the open web. So, it can pull internal documents and combine them with public research in the same report. The
seventh capability is workspace integration, and it ties directly into deep research. If you enable the
deep research. If you enable the extensions, Gemini connects to your Gmail, calendar, and drive, and you can query all of them conversationally. I'll
type check my calendar for tomorrow, find any emails related to those meetings I haven't replied to, and draft a response for each one. Gemini pulls
the events, cross references them with your inbox, and generates drafts grounded in the actual email threads.
That's your morning email routine done in one prompt instead of 30 minutes of switching between tabs. Deep research
and workspace cover a lot of ground, but they still require me to sit down, type a prompt, and wait for the result. The
next section is where Gemini starts working even when I'm not in the room.
Instead of moving Gemini's text output into Google Docs or dragging a script over to your code editor, Canvas gives you an interactive workflow where you edit everything live right where you are. I will paste a prompt asking for a
are. I will paste a prompt asking for a welcome email sequence for new users and the screen immediately splits. On the
left is my chat and on the right Canvas opens up a clean text editor with the draft. I can highlight just the second
draft. I can highlight just the second email, tell Gemini to make it sound more professional and it updates that specific text right on the page without regenerating the rest of the email. For
code, I can ask for a business dashboard and the code appears on the right where I can test it in real time. You're
actively shaping the output alongside the AI instead of just waiting for a final draft. And for bigger projects,
final draft. And for bigger projects, there is build mode. I can head over to aistudio.google.com
aistudio.google.com and type a prompt saying to build a fully working timer with a dark mode toggle and a task list. In seconds, it generates the code and renders a live clickable preview of the app right next
to it. I can test the timer, ask the AI
to it. I can test the timer, ask the AI to change the accent color to blue and watch the app update instantly. It's
right there on the free tier. And for
anyone who has been generating code in chat GPT or Claude and then piecing it together somewhere else, this skips that step entirely. Now, picture opening
step entirely. Now, picture opening Gemini on a Monday morning and finding a research summary already waiting for you, compiled overnight from sources you set up once and haven't thought about since. That's what the last two
since. That's what the last two capabilities make possible. The first
one is gems. Inside Gemini, I click gems in the left sidebar to open the manager where I can see pre-made options like coding partner or learning coach. To
build my own, I click new gem, give it a name, and write instructions describing what I want it to do. I can upload reference files under the knowledge section to give it more context. And
there's a magic wand button that rewrites my instructions into a more detailed version if I need help. Once I
save it, every conversation with that gem starts with those specific rules. So
instead of reexplaining my project every time I open Gemini, I have a gem that already knows my background and goals.
But a gem by itself still waits for me to show up and type something. The
second piece removes that completely.
Inside the app, I can take one of my custom gems and connect it to a scheduled action by typing a prompt like, "Every Monday at 8 a.m., use your knowledge of my project to search for the top AI research papers from the last
seven days and email me a summary of the three most relevant ones." Gemini
creates a scheduled action from that and it runs automatically every week without me ever opening the app. Between gems
holding your context and scheduled actions running in the background, Gemini stops being just a chatbot and starts being an agent that works autonomously on your behalf. Deep
research with Gmail and Drive Connected is the capability I keep coming back to, and it runs on a model that costs less than anything else at this level. If you
want access to Gemini's generation models, plus dozens of others in one workspace, check out Higsfield using the link in the description. And if you want to see what else Google offers for free beyond Gemini, I broke down seven of their best free AI tools in this video.
Click on the screen to watch it and I'll see you
Loading video analysis...