Data Analysis With AI In 21 Minutes
By Tina Huang
Summary
## Key takeaways - **AI aids human coordination and reduces errors.**: AI can help bridge communication gaps and minimize mistakes in collaborative tasks by summarizing discussions and acting as a verification layer for human input. [01:35], [03:23] - **Use AI for tedious tasks, not creative ones.**: AI excels at automating repetitive and mundane data cleaning and visualization tasks, freeing up humans for more complex problem-solving and strategic thinking. [02:09], [04:16] - **The DIG framework guides AI data analysis.**: Approach AI data analysis using the DIG framework: Describe to understand the data, Introspect to uncover patterns and potential issues, and Goal-set to define clear objectives for the analysis. [07:31], [10:15] - **AI enables complex data filtering beyond traditional tools.**: AI can intelligently filter and analyze data based on nuanced criteria, such as job preferences for location and specific industry skills, which would be extremely difficult with standard tools. [13:45], [14:24] - **AI can automate multimedia analysis and file organization.**: AI can process various media formats, extract frames from videos, apply transformations, and even organize and rename large collections of files within zip archives. [17:05], [18:19] - **Transform AI analyses into downloadable software.**: Complex sequences of AI-driven data analysis steps can be automated by instructing the AI to generate a Python script, which can then be downloaded and run as an executable program. [19:13], [19:49]
Topics Covered
- The ACHIEVE Framework: When to Use AI for Data Analysis.
- Don't Skip Steps: The DIG Framework for AI Analysis.
- AI Filters Data Intelligently Beyond Traditional Tools.
- AI Automates Analysis and Ensures Reproducibility.
- Build AI Applications from Analysis, No Code.
Full Transcript
I learned how to do data analysis with
AI for you. I guess we can call it vibe
analyzing. But really though, I took 11
courses on this topic. What can I say? I
used to be a data scientist at Meta. I
love data. I use data every single day.
I'm going to save you the time and money
that I spent buying these courses and
give you the cliffos version of what I
learned. As per usual, there'll be
little quizzes throughout this video.
So, pay attention. All right, let's go.
A portion of this video is sponsored by
LTX2. The outline of today's video is
first I'm going to cover when is it
useful to use AI for data analysis. Then
we'll talk about the dig framework for
how to approach analysis. But of course
to make it all concrete we need some
examples. So I'll then be showing you
lots of examples. And finally I'll
explain how to take this even further
and take your data analysis and build it
out into dashboards or even AI
applications. I do want to make a note
that a lot of the courses and examples
are focused on using chatbt as the tool
for data analysis. But that is not the
case that you have to use chatbt. In
fact, you can switch it off at Gemini
and Claude would work the same way. In
fact, sometimes they would actually work
better. So, don't feel like you need to
be married to a single tool. And I'll
actually call out if there is a tool
that I think would work even better.
Okay, let's start off with when we
should be considering using AI for data
analysis. Well, from the course Chachi
PT advanced data analysis, Dr. Jules
White, the instructor from Vanderbilt
University has an acronym for this
called achieve. He explains that there
are five different areas that is useful
for using AI in data analysis. Aiding
human coordination, cutting out tedious
task, help provide a safety net for
humans, inspire better productivity and
problem solving, and enable great ideas
to scale faster, achieve. Aiding human
coordination refers to helping people
work better with each other because
people actually tend to be quite messy,
you know, and there's a lot of
miscommunications. There's a lot of like
back and forth between people. So,
there's a lot of room for improvement
here that AI can help with. Say, for
example, you're in a meeting with a
bunch of people and you have this
meeting transcript. You can actually put
it into AI and say, "Actal assistant.
Read the following meeting transcript
and provide me a summary of the key
points of discussion." This is just an
example. I'm sure you can think of a lot
of other scenarios where there is a
bunch of data that can be analyzed such
that you can provide more clarity for
humans. The second part of the framework
is to cut out tedious task. Just as that
suggests, it's best to let AI to be able
to do things that are very repetitive
and boring for people. For example, say
you're hosting a workshop and you ask
people to sign up for the workshop and
provide different types of information
like what their name is, what their
occupation is, which department that
they're in, what their interests are.
Instead of having to go through this and
manually analyze it, you can tell the AI
that this is the list of people that
registered for my workshop on prompt
engineering and chatbt. Describe the
data in this file. By the way, asking AI
to describe data is is best practices
which we'll cover a little bit later.
But yes, so the AI will be like, okay,
like you know, this file contains this
type of information. It's a CSV file and
it contains like timestamp, name, the
email, the department, um this is a
university workshop, how they're using
chatbt and their tools already, the role
that they hold, etc. You might notice
that people are filling their department
names in a lot of different variations.
So you can actually ask the AI, there
seems to be a lot of overlap between
departments with alternate spellings.
Can you list out all the departments and
then do some intelligent grouping of
them? And then you can ask it to create
a bar chart showing the total number of
registrations per department. This is
the kind of data cleaning and
visualization that is pretty mundane and
AI is able to do this much more quickly.
Third part of the framework is to help
provide a safety net for humans. You see
people often say like oh AI has a lot of
hallucinations and that is true. AI does
hallucinate but people hallucinate too.
People make a lot of mistakes, like some
really really dumb mistakes. I make dumb
mistakes like literally constantly.
Wrote my name wrong on a form yesterday
for example. So that is why having AI as
a backup is actually a really great
idea. Say for example, you're on a
business trip and you need to ask for a
reimbursement. So you need to like
generate some invoice thingy and then
make sure you have all the fields
covered. If you're anything like me, I
am not very detail- oriented. I probably
will make a really dumb mistake. So, as
a safety net, you can actually upload
this invoice into the AI along with the
business expense policy and ask read
each page of the attached business
expense policy and see if the attached
receipt complies with it. So many other
examples of this. Every time you need to
submit like an insurance claim, you need
to like read some sort of document uh
looking at like travel policy whatever
like yeah so many examples of this. The
next part of the framework for when to
use AI for data analysis is the IEV
which is inspire better problem solving
and creativity. This is another thing.
People always feel like AI is going to
make people less creative and is going
to take over creative things, but that
is not the case. It's all about asking
the right questions. Say, for example,
you have a PowerPoint presentation that
is really, really important. You can
actually upload those slides into AI.
Ask them to quickly summarize it and
then ask it to act as a skeptic of
everything I say in this presentation
and find flaws, my assumptions,
assertions, and other key points and
then generate 10 hard questions for me.
This is a way for you to actually force
yourself to think about the questions
that you could potentially be asked and
come up with better ways and better
solutions for answering them. People are
actually very rigid creatures. We tend
to think in a very specific way and it's
very hard for us to actually expand past
that. So using AI as a tool to help us
expand our creativity is is actually
really really helpful. And finally the
last part of the framework for when you
should be using AI for data analysis uh
is the E, which is enable great ideas to
scale faster. Let's go back to that
workshop example. You're doing a
workshop on prompt engineering and you
have people coming from all types of
different backgrounds um who are
interested in all types of different
things and all types of different
levels. After you give the AI the signup
form and analyze all the data about your
participants, you want to create a cheat
sheet for each of them that is most
relevant to them. What you can actually
do is ask the AI to map each attendee
with their specific domain of interest
and then generate a column called ideas
that includes the corresponding
idea/prompt
um to put for their cheat sheet. This
way now your CSV file for each attendee
also contains a very specific prompt
idea that is specific for that attendee.
And then after the workshop, you can
actually send them an email with their
specific little cheat sheet. I'm sure
you can see prior to AI, this would have
been so hard to do if you have more than
just like 10 attendees to be able to
come up with like a custom cheat sheet
for each person. So, whenever you're
thinking about if you should use AI to
do a certain analysis, you can think
back to this acronym. Of course, I
haven't yet covered exactly how it is
that you should be approached to these
analysis, which is what I'm going to be
covering next.
But first, let's do a little quiz.
Please put your answers in the comments.
This portion of the video is sponsored
by LTX2, the new AI video engine for
creative workflows. And this one
honestly blew me away. What stood out to
me isn't just the quality, it's how LTX2
can finally tell a story. And if you
can't guess by my career choice,
storytelling is is what I live for. Most
AI video models just give you short
looping clips. A few seconds that look
great but don't really say much. LTX2
though can generate up to 15 seconds of
continuous video with synchronized audio
which means full monologues.
>> Super villains also have feelings.
>> Dialogues or even short scenes with
music and natural sounds.
[Music]
It's smooth, coherent, and feels
cinematic. The kind of storytelling
range that's missing from AI video
tools. As someone who's been creating
content for over 5 years now, this one
really impresses me. You can build short
narratives, generate B-roll and
snippets, or even cinematic transitions,
all with a single prompt. You can now
try out LTX2 to create your own AI
videos. The link is in the description.
Thank you so much LTX2 for sponsoring
this portion of the video. Now, back to
the video. From the course, ChatBT Plus
Excel, master data, make decisions, and
tell stories. There is a nice little
acronym for how it is that you should be
approaching data analysis using AI
called DIG, which stands for
description, introspection, and goal
setting. By the way, if you do have a
little bit of background in data, like
data science or data analysis, um it's
basically the same as EDA, exploratory
data analysis, but specifically for
using AI. The first step of dig, which
is describe, it's a way for you and the
AI to explore the data together. This
step is very important because it helps
both you and AI gain a familiarity for
the data and to also notice if there's
any issues with that data. This is going
to help a lot with hallucinations and
issues down the road. So very similar to
normal EDA processes, after you upload
like a spreadsheet or whatever data it
is that you give to the AI, like let's
just say like a spreadsheet in this
case, you would ask the AI to list out
the columns in the attached spreadsheet
and show me a sample of the data in each
column. For example, if your spreadsheet
contains data about different roles um
and different salaries, the AI would be
able to output and say, "Here are the
columns for your spreadsheet along with
a sample of the data for each column."
So the column name you could have like
salary ID, job ID, max salary, med
salary which is median salary, min
salary, pay period, currency and
compensation type. You can already
notice that under max salary and min
salary you have nan which is not
available. This is important to note
because hallucinations tend to happen
when you have things like missing data
or incorrectly formatted data. So if you
see something like this, the first thing
you actually want to do is to confirm
that is it just that DAI is not parsing
your data correctly. So you actually
want to go in and see like is it
actually not available or is there like
a parsing problem? And if there actually
is a parsing problem, you want to then
tell the AI, hey, you are parsing this
incorrectly. This is actually how you
should be parsing it. Or maybe the data
is just not available. Then you just
want to make a note of this for the
future. We will get back to that. But
first, you actually want to do a few
more random samples. Just ask your AI
like take a couple more random samples
of the data for each column. Make sure
you understand the format and type of
information in each column. What we're
basically doing here is verifying that
the data is being parsed correctly and
the AI has correct understanding of each
of these columns. You can even ask it
what do you think each of these columns
represent and it might tell you that
salary ID appears to be a unique
identifier for each entry. Job ID is
likely unique identifier for jobs or
position max salary, min salary and med
salary not all entries have complete
salary data etc etc. So this is a way
for you to validate that the AI
understands what's happening also that
you understand what's happening. The
best way to think about this is that
your AI is a very competent but still
very junior developer or data scientist
or data analyst. So you need to make
sure it understands what is the data
that it's actually receiving. Otherwise
any analysis that you do on top of this
could potentially be wrong. So after you
do this, you want to move on to the next
step of the dig framework which is
introspection. This is when you want the
AI to start looking at the data that it
finished describing and think about the
patterns and relationships that exist in
the data. This is also another great way
to catch any misconceptions that the AI
may have. Notices were just being like
very skeptical all the time. Very
important. You can just ask tell me some
interesting questions that could be
answered with this data set and why they
would be interesting. And it might come
with some questions like is there a
relationship between compensation type
for example base salary and variability
in salary ranges. It's saying why it's
interesting is that understanding
whether certain compensation types like
bonuses or equity are more likely
associated with higher salary ranges can
help employees and employers make more
informed decisions about how they
structure pay packages. Here's a
question that came up with that could be
a red flag. The question is, are there
any noticeable patterns in salary data
for different currencies if additional
currencies exist? So, when you see this,
you want to think to yourself, oh, like,
is there actually other currencies that
exist? And you might want to double
check yourself to see if if there's
actually data that is not USD. In this
case, there actually are no other
currencies. So, that's when you want to
tell the AI all currencies are actually
listed in USD. So, it knows that
information moving forward. If you catch
your AI asking questions like these or
you're catch your AI like asking
questions that you know cannot be
answered by the data set and ask you to
generate some more questions that can be
answered from the data set. In this way
you're really helping the AI and
yourself make sure that you really
understand what's happening in the data
set. It's also actually quite a good
exercise because sometimes AI will come
up with different things and different
analyses that you might have not have
thought of doing. I know this might seem
a little tedious and you just want to
skip to like making graphs and charts
and doing analysis, right? But do not
skip these steps, okay? Trust me.
Because like that's the thing with data.
It's one of those things where if you
mess up, it will just propagate
throughout your entire analysis. So it
is really worth the effort to actually
make sure that everything is understood
correctly. And this is actually the same
with humans too. It's not like an AI
problem. Even when I was a data
scientist, I spent a significant amount
of time doing exploratory data analysis,
making sure that I understood exactly
what was happening in data and
clarifying all that information as well
because I knew if I just like rush into
things, I'll probably end up making a
mistake and be very embarrassed when
someone catches my mistake or worst case
scenario get fired because I made a
really dumb mistake and then, you know,
lost the company a lot of money or
something like that. Anyways, so after
you do this, the third step of the
framework is goal setting. It's very
important that your AI understands what
it is that you're trying to achieve.
Like if you just went like analyze this
data, your AI is going to be like what
what the heck like you know what does
that even mean? What am I supposed to
analyze? What's the result? So it's the
same. You got to be like very clear
about what the goal actually is. So you
could tell the AI, my goal is to answer
a couple of these questions, the
questions that you know it it generated
previously and turn them into a really
exciting interesting report to post on
LinkedIn. This really helps provide
context because then your AI is able to
do this analysis and then is able to do
things like give you LinkedIn ideas how
it is that you can put it together in
the form of LinkedIn. This is going to
be very different if you were actually
analyzing this data in order to, you
know, like do something very serious
like generate like a report for your
boss. This is also just part of good
prompt engineering practices. So, if you
do feel like you want to brush up your
prompt engineering a little bit to be
able to be more clear about what it is
that you want, I do recommend that you
check out a video that I have over here
which covers the foundations of how to
do good prompt engineering. So, check it
out over here. Anyways, whenever you
have a data set that you want AI to help
you analyze, it would be really helpful
for you to go through this dig
framework. It is a really great
foundation and you can build on top of
this as well. Now all the stuff that we
talked about earlier uh is pretty
standard for if you're doing any type of
data analysis using like Excel, Python,
SQL or whatever. It's just like maybe
more convenient doing in a
conversational fashion with AI. But
there are certain things that you can do
with AI that would be extremely
difficult for you to do just using these
traditional tools. Like for example, if
you're job hunting right now and you
have access to this data set, you could
be thinking like, oh, like I'm looking
for a job that is between like 50 to
$80,000 based on the East Coast and
specifically works with wood. I don't
know, something like that, right? So in
this data, there is no specific section
that's like works with wood/notwork with
wood, you know, or like materials that
you're working with. And it also doesn't
specify like is it east coast or west
coast. It's just like the location like
Chicago, right? But because of Genai's
capabilities, you're able to like filter
through this data um in a way that's far
more intelligent to be able to find the
rules that you could be potentially
interested in. This would be so hard to
do if you didn't have Genai. And later
on in the video, I have a lot more
examples which I'll show you uh like
Genai specific really cool data analysis
things that you can do. There's also one
more thing that I thought was really
cool in this module of the course, which
is the idea of traceability and
replication. I thought this was like
super clever because a major issue that
people face when doing traditional data
analysis is that they would like come up
with some sort of thing and then it
would be stuck in like a Jupyter
notebook or like whatever and it would
actually be very difficult for other
people to reproduce that analysis. But
with AI, you can actually ask AI to come
up with a traceability document that
allows other people to be able to
perform the same analysis to validate
the results. You can ask let's create a
traceability document to make sure that
others can one know what data was used
two how the analysis was performed and
three threats to validity. We want a
guide for someone else to be able to
replicate and know the limitations of
the analysis. You can you can save this
traceability information as like a
readme.md. Uh don't worry if you don't
know what that is. It's just very common
for software engineers to be able to
store it this way but you can kind of
store it like a word document whatever
doesn't actually matter. And then for
each analysis and visualization, you can
actually ask it to write a single Python
script that performs the full analysis
to produce the visualization and the
results. I think it's a really really
clever idea and really smart thing to do
if you're doing any type of data
analysis using AI.
All right, time for our next little
quiz. Please answer the questions on
screen in the comments. Yay. Okay, we
can move on to some examples. I'm really
excited for this section. First example
is super simple but is actually really
helpful is it's pretty much like any
type of small document you can just
directly upload do dig on it and then
have it proceed to analyze that
information and transform it in whatever
way like if you have structure data like
say you have a CSV form that has like
all the different types of inventory
throughout the past few months you can
ask it to filter it for different types
of inventories you can ask it what are
the trends in the inventories over time
are there certain items that are
becoming more popular certain items that
are becoming less popular maybe you want
to like remove those if there's too much
of that in the inventory. You can even
ask to come up with a predictive model
to see what are the inventories that you
should be stocking for the next few
months so that you're able to optimize
the amount of inventory so you don't
have too much and you also don't have
too little. You can also do
visualizations both static
visualizations. You might want to make a
bar chart that's ranking all the
different types of inventory that you
have. You might want to make a time
series graph over time. What are the
changes inventory? And you can also make
dashboards as well, interactive
dashboards. for this specific one. Cloud
as of the filming of this video um is a
lot better than the other models to
create interactive dashboards. It also
usually is the best at writing the code
as well in order to do the analysis and
it tends to hallucinate less. Again,
this is at the time of this filming, so
I don't know if this is going to change
in the future, but just FYI, AI data
analysis with different media forms is
also a really cool application. For
example, you can have like a video and
you can ask it to extract 10 frames from
this video evenly spaced out from 1
second apart. Then you can ask it to
take these images, resize the images,
make it like 300 pixels wide and say
convert it to grayscale and increase
contrast by 30%. You can ask it to do
things like combining the images to
animated GIFs that flip the to the next
image at 1second intervals. You can ask
you to turn the images into PowerPoint
presentations and then catalog all of
the images into a CSV file with the name
of the image and the name of the movie
file that the image was extracted from
and the operations applied to it. So
really really cool that you can
manipulate uh multimedia using AI. Prior
to AI, this would have been so difficult
to do. Another example of data analysis
using AI is by automating things using
zip files. Zip files are very, very
convenient and they're amazing because
not only are you able to put a lot of
different files into a single zip file,
you're also able to maintain folder
hierarchy in the zip files themselves,
which means that you can actually zip
together a bunch of different files
together and ask the AI to mass analyze
them all together. So you can have
multiple Excel files that you're telling
it to combine and search and do whatever
with it. Then afterwards, you can build
all back together and then send it back
to you. Also, Grave, if you need help
organizing different files, you can ask
the AI, one, I want you to help figure
out what is in them by opening and
reading each one to create a summary.
Two, I want you to propose a folder
structure that would better organize the
files. Three, I want you to propose
better naming for each file using just A
to Z and 0 to9, keeping the extensions.
And four, when you have all of this
done, show me your proposed folder
structure and name. Then it's going to
do that. And then finally, once you're
happy with it, you can ask her to zip
everything up again and then send it
back to you and and voila, everything is
well organized and beautiful. And the
final example I'm going to show from the
course is a little bit more advanced,
but so cool. This is when you can
actually turn conversations into
software programs. Let me explain. So,
say for example, you have like a
sequence of analyses that you did,
right? Like for example, maybe you have
like some sort of movie and then you ask
it to get like 10 frames from this movie
spaced 1 second apart. uh maximized it,
I don't know, like did some photo
manipulations on it um and then combine
them together and then generate some
descriptions for it and then put them
all together into a CSV file. You can
then ask the AI, turn this process into
a Python program that I can download and
run on my computer and provide the path
to the documents as command line
arguments. Zip up the program for me to
download and then it can literally go
and actually like write a script that
performs all of these different steps
and then put them all together in an
executable program. That is so cool. You
can literally do this for any sequence
of analyses that you do. You can just
like automate it like that. At least for
me, that blows my mind. That is so cool.
All right. All right. I can literally go
on forever, but I'm gonna stop for this
section for now and I'm going to put the
next little quiz onto the screen. Please
answer these questions and put them in
the comments. Okay. So, I wanted to
include this final section because I
wanted to make sure that you understand
that just doing the analysis using AI,
you don't need to stop there. There's
actually so much more that you can you
can do on top of that. From the examples
that we already seen, we can take this
data analysis and then use it to
generate emails, use it to make like
social posts, use it to generate
reports, PowerPoint slides, even build
software programs and dashboards. But
that's not all. You can even build
full-on applications based upon these
analyses. And no, you don't actually
need to know how to code. You can just
use pipe coding. Say you've analyzed a
lot of traffic data. You can actually
make this into an application that
analyzes real-time traffic data and then
like I don't know gives alerts to
people. uh where generates reports based
upon traffic incidents. You can have
application that's able to take videos
and blur out people's faces or like
different identifications within the
video. Here's an example of an
investment research AI agent uh that
people who join our AI agents boot camp
will build that has an entire database
with information about investments and
it has this interface where the user is
able to ask it specific questions and it
will analyze that data to generate
certain types of responses,
conversations and reports. Yeah, there
is so much that you can do. Data truly
is power and being able to analyze and
harness that power by using AI just
opens up so many possibilities. All
right, I'm going to stop here. If you do
want to dive into how to actually build
out these like applications, agents, and
things like that, I'll link a few videos
in the description that you can check
out that goes into a lot more detail
about how to do this. But for this
video, I'm going to end it for now. I
really hope that this was a very helpful
video for you and you have lots of ideas
for how to analyze your data now using
AI. Vibe data analysis. As promised,
here is the final little assessment.
Please answer the questions on screen
and put them into the comments. Thank
you so much for watching until the end
of this video and I will see you in the
next video or live stream.
Loading video analysis...