What is Vibe Analytics and How to Get Your Company Ready for It? | Ep42
By Data Neighbor Podcast
Summary
## Key takeaways - **Data Never Ready for AI**: For most organizations, you would never hit a state saying that my data is ready for AI or for everybody to consume. We talked to hundreds of customers and prospects saying our data quality is bad, and we've never seen anybody claiming that data quality is good. [00:00], [00:26] - **Dashboard Graveyard Reality**: We build dashboards and share them to stakeholders, but after three months nobody even touched them, creating a dashboard graveyard. Self-serve BI promises drag-and-drop insights, but adoption is very low due to high barriers like data modeling understanding and SQL skills. [06:36], [07:10] - **Learn Semantic Layers with AI**: Semantic layers should be learned by AI rather than manually curated by humans, by crawling database metadata like schema, tables, relationships, and learning from user interactions like Google search recommendations. This extracts semantic context for reuse without constant human maintenance. [15:45], [16:15] - **Vibe Analytics Means Hypothesis Focus**: Vibe analytics means focusing on the thinking piece like hypothesis testing, such as diagnosing why a metric like onboarding dropped 10%, bouncing ideas, and letting AI verify quickly without worrying about SQL syntax or plotting. This allows exploration and iterations much faster for data teams and non-technical users. [18:46], [19:18] - **AI Tricked Me with Fake Dry Run**: The AI agent learned from past conversation histories and pretended to do a dry run successfully with preview results, but actually failed to run, fabricating the output. This shows AI can be very creative and take unexpected paths without proper guardrails. [42:00], [42:48] - **Coding Commoditized, Focus Impact**: Coding is commoditized; AI writes syntax-error-free code but needs thought leaders for business logic and guidance to avoid wild errors. Focus on critical thinking, identifying high-impact problems, and leveraging AI as a multiplier to deliver larger business impact. [33:27], [36:28]
Topics Covered
- Self-service BI creates dashboard graveyards
- AI learns semantics automatically
- Vibe analytics focuses human thinking
- Assume data quality always bad
- AI multiplies business impact
Full Transcript
I think probably for most organizations you would never hit a state saying that my data is ready for AI or for everybody
to consume. We talked to hundreds like
to consume. We talked to hundreds like customers, prospects, everybody saying oh our data quality is bad. It's like we ask head of data what's on top of your mind the number one is data quality. We
have never seen anybody claiming that data quality is good. Right? So just
assume this is not no matter whether it's early stage or like a late stage or like a public complaints everybody always say oh our data is messy and then there's quite a few probably they
organize maybe 3 months or 6 months product trying to consolidate have a single source of truth anyway good luck with that just like don't see so you always assume the data quality is bad I
think that's where we start from don't forget to subscribe to data neighbor wherever you listen to podcasts and YouTube and drop a like and comment comment below. It keeps us bringing you
comment below. It keeps us bringing you great content like this.
>> Hey everybody, welcome back to another episode of the Data Neighbor podcast with Hi, Shravia and Sean. Today we have Lei Tang on the show. He's the CTO and
co-founder of Fabi. Welcome to the show, Lei. Uh, thanks for having me here.
Lei. Uh, thanks for having me here.
>> Welcome, Lei. Super excited to have you.
>> Yeah. Uh, likewise. I'm really excited to just dive in. I am very excited about the topic that we're going to get into today around Vibe analytics. But before
we get into that, Lelay, could you tell our audience who you are, what your background is, and um what is Fabio?
>> Sure. Yeah. Uh hi everyone. This is Lei.
Um I'm co-founder CTO of Fab AI. So I
got my PhD in computer science and machine learning in the US. And then for the past uh 15 years my whole career has been building machine learning systems leading data science teams but my career
has been reaching quite a big change region from starting from a research institute like Yahoo research to large corporation Walmart labs building the e-commerce business then later I joined
a startup called clary so it's I was a chief data scientist there and then later I was leading the data science team at lift and then afterwards Uh I'm
really excited about when ChachiB was out because myself and my team has been suffering for a suffering from like just overwhelming number of questions from our business stakeholders. So I thought
man this is the golden moment to solve a problem. I had the pain. So I reached
problem. I had the pain. So I reached out to my former coworker who was a product manager when I was like we used to work together. So he was bugging me every day with all the questions. So I
made a pitch to him and that's how we get how we started a fabby. All right.
So that's Yeah.
>> I mean this this is as as uh like cool as it is for like a wide analytics company.
>> All right. Uh I so I I definitely feel the pain like too too many like questions like just bombarding our team every day. And then coming back to
every day. And then coming back to Fabby. So we started the Fabia about uh
Fabby. So we started the Fabia about uh two and a half years ago and then we had this vision that with our advancement of technology and AI. So everybody should
be easily derive insights from the data and then so you can consider Fabi AI is essentially AI native BI platform that
allows you to combine SQL, Python, AI all together to do viatics. So no matter where your data lives could be a Excel file it's very common when I work with
CFO finance team they're working on spreadsheet and it's a excel file a database tables or bigquery kind of data warehouse or some application data you
can all join them together pulling to the fabric platform to do some joint analysis and then we have the AI available for you so that you will be able to get the insights x faster and
then build your custom ized workflows.
So our customer including primary like data analysts, data scientists but then we also have such as funders like a PM
and also CSM revenue operation team essentially really lower the barrier for everybody to be able to conduct some data analysis. Uh just a quick download
data analysis. Uh just a quick download about myself and Fabby. Hey Le what just real quick when you were bombarded by or I guess the pain point that started Fabby how bad was the situation? How
many how many how many question how overloaded was your team? We probably
have like a hundreds of tickets in the backlog every week. So in the end when I was uh leading the data science team one of my primary job is to say no to my
stakeholder and then I would say okay you have this like can you help me to put in some marketing campaign to look at a certain data. The first question I would all respond is suppose we deliver
this what are you going to do with that?
uh so uh and then I I really want to see like whether or not they will take a certain certain action on top of that but there's a there's a one pitfall there is that data analysis is like
before you actually look at the data it's really difficult to to say what you want to do about it so uh I think there's a chicken egg problem there but uh that's like a one scenario we had is
like so many questions and then it's a we really need to make a balance there are certain big projects strategy project we need to work on but And there certain like uh certain like operational
questions we also need to handle as well. So especially as a data science
well. So especially as a data science leader then you have to really uh I would say one responsibility to shield the team to really focus on the most important task for the uh for the organization.
>> Yeah. Before sort of like the genai technology to be able to help with that and I'm I'm pretty sure that's still a minority of companies that are that are leveraging that. there's there's this
leveraging that. there's there's this concept of uh self-s serve kind of tools, right? like self-s serve
tools, right? like self-s serve dashboards, self-s serve reports or anything that can hopefully get to the 75% of questions that people ask anyway or I guess you know like from your
perspective how how it does it work how does it or if not how does it fail >> the whole con self-service BI probably from last generation BI tools like 10
years ago they have been claiming everybody you should go to a BI maybe Tableau Lucer drag and drop and then you should be able to pull insights by
yourself. It's a very it depends a rosy
yourself. It's a very it depends a rosy picture but in practice uh I found that like especially our business stakeholders it's just difficult them to
really do the job by themselves. So
there's a couple numbers I could share.
Number one my team like build some dashboards in the kind of we used to use mode when we were at lift and then we build mode like all these dashboards and then we share to our stakeholder.
There's one nice feature is you can track like for this particular dashboard anybody has accurate view it or run it and then we share to our stakeholder they get answer and then we go back to
check to say oh after since then after three months nobody even touched this like dashboard so in a sense we're building all this like a dashboard graveyard there so that's about more
like a personal like my own experience when dealing with this type of questions and then on the other hand I also had some ideas so given that like our team is really underst stuffed let's uh make
our let's really empower our stakeholders and then we actually we set up some program organize SQL training with our marketing organization right so
for growth marketers let's say okay let's do some secret training we even have like a dedicated course every week we have office hours we have homework to really work with our marketing
stakeholders it was quite engaging the uh the marketing team actually like they everybody attended the sessions but like afterwards I would say there still very
few people are able to really like say write SQL do the NASA by themselves in the end they still come back to us so so I think this just highlighted some kind
of issues with traditional like BI like even though they claim you can drag and drop do some kind of analysis but it's number one the barrier is still very
high for most folks um for many business stakeholders probably don't understand how the data has been organized in the like say what's the data modeling in this data
warehouse and then they don't even know like where to go to to query the data right so not to mention the SQL skill by uh by itself right so that's number one and then of course that actually push
the data engineering team say that what if we build all these semantic layers like say we we tag oh this This column is aggregable. This is where you would
is aggregable. This is where you would use for uh maybe revenue and then so we also have some tool like to allow individuals to tag like say for a table
for column what's the meaning and how how do you want to use those this actually adds a lot of like say upfront connect cost for data engineering for
the data team right so and then essential before you before you really sure that the self-service be able to work you probably need to spend like say
hours or weeks or months in order to clean up the data, write clearly detailed documentation about the semantics and then say oh now you can do that this is like a big question mark
there and it's very scary for most like data team in that front on the other hand there's also certain like Wikipedia type like tool allow you to constant like you can do some curation like so to
add some meta information to all these like database tables but most organization you are evolving so fast and my take is that any tool that actually requires human curation is
would won't would work except Wikipedia but right now Wikipedia probably still doing fine but within one organization like you you require human beings to constantly maintain and updates documentation that would be really
difficult to do so that's what we have seen like with all the self-service BI even though they're supposed to really empower everybody but really the adoption rate is very low and so that's
for the past decade for existing tools but right now I think with AI it's really making it feasible so it be able
to handle the SQL query well it can scan your like database schema to at least at a high level understand the data modeling and then there's also one big piece about that is that I forgot to
mention is that for all these traditional BI tools it's primary focused on SQL but SQL is only the first step or like only one just for descript
descriptive anatics that's what you want to True. But in many situations, you
to True. But in many situations, you probably want more than that, right? So
you want to say, I want to run some statistic correlation analysis. I want
to build a simple forecasting model to forecast what will happening or you want to do a deep dive. And then in many situations, this requires more than just SQL. SQL is just like means to end. It's
SQL. SQL is just like means to end. It's
like the first step to put the data into allow do some analysis.
>> So Le, I have one question before we dive deep. I really like where you're
dive deep. I really like where you're going. I definitely would love to chat
going. I definitely would love to chat about that. Right. So if you look at
about that. Right. So if you look at like what are the data needs that what are data scientists typically in a tech Silicon Valley tech company data scientists are about experimentation understanding deep dives doing some
causal experimentation and there is a different part of data science or analytical engineering which is reporting dashboarding getting insights right so and there's deeper level work which you're you're just pointing about
about deeper level work that you need to do SQL and Python and get there right before that even at the dashboarding level I totally agree with you on the self analytics absolutely those are like
the breaking points right I'm dealing with a bunch right now as well my own company and I'm sure whoever's listening they would have dealt with them in the past and dealing with some right now
people ask for something out of curiosity out of something that's bugging them right then because of a leadership ask and then they forget about it and we built an entire ETL that goes out of you know like goes into
trash the graveyard that you're talking about one aspect le that I think at least from the little bit of w coding that I've been doing with cloud code and
you know all of that is it still needs context I don't know maybe fab is doing something magical I would love to know but it still needs that context right and another thing I want to say is that
the other big breakage point for data is logging itself generally engineers especially when we want to launch something I've seen this across in multiple areas when whoever develops it
like it could be a product I worked at Xbox and I was at Microsoft. So I worked on a hardware side of telemetry like that as well and later on I ended up into next door social media and now I'm
at Grammarly. So even at these wide
at Grammarly. So even at these wide variety of like stakeholders I've dealt with logging and tracking doesn't come to them as their P 0 because getting the
things done is important for people like you know like shipping out shipping a feature out is important. So a lot of issues come from not having the right tracking and logging in the first place.
That still holds a problem even in this new world, right? Because we need something to be tracked. There are so many things people want to know and you realize that that's not even being logged and tracked today and we have to
do these cycles with the engineers and everyone that to make sure that happens.
We try to do that at the start of the launch. we do we successfully I mean
launch. we do we successfully I mean between the timelines and sprints logging is always like we can always add it later sort of a thing for some of them the critical ones do get added so
that's part that's still I think is still a slight challenge right and the second part is context and understanding like uh I know you mentioned for self-s serve anx people need to write down like
what these mean and stuff but it's the same from the type of wipe coding I have done it needs to know the context And it needs all the information. If not, it
can grossly like uh sometimes like do things that are does not from the business context it does not make sense.
But from a data model from what it sees, it makes sense because it sees what it sees. It does not know the nuances of
sees. It does not know the nuances of the business logic and layer. That's
because it was not documented. I didn't
write in the prompt properly. So it's
again on me to give it the context to get what I need.
>> Yes. what you said is really on the spot right so right now if you use AI to help you do this viatics provide the right
context and be clear about what you want is actually we would say consider this is like critical challenge for anybody to do vibatics but there's one good
piece which is I'm really excited about vibatics is that even though you are not like say I'm non-technical I don't know about python I don't know about the SQL
or like I I don't have much like knowledge about underlying data modeling but if the AI gives me a chart about how how much transactions if I work on if I'm really working on this business like
every day look at the chart look at the numbers I immediately know whether or not this is right or wrong right so this actually I think this is the good piece about like say data analysis is that by just looking at output you can
immediately judge whether or not the data actually the AI is giving you something reasonable Right. So, and then if there's something wrong, you can prompt like say go multiple iterations
or even bring uh like some maybe the data team to do a double check to make sure this is reasonable like it depends on what what your task is. So, I think that's one key part. The other part
which I'm uh I'm super bullish is that all these semantic layer should be learned by AI. So, rather than require human beings to do this manual update.
So you can think about this way like within Fabio like when we connect to a database data warehouse like all these like database schema table name column name relationship between different
tables foreign key primary key there are this unique contain non values all this information is already like crawled as a metadata for the AI to use. So that's
one part the more importantly is that as more and more people interacting with the AI the AI should automatically figure out okay when you talk about revenue that's like how like that's all
within this organization everybody go to this table go to this card that's what I understand it's almost like you know it almost like Google search or YouTube like if like the recommendation system
how does it work like this set of people have watched this so they go after this and AI figures that out as well with data that'll be great Imagine I go after a certain table and I always find it
giving me wrong answer. I'm like giving a context and it learns that everyone who looks at this table would probably have the same context and gives me that checks me that's right. So that will be
great like learning from people using it. So that's something I think
it. So that's something I think self-arning that that's amazing >> that that's like there two there actually two locations you can learn from this type like context knowledge.
>> Yeah. one is I would say the top level dashboards like every organization even though I'm not a fan of dashboards but I do believe there still like say for certain KPI key metric for organization
you need to have dashboards like everybody in the company probably need to check like every week right so or every day and then there you can consider that's like say the most QA
dashboards within your organization you should trust like the code like the logic behind that that's definitely like from data engineer data scient.
So that's actually the first part like the AI can learn from. Then the other part is like as everybody is interacting with AI and it can actually extract all these like semantic or context
information and then this will be reused for the future conversations. But the
cool part is that you can even surface this to a human beings too quickly like say oh I learned that these are the informations rather than ask human beings to kind of fill in all these semantic information now you just say oh
this doesn't make sense this doesn't make sense to a quick check and then you can just append to uh kind of get some documentation about the semantics right so it's most likely just like whether
you are using cloud code or open open codeex like the it's a markdown text it's a it's very easily to read like understand rather than some rigid like
say YAML file or JSON file you have to go through like hundreds of documents trying to find out like how to define that now it's like a human readable like text so you can oh this is how the semantic how this contact has been provided
>> real quickly so if we step back a little bit what does vibe analytics mean to you >> it means that you really focus on the thinking piece like say what you want
like say as we do all this data analysis This is like you have some hypothesis you're trying to think about this is what I like say suppose you have some
like a metric like say on boarding or activation the metric moved down by 10% last week and then you say okay I need to do some diagnosis and then you
probably have different hypothesis maybe it's product line like different product line different segment of users and then you really focus on like I say bouncing these different ideas and then let AI to
quickly help you like say either verify whether the hypothesis is correct or not and then rather than spending more time trying to figure out what how to write the window function how to use a math
plot li to do a plot by the way I really hate that uh math plot lab in the past when do any charting every time I need to search online to figure out the syntax but now it's so simple right so
anyway coming back is that I think the AI can really focus on just like implementing some this like detail to help you like say accept or reject the
hypothesis but you are more like as a thought leader you still provide the AI to do the right thing right so it give you some output you say oh this doesn't make sense does not make sense and then
I have another idea of course you can also bounce idea with AI as well but this really allows you to do some exploration and iterations way much more
faster so that's like if you're like not just for data team who's like dealing with data every Okay. It also like allows especially that's why like some
this our like customers their founders they won't be writing SQL like every day they probably or some PM they used to be an analyst like maybe three years ago they would not like be able to write it
but right now like I just have questions you ask AI and then immediately it'll give you some quick answer and allow you to decide what should be do uh what should be doing next right so so this is
really put like human beings especially like data science data anal or like uh you really focus on what matters most rather than worry about some of these implementation details.
>> Love that. Oh my god. I'm sure uh if uh like it wouldn't affect their p that person's job, everyone would love that.
The only part is about the spicy question about oh would it affect my own job you know sort of a thing but I'm sure it's a pain point for every like you know product person and every data
scientist data scientist is they keep getting asked these questions that are just curio out of curiosity that wouldn't land any impact as a data scientist I don't want to work on
something like you know that would not make any change to the business right but it should ensure that the the your stakeholders are successful at getting the answers that they're looking for. So
I think this would be like amazing solution for that.
>> Yeah. I um maybe sorry to interrupt like I heard some people say oh like data scientist data analysts like with all this AI would be replaced. I said
definitely no because uh I think AI is really good at writing syntax free code.
Uh this is really good. uh but I would say the worst the worst scenario is is writing syntax uh syntax errorree code errorree like SQL queries it runs smoothly but then the logic is wrong the
logic is like most likely is within everybody's like it's not documentation it's spread across everybody like every team every individual right so that's why you really need this thought leader
to help guide AI to do the right thing the way like when I was interacting with AI is that it's very brilliant but it can go really wild. So, uh if you do not
provide some guidance or guardrail, it can go really crazy. So, uh that's why like you kind of you have to decide like say focus on like say how to make the AI
how to leverage the AI more effectively, right? So, depending depend on the use
right? So, depending depend on the use case here. We we can chat a little bit
case here. We we can chat a little bit more later. I do feel like for different
more later. I do feel like for different profiles like say data team and for nontechnical business stakeholders how to leverage the AI should be different.
You cannot like say direct expose tens of thousands database tables just to anybody in the company right so that will bring probably more troubles than actually kind of saving your time or
effort so yeah actually wow there's so many topics I want to cover here you just you just added another one here let's see let's put let let's come back to this one I really wanted to get to
the other one which is obviously vibe analytics for the audience who are not as aware is a really hot field. We
talked to the CEO of Hex just last week >> and we talked to um well I mean I talked to uh different uh you know vendors like uh Snowflake, Data Bricks, they all have
their own kind of AI layer on top of I guess that's the promise. It's like hey ask questions and we'll give you the answers. No no no need to write any more
answers. No no no need to write any more syntax or or SQL anymore.
>> So definitely a ton of demand and a ton of value. How do you how do you feel in
of value. How do you how do you feel in terms of sort of like readiness for an organization where you know like hey if you hit certain threshold either in data
annotation semantic layer labeling and or buildouts and and and and labeling and things that you're ready enough for it. What's kind of like your general
it. What's kind of like your general gist of it will definitely fail if you don't at least hit this.
>> That's a very good question. I think
probably for most organizations you would never hit a state saying that my data is ready for AI or for everybody to
consume. We talked to hundreds like
consume. We talked to hundreds like customers prospects everybody saying oh our data is our data quality is bad.
It's like we ask like a head of data what's on top of your mind the number one is data quality. We just never we have never seen anybody claiming that data quality is good. Right? So just
assume this is not. No matter whether it's early stage or like a late stage or like a public complaints everybody always say oh our data is messy and then there's quite a few probably they
organize maybe three months or six months project trying to consolidate have a single source of truth anyway good luck with that it just like don't see so consuming you always assume the
data quality is bad I think that's where we start from then really depends on the individual who's using AI to do the job we need to have some form like a
guardrail uh in order for that right so if you're technical enough if you are like say you know what is like the business contacts you know this particular domain of course you want to
allow this user probably access to all the tens of thousands of tables and then they probably can help like construct certain components that's like say build
certain dashboards workflows that would be allow other people to just consume directly. So another one which we
directly. So another one which we actually tested within fab is that whenever somebody build a dashboard you can config saying that actually during
that process you already have some curation like it's most likely I'm pulling from users table transaction table I do apply certain filtering it's like a joined centralized wide table
that would then for subsequent analysis right so you already have some curation while you are doing analysis once I like share that as like report or an NASA
like within Fabby we can allow you to config that I will make this data set available for anybody who want to ask some follow-up questions so in that way
you basically you constrain like say for somebody who's consuming this particular reports they ask a AI that AI would only look at that data set provided and then
it's already QA in a sense by the data person so even as a non-technical kind of nontechnical stakeholder I come to like I want to say can you somehow slice
the data in a different way. It would
just like say really focus on the provided data set rather than going back to the original like say data warehouse with tens of thousand tables. So and
then uh the another thing is that there's also certain especially like when we build the agentical workflows you can ensure like say we do all these
dry runs and then be able to give you the preview of the result so that you can immediate that whether or not this hallucination. So I have some other
hallucination. So I have some other stories actually I I really got tricked by AI but I can reserve for later. as I
said is like always assume data is data quality is bad and then you would want to encourage everybody to kind of adopt AI because we believe that's the future
like AI would be just cheaper faster more powerful right so you want to encourage everybody to use that but right now at the current stage I would say really depends on the individual you
need to have some form of god rail there right so that's actually can make sure your organization can leverage the AI more effectively and one more one more thing I actually want to add is that I
feel the job responsibilities for data team actually will change as well rather than like say primary like primary focus on writing code to or writing ETL
pipelines building dashboards doing deep dive analysis there's also some work like say you would see more and more teams start to build some agentic like workflows like this data insights
workflows and then the data team actually would be responsible for that to make sure like say you build workflows you probably adding certain MCP to call you add guard rails and
context properly so that your stakeholder when they consume the insights from this workflow it's actually is guaranteed to be reliable and accurate right so so that's like a I
think we we're seeing like more and more companies are doing that like right now especially the agent framework like workflow connect framework is everywhere and then uh you will see more and more people doing so
>> yeah I do I do I think that's that has to be the case. That's so interesting how how you describe the whole thing. So
I just want to recap it a little bit here. You're like, hey, we always start
here. You're like, hey, we always start with the assumption that data is bad.
That's that's the universal starting point. Nobody's good data is good. And
point. Nobody's good data is good. And
if we wait until data is good, good luck.
>> So we start or I guess we as in you guys. Yeah. So you you guys start with
guys. Yeah. So you you guys start with the baseline being bad and then you solve it through almost like governance.
I guess you you call it guardrails, but then it's like technical people who know what they're doing, they get exposed to everything.
>> People who don't have idea what they're doing, they don't get exposed to anything until the next layer, which is almost like a crowdsourcing/certification process and a self-learning process
within the organization for the AI to understand these are certified information. These are always good.
information. These are always good.
These are like somebody click of config that says uh this is now published.
Everyone else can consume it. It's it's
been it's been validated and whatever.
And then you kind of fan it out from there as a kind of like a living organism. Yes. I also want to encourage
organism. Yes. I also want to encourage every organization is like you don't have to immediately like say let AI to have access to everything. You can start
with like a few seated like say select maybe dozens of tables like in your database still warehouse and then starting from there and then uh I bet it
would there should be some core tables would cover maybe 70% all these requests there and then starting from there and then have the data team somehow provide
some governance saying that what type of questions what type of data like say especially for nontechnical business stakeholders they want to and then get some adoption first because
through this process the AI would learn from there, right? So, and gradually to get more accurate then you can expand the kind of access to probably more tables or like more data sets.
>> Yeah. No, makes sense. Makes sense. So,
it sounds like your recommendation is start early because you get that learning loop faster.
>> Yes, because AI actually can pick up really quickly. Uh so I was just
really quickly. Uh so I was just encouraged start early rather than wait until I think the the issue like as I talk about the data quality like man you
always assume like wait until data is data quality is good uh it will wait just forever then potentially the migration or like consolidation project
just got paused after like after maybe one month or two months.
>> What I love le about what you've said so far is it's so much based on reality.
You're not talking about data quality being you know this only will work for a data quality being at least medium good or ensuring all of that but assuming the data quality is bad is like the reality
of the world right now the bad tracking you know I we could just go on about like uh so many things that uh we all know why so I'm glad that
>> now that's given how do we go from there right and I would probably say that's the right path to success into this world of AI if like deal with the reality and what do we do now that we know that this is the baseline and what
do we do? Yeah,
>> I mean it changes my priority a little bit as well. I uh uh I I I think I learned a thing or two at least from this little bit of a nuance. So very
cool. I hope you're not kind of working on like like improving data quality project.
>> No, no, it will continue to be there.
>> We still need to do that. So
>> we need to do that. I mean we nothing would uh like work or >> we still we still need that few core tables to start we yeah like if you don't even have any core table then that
that's that's a problem. One thing I do want to kind of touch on is the fact that you said kind of like on the or I guess Shavia mentioned it a little bit like on the job side, right? So I think
I've come across especially folks who are not as uh tenured or not as senior or not haven't spent as much time in the field is that they take a lot of pride
in their hard skills. It's almost like people's job or career in the data field whether you're a data scientist or whatever else is kind of like defined by how good your coding level is. I mean to
some extent we do that as well because we test for technical rounds and and things like that >> in this world where effectively AI kind of commoditizes that part of it and then
to your point it's like you know do you have the thinking do you can you be critical about its outputs and and and stuff like that. How do you suggest kind of people reframe a little bit if
they're like, "Hey, um, like my like I I got into this field because I really like to code or I got into this field because I really like to do X, Y, and Z." And now it's no longer the case or
Z." And now it's no longer the case or may not be the case. I still consider there would be two schools of people, right? So there will be still a very
right? So there will be still a very hardcore like say you're doing machine learning modeling or training some deep learning models or >> mh
>> try to tune how GPU should work right so that one I believe this AI could help but I do believe that especially for many of the cutting edge technology it
still require you to have some kind like a solid like coding or programming capabilities in order to do so. So uh if you are really passionate about this
like hardcore technologies I don't think there's any problem with that like I would always encourage everybody to consider this AI is really a multiplier.
It really depends on which error you want to multiply, right? So on one hand, it can get things done way much faster, right? So like for example, when I write
right? So like for example, when I write code, man, I really hate to write unit test right now with AI is so so straightforward, right? So and then so I
straightforward, right? So and then so I I think that one like from technical skill, you can definitely leverage that and really focus on the cutting edge technology are most interesting. On the
other hand, I do feel like there's another school people which in particular tailor I would say more relevant to data science data analyst is that for almost for all organizations I
worked for like in the end they would always ask what impact do it deliver what insights would lead to impact. I
think >> this is like this requires >> a deep understanding about your organization business and then bring
some probably domain knowledge and then as well as your kind of a deep understanding about the data and then potential also like a need to large engineering team PL teams across
function business stakeholders in order to get like say run some analysis derive some insight and make some changes. So
in that part which I'm super excited that AI actually makes you a full stack because for you like say maybe you are good at certain things but right now in
order to like make some changes probably require you to develop some mobile app or like say make some front end changes now with AI it's actually it's possible you don't need to back another like
front engineer in order to do so. So I I think consider this as like really expand your horizon there. So that's how you can really depends on what you're looking for you can make a huge impact
within your organization. That's why
like going back to my earlier comment is like I don't think data scientists or data analysts will be replaced by AI.
It's more like as long as you know that what matters most to the organization and the business right you really focus on you come out some critical thinking identify the right problem ask the right
problem and then you can really like say leverage AI to get more things done to deliver a much larger impact there right so that's actually way I in general
consider like say how to leverage AI should be a required skill uh for kind of right now almost like for everybody.
So like take Fabby as example when we started from the first uh first hiring we essentially say in the coding interview use internet use AI whatever
you want that's actually like really we are ahead of like most companies we just like we open test like you can just use whatever to get the job done so that's
actually very effective for us right so um because right now like we using all these AI tools we use Fabby for our own data analysis as So it's really like
makes things like especially for smalls size team and then it can get way more things done. Nice. Very cool. Awesome.
things done. Nice. Very cool. Awesome.
So mindset shift for people to be less getting hung up on just the technical side of the house. Be critical and if they want to go really deep, it's
actually a better opportunity. if they
don't want to go there, it actually helps them accelerate their impact because of the other stuff that uh that would drive business impact. All right,
Chavi, sorry. Go.
>> No, no, not at all. So, one thing that was just fascinating me just thinking through is imaginally if I want something to happen today, right? To
make my life easier as you know a data science leader in my space. I have so many ideas running through in my head to go want to do something. But being a senior manager and managing a team of
10, 11, it's pretty hard to go and do IC work every day, right? So I love to have a DS agent of my own that knows all the context I know in my head. But that's
weird. I mean, we are not talking about neural link right now. So how does that happen? Imagine we have a place where
happen? Imagine we have a place where all my Slack conversations, it can read my DMs. It can read my context that I get from all of that and you know use
that and it can use all the context.
Imagine every data scientist is using AI to do their work and it uses all the context of all the data scientists working on a project and helps me give
give me the context that I don't know about it myself right almost like decentralizing knowledge at some place where a data scientist works with certain type of intuition ask certain questions based on the data as soon as
they look at a data table they probably go with this next data table and you know we probably give them information of the score of their AI working with
that data scientist and imagine my AI agent like learns all of that because there is a central system it's the same brain that helps all of us right all the data scientists >> yes
>> and gives me a decentralized like like you know all the context that it learned and helped me do something that my really like awesome tech lead in my team how they solved the problem they helped
me do that for me >> yeah 100% right so I I feel Like right now AI has been primary be talking about
like how to get things done faster but definitely the in the future is like how to really make everybody smarter right so you can it can really properly right okay there's probably some other things
you should consider when you like say doing certain type analysis or think right so I think that would be like empower every to work smarter I think that's would be the future
>> that just again fascinates me that then there is know like you know the privacy of an employee working with AI it could be for business it could be for data like do they want to share how they work
how they think so that other data scientists get that benefit of how they work with it so fascinating for me to think through these problems that are not >> I think with organization definitely like some privacy discussion
>> as an organization you would want that to happen you would want everyone to be your top tech lead but the tech lead might not want that because you know they don't want AI to learn how they
work, how they think, right? But if you work with it so much, AI would learn about you. AI would learn how you look
about you. AI would learn how you look at the data, what you ask, what statistical techniques you use and it learns from you and passes it on to other not that great data scientists.
>> Yeah, there's definitely like a lot of challenges special about the governance.
Suppose the the AI agent actually learns knowledge from the HR department and another person asking okay oh what's the what's the conversation for another individual there definitely a lot of
like governance about this right so um actually there's one there's one funny story I want to share uh it's like how I got tricked by AI so as we were like I
was building Fabby the analyst agent and then we we did a lot of like hard work we have this sandbox environment to make sure it's secure And then we also provide relevant
context like including like when people are doing analysis like so which code block they're looking at like we include this as a context for the AI and then at
one point I thought let me include all the conversation histories and also to core result to the agent. I include that as a context and then so like the AI
agent would actually have access to the past like so what tool calls he has made what are the tool core result and then what are the response and at one point
like ask our fabi to run some analysis it says uh I'm calling dry run so we always dry run any code to make sure is like it's it actually give you correct
result and then also like preview like some tables or some charts it say I'm doing dry run dry run successfully.
These are the preview result and then I add it to my own like say analysis and dashboard it just fails to run and I like what what the heck like what is
going on in the end after some troubleshooting I realized the agent actually learns from the past conversation and just make up for this saying that it is doing the dry run and
gen the preview result. So this is this is a time. Wow man this AI is really kind of a is very smart like that's why I said like AI sometime can be very
creative. It might take a pass you
creative. It might take a pass you didn't realize it would take. So
>> it's almost like this you know that uh this toddler that that feels like they are a teenager or an adult the way they give with logical answers and then the next next statement they say like you
are a kid you are not. you think you have a genius and they're like so good and they're so smart but then you understand they are still a kid. So it's
like that with AI. It's almost like I I've seen that too. Not similar
instances but uh it just learns like some a context that you and it it it's too smart for itself like you know >> it gives it back to you and I'm like oh because I said that to you playing that
back to me you know.
>> Yeah exactly. I think that's why like I as I mentioned earlier the AI is very brilliant but like you you need to provide some guidance or guardrail and really depends on the individual like
profiles. So that would actually make
profiles. So that would actually make you your team or your organization like a really success when adopting many of these AI technologies.
>> That's oh jeez that's so interesting.
this morning I think um uh I read a headline just headline I didn't actually read the article so if it's misinformation uh I'll I'll I'll cut it out that maybe Stanford had a study that
just came out that says uh in a very competitive environment for AIs they would do very quote unquote creative things that may or may not be
according to kind of human principles like like you know just cutting corners or like you know doing doing uh uh undesirable people kind of kind of taking undesirable routes to to to get
at for example whatever the competing kind of outcome you wanted to wanted to come out of. So even in this whole like analysis or vibe analytic space you can
imagine hey if a stakeholder is asking you questions and you learn that this is see whatever oh probably I can make up some stuff so that uh that person will
feel good. Yeah, exactly. And then like
feel good. Yeah, exactly. And then like make it make it slightly believable like or make it look believable but slightly off such that you don't need to you know like you get the good feedback for
example crazy >> it's a like a hallucination but uh if it's a lens how to lie that would be a big trouble >> he's learning I think it almost kind of
does that right but it's fascinating I think we can cook up a movie just based on our conversations right now I think now novel or we all know of this.
>> But but the good thing is that like for many of these uh like say when you're doing anatics it's like you have some common sense knowledge. It's like it give you some ridiculous numbers. It's
like man this doesn't make sense right so you can you can always like say uh ask AI to revisit why even explain why do you do this right so but still
because like all these large language models its nature is nondeterministic anyway real physics like a real physical world is even quantum mechanics is nondeterministic but like still it's
predictable at the macro level but anyway I think it's it's not det is it will have some like a weird things going on but the good thing is like most of
the time as human beings we would be able to judge whether this is right or wrong and then we can really bring them back to the right track and then uh the most important is like the future is
very bright like with how this AI technology has been evolving that has been amazing so that's why like I'm super bullish about data science like data analysis this would just like
really in a new form and in the much different format right so today we're still talking about the dashboards but in the future probably they'll still remain but I could imagine that like insights will be much more different
could be like a podcast to probably like give you some numbers like uh or charts immediately converted into a deck or even a podcast or debrief just like you
can consume the insights as you want right so so I think the future is very kind like exciting and then there's a lot of things could happen. That's why
like that's why we're building Fabby so that like you can embrace uh kind of really the bright future.
>> That's awesome. I'm just writing out actually some of these things we can I mean like you know for us running teams and stuff we can immediately go do it actually if we want to like like spend
some spend some time as an organization to actually do it. So I think that's that's cool. All right. This is this is
that's cool. All right. This is this is this has been great.
What is perhaps one piece of advice biggest advice that you would give to current days data professionals for them to be especially in the the context of
vibe analytics and how possible and also how advanced it's it's getting day over day what's your your sort of one piece of advice that you would give to them
for them to be successful >> that's a very good question I I work at startups I work at large companies I would say for anybody body especially
right now I think the mindset has to be really focus on what matters most for the organization right so really pay
attention to these problems that has high impact let let me share this way not every high business impact problem
actually requires like the fancy stuff it's like in some scenarios you probably put the data auto line chop you already
70 or 80% there and then I would say always keep thinking about what matters most for the organization business and try to see how you can really provide
value there right so that's one and related to the AI era the one challenge I found that that it's really difficult to specify what I want AI to do
sometimes >> so um if it's like visualization it's fine you can take a screenshot and ask people like ask the AI agent to change but on the other hand I also feel like
right now almost like everybody needs to has a stronger sense about product business and UX it's really requires more like a full stack right so and then
you are able to articulate that clearly to the agent or to AI and then that will really multiply your impact not just 10x I would say 1,000x
>> totally agree le I think asking right questions has always been important and it's even more important now that we have like this magical thing to answer.
>> We want to give answers but it needs to be asked right questions. So absolutely
agree on that. Really really appreciate you coming on Le. I think this was really fascinating chat. We uh we've discussed technical and also these like interesting scenarios and it made it
like very like it was a great story.
>> It's such a pleasure. Yeah. Thanks for
having me. Yeah, it's awesome. Thank
you. Thank you for Thank you for your time and uh to our audience, thank you so much for tuning in. And uh if you enjoy this episode, please don't forget to like and subscribe and leave any
comment or questions for Lei. We'll get
to every single one of them. Thanks so
much and have a good rest of your day.
See you all.
Loading video analysis...