How to Build Scalable Agentic RAG with Dify and Qdrant
By Dify
Summary
Topics Covered
- Retrieval Evolves into Decision-Making
- Centralize Agentic RAG in One Node
- AI Agents Existed Since 1966
- Vector Search Cuts Latency 6x
- Anomaly Detection via Dissimilarity
Full Transcript
Yeah, perfect. So, um, agentic rack is an extension of traditional rack um where an agent dynamically makes decision during the retrieval. So, um,
instead of following a fixed pipeline, the agent decides how to retrieve information, which to use and when to iterate before producing an answer. So,
in short, traditional rag retrieve once while agentic rack choose retrieval as a decision- making process.
And uh then let me quickly walk you through the agentic rep workflow. So uh
after user enter his query, the agent will interpret the user's query, identify what information is needed and extract key entities for effective
retrieval. And then step two selection
retrieval. And then step two selection and query construction. Um the agent choose the best search methods and craft a precise query based on the user's
query. and its own understanding of the
query. and its own understanding of the knowledge database. And it prioritize
knowledge database. And it prioritize internal knowledge sources and only resource to external sources when necessary. And step three um source and
necessary. And step three um source and collection selection. The agent identify
collection selection. The agent identify the most relevant knowledge source based on its decision uh description. For
example, I will describe each collection's metadata in the prompt given to the agent and the agent will decide which collection it will query.
And step four, the query um execution, the agent will execute the queries and retrieve relevant result from the selected collection. And step five,
selected collection. And step five, evaluation loop. So after it retrieving
evaluation loop. So after it retrieving all the information the agent will um accesses and evaluate the relevance and completeness of the results. If the
results are weak the agent will do a iteration with maybe a new refined query um a different knowledge sources or external search when necessary. It will
keep iterating until the evidence is strong or reaching the iteration limit.
And the last step, the system will synthesis all the evidence into an accurate answer using the uh grounded LLM. And to summarize, the key
LLM. And to summarize, the key characteristic of a rack is that the retrieval becomes a decision making process instead of a single step. The
system understand the users intent dynamically choose the tools and the data sources and iterates when results are weak and it will generate an answer when the evidence is strong and well
grounded.
Um next page please. Yeah thank you. Um
so far we have talked about what the aentic rack is and um now I will summarize how defy will enable a gentic rack and make it practical and build to
run. Um at a higher level divy let you
run. Um at a higher level divy let you design agent workflow visually with clear control over retrieval reasoning and evaluation. So it can be divided
and evaluation. So it can be divided into four experts. So first of all all vising one agent node in DV the core intelligence of agent rat is
centralizing a single node. This is what intern analysis to selection collection routine and retrieval decisions all happen. So instead of being scattered
happen. So instead of being scattered around multiple steps you can easily see how the agency can work by looking at the cache variable or output in that
agent node. And this makes the system
agent node. And this makes the system easy and reasonable to um debug and evolve. And secondly, the vector
evolve. And secondly, the vector database integration. So um D integrates
database integration. So um D integrates natively with vector database like quadrant. And um we have plugins that uh
quadrant. And um we have plugins that uh makes you don't need to custom some glue code to do the retrieval step is that
you can use our plugins to retrieve um directly um into the databases and drag and drop orchestration. So uh this is
built um within the uh visual workflows.
You can do uh just like building the blocks and see how the data flows, how the decisions are made and adjust the workflows as the requirement change. And
lastly, the multi-step retrieval and evaluation logic. So instead of stopping
evaluation logic. So instead of stopping after a single search, agent can retry refine queries or switch data sources until the evidence is strong enough. So
together these four capabilities make a gentri not just possible but usable in real systems. And this is how we enable
a genic red. Yeah. Then let me pass to Ti to talk about the quadrant.
We got that intro from Scarlet on Diffy and agents. And I'm going to talk a bit
and agents. And I'm going to talk a bit about agents as well.
Hopefully you can see my slide. So
first I want to quickly talk about quadrant. Quadrant is the only vector
quadrant. Quadrant is the only vector source database that vector search database sorry that is built in rust and is built for high performance search. So
we are built basically for high speed, high accuracy and things like memory and retrieval in your agent.
Today I'm just going to talk about agents. Then I'm going to point you to
agents. Then I'm going to point you to an agentic vector search example or agentic rag example in trip builder.
Then we'll talk about why you want a vector search engine. We'll talk vector search basics. We'll talk agentic rag.
search basics. We'll talk agentic rag.
And then later on we'll have our demos.
So one question I want to ask of everyone in the audience is when was the first AI agent created?
And this is a question I like to ask because AI agents have become a lot more popular as of late. But it is important to note that,
sorry about that. It is important to note that we've actually had AI agents since 1966.
This is Eliza which came from uh MIT and or I believe it was Stanford where they were building a chatbot back in 1966.
And you can see that this is very similar to the interactions that we have with chat bots today. You say something, it replies in turn. We also had Shaky
the robot, which was the first physical agent, if you will, that could move around knowing what was in its environment. And finally, we have
environment. And finally, we have Dendril, which was a protein folding precursor, a way to look up medicine using computers.
So I say all that to say AI agents specifically are not new. But what is new is these new tools and frameworks that we
have available. Right? So today
have available. Right? So today
Scarlet's going to show you a demo and I'm going to show you a demo that if you tried to build even two years ago would have been hundreds of lines of code. But
now we can just build them on Diffy through a handful of nodes and all of a sudden you have a production grade search system.
A great example of this is Trip Builder from Trip Adviser. If you're not familiar with Trip Adviser, it's a website where you can go and search for cities. So, you might search for uh
cities. So, you might search for uh Paris and it'll tell you good restaurants in Paris, good hotels, attractions, museums, things like that.
That was the old way of searching and Trip Advisor uses vector search under the hood. The new way of searching which
the hood. The new way of searching which you can find on trip builder which is trip advisor's tool is agentic search.
And this agentic search is going to basically combine the queries that you would make if you went to trip builder to give you a full itinerary. So instead
of going to trip adviser and typing in Paris, having to find the hotel yourself, having to find where to get breakfast, you can say I'm going to Paris. It's going to be in January. I'm
Paris. It's going to be in January. I'm
bringing two kids and a pet. And it's
going to tell you everything from the hotel to where you should get breakfast to your amenities based on your preferences. And that's the new agentic
preferences. And that's the new agentic flow where we're taking things like vector search which have already existed. But now we're combining them to
existed. But now we're combining them to package it into a better user experience. Because in the past the user
experience. Because in the past the user had to actually type in all those queries manually. They had to know how
queries manually. They had to know how to interact with your search system.
Now, we just have an agent do it while asking the user, "Hey, what's your end goal? Is your end goal planning a trip
goal? Is your end goal planning a trip to the Bahamas? Okay, let me plan that trip for you." Right? Because we know that planning a trip to the Bahamas involves, for example, finding a hotel,
finding somewhere to eat, finding things to do throughout the day. So, that's the new agentic wave, which is nothing new.
Like we saw with Eliza, we've had these agents for a very long time, but the Oh, sorry. Um, looks like it's going back
sorry. Um, looks like it's going back and forth a bit. My apologies. But the
difference here is just that now we have access to better tools like diffy, like vector search that we can combine to give the user an even better experience.
What is the result of using vector search in your application? Well,
there's two benefits. The first is in a lot of AI systems, what people tend to do is they will
they will simply dump everything in the LLM. We know that the LLMs have large uh
LLM. We know that the LLMs have large uh context windows. So you can get away
context windows. So you can get away with doing that sometimes, but the problem is your latency is going to spike and also you're now taking the most expensive part of the system and
making it work the hardest. So what Trip Adviser found was that by switching from a LLMbased recommendation system to a vector searchbased recommendation system, they were able to cut latency
from 40 seconds down to 6 and 1/2 seconds and they increased the user perceived quality of responses by 30%.
So that's just one example of a vector search agent. A search agent is
search agent. A search agent is something like Trip Builder, right?
where we're taking tripadvisor and we're turning it into this endtoend platform that can give users the results that they want. So why do you need a vector
they want. So why do you need a vector search engine in that flow? I mentioned
before that using something like vectors for recommendation is going to cut the latency and make it cheaper. But it's
really about retrieval and memory. I
have this tweet here from Gary Tan who says, "Agents without memory of me and what I care about and all the context around me are just not as useful. We are
so early. It is not yet table stakes, but it will be. And it's this is a really good example of how quickly things move in the AI world. This was
posted in July. If you were to interact with an agent today that didn't have memory, you would ask yourself, what's going on, right? These are things that now people consider table stakes. Your
agent should know the things that you prefer. You shouldn't have to repeat
prefer. You shouldn't have to repeat things or people feel like the product is broken. And that's where vector
is broken. And that's where vector search can come in is that first step is the memory. By combining semantic search
the memory. By combining semantic search with keyword search, which we'll get into in a second, you can get more accurate results.
Fragmented knowledge. So often in enterprise situations, you've got these knowledge silos where you might have your Google Drive, you might have your personal notes, you might have a notion.
How do you search over everything all at once? And then fragile at scale. A lot
once? And then fragile at scale. A lot
of these demos work when it's just a PC where you have one user, you have root access. What happens when you have
access. What happens when you have hundreds of users and you need to manage user hierarchy? Uh I'm going to go
user hierarchy? Uh I'm going to go through these quickly and I think I'm coming up on my time. Correct.
So for vector search basics just to try to understand a bit about how these applications work. Everyone knows what a
applications work. Everyone knows what a two-dimensional vector is where you just have a line, right? In a
three-dimensional vector, we're going to add that third dimension where we're actually going to add the x or the z-axis where we have that depth. Vectors
have thousands of dimensions or hundreds. We can't really visualize it,
hundreds. We can't really visualize it, but this is a way of approximating the visualization. And by having all of
visualization. And by having all of those dimensions, we can capture really specific information about the vectors.
In the same way, we can capture dissimilar information. So, Scarlet's
dissimilar information. So, Scarlet's going to show you a demo that shows how to find similar documents. And then I'm going to show an anomaly uh detection demo that shows you how to show
dissimilar. So, that's another use case
dissimilar. So, that's another use case that we have for Quadrant. And I'm going to go through these. Vector search also has multiple applications. So everything
from HR to ad to online dating to gig economy. And we can talk about different
economy. And we can talk about different use cases in the demo. But the important thing to keep in mind is when we talk about an agent, we're talking about something that uses a LLM in a loop to
choose the next action to perform. And
this is standard vector search where you just have your query. You prompt that with context based on your candidates.
This is agentic vector search where we're going to introduce that AI to make the decisions like Scarlet referred to
and oh sorry some of the abilities of the agent is query expansion. So for
example if you have a shopping agent and a user types in a query like the one you see on the left we're going to expand that out to the query that you see on the right.
Extracting filters. If the user has a query like the one on the left, we can automatically filter inside a quadrant.
And that's one of the powerful things of Quadrant is you can search based on meaning, but then you can combine the filters on there. So you could say, I
only want cool weekend outfits, but make them men's and make them size large and make the price under $50. And then
finally, using LLM as a judge. So you
could have a vision model for example that's looking at the different items in your vector database and then deciding which one to return on the user based on
the query. So that is a quick
the query. So that is a quick walkthrough of quadrant and the way that it connects with diffy of course is instead of you having to do all of this
with your own code you can actually do all of this with diffy's low code no code tool builder. Now it's
time for some live demos. I will bring Scarlet back and also sharing the Hold on.
I'll remove this first and then I will bring Scarlet back. Okay. Okay. Great.
Scarlet, you're all set. And I know you've got uh two lovely workflows for us. Are you ready to share that?
us. Are you ready to share that?
>> Yeah. Um
>> let's be I'll I'll leave this stage to you. Take a deep breath. Mhm.
you. Take a deep breath. Mhm.
>> Uh can you see my screen? Uh yeah.
>> Yep. Yep. It's pretty clear. No problem.
Mhm.
>> Okay.
>> Yeah. So um firstly before I will do the live demo um I want to uh proudly introduce the Quantum plugin which
powers the workflow you um about to see and um the plug-in will also be released on the D marketplace very soon. So um
the quant park thing is a comprehensive vector database integration um of quant for divy. So it lets you store the
for divy. So it lets you store the vector search and manage vector directly within defies workflows with no additional code required. So um at a
high level um it provides you with four u core capabilities. Um so first one is the um data ingestion like uh you can
observe either the premputed vector using the start uh standard quadrant um points format or observed a raw text directly using the quadrant abstra text.
So for this one uh for absur for text obstin configure embanding models internally.
So uh there's no need for a separate bending nodes. So the tag is abandoned
bending nodes. So the tag is abandoned and stored in one step. M second
retrieval. The plug-in supports both vector search and hybrid search as um Siri has already um introduced the search. So I will not like uh talk in
search. So I will not like uh talk in details. And thirdly uh data management.
details. And thirdly uh data management.
Um in this plug-in you can query, filter, score or delete points and which is essential for maintaining and auditing a production knowledge base and force collection management. Um the
collection can be created um inspected or delayed directly inside divis workflows which can help us support the full life cycle without living the
platform. And then um it's demo time. Um
platform. And then um it's demo time. Um
today I will show you two quick demo. Um
the first one um is um observe demo using the Google drive and um to be brief um it automates the process from Google Drive upload to vector database
ingestion and can be used for keeping the enterprise knowledge base up to date. So um I will first show the PDF
date. So um I will first show the PDF doc. I will upload it. Yeah, this is a
doc. I will upload it. Yeah, this is a file from the uh CUAD mini data set. Um
the CU8 mini data set um is a benchmark data set curated by um a project to support AI research and develop in legal
contract review. So the data um is
contract review. So the data um is basically um test in this format. So
this is the PDF I will use and then um um uh after uh I upload this into my Google drive, the trigger uh will listen to the
change in the Google drive and uh output the change data and I will uh then process the data to extract the specific specific attribute from the trigger
which is the f ID and then I use a download uh Google drive node to download the file and Then um a ministral OCR file to extract all the
texts downloaded from the PDF. Um and
then um use a general trunker to split the document into retrieval friendly chunks. And finally um the quadrant
chunks. And finally um the quadrant after test node will convert the text trunk into embendings and store them in the quadrant collection for a vector search. And this node automatically
search. And this node automatically extract the text view from the each trunk object and converts it into correct uh quantum payload format. So uh
let me do a text round.
So now you can see um the trigger is start starting to listen for the event in the Google drive and then I go to my
drive and do a quick upload.
Yeah, let me see. Yeah, you can see um it is working now. Um but uh in this it it will like continue to listen for the event from the trigger. So the workflow
is like um can do the batch processing.
So it will continue to upload the document. It will continue to run. Yeah.
document. It will continue to run. Yeah.
Um, so let's wait for a second for the workflow to run.
Oh, maybe there are some problems with my subscription here. I can create a new
subscription here. I can create a new one in a few seconds. I'm sorry for that. Uh,
that. Uh, just a second.
And this is how uh I set up uh the Google Drive change um node which is straight to my drive.
Yeah, hopefully it will work this time.
Yeah. Oh, now it works finally. Um,
yeah. So you can see the workflow goes to the last stage. So uh if it is running um the speed is really quick and
um now let's see the final result here.
Yeah. So
yeah. So here uh from the tracing part we can see all the points are successfully uh observed from the output
um you can see the prompt id and the text uh embanded as payload and the the vector dimension. This means it is
vector dimension. This means it is successfully uh observed onto the vector database. Um the collection name I set
database. Um the collection name I set is the test. So let's see uh what we have in the test data set uh a test
collection on the quadrant.
Yeah. Here um if you open this you can see the point has already been upserted and uh taxed uh as payload. This is the
trunks we get and yeah so finally we are done. So um let's go to my second demo.
done. So um let's go to my second demo.
I think this is a more like interesting one. Um I build a uh and this is the
one. Um I build a uh and this is the real agent red demo. Uh I built a legal research agent um which it can help with the legal research of course and the
data set I use contains three collection for um with over um 69,000 vector in totals and you can see the data set info
here. Uh I had already introduced this
here. Uh I had already introduced this UAD minutes um which is the PDF file I uploaded is a part of this and uh but this is a kind of a smaller data set and
I use a much larger one uh using the housing QA and I had already obser all this data set um into the quadrant
database and I can show you. So this is the one main housing status uh which contains like nearly 30,000 points and
yeah and also I have uh Iowa housing status um with 40 uh thousand document trunks covering the landlord tenant
relationship avictions and security deposit. So um I will first do a test
deposit. So um I will first do a test run first in case uh we are going through all the workflow.
Uh let me do a refresh.
So I will first do a text ROM and I will try to ask a test question to ask the
agent. Yeah. So um in the real scenario
agent. Yeah. So um in the real scenario um we will also be like appear um as a like chatbot. So you can ask the
like chatbot. So you can ask the question uh related to this knowledge database to the uh checkboard and it will uh try to um start uh the agent
will try to start with intent analysis and then uh using the tools to um to query into different collections and use
to uh give the structure output. Yeah.
So during the process I will um go deeper into the agent node here. So um
the agent uh you can see the agentic workflow is pretty simple. Yeah. So it
just have four node and really easy to build. So for the agent node um it takes
build. So for the agent node um it takes over the core retrieval knowledge as we have discussed before and um here we are using the function calling as the
agentic strategy um which allow the model to decide when and how um to call differences and we are using GPT5 for the model because we need some kind of
reasoning um ability for the agent to do the decision making process and uh for the tools we have um quadrant hybrid search uh vector search and uh Google
search as a fallback if uh in case it don't find anything um in the workflow uh in the database and uh we have the instruction here which is the prompt
given to the agent and as you can see um the prompt at a higher level it tells the agent so first you are a quent document retrieval specialist and you should always grant answers in the quant
collections um if you can uh reach to search. So um and then I will tell the
search. So um and then I will tell the agent the retrieval looks look like. So
first like inspect the queries um do the intention analysis and construct another query and pick the collection from the table below. Uh I also give the agent
table below. Uh I also give the agent the collection info and then uh choose a tool uh for example the hybrid search tool um the vector search tool uh and
also the web search and uh the evaluation. So uh I I asked the agent to
evaluation. So uh I I asked the agent to do a evaluation. So if the result are weak um it should retry with a query facing or second collection and um if
after two unsuccessful attempts to call uh Google search and lastly summarize the findings and uh I also give the the collection info and also some uh
execution nodes or response policy to um to make the result more structure. And
uh you can also like uh do some uh edit to this part to um enable the agent to give you the best result. Yeah. So
hopefully um yeah great. So we can see the um the the test round has already uh finished and I asked is there a state
law regulating residual uh evictions in Lowa and um yeah [clears throat] so the agent just give us the answers and we can see how it works here. So if we go
to tracing and click the agent. Yeah. So
you can see this is really easy to debug because you can see what's the input given to the agent and then uh how the what's the output the agent gift. You
can see the agent is res uh retrying search in this collection. Yeah. And um
and uh I think it gets some uh Yeah. And you can see that it run a
uh Yeah. And you can see that it run a second attempt with a refined query. And
this is after it uh conducts the evaluation. And uh we can also see the
evaluation. And uh we can also see the text it retrieves and uh the payload.
Yeah. So here
so um this is really easy to debug and you can also open the uh cache variable to see uh what each nodes does.
Yeah. Perfect. And um I think uh do we have time to do another uh question test question? Um
question? Um uh any do do we have time or we can like pass that to uh Terry to do another demo because we still have another demo left.
>> Yeah.
>> Okay. Sure.
>> Yeah.
>> Yeah. Sure. Yeah. Uh I think we can move it to uh Terry for his demo. um because
uh our audience will receive a webinar resource pack where you they can also try uh with uh the workflow and um use their questions I think.
>> Yeah. Yeah. So uh let me stop sharing and pass that to the >> Okay. Thank you so much Scarlet. I
>> Okay. Thank you so much Scarlet. I
really love the second workflow and I think it's super useful for me and I cannot wait I can't wait to try it myself. Yep. So now I will bring Terry
myself. Yep. So now I will bring Terry to the stage. So, take a deep breath and have a little rest. Scarlet and I'm bringing Terry back. Give me a second.
Okay. So, hi Terry. I'll leave this stage to you for your demo. Mhm.
>> The demo that I'm going to be showing you today is an image anomaly detection service or agent I should say that is
built with quadrant on diffy. So, I want to just show you what this looks like in practice. Here I have this data set of
practice. Here I have this data set of folders or sorry this folder of images and we can actually take a look at the
image but I can tell you that this is a sample of a normal cashew product defective.
So if we look at it, it's a normal cashew that has no issues. A perfect
specimen for a cashew.
What we can do with this system is we can upload this picture and ask our system is this product defective and it says no with a 92% confidence that
there's no issue. Now let's try with a defective cashew. So I have this
defective cashew. So I have this defective cashew which we can take a look at and you you'll notice that you know something went wrong in the production
process there and our agent is going to say nope this is a fail. It's recognizing that the defect is that it was burned and then
it's going to give us some kind of action. So and of course this works with
action. So and of course this works with other examples too. So, let's try the gum.
Is this defective?
And we're going to get a Yeah, it's defective. And the defect here is the spot. And then if I try again with the clean gum,
is this defective?
That's going to work. So, what data set are we using here? We're using something called the Visa extended data set. And I
can actually open this up. Uh we just took a look at gum. So let's go to the chewing gum and take a look at that. And
what we have inside of here is this anomaly grouped and the normal photos.
So if we take a look at the anomaly grouped, you can see that we have these folders. And in each folder, we have an
folders. And in each folder, we have an image. And then we also have a mask
image. And then we also have a mask which is going to tell us what the issue is with the image. And then the normal image is just normal samples. So you see we have candles, we have capsules,
cashew, chewing gum, fryum, macaroni, etc. So what we built here, if I can go
to the diffy agent is this flow where we take the data set which it's called the visa data set. Right? So these are the images that we just saw in our
anomaly detection. There's 12 different
anomaly detection. There's 12 different categories of images. So the very first thing that we do is we upsert these images to quadrant and then we also use
filters. So I was mentioning earlier
filters. So I was mentioning earlier that you can combine that semantic search with filters. So we're going to upload the image. We turn that image into an embedding which is turning the
image into numbers. That way we can compare how similar or dissimilar images are to each other. And then when we get a new product, what we can do is first
identify what kind of product it is. So
we're going to say, okay, this is a candle, this is chewing gum, this is the fryum, this is a PCB. And then once we've identified what the image is, we
can compare the vector of our uploaded image with the centrid vector of the normal image. So we basically average
normal image. So we basically average out what we consider to be good samples and then we look at the new incoming sample and we say, hey, does this sample
match in terms of similarity? If it
does, then we can go ahead and say everything's good. And if it doesn't,
everything's good. And if it doesn't, then we're gonna want to mark that as an anomaly. And really quickly, I want to
anomaly. And really quickly, I want to show you the power of Diffy here, which is if I wanted to simply spin up a uh
service where I can search for images.
So here I have the H&M data set and we're searching for red shoes in this query. And then below we're doing an
query. And then below we're doing an image search where I'm giving it this image here and searching for similar images. And you can see that there's a
images. And you can see that there's a good amount of code that I needed to do here. And I'd probably need to know
here. And I'd probably need to know Python or Quadrant pretty in-depth in order to build this. What we're doing here is building the same thing but in
Diffy. [clears throat]
Diffy. [clears throat] So if we take a look at how the workflow actually works, we can see here I
uploaded the file to Diffy and we have a little script here that's just going to extract the URL from Diffy's file upload service. So this is a little script here
service. So this is a little script here that's just going to extract the URL because Quadrant only wants the image
URL. So that tool that uh Scarlet showed
URL. So that tool that uh Scarlet showed you earlier, I actually went ahead and I have already started expanding on this.
And of course, we're going to make this tool available to the community. So if
you have any ideas, please let us know.
But what I added was a image search tool. And what this image search tool
tool. And what this image search tool does is search images. So what we're doing here is we're taking the user input which can either be a URL or a
local file upload. We're uploading that image to Quadrant. We're embedding sorry we're embedding the image then we upload it to Quadrant.
And here you can see if we take a look at our last run um we can see here in collection Visa products we did that
image search and we found 10 similar images. And I can even bring up this
images. And I can even bring up this Visa products uh collection the same way that Scarlet showed you the QAD collection here in my quadrant cluster.
So if we take a look at my cluster UI and I search for Visa um I have yep this Visa products
collection here where we have all of the different products that we have inside of this data set.
So when I search, what I'm doing is I'm comparing that vector to the images that I have inside of my data set to a find
out what the product is and then b get scores for relevance. Right? So that's
what all this output is down here. It's
going to be similar to if I search for this image, here are the similar images.
But like I mentioned, this is anomaly detection. So we're actually doing a
detection. So we're actually doing a dissimilarity search or we're seeing how dissimilar are these items from each other. And then we have the LLM piece,
other. And then we have the LLM piece, right? So we have a prompt here where
right? So we have a prompt here where we're going to take in the search results and then based on the search results, we're going to have a confidence interval because you might
think for example, let me try here. We
have a image that it could go either way, right? Is this a defect?
way, right? Is this a defect?
Depending on your quality control, you might want to ship this product or you might say, "Hey, this product is uh defective." So, you can actually go in
defective." So, you can actually go in your back end and say, "You know what?
We don't actually care about small defects like this. Let this pass." But
of course, if you have a Why can't I exit this? And then, of course, if you
exit this? And then, of course, if you have a major defect like this, there's no situation where we want that to pass.
So we can have our threshold make sure that it captures things like that.
So that is the LLM piece. And then the way you can think about this is basically we have a demo here. Quadrant
anomaly detection coffee.
We have a demo here. But imagine you worked at a factory and you have cameras inside of your factory. So, what you'd
want to do is take a picture of the production process as it's going and then you can compare the vectors of each image or each batch or [clears throat]
each product, however you um are manufacturing, in order to find out, hey, is this product that we're delivering an anomaly or is it the normal product?
So after we do all of that, we simply have a response node that is just going to respond to the user with that data.
And then just like that, you can have a production grade anomaly detection system built with diffy and quadrant.
And I think the cool thing about this is now that I have this data inside of Quadrant, if I wanted to edit this. So let's say I
I might have another collection. Um
let's actually try this live and see what happens. Hopefully we pray to the
what happens. Hopefully we pray to the demo gods. Let's say I have another
demo gods. Let's say I have another multimodal collection.
Um, H&M. I think I have a multimodal.
H&M. I think I have a multimodal.
Let's see. Oh, I just missed it. Sorry.
I saw it back here.
Let's just refresh. My eyes aren't working today.
Okay, so I have this H&M products multimodal.
In theory, I can just change the collection name, right? Because before
we were using Visa products, now we're using H&M multimodal if it's using the same embeddings.
And I would also want to change this to say something. Um, we can change this
say something. Um, we can change this and Diffy actually has this AI uh generation thing. So, we can use the AI
generation thing. So, we can use the AI to change the prompt. And basically, we would change this to a fashion prompt.
And then if we were to run this where we upload a image, it's still going to work. It's just going to do the
comparisons now against the images in my H&M demo collection. So if I had a shirt for example, it would go and it would
say, "Hey, this shirt is defective."
Right? So that way you can really easily on the go switch up these workflows. And
as long as you have your data inside of Quadrant, you're good to go. And then
we're also going to work on creating some workflows that actually allow you to upload your data so that no matter what your business function is, you can use Diffy and Quadrant together to
unlock that information.
So that is my presentation on or my demo for the quadrant diffy tool.
Loading video analysis...