How to Build Scalable Agentic RAG with Dify and Qdrant

By Dify

Summary

## Key takeaways - **Agentic RAG Makes Retrieval Dynamic**: Agentic RAG is an extension of traditional RAG where an agent dynamically makes decisions during retrieval, instead of following a fixed pipeline. Traditional RAG retrieves once while agentic RAG chooses retrieval as a decision-making process. [00:00], [00:22] - **Prioritize Internal Knowledge Sources**: The agent prioritizes internal knowledge sources and only resorts to external sources when necessary. For example, the agent will describe each collection's metadata in the prompt and decide which collection to query. [01:00], [01:18] - **Evaluation Loop Iterates Weak Results**: After retrieving information, the agent assesses and evaluates the relevance and completeness of the results. If results are weak, the agent iterates with a new refined query, different knowledge sources, or external search until evidence is strong. [01:37], [01:48] - **Dify Centralizes in Single Agent Node**: Using one agent node in Dify centralizes intent analysis, selection, collection routing, and retrieval decisions all in a single node. This makes the system easy to debug and evolve by looking at the cache variable or output. [03:09], [03:26] - **TripAdvisor Vector Search Cuts Latency**: TripAdvisor switched from an LLM-based recommendation system to a vector search-based one, cutting latency from 40 seconds to 6.5 seconds and increasing user-perceived quality by 30%. [10:00], [10:20] - **Qdrant Plugin Enables No-Code Integration**: The Qdrant plugin for Dify provides data ingestion, vector and hybrid search, data management, and collection management directly within Dify workflows with no additional code required. [15:43], [16:08]

Topics Covered

Retrieval Evolves into Decision-Making
Centralize Agentic RAG in One Node
AI Agents Existed Since 1966
Vector Search Cuts Latency 6x
Anomaly Detection via Dissimilarity

Full Transcript

Yeah, perfect. So, um, agentic rack is an extension of traditional rack um where an agent dynamically makes decision during the retrieval. So, um,

instead of following a fixed pipeline, the agent decides how to retrieve information, which to use and when to iterate before producing an answer. So,

in short, traditional rag retrieve once while agentic rack choose retrieval as a decision- making process.

And uh then let me quickly walk you through the agentic rep workflow. So uh

after user enter his query, the agent will interpret the user's query, identify what information is needed and extract key entities for effective

retrieval. And then step two selection

retrieval. And then step two selection and query construction. Um the agent choose the best search methods and craft a precise query based on the user's

query. and its own understanding of the

query. and its own understanding of the knowledge database. And it prioritize

knowledge database. And it prioritize internal knowledge sources and only resource to external sources when necessary. And step three um source and

necessary. And step three um source and collection selection. The agent identify

collection selection. The agent identify the most relevant knowledge source based on its decision uh description. For

example, I will describe each collection's metadata in the prompt given to the agent and the agent will decide which collection it will query.

And step four, the query um execution, the agent will execute the queries and retrieve relevant result from the selected collection. And step five,

selected collection. And step five, evaluation loop. So after it retrieving

evaluation loop. So after it retrieving all the information the agent will um accesses and evaluate the relevance and completeness of the results. If the

results are weak the agent will do a iteration with maybe a new refined query um a different knowledge sources or external search when necessary. It will

keep iterating until the evidence is strong or reaching the iteration limit.

And the last step, the system will synthesis all the evidence into an accurate answer using the uh grounded LLM. And to summarize, the key

LLM. And to summarize, the key characteristic of a rack is that the retrieval becomes a decision making process instead of a single step. The

system understand the users intent dynamically choose the tools and the data sources and iterates when results are weak and it will generate an answer when the evidence is strong and well

grounded.

Um next page please. Yeah thank you. Um

so far we have talked about what the aentic rack is and um now I will summarize how defy will enable a gentic rack and make it practical and build to

run. Um at a higher level divy let you

run. Um at a higher level divy let you design agent workflow visually with clear control over retrieval reasoning and evaluation. So it can be divided

and evaluation. So it can be divided into four experts. So first of all all vising one agent node in DV the core intelligence of agent rat is

centralizing a single node. This is what intern analysis to selection collection routine and retrieval decisions all happen. So instead of being scattered

happen. So instead of being scattered around multiple steps you can easily see how the agency can work by looking at the cache variable or output in that

agent node. And this makes the system

agent node. And this makes the system easy and reasonable to um debug and evolve. And secondly, the vector

evolve. And secondly, the vector database integration. So um D integrates

database integration. So um D integrates natively with vector database like quadrant. And um we have plugins that uh

quadrant. And um we have plugins that uh makes you don't need to custom some glue code to do the retrieval step is that

you can use our plugins to retrieve um directly um into the databases and drag and drop orchestration. So uh this is

built um within the uh visual workflows.

You can do uh just like building the blocks and see how the data flows, how the decisions are made and adjust the workflows as the requirement change. And

lastly, the multi-step retrieval and evaluation logic. So instead of stopping

evaluation logic. So instead of stopping after a single search, agent can retry refine queries or switch data sources until the evidence is strong enough. So

together these four capabilities make a gentri not just possible but usable in real systems. And this is how we enable

a genic red. Yeah. Then let me pass to Ti to talk about the quadrant.

We got that intro from Scarlet on Diffy and agents. And I'm going to talk a bit

and agents. And I'm going to talk a bit about agents as well.

Hopefully you can see my slide. So

first I want to quickly talk about quadrant. Quadrant is the only vector

quadrant. Quadrant is the only vector source database that vector search database sorry that is built in rust and is built for high performance search. So

we are built basically for high speed, high accuracy and things like memory and retrieval in your agent.

Today I'm just going to talk about agents. Then I'm going to point you to

agents. Then I'm going to point you to an agentic vector search example or agentic rag example in trip builder.

Then we'll talk about why you want a vector search engine. We'll talk vector search basics. We'll talk agentic rag.

search basics. We'll talk agentic rag.

And then later on we'll have our demos.

So one question I want to ask of everyone in the audience is when was the first AI agent created?

And this is a question I like to ask because AI agents have become a lot more popular as of late. But it is important to note that,

sorry about that. It is important to note that we've actually had AI agents since 1966.

This is Eliza which came from uh MIT and or I believe it was Stanford where they were building a chatbot back in 1966.

And you can see that this is very similar to the interactions that we have with chat bots today. You say something, it replies in turn. We also had Shaky

the robot, which was the first physical agent, if you will, that could move around knowing what was in its environment. And finally, we have

environment. And finally, we have Dendril, which was a protein folding precursor, a way to look up medicine using computers.

So I say all that to say AI agents specifically are not new. But what is new is these new tools and frameworks that we

have available. Right? So today

have available. Right? So today

Scarlet's going to show you a demo and I'm going to show you a demo that if you tried to build even two years ago would have been hundreds of lines of code. But

now we can just build them on Diffy through a handful of nodes and all of a sudden you have a production grade search system.

A great example of this is Trip Builder from Trip Adviser. If you're not familiar with Trip Adviser, it's a website where you can go and search for cities. So, you might search for uh

cities. So, you might search for uh Paris and it'll tell you good restaurants in Paris, good hotels, attractions, museums, things like that.

That was the old way of searching and Trip Advisor uses vector search under the hood. The new way of searching which

the hood. The new way of searching which you can find on trip builder which is trip advisor's tool is agentic search.

And this agentic search is going to basically combine the queries that you would make if you went to trip builder to give you a full itinerary. So instead

of going to trip adviser and typing in Paris, having to find the hotel yourself, having to find where to get breakfast, you can say I'm going to Paris. It's going to be in January. I'm

Paris. It's going to be in January. I'm

bringing two kids and a pet. And it's

going to tell you everything from the hotel to where you should get breakfast to your amenities based on your preferences. And that's the new agentic

preferences. And that's the new agentic flow where we're taking things like vector search which have already existed. But now we're combining them to

existed. But now we're combining them to package it into a better user experience. Because in the past the user

experience. Because in the past the user had to actually type in all those queries manually. They had to know how

queries manually. They had to know how to interact with your search system.

Now, we just have an agent do it while asking the user, "Hey, what's your end goal? Is your end goal planning a trip

goal? Is your end goal planning a trip to the Bahamas? Okay, let me plan that trip for you." Right? Because we know that planning a trip to the Bahamas involves, for example, finding a hotel,

finding somewhere to eat, finding things to do throughout the day. So, that's the new agentic wave, which is nothing new.

Like we saw with Eliza, we've had these agents for a very long time, but the Oh, sorry. Um, looks like it's going back

sorry. Um, looks like it's going back and forth a bit. My apologies. But the

difference here is just that now we have access to better tools like diffy, like vector search that we can combine to give the user an even better experience.

What is the result of using vector search in your application? Well,

there's two benefits. The first is in a lot of AI systems, what people tend to do is they will

they will simply dump everything in the LLM. We know that the LLMs have large uh

LLM. We know that the LLMs have large uh context windows. So you can get away

context windows. So you can get away with doing that sometimes, but the problem is your latency is going to spike and also you're now taking the most expensive part of the system and

making it work the hardest. So what Trip Adviser found was that by switching from a LLMbased recommendation system to a vector searchbased recommendation system, they were able to cut latency

from 40 seconds down to 6 and 1/2 seconds and they increased the user perceived quality of responses by 30%.

So that's just one example of a vector search agent. A search agent is

search agent. A search agent is something like Trip Builder, right?

where we're taking tripadvisor and we're turning it into this endtoend platform that can give users the results that they want. So why do you need a vector

they want. So why do you need a vector search engine in that flow? I mentioned

before that using something like vectors for recommendation is going to cut the latency and make it cheaper. But it's

really about retrieval and memory. I

have this tweet here from Gary Tan who says, "Agents without memory of me and what I care about and all the context around me are just not as useful. We are

so early. It is not yet table stakes, but it will be. And it's this is a really good example of how quickly things move in the AI world. This was

posted in July. If you were to interact with an agent today that didn't have memory, you would ask yourself, what's going on, right? These are things that now people consider table stakes. Your

agent should know the things that you prefer. You shouldn't have to repeat

prefer. You shouldn't have to repeat things or people feel like the product is broken. And that's where vector

is broken. And that's where vector search can come in is that first step is the memory. By combining semantic search

the memory. By combining semantic search with keyword search, which we'll get into in a second, you can get more accurate results.

Fragmented knowledge. So often in enterprise situations, you've got these knowledge silos where you might have your Google Drive, you might have your personal notes, you might have a notion.

How do you search over everything all at once? And then fragile at scale. A lot

once? And then fragile at scale. A lot

of these demos work when it's just a PC where you have one user, you have root access. What happens when you have

access. What happens when you have hundreds of users and you need to manage user hierarchy? Uh I'm going to go

user hierarchy? Uh I'm going to go through these quickly and I think I'm coming up on my time. Correct.

So for vector search basics just to try to understand a bit about how these applications work. Everyone knows what a

applications work. Everyone knows what a two-dimensional vector is where you just have a line, right? In a

three-dimensional vector, we're going to add that third dimension where we're actually going to add the x or the z-axis where we have that depth. Vectors

have thousands of dimensions or hundreds. We can't really visualize it,

hundreds. We can't really visualize it, but this is a way of approximating the visualization. And by having all of

visualization. And by having all of those dimensions, we can capture really specific information about the vectors.

In the same way, we can capture dissimilar information. So, Scarlet's

dissimilar information. So, Scarlet's going to show you a demo that shows how to find similar documents. And then I'm going to show an anomaly uh detection demo that shows you how to show

dissimilar. So, that's another use case

dissimilar. So, that's another use case that we have for Quadrant. And I'm going to go through these. Vector search also has multiple applications. So everything

from HR to ad to online dating to gig economy. And we can talk about different

economy. And we can talk about different use cases in the demo. But the important thing to keep in mind is when we talk about an agent, we're talking about something that uses a LLM in a loop to

choose the next action to perform. And

this is standard vector search where you just have your query. You prompt that with context based on your candidates.

This is agentic vector search where we're going to introduce that AI to make the decisions like Scarlet referred to

and oh sorry some of the abilities of the agent is query expansion. So for

example if you have a shopping agent and a user types in a query like the one you see on the left we're going to expand that out to the query that you see on the right.

Extracting filters. If the user has a query like the one on the left, we can automatically filter inside a quadrant.

And that's one of the powerful things of Quadrant is you can search based on meaning, but then you can combine the filters on there. So you could say, I

only want cool weekend outfits, but make them men's and make them size large and make the price under $50. And then

finally, using LLM as a judge. So you

could have a vision model for example that's looking at the different items in your vector database and then deciding which one to return on the user based on

the query. So that is a quick

the query. So that is a quick walkthrough of quadrant and the way that it connects with diffy of course is instead of you having to do all of this

with your own code you can actually do all of this with diffy's low code no code tool builder. Now it's

time for some live demos. I will bring Scarlet back and also sharing the Hold on.

I'll remove this first and then I will bring Scarlet back. Okay. Okay. Great.

Scarlet, you're all set. And I know you've got uh two lovely workflows for us. Are you ready to share that?

us. Are you ready to share that?

>> Yeah. Um

>> let's be I'll I'll leave this stage to you. Take a deep breath. Mhm.

you. Take a deep breath. Mhm.

>> Uh can you see my screen? Uh yeah.

>> Yep. Yep. It's pretty clear. No problem.

Mhm.

>> Okay.

>> Yeah. So um firstly before I will do the live demo um I want to uh proudly introduce the Quantum plugin which

powers the workflow you um about to see and um the plug-in will also be released on the D marketplace very soon. So um

the quant park thing is a comprehensive vector database integration um of quant for divy. So it lets you store the

for divy. So it lets you store the vector search and manage vector directly within defies workflows with no additional code required. So um at a

high level um it provides you with four u core capabilities. Um so first one is the um data ingestion like uh you can

observe either the premputed vector using the start uh standard quadrant um points format or observed a raw text directly using the quadrant abstra text.

So for this one uh for absur for text obstin configure embanding models internally.

So uh there's no need for a separate bending nodes. So the tag is abandoned

bending nodes. So the tag is abandoned and stored in one step. M second

retrieval. The plug-in supports both vector search and hybrid search as um Siri has already um introduced the search. So I will not like uh talk in

search. So I will not like uh talk in details. And thirdly uh data management.

details. And thirdly uh data management.

Um in this plug-in you can query, filter, score or delete points and which is essential for maintaining and auditing a production knowledge base and force collection management. Um the

collection can be created um inspected or delayed directly inside divis workflows which can help us support the full life cycle without living the

platform. And then um it's demo time. Um

platform. And then um it's demo time. Um

today I will show you two quick demo. Um

the first one um is um observe demo using the Google drive and um to be brief um it automates the process from Google Drive upload to vector database

ingestion and can be used for keeping the enterprise knowledge base up to date. So um I will first show the PDF

date. So um I will first show the PDF doc. I will upload it. Yeah, this is a

doc. I will upload it. Yeah, this is a file from the uh CUAD mini data set. Um

the CU8 mini data set um is a benchmark data set curated by um a project to support AI research and develop in legal

contract review. So the data um is

contract review. So the data um is basically um test in this format. So

this is the PDF I will use and then um um uh after uh I upload this into my Google drive, the trigger uh will listen to the

change in the Google drive and uh output the change data and I will uh then process the data to extract the specific specific attribute from the trigger

which is the f ID and then I use a download uh Google drive node to download the file and Then um a ministral OCR file to extract all the

texts downloaded from the PDF. Um and

then um use a general trunker to split the document into retrieval friendly chunks. And finally um the quadrant

chunks. And finally um the quadrant after test node will convert the text trunk into embendings and store them in the quadrant collection for a vector search. And this node automatically

search. And this node automatically extract the text view from the each trunk object and converts it into correct uh quantum payload format. So uh

let me do a text round.

So now you can see um the trigger is start starting to listen for the event in the Google drive and then I go to my

drive and do a quick upload.

Yeah, let me see. Yeah, you can see um it is working now. Um but uh in this it it will like continue to listen for the event from the trigger. So the workflow

is like um can do the batch processing.

So it will continue to upload the document. It will continue to run. Yeah.

document. It will continue to run. Yeah.

Um, so let's wait for a second for the workflow to run.

Oh, maybe there are some problems with my subscription here. I can create a new

subscription here. I can create a new one in a few seconds. I'm sorry for that. Uh,

that. Uh, just a second.

And this is how uh I set up uh the Google Drive change um node which is straight to my drive.

Yeah, hopefully it will work this time.

Yeah. Oh, now it works finally. Um,

yeah. So you can see the workflow goes to the last stage. So uh if it is running um the speed is really quick and

um now let's see the final result here.

Yeah. So

yeah. So here uh from the tracing part we can see all the points are successfully uh observed from the output

um you can see the prompt id and the text uh embanded as payload and the the vector dimension. This means it is

vector dimension. This means it is successfully uh observed onto the vector database. Um the collection name I set

database. Um the collection name I set is the test. So let's see uh what we have in the test data set uh a test

collection on the quadrant.

Yeah. Here um if you open this you can see the point has already been upserted and uh taxed uh as payload. This is the

trunks we get and yeah so finally we are done. So um let's go to my second demo.

done. So um let's go to my second demo.

I think this is a more like interesting one. Um I build a uh and this is the

one. Um I build a uh and this is the real agent red demo. Uh I built a legal research agent um which it can help with the legal research of course and the

data set I use contains three collection for um with over um 69,000 vector in totals and you can see the data set info

here. Uh I had already introduced this

here. Uh I had already introduced this UAD minutes um which is the PDF file I uploaded is a part of this and uh but this is a kind of a smaller data set and

I use a much larger one uh using the housing QA and I had already obser all this data set um into the quadrant

database and I can show you. So this is the one main housing status uh which contains like nearly 30,000 points and

yeah and also I have uh Iowa housing status um with 40 uh thousand document trunks covering the landlord tenant

relationship avictions and security deposit. So um I will first do a test

deposit. So um I will first do a test run first in case uh we are going through all the workflow.

Uh let me do a refresh.

So I will first do a text ROM and I will try to ask a test question to ask the

agent. Yeah. So um in the real scenario

agent. Yeah. So um in the real scenario um we will also be like appear um as a like chatbot. So you can ask the

like chatbot. So you can ask the question uh related to this knowledge database to the uh checkboard and it will uh try to um start uh the agent

will try to start with intent analysis and then uh using the tools to um to query into different collections and use

to uh give the structure output. Yeah.

So during the process I will um go deeper into the agent node here. So um

the agent uh you can see the agentic workflow is pretty simple. Yeah. So it

just have four node and really easy to build. So for the agent node um it takes

build. So for the agent node um it takes over the core retrieval knowledge as we have discussed before and um here we are using the function calling as the

agentic strategy um which allow the model to decide when and how um to call differences and we are using GPT5 for the model because we need some kind of

reasoning um ability for the agent to do the decision making process and uh for the tools we have um quadrant hybrid search uh vector search and uh Google

search as a fallback if uh in case it don't find anything um in the workflow uh in the database and uh we have the instruction here which is the prompt

given to the agent and as you can see um the prompt at a higher level it tells the agent so first you are a quent document retrieval specialist and you should always grant answers in the quant

collections um if you can uh reach to search. So um and then I will tell the

search. So um and then I will tell the agent the retrieval looks look like. So

first like inspect the queries um do the intention analysis and construct another query and pick the collection from the table below. Uh I also give the agent

table below. Uh I also give the agent the collection info and then uh choose a tool uh for example the hybrid search tool um the vector search tool uh and

also the web search and uh the evaluation. So uh I I asked the agent to

evaluation. So uh I I asked the agent to do a evaluation. So if the result are weak um it should retry with a query facing or second collection and um if

after two unsuccessful attempts to call uh Google search and lastly summarize the findings and uh I also give the the collection info and also some uh

execution nodes or response policy to um to make the result more structure. And

uh you can also like uh do some uh edit to this part to um enable the agent to give you the best result. Yeah. So

hopefully um yeah great. So we can see the um the the test round has already uh finished and I asked is there a state

law regulating residual uh evictions in Lowa and um yeah [clears throat] so the agent just give us the answers and we can see how it works here. So if we go

to tracing and click the agent. Yeah. So

you can see this is really easy to debug because you can see what's the input given to the agent and then uh how the what's the output the agent gift. You

can see the agent is res uh retrying search in this collection. Yeah. And um

and uh I think it gets some uh Yeah. And you can see that it run a

uh Yeah. And you can see that it run a second attempt with a refined query. And

this is after it uh conducts the evaluation. And uh we can also see the

evaluation. And uh we can also see the text it retrieves and uh the payload.

Yeah. So here

so um this is really easy to debug and you can also open the uh cache variable to see uh what each nodes does.

Yeah. Perfect. And um I think uh do we have time to do another uh question test question? Um

question? Um uh any do do we have time or we can like pass that to uh Terry to do another demo because we still have another demo left.

>> Yeah.

>> Okay. Sure.

>> Yeah.

>> Yeah. Sure. Yeah. Uh I think we can move it to uh Terry for his demo. um because

uh our audience will receive a webinar resource pack where you they can also try uh with uh the workflow and um use their questions I think.

>> Yeah. Yeah. So uh let me stop sharing and pass that to the >> Okay. Thank you so much Scarlet. I

>> Okay. Thank you so much Scarlet. I

really love the second workflow and I think it's super useful for me and I cannot wait I can't wait to try it myself. Yep. So now I will bring Terry

myself. Yep. So now I will bring Terry to the stage. So, take a deep breath and have a little rest. Scarlet and I'm bringing Terry back. Give me a second.

Okay. So, hi Terry. I'll leave this stage to you for your demo. Mhm.

>> The demo that I'm going to be showing you today is an image anomaly detection service or agent I should say that is

built with quadrant on diffy. So, I want to just show you what this looks like in practice. Here I have this data set of

practice. Here I have this data set of folders or sorry this folder of images and we can actually take a look at the

image but I can tell you that this is a sample of a normal cashew product defective.

So if we look at it, it's a normal cashew that has no issues. A perfect

specimen for a cashew.

What we can do with this system is we can upload this picture and ask our system is this product defective and it says no with a 92% confidence that

there's no issue. Now let's try with a defective cashew. So I have this

defective cashew. So I have this defective cashew which we can take a look at and you you'll notice that you know something went wrong in the production

process there and our agent is going to say nope this is a fail. It's recognizing that the defect is that it was burned and then

it's going to give us some kind of action. So and of course this works with

action. So and of course this works with other examples too. So, let's try the gum.

Is this defective?

And we're going to get a Yeah, it's defective. And the defect here is the spot. And then if I try again with the clean gum,

is this defective?

That's going to work. So, what data set are we using here? We're using something called the Visa extended data set. And I

can actually open this up. Uh we just took a look at gum. So let's go to the chewing gum and take a look at that. And

what we have inside of here is this anomaly grouped and the normal photos.

So if we take a look at the anomaly grouped, you can see that we have these folders. And in each folder, we have an

folders. And in each folder, we have an image. And then we also have a mask

image. And then we also have a mask which is going to tell us what the issue is with the image. And then the normal image is just normal samples. So you see we have candles, we have capsules,

cashew, chewing gum, fryum, macaroni, etc. So what we built here, if I can go

to the diffy agent is this flow where we take the data set which it's called the visa data set. Right? So these are the images that we just saw in our

anomaly detection. There's 12 different

anomaly detection. There's 12 different categories of images. So the very first thing that we do is we upsert these images to quadrant and then we also use

filters. So I was mentioning earlier

filters. So I was mentioning earlier that you can combine that semantic search with filters. So we're going to upload the image. We turn that image into an embedding which is turning the

image into numbers. That way we can compare how similar or dissimilar images are to each other. And then when we get a new product, what we can do is first

identify what kind of product it is. So

we're going to say, okay, this is a candle, this is chewing gum, this is the fryum, this is a PCB. And then once we've identified what the image is, we

can compare the vector of our uploaded image with the centrid vector of the normal image. So we basically average

normal image. So we basically average out what we consider to be good samples and then we look at the new incoming sample and we say, hey, does this sample

match in terms of similarity? If it

does, then we can go ahead and say everything's good. And if it doesn't,

everything's good. And if it doesn't, then we're gonna want to mark that as an anomaly. And really quickly, I want to

anomaly. And really quickly, I want to show you the power of Diffy here, which is if I wanted to simply spin up a uh

service where I can search for images.

So here I have the H&M data set and we're searching for red shoes in this query. And then below we're doing an

query. And then below we're doing an image search where I'm giving it this image here and searching for similar images. And you can see that there's a

images. And you can see that there's a good amount of code that I needed to do here. And I'd probably need to know

here. And I'd probably need to know Python or Quadrant pretty in-depth in order to build this. What we're doing here is building the same thing but in

Diffy. [clears throat]

Diffy. [clears throat] So if we take a look at how the workflow actually works, we can see here I

uploaded the file to Diffy and we have a little script here that's just going to extract the URL from Diffy's file upload service. So this is a little script here

service. So this is a little script here that's just going to extract the URL because Quadrant only wants the image

URL. So that tool that uh Scarlet showed

URL. So that tool that uh Scarlet showed you earlier, I actually went ahead and I have already started expanding on this.

And of course, we're going to make this tool available to the community. So if

you have any ideas, please let us know.

But what I added was a image search tool. And what this image search tool

tool. And what this image search tool does is search images. So what we're doing here is we're taking the user input which can either be a URL or a

local file upload. We're uploading that image to Quadrant. We're embedding sorry we're embedding the image then we upload it to Quadrant.

And here you can see if we take a look at our last run um we can see here in collection Visa products we did that

image search and we found 10 similar images. And I can even bring up this

images. And I can even bring up this Visa products uh collection the same way that Scarlet showed you the QAD collection here in my quadrant cluster.

So if we take a look at my cluster UI and I search for Visa um I have yep this Visa products

collection here where we have all of the different products that we have inside of this data set.

So when I search, what I'm doing is I'm comparing that vector to the images that I have inside of my data set to a find

out what the product is and then b get scores for relevance. Right? So that's

what all this output is down here. It's

going to be similar to if I search for this image, here are the similar images.

But like I mentioned, this is anomaly detection. So we're actually doing a

detection. So we're actually doing a dissimilarity search or we're seeing how dissimilar are these items from each other. And then we have the LLM piece,

other. And then we have the LLM piece, right? So we have a prompt here where

right? So we have a prompt here where we're going to take in the search results and then based on the search results, we're going to have a confidence interval because you might

think for example, let me try here. We

have a image that it could go either way, right? Is this a defect?

way, right? Is this a defect?

Depending on your quality control, you might want to ship this product or you might say, "Hey, this product is uh defective." So, you can actually go in

defective." So, you can actually go in your back end and say, "You know what?

We don't actually care about small defects like this. Let this pass." But

of course, if you have a Why can't I exit this? And then, of course, if you

exit this? And then, of course, if you have a major defect like this, there's no situation where we want that to pass.

So we can have our threshold make sure that it captures things like that.

So that is the LLM piece. And then the way you can think about this is basically we have a demo here. Quadrant

anomaly detection coffee.

We have a demo here. But imagine you worked at a factory and you have cameras inside of your factory. So, what you'd

want to do is take a picture of the production process as it's going and then you can compare the vectors of each image or each batch or [clears throat]

each product, however you um are manufacturing, in order to find out, hey, is this product that we're delivering an anomaly or is it the normal product?

So after we do all of that, we simply have a response node that is just going to respond to the user with that data.

And then just like that, you can have a production grade anomaly detection system built with diffy and quadrant.

And I think the cool thing about this is now that I have this data inside of Quadrant, if I wanted to edit this. So let's say I

I might have another collection. Um

let's actually try this live and see what happens. Hopefully we pray to the

what happens. Hopefully we pray to the demo gods. Let's say I have another

demo gods. Let's say I have another multimodal collection.

Um, H&M. I think I have a multimodal.

H&M. I think I have a multimodal.

Let's see. Oh, I just missed it. Sorry.

I saw it back here.

Let's just refresh. My eyes aren't working today.

Okay, so I have this H&M products multimodal.

In theory, I can just change the collection name, right? Because before

we were using Visa products, now we're using H&M multimodal if it's using the same embeddings.

And I would also want to change this to say something. Um, we can change this

say something. Um, we can change this and Diffy actually has this AI uh generation thing. So, we can use the AI

generation thing. So, we can use the AI to change the prompt. And basically, we would change this to a fashion prompt.

And then if we were to run this where we upload a image, it's still going to work. It's just going to do the

comparisons now against the images in my H&M demo collection. So if I had a shirt for example, it would go and it would

say, "Hey, this shirt is defective."

Right? So that way you can really easily on the go switch up these workflows. And

as long as you have your data inside of Quadrant, you're good to go. And then

we're also going to work on creating some workflows that actually allow you to upload your data so that no matter what your business function is, you can use Diffy and Quadrant together to

unlock that information.

So that is my presentation on or my demo for the quadrant diffy tool.

Loading...

Loading video analysis...