What is GraphRAG? (…and how you can build a GraphRAG solution quickly!) | Amazon Web Services

By Amazon Web Services

Summary

## Key takeaways - **Graph RAG vs. Traditional RAG**: Traditional RAG relies on vector similarity, which can miss relevant information that isn't semantically close. Graph RAG, however, uses graph traversals to connect related but dissimilar information, providing a more comprehensive and accurate response by considering relationships like parent-child or cause-effect. [07:04] - **Graph RAG: Beyond Vector Similarity**: While RAG uses vectors to find semantically similar information, graph RAG leverages graph structures to identify connections between data points that might be unrelated in meaning but are contextually linked. This allows for a deeper understanding of complex relationships. [06:41] - **Graph RAG Example: Sales Prospects**: A traditional RAG might incorrectly predict good sales prospects based on demand, but Graph RAG can uncover critical context like shipping delays due to a blocked canal, leading to a more accurate prediction of negative sales impact. [09:19] - **Graph RAG Toolkit for Customization**: The open-source Graph RAG toolkit offers high customizability, allowing users to choose different models for AI processes, adjust batch sizes, and select various graph and vector stores, such as Neptune Analytics or Amazon OpenSearch Serverless. [19:19] - **AWS Bedrock Knowledge Bases for Graph RAG**: Amazon Bedrock Knowledge Bases provide a fully managed Graph RAG solution with Amazon Neptune, automatically creating a graph data model from ingested documents. This enhances response accuracy and minimizes hallucinations by connecting information across multiple logical steps and documents. [12:45]

Topics Covered

Graphs enhance AI beyond simple vector similarity.
Graph RAG navigates complex relationships for better accuracy.
Graph RAG builds context through lineage and summarized facts.
Managed vs. Open Source Graph RAG: Scalability vs. Customization.

Full Transcript

Hey and welcome to this Neptune

snackable session which is going to talk

about graph rack. Um topic's been around

for quite a little while but this is

something really close to my heart.

Absolutely love um generative AI how it

can be incorporated into integrated with

graph. So really excited to talk about

the topic today and also how you can

actually start building a graph rag

solution quickly on Amazon Neptune. So

let's dive straight into it.

So what are we going to talk about

today? So enhancing generative AI with

graphs. So what are the types of use

cases we actually see combining graphs

and genai together. Next we're going to

explore sort of graph rag. how does it

differ from traditional rack in a sense?

And then finally um actually talking

about the um the ways the solutions that

that already exist the toolkits um that

give you the ability to actually build

your own graph rag solutions using

Amazon Neptune and then I want to cover

some next steps um lots of stuff to

share with you. So really excited um

let's let's get cracking. So enhancing

generative AI with graphs. So we kind of

like see two very different distinct

patterns between um the way Genai can

actually integrate with graphs. So first

of all is like open domain uh question

answering. So this is where you've

already got a graph and you're

effectively saying okay um I want I'm

going to allow my users to uh to run

natural language queries across the

entire graph. they can literally ask for

anything. Um

good you good good good reasons for that

for for doing things. It provides

flexibility. Um it provides flexible

accessibility as well to the graph data

that for users that potentially aren't

familiar with graph, aren't familiar

with um the underlying graph schema.

Um when should you not use that? Well,

it basically provides direct access to

your entire data. um you when if you

were ever doing like an open domain Q&A

um application, you really want to put

guard rails in to ensure that new

they're not just creating new data in

your in your knowledge graph or they're

saying okay give me everything query

back everything which is going to

overload an API or anything like that.

Next we've got defined domain question

answering. So, this differs slightly in

the fact that it's you're now um you're

still asking allowing your users to ask

a natural language question to your

graph, but you're actually using um

you're actually using uh and LLMs to

first identify the context of the

question. So what effectively what type

of questions they are asking

sorry and then extract elements from

that question to put into template

queries.

Now this approach again provides

accessibility to the graph data but in a

much more tightly controlled manner. Um,

and this is especially useful when a

domain is much more complex. Um, and

it's non-trivial to query without a sort

of an expertise.

Um, when should you not use it? Well,

the fact is that you've got to define

these templated queries. So, if you've

got loads that that that leads to

maintainability issues. Um, and there's

going to be someone's going to need to

be knowledgeable about the graph, about

the domain in order to actually write

these template queries.

So next we've got sort of the side of

the graph generation. So the previous

was sort of like um text to graph uh

effectively. So you're actually you've

got a knowledge graph, you're just

you're asking the LLM to query your

knowledge graph. Um now we're asking the

LM to create the knowledge graph. So

knowledge graph generation really how

does this work? Well, you provide a

file, you provide a data source to the

uh to the LLM and say, "Okay, create me

a knowledge graph from this." And it's

going to perform uh things like entity

extraction. It's going to perform um

sort of analysis on those on those

entities, understand the context in

which they're all connected, and then

it's going to effectively create nodes,

create entities within your graph. Now,

this is great because it allows you

effectively it's it's much more of a

hands-off approach to creating a

knowledge graph. You're not having to

manually create these types of entities.

Um the thing being I suppose with this

like anything is there's a significant

amount of trial and error involved. Um

while technologies are getting much much

better in that and certainly the things

I'll be talking about today um are

creating sort of very opinionated

models. Um there is you're still going

to need some sort of level of domain

knowledge to be able to verify that the

knowledge graph graph has actually been

created properly.

And then finally we have a graph

enhanced rag or graph rag. So, um, this

is this is kind of like where you're

where you're you're you're still

building you're still creating your

graph. Um, but you're actually now

leveraging the power of graph traversals

in as well in addition to things like

similarity lookups and things like that

to provide a more comprehensive more

accurate answers. Um and we talked a

little bit more about this uh sort of

the approach to graph rag.

So as we've seen genai plus graphs is

not always graph rag. So what is graph

rag?

So this sentence is actually quite

important to understand. So sometimes

the most relevant information to

answering a question is not the closest

in meaning. So what does that mean? So

let's break that down. Let's break rag

down into its component parts and also

integrate some uh with graph rag as

well. So first of all we have got

vectors. Now vectors are essentially a

mathematical representation or encoding

of a something. Now that something could

be a word, it could be a picture. It

could be a sound bite, a video, it

doesn't really matter. But

fundamentally, vectors are used by

machine learning, by AI to process and

understand information.

In rag applications, they're used to

identify similarity between objects. So,

for example, dog and cat are similar in

the vector space because they are

related to the same family of words. For

example, animals or pets.

Now, graphs, as we've discovered early

on, are used to identify connections

between data.

Now, using the example of cats and dogs,

these might have connections with

something like a ball or a toy that

aren't directly similar, but are still

related to the core concepts.

So further to this notion, rag solutions

primarily focus on things that are

similar in the vector space. So this is

generally known as like vector

similarity search or cosine similarity

and top k. However, where this approach

fails is when you need to retrieve all

relevant information. So information

that is like unrelated um to the search

term or in fact directly opposite the

search term in the vector space will

fundamentally be ignored.

So how do we get around this? Well using

graphs can provide us with several

different methods of connecting

information that is related. So for

example uh parent child relationships,

cause effect, semantic similarity.

we can actually use the graph to model

out all these different types of

relationship in order to interpret the

data better.

So just a quick overview of some graph

terminology before we before we dive any

deeper. So firstly we have nodes. Now

these represent entities or like things

within our graph like an entity can be

something in the real world. It could be

transient, virtual, it doesn't matter.

It's actually it's like a it's an actual

thing like a person. Nodes can also have

properties which represent key value

pair attributes associated with that

object.

Next we have edges. Now edges are used

to determine the type of the

relationship between nodes. So two nodes

can have multiple edges between them

that represent both an instance of a

relationship. So for example uh the

number of times I've signed up and

subsequently canled my gym membership um

as well as the type of relationship

itself. So edges too can have

properties. Um these include like

waiting properties or for example

metadata such as like the date and time

of when a relationship was created or

when it was when it ended.

So let's look over a graph rag example

to identify really the differences

between um traditional or vector-based

vector um rag and graph rag and see how

that actually affects the accuracy of a

response.

So we have a rag up application that has

stored information relating to a

specific company and this company

example corp. It sells widgets. Um

they've partnered with any company

logistics to do to deliver those widgets

uh to the UK in in Chris at Christmas.

Um and here we can see that information

along with sales trend data. Um and and

we can see that there will be huge

demand for these widgets over Christmas

in the UK. So as a business I want to

understand what my sales prospects are

for example corp in the UK. Now if we

were to query that using using like ve

traditional rag vector rag um we can see

the rag application has identified

similar information to my question. So

we can see example corp there in both

objects here.

So we can summize that sales are going

to be good. Hooray. Fantastic. That's

excellent.

Okay. Okay, so what does this actually

look like from a graph rag approach?

Well, first of all, by using a graph to

connect related but dissimilar

information, we can get a more holistic

view of what the sales prospects will be

for example corp. So here we can see

that any company logistics is actually

cutting shipping times by using the

fictitious canal. So this is actually a

net positive benefit.

However, we can also see that the very

same canal is blocked causing delays.

So, it's this connected information

specifically associated with the delays

relating to the fictitious canal that

provide us with additional context in

order for AI to generate a more accurate

response. So, whilst it immediately

looks like we've got huge demand,

fundamentally we can't provide a supply

for that demand. So sales are likely to

be uh negatively impacted.

So how does graph rag fundamentally

work? Well, the first key step is the

build maintain step where multiple

foundational models are used to first

generate a vector embedding of a given

chunk to extract specific entities and

facts from a chunk. Identifi identify

entities based on um known or given

synonyms. For example, AM ZN is the same

as Amazon is the same as Amazon LLC,

etc. Things like that. From here, those

sources, those chunks, entities, facts,

etc., they're all stored in the graph

connected by contextual relationships

derived through extraction process. For

example, things like same as or related

to.

Now, the next key steps are the retrieve

and generate steps. Now, graph rag works

by performing an initial vector similar

similarity search to identify

information that is similar in the

vector space. So, this is important

because both rag and graph rag

effectively start in the same way.

However, from there, graph queries will

be run to traverse two or potentially

three hops out from these nodes to

actually understand about all the

connected yet dissimilar information. So

finally the foundational model will

generate a response based on the

retrieved graph graph objects. Now that

includes the prompt that includes the

question and the context in which the

graph objects are connected.

So that was all the theory based stuff

like how do you go about building a

graph rag solution using Neptune? Well,

we have a couple of uh a couple of

primary solutions um that I'm going to

discuss. So first of all we have Amazon

Bedrock knowledge bases. So this is a

fully managed graph rag solution. Um and

secondly we've got the graph rag

toolkit. So this is an open-source

Pythonbased toolkit that provides

functionality to choose between vector

stores as well as embedding extraction

and response models. So really depends

on which one you use depends on sort of

your use case. So let's talk about let's

talk about the uh initially bedrock

knowledge bases for graph rag. So

Amazon bedrock knowledge basis offers a

fully managed graph rack uh feature with

Amazon Neptune. So it creates Amazon

Neptune analytics uh graph behind the

scenes um and it provides capability to

to actually create your um create your

graph data model

um through Bedrock knowledge bases. You

can ingest data from multiple sources to

create a generic graph data model that

describes the different entities and

relationships between them. So, graph

automatically identifies and leverages

uh relationships between entities and

structural elements within documents

ingested into knowledge bases. So what

this does it means it really provides

and enables a more comprehensive

contextually relevant response from a

foundational model particularly when the

information needs to be connected

through multiple logical steps

and also better crossdocument reasoning

capabilities allowing for more like more

precise and contextually accurate

answers by connecting information across

various sources. So it's not you're not

just using a single source approach to

this. Um and this effectively further

enhances accuracy and minimizes

hallucinations.

So this is what the graph rag data model

actually looks like. Um you've got

initially you're going to create through

the create uh chunk creation embeddings

generation effectively documents are

broken down into smaller manageable

chunks. Embed embeddings are embedding

or vector embeddings are generated from

these chunks to encode their semantic

meaning. So what do they actually what

do they what do these pieces actually

mean? What do they what do they relate

to? And then from that entities

extracted from the chunks um the they

serve as edges linking um related chunks

which are the nodes which you can see

here based on shared entities and

context. Um the resulting graph as I

mentioned is stored in Neptune Analytics

for efficient query and traversal.

Let's now talk about the graph rag

toolkit. So the graph rag toolkit is a

pythonbased toolkit library that

integrates with llama index to combine

genai with graphs. So it provides the

ability to automatically create a graph

from unstructured data.

Now it uses similar constructs as the

bedrock knowledge bases graph data model

uh to create a lexical graph

representation of the ingested

unstructured data.

Under the hood, it uses LMR index

components to build the graph as well as

integrate with foundational models

during the various AI processing stages

like um generation of embeddings um ext

entity extraction and then subsequently

the response models.

So the lexical graph data model is

effectively broken into three tiers. So

we've got the first one lineage which is

the blue uh blue nodes you can see. So

this defines where a specific chunk of

information comes from i.e. the source

document and also which chunk or chunks

it is connected to. So by having this

relationship model we can actually

provide um provide a foundation model

with the context for it to understand

what proceeds or preeds a specific chunk

and from which chunk from which source

an actual chunk actually came from. Next

we've got summarization and this is the

green these are the green nodes. So this

tier is made up of topic statement and

fact nodes. So topic rep topics

represent centralized themes such as

Amazon Neptune database service or

Neptune database technical features. Um

statements are directly connected to

topics. Um you can see here the belongs

to relationship.

um and they provide support for known

pieces of information such as Amazon

Neptune supports open cipher query

language. Now facts are then derived

from a statement in a manner that

identifies both the connection and the

type of fact. So for example, Amazon

Neptune supports open cipher.

So capitalizing like the support

keyword. So entity then f the final tier

is the entity relationships. So this is

the red node. So the final tier is made

up of entities or the things identified

as part of a fact. Now they're actually

modeled uh as triples based on the

context of a fact. And you can see like

entity can be a subject or it can be an

object of a fact.

Now using the previous example say

things like Amazon Neptune supports open

cipher. Both Amazon Neptune and Open

Cipher are in fact entities. Um however

the subject of the fact is actually

Amazon Neptune and Open Cipher is the

object of the fact with the uh relation

being support. So

in addition to entities that are

actually part of uh multiple triples

there are also connected to each other

to actually help determine the context

between them.

So when should you use sort of managed

versus open-source graph rag? So from a

fully managed perspective it the

scalability and performance it's

optimized really for any scale

enterprise applications within AWS. Um

it's a fully managed service. So as

you'd expect security and compliance it

has integrated um with AWS so integrated

security protocols.

In addition, from an operational

perspective, you're really looking at

reduced development operational burden

with AWS support and maintenance. So

you're not you don't have you're not

having developers continually tweak the

prompts or things like that. You're

you're really utilizing that fully

managed service approach. Now from an

open source perspective, it's the what

what are the some of the some of the

positive things about this is it's high

customizability of it. You can choose

different models for different sections.

You can uh for different AI processes,

you can change the batch load. Um you

can uh batch sizes. You can change the

number of workers. You can really tweak

it for the compute for the specific use

case that you're working on. You can

host it across multiple accounts. So

it's a it provides a lot of lots a lot

of customization. Um in addition open

integration you've got choices for your

graph um store. So you can use Neptune

database and Neptune litics. Um you can

lots of choice for your vector store as

well. Um connections between like

Neptune litics for both graph and

vector. Uh Amazon datab Amazon uh

Neptune database for your graph and then

Amazon open search serverless for your

vector store. PG vector lots of

different lots of different options

around that. So if you're more

comfortable, if you've got you've got

experience working with different vector

stores, um there's potential that the

actual the toolkit will uh will

immediately work with that.

So next steps, um just wanted to share

with you some developer resources. So

we've got some the graph toolkit. Um we

got a our public docs on how to actually

build um knowledge base using um using

Bedrock and also uh Neptune MCP which is

uh which is a video that we're going to

be producing shortly as well. Um some

videos that you might want to be

interested in as well discovering uh

graph data modeling using generative AI

diagram as code um as well as

integrating as like integrating a lens

with lang chain. And then finally we've

got some uh got some blogs. So,

introducing the graph rag toolkit um and

also introducing Amazon Neptune plus

MCP.

Also want to share with you while I'm

here um the QR code around the graph rag

toolkit. Um lots of information there,

lots of um

example notebooks that you can use which

I'd highly recommend for that. And in

that with that I would thank like to

thank you very much.

Loading...

Loading video analysis...