What is GraphRAG? (…and how you can build a GraphRAG solution quickly!) | Amazon Web Services
By Amazon Web Services
Summary
## Key takeaways - **Graph RAG vs. Traditional RAG**: Traditional RAG relies on vector similarity, which can miss relevant information that isn't semantically close. Graph RAG, however, uses graph traversals to connect related but dissimilar information, providing a more comprehensive and accurate response by considering relationships like parent-child or cause-effect. [07:04] - **Graph RAG: Beyond Vector Similarity**: While RAG uses vectors to find semantically similar information, graph RAG leverages graph structures to identify connections between data points that might be unrelated in meaning but are contextually linked. This allows for a deeper understanding of complex relationships. [06:41] - **Graph RAG Example: Sales Prospects**: A traditional RAG might incorrectly predict good sales prospects based on demand, but Graph RAG can uncover critical context like shipping delays due to a blocked canal, leading to a more accurate prediction of negative sales impact. [09:19] - **Graph RAG Toolkit for Customization**: The open-source Graph RAG toolkit offers high customizability, allowing users to choose different models for AI processes, adjust batch sizes, and select various graph and vector stores, such as Neptune Analytics or Amazon OpenSearch Serverless. [19:19] - **AWS Bedrock Knowledge Bases for Graph RAG**: Amazon Bedrock Knowledge Bases provide a fully managed Graph RAG solution with Amazon Neptune, automatically creating a graph data model from ingested documents. This enhances response accuracy and minimizes hallucinations by connecting information across multiple logical steps and documents. [12:45]
Topics Covered
- Graphs enhance AI beyond simple vector similarity.
- Graph RAG navigates complex relationships for better accuracy.
- Graph RAG builds context through lineage and summarized facts.
- Managed vs. Open Source Graph RAG: Scalability vs. Customization.
Full Transcript
Hey and welcome to this Neptune
snackable session which is going to talk
about graph rack. Um topic's been around
for quite a little while but this is
something really close to my heart.
Absolutely love um generative AI how it
can be incorporated into integrated with
graph. So really excited to talk about
the topic today and also how you can
actually start building a graph rag
solution quickly on Amazon Neptune. So
let's dive straight into it.
So what are we going to talk about
today? So enhancing generative AI with
graphs. So what are the types of use
cases we actually see combining graphs
and genai together. Next we're going to
explore sort of graph rag. how does it
differ from traditional rack in a sense?
And then finally um actually talking
about the um the ways the solutions that
that already exist the toolkits um that
give you the ability to actually build
your own graph rag solutions using
Amazon Neptune and then I want to cover
some next steps um lots of stuff to
share with you. So really excited um
let's let's get cracking. So enhancing
generative AI with graphs. So we kind of
like see two very different distinct
patterns between um the way Genai can
actually integrate with graphs. So first
of all is like open domain uh question
answering. So this is where you've
already got a graph and you're
effectively saying okay um I want I'm
going to allow my users to uh to run
natural language queries across the
entire graph. they can literally ask for
anything. Um
good you good good good reasons for that
for for doing things. It provides
flexibility. Um it provides flexible
accessibility as well to the graph data
that for users that potentially aren't
familiar with graph, aren't familiar
with um the underlying graph schema.
Um when should you not use that? Well,
it basically provides direct access to
your entire data. um you when if you
were ever doing like an open domain Q&A
um application, you really want to put
guard rails in to ensure that new
they're not just creating new data in
your in your knowledge graph or they're
saying okay give me everything query
back everything which is going to
overload an API or anything like that.
Next we've got defined domain question
answering. So, this differs slightly in
the fact that it's you're now um you're
still asking allowing your users to ask
a natural language question to your
graph, but you're actually using um
you're actually using uh and LLMs to
first identify the context of the
question. So what effectively what type
of questions they are asking
sorry and then extract elements from
that question to put into template
queries.
Now this approach again provides
accessibility to the graph data but in a
much more tightly controlled manner. Um,
and this is especially useful when a
domain is much more complex. Um, and
it's non-trivial to query without a sort
of an expertise.
Um, when should you not use it? Well,
the fact is that you've got to define
these templated queries. So, if you've
got loads that that that leads to
maintainability issues. Um, and there's
going to be someone's going to need to
be knowledgeable about the graph, about
the domain in order to actually write
these template queries.
So next we've got sort of the side of
the graph generation. So the previous
was sort of like um text to graph uh
effectively. So you're actually you've
got a knowledge graph, you're just
you're asking the LLM to query your
knowledge graph. Um now we're asking the
LM to create the knowledge graph. So
knowledge graph generation really how
does this work? Well, you provide a
file, you provide a data source to the
uh to the LLM and say, "Okay, create me
a knowledge graph from this." And it's
going to perform uh things like entity
extraction. It's going to perform um
sort of analysis on those on those
entities, understand the context in
which they're all connected, and then
it's going to effectively create nodes,
create entities within your graph. Now,
this is great because it allows you
effectively it's it's much more of a
hands-off approach to creating a
knowledge graph. You're not having to
manually create these types of entities.
Um the thing being I suppose with this
like anything is there's a significant
amount of trial and error involved. Um
while technologies are getting much much
better in that and certainly the things
I'll be talking about today um are
creating sort of very opinionated
models. Um there is you're still going
to need some sort of level of domain
knowledge to be able to verify that the
knowledge graph graph has actually been
created properly.
And then finally we have a graph
enhanced rag or graph rag. So, um, this
is this is kind of like where you're
where you're you're you're still
building you're still creating your
graph. Um, but you're actually now
leveraging the power of graph traversals
in as well in addition to things like
similarity lookups and things like that
to provide a more comprehensive more
accurate answers. Um and we talked a
little bit more about this uh sort of
the approach to graph rag.
So as we've seen genai plus graphs is
not always graph rag. So what is graph
rag?
So this sentence is actually quite
important to understand. So sometimes
the most relevant information to
answering a question is not the closest
in meaning. So what does that mean? So
let's break that down. Let's break rag
down into its component parts and also
integrate some uh with graph rag as
well. So first of all we have got
vectors. Now vectors are essentially a
mathematical representation or encoding
of a something. Now that something could
be a word, it could be a picture. It
could be a sound bite, a video, it
doesn't really matter. But
fundamentally, vectors are used by
machine learning, by AI to process and
understand information.
In rag applications, they're used to
identify similarity between objects. So,
for example, dog and cat are similar in
the vector space because they are
related to the same family of words. For
example, animals or pets.
Now, graphs, as we've discovered early
on, are used to identify connections
between data.
Now, using the example of cats and dogs,
these might have connections with
something like a ball or a toy that
aren't directly similar, but are still
related to the core concepts.
So further to this notion, rag solutions
primarily focus on things that are
similar in the vector space. So this is
generally known as like vector
similarity search or cosine similarity
and top k. However, where this approach
fails is when you need to retrieve all
relevant information. So information
that is like unrelated um to the search
term or in fact directly opposite the
search term in the vector space will
fundamentally be ignored.
So how do we get around this? Well using
graphs can provide us with several
different methods of connecting
information that is related. So for
example uh parent child relationships,
cause effect, semantic similarity.
we can actually use the graph to model
out all these different types of
relationship in order to interpret the
data better.
So just a quick overview of some graph
terminology before we before we dive any
deeper. So firstly we have nodes. Now
these represent entities or like things
within our graph like an entity can be
something in the real world. It could be
transient, virtual, it doesn't matter.
It's actually it's like a it's an actual
thing like a person. Nodes can also have
properties which represent key value
pair attributes associated with that
object.
Next we have edges. Now edges are used
to determine the type of the
relationship between nodes. So two nodes
can have multiple edges between them
that represent both an instance of a
relationship. So for example uh the
number of times I've signed up and
subsequently canled my gym membership um
as well as the type of relationship
itself. So edges too can have
properties. Um these include like
waiting properties or for example
metadata such as like the date and time
of when a relationship was created or
when it was when it ended.
So let's look over a graph rag example
to identify really the differences
between um traditional or vector-based
vector um rag and graph rag and see how
that actually affects the accuracy of a
response.
So we have a rag up application that has
stored information relating to a
specific company and this company
example corp. It sells widgets. Um
they've partnered with any company
logistics to do to deliver those widgets
uh to the UK in in Chris at Christmas.
Um and here we can see that information
along with sales trend data. Um and and
we can see that there will be huge
demand for these widgets over Christmas
in the UK. So as a business I want to
understand what my sales prospects are
for example corp in the UK. Now if we
were to query that using using like ve
traditional rag vector rag um we can see
the rag application has identified
similar information to my question. So
we can see example corp there in both
objects here.
So we can summize that sales are going
to be good. Hooray. Fantastic. That's
excellent.
Okay. Okay, so what does this actually
look like from a graph rag approach?
Well, first of all, by using a graph to
connect related but dissimilar
information, we can get a more holistic
view of what the sales prospects will be
for example corp. So here we can see
that any company logistics is actually
cutting shipping times by using the
fictitious canal. So this is actually a
net positive benefit.
However, we can also see that the very
same canal is blocked causing delays.
So, it's this connected information
specifically associated with the delays
relating to the fictitious canal that
provide us with additional context in
order for AI to generate a more accurate
response. So, whilst it immediately
looks like we've got huge demand,
fundamentally we can't provide a supply
for that demand. So sales are likely to
be uh negatively impacted.
So how does graph rag fundamentally
work? Well, the first key step is the
build maintain step where multiple
foundational models are used to first
generate a vector embedding of a given
chunk to extract specific entities and
facts from a chunk. Identifi identify
entities based on um known or given
synonyms. For example, AM ZN is the same
as Amazon is the same as Amazon LLC,
etc. Things like that. From here, those
sources, those chunks, entities, facts,
etc., they're all stored in the graph
connected by contextual relationships
derived through extraction process. For
example, things like same as or related
to.
Now, the next key steps are the retrieve
and generate steps. Now, graph rag works
by performing an initial vector similar
similarity search to identify
information that is similar in the
vector space. So, this is important
because both rag and graph rag
effectively start in the same way.
However, from there, graph queries will
be run to traverse two or potentially
three hops out from these nodes to
actually understand about all the
connected yet dissimilar information. So
finally the foundational model will
generate a response based on the
retrieved graph graph objects. Now that
includes the prompt that includes the
question and the context in which the
graph objects are connected.
So that was all the theory based stuff
like how do you go about building a
graph rag solution using Neptune? Well,
we have a couple of uh a couple of
primary solutions um that I'm going to
discuss. So first of all we have Amazon
Bedrock knowledge bases. So this is a
fully managed graph rag solution. Um and
secondly we've got the graph rag
toolkit. So this is an open-source
Pythonbased toolkit that provides
functionality to choose between vector
stores as well as embedding extraction
and response models. So really depends
on which one you use depends on sort of
your use case. So let's talk about let's
talk about the uh initially bedrock
knowledge bases for graph rag. So
Amazon bedrock knowledge basis offers a
fully managed graph rack uh feature with
Amazon Neptune. So it creates Amazon
Neptune analytics uh graph behind the
scenes um and it provides capability to
to actually create your um create your
graph data model
um through Bedrock knowledge bases. You
can ingest data from multiple sources to
create a generic graph data model that
describes the different entities and
relationships between them. So, graph
automatically identifies and leverages
uh relationships between entities and
structural elements within documents
ingested into knowledge bases. So what
this does it means it really provides
and enables a more comprehensive
contextually relevant response from a
foundational model particularly when the
information needs to be connected
through multiple logical steps
and also better crossdocument reasoning
capabilities allowing for more like more
precise and contextually accurate
answers by connecting information across
various sources. So it's not you're not
just using a single source approach to
this. Um and this effectively further
enhances accuracy and minimizes
hallucinations.
So this is what the graph rag data model
actually looks like. Um you've got
initially you're going to create through
the create uh chunk creation embeddings
generation effectively documents are
broken down into smaller manageable
chunks. Embed embeddings are embedding
or vector embeddings are generated from
these chunks to encode their semantic
meaning. So what do they actually what
do they what do these pieces actually
mean? What do they what do they relate
to? And then from that entities
extracted from the chunks um the they
serve as edges linking um related chunks
which are the nodes which you can see
here based on shared entities and
context. Um the resulting graph as I
mentioned is stored in Neptune Analytics
for efficient query and traversal.
Let's now talk about the graph rag
toolkit. So the graph rag toolkit is a
pythonbased toolkit library that
integrates with llama index to combine
genai with graphs. So it provides the
ability to automatically create a graph
from unstructured data.
Now it uses similar constructs as the
bedrock knowledge bases graph data model
uh to create a lexical graph
representation of the ingested
unstructured data.
Under the hood, it uses LMR index
components to build the graph as well as
integrate with foundational models
during the various AI processing stages
like um generation of embeddings um ext
entity extraction and then subsequently
the response models.
So the lexical graph data model is
effectively broken into three tiers. So
we've got the first one lineage which is
the blue uh blue nodes you can see. So
this defines where a specific chunk of
information comes from i.e. the source
document and also which chunk or chunks
it is connected to. So by having this
relationship model we can actually
provide um provide a foundation model
with the context for it to understand
what proceeds or preeds a specific chunk
and from which chunk from which source
an actual chunk actually came from. Next
we've got summarization and this is the
green these are the green nodes. So this
tier is made up of topic statement and
fact nodes. So topic rep topics
represent centralized themes such as
Amazon Neptune database service or
Neptune database technical features. Um
statements are directly connected to
topics. Um you can see here the belongs
to relationship.
um and they provide support for known
pieces of information such as Amazon
Neptune supports open cipher query
language. Now facts are then derived
from a statement in a manner that
identifies both the connection and the
type of fact. So for example, Amazon
Neptune supports open cipher.
So capitalizing like the support
keyword. So entity then f the final tier
is the entity relationships. So this is
the red node. So the final tier is made
up of entities or the things identified
as part of a fact. Now they're actually
modeled uh as triples based on the
context of a fact. And you can see like
entity can be a subject or it can be an
object of a fact.
Now using the previous example say
things like Amazon Neptune supports open
cipher. Both Amazon Neptune and Open
Cipher are in fact entities. Um however
the subject of the fact is actually
Amazon Neptune and Open Cipher is the
object of the fact with the uh relation
being support. So
in addition to entities that are
actually part of uh multiple triples
there are also connected to each other
to actually help determine the context
between them.
So when should you use sort of managed
versus open-source graph rag? So from a
fully managed perspective it the
scalability and performance it's
optimized really for any scale
enterprise applications within AWS. Um
it's a fully managed service. So as
you'd expect security and compliance it
has integrated um with AWS so integrated
security protocols.
In addition, from an operational
perspective, you're really looking at
reduced development operational burden
with AWS support and maintenance. So
you're not you don't have you're not
having developers continually tweak the
prompts or things like that. You're
you're really utilizing that fully
managed service approach. Now from an
open source perspective, it's the what
what are the some of the some of the
positive things about this is it's high
customizability of it. You can choose
different models for different sections.
You can uh for different AI processes,
you can change the batch load. Um you
can uh batch sizes. You can change the
number of workers. You can really tweak
it for the compute for the specific use
case that you're working on. You can
host it across multiple accounts. So
it's a it provides a lot of lots a lot
of customization. Um in addition open
integration you've got choices for your
graph um store. So you can use Neptune
database and Neptune litics. Um you can
lots of choice for your vector store as
well. Um connections between like
Neptune litics for both graph and
vector. Uh Amazon datab Amazon uh
Neptune database for your graph and then
Amazon open search serverless for your
vector store. PG vector lots of
different lots of different options
around that. So if you're more
comfortable, if you've got you've got
experience working with different vector
stores, um there's potential that the
actual the toolkit will uh will
immediately work with that.
So next steps, um just wanted to share
with you some developer resources. So
we've got some the graph toolkit. Um we
got a our public docs on how to actually
build um knowledge base using um using
Bedrock and also uh Neptune MCP which is
uh which is a video that we're going to
be producing shortly as well. Um some
videos that you might want to be
interested in as well discovering uh
graph data modeling using generative AI
diagram as code um as well as
integrating as like integrating a lens
with lang chain. And then finally we've
got some uh got some blogs. So,
introducing the graph rag toolkit um and
also introducing Amazon Neptune plus
MCP.
Also want to share with you while I'm
here um the QR code around the graph rag
toolkit. Um lots of information there,
lots of um
example notebooks that you can use which
I'd highly recommend for that. And in
that with that I would thank like to
thank you very much.
Loading video analysis...