00:00 - 00:05
this is amazing now we have graph rag
00:03 - 00:08
from Microsoft this is one of the best
00:05 - 00:10
rag system I've seen so far so what is
00:08 - 00:13
rack and why we need rack so all the
00:10 - 00:15
large language models don't have every
00:13 - 00:18
information in the world so when we ask
00:15 - 00:21
a question which is not relevant it is
00:18 - 00:23
not able to answer correctly but when we
00:21 - 00:25
provide relevant context to the LGE
00:23 - 00:28
language model it is able to respond
00:25 - 00:30
accurately that is rack system or
00:28 - 00:32
retrieval augmented generation so the
00:30 - 00:34
information what we provide decides the
00:32 - 00:36
quality of response we get from the
00:34 - 00:39
large language model so the better the
00:36 - 00:41
context better the response so there are
00:39 - 00:43
different rack system available the
00:41 - 00:46
basic rack system is that when you ask a
00:43 - 00:48
question based on the question we do
00:46 - 00:50
semantic search and pull the relevant
00:48 - 00:53
information and send it to L language
00:50 - 00:56
model but there is an advanced version
00:53 - 00:58
which improves the quality of response
00:56 - 01:01
that is graph rag so let's take this as
00:58 - 01:04
an example you have a private data set
01:01 - 01:06
generally for rack process you divide
01:04 - 01:08
the data into chunks and then do
01:06 - 01:11
semantic search against the database
01:08 - 01:14
that is basic rag but semantic search is
01:11 - 01:16
not that relevant because it doesn't
01:14 - 01:20
know the meaning between entities that's
01:16 - 01:22
when we have graph rag same as before we
01:20 - 01:25
divide that into chunks then we extract
01:22 - 01:26
useful information such as the entities
01:25 - 01:30
and how they are related whether they
01:26 - 01:32
are closely related or further related
01:30 - 01:34
it is similar to semantic chunking but
01:32 - 01:36
it's more advanced in regards to
01:34 - 01:38
information extraction and identifying
01:36 - 01:41
the relationship between them so when a
01:38 - 01:44
user ask a question we can provide more
01:41 - 01:47
relevant context to give high quality
01:44 - 01:50
answer but to increase that quality
01:47 - 01:53
further we can even combine this with
01:50 - 01:56
semantic search rack and further we can
01:53 - 01:59
Implement pre-existing relationships
01:56 - 02:01
created by communities and using that we
01:59 - 02:03
are able to to give much more meaning to
02:01 - 02:06
the context using this we are able to
02:03 - 02:08
create high quality data set high
02:06 - 02:11
quality summarized Q&A and much more
02:08 - 02:14
apart from just entity detection you got
02:11 - 02:17
much more than that hierarchy extraction
02:14 - 02:20
graph embedding entity summarization
02:17 - 02:22
Community summarization topic detection
02:20 - 02:24
representation learning by the end of
02:22 - 02:27
this video you will learn how you can
02:24 - 02:30
implement this in your own application
02:27 - 02:32
like this to extract entities and and
02:30 - 02:35
get high quality answers gladly
02:32 - 02:37
Microsoft open-source graph rag using
02:35 - 02:40
this you are able to easily integrate
02:37 - 02:41
this Advanced rag system in your own
02:40 - 02:43
application that's exactly what we're
02:41 - 02:45
going to see today let's get
02:45 - 02:49
started hi everyone I'm really excited
02:47 - 02:51
to show you about graph rag as you can
02:49 - 02:54
see in this image you are able to
02:51 - 02:57
identify different entities and how they
02:54 - 03:00
are related with each other using a bulk
02:57 - 03:03
amount of data you are able to fine tune
03:00 - 03:05
or extract useful information and make
03:03 - 03:07
your chat bot more accurate I'm going to
03:05 - 03:08
take you through step by step how you
03:07 - 03:10
can implement this in your own
03:08 - 03:12
application as a beginner's guide and
03:10 - 03:14
you can extend this from there but
03:12 - 03:15
before that I regularly create videos
03:14 - 03:17
regards Artificial Intelligence on my
03:15 - 03:18
YouTube channel so do subscribe and
03:17 - 03:20
click the Bell icon to stay tuned make
03:18 - 03:21
sure you click the like button so this
03:20 - 03:23
video can be helpful for many others
03:21 - 03:27
like you we can see in this graph rag
03:23 - 03:30
paper a graph Rag and Nave rag the graph
03:27 - 03:33
rag is much more detailed in answer
03:30 - 03:35
compared to the base rack so even when
03:33 - 03:38
we see here when we don't use the
03:35 - 03:41
community information just the basic
03:38 - 03:44
graph rag that is graph rag using local
03:41 - 03:47
search but when you include community-
03:44 - 03:50
based graph extraction this is called
03:47 - 03:53
gra rag with global search so we are
03:50 - 03:57
going to see gra frag with local search
03:53 - 03:59
and gr rag with global search so first
03:57 - 04:01
step pip install gr frag and then click
03:59 - 04:04
ENT this will install the main required
04:01 - 04:06
package export the gra rag API key like
04:04 - 04:08
this this is the open AI API key after
04:06 - 04:10
this click enter for now I'm going to
04:08 - 04:13
show you how you can integrate the CH
04:10 - 04:15
GPT that is open AI model but there are
04:13 - 04:17
options where you can integrate Gro or o
04:15 - 04:19
l together which I will be covering
04:17 - 04:22
later I'm going to create a folder
04:19 - 04:25
called input mkd input and then click
04:22 - 04:28
enter even you can use that from vs code
04:25 - 04:29
just open this in vs code here you can
04:28 - 04:32
click this icon to create the in input
04:29 - 04:35
folder like this now inside that I'm
04:32 - 04:37
going to put a text and here is the text
04:35 - 04:40
file which I've added so if I open it
04:37 - 04:43
you can see a project titled A Christmas
04:40 - 04:46
Carol author Charles Dickens and you got
04:43 - 04:48
more information in this whole article
04:46 - 04:51
or the book the whole thing I'm going to
04:48 - 04:53
feed that to graph rag so as you can see
04:51 - 04:56
the folder structure inside the input
04:53 - 04:58
folder I've got book. dxt where I got
04:56 - 05:01
this text you can have your own data in
04:58 - 05:03
this file and it could be any data now
05:01 - 05:06
we are going to convert this to graph
05:03 - 05:08
and then ask questions to do that I'm
05:06 - 05:12
coming back to my terminal python hyphen
05:08 - 05:15
M gra rag. index hyphen hyphen init
05:12 - 05:18
hyphen hyen root and Dot so init is for
05:15 - 05:20
initializing the project hyphen ien root
05:18 - 05:22
is to tell that this is the current
05:20 - 05:25
directory where you can find the input
05:22 - 05:28
folder after this click enter you can
05:25 - 05:30
see the project got initialized now if I
05:28 - 05:34
open the vs code here is a folder
05:30 - 05:38
structure where you got prompts output
05:34 - 05:41
input file settings. so file that's
05:38 - 05:43
where you can Define your open AI API
05:41 - 05:45
key but for now we have already set up
05:43 - 05:47
on the environment variable so I'm going
05:45 - 05:50
to leave it as it is you can change the
05:47 - 05:55
model name in settings. yamal file there
05:50 - 05:58
you can choose GPT 40 or anything of
05:55 - 06:01
your choice here you can even try olama
05:58 - 06:04
by changing this API base to this URL
06:01 - 06:06
and you can even try adding your API
06:04 - 06:08
version here I haven't tried this but
06:06 - 06:10
it's worth for you to try similarly you
06:08 - 06:12
can integrate this with Gro same like
06:10 - 06:15
this you can replace this with Gro
06:12 - 06:18
endpoint for olama you might need to
06:15 - 06:20
add/ V1 to mention the version so that
06:18 - 06:22
is the key difference to use different
06:20 - 06:26
large language models for now I'm going
06:22 - 06:28
to use GPD 40 you have other settings as
06:26 - 06:30
well here you can set the embedding
06:28 - 06:33
model now we setting text embedding
06:30 - 06:34
three small but you can change this you
06:33 - 06:38
can see the chunk size you can try
06:34 - 06:40
changing this based on your data type
06:38 - 06:42
you can see the input folder that is
06:40 - 06:44
input you can even modify this that's
06:42 - 06:47
why we initially created a folder called
06:44 - 06:49
input and we added all our files there
06:47 - 06:51
so that is in txt format if you're
06:49 - 06:54
adding mock down you might need to
06:51 - 06:57
modify this to MD then you got different
06:54 - 06:59
folders reporting entity extraction
06:57 - 07:01
folder summarized description so the
06:59 - 07:04
prompts you can see here Community
07:01 - 07:06
report you can see here local search and
07:04 - 07:08
Global search as I mentioned before now
07:06 - 07:11
if I go into the prompts you can see the
07:08 - 07:12
community report prompt explained here
07:11 - 07:14
so in this way we telling the log
07:12 - 07:17
language model how to extract Community
07:14 - 07:19
information from the data it's very
07:17 - 07:21
detailed similarly we got entity
07:19 - 07:24
extraction then we got summarized
07:21 - 07:26
description so everything is pre-built
07:24 - 07:29
now we're going to run the code the way
07:26 - 07:31
we do is go to our terminal last time we
07:29 - 07:35
added IIT so this time I'm not going to
07:31 - 07:37
use that just gra rag. index and then
07:35 - 07:38
root folder is the current folder that's
07:37 - 07:40
why I mention dot if you're going to a
07:38 - 07:43
different folder you can mention the
07:40 - 07:45
folder name here after this click enter
07:43 - 07:47
now you can see it automatically
07:45 - 07:51
dividing those two chunks we mentioned
07:47 - 07:55
in the settings 300 tokens totally 230
07:51 - 07:58
chunks divided now it's gr frag indexer
07:55 - 08:01
loading the input and extracting
07:58 - 08:02
entities now summarizing entities it's
08:01 - 08:05
using the same prompt which I showed
08:02 - 08:08
before so going through each and every
08:05 - 08:10
chunk then it's creating relationships
08:08 - 08:13
between different identified entities
08:10 - 08:15
like Charles Dickens A Christmas Carol
08:13 - 08:17
this is very detailed next is generating
08:15 - 08:20
relationship IDs so you can see the
08:17 - 08:22
progress here different steps carried up
08:20 - 08:24
and finally it's creating the final
08:22 - 08:27
Community reports which is used for
08:24 - 08:29
Global search so with Community reports
08:27 - 08:32
it's Global search without Community
08:29 - 08:34
reports is local search now it's all
08:32 - 08:37
completed all workflows completed
08:34 - 08:39
successfully now if I open this nvs code
08:37 - 08:43
you can see there's a new folder called
08:39 - 08:46
output there you got different folders
08:43 - 08:50
artifacts reports and the log file so
08:46 - 08:52
all knowledge graph has been created now
08:50 - 08:55
we can ask questions so step number one
08:52 - 08:57
is indexing the document so that is done
08:55 - 08:59
now we can ask questions let's do it
08:57 - 09:00
let's come back to our terminal so same
09:00 - 09:06
gra rack. query instead of index now
09:03 - 09:08
it's query hyphen hyen root then dot
09:06 - 09:11
current folder then adding hyphen hyen
09:08 - 09:14
method Global that means Global search
09:11 - 09:16
now we are ready to ask questions and
09:14 - 09:18
here is the question what are the top
09:16 - 09:21
themes in this story so that is the
09:18 - 09:25
story which we uploaded I'm going to
09:21 - 09:27
click enter now it's performing gra Rag
09:25 - 09:29
and here is the final answer top themes
09:27 - 09:32
in the story transformation and
09:29 - 09:34
Redemption family love support
09:32 - 09:36
generosity and charity and also it's
09:34 - 09:39
giving a reference from where it has
09:36 - 09:41
been taken this is quite in-depth
09:39 - 09:44
compared to the regular rag you see now
09:41 - 09:47
let's do local search so instead of
09:44 - 09:49
method Global I'm using method local and
09:47 - 09:52
asking this question asking about this
09:49 - 09:54
person and what are his relationships
09:52 - 09:57
and then click enter and here is a local
09:54 - 10:01
search response and this is not using
09:57 - 10:03
Community answers or Community summaries
10:01 - 10:05
so that this can be focused more on the
10:03 - 10:07
uploaded data rather than the overall
10:05 - 10:10
picture so Global search gives us a
10:07 - 10:12
wider picture local search gives us a
10:10 - 10:14
more focused picture I showed you two
10:12 - 10:17
commands to run Global search and local
10:14 - 10:20
search using terminal but you can even
10:17 - 10:23
add this to your program but rather than
10:20 - 10:25
adding this to your program you can even
10:23 - 10:27
add it like this which I'll be covering
10:25 - 10:30
that in the upcoming video this is going
10:27 - 10:31
to take the rag to the next level I'm
10:30 - 10:33
really excited about this I'm going to
10:31 - 10:35
create more videos similar to this so
10:33 - 10:36
stay tuned I hope you like this video do
10:35 - 10:39
like share and subscribe and thanks for