feat: add GraphRAG framework and integrate TuGraph (#1506)

Co-authored-by: KingSkyLi <15566300566@163.com> Co-authored-by: aries_ckt <916701291@qq.com> Co-authored-by: Fangyin Cheng <staneyffer@gmail.com>
2025-09-04 02:25:08 +00:00 · 2024-05-16 15:39:50 +08:00
parent 593e974405
commit a9087c3853
133 changed files with 10139 additions and 6631 deletions
--- a/docs/docs/cookbook/rag/graph_rag_app_develop.md
+++ b/docs/docs/cookbook/rag/graph_rag_app_develop.md
@@ -0,0 +1,210 @@
+# Graph RAG User Manual
+
+In this example, we will show how to use the Graph RAG framework in DB-GPT. Using a graph database to implement RAG can, to some extent, alleviate the uncertainty and interpretability issues brought about by vector database retrieval.
+
+You can refer to the python example file `DB-GPT/examples/rag/graph_rag_example.py` in the source code. This example demonstrates how to load knowledge from a document and store it in a graph store. Subsequently, it recalls knowledge relevant to your question by searching for triplets in the graph store.
+
+
+### Install Dependencies
+
+First, you need to install the `dbgpt` library.
+
+```bash
+pip install "dbgpt[rag]>=0.5.6"
+````
+
+### Prepare Graph Database
+
+To store the knowledge in graph, we need an graph database, [TuGraph](https://github.com/TuGraph-family/tugraph-db) is the first graph database supported by DB-GPT.
+
+Visit github repository of TuGraph to view [Quick Start](https://tugraph-db.readthedocs.io/zh-cn/latest/3.quick-start/1.preparation.html#id5) document, follow the instructions to pull the TuGraph database docker image (latest / version >= 4.3.0) and launch it.
+
+```
+docker pull tugraph/tugraph-runtime-centos7:latest
+docker run -it -d -p 7001:7001 -p 7070:7070 -p 7687:7687 -p 8000:8000 -p 8888:8888 -p 8889:8889 -p 9090:9090 \
+ -v /root/tugraph/data:/var/lib/lgraph/data  -v /root/tugraph/log:/var/log/lgraph_log \
+ --name tugraph_demo tugraph/tugraph-runtime-centos7:latest /bin/bash
+```
+
+The default port for the bolt protocol is `7687`, and DB-GPT accesses TuGraph through this port via `neo4j` python client.
+
+```
+ pip install "neo4j>=5.20.0"
+```
+
+### Prepare LLM
+
+To build a Graph RAG program, we need a LLM, here are some of the LLMs that DB-GPT supports:
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+<Tabs
+  defaultValue="openai"
+  values={[
+    {label: 'Open AI(API)', value: 'openai'},
+    {label: 'YI(API)', value: 'yi_proxy'},
+    {label: 'API Server(cluster)', value: 'model_service'},
+  ]}>
+  <TabItem value="openai">
+
+First, you should install the `openai` library. 
+
+```bash
+pip install openai
+```
+Then set your API key in the environment `OPENAI_API_KEY`.
+
+```python
+from dbgpt.model.proxy import OpenAILLMClient
+
+llm_client = OpenAILLMClient()
+```
+  </TabItem>
+
+  <TabItem value="yi_proxy">
+
+You should have a YI account and get the API key from the YI official website.
+
+First, you should install the `openai` library.
+
+```bash
+pip install openai
+```
+
+Then set your API key in the environment variable `YI_API_KEY`.
+
+```python
+from dbgpt.model.proxy import YiLLMClient
+
+llm_client = YiLLMClient()
+```
+  </TabItem>
+
+  <TabItem value="model_service">
+
+If you have deployed [DB-GPT cluster](/docs/installation/model_service/cluster) and 
+[API server](/docs/installation/advanced_usage/OpenAI_SDK_call)
+, you can connect to the API server to get the LLM model.
+
+The API is compatible with the OpenAI API, so you can use the OpenAILLMClient to 
+connect to the API server.
+
+First you should install the `openai` library.
+```bash
+pip install openai
+```
+
+```python
+from dbgpt.model.proxy import OpenAILLMClient
+
+llm_client = OpenAILLMClient(api_base="http://localhost:8100/api/v1/", api_key="{your_api_key}")
+```
+  </TabItem>
+</Tabs>
+
+
+
+
+### TuGraph Configuration
+
+Set variables below in `.env` file, let DB-GPT know how to connect to TuGraph.
+
+```
+GRAPH_STORE_TYPE=TuGraph
+TUGRAPH_HOST=127.0.0.1
+TUGRAPH_PORT=7687
+TUGRAPH_USERNAME=admin
+TUGRAPH_PASSWORD=73@TuGraph
+```
+
+
+
+### Load into Knowledge Graph
+
+When using a graph database as the underlying knowledge storage platform, it is necessary to build a knowledge graph to facilitate the archiving and retrieval of documents. DB-GPT leverages the capabilities of large language models to implement an integrated knowledge graph, while still maintaining the flexibility to freely connect to other knowledge graph systems and graph database systems. 
+
+To maintain compatibility with existing conventional RAG frameworks, we continue to access the knowledge graph through the `VectorStoreConnector` interface. Simply set the `vector_store_type` to `KnowledgeGraph` to enable this connection.
+
+```python
+from dbgpt.model.proxy.llms.chatgpt import OpenAILLMClient
+from dbgpt.storage.vector_store.base import VectorStoreConfig
+from dbgpt.storage.vector_store.connector import VectorStoreConnector
+
+def _create_vector_connector():
+    """Create vector connector."""
+    return VectorStoreConnector(
+        vector_store_type="KnowledgeGraph",
+        vector_store_config=VectorStoreConfig(
+            name="graph_rag_test_kg",
+            embedding_fn=None,
+            llm_client=OpenAILLMClient(),
+            model_name="gpt-4"
+        )
+    )
+```
+
+
+
+### Retrieve from Knowledge Graph
+
+Then you can retrieve the knowledge from the knowledge graph, which is the same with vector store.
+
+```python
+import os
+
+from dbgpt.configs.model_config import ROOT_PATH
+from dbgpt.rag import ChunkParameters
+from dbgpt.rag.assembler import EmbeddingAssembler
+from dbgpt.rag.knowledge import KnowledgeFactory
+
+async def main():
+    file_path = os.path.join(ROOT_PATH, "examples/test_files/tranformers_story.md")
+    knowledge = KnowledgeFactory.from_file_path(file_path)
+    vector_connector = _create_kg_connector()
+    chunk_parameters = ChunkParameters(chunk_strategy="CHUNK_BY_SIZE")
+    # get embedding assembler
+    assembler = EmbeddingAssembler.load_from_knowledge(
+        knowledge=knowledge,
+        chunk_parameters=chunk_parameters,
+        vector_store_connector=vector_connector,
+    )
+    assembler.persist()
+    # get embeddings retriever
+    retriever = assembler.as_retriever(3)
+    chunks = await retriever.aretrieve_with_scores(
+        "What actions has Megatron taken?",
+        score_threshold=0.3
+    )
+    print(f"embedding rag example results:{chunks}")
+    vector_connector.delete_vector_name("graph_rag_test")
+```
+
+
+
+
+### Chat Knowledge via GraphRAG
+
+Here we demonstrate how to achieve chat knowledge through Graph RAG on web page.
+
+First, create a knowledge base using the `Knowledge Graph` type. Upload the knowledge documents and wait for the slicing to complete.
+
+
+<p align="left">
+  <img src={'/img/chat_knowledge/graph_rag/create_knowledge_graph.jpg'} width="1000px"/>
+</p>
+
+Then, view the knowledge graph data.
+<p align="left">
+  <img src={'/img/chat_knowledge/graph_rag/view_graph.jpg'} width="1000px"/>
+</p>
+
+The graph data may look like this.
+<p align="left">
+  <img src={'/img/chat_knowledge/graph_rag/graph_data.jpg'} width="1000px"/>
+</p>
+
+Start chat to knowledge based on Graph RAG.
+<p align="left">
+  <img src={'/img/chat_knowledge/graph_rag/graph_rag_chat.jpg'} width="1000px"/>
+</p>
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -470,6 +470,16 @@ const sidebars = {
            }
          ],
        },
+        {
+          type: 'category',
+          label: 'RAG',
+          items: [
+            {
+              type: 'doc',
+              id: 'cookbook/rag/graph_rag_app_develop',
+            }
+          ],
+        },
        {
          type: 'category',
          label: 'App',
@@ -479,7 +489,7 @@ const sidebars = {
              id: 'cookbook/app/data_analysis_app_develop',
            }
          ],
-        }, 
+        },
      ],

    },
--- a/docs/static/img/chat_knowledge/graph_rag/create_knowledge_graph.jpg
+++ b/docs/static/img/chat_knowledge/graph_rag/create_knowledge_graph.jpg
--- a/docs/static/img/chat_knowledge/graph_rag/graph_data.jpg
+++ b/docs/static/img/chat_knowledge/graph_rag/graph_data.jpg
--- a/docs/static/img/chat_knowledge/graph_rag/graph_rag_chat.jpg
+++ b/docs/static/img/chat_knowledge/graph_rag/graph_rag_chat.jpg
--- a/docs/static/img/chat_knowledge/graph_rag/view_graph.jpg
+++ b/docs/static/img/chat_knowledge/graph_rag/view_graph.jpg
--- a/docs/yarn.lock
+++ b/docs/yarn.lock