The contribution aims to enhance the functionality of the DBGPT
repository by integrating support for the Weaviate database. Weaviate is
a vector database that provides advanced indexing and search
capabilities for textual data. By incorporating Weaviate into the DBGPT
repository, users will have access to efficient storage, retrieval, and
similarity search features for their text-based data.
Proposed Changes:
1. Implement WeaviateStore Class:
- Create a new class, "WeaviateStore," that extends the existing vector
store functionality in the DBGPT repository.
- The WeaviateStore class will serve as a wrapper around the Weaviate
database and provide methods for data loading, searching, and
vectorization.
- The class will utilize the Weaviate Python client library for seamless
integration with the Weaviate database.
2. Schema Definition:
- Define the schema for the Weaviate database to support the required
data structure in the DBGPT repository.
- The schema will include a "Document" class with properties for
metadata and text.
- The "metadata" property will store the metadata associated with each
document.
- The "text" property will store the textual content of each document.
3. Data Loading:
- Implement a method within the WeaviateStore class to load documents
into the Weaviate database.
- Iterate through the documents in the DBGPT repository and extract the
necessary metadata and text.
- Use the Weaviate Python client to add each document to the Weaviate
database, mapping the metadata and text to the corresponding properties
defined in the schema.
4. Similar Search:
- Implement a method within the WeaviateStore class to perform a similar
search in the Weaviate database based on a given text query.
- Utilize Weaviate's vector-based search capabilities to find documents
similar to the provided text query.
- Return the relevant documents along with additional information such
as distance or relevance scores.
5. Vector Name Existence:
- Implement a method within the WeaviateStore class to check if a vector
name exists for a given class in the Weaviate database.
- The method will query the Weaviate database's schema and determine if
the vector name exists for the specified class.