mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-16 23:13:31 +00:00
Fix update_document function, add test and documentation. (#5359)
# Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.
This commit is contained in:
@@ -160,3 +160,37 @@ def test_chroma_with_include_parameter() -> None:
|
||||
assert output["embeddings"] is not None
|
||||
output = docsearch.get()
|
||||
assert output["embeddings"] is None
|
||||
|
||||
|
||||
def test_chroma_update_document() -> None:
|
||||
"""Test the update_document function in the Chroma class."""
|
||||
|
||||
# Initial document content and id
|
||||
initial_content = "foo"
|
||||
document_id = "doc1"
|
||||
|
||||
# Create an instance of Document with initial content and metadata
|
||||
original_doc = Document(page_content=initial_content, metadata={"page": "0"})
|
||||
|
||||
# Initialize a Chroma instance with the original document
|
||||
docsearch = Chroma.from_documents(
|
||||
collection_name="test_collection",
|
||||
documents=[original_doc],
|
||||
embedding=FakeEmbeddings(),
|
||||
ids=[document_id],
|
||||
)
|
||||
|
||||
# Define updated content for the document
|
||||
updated_content = "updated foo"
|
||||
|
||||
# Create a new Document instance with the updated content and the same id
|
||||
updated_doc = Document(page_content=updated_content, metadata={"page": "0"})
|
||||
|
||||
# Update the document in the Chroma instance
|
||||
docsearch.update_document(document_id=document_id, document=updated_doc)
|
||||
|
||||
# Perform a similarity search with the updated content
|
||||
output = docsearch.similarity_search(updated_content, k=1)
|
||||
|
||||
# Assert that the updated document is returned by the search
|
||||
assert output == [Document(page_content=updated_content, metadata={"page": "0"})]
|
||||
|
Reference in New Issue
Block a user