mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-02 13:08:57 +00:00
docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb
(#19112)
## Description This PR addresses a documentation issue in the [Indexing](https://python.langchain.com/docs/modules/data_connection/indexing) page. Specifically, it corrects the execution results of the Jupyter notebook under the [Source](https://python.langchain.com/docs/modules/data_connection/indexing#source) section, which were broken as detailed below. ## Problem The execution results following the statement, `This should delete the old versions of documents associated with doggy.txt source and replace them with the new versions.`, appear to be incorrect, as described below. ### Current Behavior - For some reason, the `index` function fails to add the new content of `doggy.txt`. Although it deletes the document objects associated with the `doggy.txt` source, it does not add the objects in `changed_doggy_docs`. Consequently, the execution result displays `num_added: 0`. - This unexpected behavior also impacts the results of `vectorstore.similarity_search("dog", k=30)`, showing only the contents of `kitty.txt`. It appears as though the contents of `doggy.txt` have been completely removed from the index: ``` Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}), Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}), Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})] ``` ### Expected Behavior - The `index` function should successfully add the objects in `changed_doggy_docs` after removing the old content of `doggy.txt`. The anticipated execution result is `num_added: 2`. - Subsequently, the modified content of `doggy.txt` should appear in the results of `vectorstore.similarity_search("dog", k=30)` as follows: ``` [Document(page_content='woof woof', metadata={'source': 'doggy.txt'}), Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}), Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}), Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}), Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})] ``` ## Fix I reran `docs/docs/modules/data_connection/indexing.ipynb` and have included the diff in this PR.
This commit is contained in:
parent
ebc4a64f9e
commit
d647ff1a9a
@ -85,7 +85,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"id": "15f7263e-c82e-4914-874f-9699ea4de93e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@ -192,7 +192,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 6,
|
||||
"id": "67d2a5c8-f2bd-489a-b58e-2c7ba7fefe6f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@ -724,7 +724,7 @@
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'num_added': 0, 'num_updated': 0, 'num_skipped': 2, 'num_deleted': 2}"
|
||||
"{'num_added': 2, 'num_updated': 0, 'num_skipped': 0, 'num_deleted': 2}"
|
||||
]
|
||||
},
|
||||
"execution_count": 30,
|
||||
@ -751,7 +751,9 @@
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),\n",
|
||||
"[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),\n",
|
||||
" Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),\n",
|
||||
" Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),\n",
|
||||
" Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),\n",
|
||||
" Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]"
|
||||
]
|
||||
@ -904,7 +906,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.1"
|
||||
"version": "3.9.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
Loading…
Reference in New Issue
Block a user