langchain/docs
Shotaro Sano d647ff1a9a
docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112)
## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.
2024-03-15 22:27:15 +00:00
..
api_reference community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106) 2024-03-05 15:47:29 -08:00
data 👥 Update LangChain people data (#18473) 2024-03-03 19:58:58 -08:00
docs docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112) 2024-03-15 22:27:15 +00:00
scripts docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 2024-03-14 17:42:22 -07:00
src docs[patch]: properly load/use env vars (#18942) 2024-03-11 15:38:05 -07:00
static docs: Add graph construction docs (#18904) 2024-03-13 12:27:58 -07:00
.gitignore docs[minor]: Swap gtag for supabase (#18937) 2024-03-11 14:23:12 -07:00
.local_build.sh docs: partner packages (#16960) 2024-02-02 15:12:21 -08:00
.yarnrc.yml docs[minor]: Add thumbs up/down to all docs pages (#18526) 2024-03-04 15:14:28 -08:00
babel.config.js Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
code-block-loader.js Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
docusaurus.config.js docs[patch]: properly load/use env vars (#18942) 2024-03-11 15:38:05 -07:00
package.json docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 2024-03-14 17:42:22 -07:00
README.md docs: developer docs (#14776) 2023-12-17 12:55:49 -08:00
settings.ini Restructure docs (#11620) 2023-10-10 12:55:19 -07:00
sidebars.js docs: Toolkits menu (#16217) 2024-02-08 14:52:26 -08:00
vercel_build.sh docs: fix vercel build script (#19090) 2024-03-14 20:53:43 +00:00
vercel_requirements.txt infra: docs build install community editable (#14739) 2023-12-14 16:13:09 -08:00
vercel.json docs: providers update 4 (#18540) 2024-03-09 13:30:48 -08:00
yarn.lock docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 2024-03-14 17:42:22 -07:00

LangChain Documentation

For more information on contributing to our documentation, see the Documentation Contributing Guide