mirror of https://github.com/hwchase17/langchain.git synced 2026-04-23 20:23:59 +00:00

Go to file

Shotaro Sano d647ff1a9a docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112 )

## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.

2024-03-15 22:27:15 +00:00

.devcontainer

Update README.md (#8570 )

2023-11-12 22:07:49 -08:00

.github

infra: run min version ci before integration tests (#18945 )

2024-03-15 12:14:44 -07:00

cookbook

docs: Updating cookbook README for amazon personalize (#17854 )

2024-03-08 16:52:36 -08:00

docker

community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815 )

2024-03-08 14:39:28 -05:00

docs

docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112 )

2024-03-15 22:27:15 +00:00

libs

docs: fix databricks document url (#19096 )

2024-03-15 22:25:11 +00:00

templates

templates: Switch neo4j generation template to LLMGraphTransformer (#19024 )

2024-03-14 16:00:42 -07:00

.gitattributes

Update dev container (#6189 )

2023-06-16 15:42:14 -07:00

.gitignore

airbyte[patch]: init pkg (#18236 )

2024-02-27 19:37:53 -08:00

.readthedocs.yaml

infra: update rtd yaml (#17502 )

2024-02-13 18:16:44 -08:00

CITATION.cff

rename repo namespace to langchain-ai (#11259 )

2023-10-01 15:30:58 -04:00

LICENSE

Library Licenses (#13300 )

2023-11-28 17:34:27 -08:00

Makefile

infra: update to pathspec for 'git grep' in lint check (#18178 )

2024-03-01 22:03:45 +00:00

MIGRATE.md

Update main readme (#13298 )

2023-11-13 17:37:54 -08:00

poetry.lock

text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346 )

2024-02-29 18:33:21 -08:00

poetry.toml

Unbreak devcontainer (#8154 )

2023-07-23 19:33:47 -07:00

pyproject.toml

text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346 )

2024-02-29 18:33:21 -08:00

README.md

docs: update readme diagram (#18929 )

2024-03-11 11:17:45 -07:00

SECURITY.md

Updated security policy (#19089 )

2024-03-14 20:58:47 +00:00

README.md

🦜️🔗 LangChain

⚡ Build context-aware reasoning applications ⚡

Looking for the JS/TS library? Check out LangChain.js.

To help you ship LangChain apps to production faster, check out LangSmith. LangSmith is a unified developer platform for building, testing, and monitoring LLM applications. Fill out this form to speak with our sales team.

Quick Install

With pip:

pip install langchain

With conda:

conda install langchain -c conda-forge

🤔 What is LangChain?

LangChain is a framework for developing applications powered by language models. It enables applications that:

Are context-aware: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)
Reason: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)

This framework consists of several parts.

LangChain Libraries: The Python and JavaScript libraries. Contains interfaces and integrations for a myriad of components, a basic run time for combining these components into chains and agents, and off-the-shelf implementations of chains and agents.
LangChain Templates: A collection of easily deployable reference architectures for a wide variety of tasks.
LangServe: A library for deploying LangChain chains as a REST API.
LangSmith: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
LangGraph: LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain. It extends the LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner.

The LangChain libraries themselves are made up of several different packages.

langchain-core: Base abstractions and LangChain Expression Language.
langchain-community: Third party integrations.
langchain: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.

🧱 What can you build with LangChain?

❓ Retrieval augmented generation

Documentation
End-to-end Example: Chat LangChain and repo

💬 Analyzing structured data

Documentation
End-to-end Example: SQL Llama2 Template

🤖 Chatbots

Documentation
End-to-end Example: Web LangChain (web researcher chatbot) and repo

And much more! Head to the Use cases section of the docs for more.

🚀 How does LangChain help?

The main value props of the LangChain libraries are:

Components: composable tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
Off-the-shelf chains: built-in assemblages of components for accomplishing higher-level tasks

Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.

Components fall into the following modules:

📃 Model I/O:

This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.

📚 Retrieval:

Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

📖 Documentation

Please see here for full documentation, which includes:

Getting started: installation, setting up the environment, simple examples
Overview of the interfaces, modules, and integrations
Use case walkthroughs and best practice guides
LangSmith, LangServe, and LangChain Template overviews
Reference: full API docs

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see here.

🌟 Contributors

Description

⚡ Building applications with LLMs through composability ⚡

Readme MIT Cite this repository 4.9 GiB

README.md Unescape Escape

🦜️🔗 LangChain

Quick Install

🤔 What is LangChain?

🧱 What can you build with LangChain?

🚀 How does LangChain help?

📖 Documentation

💁 Contributing

🌟 Contributors

README.md