mirror of https://github.com/hwchase17/langchain.git synced 2025-09-04 12:39:32 +00:00

Go to file

Raymond Yuan 5171c3bcca Refactor vector storage to correctly handle relevancy scores (#6570 )

Description: This pull request aims to support generating the correct
generic relevancy scores for different vector stores by refactoring the
relevance score functions and their selection in the base class and
subclasses of VectorStore. This is especially relevant with VectorStores
that require a distance metric upon initialization. Note many of the
current implenetations of `_similarity_search_with_relevance_scores` are
not technically correct, as they just return
`self.similarity_search_with_score(query, k, **kwargs)` without applying
the relevant score function

Also includes changes associated with:
https://github.com/hwchase17/langchain/pull/6564 and
https://github.com/hwchase17/langchain/pull/6494

See more indepth discussion in thread in #6494 

Issue: 
https://github.com/hwchase17/langchain/issues/6526
https://github.com/hwchase17/langchain/issues/6481
https://github.com/hwchase17/langchain/issues/6346

Dependencies: None

The changes include:
- Properly handling score thresholding in FAISS
`similarity_search_with_score_by_vector` for the corresponding distance
metric.
- Refactoring the `_similarity_search_with_relevance_scores` method in
the base class and removing it from the subclasses for incorrectly
implemented subclasses.
- Adding a `_select_relevance_score_fn` method in the base class and
implementing it in the subclasses to select the appropriate relevance
score function based on the distance strategy.
- Updating the `__init__` methods of the subclasses to set the
`relevance_score_fn` attribute.
- Removing the `_default_relevance_score_fn` function from the FAISS
class and using the base class's `_euclidean_relevance_score_fn`
instead.
- Adding the `DistanceStrategy` enum to the `utils.py` file and updating
the imports in the vector store classes.
- Updating the tests to import the `DistanceStrategy` enum from the
`utils.py` file.

---------

Co-authored-by: Hanit <37485638+hanit-com@users.noreply.github.com>

2023-07-10 20:37:03 -07:00

.devcontainer

Update dev container (#6189 )

2023-06-16 15:42:14 -07:00

.github

update pr tmpl (#7095 )

2023-07-03 13:34:03 -06:00

docs

Minor update to clarify map-reduce custom prompt usage (#7453 )

2023-07-10 16:43:44 -07:00

langchain

Refactor vector storage to correctly handle relevancy scores (#6570 )

2023-07-10 20:37:03 -07:00

tests

Refactor vector storage to correctly handle relevancy scores (#6570 )

2023-07-10 20:37:03 -07:00

.dockerignore

fix: tests with Dockerfile (#2382 )

2023-04-04 06:47:19 -07:00

.flake8

…

.gitattributes

Update dev container (#6189 )

2023-06-16 15:42:14 -07:00

.gitignore

Doc refactor (#6300 )

2023-06-16 11:52:56 -07:00

.gitmodules

Doc refactor (#6300 )

2023-06-16 11:52:56 -07:00

.readthedocs.yaml

Page per class-style api reference (#6560 )

2023-06-30 09:23:32 -07:00

CITATION.cff

…

dev.Dockerfile

Update dev container (#6189 )

2023-06-16 15:42:14 -07:00

Dockerfile

make ARG POETRY_HOME available in multistage (#3882 )

2023-05-01 20:57:41 -07:00

LICENSE

…

Makefile

Doc refactor (#6300 )

2023-06-16 11:52:56 -07:00

poetry.lock

Added deeplake use case examples of the new features (#6528 )

2023-07-10 07:04:29 -07:00

poetry.toml

…

pyproject.toml

Added deeplake use case examples of the new features (#6528 )

2023-07-10 07:04:29 -07:00

README.md

Del linkcheck readme (#6317 )

2023-06-16 16:18:45 -07:00

README.md

🦜️🔗 LangChain

⚡ Building applications with LLMs through composability ⚡

Looking for the JS/TS version? Check out LangChain.js.

Production Support: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.

Quick Install

pip install langchain or conda install langchain -c conda-forge

🤔 What is this?

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.

This library aims to assist in the development of those types of applications. Common examples of these applications include:

❓ Question Answering over specific documents

Documentation
End-to-end Example: Question Answering over Notion Database

💬 Chatbots

Documentation
End-to-end Example: Chat-LangChain

🤖 Agents

Documentation
End-to-end Example: GPT+WolframAlpha

📖 Documentation

Please see here for full documentation on:

Getting started (installation, setting up the environment, simple examples)
How-To examples (demos, integrations, helper functions)
Reference (full API docs)
Resources (high-level explanation of core concepts)

🚀 What can this help with?

There are six main areas that LangChain is designed to help with. These are, in increasing order of complexity:

📃 LLMs and Prompts:

This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.

🔗 Chains:

Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

📚 Data Augmented Generation:

Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.

🤖 Agents:

Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.

🧠 Memory:

Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

🧐 Evaluation:

[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

For more information on these concepts, please see our full documentation.

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see here.

Description

⚡ Building applications with LLMs through composability ⚡

Readme MIT Cite this repository 4.9 GiB

Languages

Jupyter Notebook 74.2%

Python 20.7%

omnetpp-msg 4.8%

Makefile 0.1%

MDX 0.1%

README.md Unescape Escape

🦜️🔗 LangChain

Quick Install

🤔 What is this?

📖 Documentation

🚀 What can this help with?

💁 Contributing

README.md