Compare commits

...

53 Commits

Author SHA1 Message Date
Erick Friis
534b8f4364 standard-tests: release 0.3.7 (#28637) 2024-12-09 15:12:18 -05:00
Tomaz Bratanic
6815981578 Switch graphqa example in docs to langgraph (#28574)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-09 14:46:00 -05:00
Naka Masato
ce3b69aa05 community: add include_labels option to ConfluenceLoader (#28259)
## **Description:**

Enable `ConfluenceLoader` to include labels with `include_labels` option
(`false` by default for backward compatibility). and the labels are set
to `metadata` in the `Document`. e.g. `{"labels": ["l1", "l2"]}`

## Notes

Confluence API supports to get labels by providing `metadata.labels` to
`expand` query parameter

All of the following functions support `expand` in the same way:
- confluence.get_page_by_id
- confluence.get_all_pages_by_label
- confluence.get_all_pages_from_space
- cql (internally using
[/api/content/search](https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content/#api-wiki-rest-api-content-search-get))

## **Issue:**

No issue related to this PR.

## **Dependencies:** 

No changes.

## **Twitter handle:** 

[@gymnstcs](https://x.com/gymnstcs)


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 19:35:01 +00:00
Rajendra Kadam
242fee11be community[minor] Pebblo: Support for new Pinecone class PineconeVectorStore (#28253)
- **Description:** Support for new Pinecone class PineconeVectorStore in
PebbloRetrievalQA.
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** -

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 19:33:54 +00:00
Pranav Ramesh Lohar
85114b4f3a docs: Update sql-query doc by fixing spelling mistake of chinhook.db to chinook.db (#28465)
Link (of doc with mistake):
https://python.langchain.com/v0.1/docs/use_cases/sql/quickstart/#:~:text=Now%2C-,Chinhook.db,-is%20in%20our

  - **Description:** speeling mistake in how-to docs of sql-db 
  - **Issue:** just a spelling mistake.
  - **Dependencies:** NA
2024-12-09 14:15:29 -05:00
nikitajoyn
9fcd203556 partners/mistralai: Fix KeyError in Vertex AI stream (#28624)
- **Description:** Streaming response from Mistral model using Vertex AI
raises KeyError when trying to access `choices` key, that the last chunk
doesn't have. The fix is to access the key safely using `get()`.
  - **Issue:** https://github.com/langchain-ai/langchain/issues/27886
  - **Dependencies:**
  - **Twitter handle:**
2024-12-09 14:14:58 -05:00
Huy Nguyen
bdb4cf7cc0 Fix typo in Custom Output Parser doc (#28617)
- [x] Fix typo in Custom Output Parser doc
2024-12-09 14:14:00 -05:00
ccurme
b476fdb54a docs: update readme (#28631) 2024-12-09 13:50:12 -05:00
maang-h
b64d846347 docs: Standardize MoonshotChat docstring (#28159)
- **Description:** Add docstring

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 18:46:25 +00:00
Erick Friis
4c70ffff01 standard-tests: sync/async vectorstore tests conditional (#28636)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-09 18:02:55 +00:00
ccurme
ffb5c1905a openai[patch]: release 0.2.12 (#28633) 2024-12-09 12:38:13 -05:00
ccurme
6e6061fe73 openai[patch]: bump minimum SDK version (#28632)
Resolves https://github.com/langchain-ai/langchain/issues/28625
2024-12-09 11:28:05 -05:00
Mohammad Mohtashim
ec9b41431e [Core]: Small Docstring Clarification for BaseTool (#28148)
- **Description:** `kwargs` are not being passed to `run` of the
`BaseTool` which has been fixed
- **Issue:** #28114

---------

Co-authored-by: Stevan Kapicic <kapicic.ste1@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 06:10:19 +00:00
Erick Friis
cef21a0b49 cli: warning on app add (#28619)
instead of #28128
2024-12-09 06:07:14 +00:00
Ankit Dangi
90f162efb6 text-splitters: add pydocstyle linting (#28127)
As seen in #23188, turned on Google-style docstrings by enabling
`pydocstyle` linting in the `text-splitters` package. Each resulting
linting error was addressed differently: ignored, resolved, suppressed,
and missing docstrings were added.

Fixes one of the checklist items from #25154, similar to #25939 in
`core` package. Ran `make format`, `make lint` and `make test` from the
root of the package `text-splitters` to ensure no issues were found.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 06:01:03 +00:00
Erick Friis
b53f07bfb9 docs: more integration contrib (#28618) 2024-12-09 05:41:08 +00:00
WGNW_MG
eabe587787 community[patch]:Fix for get_openai_callback() return token_cost=0.0 when model is gpt-4o-11-20 (#28408)
- **Description:** update MODEL_COST_PER_1K_TOKENS for new gpt-4o-11-20.
- **Issue:** with latest gpt-4o-11-20, openai callback return
token_cost=0.0
- **Dependencies:** None (just simple dict fix.)
- **Twitter handle:** I Don't Use Twitter. 
- (However..., I have a YouTube channel. Could you upload this there, by
any chance?
https://www.youtube.com/@%EA%B2%9C%EC%B0%BD%EB%B6%80%EA%B3%A0%EB%AC%B8AI%EC%9E%90%EB%AC%B8%EC%84%BC%EC%84%B8)
2024-12-08 20:46:50 -08:00
Fahim Zaman
481c4bfaba core[patch]: Fixed trim functions, and added corresponding unit test for the solved issue (#28429)
- **Description:** 
- Trim functions were incorrectly deleting nodes with more than 1
outgoing/incoming edge, so an extra condition was added to check for
this directly. A unit test "test_trim_multi_edge" was written to test
this test case specifically.
- **Issue:** 
  - Fixes #28411 
  - Fixes https://github.com/langchain-ai/langgraph/issues/1676
- **Dependencies:** 
  - No changes were made to the dependencies

- [x] Unit tests were added to verify the changes.
- [x] Updated documentation where necessary.
- [x] Ran make format, make lint, and make test to ensure compliance
with project standards.

---------

Co-authored-by: Tasif Hussain <tasif006@gmail.com>
2024-12-08 20:45:28 -08:00
Inah Jeon
54fba7e520 docs: change upstage solar model descriptions (#28419)
Thank you for contributing to LangChain!

- [ ] **PR message**:
- **Description:**: We have launched the new **Solar Pro** model, and
the documentation has been updated to include its details and features.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-12-08 20:43:19 -08:00
funkyrailroad
079c7ea0fc docs: Fix typo in weaviate integration docs (#28425)
- [ ] "docs: Fix typo in weaviate integration docs"
2024-12-08 20:42:00 -08:00
Zapiron
e8508fb4c6 docs: Fixed mini typo in recommend and improve the phrasing (#28438)
Fixed a typo on the word "recommend" and generally improved the phrasing
2024-12-08 20:30:43 -08:00
Zapiron
220b33df7f docs: Fixed broken link in the warning message to @tool API Reference… (#28437)
Fixed the broken hyperlink in the warning of docstring section to the
correct `@tool` API reference
2024-12-08 20:29:08 -08:00
Zapiron
1fc4ac32f0 docs: Resolve incorrect import of AttributeInfo for self-query retriever section (#28446)
Most of the imports from the self-query retriever section seems to
imported `AttributeInfo` from `query_constructor.base` instead of
`query_constructor.schema`, found in the API reference
[here](https://python.langchain.com/api_reference/langchain/chains/langchain.chains.query_constructor.schema.AttributeInfo.html)

This PR resolves the wrong imports from most of the notebooks
2024-12-08 20:23:26 -08:00
Marco Perini
2354bb7bfa partners: 🕷️🦜 ScrapeGraph API Integration (#28559)
Hi Langchain team!

I'm the co-founder and mantainer at
[ScrapeGraphAI](https://scrapegraphai.com/).
By following the integration
[guide](https://python.langchain.com/docs/contributing/how_to/integrations/publish/)
on your site, I have created a new lib called
[langchain-scrapegraph](https://github.com/ScrapeGraphAI/langchain-scrapegraph).

With this PR I would like to integrate Scrapegraph as provider in
Langchain, adding the required documentation files.
Let me know if there are some changes to be made to be properly
integrated both in the lib and in the documentation.

Thank you 🕷️🦜

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-09 02:38:21 +00:00
Abhinav
317a38b83e community[minor]: Add support for modle2vec embeddings (#28507)
This PR add an embeddings integration for model2vec, the
`Model2vecEmbeddings` class.

- **Description**: [Model2Vec](https://github.com/MinishLab/model2vec)
lets you turn any sentence transformer into a really small static model
and makes running the model faster.
- **Issue**:
- **Dependencies**: model2vec
([pypi](https://pypi.org/project/model2vec/))
- **Twitter handle:**:

- [x] **Add tests and docs**: 
-
[Test](https://github.com/blacksmithop/langchain/blob/model2vec_embeddings/libs/community/langchain_community/embeddings/model2vec.py),
[docs](https://github.com/blacksmithop/langchain/blob/model2vec_embeddings/docs/docs/integrations/text_embedding/model2vec.ipynb)

- [x] **Lint and test**:

---------

Co-authored-by: Abhinav KM <abhinav.m@zerone-consulting.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-12-09 02:17:22 +00:00
Mateusz Szewczyk
fbf0704e48 docs: Update IBM documentation (#28503)
Thank you for contributing to LangChain!

PR: Update IBM documentation
2024-12-08 12:40:29 -08:00
Mohammad Mohtashim
524ee6d9ac Invalid tool_choice being passed to ChatLiteLLM (#28198)
- **Description:** Invalid `tool_choice` is given to `ChatLiteLLM` to
`bind_tools` due to it's parent's class default value being pass through
`with_structured_output`.
- **Issue:** #28176
2024-12-07 14:33:40 -05:00
Erick Friis
dd0085a9ff docs: standard tests to markdown, load templates from files (#28603) 2024-12-07 01:37:21 +00:00
Erick Friis
9b848491c8 docs: tool, retriever contributing docs (#28602) 2024-12-07 00:36:55 +00:00
Erick Friis
5e8553c31a standard-tests: retriever docstrings (#28596) 2024-12-07 00:32:19 +00:00
ccurme
d801c6ffc7 tests[patch]: nits (#28601) 2024-12-07 00:13:04 +00:00
Ikko Eltociear Ashimine
a32035d17d docs: update uptrain.ipynb (#28561)
evluate -> evaluate
2024-12-06 19:09:48 -05:00
Erick Friis
07c2ac765a community: release 0.3.10 (#28600) 2024-12-07 00:07:13 +00:00
Erick Friis
4a7dc6ec4c standard-tests: release 0.3.6 (#28599) 2024-12-07 00:05:04 +00:00
ccurme
80a88f8f04 tests[patch]: update API ref for chat models (#28594) 2024-12-06 19:00:14 -05:00
Erick Friis
0eb7ab65f1 multiple: fix xfailed signatures (#28597) 2024-12-06 15:39:47 -08:00
Erick Friis
b7c2029e84 standard-tests: root docstrings (#28595) 2024-12-06 15:14:52 -08:00
Erick Friis
925ca75ca5 docs: format (#28593) 2024-12-06 15:08:25 -08:00
Erick Friis
f943205ebf docs: dont document root init (#28592) 2024-12-06 15:07:53 -08:00
Erick Friis
9e2abcd152 standard-tests: show right classes in api docs (#28591) 2024-12-06 14:48:13 -08:00
Erick Friis
246c10a1cc standard-tests: private members and tools unit troubleshoot (#28590) 2024-12-06 13:52:58 -08:00
Erick Friis
1cedf401a7 docs: enable private docstring submembers sphinx (#28589) 2024-12-06 13:36:34 -08:00
Erick Friis
791d7e965e docs: enable private docstring modules sphinx (#28588) 2024-12-06 13:23:06 -08:00
Erick Friis
4f99952129 docs: enable private docstring members sphinx (#28586) 2024-12-06 13:19:52 -08:00
Bagatur
221ab03fe4 docs: readme/intro nits (#28581) 2024-12-06 12:52:15 -08:00
Erick Friis
e6663b69f3 langchain: release 0.3.10 (#28585) 2024-12-06 20:20:24 +00:00
Erick Friis
c38b845d7e core: fix path test (#28584) 2024-12-06 20:05:18 +00:00
ccurme
2c6bc74cb1 multiple: combine sync/async vector store standard test suites (#28580)
Breaking change in `langchain-tests`.
2024-12-06 14:55:06 -05:00
Bagatur
dda9f90047 core[patch]: Release 0.3.22 (#28582) 2024-12-06 19:36:53 +00:00
ccurme
15cbc36a23 docs[patch]: update contributor docs for integrations (#28576)
- Reformat tabs
- Add code snippets inline
- Add embeddings content
2024-12-06 13:33:24 -05:00
ccurme
f3dc142d3c cli[patch]: implement minimal starter vector store (#28577)
Basically the same as core's in-memory vector store. Removed some
optional methods.
2024-12-06 13:10:22 -05:00
Erick Friis
5277a021c1 docs: raw loader codeblock (#28548)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-06 09:26:34 -08:00
Erick Friis
18386c16c7 core, tests: more tolerant _aget_relevant_documents function (#28462) 2024-12-06 00:49:30 +00:00
120 changed files with 5244 additions and 3675 deletions

View File

@@ -38,18 +38,21 @@ conda install langchain -c conda-forge
For these applications, LangChain simplifies the entire application lifecycle:
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel), [components](https://python.langchain.com/docs/concepts/), and [third-party integrations](https://python.langchain.com/docs/integrations/providers/).
- **Open-source libraries**: Build your applications using LangChain's open-source
[components](https://python.langchain.com/docs/concepts/) and
[third-party integrations](https://python.langchain.com/docs/integrations/providers/).
Use [LangGraph](https://langchain-ai.github.io/langgraph/) to build stateful agents with first-class streaming and human-in-the-loop support.
- **Productionization**: Inspect, monitor, and evaluate your apps with [LangSmith](https://docs.smith.langchain.com/) so that you can constantly optimize and deploy with confidence.
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/).
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/).
### Open-source libraries
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
- **`langchain-community`**: Third party integrations.
- Some integrations have been further split into **partner packages** that only rely on **`langchain-core`**. Examples include **`langchain_openai`** and **`langchain_anthropic`**.
- **`langchain-core`**: Base abstractions.
- **Integration packages** (e.g. **`langchain-openai`**, **`langchain-anthropic`**, etc.): Important integrations have been split into lightweight packages that are co-maintained by the LangChain team and the integration developers.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **[`LangGraph`](https://langchain-ai.github.io/langgraph/)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. To learn more about LangGraph, check out our first LangChain Academy course, *Introduction to LangGraph*, available [here](https://academy.langchain.com/courses/intro-to-langgraph).
- **`langchain-community`**: Third-party integrations that are community maintained.
- **[LangGraph](https://langchain-ai.github.io/langgraph)**: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. To learn more about LangGraph, check out our first LangChain Academy course, *Introduction to LangGraph*, available [here](https://academy.langchain.com/courses/intro-to-langgraph).
### Productionization:
@@ -57,7 +60,7 @@ For these applications, LangChain simplifies the entire application lifecycle:
### Deployment:
- **[LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/)**: Turn your LangGraph applications into production-ready APIs and Assistants.
- **[LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/)**: Turn your LangGraph applications into production-ready APIs and Assistants.
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/svg/langchain_stack_112024.svg#gh-light-mode-only "LangChain Architecture Overview")
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/svg/langchain_stack_112024_dark.svg#gh-dark-mode-only "LangChain Architecture Overview")
@@ -85,19 +88,12 @@ And much more! Head to the [Tutorials](https://python.langchain.com/docs/tutoria
The main value props of the LangChain libraries are:
1. **Components**: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2. **Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
## LangChain Expression Language (LCEL)
LCEL is a key part of LangChain, allowing you to build and organize chains of processes in a straightforward, declarative manner. It was designed to support taking prototypes directly into production without needing to alter any code. This means you can use LCEL to set up everything from basic "prompt + LLM" setups to intricate, multi-step workflows.
- **[Overview](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel)**: LCEL and its benefits
- **[Interface](https://python.langchain.com/docs/concepts/#runnable-interface)**: The standard Runnable interface for LCEL objects
- **[Primitives](https://python.langchain.com/docs/how_to/#langchain-expression-language-lcel)**: More on the primitives LCEL includes
- **[Cheatsheet](https://python.langchain.com/docs/how_to/lcel_cheatsheet/)**: Quick overview of the most common usage patterns
1. **Components**: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not.
2. **Easy orchestration with LangGraph**: [LangGraph](https://langchain-ai.github.io/langgraph/),
built on top of `langchain-core`, has built-in support for [messages](https://python.langchain.com/docs/concepts/messages/), [tools](https://python.langchain.com/docs/concepts/tools/),
and other LangChain abstractions. This makes it easy to combine components into
production-ready applications with persistence, streaming, and other key features.
Check out the LangChain [tutorials page](https://python.langchain.com/docs/tutorials/#orchestration) for examples.
## Components
@@ -105,15 +101,19 @@ Components fall into the following **modules**:
**📃 Model I/O**
This includes [prompt management](https://python.langchain.com/docs/concepts/#prompt-templates), [prompt optimization](https://python.langchain.com/docs/concepts/#example-selectors), a generic interface for [chat models](https://python.langchain.com/docs/concepts/#chat-models) and [LLMs](https://python.langchain.com/docs/concepts/#llms), and common utilities for working with [model outputs](https://python.langchain.com/docs/concepts/#output-parsers).
This includes [prompt management](https://python.langchain.com/docs/concepts/prompt_templates/)
and a generic interface for [chat models](https://python.langchain.com/docs/concepts/chat_models/), including a consistent interface for [tool-calling](https://python.langchain.com/docs/concepts/tool_calling/) and [structured output](https://python.langchain.com/docs/concepts/structured_outputs/) across model providers.
**📚 Retrieval**
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/docs/concepts/#document-loaders) from a variety of sources, [preparing it](https://python.langchain.com/docs/concepts/#text-splitters), then [searching over (a.k.a. retrieving from)](https://python.langchain.com/docs/concepts/#retrievers) it for use in the generation step.
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/docs/concepts/document_loaders/) from a variety of sources, [preparing it](https://python.langchain.com/docs/concepts/text_splitters/), then [searching over (a.k.a. retrieving from)](https://python.langchain.com/docs/concepts/retrievers/) it for use in the generation step.
**🤖 Agents**
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. LangChain provides a [standard interface for agents](https://python.langchain.com/docs/concepts/#agents), along with [LangGraph](https://github.com/langchain-ai/langgraph) for building custom agents.
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. [LangGraph](https://langchain-ai.github.io/langgraph/) makes it easy to use
LangChain components to build both [custom](https://langchain-ai.github.io/langgraph/tutorials/)
and [built-in](https://langchain-ai.github.io/langgraph/how-tos/create-react-agent/)
LLM agents.
## 📖 Documentation

View File

@@ -60,6 +60,7 @@ copy-infra:
cp package.json $(OUTPUT_NEW_DIR)
cp sidebars.js $(OUTPUT_NEW_DIR)
cp -r static $(OUTPUT_NEW_DIR)
cp -r ../libs/cli/langchain_cli/integration_template $(OUTPUT_NEW_DIR)/src/theme
cp yarn.lock $(OUTPUT_NEW_DIR)
render:
@@ -81,6 +82,7 @@ build: install-py-deps generate-files copy-infra render md-sync append-related
vercel-build: install-vercel-deps build generate-references
rm -rf docs
mv $(OUTPUT_NEW_DOCS_DIR) docs
cp -r ../libs/cli/langchain_cli/integration_template src/theme
rm -rf build
mkdir static/api_reference
git clone --depth=1 https://github.com/langchain-ai/langchain-api-docs-html.git

View File

@@ -87,6 +87,18 @@ class Beta(BaseAdmonition):
def setup(app):
app.add_directive("example_links", ExampleLinksDirective)
app.add_directive("beta", Beta)
app.connect("autodoc-skip-member", skip_private_members)
def skip_private_members(app, what, name, obj, skip, options):
if skip:
return True
if hasattr(obj, "__doc__") and obj.__doc__ and ":private:" in obj.__doc__:
return True
if name == "__init__" and obj.__objclass__ is object:
# dont document default init
return True
return None
# -- Project information -----------------------------------------------------

View File

@@ -72,14 +72,21 @@ def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
Returns:
list: A list of loaded module objects.
"""
classes_: List[ClassInfo] = []
functions: List[FunctionInfo] = []
module = importlib.import_module(module_path)
if ":private:" in (module.__doc__ or ""):
return ModuleMembers(classes_=[], functions=[])
for name, type_ in inspect.getmembers(module):
if not hasattr(type_, "__module__"):
continue
if type_.__module__ != module_path:
continue
if ":private:" in (type_.__doc__ or ""):
continue
if inspect.isclass(type_):
# The type of the class is used to select a template

View File

@@ -65,7 +65,7 @@ A package to deploy LangChain chains as REST APIs. Makes it easy to get a produc
:::important
LangServe is designed to primarily deploy simple Runnables and work with well-known primitives in langchain-core.
If you need a deployment option for LangGraph, you should instead be looking at LangGraph Cloud (beta) which will be better suited for deploying LangGraph applications.
If you need a deployment option for LangGraph, you should instead be looking at LangGraph Platform (beta) which will be better suited for deploying LangGraph applications.
:::
For more information, see the [LangServe documentation](/docs/langserve).

View File

@@ -1,4 +1,5 @@
---
pagination_prev: null
pagination_next: contributing/how_to/integrations/package
---
@@ -37,7 +38,6 @@ While any component can be integrated into LangChain, there are specific types o
<li>Chat Models</li>
<li>Tools/Toolkits</li>
<li>Retrievers</li>
<li>Document Loaders</li>
<li>Vector Stores</li>
<li>Embedding Models</li>
</ul>
@@ -45,6 +45,7 @@ While any component can be integrated into LangChain, there are specific types o
<td>
<ul>
<li>LLMs (Text-Completion Models)</li>
<li>Document Loaders</li>
<li>Key-Value Stores</li>
<li>Document Transformers</li>
<li>Model Caches</li>

View File

@@ -12,98 +12,90 @@ which contain classes that are compatible with LangChain's core interfaces.
We will cover:
1. How to implement components, such as [chat models](/docs/concepts/chat_models/) and [vector stores](/docs/concepts/vectorstores/), that adhere
1. (Optional) How to bootstrap a new integration package
2. How to implement components, such as [chat models](/docs/concepts/chat_models/) and [vector stores](/docs/concepts/vectorstores/), that adhere
to the LangChain interface;
2. (Optional) How to bootstrap a new integration package.
## Implementing LangChain components
LangChain components are subclasses of base classes in [langchain-core](/docs/concepts/architecture/#langchain-core).
Examples include [chat models](/docs/concepts/chat_models/),
[vector stores](/docs/concepts/vectorstores/), [tools](/docs/concepts/tools/),
[embedding models](/docs/concepts/embedding_models/) and [retrievers](/docs/concepts/retrievers/).
Your integration package will typically implement a subclass of at least one of these
components. Expand the tabs below to see details on each.
<details>
<summary>Chat models</summary>
Refer to the [Custom Chat Model Guide](/docs/how_to/custom_chat_model) guide for
detail on a starter chat model [implementation](/docs/how_to/custom_chat_model/#implementation).
:::tip
The model from the [Custom Chat Model Guide](/docs/how_to/custom_chat_model) is tested
against the standard unit and integration tests in the LangChain Github repository.
You can also access that implementation directly from Github
[here](https://github.com/langchain-ai/langchain/blob/master/libs/standard-tests/tests/unit_tests/custom_chat_model.py).
:::
</details>
<details>
<summary>Vector stores</summary>
Your vector store implementation will depend on your chosen database technology.
`langchain-core` includes a minimal
[in-memory vector store](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.in_memory.InMemoryVectorStore.html)
that we can use as a guide. You can access the code [here](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/in_memory.py).
All vector stores must inherit from the [VectorStore](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html)
base class. This interface consists of methods for writing, deleting and searching
for documents in the vector store.
`VectorStore` supports a variety of synchronous and asynchronous search types (e.g.,
nearest-neighbor or maximum marginal relevance), as well as interfaces for adding
documents to the store. See the [API Reference](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html)
for all supported methods. The required methods are tabulated below:
| Method/Property | Description |
|------------------------ |------------------------------------------------------|
| `add_documents` | Add documents to the vector store. |
| `delete` | Delete selected documents from vector store (by IDs) |
| `get_by_ids` | Get selected documents from vector store (by IDs) |
| `similarity_search` | Get documents most similar to a query. |
| `embeddings` (property) | Embeddings object for vector store. |
| `from_texts` | Instantiate vector store via adding texts. |
Note that `InMemoryVectorStore` implements some optional search types, as well as
convenience methods for loading and dumping the object to a file, but this is not
necessary for all implementations.
:::tip
The [in-memory vector store](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/in_memory.py)
is tested against the standard tests in the LangChain Github repository.
:::
</details>
<!-- <details>
<summary>Embeddings</summary>
</details>
<details>
<summary>Tools</summary>
</details>
<details>
<summary>Retrievers</summary>
</details>
<details>
<summary>Document Loaders</summary>
</details> -->
## (Optional) bootstrapping a new integration package
In this section, we will outline 2 options for bootstrapping a new integration package,
and you're welcome to use other tools if you prefer!
1. **langchain-cli**: This is a command-line tool that can be used to bootstrap a new integration package with a template for LangChain components and Poetry for dependency management.
2. **Poetry**: This is a Python dependency management tool that can be used to bootstrap a new Python package with dependencies. You can then add LangChain components to this package.
<details>
<summary>Option 1: langchain-cli (recommended)</summary>
In this guide, we will be using the `langchain-cli` to create a new integration package
from a template, which can be edited to implement your LangChain components.
### **Prerequisites**
- [GitHub](https://github.com) account
- [PyPi](https://pypi.org/) account
### Boostrapping a new Python package with langchain-cli
First, install `langchain-cli` and `poetry`:
```bash
pip install langchain-cli poetry
```
Next, come up with a name for your package. For this guide, we'll use `langchain-parrot-link`.
You can confirm that the name is available on PyPi by searching for it on the [PyPi website](https://pypi.org/).
Next, create your new Python package with `langchain-cli`, and navigate into the new directory with `cd`:
```bash
langchain-cli integration new
> The name of the integration to create (e.g. `my-integration`): parrot-link
> Name of integration in PascalCase [ParrotLink]:
cd parrot-link
```
Next, let's add any dependencies we need
```bash
poetry add my-integration-sdk
```
We can also add some `typing` or `test` dependencies in a separate poetry dependency group.
```
poetry add --group typing my-typing-dep
poetry add --group test my-test-dep
```
And finally, have poetry set up a virtual environment with your dependencies, as well
as your integration package:
```bash
poetry install --with lint,typing,test,test_integration
```
You now have a new Python package with a template for LangChain components! This
template comes with files for each integration type, and you're welcome to duplicate or
delete any of these files as needed (including the associated test files).
To create any individual files from the [template], you can run e.g.:
```bash
langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/chat_models.py \
--dst langchain_parrot_link/chat_models_2.py
```
</details>
<details>
<summary>Option 2: Poetry (manual)</summary>
In this guide, we will be using [Poetry](https://python-poetry.org/) for
dependency management and packaging, and you're welcome to use any other tools you prefer.
@@ -183,6 +175,8 @@ later, following the [standard tests](../standard_tests) guide.
For `chat_models.py`, simply paste the contents of the chat model implementation
[above](#implementing-langchain-components).
</details>
### Push your package to a public Github repository
This is only required if you want to publish your integration in the LangChain documentation.
@@ -191,6 +185,319 @@ This is only required if you want to publish your integration in the LangChain d
2. Push your code to the repository.
3. Confirm that your repository is viewable by the public (e.g. in a private browsing window, where you're not logged into Github).
## Implementing LangChain components
LangChain components are subclasses of base classes in [langchain-core](/docs/concepts/architecture/#langchain-core).
Examples include [chat models](/docs/concepts/chat_models/),
[vector stores](/docs/concepts/vectorstores/), [tools](/docs/concepts/tools/),
[embedding models](/docs/concepts/embedding_models/) and [retrievers](/docs/concepts/retrievers/).
Your integration package will typically implement a subclass of at least one of these
components. Expand the tabs below to see details on each.
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';
<Tabs>
<TabItem value="chat_models" label="Chat models">
Refer to the [Custom Chat Model Guide](/docs/how_to/custom_chat_model) guide for
detail on a starter chat model [implementation](/docs/how_to/custom_chat_model/#implementation).
You can start from the following template or langchain-cli command:
```bash
langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/chat_models.py \
--dst langchain_parrot_link/chat_models.py
```
<details>
<summary>Example chat model code</summary>
import ChatModelSource from '../../../../src/theme/integration_template/integration_template/chat_models.py';
<CodeBlock language="python" title="langchain_parrot_link/chat_models.py">
{
ChatModelSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</details>
</TabItem>
<TabItem value="vector_stores" label="Vector stores">
Your vector store implementation will depend on your chosen database technology.
`langchain-core` includes a minimal
[in-memory vector store](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.in_memory.InMemoryVectorStore.html)
that we can use as a guide. You can access the code [here](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/in_memory.py).
All vector stores must inherit from the [VectorStore](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html)
base class. This interface consists of methods for writing, deleting and searching
for documents in the vector store.
`VectorStore` supports a variety of synchronous and asynchronous search types (e.g.,
nearest-neighbor or maximum marginal relevance), as well as interfaces for adding
documents to the store. See the [API Reference](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.base.VectorStore.html)
for all supported methods. The required methods are tabulated below:
| Method/Property | Description |
|------------------------ |------------------------------------------------------|
| `add_documents` | Add documents to the vector store. |
| `delete` | Delete selected documents from vector store (by IDs) |
| `get_by_ids` | Get selected documents from vector store (by IDs) |
| `similarity_search` | Get documents most similar to a query. |
| `embeddings` (property) | Embeddings object for vector store. |
| `from_texts` | Instantiate vector store via adding texts. |
Note that `InMemoryVectorStore` implements some optional search types, as well as
convenience methods for loading and dumping the object to a file, but this is not
necessary for all implementations.
:::tip
The [in-memory vector store](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/in_memory.py)
is tested against the standard tests in the LangChain Github repository.
:::
<details>
<summary>Example vector store code</summary>
import VectorstoreSource from '../../../../src/theme/integration_template/integration_template/vectorstores.py';
<CodeBlock language="python" title="langchain_parrot_link/vectorstores.py">
{
VectorstoreSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</details>
</TabItem>
<TabItem value="embeddings" label="Embeddings">
Embeddings are used to convert `str` objects from `Document.page_content` fields
into a vector representation (represented as a list of floats).
The `Embeddings` class must inherit from the [Embeddings](https://python.langchain.com/api_reference/core/embeddings/langchain_core.embeddings.embeddings.Embeddings.html#langchain_core.embeddings.embeddings.Embeddings)
base class. This interface has 5 methods that can be implemented.
| Method/Property | Description |
|------------------------ |------------------------------------------------------|
| `__init__` | Initialize the embeddings object. (optional) |
| `embed_query` | Embed a list of texts. (required) |
| `embed_documents` | Embed a list of documents. (required) |
| `aembed_query` | Asynchronously embed a list of texts. (optional) |
| `aembed_documents` | Asynchronously embed a list of documents. (optional) |
### Constructor
The `__init__` constructor is optional but common, but can be used to set up any necessary attributes
that a user can pass in when initializing the embeddings object. Common attributes include
- `model` - the id of the model to use for embeddings
### Embedding queries vs documents
The `embed_query` and `embed_documents` methods are required. These methods both operate
on string inputs (the accessing of `Document.page_content` attributes) is handled
by the VectorStore using the embedding model for legacy reasons.
`embed_query` takes in a single string and returns a single embedding as a list of floats.
If your model has different modes for embedding queries vs the underlying documents, you can
implement this method to handle that.
`embed_documents` takes in a list of strings and returns a list of embeddings as a list of lists of floats.
### Implementation
You can start from the following template or langchain-cli command:
```bash
langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/embeddings.py \
--dst langchain_parrot_link/embeddings.py
```
<details>
<summary>Example embeddings code</summary>
import EmbeddingsSource from '/src/theme/integration_template/integration_template/embeddings.py';
<CodeBlock language="python" title="langchain_parrot_link/embeddings.py">
{
EmbeddingsSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</details>
</TabItem>
<TabItem value="tools" label="Tools">
Tools are used in 2 main ways:
1. To define an "input schema" or "args schema" to pass to a chat model's tool calling
feature along with a text request, such that the chat model can generate a "tool call",
or parameters to call the tool with.
2. To take a "tool call" as generated above, and take some action and return a response
that can be passed back to the chat model as a ToolMessage.
The `Tools` class must inherit from the [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#langchain_core.tools.base.BaseTool) base class. This interface has 3 properties and 2 methods that should be implemented in a
subclass.
| Method/Property | Description |
|------------------------ |------------------------------------------------------|
| `name` | Name of the tool (passed to the LLM too). |
| `description` | Description of the tool (passed to the LLM too). |
| `args_schema` | Define the schema for the tool's input arguments. |
| `_run` | Run the tool with the given arguments. |
| `_arun` | Asynchronously run the tool with the given arguments.|
### Properties
`name`, `description`, and `args_schema` are all properties that should be implemented
in the subclass. `name` and `description` are strings that are used to identify the tool
and provide a description of what the tool does. Both of these are passed to the LLM,
and users may override these values depending on the LLM they are using as a form of
"prompt engineering." Giving these a concise and LLM-usable name and description is
important for the initial user experience of the tool.
`args_schema` is a Pydantic `BaseModel` that defines the schema for the tool's input
arguments. This is used to validate the input arguments to the tool, and to provide
a schema for the LLM to fill out when calling the tool. Similar to the `name` and
`description` of the overall Tool class, the fields' names (the variable name) and
description (part of `Field(..., description="description")`) are passed to the LLM,
and the values in these fields should be concise and LLM-usable.
### Run Methods
`_run` is the main method that should be implemented in the subclass. This method
takes in the arguments from `args_schema` and runs the tool, returning a string
response. This method is usually called in a LangGraph [`ToolNode`](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/), and can also be called in a legacy
`langchain.agents.AgentExecutor`.
`_arun` is optional because by default, `_run` will be run in an async executor.
However, if your tool is calling any apis or doing any async work, you should implement
this method to run the tool asynchronously in addition to `_run`.
### Implementation
You can start from the following template or langchain-cli command:
```bash
langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/tools.py \
--dst langchain_parrot_link/tools.py
```
<details>
<summary>Example tool code</summary>
import ToolSource from '/src/theme/integration_template/integration_template/tools.py';
<CodeBlock language="python" title="langchain_parrot_link/tools.py">
{
ToolSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</details>
</TabItem>
<TabItem value="retrievers" label="Retrievers">
Retrievers are used to retrieve documents from APIs, databases, or other sources
based on a query. The `Retriever` class must inherit from the [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) base class. This interface has 1 attribute and 2 methods that should be implemented in a subclass.
| Method/Property | Description |
|------------------------ |------------------------------------------------------|
| `k` | Default number of documents to retrieve (configurable). |
| `_get_relevant_documents`| Retrieve documents based on a query. |
| `_aget_relevant_documents`| Asynchronously retrieve documents based on a query. |
### Attributes
`k` is an attribute that should be implemented in the subclass. This attribute
can simply be defined at the top of the class with a default value like
`k: int = 5`. This attribute is the default number of documents to retrieve
from the retriever, and can be overridden by the user when constructing or calling
the retriever.
### Methods
`_get_relevant_documents` is the main method that should be implemented in the subclass.
This method takes in a query and returns a list of `Document` objects, which have 2
main properties:
- `page_content` - the text content of the document
- `metadata` - a dictionary of metadata about the document
Retrievers are typically directly invoked by a user, e.g. as
`MyRetriever(k=4).invoke("query")`, which will automatically call `_get_relevant_documents`
under the hood.
`_aget_relevant_documents` is optional because by default, `_get_relevant_documents` will
be run in an async executor. However, if your retriever is calling any apis or doing
any async work, you should implement this method to run the retriever asynchronously
in addition to `_get_relevant_documents` for performance reasons.
### Implementation
You can start from the following template or langchain-cli command:
```bash
langchain-cli integration new \
--name parrot-link \
--name-class ParrotLink \
--src integration_template/retrievers.py \
--dst langchain_parrot_link/retrievers.py
```
<details>
<summary>Example retriever code</summary>
import RetrieverSource from '/src/theme/integration_template/integration_template/retrievers.py';
<CodeBlock language="python" title="langchain_parrot_link/retrievers.py">
{
RetrieverSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</details>
</TabItem>
</Tabs>
---
## Next Steps
Now that you've implemented your package, you can move on to [testing your integration](../standard_tests) for your integration and successfully run them.

View File

@@ -1,600 +0,0 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {
"vscode": {
"languageId": "raw"
}
},
"source": [
"---\n",
"pagination_next: contributing/how_to/integrations/publish\n",
"pagination_prev: contributing/how_to/integrations/package\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to add standard tests to an integration\n",
"\n",
"When creating either a custom class for yourself or to publish in a LangChain integration, it is important to add standard tests to ensure it works as expected. This guide will show you how to add standard tests to a custom chat model, and you can **[Skip to the test templates](#standard-test-templates-per-component)** for implementing tests for each integration type.\n",
"\n",
"## Setup\n",
"\n",
"If you're coming from the [previous guide](../package), you have already installed these dependencies, and you can skip this section.\n",
"\n",
"First, let's install 2 dependencies:\n",
"\n",
"- `langchain-core` will define the interfaces we want to import to define our custom tool.\n",
"- `langchain-tests` will provide the standard tests we want to use. Recommended to pin to the latest version: <img src=\"https://img.shields.io/pypi/v/langchain-tests\" style={{position:\"relative\",top:4,left:3}} />\n",
"\n",
":::note\n",
"\n",
"Because added tests in new versions of `langchain-tests` can break your CI/CD pipelines, we recommend pinning the \n",
"version of `langchain-tests` to avoid unexpected changes.\n",
"\n",
":::\n",
"\n",
"import Tabs from '@theme/Tabs';\n",
"import TabItem from '@theme/TabItem';\n",
"\n",
"<Tabs>\n",
" <TabItem value=\"poetry\" label=\"Poetry\" default>\n",
"If you followed the [previous guide](../package), you should already have these dependencies installed!\n",
"\n",
"```bash\n",
"poetry add langchain-core\n",
"poetry add --group test pytest pytest-socket pytest-asyncio langchain-tests==<latest_version>\n",
"poetry install --with test\n",
"```\n",
" </TabItem>\n",
" <TabItem value=\"pip\" label=\"Pip\">\n",
"```bash\n",
"pip install -U langchain-core pytest pytest-socket pytest-asyncio langchain-tests\n",
"\n",
"# install current package in editable mode\n",
"pip install --editable .\n",
"```\n",
" </TabItem>\n",
"</Tabs>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say we're publishing a package, `langchain_parrot_link`, that exposes the chat model from the [guide on implementing the package](../package). We can add the standard tests to the package by following the steps below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And we'll assume you've structured your package the same way as the main LangChain\n",
"packages:\n",
"\n",
"```plaintext\n",
"langchain-parrot-link/\n",
"├── langchain_parrot_link/\n",
"│ ├── __init__.py\n",
"│ └── chat_models.py\n",
"├── tests/\n",
"│ ├── __init__.py\n",
"│ └── test_chat_models.py\n",
"├── pyproject.toml\n",
"└── README.md\n",
"```\n",
"\n",
"## Add and configure standard tests\n",
"\n",
"There are 2 namespaces in the `langchain-tests` package: \n",
"\n",
"- [unit tests](../../../concepts/testing.mdx#unit-tests) (`langchain_tests.unit_tests`): designed to be used to test the component in isolation and without access to external services\n",
"- [integration tests](../../../concepts/testing.mdx#unit-tests) (`langchain_tests.integration_tests`): designed to be used to test the component with access to external services (in particular, the external service that the component is designed to interact with).\n",
"\n",
"Both types of tests are implemented as [`pytest` class-based test suites](https://docs.pytest.org/en/7.1.x/getting-started.html#group-multiple-tests-in-a-class).\n",
"\n",
"By subclassing the base classes for each type of standard test (see below), you get all of the standard tests for that type, and you\n",
"can override the properties that the test suite uses to configure the tests.\n",
"\n",
"### Standard chat model tests\n",
"\n",
"Here's how you would configure the standard unit tests for the custom chat model:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_chat_models.py\"\n",
"from typing import Tuple, Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.unit_tests import ChatModelUnitTests\n",
"\n",
"\n",
"class TestChatParrotLinkUnit(ChatModelUnitTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\n",
" \"model\": \"bird-brain-001\",\n",
" \"temperature\": 0,\n",
" \"parrot_buffer_length\": 50,\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_chat_models.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.integration_tests import ChatModelIntegrationTests\n",
"\n",
"\n",
"class TestChatParrotLinkIntegration(ChatModelIntegrationTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\n",
" \"model\": \"bird-brain-001\",\n",
" \"temperature\": 0,\n",
" \"parrot_buffer_length\": 50,\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and you would run these with the following commands from your project root\n",
"\n",
"<Tabs>\n",
" <TabItem value=\"poetry\" label=\"Poetry\" default>\n",
"\n",
"```bash\n",
"# run unit tests without network access\n",
"poetry run pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests\n",
"\n",
"# run integration tests\n",
"poetry run pytest --asyncio-mode=auto tests/integration_tests\n",
"```\n",
"\n",
" </TabItem>\n",
" <TabItem value=\"pip\" label=\"Pip\">\n",
"\n",
"```bash\n",
"# run unit tests without network access\n",
"pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests\n",
"\n",
"# run integration tests\n",
"pytest --asyncio-mode=auto tests/integration_tests\n",
"```\n",
"\n",
" </TabItem>\n",
"</Tabs>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test suite information and troubleshooting\n",
"\n",
"For a full list of the standard test suites that are available, as well as\n",
"information on which tests are included and how to troubleshoot common issues,\n",
"see the [Standard Tests API Reference](https://python.langchain.com/api_reference/standard_tests/index.html).\n",
"\n",
"An increasing number of troubleshooting guides are being added to this documentation,\n",
"and if you're interested in contributing, feel free to add docstrings to tests in \n",
"[Github](https://github.com/langchain-ai/langchain/tree/master/libs/standard-tests/langchain_tests)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard test templates per component:\n",
"\n",
"Above, we implement the **unit** and **integration** standard tests for a tool. Below are the templates for implementing the standard tests for each component:\n",
"\n",
"<details>\n",
" <summary>Chat Models</summary>\n",
" <p>Note: The standard tests for chat models are implemented in the example in the main body of this guide too.</p>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Chat model standard tests test a range of behaviors, from the most basic requirements (generating a response to a query) to optional capabilities like multi-modal support and tool-calling. For a test run to be successful:\n",
"\n",
"1. If a feature is intended to be supported by the model, it should pass;\n",
"2. If a feature is not intended to be supported by the model, it should be skipped.\n",
"\n",
"Tests for \"optional\" capabilities are controlled via a set of properties that can be overridden on the test model subclass.\n",
"\n",
"You can see the entire list of properties in the API reference [here](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.chat_models.ChatModelTests.html). These properties are shared by both unit and integration tests.\n",
"\n",
"For example, to enable integration tests for image inputs, we can implement\n",
"\n",
"```python\n",
"@property\n",
"def supports_image_inputs(self) -> bool:\n",
" return True\n",
"```\n",
"\n",
"on the integration test class.\n",
"\n",
":::note\n",
"\n",
"Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:\n",
"\n",
"- [Unit tests API reference](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.chat_models.ChatModelUnitTests.html)\n",
"- [Integration tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html)\n",
"\n",
":::\n",
"\n",
"Unit test example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_chat_models.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.unit_tests import ChatModelUnitTests\n",
"\n",
"\n",
"class TestChatParrotLinkUnit(ChatModelUnitTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\n",
" \"model\": \"bird-brain-001\",\n",
" \"temperature\": 0,\n",
" \"parrot_buffer_length\": 50,\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Integration test example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_chat_models.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.integration_tests import ChatModelIntegrationTests\n",
"\n",
"\n",
"class TestChatParrotLinkIntegration(ChatModelIntegrationTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\n",
" \"model\": \"bird-brain-001\",\n",
" \"temperature\": 0,\n",
" \"parrot_buffer_length\": 50,\n",
" }"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Embedding Models</summary>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_embeddings.py\"\n",
"from typing import Tuple, Type\n",
"\n",
"from langchain_parrot_link.embeddings import ParrotLinkEmbeddings\n",
"from langchain_tests.unit_tests import EmbeddingsUnitTests\n",
"\n",
"\n",
"class TestParrotLinkEmbeddingsUnit(EmbeddingsUnitTests):\n",
" @property\n",
" def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:\n",
" return ParrotLinkEmbeddings\n",
"\n",
" @property\n",
" def embedding_model_params(self) -> dict:\n",
" return {\"model\": \"nest-embed-001\", \"temperature\": 0}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_embeddings.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.embeddings import ParrotLinkEmbeddings\n",
"from langchain_tests.integration_tests import EmbeddingsIntegrationTests\n",
"\n",
"\n",
"class TestParrotLinkEmbeddingsIntegration(EmbeddingsIntegrationTests):\n",
" @property\n",
" def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:\n",
" return ParrotLinkEmbeddings\n",
"\n",
" @property\n",
" def embedding_model_params(self) -> dict:\n",
" return {\"model\": \"nest-embed-001\"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Tools/Toolkits</summary>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.unit_tests import ToolsUnitTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolUnit(ToolsUnitTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.integration_tests import ToolsIntegrationTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolIntegration(ToolsIntegrationTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Vector Stores</summary>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's how you would configure the standard tests for a typical vector store (using\n",
"`ParrotVectorStore` as a placeholder):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_vectorstores_sync.py\"\n",
"\n",
"from typing import AsyncGenerator, Generator\n",
"\n",
"import pytest\n",
"from langchain_core.vectorstores import VectorStore\n",
"from langchain_parrot_link.vectorstores import ParrotVectorStore\n",
"from langchain_standard_tests.integration_tests.vectorstores import (\n",
" AsyncReadWriteTestSuite,\n",
" ReadWriteTestSuite,\n",
")\n",
"\n",
"\n",
"class TestSync(ReadWriteTestSuite):\n",
" @pytest.fixture()\n",
" def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore\n",
" \"\"\"Get an empty vectorstore for unit tests.\"\"\"\n",
" store = ParrotVectorStore()\n",
" # note: store should be EMPTY at this point\n",
" # if you need to delete data, you may do so here\n",
" try:\n",
" yield store\n",
" finally:\n",
" # cleanup operations, or deleting data\n",
" pass\n",
"\n",
"\n",
"class TestAsync(AsyncReadWriteTestSuite):\n",
" @pytest.fixture()\n",
" async def vectorstore(self) -> AsyncGenerator[VectorStore, None]: # type: ignore\n",
" \"\"\"Get an empty vectorstore for unit tests.\"\"\"\n",
" store = ParrotVectorStore()\n",
" # note: store should be EMPTY at this point\n",
" # if you need to delete data, you may do so here\n",
" try:\n",
" yield store\n",
" finally:\n",
" # cleanup operations, or deleting data\n",
" pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are separate suites for testing synchronous and asynchronous methods.\n",
"Configuring the tests consists of implementing pytest fixtures for setting up an\n",
"empty vector store and tearing down the vector store after the test run ends.\n",
"\n",
"For example, below is the `ReadWriteTestSuite` for the [Chroma](https://python.langchain.com/docs/integrations/vectorstores/chroma/)\n",
"integration:\n",
"\n",
"```python\n",
"from typing import Generator\n",
"\n",
"import pytest\n",
"from langchain_core.vectorstores import VectorStore\n",
"from langchain_tests.integration_tests.vectorstores import ReadWriteTestSuite\n",
"\n",
"from langchain_chroma import Chroma\n",
"\n",
"\n",
"class TestSync(ReadWriteTestSuite):\n",
" @pytest.fixture()\n",
" def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore\n",
" \"\"\"Get an empty vectorstore.\"\"\"\n",
" store = Chroma(embedding_function=self.get_embeddings())\n",
" try:\n",
" yield store\n",
" finally:\n",
" store.delete_collection()\n",
" pass\n",
"```\n",
"\n",
"Note that before the initial `yield`, we instantiate the vector store with an\n",
"[embeddings](/docs/concepts/embedding_models/) object. This is a pre-defined\n",
"[\"fake\" embeddings model](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.vectorstores.ReadWriteTestSuite.html#langchain_tests.integration_tests.vectorstores.ReadWriteTestSuite.get_embeddings)\n",
"that will generate short, arbitrary vectors for documents. You can use a different\n",
"embeddings object if desired.\n",
"\n",
"In the `finally` block, we call whatever integration-specific logic is needed to\n",
"bring the vector store to a clean state. This logic is executed in between each test\n",
"(e.g., even if tests fail).\n",
"\n",
":::note\n",
"\n",
"Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:\n",
"\n",
"- [Sync tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.vectorstores.ReadWriteTestSuite.html)\n",
"- [Async tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.vectorstores.AsyncReadWriteTestSuite.html)\n",
"\n",
":::"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,393 @@
---
pagination_next: contributing/how_to/integrations/publish
pagination_prev: contributing/how_to/integrations/package
---
# How to add standard tests to an integration
When creating either a custom class for yourself or to publish in a LangChain integration, it is important to add standard tests to ensure it works as expected. This guide will show you how to add standard tests to each integration type.
## Setup
First, let's install 2 dependencies:
- `langchain-core` will define the interfaces we want to import to define our custom tool.
- `langchain-tests` will provide the standard tests we want to use, as well as pytest plugins necessary to run them. Recommended to pin to the latest version: <img src="https://img.shields.io/pypi/v/langchain-tests" style={{position:"relative",top:4,left:3}} />
:::note
Because added tests in new versions of `langchain-tests` can break your CI/CD pipelines, we recommend pinning the
version of `langchain-tests` to avoid unexpected changes.
:::
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
<Tabs>
<TabItem value="poetry" label="Poetry" default>
If you followed the [previous guide](../package), you should already have these dependencies installed!
```bash
poetry add langchain-core
poetry add --group test langchain-tests==<latest_version>
poetry install --with test
```
</TabItem>
<TabItem value="pip" label="Pip">
```bash
pip install -U langchain-core langchain-tests
# install current package in editable mode
pip install --editable .
```
</TabItem>
</Tabs>
## Add and configure standard tests
There are 2 namespaces in the `langchain-tests` package:
- [unit tests](../../../concepts/testing.mdx#unit-tests) (`langchain_tests.unit_tests`): designed to be used to test the component in isolation and without access to external services
- [integration tests](../../../concepts/testing.mdx#integration-tests) (`langchain_tests.integration_tests`): designed to be used to test the component with access to external services (in particular, the external service that the component is designed to interact with).
Both types of tests are implemented as [`pytest` class-based test suites](https://docs.pytest.org/en/7.1.x/getting-started.html#group-multiple-tests-in-a-class).
By subclassing the base classes for each type of standard test (see below), you get all of the standard tests for that type, and you
can override the properties that the test suite uses to configure the tests.
In order to run the tests in the same way as this guide, we recommend subclassing these
classes in test files under two test subdirectories:
- `tests/unit_tests` for unit tests
- `tests/integration_tests` for integration tests
### Implementing standard tests
import CodeBlock from '@theme/CodeBlock';
In the following tabs, we show how to implement the standard tests for
each component type:
<Tabs>
<TabItem value="chat_models" label="Chat models">
To configure standard tests for a chat model, we subclass `ChatModelUnitTests` and `ChatModelIntegrationTests`. On each subclass, we override the following `@property` methods to specify the chat model to be tested and the chat model's configuration:
| Property | Description |
| --- | --- |
| `chat_model_class` | The class for the chat model to be tested |
| `chat_model_params` | The parameters to pass to the chat
model's constructor |
Additionally, chat model standard tests test a range of behaviors, from the most basic requirements (generating a response to a query) to optional capabilities like multi-modal support and tool-calling. For a test run to be successful:
1. If a feature is intended to be supported by the model, it should pass;
2. If a feature is not intended to be supported by the model, it should be skipped.
Tests for "optional" capabilities are controlled via a set of properties that can be overridden on the test model subclass.
You can see the **entire list of configurable capabilities** in the API references for
[unit tests](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.chat_models.ChatModelUnitTests.html)
and [integration tests](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html).
For example, to enable integration tests for image inputs, we can implement
```python
@property
def supports_image_inputs(self) -> bool:
return True
```
on the integration test class.
:::note
Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:
- [Unit tests API reference](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.chat_models.ChatModelUnitTests.html)
- [Integration tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html)
:::
Unit test example:
import ChatUnitSource from '../../../../src/theme/integration_template/tests/unit_tests/test_chat_models.py';
<CodeBlock language="python" title="tests/unit_tests/test_chat_models.py">
{
ChatUnitSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
Integration test example:
import ChatIntegrationSource from '../../../../src/theme/integration_template/tests/integration_tests/test_chat_models.py';
<CodeBlock language="python" title="tests/integration_tests/test_chat_models.py">
{
ChatIntegrationSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</TabItem>
<TabItem value="vector_stores" label="Vector stores">
Here's how you would configure the standard tests for a typical vector store (using
`ParrotVectorStore` as a placeholder):
Vector store tests do not have optional capabilities to be configured at this time.
import VectorStoreIntegrationSource from '../../../../src/theme/integration_template/tests/integration_tests/test_vectorstores.py';
<CodeBlock language="python" title="tests/integration_tests/test_vectorstores.py">
{
VectorStoreIntegrationSource.replaceAll('__ModuleName__', 'Parrot')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
Configuring the tests consists of implementing pytest fixtures for setting up an
empty vector store and tearing down the vector store after the test run ends.
| Fixture | Description |
| --- | --- |
| `vectorstore` | A generator that yields an empty vector store for unit tests. The vector store is cleaned up after the test run ends. |
For example, below is the `VectorStoreIntegrationTests` class for the [Chroma](https://python.langchain.com/docs/integrations/vectorstores/chroma/)
integration:
```python
from typing import Generator
import pytest
from langchain_core.vectorstores import VectorStore
from langchain_tests.integration_tests.vectorstores import VectorStoreIntegrationTests
from langchain_chroma import Chroma
class TestChromaStandard(VectorStoreIntegrationTests):
@pytest.fixture()
def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = Chroma(embedding_function=self.get_embeddings())
try:
yield store
finally:
store.delete_collection()
pass
```
Note that before the initial `yield`, we instantiate the vector store with an
[embeddings](/docs/concepts/embedding_models/) object. This is a pre-defined
["fake" embeddings model](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.vectorstores.VectorStoreIntegrationTests.html#langchain_tests.integration_tests.vectorstores.VectorStoreIntegrationTests.get_embeddings)
that will generate short, arbitrary vectors for documents. You can use a different
embeddings object if desired.
In the `finally` block, we call whatever integration-specific logic is needed to
bring the vector store to a clean state. This logic is executed in between each test
(e.g., even if tests fail).
:::note
Details on what tests are run and troubleshooting tips for each test can be found in the [API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.vectorstores.VectorStoreIntegrationTests.html).
:::
</TabItem>
<TabItem value="embeddings" label="Embeddings">
To configure standard tests for an embeddings model, we subclass `EmbeddingsUnitTests` and `EmbeddingsIntegrationTests`. On each subclass, we override the following `@property` methods to specify the embeddings model to be tested and the embeddings model's configuration:
| Property | Description |
| --- | --- |
| `embeddings_class` | The class for the embeddings model to be tested |
| `embedding_model_params` | The parameters to pass to the embeddings model's constructor |
:::note
Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:
- [Unit tests API reference](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.embeddings.EmbeddingsUnitTests.html)
- [Integration tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.embeddings.EmbeddingsIntegrationTests.html)
:::
Unit test example:
import EmbeddingsUnitSource from '../../../../src/theme/integration_template/tests/unit_tests/test_embeddings.py';
<CodeBlock language="python" title="tests/unit_tests/test_embeddings.py">
{
EmbeddingsUnitSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
Integration test example:
```python title="tests/integration_tests/test_embeddings.py"
from typing import Type
from langchain_parrot_link.embeddings import ParrotLinkEmbeddings
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
class TestParrotLinkEmbeddingsIntegration(EmbeddingsIntegrationTests):
@property
def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:
return ParrotLinkEmbeddings
@property
def embedding_model_params(self) -> dict:
return {"model": "nest-embed-001"}
```
import EmbeddingsIntegrationSource from '../../../../src/theme/integration_template/tests/integration_tests/test_embeddings.py';
<CodeBlock language="python" title="tests/integration_tests/test_embeddings.py">
{
EmbeddingsIntegrationSource.replaceAll('__ModuleName__', 'ParrotLink')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT_LINK')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</TabItem>
<TabItem value="tools" label="Tools">
To configure standard tests for a tool, we subclass `ToolsUnitTests` and
`ToolsIntegrationTests`. On each subclass, we override the following `@property` methods
to specify the tool to be tested and the tool's configuration:
| Property | Description |
| --- | --- |
| `tool_constructor` | The constructor for the tool to be tested, or an instantiated tool. |
| `tool_constructor_params` | The parameters to pass to the tool (optional). |
| `tool_invoke_params_example` | An example of the parameters to pass to the tool's `invoke` method. |
If you are testing a tool class and pass a class like `MyTool` to `tool_constructor`, you can pass the parameters to the constructor in `tool_constructor_params`.
If you are testing an instantiated tool, you can pass the instantiated tool to `tool_constructor` and do not
override `tool_constructor_params`.
:::note
Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:
- [Unit tests API reference](https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.tools.ToolsUnitTests.html)
- [Integration tests API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.tools.ToolsIntegrationTests.html)
:::
import ToolsUnitSource from '../../../../src/theme/integration_template/tests/unit_tests/test_tools.py';
<CodeBlock language="python" title="tests/unit_tests/test_tools.py">
{
ToolsUnitSource.replaceAll('__ModuleName__', 'Parrot')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
import ToolsIntegrationSource from '../../../../src/theme/integration_template/tests/integration_tests/test_tools.py';
<CodeBlock language="python" title="tests/integration_tests/test_tools.py">
{
ToolsIntegrationSource.replaceAll('__ModuleName__', 'Parrot')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</TabItem>
<TabItem value="retrievers" label="Retrievers">
To configure standard tests for a retriever, we subclass `RetrieversUnitTests` and
`RetrieversIntegrationTests`. On each subclass, we override the following `@property` methods
| Property | Description |
| --- | --- |
| `retriever_constructor` | The class for the retriever to be tested |
| `retriever_constructor_params` | The parameters to pass to the retriever's constructor |
| `retriever_query_example` | An example of the query to pass to the retriever's `invoke` method |
:::note
Details on what tests are run and troubleshooting tips for each test can be found in the [API reference](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.retrievers.RetrieversIntegrationTests.html).
:::
import RetrieverIntegrationSource from '../../../../src/theme/integration_template/tests/integration_tests/test_retrievers.py';
<CodeBlock language="python" title="tests/integration_tests/test_retrievers.py">
{
RetrieverIntegrationSource.replaceAll('__ModuleName__', 'Parrot')
.replaceAll('__package_name__', 'langchain-parrot-link')
.replaceAll('__MODULE_NAME__', 'PARROT')
.replaceAll('__module_name__', 'langchain_parrot_link')
}
</CodeBlock>
</TabItem>
</Tabs>
---
### Running the tests
You can run these with the following commands from your project root
<Tabs>
<TabItem value="poetry" label="Poetry" default>
```bash
# run unit tests without network access
poetry run pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests
# run integration tests
poetry run pytest --asyncio-mode=auto tests/integration_tests
```
</TabItem>
<TabItem value="pip" label="Pip">
```bash
# run unit tests without network access
pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests
# run integration tests
pytest --asyncio-mode=auto tests/integration_tests
```
</TabItem>
</Tabs>
## Test suite information and troubleshooting
For a full list of the standard test suites that are available, as well as
information on which tests are included and how to troubleshoot common issues,
see the [Standard Tests API Reference](https://python.langchain.com/api_reference/standard_tests/index.html).
You can see troubleshooting guides under the individual test suites listed in that API Reference. For example,
[here is the guide for `ChatModelIntegrationTests.test_usage_metadata`](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html#langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.test_usage_metadata).

View File

@@ -802,7 +802,7 @@
"That's a wrap! In this quick start we covered how to create a simple agent. Agents are a complex topic, and there's lot to learn! \n",
"\n",
":::important\n",
"This section covered building with LangChain Agents. LangChain Agents are fine for getting started, but past a certain point you will likely want flexibility and control that they do not offer. For working with more advanced agents, we'd reccommend checking out [LangGraph](/docs/concepts/architecture/#langgraph)\n",
"This section covered building with LangChain Agents. They are fine for getting started, but past a certain point you will likely want flexibility and control which they do not offer. To develop more advanced agents, we recommend checking out [LangGraph](/docs/concepts/architecture/#langgraph)\n",
":::\n",
"\n",
"If you want to continue using LangChain agents, some good advanced guides are:\n",

View File

@@ -294,7 +294,7 @@
"metadata": {},
"source": [
":::caution\n",
"By default, `@tool(parse_docstring=True)` will raise `ValueError` if the docstring does not parse correctly. See [API Reference](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.tool.html) for detail and examples.\n",
"By default, `@tool(parse_docstring=True)` will raise `ValueError` if the docstring does not parse correctly. See [API Reference](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) for detail and examples.\n",
":::"
]
},

View File

@@ -1,459 +0,0 @@
{
"cells": [
{
"cell_type": "raw",
"id": "5e61b0f2-15b9-4241-9ab5-ff0f3f732232",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 1\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "846ef4f4-ee38-4a42-a7d3-1a23826e4830",
"metadata": {},
"source": [
"# How to map values to a graph database\n",
"\n",
"In this guide we'll go over strategies to improve graph database query generation by mapping values from user inputs to database.\n",
"When using the built-in graph chains, the LLM is aware of the graph schema, but has no information about the values of properties stored in the database.\n",
"Therefore, we can introduce a new step in graph database QA system to accurately map values.\n",
"\n",
"## Setup\n",
"\n",
"First, get required packages and set environment variables:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18294435-182d-48da-bcab-5b8945b6d9cf",
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain langchain-neo4j langchain-openai neo4j"
]
},
{
"cell_type": "markdown",
"id": "d86dd771-4001-4a34-8680-22e9b50e1e88",
"metadata": {},
"source": [
"We default to OpenAI models in this guide, but you can swap them out for the model provider of your choice."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "9346f8e9-78bf-4667-b3d3-72807a73b718",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
"\n",
"# Uncomment the below to use LangSmith. Not required.\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()\n",
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\""
]
},
{
"cell_type": "markdown",
"id": "271c8a23-e51c-4ead-a76e-cf21107db47e",
"metadata": {},
"source": [
"Next, we need to define Neo4j credentials.\n",
"Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a2a3bb65-05c7-4daf-bac2-b25ae7fe2751",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"NEO4J_URI\"] = \"bolt://localhost:7687\"\n",
"os.environ[\"NEO4J_USERNAME\"] = \"neo4j\"\n",
"os.environ[\"NEO4J_PASSWORD\"] = \"password\""
]
},
{
"cell_type": "markdown",
"id": "50fa4510-29b7-49b6-8496-5e86f694e81f",
"metadata": {},
"source": [
"The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4ee9ef7a-eef9-4289-b9fd-8fbc31041688",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_neo4j import Neo4jGraph\n",
"\n",
"graph = Neo4jGraph()\n",
"\n",
"# Import movie information\n",
"\n",
"movies_query = \"\"\"\n",
"LOAD CSV WITH HEADERS FROM \n",
"'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n",
"AS row\n",
"MERGE (m:Movie {id:row.movieId})\n",
"SET m.released = date(row.released),\n",
" m.title = row.title,\n",
" m.imdbRating = toFloat(row.imdbRating)\n",
"FOREACH (director in split(row.director, '|') | \n",
" MERGE (p:Person {name:trim(director)})\n",
" MERGE (p)-[:DIRECTED]->(m))\n",
"FOREACH (actor in split(row.actors, '|') | \n",
" MERGE (p:Person {name:trim(actor)})\n",
" MERGE (p)-[:ACTED_IN]->(m))\n",
"FOREACH (genre in split(row.genres, '|') | \n",
" MERGE (g:Genre {name:trim(genre)})\n",
" MERGE (m)-[:IN_GENRE]->(g))\n",
"\"\"\"\n",
"\n",
"graph.query(movies_query)"
]
},
{
"cell_type": "markdown",
"id": "0cb0ea30-ca55-4f35-aad6-beb57453de66",
"metadata": {},
"source": [
"## Detecting entities in the user input\n",
"We have to extract the types of entities/values we want to map to a graph database. In this example, we are dealing with a movie graph, so we can map movies and people to the database."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e1a19424-6046-40c2-81d1-f3b88193a293",
"metadata": {},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
"\n",
"\n",
"class Entities(BaseModel):\n",
" \"\"\"Identifying information about entities.\"\"\"\n",
"\n",
" names: List[str] = Field(\n",
" ...,\n",
" description=\"All the person or movies appearing in the text\",\n",
" )\n",
"\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"You are extracting person and movies from the text.\",\n",
" ),\n",
" (\n",
" \"human\",\n",
" \"Use the given format to extract information from the following \"\n",
" \"input: {question}\",\n",
" ),\n",
" ]\n",
")\n",
"\n",
"\n",
"entity_chain = prompt | llm.with_structured_output(Entities)"
]
},
{
"cell_type": "markdown",
"id": "9c14084c-37a7-4a9c-a026-74e12961c781",
"metadata": {},
"source": [
"We can test the entity extraction chain."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "bbfe0d8f-982e-46e6-88fb-8a4f0d850b07",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Entities(names=['Casino'])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"entities = entity_chain.invoke({\"question\": \"Who played in Casino movie?\"})\n",
"entities"
]
},
{
"cell_type": "markdown",
"id": "a8afbf13-05d0-4383-8050-f88b8c2f6fab",
"metadata": {},
"source": [
"We will utilize a simple `CONTAINS` clause to match entities to database. In practice, you might want to use a fuzzy search or a fulltext index to allow for minor misspellings."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "6f92929f-74fb-4db2-b7e1-eb1e9d386a67",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Casino maps to Casino Movie in database\\n'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"match_query = \"\"\"MATCH (p:Person|Movie)\n",
"WHERE p.name CONTAINS $value OR p.title CONTAINS $value\n",
"RETURN coalesce(p.name, p.title) AS result, labels(p)[0] AS type\n",
"LIMIT 1\n",
"\"\"\"\n",
"\n",
"\n",
"def map_to_database(entities: Entities) -> Optional[str]:\n",
" result = \"\"\n",
" for entity in entities.names:\n",
" response = graph.query(match_query, {\"value\": entity})\n",
" try:\n",
" result += f\"{entity} maps to {response[0]['result']} {response[0]['type']} in database\\n\"\n",
" except IndexError:\n",
" pass\n",
" return result\n",
"\n",
"\n",
"map_to_database(entities)"
]
},
{
"cell_type": "markdown",
"id": "f66c6756-6efb-4b1e-9b5d-87ed914a5212",
"metadata": {},
"source": [
"## Custom Cypher generating chain\n",
"\n",
"We need to define a custom Cypher prompt that takes the entity mapping information along with the schema and the user question to construct a Cypher statement.\n",
"We will be using the LangChain expression language to accomplish that."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "8ef3e21d-f1c2-45e2-9511-4920d1cf6e7e",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"\n",
"# Generate Cypher statement based on natural language input\n",
"cypher_template = \"\"\"Based on the Neo4j graph schema below, write a Cypher query that would answer the user's question:\n",
"{schema}\n",
"Entities in the question map to the following database values:\n",
"{entities_list}\n",
"Question: {question}\n",
"Cypher query:\"\"\"\n",
"\n",
"cypher_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"Given an input question, convert it to a Cypher query. No pre-amble.\",\n",
" ),\n",
" (\"human\", cypher_template),\n",
" ]\n",
")\n",
"\n",
"cypher_response = (\n",
" RunnablePassthrough.assign(names=entity_chain)\n",
" | RunnablePassthrough.assign(\n",
" entities_list=lambda x: map_to_database(x[\"names\"]),\n",
" schema=lambda _: graph.get_schema,\n",
" )\n",
" | cypher_prompt\n",
" | llm.bind(stop=[\"\\nCypherResult:\"])\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "1f0011e3-9660-4975-af2a-486b1bc3b954",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'MATCH (:Movie {title: \"Casino\"})<-[:ACTED_IN]-(actor)\\nRETURN actor.name'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cypher = cypher_response.invoke({\"question\": \"Who played in Casino movie?\"})\n",
"cypher"
]
},
{
"cell_type": "markdown",
"id": "38095678-611f-4847-a4de-e51ef7ef727c",
"metadata": {},
"source": [
"## Generating answers based on database results\n",
"\n",
"Now that we have a chain that generates the Cypher statement, we need to execute the Cypher statement against the database and send the database results back to an LLM to generate the final answer.\n",
"Again, we will be using LCEL."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d1fa97c0-1c9c-41d3-9ee1-5f1905d17434",
"metadata": {},
"outputs": [],
"source": [
"from langchain_neo4j.chains.graph_qa.cypher_utils import (\n",
" CypherQueryCorrector,\n",
" Schema,\n",
")\n",
"\n",
"graph.refresh_schema()\n",
"# Cypher validation tool for relationship directions\n",
"corrector_schema = [\n",
" Schema(el[\"start\"], el[\"type\"], el[\"end\"])\n",
" for el in graph.structured_schema.get(\"relationships\")\n",
"]\n",
"cypher_validation = CypherQueryCorrector(corrector_schema)\n",
"\n",
"# Generate natural language response based on database results\n",
"response_template = \"\"\"Based on the the question, Cypher query, and Cypher response, write a natural language response:\n",
"Question: {question}\n",
"Cypher query: {query}\n",
"Cypher Response: {response}\"\"\"\n",
"\n",
"response_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"Given an input question and Cypher response, convert it to a natural\"\n",
" \" language answer. No pre-amble.\",\n",
" ),\n",
" (\"human\", response_template),\n",
" ]\n",
")\n",
"\n",
"chain = (\n",
" RunnablePassthrough.assign(query=cypher_response)\n",
" | RunnablePassthrough.assign(\n",
" response=lambda x: graph.query(cypher_validation(x[\"query\"])),\n",
" )\n",
" | response_prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "918146e5-7918-46d2-a774-53f9547d8fcb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Robert De Niro, James Woods, Joe Pesci, and Sharon Stone played in the movie \"Casino\".'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": \"Who played in Casino movie?\"})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7ba75cd-8399-4e54-a6f8-8a411f159f56",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,548 +0,0 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 2\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to best prompt for Graph-RAG\n",
"\n",
"In this guide we'll go over prompting strategies to improve graph database query generation. We'll largely focus on methods for getting relevant database-specific information in your prompt.\n",
"\n",
"## Setup\n",
"\n",
"First, get required packages and set environment variables:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install --upgrade --quiet langchain langchain-neo4j langchain-openai neo4j"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We default to OpenAI models in this guide, but you can swap them out for the model provider of your choice."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
" ········\n"
]
}
],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass()\n",
"\n",
"# Uncomment the below to use LangSmith. Not required.\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()\n",
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we need to define Neo4j credentials.\n",
"Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"NEO4J_URI\"] = \"bolt://localhost:7687\"\n",
"os.environ[\"NEO4J_USERNAME\"] = \"neo4j\"\n",
"os.environ[\"NEO4J_PASSWORD\"] = \"password\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_neo4j import Neo4jGraph\n",
"\n",
"graph = Neo4jGraph()\n",
"\n",
"# Import movie information\n",
"\n",
"movies_query = \"\"\"\n",
"LOAD CSV WITH HEADERS FROM \n",
"'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n",
"AS row\n",
"MERGE (m:Movie {id:row.movieId})\n",
"SET m.released = date(row.released),\n",
" m.title = row.title,\n",
" m.imdbRating = toFloat(row.imdbRating)\n",
"FOREACH (director in split(row.director, '|') | \n",
" MERGE (p:Person {name:trim(director)})\n",
" MERGE (p)-[:DIRECTED]->(m))\n",
"FOREACH (actor in split(row.actors, '|') | \n",
" MERGE (p:Person {name:trim(actor)})\n",
" MERGE (p)-[:ACTED_IN]->(m))\n",
"FOREACH (genre in split(row.genres, '|') | \n",
" MERGE (g:Genre {name:trim(genre)})\n",
" MERGE (m)-[:IN_GENRE]->(g))\n",
"\"\"\"\n",
"\n",
"graph.query(movies_query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Filtering graph schema\n",
"\n",
"At times, you may need to focus on a specific subset of the graph schema while generating Cypher statements.\n",
"Let's say we are dealing with the following graph schema:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node properties are the following:\n",
"Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING},Person {name: STRING},Genre {name: STRING}\n",
"Relationship properties are the following:\n",
"\n",
"The relationships are the following:\n",
"(:Movie)-[:IN_GENRE]->(:Genre),(:Person)-[:DIRECTED]->(:Movie),(:Person)-[:ACTED_IN]->(:Movie)\n"
]
}
],
"source": [
"graph.refresh_schema()\n",
"print(graph.schema)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say we want to exclude the _Genre_ node from the schema representation we pass to an LLM.\n",
"We can achieve that using the `exclude` parameter of the GraphCypherQAChain chain."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from langchain_neo4j import GraphCypherQAChain\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
"chain = GraphCypherQAChain.from_llm(\n",
" graph=graph,\n",
" llm=llm,\n",
" exclude_types=[\"Genre\"],\n",
" verbose=True,\n",
" allow_dangerous_requests=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node properties are the following:\n",
"Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING},Person {name: STRING}\n",
"Relationship properties are the following:\n",
"\n",
"The relationships are the following:\n",
"(:Person)-[:DIRECTED]->(:Movie),(:Person)-[:ACTED_IN]->(:Movie)\n"
]
}
],
"source": [
"print(chain.graph_schema)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Few-shot examples\n",
"\n",
"Including examples of natural language questions being converted to valid Cypher queries against our database in the prompt will often improve model performance, especially for complex queries.\n",
"\n",
"Let's say we have the following examples:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"examples = [\n",
" {\n",
" \"question\": \"How many artists are there?\",\n",
" \"query\": \"MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)\",\n",
" },\n",
" {\n",
" \"question\": \"Which actors played in the movie Casino?\",\n",
" \"query\": \"MATCH (m:Movie {{title: 'Casino'}})<-[:ACTED_IN]-(a) RETURN a.name\",\n",
" },\n",
" {\n",
" \"question\": \"How many movies has Tom Hanks acted in?\",\n",
" \"query\": \"MATCH (a:Person {{name: 'Tom Hanks'}})-[:ACTED_IN]->(m:Movie) RETURN count(m)\",\n",
" },\n",
" {\n",
" \"question\": \"List all the genres of the movie Schindler's List\",\n",
" \"query\": \"MATCH (m:Movie {{title: 'Schindler\\\\'s List'}})-[:IN_GENRE]->(g:Genre) RETURN g.name\",\n",
" },\n",
" {\n",
" \"question\": \"Which actors have worked in movies from both the comedy and action genres?\",\n",
" \"query\": \"MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name\",\n",
" },\n",
" {\n",
" \"question\": \"Which directors have made movies with at least three different actors named 'John'?\",\n",
" \"query\": \"MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person) WHERE a.name STARTS WITH 'John' WITH d, COUNT(DISTINCT a) AS JohnsCount WHERE JohnsCount >= 3 RETURN d.name\",\n",
" },\n",
" {\n",
" \"question\": \"Identify movies where directors also played a role in the film.\",\n",
" \"query\": \"MATCH (p:Person)-[:DIRECTED]->(m:Movie), (p)-[:ACTED_IN]->(m) RETURN m.title, p.name\",\n",
" },\n",
" {\n",
" \"question\": \"Find the actor with the highest number of movies in the database.\",\n",
" \"query\": \"MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) RETURN a.name, COUNT(m) AS movieCount ORDER BY movieCount DESC LIMIT 1\",\n",
" },\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can create a few-shot prompt with them like so:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate\n",
"\n",
"example_prompt = PromptTemplate.from_template(\n",
" \"User input: {question}\\nCypher query: {query}\"\n",
")\n",
"prompt = FewShotPromptTemplate(\n",
" examples=examples[:5],\n",
" example_prompt=example_prompt,\n",
" prefix=\"You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\\n\\nHere is the schema information\\n{schema}.\\n\\nBelow are a number of examples of questions and their corresponding Cypher queries.\",\n",
" suffix=\"User input: {question}\\nCypher query: \",\n",
" input_variables=[\"question\", \"schema\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\n",
"\n",
"Here is the schema information\n",
"foo.\n",
"\n",
"Below are a number of examples of questions and their corresponding Cypher queries.\n",
"\n",
"User input: How many artists are there?\n",
"Cypher query: MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)\n",
"\n",
"User input: Which actors played in the movie Casino?\n",
"Cypher query: MATCH (m:Movie {title: 'Casino'})<-[:ACTED_IN]-(a) RETURN a.name\n",
"\n",
"User input: How many movies has Tom Hanks acted in?\n",
"Cypher query: MATCH (a:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN count(m)\n",
"\n",
"User input: List all the genres of the movie Schindler's List\n",
"Cypher query: MATCH (m:Movie {title: 'Schindler\\'s List'})-[:IN_GENRE]->(g:Genre) RETURN g.name\n",
"\n",
"User input: Which actors have worked in movies from both the comedy and action genres?\n",
"Cypher query: MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name\n",
"\n",
"User input: How many artists are there?\n",
"Cypher query: \n"
]
}
],
"source": [
"print(prompt.format(question=\"How many artists are there?\", schema=\"foo\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dynamic few-shot examples\n",
"\n",
"If we have enough examples, we may want to only include the most relevant ones in the prompt, either because they don't fit in the model's context window or because the long tail of examples distracts the model. And specifically, given any input we want to include the examples most relevant to that input.\n",
"\n",
"We can do just this using an ExampleSelector. In this case we'll use a [SemanticSimilarityExampleSelector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html), which will store the examples in the vector database of our choosing. At runtime it will perform a similarity search between the input and our examples, and return the most semantically similar ones: "
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.example_selectors import SemanticSimilarityExampleSelector\n",
"from langchain_neo4j import Neo4jVector\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"example_selector = SemanticSimilarityExampleSelector.from_examples(\n",
" examples,\n",
" OpenAIEmbeddings(),\n",
" Neo4jVector,\n",
" k=5,\n",
" input_keys=[\"question\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'query': 'MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)',\n",
" 'question': 'How many artists are there?'},\n",
" {'query': \"MATCH (a:Person {{name: 'Tom Hanks'}})-[:ACTED_IN]->(m:Movie) RETURN count(m)\",\n",
" 'question': 'How many movies has Tom Hanks acted in?'},\n",
" {'query': \"MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name\",\n",
" 'question': 'Which actors have worked in movies from both the comedy and action genres?'},\n",
" {'query': \"MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person) WHERE a.name STARTS WITH 'John' WITH d, COUNT(DISTINCT a) AS JohnsCount WHERE JohnsCount >= 3 RETURN d.name\",\n",
" 'question': \"Which directors have made movies with at least three different actors named 'John'?\"},\n",
" {'query': 'MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) RETURN a.name, COUNT(m) AS movieCount ORDER BY movieCount DESC LIMIT 1',\n",
" 'question': 'Find the actor with the highest number of movies in the database.'}]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"example_selector.select_examples({\"question\": \"how many artists are there?\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To use it, we can pass the ExampleSelector directly in to our FewShotPromptTemplate:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"prompt = FewShotPromptTemplate(\n",
" example_selector=example_selector,\n",
" example_prompt=example_prompt,\n",
" prefix=\"You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\\n\\nHere is the schema information\\n{schema}.\\n\\nBelow are a number of examples of questions and their corresponding Cypher queries.\",\n",
" suffix=\"User input: {question}\\nCypher query: \",\n",
" input_variables=[\"question\", \"schema\"],\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\n",
"\n",
"Here is the schema information\n",
"foo.\n",
"\n",
"Below are a number of examples of questions and their corresponding Cypher queries.\n",
"\n",
"User input: How many artists are there?\n",
"Cypher query: MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)\n",
"\n",
"User input: How many movies has Tom Hanks acted in?\n",
"Cypher query: MATCH (a:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN count(m)\n",
"\n",
"User input: Which actors have worked in movies from both the comedy and action genres?\n",
"Cypher query: MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name\n",
"\n",
"User input: Which directors have made movies with at least three different actors named 'John'?\n",
"Cypher query: MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person) WHERE a.name STARTS WITH 'John' WITH d, COUNT(DISTINCT a) AS JohnsCount WHERE JohnsCount >= 3 RETURN d.name\n",
"\n",
"User input: Find the actor with the highest number of movies in the database.\n",
"Cypher query: MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) RETURN a.name, COUNT(m) AS movieCount ORDER BY movieCount DESC LIMIT 1\n",
"\n",
"User input: how many artists are there?\n",
"Cypher query: \n"
]
}
],
"source": [
"print(prompt.format(question=\"how many artists are there?\", schema=\"foo\"))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
"chain = GraphCypherQAChain.from_llm(\n",
" graph=graph,\n",
" llm=llm,\n",
" cypher_prompt=prompt,\n",
" verbose=True,\n",
" allow_dangerous_requests=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new GraphCypherQAChain chain...\u001b[0m\n",
"Generated Cypher:\n",
"\u001b[32;1m\u001b[1;3mMATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)\u001b[0m\n",
"Full Context:\n",
"\u001b[32;1m\u001b[1;3m[{'count(DISTINCT a)': 967}]\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'query': 'How many actors are in the graph?',\n",
" 'result': 'There are 967 actors in the graph.'}"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(\"How many actors are in the graph?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -316,9 +316,7 @@ For a high-level tutorial, check out [this guide](/docs/tutorials/sql_qa/).
You can use an LLM to do question answering over graph databases.
For a high-level tutorial, check out [this guide](/docs/tutorials/graph/).
- [How to: map values to a database](/docs/how_to/graph_mapping)
- [How to: add a semantic layer over the database](/docs/how_to/graph_semantic)
- [How to: improve results with prompting](/docs/how_to/graph_prompting)
- [How to: construct knowledge graphs](/docs/how_to/graph_constructing)
### Summarization

View File

@@ -12,7 +12,7 @@
"There are two ways to implement a custom parser:\n",
"\n",
"1. Using `RunnableLambda` or `RunnableGenerator` in [LCEL](/docs/concepts/lcel/) -- we strongly recommend this for most use cases\n",
"2. By inherting from one of the base classes for out parsing -- this is the hard way of doing things\n",
"2. By inheriting from one of the base classes for out parsing -- this is the hard way of doing things\n",
"\n",
"The difference between the two approaches are mostly superficial and are mainly in terms of which callbacks are triggered (e.g., `on_chain_start` vs. `on_parser_start`), and how a runnable lambda vs. a parser might be visualized in a tracing platform like LangSmith."
]
@@ -200,7 +200,7 @@
"id": "24067447-8a5a-4d6b-86a3-4b9cc4b4369b",
"metadata": {},
"source": [
"## Inherting from Parsing Base Classes"
"## Inheriting from Parsing Base Classes"
]
},
{
@@ -208,7 +208,7 @@
"id": "9713f547-b2e4-48eb-807f-a0f6f6d0e7e0",
"metadata": {},
"source": [
"Another approach to implement a parser is by inherting from `BaseOutputParser`, `BaseGenerationOutputParser` or another one of the base parsers depending on what you need to do.\n",
"Another approach to implement a parser is by inheriting from `BaseOutputParser`, `BaseGenerationOutputParser` or another one of the base parsers depending on what you need to do.\n",
"\n",
"In general, we **do not** recommend this approach for most use cases as it results in more code to write without significant benefits.\n",
"\n",

View File

@@ -55,7 +55,7 @@
"* Run `.read Chinook_Sqlite.sql`\n",
"* Test `SELECT * FROM Artist LIMIT 10;`\n",
"\n",
"Now, `Chinhook.db` is in our directory and we can interface with it using the SQLAlchemy-driven [SQLDatabase](https://python.langchain.com/api_reference/community/utilities/langchain_community.utilities.sql_database.SQLDatabase.html) class:"
"Now, `Chinook.db` is in our directory and we can interface with it using the SQLAlchemy-driven [SQLDatabase](https://python.langchain.com/api_reference/community/utilities/langchain_community.utilities.sql_database.SQLDatabase.html) class:"
]
},
{

View File

@@ -51,7 +51,7 @@
"* Run `.read Chinook_Sqlite.sql`\n",
"* Test `SELECT * FROM Artist LIMIT 10;`\n",
"\n",
"Now, `Chinhook.db` is in our directory and we can interface with it using the SQLAlchemy-driven `SQLDatabase` class:"
"Now, `Chinook.db` is in our directory and we can interface with it using the SQLAlchemy-driven `SQLDatabase` class:"
]
},
{

View File

@@ -54,7 +54,7 @@
"* Run `.read Chinook_Sqlite.sql`\n",
"* Test `SELECT * FROM Artist LIMIT 10;`\n",
"\n",
"Now, `Chinhook.db` is in our directory and we can interface with it using the SQLAlchemy-driven `SQLDatabase` class:"
"Now, `Chinook.db` is in our directory and we can interface with it using the SQLAlchemy-driven `SQLDatabase` class:"
]
},
{

View File

@@ -336,7 +336,7 @@
"\n",
"The **MultiQueryRetriever** is used to tackle the problem that the RAG pipeline might not return the best set of documents based on the query. It generates multiple queries that mean the same as the original query and then fetches documents for each.\n",
"\n",
"To evluate this retriever, UpTrain will run the following evaluation:\n",
"To evaluate this retriever, UpTrain will run the following evaluation:\n",
"- **[Multi Query Accuracy](https://docs.uptrain.ai/predefined-evaluations/query-quality/multi-query-accuracy)**: Checks if the multi-queries generated mean the same as the original query."
]
},

View File

@@ -36,7 +36,7 @@
"### Integration details\n",
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/ibm/) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
"| [ChatWatsonx](https://python.langchain.com/api_reference/ibm/chat_models/langchain_ibm.chat_models.ChatWatsonx.html#langchain_ibm.chat_models.ChatWatsonx) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | ❌ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |\n",
"| [ChatWatsonx](https://python.langchain.com/api_reference/ibm/chat_models/langchain_ibm.chat_models.ChatWatsonx.html) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | ❌ | ❌ | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |\n",
"\n",
"### Model features\n",
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | Image input | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",

View File

@@ -0,0 +1,41 @@
# ScrapeGraph AI
>[ScrapeGraph AI](https://scrapegraphai.com) is a service that provides AI-powered web scraping capabilities.
>It offers tools for extracting structured data, converting webpages to markdown, and processing local HTML content
>using natural language prompts.
## Installation and Setup
Install the required packages:
```bash
pip install langchain-scrapegraph
```
Set up your API key:
```bash
export SGAI_API_KEY="your-scrapegraph-api-key"
```
## Tools
See a [usage example](/docs/integrations/tools/scrapegraph).
There are four tools available:
```python
from langchain_scrapegraph.tools import (
SmartScraperTool, # Extract structured data from websites
MarkdownifyTool, # Convert webpages to markdown
LocalScraperTool, # Process local HTML content
GetCreditsTool, # Check remaining API credits
)
```
Each tool serves a specific purpose:
- `SmartScraperTool`: Extract structured data from websites given a URL, prompt and optional output schema
- `MarkdownifyTool`: Convert any webpage to clean markdown format
- `LocalScraperTool`: Extract structured data from a local HTML file given a prompt and optional output schema
- `GetCreditsTool`: Check your remaining ScrapeGraph AI credits

View File

@@ -8,7 +8,7 @@
"\n",
">[Upstage](https://upstage.ai) is a leading artificial intelligence (AI) company specializing in delivering above-human-grade performance LLM components.\n",
">\n",
">**Solar Mini Chat** is a fast yet powerful advanced large language model focusing on English and Korean. It has been specifically fine-tuned for multi-turn chat purposes, showing enhanced performance across a wide range of natural language processing tasks, like multi-turn conversation or tasks that require an understanding of long contexts, such as RAG (Retrieval-Augmented Generation), compared to other models of a similar size. This fine-tuning equips it with the ability to handle longer conversations more effectively, making it particularly adept for interactive applications.\n",
">**Solar Pro** is an enterprise-grade LLM optimized for single-GPU deployment, excelling in instruction-following and processing structured formats like HTML and Markdown. It supports English, Korean, and Japanese with top multilingual performance and offers domain expertise in finance, healthcare, and legal.\n",
"\n",
">Other than Solar, Upstage also offers features for real-world RAG (retrieval-augmented generation), such as **Document Parse** and **Groundedness Check**. \n"
]
@@ -21,12 +21,12 @@
"\n",
"| API | Description | Import | Example usage |\n",
"| --- | --- | --- | --- |\n",
"| Chat | Build assistants using Solar Mini Chat | `from langchain_upstage import ChatUpstage` | [Go](../../chat/upstage) |\n",
"| Chat | Build assistants using Solar Chat | `from langchain_upstage import ChatUpstage` | [Go](../../chat/upstage) |\n",
"| Text Embedding | Embed strings to vectors | `from langchain_upstage import UpstageEmbeddings` | [Go](../../text_embedding/upstage) |\n",
"| Groundedness Check | Verify groundedness of assistant's response | `from langchain_upstage import UpstageGroundednessCheck` | [Go](../../tools/upstage_groundedness_check) |\n",
"| Document Parse | Serialize documents with tables and figures | `from langchain_upstage import UpstageDocumentParseLoader` | [Go](../../document_loaders/upstage) |\n",
"\n",
"See [documentations](https://developers.upstage.ai/) for more details about the features."
"See [documentations](https://console.upstage.ai/docs/getting-started/overview) for more details about the models and features."
]
},
{

View File

@@ -35,9 +35,9 @@
"\n",
"### Integration details\n",
"\n",
"| Class | Package | JS support | Package downloads | Package latest |\n",
"| Class | Package | [JS support](https://js.langchain.com/docs/integrations/document_compressors/ibm/) | Package downloads | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [WatsonxRerank](https://python.langchain.com/api_reference/ibm/chat_models/langchain_ibm.rerank.WatsonxRerank.html) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |"
"| [WatsonxRerank](https://python.langchain.com/api_reference/ibm/rerank/langchain_ibm.rerank.WatsonxRerank.html) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-ibm?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-ibm?style=flat-square&label=%20) |"
]
},
{
@@ -445,7 +445,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "langchain_ibm",
"language": "python",
"name": "python3"
},

View File

@@ -194,7 +194,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -146,7 +146,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -164,7 +164,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -185,7 +185,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_community.llms import Tongyi\n",
"\n",

View File

@@ -282,7 +282,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -196,7 +196,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -125,7 +125,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_community.query_constructors.hanavector import HanaTranslator\n",
"from langchain_openai import ChatOpenAI\n",

View File

@@ -119,7 +119,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -160,7 +160,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -165,7 +165,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import ChatOpenAI\n",
"\n",

View File

@@ -168,7 +168,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -135,7 +135,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -141,7 +141,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -190,7 +190,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -144,7 +144,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -194,7 +194,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -308,7 +308,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -218,7 +218,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import ChatOpenAI\n",
"\n",

View File

@@ -249,7 +249,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -91,7 +91,7 @@
"os.environ[\"VECTARA_CORPUS_ID\"] = \"<YOUR_VECTARA_CORPUS_ID>\"\n",
"os.environ[\"VECTARA_CUSTOMER_ID\"] = \"<YOUR_VECTARA_CUSTOMER_ID>\"\n",
"\n",
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_community.vectorstores import Vectara\n",
"from langchain_openai.chat_models import ChatOpenAI"

View File

@@ -115,7 +115,7 @@
},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import OpenAI\n",
"\n",

View File

@@ -327,7 +327,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "langchain",
"display_name": "langchain_ibm",
"language": "python",
"name": "python3"
},

View File

@@ -0,0 +1,201 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e8712110",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"Model2Vec is a technique to turn any sentence transformer into a really small static model\n",
"[model2vec](https://github.com/MinishLab/model2vec) can be used to generate embeddings."
]
},
{
"cell_type": "markdown",
"id": "266dd424",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"```bash\n",
"pip install -U langchain-community\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "78ab91a6",
"metadata": {},
"source": [
"## Instantiation"
]
},
{
"cell_type": "markdown",
"id": "d06e7719",
"metadata": {},
"source": [
"Ensure that `model2vec` is installed\n",
"\n",
"```bash\n",
"pip install -U model2vec\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "f8ea1ed5",
"metadata": {},
"source": [
"## Indexing and Retrieval"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d25dc22d-b656-46c6-a42d-eace958590cd",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-24T15:13:17.176956Z",
"start_time": "2023-05-24T15:13:15.399076Z"
},
"execution": {
"iopub.execute_input": "2024-03-29T15:39:19.252281Z",
"iopub.status.busy": "2024-03-29T15:39:19.252101Z",
"iopub.status.idle": "2024-03-29T15:39:19.339106Z",
"shell.execute_reply": "2024-03-29T15:39:19.338614Z",
"shell.execute_reply.started": "2024-03-29T15:39:19.252260Z"
}
},
"outputs": [],
"source": [
"from langchain_community.embeddings import Model2vecEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "8397b91f-a1f9-4be6-a699-fedaada7c37a",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-24T15:13:17.193751Z",
"start_time": "2023-05-24T15:13:17.182053Z"
},
"execution": {
"iopub.execute_input": "2024-03-29T15:39:19.901573Z",
"iopub.status.busy": "2024-03-29T15:39:19.900935Z",
"iopub.status.idle": "2024-03-29T15:39:19.906540Z",
"shell.execute_reply": "2024-03-29T15:39:19.905345Z",
"shell.execute_reply.started": "2024-03-29T15:39:19.901529Z"
}
},
"outputs": [],
"source": [
"embeddings = Model2vecEmbeddings(\"minishlab/potion-base-8M\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "abcf98b7-424c-4691-a1cd-862c3d53be11",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-24T15:13:17.844903Z",
"start_time": "2023-05-24T15:13:17.198751Z"
},
"execution": {
"iopub.execute_input": "2024-03-29T15:39:20.434581Z",
"iopub.status.busy": "2024-03-29T15:39:20.433117Z",
"iopub.status.idle": "2024-03-29T15:39:22.178650Z",
"shell.execute_reply": "2024-03-29T15:39:22.176058Z",
"shell.execute_reply.started": "2024-03-29T15:39:20.434501Z"
},
"scrolled": true
},
"outputs": [],
"source": [
"query_text = \"This is a test query.\"\n",
"query_result = embeddings.embed_query(query_text)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "98897454-b280-4ee1-bbb9-2c6c15342f87",
"metadata": {
"ExecuteTime": {
"end_time": "2023-05-24T15:13:18.605339Z",
"start_time": "2023-05-24T15:13:17.845906Z"
},
"execution": {
"iopub.execute_input": "2024-03-29T15:39:28.164009Z",
"iopub.status.busy": "2024-03-29T15:39:28.161759Z",
"iopub.status.idle": "2024-03-29T15:39:30.217232Z",
"shell.execute_reply": "2024-03-29T15:39:30.215348Z",
"shell.execute_reply.started": "2024-03-29T15:39:28.163876Z"
},
"scrolled": true
},
"outputs": [],
"source": [
"document_text = \"This is a test document.\"\n",
"document_result = embeddings.embed_documents([document_text])"
]
},
{
"cell_type": "markdown",
"id": "11bac134",
"metadata": {},
"source": [
"## Direct Usage\n",
"\n",
"Here's how you would directly make use of `model2vec`\n",
"\n",
"```python\n",
"from model2vec import StaticModel\n",
"\n",
"# Load a model from the HuggingFace hub (in this case the potion-base-8M model)\n",
"model = StaticModel.from_pretrained(\"minishlab/potion-base-8M\")\n",
"\n",
"# Make embeddings\n",
"embeddings = model.encode([\"It's dangerous to go alone!\", \"It's a secret to everybody.\"])\n",
"\n",
"# Make sequences of token embeddings\n",
"token_embeddings = model.encode_as_sequence([\"It's dangerous to go alone!\", \"It's a secret to everybody.\"])\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "d81e21aa",
"metadata": {},
"source": [
"## API Reference\n",
"\n",
"For more information check out the model2vec github [repo](https://github.com/MinishLab/model2vec)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,380 @@
{
"cells": [
{
"cell_type": "raw",
"id": "10238e62-3465-4973-9279-606cbb7ccf16",
"metadata": {},
"source": [
"---\n",
"sidebar_label: ScrapeGraph\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "a6f91f20",
"metadata": {},
"source": [
"# ScrapeGraph\n",
"\n",
"This notebook provides a quick overview for getting started with ScrapeGraph [tools](/docs/integrations/tools/). For detailed documentation of all ScrapeGraph features and configurations head to the [API reference](https://python.langchain.com/docs/integrations/tools/scrapegraph).\n",
"\n",
"For more information about ScrapeGraph AI:\n",
"- [ScrapeGraph AI Website](https://scrapegraphai.com)\n",
"- [Open Source Project](https://github.com/ScrapeGraphAI/Scrapegraph-ai)\n",
"\n",
"## Overview\n",
"\n",
"### Integration details\n",
"\n",
"| Class | Package | Serializable | JS support | Package latest |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [SmartScraperTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-scrapegraph?style=flat-square&label=%20) |\n",
"| [MarkdownifyTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-scrapegraph?style=flat-square&label=%20) |\n",
"| [LocalScraperTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-scrapegraph?style=flat-square&label=%20) |\n",
"| [GetCreditsTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-scrapegraph?style=flat-square&label=%20) |\n",
"\n",
"### Tool features\n",
"\n",
"| Tool | Purpose | Input | Output |\n",
"| :--- | :--- | :--- | :--- |\n",
"| SmartScraperTool | Extract structured data from websites | URL + prompt | JSON |\n",
"| MarkdownifyTool | Convert webpages to markdown | URL | Markdown text |\n",
"| LocalScraperTool | Extract data from HTML content | HTML + prompt | JSON |\n",
"| GetCreditsTool | Check API credits | None | Credit info |\n",
"\n",
"\n",
"## Setup\n",
"\n",
"The integration requires the following packages:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f85b4089",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install --quiet -U langchain-scrapegraph"
]
},
{
"cell_type": "markdown",
"id": "b15e9266",
"metadata": {},
"source": [
"### Credentials\n",
"\n",
"You'll need a ScrapeGraph AI API key to use these tools. Get one at [scrapegraphai.com](https://scrapegraphai.com)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e0b178a2",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"if not os.environ.get(\"SGAI_API_KEY\"):\n",
" os.environ[\"SGAI_API_KEY\"] = getpass.getpass(\"ScrapeGraph AI API key:\\n\")"
]
},
{
"cell_type": "markdown",
"id": "bc5ab717",
"metadata": {},
"source": [
"It's also helpful (but not needed) to set up [LangSmith](https://smith.langchain.com/) for best-in-class observability:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a6c2f136",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
]
},
{
"cell_type": "markdown",
"id": "1c97218f",
"metadata": {},
"source": [
"## Instantiation\n",
"\n",
"Here we show how to instantiate instances of the ScrapeGraph tools:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "8b3ddfe9",
"metadata": {},
"outputs": [],
"source": [
"from langchain_scrapegraph.tools import (\n",
" GetCreditsTool,\n",
" LocalScraperTool,\n",
" MarkdownifyTool,\n",
" SmartScraperTool,\n",
")\n",
"\n",
"smartscraper = SmartScraperTool()\n",
"markdownify = MarkdownifyTool()\n",
"localscraper = LocalScraperTool()\n",
"credits = GetCreditsTool()"
]
},
{
"cell_type": "markdown",
"id": "74147a1a",
"metadata": {},
"source": [
"## Invocation\n",
"\n",
"### [Invoke directly with args](/docs/concepts/tools)\n",
"\n",
"Let's try each tool individually:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "65310a8b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SmartScraper Result: {'company_name': 'ScrapeGraphAI', 'description': \"ScrapeGraphAI is a powerful AI web scraping tool that turns entire websites into clean, structured data through a simple API. It's designed to help developers and AI companies extract valuable data from websites efficiently and transform it into formats that are ready for use in LLM applications and data analysis.\"}\n",
"\n",
"Markdownify Result (first 200 chars): [![ScrapeGraphAI Logo](https://scrapegraphai.com/images/scrapegraphai_logo.svg)ScrapeGraphAI](https://scrapegraphai.com/)\n",
"\n",
"PartnersPricingFAQ[Blog](https://scrapegraphai.com/blog)DocsLog inSign up\n",
"\n",
"Op\n",
"LocalScraper Result: {'company_name': 'Company Name', 'description': 'We are a technology company focused on AI solutions.', 'contact': {'email': 'contact@example.com', 'phone': '(555) 123-4567'}}\n",
"\n",
"Credits Info: {'remaining_credits': 49679, 'total_credits_used': 914}\n"
]
}
],
"source": [
"# SmartScraper\n",
"result = smartscraper.invoke(\n",
" {\n",
" \"user_prompt\": \"Extract the company name and description\",\n",
" \"website_url\": \"https://scrapegraphai.com\",\n",
" }\n",
")\n",
"print(\"SmartScraper Result:\", result)\n",
"\n",
"# Markdownify\n",
"markdown = markdownify.invoke({\"website_url\": \"https://scrapegraphai.com\"})\n",
"print(\"\\nMarkdownify Result (first 200 chars):\", markdown[:200])\n",
"\n",
"local_html = \"\"\"\n",
"<html>\n",
" <body>\n",
" <h1>Company Name</h1>\n",
" <p>We are a technology company focused on AI solutions.</p>\n",
" <div class=\"contact\">\n",
" <p>Email: contact@example.com</p>\n",
" <p>Phone: (555) 123-4567</p>\n",
" </div>\n",
" </body>\n",
"</html>\n",
"\"\"\"\n",
"\n",
"# LocalScraper\n",
"result_local = localscraper.invoke(\n",
" {\n",
" \"user_prompt\": \"Make a summary of the webpage and extract the email and phone number\",\n",
" \"website_html\": local_html,\n",
" }\n",
")\n",
"print(\"LocalScraper Result:\", result_local)\n",
"\n",
"# Check credits\n",
"credits_info = credits.invoke({})\n",
"print(\"\\nCredits Info:\", credits_info)"
]
},
{
"cell_type": "markdown",
"id": "d6e73897",
"metadata": {},
"source": [
"### [Invoke with ToolCall](/docs/concepts/tools)\n",
"\n",
"We can also invoke the tool with a model-generated ToolCall:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f90e33a7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ToolMessage(content='{\"main_heading\": \"Get the data you need from any website\", \"description\": \"Easily extract and gather information with just a few lines of code with a simple api. Turn websites into clean and usable structured data.\"}', name='SmartScraper', tool_call_id='1')"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_generated_tool_call = {\n",
" \"args\": {\n",
" \"user_prompt\": \"Extract the main heading and description\",\n",
" \"website_url\": \"https://scrapegraphai.com\",\n",
" },\n",
" \"id\": \"1\",\n",
" \"name\": smartscraper.name,\n",
" \"type\": \"tool_call\",\n",
"}\n",
"smartscraper.invoke(model_generated_tool_call)"
]
},
{
"cell_type": "markdown",
"id": "659f9fbd",
"metadata": {},
"source": [
"## Chaining\n",
"\n",
"Let's use our tools with an LLM to analyze a website:\n",
"\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "af3123ad",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"# | output: false\n",
"# | echo: false\n",
"\n",
"# %pip install -qU langchain langchain-openai\n",
"from langchain.chat_models import init_chat_model\n",
"\n",
"llm = init_chat_model(model=\"gpt-4o\", model_provider=\"openai\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "fdbf35b5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='ScrapeGraph AI is an AI-powered web scraping tool that efficiently extracts and converts website data into structured formats via a simple API. It caters to developers, data scientists, and AI researchers, offering features like easy integration, support for dynamic content, and scalability for large projects. It supports various website types, including business, e-commerce, and educational sites. Contact: contact@scrapegraphai.com.', additional_kwargs={'tool_calls': [{'id': 'call_shkRPyjyAtfjH9ffG5rSy9xj', 'function': {'arguments': '{\"user_prompt\":\"Extract details about the products, services, and key features offered by ScrapeGraph AI, as well as any unique selling points or innovations mentioned on the website.\",\"website_url\":\"https://scrapegraphai.com\"}', 'name': 'SmartScraper'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 47, 'prompt_tokens': 480, 'total_tokens': 527, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_c7ca0ebaca', 'finish_reason': 'stop', 'logprobs': None}, id='run-45a12c86-d499-4273-8c59-0db926799bc7-0', tool_calls=[{'name': 'SmartScraper', 'args': {'user_prompt': 'Extract details about the products, services, and key features offered by ScrapeGraph AI, as well as any unique selling points or innovations mentioned on the website.', 'website_url': 'https://scrapegraphai.com'}, 'id': 'call_shkRPyjyAtfjH9ffG5rSy9xj', 'type': 'tool_call'}], usage_metadata={'input_tokens': 480, 'output_tokens': 47, 'total_tokens': 527, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnableConfig, chain\n",
"\n",
"prompt = ChatPromptTemplate(\n",
" [\n",
" (\n",
" \"system\",\n",
" \"You are a helpful assistant that can use tools to extract structured information from websites.\",\n",
" ),\n",
" (\"human\", \"{user_input}\"),\n",
" (\"placeholder\", \"{messages}\"),\n",
" ]\n",
")\n",
"\n",
"llm_with_tools = llm.bind_tools([smartscraper], tool_choice=smartscraper.name)\n",
"llm_chain = prompt | llm_with_tools\n",
"\n",
"\n",
"@chain\n",
"def tool_chain(user_input: str, config: RunnableConfig):\n",
" input_ = {\"user_input\": user_input}\n",
" ai_msg = llm_chain.invoke(input_, config=config)\n",
" tool_msgs = smartscraper.batch(ai_msg.tool_calls, config=config)\n",
" return llm_chain.invoke({**input_, \"messages\": [ai_msg, *tool_msgs]}, config=config)\n",
"\n",
"\n",
"tool_chain.invoke(\n",
" \"What does ScrapeGraph AI do? Extract this information from their website https://scrapegraphai.com\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "4ac8146c",
"metadata": {},
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all ScrapeGraph features and configurations head to the Langchain API reference: https://python.langchain.com/docs/integrations/tools/scrapegraph\n",
"\n",
"Or to the official SDK repo: https://github.com/ScrapeGraphAI/langchain-scrapegraph"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -552,7 +552,7 @@
"id": "66690c78",
"metadata": {},
"source": [
"A known limitation of large languag models (LLMs) is that their training data can be outdated, or not include the specific domain knowledge that you require.\n",
"A known limitation of large language models (LLMs) is that their training data can be outdated, or not include the specific domain knowledge that you require.\n",
"\n",
"Take a look at the example below:"
]

View File

@@ -11,7 +11,7 @@ LangChain simplifies every stage of the LLM application lifecycle:
- **Development**: Build your applications using LangChain's open-source [building blocks](/docs/concepts/lcel), [components](/docs/concepts), and [third-party integrations](/docs/integrations/providers/).
Use [LangGraph](/docs/concepts/architecture/#langgraph) to build stateful agents with first-class streaming and human-in-the-loop support.
- **Productionization**: Use [LangSmith](https://docs.smith.langchain.com/) to inspect, monitor and evaluate your chains, so that you can continuously optimize and deploy with confidence.
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/).
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Platform](https://langchain-ai.github.io/langgraph/cloud/).
import ThemedImage from '@theme/ThemedImage';
import useBaseUrl from '@docusaurus/useBaseUrl';
@@ -29,11 +29,11 @@ import useBaseUrl from '@docusaurus/useBaseUrl';
Concretely, the framework consists of the following open-source libraries:
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
- Integration packages (e.g. **`langchain-openai`**, **`langchain-anthropic`**, etc.): Important integrations have been split into lightweight packages that are co-maintained by the LangChain team and the integration developers.
- **Integration packages** (e.g. `langchain-openai`, `langchain-anthropic`, etc.): Important integrations have been split into lightweight packages that are co-maintained by the LangChain team and the integration developers.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **`langchain-community`**: Third-party integrations that are community maintained.
- **[LangGraph](https://langchain-ai.github.io/langgraph)**: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it.
- **[LangGraphPlatform](https://langchain-ai.github.io/langgraph/concepts/#langgraph-platform)**: Deploy LLM applications built with LangGraph to production.
- **[LangGraph](https://langchain-ai.github.io/langgraph)**: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. To learn more about LangGraph, check out our first LangChain Academy course, *Introduction to LangGraph*, available [here](https://academy.langchain.com/courses/intro-to-langgraph).
- **[LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/#langgraph-platform)**: Deploy LLM applications built with LangGraph to production.
- **[LangSmith](https://docs.smith.langchain.com)**: A developer platform that lets you debug, test, evaluate, and monitor LLM applications.

File diff suppressed because one or more lines are too long

View File

@@ -35,6 +35,7 @@
"json-loader": "^0.5.7",
"prism-react-renderer": "^2.1.0",
"process": "^0.11.10",
"raw-loader": "^4.0.2",
"react": "^18",
"react-dom": "^18",
"typescript": "^5.2.2",

View File

@@ -25,8 +25,6 @@ NOTEBOOKS_NO_EXECUTION = [
"docs/docs/how_to/example_selectors_langsmith.ipynb", # TODO: add langchain-benchmarks; fix cassette issue
"docs/docs/how_to/extraction_long_text.ipynb", # Non-determinism due to batch
"docs/docs/how_to/graph_constructing.ipynb", # Requires local neo4j
"docs/docs/how_to/graph_mapping.ipynb", # Requires local neo4j
"docs/docs/how_to/graph_prompting.ipynb", # Requires local neo4j
"docs/docs/how_to/graph_semantic.ipynb", # Requires local neo4j
"docs/docs/how_to/hybrid.ipynb", # Requires AstraDB instance
"docs/docs/how_to/indexing.ipynb", # Requires local Elasticsearch

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

View File

@@ -62,6 +62,14 @@
"source": "/docs/tutorials/local_rag",
"destination": "/docs/tutorials/rag"
},
{
"source": "/docs/how_to/graph_mapping(/?)",
"destination": "/docs/tutorials/graph#query-validation"
},
{
"source": "/docs/how_to/graph_prompting(/?)",
"destination": "/docs/tutorials/graph#few-shot-prompting"
},
{
"source": "/docs/tutorials/data_generation",
"destination": "https://python.langchain.com/v0.2/docs/tutorials/data_generation/"

View File

@@ -9043,6 +9043,14 @@ raw-body@2.5.2:
iconv-lite "0.4.24"
unpipe "1.0.0"
raw-loader@^4.0.2:
version "4.0.2"
resolved "https://registry.yarnpkg.com/raw-loader/-/raw-loader-4.0.2.tgz#1aac6b7d1ad1501e66efdac1522c73e59a584eb6"
integrity sha512-ZnScIV3ag9A4wPX/ZayxL/jZH+euYb6FcUinPcgiQW0+UBtEv0O6Q3lGd3cqJ+GHH+rksEv3Pj99oxJ3u3VIKA==
dependencies:
loader-utils "^2.0.0"
schema-utils "^3.0.0"
rc@1.2.8:
version "1.2.8"
resolved "https://registry.yarnpkg.com/rc/-/rc-1.2.8.tgz#cd924bf5200a075b83c188cd6b9e211b7fc0d3ed"

View File

@@ -45,5 +45,4 @@ _e2e_test:
poetry run pip install -e ../../../standard-tests && \
make format lint tests && \
poetry install --with test_integration && \
rm tests/integration_tests/test_vectorstores.py && \
make integration_test

View File

@@ -2,23 +2,23 @@
from __future__ import annotations
import uuid
from typing import (
TYPE_CHECKING,
Any,
Callable,
Iterable,
Iterator,
List,
Optional,
Sequence,
Tuple,
Type,
TypeVar,
)
from langchain_core.documents import Document
from langchain_core.embeddings import Embeddings
from langchain_core.vectorstores import VectorStore
if TYPE_CHECKING:
from langchain_core.documents import Document
from langchain_core.vectorstores.utils import _cosine_similarity as cosine_similarity
VST = TypeVar("VST", bound=VectorStore)
@@ -158,40 +158,184 @@ class __ModuleName__VectorStore(VectorStore):
""" # noqa: E501
_database: dict[str, tuple[Document, list[float]]] = {}
def __init__(self, embedding: Embeddings) -> None:
"""Initialize with the given embedding function.
def add_texts(
self,
texts: Iterable[str],
Args:
embedding: embedding function to use.
"""
self._database: dict[str, dict[str, Any]] = {}
self.embedding = embedding
@classmethod
def from_texts(
cls: Type[__ModuleName__VectorStore],
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
**kwargs: Any,
) -> List[str]:
raise NotImplementedError
) -> __ModuleName__VectorStore:
store = cls(
embedding=embedding,
)
store.add_texts(texts=texts, metadatas=metadatas, **kwargs)
return store
# optional: add custom async implementations
# async def aadd_texts(
# self,
# texts: Iterable[str],
# @classmethod
# async def afrom_texts(
# cls: Type[VST],
# texts: List[str],
# embedding: Embeddings,
# metadatas: Optional[List[dict]] = None,
# **kwargs: Any,
# ) -> List[str]:
# ) -> VST:
# return await asyncio.get_running_loop().run_in_executor(
# None, partial(self.add_texts, **kwargs), texts, metadatas
# None, partial(cls.from_texts, **kwargs), texts, embedding, metadatas
# )
def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> Optional[bool]:
raise NotImplementedError
@property
def embeddings(self) -> Embeddings:
return self.embedding
def add_documents(
self,
documents: List[Document],
ids: Optional[List[str]] = None,
**kwargs: Any,
) -> List[str]:
"""Add documents to the store."""
texts = [doc.page_content for doc in documents]
vectors = self.embedding.embed_documents(texts)
if ids and len(ids) != len(texts):
msg = (
f"ids must be the same length as texts. "
f"Got {len(ids)} ids and {len(texts)} texts."
)
raise ValueError(msg)
id_iterator: Iterator[Optional[str]] = (
iter(ids) if ids else iter(doc.id for doc in documents)
)
ids_ = []
for doc, vector in zip(documents, vectors):
doc_id = next(id_iterator)
doc_id_ = doc_id if doc_id else str(uuid.uuid4())
ids_.append(doc_id_)
self._database[doc_id_] = {
"id": doc_id_,
"vector": vector,
"text": doc.page_content,
"metadata": doc.metadata,
}
return ids_
# optional: add custom async implementations
# async def aadd_documents(
# self,
# documents: List[Document],
# ids: Optional[List[str]] = None,
# **kwargs: Any,
# ) -> List[str]:
# raise NotImplementedError
def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> None:
if ids:
for _id in ids:
self._database.pop(_id, None)
# optional: add custom async implementations
# async def adelete(
# self, ids: Optional[List[str]] = None, **kwargs: Any
# ) -> Optional[bool]:
# ) -> None:
# raise NotImplementedError
def get_by_ids(self, ids: Sequence[str], /) -> list[Document]:
"""Get documents by their ids.
Args:
ids: The ids of the documents to get.
Returns:
A list of Document objects.
"""
documents = []
for doc_id in ids:
doc = self._database.get(doc_id)
if doc:
documents.append(
Document(
id=doc["id"],
page_content=doc["text"],
metadata=doc["metadata"],
)
)
return documents
# optional: add custom async implementations
# async def aget_by_ids(self, ids: Sequence[str], /) -> list[Document]:
# raise NotImplementedError
# NOTE: the below helper method implements similarity search for in-memory
# storage. It is optional and not a part of the vector store interface.
def _similarity_search_with_score_by_vector(
self,
embedding: List[float],
k: int = 4,
filter: Optional[Callable[[Document], bool]] = None,
**kwargs: Any,
) -> List[tuple[Document, float, List[float]]]:
# get all docs with fixed order in list
docs = list(self._database.values())
if filter is not None:
docs = [
doc
for doc in docs
if filter(Document(page_content=doc["text"], metadata=doc["metadata"]))
]
if not docs:
return []
similarity = cosine_similarity([embedding], [doc["vector"] for doc in docs])[0]
# get the indices ordered by similarity score
top_k_idx = similarity.argsort()[::-1][:k]
return [
(
# Document
Document(
id=doc_dict["id"],
page_content=doc_dict["text"],
metadata=doc_dict["metadata"],
),
# Score
float(similarity[idx].item()),
# Embedding vector
doc_dict["vector"],
)
for idx in top_k_idx
# Assign using walrus operator to avoid multiple lookups
if (doc_dict := docs[idx])
]
def similarity_search(
self, query: str, k: int = 4, **kwargs: Any
) -> List[Document]:
raise NotImplementedError
embedding = self.embedding.embed_query(query)
return [
doc
for doc, _, _ in self._similarity_search_with_score_by_vector(
embedding=embedding, k=k, **kwargs
)
]
# optional: add custom async implementations
# async def asimilarity_search(
@@ -204,9 +348,15 @@ class __ModuleName__VectorStore(VectorStore):
# return await asyncio.get_event_loop().run_in_executor(None, func)
def similarity_search_with_score(
self, *args: Any, **kwargs: Any
self, query: str, k: int = 4, **kwargs: Any
) -> List[Tuple[Document, float]]:
raise NotImplementedError
embedding = self.embedding.embed_query(query)
return [
(doc, similarity)
for doc, similarity, _ in self._similarity_search_with_score_by_vector(
embedding=embedding, k=k, **kwargs
)
]
# optional: add custom async implementations
# async def asimilarity_search_with_score(
@@ -218,10 +368,12 @@ class __ModuleName__VectorStore(VectorStore):
# func = partial(self.similarity_search_with_score, *args, **kwargs)
# return await asyncio.get_event_loop().run_in_executor(None, func)
def similarity_search_by_vector(
self, embedding: List[float], k: int = 4, **kwargs: Any
) -> List[Document]:
raise NotImplementedError
### ADDITIONAL OPTIONAL SEARCH METHODS BELOW ###
# def similarity_search_by_vector(
# self, embedding: List[float], k: int = 4, **kwargs: Any
# ) -> List[Document]:
# raise NotImplementedError
# optional: add custom async implementations
# async def asimilarity_search_by_vector(
@@ -233,15 +385,15 @@ class __ModuleName__VectorStore(VectorStore):
# func = partial(self.similarity_search_by_vector, embedding, k=k, **kwargs)
# return await asyncio.get_event_loop().run_in_executor(None, func)
def max_marginal_relevance_search(
self,
query: str,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
**kwargs: Any,
) -> List[Document]:
raise NotImplementedError
# def max_marginal_relevance_search(
# self,
# query: str,
# k: int = 4,
# fetch_k: int = 20,
# lambda_mult: float = 0.5,
# **kwargs: Any,
# ) -> List[Document]:
# raise NotImplementedError
# optional: add custom async implementations
# async def amax_marginal_relevance_search(
@@ -265,15 +417,15 @@ class __ModuleName__VectorStore(VectorStore):
# )
# return await asyncio.get_event_loop().run_in_executor(None, func)
def max_marginal_relevance_search_by_vector(
self,
embedding: List[float],
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
**kwargs: Any,
) -> List[Document]:
raise NotImplementedError
# def max_marginal_relevance_search_by_vector(
# self,
# embedding: List[float],
# k: int = 4,
# fetch_k: int = 20,
# lambda_mult: float = 0.5,
# **kwargs: Any,
# ) -> List[Document]:
# raise NotImplementedError
# optional: add custom async implementations
# async def amax_marginal_relevance_search_by_vector(
@@ -285,29 +437,3 @@ class __ModuleName__VectorStore(VectorStore):
# **kwargs: Any,
# ) -> List[Document]:
# raise NotImplementedError
@classmethod
def from_texts(
cls: Type[VST],
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict]] = None,
**kwargs: Any,
) -> VST:
raise NotImplementedError
# optional: add custom async implementations
# @classmethod
# async def afrom_texts(
# cls: Type[VST],
# texts: List[str],
# embedding: Embeddings,
# metadatas: Optional[List[dict]] = None,
# **kwargs: Any,
# ) -> VST:
# return await asyncio.get_running_loop().run_in_executor(
# None, partial(cls.from_texts, **kwargs), texts, embedding, metadatas
# )
def _select_relevance_score_fn(self) -> Callable[[float], float]:
raise NotImplementedError

View File

@@ -19,6 +19,6 @@ class Test__ModuleName__Retriever(RetrieversIntegrationTests):
@property
def retriever_query_example(self) -> str:
"""
Returns a dictionary representing the "args" of an example retriever call.
Returns a str representing the "query" of an example retriever call.
"""
return "example query"

View File

@@ -1,33 +1,16 @@
from typing import AsyncGenerator, Generator
from typing import Generator
import pytest
from __module_name__.vectorstores import __ModuleName__VectorStore
from langchain_core.vectorstores import VectorStore
from langchain_tests.integration_tests import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
from langchain_tests.integration_tests import VectorStoreIntegrationTests
class Test__ModuleName__VectorStoreSync(ReadWriteTestSuite):
class Test__ModuleName__VectorStore(VectorStoreIntegrationTests):
@pytest.fixture()
def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = __ModuleName__VectorStore()
# note: store should be EMPTY at this point
# if you need to delete data, you may do so here
try:
yield store
finally:
# cleanup operations, or deleting data
pass
class Test__ModuleName__VectorStoreAsync(AsyncReadWriteTestSuite):
@pytest.fixture()
async def vectorstore(self) -> AsyncGenerator[VectorStore, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = __ModuleName__VectorStore()
store = __ModuleName__VectorStore(self.get_embeddings())
# note: store should be EMPTY at this point
# if you need to delete data, you may do so here
try:

View File

@@ -5,6 +5,7 @@ Manage LangChain apps
import shutil
import subprocess
import sys
import warnings
from pathlib import Path
from typing import Dict, List, Optional, Tuple
@@ -163,6 +164,12 @@ def add(
langchain app add git+ssh://git@github.com/efriis/simple-pirate.git
"""
if not branch and not repo:
warnings.warn(
"Adding templates from the default branch and repo is deprecated."
" At a minimum, you will have to add `--branch v0.2` for this to work"
)
parsed_deps = parse_dependencies(dependencies, repo, branch, api_path)
project_root = get_package_root(project_dir)

View File

@@ -30,10 +30,12 @@ MODEL_COST_PER_1K_TOKENS = {
"gpt-4o": 0.0025,
"gpt-4o-2024-05-13": 0.005,
"gpt-4o-2024-08-06": 0.0025,
"gpt-4o-2024-11-20": 0.0025,
# GPT-4o output
"gpt-4o-completion": 0.01,
"gpt-4o-2024-05-13-completion": 0.015,
"gpt-4o-2024-08-06-completion": 0.01,
"gpt-4o-2024-11-20-completion": 0.01,
# GPT-4 input
"gpt-4": 0.03,
"gpt-4-0314": 0.03,

View File

@@ -27,8 +27,9 @@ logger = logging.getLogger(__name__)
PINECONE = "Pinecone"
QDRANT = "Qdrant"
PGVECTOR = "PGVector"
PINECONE_VECTOR_STORE = "PineconeVectorStore"
SUPPORTED_VECTORSTORES = {PINECONE, QDRANT, PGVECTOR}
SUPPORTED_VECTORSTORES = {PINECONE, QDRANT, PGVECTOR, PINECONE_VECTOR_STORE}
def clear_enforcement_filters(retriever: VectorStoreRetriever) -> None:
@@ -505,7 +506,7 @@ def _set_identity_enforcement_filter(
of the retriever based on the type of the vectorstore.
"""
search_kwargs = retriever.search_kwargs
if retriever.vectorstore.__class__.__name__ == PINECONE:
if retriever.vectorstore.__class__.__name__ in [PINECONE, PINECONE_VECTOR_STORE]:
_apply_pinecone_authorization_filter(search_kwargs, auth_context)
elif retriever.vectorstore.__class__.__name__ == QDRANT:
_apply_qdrant_authorization_filter(search_kwargs, auth_context)

View File

@@ -11,6 +11,7 @@ from typing import (
Dict,
Iterator,
List,
Literal,
Mapping,
Optional,
Sequence,
@@ -212,6 +213,33 @@ def _convert_message_to_dict(message: BaseMessage) -> dict:
return message_dict
_OPENAI_MODELS = [
"o1-mini",
"o1-preview",
"gpt-4o-mini",
"gpt-4o-mini-2024-07-18",
"gpt-4o",
"gpt-4o-2024-08-06",
"gpt-4o-2024-05-13",
"gpt-4-turbo",
"gpt-4-turbo-preview",
"gpt-4-0125-preview",
"gpt-4-1106-preview",
"gpt-3.5-turbo-1106",
"gpt-3.5-turbo",
"gpt-3.5-turbo-0301",
"gpt-3.5-turbo-0613",
"gpt-3.5-turbo-16k",
"gpt-3.5-turbo-16k-0613",
"gpt-4",
"gpt-4-0314",
"gpt-4-0613",
"gpt-4-32k",
"gpt-4-32k-0314",
"gpt-4-32k-0613",
]
class ChatLiteLLM(BaseChatModel):
"""Chat model that uses the LiteLLM API."""
@@ -465,6 +493,9 @@ class ChatLiteLLM(BaseChatModel):
def bind_tools(
self,
tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
tool_choice: Optional[
Union[dict, str, Literal["auto", "none", "required", "any"], bool]
] = None,
**kwargs: Any,
) -> Runnable[LanguageModelInput, BaseMessage]:
"""Bind tool-like objects to this chat model.
@@ -476,17 +507,47 @@ class ChatLiteLLM(BaseChatModel):
Can be a dictionary, pydantic model, callable, or BaseTool. Pydantic
models, callables, and BaseTools will be automatically converted to
their schema dictionary representation.
tool_choice: Which tool to require the model to call.
Must be the name of the single provided function or
"auto" to automatically determine which function to call
(if any), or a dict of the form:
{"type": "function", "function": {"name": <<tool_name>>}}.
tool_choice: Which tool to require the model to call. Options are:
- str of the form ``"<<tool_name>>"``: calls <<tool_name>> tool.
- ``"auto"``:
automatically selects a tool (including no tool).
- ``"none"``:
does not call a tool.
- ``"any"`` or ``"required"`` or ``True``:
forces least one tool to be called.
- dict of the form:
``{"type": "function", "function": {"name": <<tool_name>>}}``
- ``False`` or ``None``: no effect
**kwargs: Any additional parameters to pass to the
:class:`~langchain.runnable.Runnable` constructor.
"""
formatted_tools = [convert_to_openai_tool(tool) for tool in tools]
return super().bind(tools=formatted_tools, **kwargs)
# In case of openai if tool_choice is `any` or if bool has been provided we
# change it to `required` as that is suppored by openai.
if (
(self.model is not None and "azure" in self.model)
or (self.model_name is not None and "azure" in self.model_name)
or (self.model is not None and self.model in _OPENAI_MODELS)
or (self.model_name is not None and self.model_name in _OPENAI_MODELS)
) and (tool_choice == "any" or isinstance(tool_choice, bool)):
tool_choice = "required"
# If tool_choice is bool apart from openai we make it `any`
elif isinstance(tool_choice, bool):
tool_choice = "any"
elif isinstance(tool_choice, dict):
tool_names = [
formatted_tool["function"]["name"] for formatted_tool in formatted_tools
]
if not any(
tool_name == tool_choice["function"]["name"] for tool_name in tool_names
):
raise ValueError(
f"Tool choice {tool_choice} was specified, but the only "
f"provided tools were {tool_names}."
)
return super().bind(tools=formatted_tools, tool_choice=tool_choice, **kwargs)
@property
def _identifying_params(self) -> Dict[str, Any]:

View File

@@ -13,21 +13,142 @@ from langchain_community.llms.moonshot import MOONSHOT_SERVICE_URL_BASE, Moonsho
class MoonshotChat(MoonshotCommon, ChatOpenAI): # type: ignore[misc, override, override]
"""Moonshot large language models.
"""Moonshot chat model integration.
To use, you should have the ``openai`` python package installed, and the
environment variable ``MOONSHOT_API_KEY`` set with your API key.
(Moonshot's chat API is compatible with OpenAI's SDK.)
Setup:
Install ``openai`` and set environment variables ``MOONSHOT_API_KEY``.
Referenced from https://platform.moonshot.cn/docs
.. code-block:: bash
Example:
pip install openai
export MOONSHOT_API_KEY="your-api-key"
Key init args — completion params:
model: str
Name of Moonshot model to use.
temperature: float
Sampling temperature.
max_tokens: Optional[int]
Max number of tokens to generate.
Key init args — client params:
api_key: Optional[str]
Moonshot API KEY. If not passed in will be read from env var MOONSHOT_API_KEY.
api_base: Optional[str]
Base URL for API requests.
See full list of supported init args and their descriptions in the params section.
Instantiate:
.. code-block:: python
from langchain_community.chat_models.moonshot import MoonshotChat
from langchain_community.chat_models import MoonshotChat
moonshot = MoonshotChat(model="moonshot-v1-8k")
"""
chat = MoonshotChat(
temperature=0.5,
api_key="your-api-key",
model="moonshot-v1-8k",
# api_base="...",
# other params...
)
Invoke:
.. code-block:: python
messages = [
("system", "你是一名专业的翻译家,可以将用户的中文翻译为英文。"),
("human", "我喜欢编程。"),
]
chat.invoke(messages)
.. code-block:: python
AIMessage(
content='I like programming.',
additional_kwargs={},
response_metadata={
'token_usage': {
'completion_tokens': 5,
'prompt_tokens': 27,
'total_tokens': 32
},
'model_name': 'moonshot-v1-8k',
'system_fingerprint': None,
'finish_reason': 'stop',
'logprobs': None
},
id='run-71c03f4e-6628-41d5-beb6-d2559ae68266-0'
)
Stream:
.. code-block:: python
for chunk in chat.stream(messages):
print(chunk)
.. code-block:: python
content='' additional_kwargs={} response_metadata={} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
content='I' additional_kwargs={} response_metadata={} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
content=' like' additional_kwargs={} response_metadata={} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
content=' programming' additional_kwargs={} response_metadata={} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
content='.' additional_kwargs={} response_metadata={} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
content='' additional_kwargs={} response_metadata={'finish_reason': 'stop'} id='run-80d77096-8b83-4c39-a84d-71d9c746da92'
.. code-block:: python
stream = chat.stream(messages)
full = next(stream)
for chunk in stream:
full += chunk
full
.. code-block:: python
AIMessageChunk(
content='I like programming.',
additional_kwargs={},
response_metadata={'finish_reason': 'stop'},
id='run-10c80976-7aa5-4ff7-ba3e-1251665557ef'
)
Async:
.. code-block:: python
await chat.ainvoke(messages)
# stream:
# async for chunk in chat.astream(messages):
# print(chunk)
# batch:
# await chat.abatch([messages])
.. code-block:: python
[AIMessage(content='I like programming.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 27, 'total_tokens': 32}, 'model_name': 'moonshot-v1-8k', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-2938b005-9204-4b9f-b273-1c3272fce9e5-0')]
Response metadata
.. code-block:: python
ai_msg = chat.invoke(messages)
ai_msg.response_metadata
.. code-block:: python
{
'token_usage': {
'completion_tokens': 5,
'prompt_tokens': 27,
'total_tokens': 32
},
'model_name': 'moonshot-v1-8k',
'system_fingerprint': None,
'finish_reason': 'stop',
'logprobs': None
}
""" # noqa: E501
@pre_init
def validate_environment(cls, values: Dict) -> Dict:

View File

@@ -166,6 +166,7 @@ class ConfluenceLoader(BaseLoader):
include_archived_content: bool = False,
include_attachments: bool = False,
include_comments: bool = False,
include_labels: bool = False,
content_format: ContentFormat = ContentFormat.STORAGE,
limit: Optional[int] = 50,
max_pages: Optional[int] = 1000,
@@ -181,6 +182,7 @@ class ConfluenceLoader(BaseLoader):
self.include_archived_content = include_archived_content
self.include_attachments = include_attachments
self.include_comments = include_comments
self.include_labels = include_labels
self.content_format = content_format
self.limit = limit
self.max_pages = max_pages
@@ -327,12 +329,20 @@ class ConfluenceLoader(BaseLoader):
)
include_attachments = self._resolve_param("include_attachments", kwargs)
include_comments = self._resolve_param("include_comments", kwargs)
include_labels = self._resolve_param("include_labels", kwargs)
content_format = self._resolve_param("content_format", kwargs)
limit = self._resolve_param("limit", kwargs)
max_pages = self._resolve_param("max_pages", kwargs)
ocr_languages = self._resolve_param("ocr_languages", kwargs)
keep_markdown_format = self._resolve_param("keep_markdown_format", kwargs)
keep_newlines = self._resolve_param("keep_newlines", kwargs)
expand = ",".join(
[
content_format.value,
"version",
*(["metadata.labels"] if include_labels else []),
]
)
if not space_key and not page_ids and not label and not cql:
raise ValueError(
@@ -347,13 +357,14 @@ class ConfluenceLoader(BaseLoader):
limit=limit,
max_pages=max_pages,
status="any" if include_archived_content else "current",
expand=f"{content_format.value},version",
expand=expand,
)
yield from self.process_pages(
pages,
include_restricted_content,
include_attachments,
include_comments,
include_labels,
content_format,
ocr_languages=ocr_languages,
keep_markdown_format=keep_markdown_format,
@@ -380,13 +391,14 @@ class ConfluenceLoader(BaseLoader):
limit=limit,
max_pages=max_pages,
include_archived_spaces=include_archived_content,
expand=f"{content_format.value},version",
expand=expand,
)
yield from self.process_pages(
pages,
include_restricted_content,
include_attachments,
include_comments,
False, # labels are not included in the search results
content_format,
ocr_languages,
keep_markdown_format,
@@ -408,7 +420,8 @@ class ConfluenceLoader(BaseLoader):
before_sleep=before_sleep_log(logger, logging.WARNING),
)(self.confluence.get_page_by_id)
page = get_page(
page_id=page_id, expand=f"{content_format.value},version"
page_id=page_id,
expand=expand,
)
if not include_restricted_content and not self.is_public_page(page):
continue
@@ -416,6 +429,7 @@ class ConfluenceLoader(BaseLoader):
page,
include_attachments,
include_comments,
include_labels,
content_format,
ocr_languages,
keep_markdown_format,
@@ -498,6 +512,7 @@ class ConfluenceLoader(BaseLoader):
include_restricted_content: bool,
include_attachments: bool,
include_comments: bool,
include_labels: bool,
content_format: ContentFormat,
ocr_languages: Optional[str] = None,
keep_markdown_format: Optional[bool] = False,
@@ -511,6 +526,7 @@ class ConfluenceLoader(BaseLoader):
page,
include_attachments,
include_comments,
include_labels,
content_format,
ocr_languages=ocr_languages,
keep_markdown_format=keep_markdown_format,
@@ -522,6 +538,7 @@ class ConfluenceLoader(BaseLoader):
page: dict,
include_attachments: bool,
include_comments: bool,
include_labels: bool,
content_format: ContentFormat,
ocr_languages: Optional[str] = None,
keep_markdown_format: Optional[bool] = False,
@@ -575,10 +592,19 @@ class ConfluenceLoader(BaseLoader):
]
text = text + "".join(comment_texts)
if include_labels:
labels = [
label["name"]
for label in page.get("metadata", {})
.get("labels", {})
.get("results", [])
]
metadata = {
"title": page["title"],
"id": page["id"],
"source": self.base_url.strip("/") + page["_links"]["webui"],
**({"labels": labels} if include_labels else {}),
}
if "version" in page and "when" in page["version"]:

View File

@@ -145,6 +145,9 @@ if TYPE_CHECKING:
from langchain_community.embeddings.mlflow_gateway import (
MlflowAIGatewayEmbeddings,
)
from langchain_community.embeddings.model2vec import (
Model2vecEmbeddings,
)
from langchain_community.embeddings.modelscope_hub import (
ModelScopeEmbeddings,
)
@@ -289,6 +292,7 @@ __all__ = [
"MlflowAIGatewayEmbeddings",
"MlflowCohereEmbeddings",
"MlflowEmbeddings",
"Model2vecEmbeddings",
"ModelScopeEmbeddings",
"MosaicMLInstructorEmbeddings",
"NLPCloudEmbeddings",
@@ -372,6 +376,7 @@ _module_lookup = {
"MlflowAIGatewayEmbeddings": "langchain_community.embeddings.mlflow_gateway",
"MlflowCohereEmbeddings": "langchain_community.embeddings.mlflow",
"MlflowEmbeddings": "langchain_community.embeddings.mlflow",
"Model2vecEmbeddings": "langchain_community.embeddings.model2vec",
"ModelScopeEmbeddings": "langchain_community.embeddings.modelscope_hub",
"MosaicMLInstructorEmbeddings": "langchain_community.embeddings.mosaicml",
"NLPCloudEmbeddings": "langchain_community.embeddings.nlpcloud",

View File

@@ -0,0 +1,66 @@
"""Wrapper around model2vec embedding models."""
from typing import List
from langchain_core.embeddings import Embeddings
class Model2vecEmbeddings(Embeddings):
"""model2v embedding models.
Install model2vec first, run 'pip install -U model2vec'.
The github repository for model2vec is : https://github.com/MinishLab/model2vec
Example:
.. code-block:: python
from langchain_community.embeddings import Model2vecEmbeddings
embedding = Model2vecEmbeddings("minishlab/potion-base-8M")
embedding.embed_documents([
"It's dangerous to go alone!",
"It's a secret to everybody.",
])
embedding.embed_query(
"Take this with you."
)
"""
def __init__(self, model: str):
"""Initialize embeddings.
Args:
model: Model name.
"""
try:
from model2vec import StaticModel
except ImportError as e:
raise ImportError(
"Unable to import model2vec, please install with "
"`pip install -U model2vec`."
) from e
self._model = StaticModel.from_pretrained(model)
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Embed documents using the model2vec embeddings model.
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
return self._model.encode_as_sequence(texts)
def embed_query(self, text: str) -> List[float]:
"""Embed a query using the model2vec embeddings model.
Args:
text: The text to embed.
Returns:
Embeddings for the text.
"""
return self._model.encode(text)

File diff suppressed because it is too large Load Diff

View File

@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "langchain-community"
version = "0.3.9"
version = "0.3.10"
description = "Community contributed LangChain integrations."
authors = []
license = "MIT"
@@ -30,8 +30,8 @@ ignore-words-list = "momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogy
[tool.poetry.dependencies]
python = ">=3.9,<4.0"
langchain-core = "^0.3.21"
langchain = "^0.3.8"
langchain-core = "^0.3.22"
langchain = "^0.3.10"
SQLAlchemy = ">=1.4,<3"
requests = "^2"
PyYAML = ">=5.3"

View File

@@ -3,27 +3,15 @@
import uuid
import pytest
from langchain_tests.integration_tests.vectorstores import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
from langchain_tests.integration_tests.vectorstores import VectorStoreIntegrationTests
from langchain_community.vectorstores import ApertureDB
class TestApertureDBReadWriteTestSuite(ReadWriteTestSuite):
class TestApertureStandard(VectorStoreIntegrationTests):
@pytest.fixture
def vectorstore(self) -> ApertureDB:
descriptor_set = uuid.uuid4().hex # Fresh descriptor set for each test
return ApertureDB(
embeddings=self.get_embeddings(), descriptor_set=descriptor_set
)
class TestAsyncApertureDBReadWriteTestSuite(AsyncReadWriteTestSuite):
@pytest.fixture
async def vectorstore(self) -> ApertureDB:
descriptor_set = uuid.uuid4().hex # Fresh descriptor set for each test
return ApertureDB(
embeddings=self.get_embeddings(), descriptor_set=descriptor_set
)

View File

@@ -195,6 +195,36 @@ class TestConfluenceLoader:
assert mock_confluence.cql.call_count == 0
assert mock_confluence.get_page_child_by_type.call_count == 0
@pytest.mark.requires("markdownify")
def test_confluence_loader_when_include_lables_set_to_true(
self, mock_confluence: MagicMock
) -> None:
# one response with two pages
mock_confluence.get_all_pages_from_space.return_value = [
self._get_mock_page("123", include_labels=True),
self._get_mock_page("456", include_labels=False),
]
mock_confluence.get_all_restrictions_for_content.side_effect = [
self._get_mock_page_restrictions("123"),
self._get_mock_page_restrictions("456"),
]
conflence_loader = self._get_mock_confluence_loader(
mock_confluence,
space_key=self.MOCK_SPACE_KEY,
include_labels=True,
max_pages=2,
)
documents = conflence_loader.load()
assert mock_confluence.get_all_pages_from_space.call_count == 1
assert len(documents) == 2
assert all(isinstance(doc, Document) for doc in documents)
assert documents[0].metadata["labels"] == ["l1", "l2"]
assert documents[1].metadata["labels"] == []
def _get_mock_confluence_loader(
self, mock_confluence: MagicMock, **kwargs: Any
) -> ConfluenceLoader:
@@ -208,7 +238,10 @@ class TestConfluenceLoader:
return confluence_loader
def _get_mock_page(
self, page_id: str, content_format: ContentFormat = ContentFormat.STORAGE
self,
page_id: str,
content_format: ContentFormat = ContentFormat.STORAGE,
include_labels: bool = False,
) -> Dict:
return {
"id": f"{page_id}",
@@ -216,6 +249,20 @@ class TestConfluenceLoader:
"body": {
f"{content_format.name.lower()}": {"value": f"<p>Content {page_id}</p>"}
},
**(
{
"metadata": {
"labels": {
"results": [
{"prefix": "global", "name": "l1", "id": "111"},
{"prefix": "global", "name": "l2", "id": "222"},
]
}
}
if include_labels
else {},
}
),
"status": "current",
"type": "page",
"_links": {

View File

@@ -26,6 +26,7 @@ EXPECTED_ALL = [
"MlflowAIGatewayEmbeddings",
"MlflowEmbeddings",
"MlflowCohereEmbeddings",
"Model2vecEmbeddings",
"ModelScopeEmbeddings",
"TensorflowHubEmbeddings",
"SagemakerEndpointEmbeddings",

View File

@@ -0,0 +1,11 @@
from langchain_community.embeddings.model2vec import Model2vecEmbeddings
def test_hugginggface_inferenceapi_embedding_documents_init() -> None:
"""Test model2vec embeddings."""
try:
embedding = Model2vecEmbeddings("minishlab/potion-base-8M")
assert len(embedding.embed_query("hi")) == 256
except Exception:
# model2vec is not installed
assert True

View File

@@ -3,10 +3,7 @@ from typing import Any
import pytest
from langchain_core.documents import Document
from langchain_tests.integration_tests.vectorstores import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
from langchain_tests.integration_tests.vectorstores import VectorStoreIntegrationTests
from langchain_community.vectorstores.inmemory import InMemoryVectorStore
from tests.integration_tests.vectorstores.fake_embeddings import (
@@ -26,18 +23,12 @@ def _AnyDocument(**kwargs: Any) -> Document:
return doc
class TestInMemoryReadWriteTestSuite(ReadWriteTestSuite):
class TestInMemoryStandard(VectorStoreIntegrationTests):
@pytest.fixture
def vectorstore(self) -> InMemoryVectorStore:
return InMemoryVectorStore(embedding=self.get_embeddings())
class TestAsyncInMemoryReadWriteTestSuite(AsyncReadWriteTestSuite):
@pytest.fixture
async def vectorstore(self) -> InMemoryVectorStore:
return InMemoryVectorStore(embedding=self.get_embeddings())
async def test_inmemory() -> None:
"""Test end to end construction and search."""
store = await InMemoryVectorStore.afrom_texts(

View File

@@ -27,7 +27,7 @@ from inspect import signature
from typing import TYPE_CHECKING, Any, Optional
from pydantic import ConfigDict
from typing_extensions import TypedDict
from typing_extensions import Self, TypedDict
from langchain_core._api import deprecated
from langchain_core.documents import Document
@@ -180,6 +180,18 @@ class BaseRetriever(RunnableSerializable[RetrieverInput, RetrieverOutput], ABC):
cls._aget_relevant_documents = aswap # type: ignore[assignment]
parameters = signature(cls._get_relevant_documents).parameters
cls._new_arg_supported = parameters.get("run_manager") is not None
if (
not cls._new_arg_supported
and cls._aget_relevant_documents == BaseRetriever._aget_relevant_documents
):
# we need to tolerate no run_manager in _aget_relevant_documents signature
async def _aget_relevant_documents(
self: Self, query: str
) -> list[Document]:
return await run_in_executor(None, self._get_relevant_documents, query) # type: ignore
cls._aget_relevant_documents = _aget_relevant_documents # type: ignore[assignment]
# If a V1 retriever broke the interface and expects additional arguments
cls._expects_other_args = (
len(set(parameters.keys()) - {"self", "query", "run_manager"}) > 0

View File

@@ -470,14 +470,22 @@ class Graph:
"""Remove the first node if it exists and has a single outgoing edge,
i.e., if removing it would not leave the graph without a "first" node."""
first_node = self.first_node()
if first_node and _first_node(self, exclude=[first_node.id]):
if (
first_node
and _first_node(self, exclude=[first_node.id])
and len({e for e in self.edges if e.source == first_node.id}) == 1
):
self.remove_node(first_node)
def trim_last_node(self) -> None:
"""Remove the last node if it exists and has a single incoming edge,
i.e., if removing it would not leave the graph without a "last" node."""
last_node = self.last_node()
if last_node and _last_node(self, exclude=[last_node.id]):
if (
last_node
and _last_node(self, exclude=[last_node.id])
and len({e for e in self.edges if e.target == last_node.id}) == 1
):
self.remove_node(last_node)
def draw_ascii(self) -> str:

View File

@@ -609,7 +609,7 @@ class ChildTool(BaseTool):
run_id: The id of the run. Defaults to None.
config: The configuration for the tool. Defaults to None.
tool_call_id: The id of the tool call. Defaults to None.
kwargs: Additional arguments to pass to the tool
kwargs: Keyword arguments to be passed to tool callbacks
Returns:
The output of the tool.
@@ -721,7 +721,7 @@ class ChildTool(BaseTool):
run_id: The id of the run. Defaults to None.
config: The configuration for the tool. Defaults to None.
tool_call_id: The id of the tool call. Defaults to None.
kwargs: Additional arguments to pass to the tool
kwargs: Keyword arguments to be passed to tool callbacks
Returns:
The output of the tool.

View File

@@ -1,10 +1,10 @@
[build-system]
requires = ["poetry-core>=1.0.0"]
requires = [ "poetry-core>=1.0.0",]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "langchain-core"
version = "0.3.21"
version = "0.3.22"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"
@@ -12,16 +12,10 @@ readme = "README.md"
repository = "https://github.com/langchain-ai/langchain"
[tool.mypy]
exclude = [
"notebooks",
"examples",
"example_data",
"langchain_core/pydantic",
"tests/unit_tests/utils/test_function_calling.py",
]
exclude = [ "notebooks", "examples", "example_data", "langchain_core/pydantic", "tests/unit_tests/utils/test_function_calling.py",]
disallow_untyped_defs = "True"
[[tool.mypy.overrides]]
module = ["numpy", "pytest"]
module = [ "numpy", "pytest",]
ignore_missing_imports = true
[tool.ruff]
@@ -50,53 +44,17 @@ python = ">=3.12.4"
[tool.poetry.extras]
[tool.ruff.lint]
select = [
"ASYNC",
"B",
"C4",
"COM",
"DJ",
"E",
"EM",
"EXE",
"F",
"FLY",
"FURB",
"I",
"ICN",
"INT",
"LOG",
"N",
"NPY",
"PD",
"PIE",
"Q",
"RSE",
"S",
"SIM",
"SLOT",
"T10",
"T201",
"TID",
"UP",
"W",
"YTT",
]
ignore = ["COM812", "UP007", "W293", "S101", "S110", "S112"]
select = [ "ASYNC", "B", "C4", "COM", "DJ", "E", "EM", "EXE", "F", "FLY", "FURB", "I", "ICN", "INT", "LOG", "N", "NPY", "PD", "PIE", "Q", "RSE", "S", "SIM", "SLOT", "T10", "T201", "TID", "UP", "W", "YTT",]
ignore = [ "COM812", "UP007", "W293", "S101", "S110", "S112",]
[tool.coverage.run]
omit = ["tests/*"]
omit = [ "tests/*",]
[tool.pytest.ini_options]
addopts = "--snapshot-warn-unused --strict-markers --strict-config --durations=5"
markers = [
"requires: mark tests as requiring a specific library",
"compile: mark placeholder test used to compile integration tests without running them",
]
markers = [ "requires: mark tests as requiring a specific library", "compile: mark placeholder test used to compile integration tests without running them",]
asyncio_mode = "auto"
filterwarnings = [
"ignore::langchain_core._api.beta_decorator.LangChainBetaWarning",
]
filterwarnings = [ "ignore::langchain_core._api.beta_decorator.LangChainBetaWarning",]
[tool.poetry.group.lint]
optional = true
@@ -114,37 +72,29 @@ optional = true
optional = true
[tool.ruff.lint.pep8-naming]
classmethod-decorators = [
"classmethod",
"langchain_core.utils.pydantic.pre_init",
"pydantic.field_validator",
"pydantic.v1.root_validator",
]
classmethod-decorators = [ "classmethod", "langchain_core.utils.pydantic.pre_init", "pydantic.field_validator", "pydantic.v1.root_validator",]
[tool.ruff.lint.per-file-ignores]
"tests/unit_tests/prompts/test_chat.py" = ["E501"]
"tests/unit_tests/runnables/test_runnable.py" = ["E501"]
"tests/unit_tests/runnables/test_graph.py" = ["E501"]
"tests/**" = ["S"]
"scripts/**" = ["S"]
"tests/unit_tests/prompts/test_chat.py" = [ "E501",]
"tests/unit_tests/runnables/test_runnable.py" = [ "E501",]
"tests/unit_tests/runnables/test_graph.py" = [ "E501",]
"tests/**" = [ "S",]
"scripts/**" = [ "S",]
[tool.poetry.group.lint.dependencies]
ruff = "^0.5"
[tool.poetry.group.typing.dependencies]
mypy = ">=1.10,<1.11"
types-pyyaml = "^6.0.12.2"
types-requests = "^2.28.11.5"
types-jinja2 = "^2.11.9"
[tool.poetry.group.dev.dependencies]
jupyter = "^1.0.0"
setuptools = "^67.6.1"
grandalf = "^0.8"
[tool.poetry.group.test.dependencies]
pytest = "^8"
freezegun = "^1.2.2"
@@ -163,15 +113,12 @@ python = "<3.12"
version = ">=1.26.0,<3"
python = ">=3.12"
[tool.poetry.group.test_integration.dependencies]
[tool.poetry.group.typing.dependencies.langchain-text-splitters]
path = "../text-splitters"
develop = true
[tool.poetry.group.test.dependencies.langchain-tests]
path = "../standard-tests"
develop = true

View File

@@ -10,7 +10,11 @@ ROOT = HERE.parent.parent.parent
def test_as_import_path() -> None:
"""Test that the path is converted to a LangChain import path."""
# Verify that default paths are correct
assert path.PACKAGE_DIR == ROOT / "langchain_core"
# if editable install, check directory structure
if path.PACKAGE_DIR == ROOT / "langchain_core":
assert path.PACKAGE_DIR == ROOT / "langchain_core"
# Verify that as import path works correctly
assert path.as_import_path(HERE, relative_to=ROOT) == "tests.unit_tests._api"
assert (

View File

@@ -69,6 +69,26 @@ def test_trim(snapshot: SnapshotAssertion) -> None:
assert graph.last_node() is end
def test_trim_multi_edge() -> None:
class Scheme(BaseModel):
a: str
graph = Graph()
start = graph.add_node(Scheme, id="__start__")
a = graph.add_node(Scheme, id="a")
last = graph.add_node(Scheme, id="__end__")
graph.add_edge(start, a)
graph.add_edge(a, last)
graph.add_edge(start, last)
graph.trim_first_node() # should not remove __start__ since it has 2 outgoing edges
assert graph.first_node() is start
graph.trim_last_node() # should not remove the __end__ node since it has 2 incoming edges
assert graph.last_node() is last
def test_graph_sequence(snapshot: SnapshotAssertion) -> None:
fake_llm = FakeListLLM(responses=["a"])
prompt = PromptTemplate.from_template("Hello, {name}!")

View File

@@ -2,10 +2,7 @@ from pathlib import Path
from unittest.mock import AsyncMock, Mock
import pytest
from langchain_tests.integration_tests.vectorstores import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
from langchain_tests.integration_tests.vectorstores import VectorStoreIntegrationTests
from langchain_core.documents import Document
from langchain_core.embeddings.fake import DeterministicFakeEmbedding
@@ -13,18 +10,12 @@ from langchain_core.vectorstores import InMemoryVectorStore
from tests.unit_tests.stubs import _any_id_document
class TestInMemoryReadWriteTestSuite(ReadWriteTestSuite):
class TestInMemoryStandard(VectorStoreIntegrationTests):
@pytest.fixture
def vectorstore(self) -> InMemoryVectorStore:
return InMemoryVectorStore(embedding=self.get_embeddings())
class TestAsyncInMemoryReadWriteTestSuite(AsyncReadWriteTestSuite):
@pytest.fixture
async def vectorstore(self) -> InMemoryVectorStore:
return InMemoryVectorStore(embedding=self.get_embeddings())
async def test_inmemory_similarity_search() -> None:
"""Test end to end similarity search."""
store = await InMemoryVectorStore.afrom_texts(

File diff suppressed because it is too large Load Diff

View File

@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "langchain"
version = "0.3.9"
version = "0.3.10"
description = "Building applications with LLMs through composability"
authors = []
license = "MIT"
@@ -33,7 +33,7 @@ langchain-server = "langchain.server:main"
[tool.poetry.dependencies]
python = ">=3.9,<4.0"
langchain-core = "^0.3.21"
langchain-core = "^0.3.22"
langchain-text-splitters = "^0.3.0"
langsmith = "^0.1.17"
pydantic = "^2.7.4"

View File

@@ -68,6 +68,9 @@ packages:
- name: langchain-qdrant
repo: langchain-ai/langchain
path: libs/partners/qdrant
- name: langchain-scrapegraph
repo: ScrapeGraphAI/langchain-scrapegraph
path: .
- name: langchain-sema4
repo: langchain-ai/langchain-sema4
path: libs/sema4

View File

@@ -1,16 +1,13 @@
from typing import AsyncGenerator, Generator
from typing import Generator
import pytest
from langchain_core.vectorstores import VectorStore
from langchain_tests.integration_tests.vectorstores import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
from langchain_tests.integration_tests.vectorstores import VectorStoreIntegrationTests
from langchain_chroma import Chroma
class TestSync(ReadWriteTestSuite):
class TestChromaStandard(VectorStoreIntegrationTests):
@pytest.fixture()
def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
@@ -20,15 +17,3 @@ class TestSync(ReadWriteTestSuite):
finally:
store.delete_collection()
pass
class TestAsync(AsyncReadWriteTestSuite):
@pytest.fixture()
async def vectorstore(self) -> AsyncGenerator[VectorStore, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = Chroma(embedding_function=self.get_embeddings())
try:
yield store
finally:
store.delete_collection()
pass

View File

@@ -4,6 +4,7 @@ from typing import Type
import pytest
from langchain_core.language_models import BaseChatModel
from langchain_core.tools import BaseTool
from langchain_tests.integration_tests import ( # type: ignore[import-not-found]
ChatModelIntegrationTests, # type: ignore[import-not-found]
)
@@ -24,5 +25,7 @@ class TestFireworksStandard(ChatModelIntegrationTests):
}
@pytest.mark.xfail(reason="Not yet implemented.")
def test_tool_message_histories_list_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_list_content(model)
def test_tool_message_histories_list_content(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_tool_message_histories_list_content(model, my_adder_tool)

View File

@@ -5,6 +5,7 @@ from typing import Optional, Type
import pytest
from langchain_core.language_models import BaseChatModel
from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_core.tools import BaseTool
from langchain_tests.integration_tests import (
ChatModelIntegrationTests,
)
@@ -20,8 +21,10 @@ class BaseTestGroq(ChatModelIntegrationTests):
return ChatGroq
@pytest.mark.xfail(reason="Not yet implemented.")
def test_tool_message_histories_list_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_list_content(model)
def test_tool_message_histories_list_content(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_tool_message_histories_list_content(model, my_adder_tool)
class TestGroqLlama(BaseTestGroq):
@@ -47,8 +50,10 @@ class TestGroqLlama(BaseTestGroq):
@pytest.mark.xfail(
reason=("Fails with 'Failed to call a function. Please adjust your prompt.'")
)
def test_tool_message_histories_string_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_string_content(model)
def test_tool_message_histories_string_content(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_tool_message_histories_string_content(model, my_adder_tool)
@pytest.mark.xfail(
reason=(

View File

@@ -595,7 +595,7 @@ class ChatMistralAI(BaseChatModel):
for chunk in self.completion_with_retry(
messages=message_dicts, run_manager=run_manager, **params
):
if len(chunk["choices"]) == 0:
if len(chunk.get("choices", [])) == 0:
continue
new_chunk = _convert_chunk_to_message_chunk(chunk, default_chunk_class)
# make future chunks same type as first chunk
@@ -621,7 +621,7 @@ class ChatMistralAI(BaseChatModel):
async for chunk in await acompletion_with_retry(
self, messages=message_dicts, run_manager=run_manager, **params
):
if len(chunk["choices"]) == 0:
if len(chunk.get("choices", [])) == 0:
continue
new_chunk = _convert_chunk_to_message_chunk(chunk, default_chunk_class)
# make future chunks same type as first chunk

View File

@@ -495,7 +495,7 @@ files = [
[[package]]
name = "langchain-core"
version = "0.3.21"
version = "0.3.22"
description = "Building applications with LLMs through composability"
optional = false
python-versions = ">=3.9,<4.0"
@@ -520,7 +520,7 @@ url = "../../core"
[[package]]
name = "langchain-tests"
version = "0.3.4"
version = "0.3.6"
description = "Standard tests for LangChain implementations"
optional = false
python-versions = ">=3.9,<4.0"
@@ -528,9 +528,15 @@ files = []
develop = true
[package.dependencies]
httpx = "^0.27.0"
langchain-core = "^0.3.19"
httpx = ">=0.25.0,<1"
langchain-core = "^0.3.22"
numpy = [
{version = ">=1.24.0,<2.0.0", markers = "python_version < \"3.12\""},
{version = ">=1.26.2,<3", markers = "python_version >= \"3.12\""},
]
pytest = ">=7,<9"
pytest-asyncio = ">=0.20,<1"
pytest-socket = ">=0.6.0,<1"
syrupy = "^4"
[package.source]
@@ -1639,4 +1645,4 @@ watchmedo = ["PyYAML (>=3.10)"]
[metadata]
lock-version = "2.0"
python-versions = ">=3.9,<4.0"
content-hash = "ded25b72c77fad9a869f3308c1bba084b58f54eb13df2785f061bc340d6ec748"
content-hash = "6fb8c9f98c76ba402d53234ac2ac78bcebafbe818e64cd849e0ae26cafcd5ba4"

View File

@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
[tool.poetry]
name = "langchain-openai"
version = "0.2.11"
version = "0.2.12"
description = "An integration package connecting OpenAI and LangChain"
authors = []
readme = "README.md"
@@ -24,7 +24,7 @@ ignore_missing_imports = true
[tool.poetry.dependencies]
python = ">=3.9,<4.0"
langchain-core = "^0.3.21"
openai = "^1.54.0"
openai = "^1.55.3"
tiktoken = ">=0.7,<1"
[tool.ruff.lint]

View File

@@ -4,6 +4,7 @@ from typing import Tuple, Type
import pytest
from langchain_core.language_models import BaseChatModel
from langchain_core.tools import BaseTool
from langchain_tests.unit_tests import ChatModelUnitTests
from langchain_openai import AzureChatOpenAI
@@ -23,8 +24,10 @@ class TestOpenAIStandard(ChatModelUnitTests):
}
@pytest.mark.xfail(reason="AzureOpenAI does not support tool_choice='any'")
def test_bind_tool_pydantic(self, model: BaseChatModel) -> None:
super().test_bind_tool_pydantic(model)
def test_bind_tool_pydantic(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_bind_tool_pydantic(model, my_adder_tool)
@property
def init_from_env_params(self) -> Tuple[dict, dict, dict]:

View File

@@ -5,6 +5,7 @@ from typing import Optional, Type
import pytest # type: ignore[import-not-found]
from langchain_core.language_models import BaseChatModel
from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_core.tools import BaseTool
from langchain_tests.integration_tests import ( # type: ignore[import-not-found]
ChatModelIntegrationTests, # type: ignore[import-not-found]
)
@@ -40,13 +41,19 @@ class TestXAIStandard(ChatModelIntegrationTests):
super().test_usage_metadata_streaming(model)
@pytest.mark.xfail(reason="Can't handle AIMessage with empty content.")
def test_tool_message_error_status(self, model: BaseChatModel) -> None:
super().test_tool_message_error_status(model)
def test_tool_message_error_status(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_tool_message_error_status(model, my_adder_tool)
@pytest.mark.xfail(reason="Can't handle AIMessage with empty content.")
def test_structured_few_shot_examples(self, model: BaseChatModel) -> None:
super().test_structured_few_shot_examples(model)
def test_structured_few_shot_examples(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_structured_few_shot_examples(model, my_adder_tool)
@pytest.mark.xfail(reason="Can't handle AIMessage with empty content.")
def test_tool_message_histories_string_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_string_content(model)
def test_tool_message_histories_string_content(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
super().test_tool_message_histories_string_content(model, my_adder_tool)

View File

@@ -0,0 +1,7 @@
"""
Base Test classes for standard testing.
To learn how to use these classes, see the
`Integration standard testing <https://python.langchain.com/docs/contributing/how_to/integrations/standard_tests/>`_
guide.
"""

View File

@@ -3,9 +3,15 @@ from typing import Type
class BaseStandardTests(ABC):
"""
:private:
"""
def test_no_overrides_DO_NOT_OVERRIDE(self) -> None:
"""
Test that no standard tests are overridden.
:private:
"""
# find path to standard test implementations
comparison_class = None

View File

@@ -23,7 +23,7 @@ from .chat_models import ChatModelIntegrationTests
from .embeddings import EmbeddingsIntegrationTests
from .retrievers import RetrieversIntegrationTests
from .tools import ToolsIntegrationTests
from .vectorstores import AsyncReadWriteTestSuite, ReadWriteTestSuite
from .vectorstores import VectorStoreIntegrationTests
__all__ = [
"ChatModelIntegrationTests",
@@ -33,7 +33,6 @@ __all__ = [
"BaseStoreSyncTests",
"AsyncCacheTestSuite",
"SyncCacheTestSuite",
"AsyncReadWriteTestSuite",
"ReadWriteTestSuite",
"VectorStoreIntegrationTests",
"RetrieversIntegrationTests",
]

View File

@@ -1,3 +1,11 @@
"""
Standard tests for the BaseStore abstraction
We don't recommend implementing externally managed BaseStore abstractions at this time.
:private:
"""
from abc import abstractmethod
from typing import AsyncGenerator, Generator, Generic, Tuple, TypeVar

View File

@@ -1,3 +1,11 @@
"""
Standard tests for the BaseCache abstraction
We don't recommend implementing externally managed BaseCache abstractions at this time.
:private:
"""
from abc import abstractmethod
import pytest

View File

@@ -16,7 +16,7 @@ from langchain_core.messages import (
)
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
from langchain_core.tools import BaseTool, tool
from langchain_core.utils.function_calling import tool_example_to_messages
from pydantic import BaseModel, Field
from pydantic.v1 import BaseModel as BaseModelV1
@@ -24,16 +24,29 @@ from pydantic.v1 import Field as FieldV1
from langchain_tests.unit_tests.chat_models import (
ChatModelTests,
my_adder_tool,
)
from langchain_tests.utils.pydantic import PYDANTIC_MAJOR_VERSION
class MagicFunctionSchema(BaseModel):
def _get_joke_class() -> type[BaseModel]:
"""
:private:
"""
class Joke(BaseModel):
"""Joke to tell user."""
setup: str = Field(description="question to set up a joke")
punchline: str = Field(description="answer to resolve the joke")
return Joke
class _MagicFunctionSchema(BaseModel):
input: int = Field(..., gt=-1000, lt=1000)
@tool(args_schema=MagicFunctionSchema)
@tool(args_schema=_MagicFunctionSchema)
def magic_function(input: int) -> int:
"""Applies a magic function to an input."""
return input + 2
@@ -45,13 +58,6 @@ def magic_function_no_args() -> int:
return 5
class Joke(BaseModel):
"""Joke to tell user."""
setup: str = Field(description="question to set up a joke")
punchline: str = Field(description="answer to resolve the joke")
def _validate_tool_call_message(message: BaseMessage) -> None:
assert isinstance(message, AIMessage)
assert len(message.tool_calls) == 1
@@ -103,17 +109,214 @@ class ChatModelIntegrationTests(ChatModelTests):
.. note::
API references for individual test methods include troubleshooting tips.
.. note::
Test subclasses can control what features are tested (such as tool
calling or multi-modality) by selectively overriding the properties on the
class. Relevant properties are mentioned in the references for each method.
See this page for detail on all properties:
https://python.langchain.com/api_reference/standard_tests/unit_tests/langchain_tests.unit_tests.chat_models.ChatModelTests.html
Test subclasses must implement the following two properties:
chat_model_class
The chat model class to test, e.g., ``ChatParrotLink``.
Example:
.. code-block:: python
@property
def chat_model_class(self) -> Type[ChatParrotLink]:
return ChatParrotLink
chat_model_params
Initialization parameters for the chat model.
Example:
.. code-block:: python
@property
def chat_model_params(self) -> dict:
return {"model": "bird-brain-001", "temperature": 0}
In addition, test subclasses can control what features are tested (such as tool
calling or multi-modality) by selectively overriding the following properties.
Expand to see details:
.. dropdown:: has_tool_calling
Boolean property indicating whether the chat model supports tool calling.
By default, this is determined by whether the chat model's `bind_tools` method
is overridden. It typically does not need to be overridden on the test class.
Example override:
.. code-block:: python
@property
def has_tool_calling(self) -> bool:
return True
.. dropdown:: tool_choice_value
Value to use for tool choice when used in tests.
Some tests for tool calling features attempt to force tool calling via a
`tool_choice` parameter. A common value for this parameter is "any". Defaults
to `None`.
Note: if the value is set to "tool_name", the name of the tool used in each
test will be set as the value for `tool_choice`.
Example:
.. code-block:: python
@property
def tool_choice_value(self) -> Optional[str]:
return "any"
.. dropdown:: has_structured_output
Boolean property indicating whether the chat model supports structured
output.
By default, this is determined by whether the chat model's
`with_structured_output` method is overridden. If the base implementation is
intended to be used, this method should be overridden.
See: https://python.langchain.com/docs/concepts/structured_outputs/
Example:
.. code-block:: python
@property
def has_structured_output(self) -> bool:
return True
.. dropdown:: supports_image_inputs
Boolean property indicating whether the chat model supports image inputs.
Defaults to ``False``.
If set to ``True``, the chat model will be tested using content blocks of the
form
.. code-block:: python
[
{"type": "text", "text": "describe the weather in this image"},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
]
See https://python.langchain.com/docs/concepts/multimodality/
Example:
.. code-block:: python
@property
def supports_image_inputs(self) -> bool:
return True
.. dropdown:: supports_video_inputs
Boolean property indicating whether the chat model supports image inputs.
Defaults to ``False``. No current tests are written for this feature.
.. dropdown:: returns_usage_metadata
Boolean property indicating whether the chat model returns usage metadata
on invoke and streaming responses.
``usage_metadata`` is an optional dict attribute on AIMessages that track input
and output tokens: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html
Example:
.. code-block:: python
@property
def returns_usage_metadata(self) -> bool:
return False
.. dropdown:: supports_anthropic_inputs
Boolean property indicating whether the chat model supports Anthropic-style
inputs.
These inputs might feature "tool use" and "tool result" content blocks, e.g.,
.. code-block:: python
[
{"type": "text", "text": "Hmm let me think about that"},
{
"type": "tool_use",
"input": {"fav_color": "green"},
"id": "foo",
"name": "color_picker",
},
]
If set to ``True``, the chat model will be tested using content blocks of this
form.
Example:
.. code-block:: python
@property
def supports_anthropic_inputs(self) -> bool:
return False
.. dropdown:: supports_image_tool_message
Boolean property indicating whether the chat model supports ToolMessages
that include image content, e.g.,
.. code-block:: python
ToolMessage(
content=[
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{image_data}"},
},
],
tool_call_id="1",
name="random_image",
)
If set to ``True``, the chat model will be tested with message sequences that
include ToolMessages of this form.
Example:
.. code-block:: python
@property
def supports_image_tool_message(self) -> bool:
return False
.. dropdown:: supported_usage_metadata_details
Property controlling what usage metadata details are emitted in both invoke
and stream.
``usage_metadata`` is an optional dict attribute on AIMessages that track input
and output tokens: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html
It includes optional keys ``input_token_details`` and ``output_token_details``
that can track usage details associated with special types of tokens, such as
cached, audio, or reasoning.
Only needs to be overridden if these details are supplied.
"""
@property
def standard_chat_model_params(self) -> dict:
""":meta private:"""
""":private:"""
return {}
def test_invoke(self, model: BaseChatModel) -> None:
@@ -908,6 +1111,7 @@ class ChatModelIntegrationTests(ChatModelTests):
if not self.has_tool_calling:
pytest.skip("Test requires tool calling.")
Joke = _get_joke_class()
# Pydantic class
# Type ignoring since the interface only officially supports pydantic 1
# or pydantic.v1.BaseModel but not pydantic.BaseModel from pydantic 2.
@@ -960,6 +1164,8 @@ class ChatModelIntegrationTests(ChatModelTests):
if not self.has_tool_calling:
pytest.skip("Test requires tool calling.")
Joke = _get_joke_class()
# Pydantic class
# Type ignoring since the interface only officially supports pydantic 1
# or pydantic.v1.BaseModel but not pydantic.BaseModel from pydantic 2.
@@ -1089,7 +1295,9 @@ class ChatModelIntegrationTests(ChatModelTests):
joke_result = chat.invoke("Give me a joke about cats, include the punchline.")
assert isinstance(joke_result, Joke)
def test_tool_message_histories_string_content(self, model: BaseChatModel) -> None:
def test_tool_message_histories_string_content(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
"""Test that message histories are compatible with string tool contents
(e.g. OpenAI format). If a model passes this test, it should be compatible
with messages generated from providers following OpenAI format.
@@ -1123,8 +1331,8 @@ class ChatModelIntegrationTests(ChatModelTests):
.. code-block:: python
@pytest.mark.xfail(reason=("Not implemented."))
def test_tool_message_histories_string_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_string_content(model)
def test_tool_message_histories_string_content(self, *args: Any) -> None:
super().test_tool_message_histories_string_content(*args)
""" # noqa: E501
if not self.has_tool_calling:
pytest.skip("Test requires tool calling.")
@@ -1158,6 +1366,7 @@ class ChatModelIntegrationTests(ChatModelTests):
def test_tool_message_histories_list_content(
self,
model: BaseChatModel,
my_adder_tool: BaseTool,
) -> None:
"""Test that message histories are compatible with list tool contents
(e.g. Anthropic format).
@@ -1206,8 +1415,8 @@ class ChatModelIntegrationTests(ChatModelTests):
.. code-block:: python
@pytest.mark.xfail(reason=("Not implemented."))
def test_tool_message_histories_list_content(self, model: BaseChatModel) -> None:
super().test_tool_message_histories_list_content(model)
def test_tool_message_histories_list_content(self, *args: Any) -> None:
super().test_tool_message_histories_list_content(*args)
""" # noqa: E501
if not self.has_tool_calling:
pytest.skip("Test requires tool calling.")
@@ -1246,7 +1455,9 @@ class ChatModelIntegrationTests(ChatModelTests):
result_list_content = model_with_tools.invoke(messages_list_content)
assert isinstance(result_list_content, AIMessage)
def test_structured_few_shot_examples(self, model: BaseChatModel) -> None:
def test_structured_few_shot_examples(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
"""Test that the model can process few-shot examples with tool calls.
These are represented as a sequence of messages of the following form:
@@ -1286,8 +1497,8 @@ class ChatModelIntegrationTests(ChatModelTests):
.. code-block:: python
@pytest.mark.xfail(reason=("Not implemented."))
def test_structured_few_shot_examples(self, model: BaseChatModel) -> None:
super().test_structured_few_shot_examples(model)
def test_structured_few_shot_examples(self, *args: Any) -> None:
super().test_structured_few_shot_examples(*args)
""" # noqa: E501
if not self.has_tool_calling:
pytest.skip("Test requires tool calling.")
@@ -1557,7 +1768,9 @@ class ChatModelIntegrationTests(ChatModelTests):
]
model.bind_tools([color_picker]).invoke(messages)
def test_tool_message_error_status(self, model: BaseChatModel) -> None:
def test_tool_message_error_status(
self, model: BaseChatModel, my_adder_tool: BaseTool
) -> None:
"""Test that ToolMessage with ``status="error"`` can be handled.
These messages may take the form:
@@ -1647,16 +1860,21 @@ class ChatModelIntegrationTests(ChatModelTests):
assert len(result.content) > 0
def invoke_with_audio_input(self, *, stream: bool = False) -> AIMessage:
""":private:"""
raise NotImplementedError()
def invoke_with_audio_output(self, *, stream: bool = False) -> AIMessage:
""":private:"""
raise NotImplementedError()
def invoke_with_reasoning_output(self, *, stream: bool = False) -> AIMessage:
""":private:"""
raise NotImplementedError()
def invoke_with_cache_read_input(self, *, stream: bool = False) -> AIMessage:
""":private:"""
raise NotImplementedError()
def invoke_with_cache_creation_input(self, *, stream: bool = False) -> AIMessage:
""":private:"""
raise NotImplementedError()

View File

@@ -1,4 +1,12 @@
"""Test suite to check index implementations."""
"""Test suite to check index implementations.
Standard tests for the DocumentIndex abstraction
We don't recommend implementing externally managed DocumentIndex abstractions at this
time.
:private:
"""
import inspect
import uuid

Some files were not shown because too many files have changed in this diff Show More