Compare commits

...

240 Commits

Author SHA1 Message Date
Erick Friis
9a31ad6f60 langchain: release 0.3.1 (#26868) 2024-09-25 11:43:54 -07:00
Erick Friis
ef2ab26113 core: release 0.3.6 (#26863) 2024-09-25 11:05:53 -07:00
ccurme
87e21493f7 docs[patch]: remove deprecated loaders from feature tables (#26709) 2024-09-25 12:53:32 -04:00
ccurme
a0010063e8 docs[patch]: add guide for loading web pages (#26708) 2024-09-25 12:03:42 -04:00
Bagatur
eaffa92c1d openai[patch]: Release 0.2.1 (#26858) 2024-09-25 15:55:49 +00:00
Rajendra Kadam
51c4393298 community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules (#26852)
- **Description:** This pull request addresses the validation error in
`SettingsConfigDict` due to extra fields in the `.env` file. The issue
is prevalent across multiple Langchain modules. This fix ensures that
extra fields in the `.env` file are ignored, preventing validation
errors.
  **Changes include:**
    - Applied fixes to modules using `SettingsConfigDict`.

- **Issue:** NA, similar
https://github.com/langchain-ai/langchain/issues/26850
- **Dependencies:** NA
2024-09-25 10:02:14 -04:00
Devin Gaffney
d502858412 Update main README.md to reference latest version of documentation (#26854)
Update README.md to point at latest docs
2024-09-25 09:44:18 -04:00
Eugene Yurtsev
27c12146c8 docs[patch]: In conceptual docs explain constraints on ToolMessage (#26792)
Minor clarification
2024-09-25 09:34:45 -04:00
Christophe Bornet
3a1b9259a7 core: Add ruff rules for comprehensions (C4) (#26829) 2024-09-25 09:34:17 -04:00
Rajendra Kadam
7e5a9c317f community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag (#26812)
- **Description:** The flag is named `anonymize_snippets`. When set to
true, the Pebblo server will anonymize snippets by redacting all
personally identifiable information (PII) from the snippets going into
VectorDB and the generated reports
- **Issue:** NA
- **Dependencies:** NA
- **docs**: Updated
2024-09-25 09:33:06 -04:00
Rajendra Kadam
92003b3724 community[patch]: [SharePointLoader] Fix validation error in _O365Settings due to extra fields in .env file (#26851)
**Description:** Fix validation error in _O365Settings by ignoring extra
fields in .env file
**Issue:** https://github.com/langchain-ai/langchain/issues/26850
**Dependencies:** NA
2024-09-25 09:31:59 -04:00
Subhrajyoty Roy
b61fb98466 community[patch]: callback before yield for friendli (#26842)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the friendli model in the community package
**Issue:** #16913
2024-09-25 09:31:12 -04:00
ccurme
13acf9e6b0 langchain[patch]: add deprecation warnings (#26853) 2024-09-25 09:26:44 -04:00
William FH
82b5b77940 [Core] Add more interops tests (#26841)
To test that the client propagates both ways
2024-09-24 20:18:20 -07:00
William FH
9b6ac41442 [Core] Inherit tracing metadata & tags (#26838) 2024-09-24 19:33:12 -07:00
Erick Friis
3796e143f8 docs: remove one more print from build (#26834) 2024-09-24 22:40:16 +00:00
Erick Friis
95269366ae docs: make build less verbose (#26833) 2024-09-24 22:30:05 +00:00
Erick Friis
425c0f381f experimental: release 0.3.1 (#26830) 2024-09-24 15:03:05 -07:00
John
6c3ea262c8 partners/unstructured: release 0.1.5 (#26831)
**Description:** update package version to support loading URLs #26670
**Issue:**  #26697
2024-09-24 15:02:53 -07:00
mercyspirit
0414be4b80 experimental[major]: CVE-2024-46946 fix (#26783)
Description: Resolve CVE-2024-46946 by switching out sympify with
parse_expr with a very specific allowed set of operations.

https://nvd.nist.gov/vuln/detail/cve-2024-46946

Sympify uses eval which makes it vulnerable to code execution.
parse_expr is limited to specific expressions.

Bandit results

![image](https://github.com/user-attachments/assets/170a6376-7028-4e70-a7ef-9acfb49c1d8a)

---------

Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-24 21:37:56 +00:00
Erick Friis
f9ef688b3a docs: upgrade to docusaurus v3 (#26803) 2024-09-24 11:28:13 -07:00
Subhrajyoty Roy
b1da532522 community[patch]: callback before yield for deepsparse llm (#26822)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the deepsparse model in the community package
**Issue:** #16913
2024-09-24 13:55:52 -04:00
Nuno Campos
de70a64e3a core: Run LangChainTracer inline (#26797)
- this flag ensures the tracer always runs in the same thread as the run
being traced for both sync and async runs
- pro: less chance for ordering bugs and other oddities
- blocking the event loop is not a concern given all code in the tracer
holds the GIL anyway
2024-09-24 08:31:18 -07:00
Jorge Piedrahita Ortiz
408a930d55 community: Add Sambanova Cloud Chat model community integration (#26333)
**Description:** : Add SambaNova Cloud Chat model community integration
Includes 
- chat model integration (following Standardize ChatModel docstrings)
-  tests
- docs usage notebook (following Standardize ChatModel integration docs)

https://cloud.sambanova.ai/

---------

Co-authored-by: luisfucros <luisfucros@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-24 14:11:32 +00:00
Tom
2b83c7c3ab community[patch]: Fix tool_calls parsing when streaming from DeepInfra (#26813)
- **Description:** This PR fixes the response parsing logic for
`ChatDeepInfra`, more specifially `_convert_delta_to_message_chunk()`,
which is invoked when streaming via `ChatDeepInfra`.
- **Issue:** Streaming from DeepInfra via `ChatDeepInfra` is currently
broken because the response parsing logic doesn't handle that
`tool_calls` can be `None`. (There is no GitHub issue for this problem
yet.)
- **Dependencies:** –
- **Twitter handle:** –

Keeping this here as a reminder:
> If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-24 13:47:36 +00:00
Subhrajyoty Roy
997d95c8f8 community[patch]: callback before yield for bedrock llm (#26804)
**Description:** Moves yield to after callback for
`_prepare_input_and_invoke_stream` and
`_aprepare_input_and_invoke_stream` for bedrock llm in community
package.
**Issue:** #16913
2024-09-24 12:14:59 +00:00
Erick Friis
e40a2b8bbf docs: fix mdx codefences (#26802)
```
git grep -l -E '"```\{=mdx\}\\n",' | xargs perl -0777 -i -pe 's/"```\{=mdx\}\\n",\n    (\W.*?),\n\s*"```\\?n?"/$1/s'
```
2024-09-24 06:06:13 +00:00
Erick Friis
35081d2765 docs: fix admonition formatting (#26801) 2024-09-23 21:55:17 -07:00
Erick Friis
603d38f06d docs: make docs mdxv2 compatible (#26798)
prep for docusaurus migration
2024-09-23 21:24:23 -07:00
ccurme
2a4c5713cd openai[patch]: fix azure integration tests (#26791) 2024-09-23 17:49:15 -04:00
ccurme
1ce056d1b2 docs[patch]: add memory migration guides to sidebar (#26711)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-23 15:31:27 -04:00
Mohammad Mohtashim
154a5ff7ca core[patch]: On Chain Start Fix for Chain Class (#26593)
- **Issue:** #26588

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-23 19:30:59 +00:00
ccurme
bba7af903b core[patch]: set default on Blob (#26787)
Resolves https://github.com/langchain-ai/langchain/issues/26781
2024-09-23 18:55:56 +00:00
ccurme
97b27f0930 langchain[patch]: fix extended tests (#26788)
Broken by addition of `disabled_params`
2024-09-23 18:52:09 +00:00
Brace Sproul
fb9ac8da2f fix(docs): Drop announcement bar (#26782) 2024-09-23 18:03:59 +00:00
Bagatur
e1e4f88b3e openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599)
…lse, tool_choice=required

response_format=json_schema, tool_choice=required, parallel_tool_calls
are all supported for gpt-4o on azure.
2024-09-22 22:25:22 -07:00
Gabriel Altay
bb40a0fb32 Remove pydantic restricted namespaces from HuggingFaceInferenceAPIEmbedings (#26744)
without this `model_config` importing this package produces warnings
about "model_name" having conflicts with protected namespace "model_".

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-22 08:05:37 -04:00
Gor Hayrapetyan
f97ac92f00 community[patch]: Handle empty PR body in get_pull_request in Github utility (#26739)
**Description:**
When PR body is empty `get_pull_request` method fails with bellow
exception.


**Issue:**
```
TypeError('expected string or buffer')Traceback (most recent call last):


  File ".../.venv/lib/python3.9/site-packages/langchain_core/tools/base.py", line 661, in run
    response = context.run(self._run, *tool_args, **tool_kwargs)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/tools/github/tool.py", line 52, in _run
    return self.api_wrapper.run(self.mode, query)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 816, in run
    return json.dumps(self.get_pull_request(int(query)))


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 495, in get_pull_request
    add_to_dict(response_dict, "body", pull.body)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 487, in add_to_dict
    tokens = get_tokens(value)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 483, in get_tokens
    return len(tiktoken.get_encoding("cl100k_base").encode(text))


  File "....venv/lib/python3.9/site-packages/tiktoken/core.py", line 116, in encode
    if match := _special_token_regex(disallowed_special).search(text):


TypeError: expected string or buffer
```

**Twitter:**  __gorros__
2024-09-22 01:56:24 +00:00
Erick Friis
238a31bbd9 core: release 0.3.5 (#26737) 2024-09-21 00:26:39 +00:00
William FH
55af6fbd02 [LangChainTracer] Omit Chunk (#26602)
in events / new llm token
2024-09-20 17:10:34 -07:00
Anton Dubovik
3e2cb4e8a4 openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767)
Chunking of the input array controlled by `self.chunk_size` is being
ignored when `self.check_embedding_ctx_length` is disabled. Effectively,
the chunk size is assumed to be equal 1 in such a case. This is
suprising.

The PR takes into account `self.chunk_size` passed by the user.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:58:45 -07:00
William FH
864020e592 [Tracer] add project name to run from tracer (#26736) 2024-09-20 16:48:37 -07:00
Nithish Raghunandanan
2d21274bf6 couchbase: Add ttl support to caches & chat_message_history (#26214)
**Description:** Add support to delete documents automatically from the
caches & chat message history by adding a new optional parameter, `ttl`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:44:29 +00:00
Krishna Kulkarni
c6c508ee96 Refining Skip Count Calculation by Filtering Documents with session_id (#26020)
In the previous implementation, `skip_count` was counting all the
documents in the collection. Instead, we want to filter the documents by
`session_id` and calculate `skip_count` by subtracting `history_size`
from the filtered count.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 23:40:56 +00:00
Tibor Reiss
a8b24135a2 fix[experimental]: Fix text splitter with gradient (#26629)
Fixes #26221

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:35:50 +00:00
Alejandro Rodríguez
4ac9a6f52c core: fix "template" not allowed as prompt param (#26060)
- **Description:**  fix "template" not allowed as prompt param
- **Issue:** #26058
- **Dependencies:** none


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:33:06 +00:00
Christophe Bornet
58f339a67c community: Fix links in GraphVectorStore pydoc (#25959)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:17:53 +00:00
Christophe Bornet
e49c413977 core: Add docstring for GraphVectorStoreRetriever (#26224)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-20 23:16:37 +00:00
Lucain
a2023a1e96 huggingface; fix huggingface_endpoint.py (initialize clients only with supported kwargs) (#26378)
## Description

By default, `HuggingFaceEndpoint` instantiates both the
`InferenceClient` and the `AsyncInferenceClient` with the
`"server_kwargs"` passed as input. This is an issue as both clients
might not support exactly the same kwargs. This has been highlighted in
https://github.com/huggingface/huggingface_hub/issues/2522 by
@morgandiverrez with the `trust_env` parameter. In order to make
`langchain` integration future-proof, I do think it's wiser to forward
only the supported parameters to each client. Parameters that are not
supported are simply ignored with a warning to the user. From a
`huggingface_hub` maintenance perspective, this allows us much more
flexibility as we are not constrained to support the exact same kwargs
in both clients.

## Issue

https://github.com/huggingface/huggingface_hub/issues/2522

## Dependencies

None

## Twitter 

https://x.com/Wauplin

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:05:24 -07:00
ccurme
f2285376a5 community[patch]: add web loader tests (#26728) 2024-09-20 18:29:54 -04:00
Erick Friis
4a2745064a core: release 0.3.4 (#26729) 2024-09-20 14:47:15 -07:00
Nuno Campos
345edeb1f0 core: In astream_events propagate cancellation reason to inner task (#26727)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-20 14:42:10 -07:00
Erick Friis
465e43cd43 core: release 0.3.3 (#26713) 2024-09-20 13:54:19 -07:00
Eugene Yurtsev
4fc69d61ad core[patch]: Fix defusedxml import (#26718)
Fix defusedxml import. Haven't investigated what's actually going on
under the hood -- defusedxml probably does some weird things in __init__
2024-09-20 16:53:24 -04:00
Eugene Yurtsev
79b224f6f3 core/langchain: fix version used in deprecation (#26724)
in core deprecation should be version 0.3.3 instead of 0.3.4
in langchain deprecation should be version 0.3.1 instead of 0.3.4
2024-09-20 16:47:18 -04:00
Eugene Yurtsev
8a9f7091c0 docs: Update trim message usage in migrating_memory (#26722)
Make sure we don't end up with a ToolMessage that precedes an AIMessage
2024-09-20 20:20:27 +00:00
Eugene Yurtsev
91f4711e53 core[patch],langchain[patch]: deprecate memory and entity abstractions and implementations (#26717)
This PR deprecates the old memory, entity abstractions and implementations
2024-09-20 15:06:25 -04:00
William FH
19ce95d3c9 Avoid copying runs (#26689)
Also, re-unify run trees. Use a single shared client.
2024-09-20 10:57:41 -07:00
Eric
90031b1b3e support epsilla cloud vector database in langchain (#26065)
Description

- support epsilla cloud in langchain

---------

Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 17:14:23 +00:00
ZhangShenao
baef7639fd Improvement[text-splitter] Fix import of ExperimentalMarkdownSyntaxTextSplitter (#26703)
#26028 

Export `ExperimentalMarkdownSyntaxTextSplitter` in __init__

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 17:04:31 +00:00
Eugene Yurtsev
acf8c2c13e docs: Add migration instructions for v0.0.x memory abstractions (#26668)
This PR adds a migration guide for any code that relies on old memory
abstractions.
2024-09-20 15:09:23 +00:00
ccurme
eeab6a688c docs[patch]: update PDF loader docs (#26627)
Docs preview:
https://langchain-git-cc-pdfdocs-langchain.vercel.app/docs/how_to/document_loader_pdf/
2024-09-20 11:07:06 -04:00
stein1988
91594928c5 fix:fix ChatZhipuAI tool call bug (#26693)
- [ ] **PR title**: "community:fix ChatZhipuAI tool call bug"

- [ ] **Description:** ZhipuAI api response as follows:
{'id': '20240920132549e379a9152a6a4d7c', 'created': 1726809949, 'model':
'glm-4-flash', 'choices': [{'index': 0, 'finish_reason': 'tool_calls',
'delta': {'role': 'assistant', 'tool_calls': [{'id':
'call_20240920132549e379a9152a6a4d7c', 'index': 0, 'type': 'function',
'function': {'name': 'get_datetime_offline', 'arguments': '{}'}}]}}]}
so, tool_calls = dct.get("tool_call", None) in
_convert_delta_to_message_chunk should be "tool_calls"
2024-09-20 13:06:42 +00:00
guoqiang0401
8f0c04f47e Update tool_calling.ipynb (#26699)
There is a small bug in "TypedDict class" sample source.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-20 13:04:50 +00:00
Bagatur
f7bb3640f1 core[patch]: support js chat model namespaces (#26688) 2024-09-19 16:14:20 -07:00
Bagatur
c453b76579 core[patch]: Release 0.3.2 (#26686) 2024-09-19 14:58:45 -07:00
Piyush Jain
f087ab43fd core[patch]: Fix load of ChatBedrock (#26679)
Complementary PR to master for #26643.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-19 21:57:20 +00:00
Bagatur
409f35363b core[patch]: support load from path for default namespaces (#26675) 2024-09-19 14:47:27 -07:00
Eugene Yurtsev
e8236e58f2 ci: restore qa template that was known to work (#26684)
Restore qa template that was working
2024-09-19 17:20:42 -04:00
ccurme
eef18dec44 unstructured[patch]: support loading URLs (#26670)
`unstructured.partition.auto.partition` supports a `url` kwarg, but
`url` in `UnstructuredLoader.__init__` is reserved for the server URL.
Here we add a `web_url` kwarg that is passed to the partition kwargs:
```python
self.unstructured_kwargs["url"] = web_url
```
2024-09-19 11:40:25 -07:00
Erick Friis
311f861547 core, community: move graph vectorstores to community (#26678)
remove beta namespace from core, add to community
2024-09-19 11:38:14 -07:00
Serena Ruan
c77c28e631 [community] Fix WorkspaceClient error with pydantic validation (#26649)
Thank you for contributing to LangChain!

Fix error like
<img width="1167" alt="image"
src="https://github.com/user-attachments/assets/2e219b26-ec7e-48ef-8111-e0ff2f5ac4c0">

After the fix:
<img width="584" alt="image"
src="https://github.com/user-attachments/assets/48f36fe7-628c-48b6-81b2-7fe741e4ca85">


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: serena-ruan <serena.rxy@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 18:25:33 +00:00
ccurme
7d49ee9741 unstructured[patch]: add to integration tests (#26666)
- Add to tests on parsed content;
- Add tests for async + lazy loading;
- Add a test for `strategy="hi_res"`.
2024-09-19 13:43:34 -04:00
Erick Friis
28dd6564db docs: highlight styling (#26636)
MERGE ME PLEASE
2024-09-19 17:12:59 +00:00
ccurme
f91bdd12d2 community[patch]: add to pypdf tests and run in CI (#26663) 2024-09-19 14:45:49 +00:00
ice yao
4d3d62c249 docs: fix nomic link error (#26642) 2024-09-19 14:41:45 +00:00
Rajendra Kadam
60dc19da30 [community] Added PebbloTextLoader for loading text data in PebbloSafeLoader (#26582)
- **Description:** Added PebbloTextLoader for loading text in
PebbloSafeLoader.
- Since PebbloSafeLoader wraps document loaders, this new loader enables
direct loading of text into Documents using PebbloSafeLoader.
- **Issue:** NA
- **Dependencies:** NA
- [x] **Tests**: Added/Updated tests
2024-09-19 09:59:04 -04:00
Jorge Piedrahita Ortiz
55b641b761 community: fix error in sambastudio embeddings (#26260)
fix error in samba studio embeddings  result unpacking
2024-09-19 09:57:04 -04:00
Jorge Piedrahita Ortiz
37b72023fe community: remove sambaverse (#26265)
removing Sambaverse llm model and references given is not available
after Sep/10/2024

<img width="1781" alt="image"
src="https://github.com/user-attachments/assets/4dcdb5f7-5264-4a03-b8e5-95c88304e059">
2024-09-19 09:56:30 -04:00
Martin Triska
3fc0ea510e community : [bugfix] Use document ids as keys in AzureSearch vectorstore (#25486)
# Description
[Vector store base
class](4cdaca67dc/libs/core/langchain_core/vectorstores/base.py (L65))
currently expects `ids` to be passed in and that is what it passes along
to the AzureSearch vector store when attempting to `add_texts()`.
However AzureSearch expects `keys` to be passed in. When they are not
present, AzureSearch `add_embeddings()` makes up new uuids. This is a
problem when trying to run indexing. [Indexing code
expects](b297af5482/libs/core/langchain_core/indexing/api.py (L371))
the documents to be uploaded using provided ids. Currently AzureSearch
ignores `ids` passed from `indexing` and makes up new ones. Later when
`indexer` attempts to delete removed file, it uses the `id` it had
stored when uploading the document, however it was uploaded under
different `id`.

**Twitter handle: @martintriska1**
2024-09-19 09:37:18 -04:00
Tomaz Bratanic
a8561bc303 Fix async parsing for llm graph transformer (#26650) 2024-09-19 09:15:33 -04:00
Erik
4e0a6ebe7d community: Add warning when page_content is empty (#25955)
Page content sometimes is empty when PyMuPDF can not find text on pages.
For example, this can happen when the text of the PDF is not copyable
"by hand". Then an OCR solution is need - which is not integrated here.

This warning should accurately warn the user that some pages are lost
during this process.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:22:09 +00:00
Christophe Bornet
fd21ffe293 core: Add N(naming) ruff rules (#25362)
Public classes/functions are not renamed and rule is ignored for them.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:09:39 +00:00
Daniel Cooke
7835c0651f langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:21:24 +00:00
Tibor Reiss
85caaa773f docs[community]: Fix raw string in docstring (#26350)
Fixes #26212: replaced the raw string with backslashes. Alternative:
raw-stringif the full docstring.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-19 04:18:56 +00:00
Erick Friis
8fb643a6e8 partners/box: release 0.2.1 (#26644) 2024-09-19 04:02:06 +00:00
Tomaz Bratanic
03b9aca55d community: Retry retriable errors in Neo4j (#26211)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:01:07 +00:00
Scott Hurrey
acbb4e4701 box: Add searchoptions for BoxRetriever, documentation for BoxRetriever as agent tool (#26181)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


Added search options for BoxRetriever and added documentation to
demonstrate how to use BoxRetriever as an agent tool - @BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-18 21:00:06 -07:00
Erick Friis
e0c36afc3e docs: v0.3 link redirect (#26632) 2024-09-18 14:28:56 -07:00
Erick Friis
9909354cd0 core: use ruff.target-version instead (#26634)
tested on one of the replacement cases and seems to work! 
![ScreenShot 2024-09-18 at 02 02
43PM](https://github.com/user-attachments/assets/7170975a-2542-43ed-a203-d4126c6a2c81)
2024-09-18 21:06:14 +00:00
Erick Friis
84b831356c core: remove [project] tag from pyproject (#26633)
makes core incompatible with uv installs
2024-09-18 20:39:49 +00:00
Christophe Bornet
a47b332841 core: Put Python version as a project requirement so it is considered by ruff (#26608)
Ruff doesn't know about the python version in
`[tool.poetry.dependencies]`. It can get it from
`project.requires-python`.

Notes:
* poetry seems to have issues getting the python constraints from
`requires-python` and using `python` in per dependency constraints. So I
had to duplicate the info. I will open an issue on poetry.
* `inspect.isclass()` doesn't work correctly with `GenericAlias`
(`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not
isinstance(type, GenericAlias)` checks:

Python 3.11
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
False
```

Python 3.9
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
True
```

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-18 14:37:57 +00:00
Patrick McGleenon
0f07cf61da docs: fixed typo in XML document loader (#26613)
Fixed typo `Unstrucutred`
2024-09-18 14:26:57 +00:00
Erick Friis
d158401e73 infra: master release checkout ref for release note (#26605) 2024-09-18 01:51:54 +00:00
Bagatur
de58942618 docs: consolidate dropdowns (#26600) 2024-09-18 01:24:10 +00:00
Bagatur
df38d5250f docs: cleanup nav (#26546) 2024-09-17 17:49:46 -07:00
sanjay920
b246052184 docs: fix typo in clickhouse vectorstore doc (#26598)
- **Description:** typo in clickhouse vectorstore doc
- **Issue:** #26597
- **Dependencies:** none
- **Twitter handle:** sanjay920

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 23:33:22 +00:00
Miguel Grinberg
52729ac0be docs: update hybrid search example with Elasticsearch retriever (#26328)
- **Description:** the example to perform hybrid search with the
Elasticsearch retriever is out of date
- **Issue:** N/A
- **Dependencies:** N/A

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 23:15:27 +00:00
Marco Rossi IT
f62d454f36 docs: fix typo on amazon_textract.ipynb (#26493)
- **Description:** fixed a typo on amazon textract page

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:27:45 +00:00
gbaian10
6fe2536c5a docs: fix the ImportError in google_speech_to_text.ipynb (#26522)
fix #26370

- #26370 

`GoogleSpeechToTextLoader` is a deprecated method in
`langchain_community.document_loaders.google_speech_to_text`.

The new recommended usage is to use `SpeechToTextLoader` from
`langchain_google_community`.

When importing from `langchain_google_community`, use the name
`SpeechToTextLoader` instead of the old `GoogleSpeechToTextLoader`.


![image](https://github.com/user-attachments/assets/3a8bd309-9858-4938-b7db-872f51b9542e)

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:18:57 +00:00
Zhanwei Zhang
418b170f94 docs: Fix typo in conda environment code block in rag.ipynb (#26487)
Thank you for contributing to LangChain!

- [x] **PR title**: Fix typo in conda environment code block in
rag.ipynb
  - In docs/tutorials/rag.ipynb

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:13:55 +00:00
ZhangShenao
c3b3f46cb8 Improvement[Community] Improve api doc of BeautifulSoupTransformer (#26423)
- Add missing args

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:00:07 +00:00
ogawa
e2245fac82 community[patch]: o1-preview and o1-mini costs (#26411)
updated OpenAI cost definitions according to the following:
https://openai.com/api/pricing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:59:46 +00:00
ZhangShenao
1a8e9023de Improvement[Community] Improve streamlit_callback_handler (#26373)
- add decorator for static methods

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:54:37 +00:00
Bagatur
1a62f9850f anthropic[patch]: Release 0.2.1 (#26592) 2024-09-17 14:44:21 -07:00
Harutaka Kawamura
6ed50e78c9 community: Rename deployments server to AI gateway (#26368)
We recently renamed `MLflow Deployments Server` to `MLflow AI Gateway`
in mlflow. This PR updates the relevant notebooks to use `MLflow AI
gateway`

---

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:36:04 +00:00
Bagatur
5ced41bf50 anthropic[patch]: fix tool call and tool res image_url handling (#26587)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 14:30:07 -07:00
Christophe Bornet
c6bdd6f482 community: Fix references in link extractors docstrings (#26314)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:26:25 +00:00
Christophe Bornet
3a99467ccb core[patch]: Add ruff rule UP006(use PEP585 annotations) (#26574)
* Added rules `UPD006` now that Pydantic is v2+

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-17 21:22:50 +00:00
wlleiiwang
2ef4c9466f community: modify document links for tencent vectordb (#26316)
- modify document links for create a tencent vectordb database instance.

Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:11:10 +00:00
Erick Friis
194adc485c docs: pypi readme image links (#26590) 2024-09-17 20:41:34 +00:00
Bagatur
97b05d70e6 docs: anthropic api ref nit (#26591) 2024-09-17 20:39:53 +00:00
Bagatur
e1d113ea84 core,openai,grow,fw[patch]: deprecate bind_functions, update chat mod… (#26584)
…el api ref
2024-09-17 11:32:39 -07:00
ccurme
7c05f71e0f milvus[patch]: fix vectorstore integration tests (#26583)
Resolves https://github.com/langchain-ai/langchain/issues/26564
2024-09-17 14:17:05 -04:00
Bagatur
145a49cca2 core[patch]: Release 0.3.1 (#26581) 2024-09-17 17:34:09 +00:00
Nuno Campos
5fc44989bf core[patch]: Fix "argument of type 'NoneType' is not iterable" error in LangChainTracer (#26576)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 10:29:46 -07:00
Erick Friis
f4a65236ee infra: only force reinstall on release (#26580) 2024-09-17 17:12:17 +00:00
Isaac Francisco
06cde06a20 core[minor]: remove beta from RemoveMessage (#26579) 2024-09-17 17:09:58 +00:00
Erick Friis
3e51fdc840 infra: more skip if pull request libs (#26578) 2024-09-17 09:48:02 -07:00
RUO
0a177ec2cc community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376)
### Description:
This pull request significantly enhances the MongodbLoader class in the
LangChain community package by adding robust metadata customization and
improved field extraction capabilities. The updated class now allows
users to specify additional metadata fields through the metadata_names
parameter, enabling the extraction of both top-level and deeply nested
document attributes as metadata. This flexibility is crucial for users
who need to include detailed contextual information without altering the
database schema.

Moreover, the include_db_collection_in_metadata flag offers optional
inclusion of database and collection names in the metadata, allowing for
even greater customization depending on the user's needs.

The loader's field extraction logic has been refined to handle missing
or nested fields more gracefully. It now employs a safe access mechanism
that avoids the KeyError previously encountered when a specified nested
field was absent in a document. This update ensures that the loader can
handle diverse and complex data structures without failure, making it
more resilient and user-friendly.

### Issue:
This pull request addresses a critical issue where the MongodbLoader
class in the LangChain community package could throw a KeyError when
attempting to access nested fields that may not exist in some documents.
The previous implementation did not handle the absence of specified
nested fields gracefully, leading to runtime errors and interruptions in
data processing workflows.

This enhancement ensures robust error handling by safely accessing
nested document fields, using default values for missing data, thus
preventing KeyError and ensuring smoother operation across various data
structures in MongoDB. This improvement is crucial for users working
with diverse and complex data sets, ensuring the loader can adapt to
documents with varying structures without failing.

### Dependencies: 
Requires motor for asynchronous MongoDB interaction.

### Twitter handle: 
N/A

### Add tests and docs
Tests: Unit tests have been added to verify that the metadata inclusion
toggle works as expected and that the field extraction correctly handles
nested fields.
Docs: An example notebook demonstrating the use of the enhanced
MongodbLoader is included in the docs/docs/integrations directory. This
notebook includes setup instructions, example usage, and outputs.
(Here is the notebook link : [colab
link](https://colab.research.google.com/drive/1tp7nyUnzZa3dxEFF4Kc3KS7ACuNF6jzH?usp=sharing))
Lint and test
Before submitting, I ran make format, make lint, and make test as per
the contribution guidelines. All tests pass, and the code style adheres
to the LangChain standards.

```python
import unittest
from unittest.mock import patch, MagicMock
import asyncio
from langchain_community.document_loaders.mongodb import MongodbLoader

class TestMongodbLoader(unittest.TestCase):
    def setUp(self):
        """Setup the MongodbLoader test environment by mocking the motor client 
        and database collection interactions."""
        # Mocking the AsyncIOMotorClient
        self.mock_client = MagicMock()
        self.mock_db = MagicMock()
        self.mock_collection = MagicMock()

        self.mock_client.get_database.return_value = self.mock_db
        self.mock_db.get_collection.return_value = self.mock_collection

        # Initialize the MongodbLoader with test data
        self.loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )

    @patch('langchain_community.document_loaders.mongodb.AsyncIOMotorClient', return_value=MagicMock())
    def test_constructor(self, mock_motor_client):
        """Test if the constructor properly initializes with the correct database and collection names."""
        loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )
        self.assertEqual(loader.db_name, "testdb")
        self.assertEqual(loader.collection_name, "testcol")

    def test_aload(self):
        """Test the aload method to ensure it correctly queries and processes documents."""
        # Setup mock data and responses for the database operations
        self.mock_collection.count_documents.return_value = asyncio.Future()
        self.mock_collection.count_documents.return_value.set_result(1)
        self.mock_collection.find.return_value = [
            {"_id": "1", "content": "Test document content"}
        ]

        # Run the aload method and check responses
        loop = asyncio.get_event_loop()
        results = loop.run_until_complete(self.loader.aload())
        self.assertEqual(len(results), 1)
        self.assertEqual(results[0].page_content, "Test document content")

    def test_construct_projection(self):
        """Verify that the projection dictionary is constructed correctly based on field names."""
        self.loader.field_names = ['content', 'author']
        self.loader.metadata_names = ['timestamp']
        expected_projection = {'content': 1, 'author': 1, 'timestamp': 1}
        projection = self.loader._construct_projection()
        self.assertEqual(projection, expected_projection)

if __name__ == '__main__':
    unittest.main()
```


### Additional Example for Documentation
Sample Data:

```json
[
    {
        "_id": "1",
        "title": "Artificial Intelligence in Medicine",
        "content": "AI is transforming the medical industry by providing personalized medicine solutions.",
        "author": {
            "name": "John Doe",
            "email": "john.doe@example.com"
        },
        "tags": ["AI", "Healthcare", "Innovation"]
    },
    {
        "_id": "2",
        "title": "Data Science in Sports",
        "content": "Data science provides insights into player performance and strategic planning in sports.",
        "author": {
            "name": "Jane Smith",
            "email": "jane.smith@example.com"
        },
        "tags": ["Data Science", "Sports", "Analytics"]
    }
]
```
Example Code:

```python
loader = MongodbLoader(
    connection_string="mongodb://localhost:27017",
    db_name="example_db",
    collection_name="articles",
    filter_criteria={"tags": "AI"},
    field_names=["title", "content"],
    metadata_names=["author.name", "author.email"],
    include_db_collection_in_metadata=True
)

documents = loader.load()

for doc in documents:
    print("Page Content:", doc.page_content)
    print("Metadata:", doc.metadata)
```
Expected Output:

```
Page Content: Artificial Intelligence in Medicine AI is transforming the medical industry by providing personalized medicine solutions.
Metadata: {'author_name': 'John Doe', 'author_email': 'john.doe@example.com', 'database': 'example_db', 'collection': 'articles'}
```

Thank you.

---

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 10:23:17 -04:00
ccurme
6758894af1 docs: update v0.3 integrations table (#26571) 2024-09-17 09:56:04 -04:00
venkatram-dev
6ba3c715b7 doc_fix_chroma_integration (#26565)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
docs:integrations:vectorstores:chroma:fix_typo


- [x] **PR message**: ***Delete this entire checklist*** and replace
with


- **Description:** fix_typo in docs:integrations:vectorstores:chroma
https://python.langchain.com/docs/integrations/vectorstores/chroma/
    - **Issue:** https://github.com/langchain-ai/langchain/issues/26561

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-17 08:17:54 -04:00
Bagatur
d8952b8e8c langchain[patch]: infer mistral provider in init_chat_model (#26557) 2024-09-17 00:35:54 +00:00
Bagatur
31f61d4d7d docs: v0.3 nits (#26556) 2024-09-17 00:14:47 +00:00
Bagatur
99abd254fb docs: clean up init_chat_model (#26551) 2024-09-16 22:08:22 +00:00
Tomaz Bratanic
3bcd641bc1 Add check for prompt based approach in llm graph transformer (#26519) 2024-09-16 15:01:09 -07:00
Bagatur
0bd98c99b3 docs: add sema4 to release table (#26549) 2024-09-16 14:59:13 -07:00
Eugene Yurtsev
8a2f2fc30b docs: what langchain-cli migrate can do (#26547) 2024-09-16 20:10:40 +00:00
SQpgducray
724a53711b docs: Fix missing self argument in _get_docs_with_query method of `Cust… (#26312)
…omSelfQueryRetriever`

This commit corrects an issue in the `_get_docs_with_query` method of
the `CustomSelfQueryRetriever` class. The method was incorrectly using
`self.vectorstore.similarity_search_with_score(query, **search_kwargs)`
without including the `self` argument, which is required for proper
method invocation.

The `self` argument is necessary for calling instance methods and
accessing instance attributes. By including `self` in the method call,
we ensure that the method is correctly executed in the context of the
current instance, allowing it to function as intended.

No other changes were made to the method's logic or functionality.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-16 20:02:30 +00:00
Eugene Yurtsev
c6a78132d6 docs: show how to use langchain-cli for migration (#26535)
Update v0.3 instructions a bit

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-16 15:53:05 -04:00
Bagatur
a319a0ff1d docs: add redirects for tools and lcel (#26541) 2024-09-16 18:06:15 +00:00
Eugene Yurtsev
63c3cc1f1f ci: updates issue and discussion templates (#26542)
Update issue and discussion templates
2024-09-16 17:43:04 +00:00
ccurme
0154c586d3 docs: update integrations table in 0.3 guide (#26536) 2024-09-16 17:41:56 +00:00
Eugene Yurtsev
c2588b334f unstructured: release 0.1.4 (#26540)
Release to work with langchain 0.3
2024-09-16 17:38:38 +00:00
Eugene Yurtsev
8b985a42e9 milvus: 0.1.6 release (#26538)
Release to work with langchain 0.3
2024-09-16 13:33:09 -04:00
Eugene Yurtsev
5b4206acd8 box: 0.2.0 release (#26539)
Release to work with langchain 0.3
2024-09-16 13:32:59 -04:00
ccurme
0592c29e9b qdrant[patch]: release 0.1.4 (#26534)
`langchain-qdrant` imports pydantic but was importing pydantic proper
before 0.3 release:
042e84170b/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py (L5-L8)
2024-09-16 13:04:12 -04:00
Eugene Yurtsev
88891477eb langchain-cli: release 0.0.31 (#26533)
langchain-cli 0.0.31 release
2024-09-16 12:57:24 -04:00
ccurme
88bc15d69b standard-tests[patch]: add async test for structured output (#26527) 2024-09-16 11:15:23 -04:00
Erick Friis
1ab181f514 voyageai: release 0.1.2 (#26512) 2024-09-16 03:11:15 +00:00
Erick Friis
ee4e11379f nomic: release 0.1.3, core 0.3 compat but not required (#26511) 2024-09-15 20:10:25 -07:00
Yoshitaka Fujii
bd42344b0a docs: Update concepts.mdx (#26496)
- Fix comments in Python
- Fix repeated sentences
2024-09-16 01:46:15 +00:00
Erick Friis
9f5960a0aa docs: new algolia index (#26508) 2024-09-15 18:33:42 -07:00
Erick Friis
135afdf4fb docs: most 0.1 redirects too (#26494)
takes redirects from 0.1 docs and factors them into suggested redirects
in 0.3 docs
2024-09-15 18:29:58 +00:00
Erick Friis
4131be63af multiple: 0.3.0 not dev version (#26502) 2024-09-15 18:26:50 +00:00
Bhadresh Savani
f66b7ba32d Update google_search.ipynb (#26420)
Added changes for pip installation
2024-09-14 15:08:40 -07:00
jessicaou
9c6aa3f0b7 broken LangGraph docs link (#26438)
Update broken langgraph link in the README.md file

Co-authored-by: Jess Ou <jessou@jesss-mbp.local.meter>
2024-09-14 15:07:51 -07:00
Nicolas
2240ca2979 docs: Fix Firecrawl v0 version (#26452)
Firecrawl integration is currently on v0 - which is supported until
version 0.0.20.

@rafaelsideguide is working on a pr for v1 but meanwhile we should fix
the docs.
2024-09-14 15:06:15 -07:00
Eugene Yurtsev
77ccb4b1cf cli[patch]: Update the migration script message (#26490)
Update the migration script message
2024-09-14 14:40:35 -04:00
Bagatur
b47f4cfe51 mongodb[minor]: Release 0.2.0 (#26484) 2024-09-13 19:17:36 -07:00
Bagatur
779a008d4e docs: update v3 versions (#26483) 2024-09-14 02:16:54 +00:00
Bagatur
4e6620ecdd chroma[patch]: Release 0.1.4 (#26470) 2024-09-13 17:31:34 -07:00
Bagatur
543a80569c prompty[minor]: Release 0.1.0 (#26481) 2024-09-13 23:32:01 +00:00
ccurme
9c88037dbc huggingface[patch]: xfail test (#26479) 2024-09-13 23:16:06 +00:00
Bagatur
a2bfa41216 azure-dynamic-sessions[minor]: Release 0.2.0 (#26478) 2024-09-13 23:09:48 +00:00
ccurme
8abc7ff55a experimental: release 0.3 (#26477) 2024-09-13 23:07:35 +00:00
Bagatur
6abb23ca97 exa[minor]: Release 0.2.0 (#26476) 2024-09-13 23:04:10 +00:00
ccurme
900115a568 community: release 0.3 (#26472) 2024-09-13 22:55:56 +00:00
Bagatur
17b397ef93 pinecone[minor]: Release 0.2.0 (#26474) 2024-09-13 22:55:35 +00:00
Erick Friis
ca304ae046 robocorp: rm package (now langchain-sema4) (#26471) 2024-09-13 15:54:00 -07:00
Erick Friis
537f6924dc partners/ollama: release 0.2.0 (#26468) 2024-09-13 15:48:48 -07:00
Erick Friis
995dfc6b05 partners/fireworks: release 0.2.0 (#26467) 2024-09-13 22:48:16 +00:00
Erick Friis
832bc834b1 partners/anthropic: release 0.2.0 (#26469)
0.3.0 version was a mistake! not released - bumping version back to
0.2.0 here
2024-09-13 22:47:09 +00:00
Erick Friis
6997731729 partners/anthropic: release 0.3.0 (#26466) 2024-09-13 22:44:11 +00:00
Bagatur
64bfe1ff23 groq[minor]: Release 0.2.0 (#26465) 2024-09-13 22:43:11 +00:00
Erick Friis
58c7414e10 langchain: release 0.3.0 (#26462) 2024-09-13 22:40:37 +00:00
ccurme
125c9896a8 huggingface: release 0.1 (#26463) 2024-09-13 22:39:49 +00:00
Bagatur
f7ae12fa1f openai[minor]: Release 0.2.0 (#26464) 2024-09-13 15:38:10 -07:00
ccurme
d1462badaf text-splitters: release 0.3 (#26460)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-13 22:31:06 +00:00
ccurme
9b30bdceb6 mistralai: release 0.2 (#26458)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-13 18:27:51 -04:00
Bagatur
3125a89198 infra: fix min version (#26461) 2024-09-13 22:25:22 +00:00
Bagatur
44791ce131 infra: rm pydantic from min version test (#26459) 2024-09-13 15:22:28 -07:00
Bagatur
fa8e0d90de docs: update version docs (#26457) 2024-09-13 22:20:24 +00:00
Bagatur
222caaebdd infra: fix release (#26455) 2024-09-13 15:01:36 -07:00
Erick Friis
d46ab19954 core: release 0.3.0 (#26453) 2024-09-13 21:45:45 +00:00
Erick Friis
c2a3021bb0 multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Bagatur
d9813bdbbc openai[patch]: Release 0.1.25 (#26439) 2024-09-13 12:00:01 -07:00
liuhetian
7fc9e99e21 openai[patch]: get output_type when using with_structured_output (#26307)
- This allows pydantic to correctly resolve annotations necessary when
using openai new param `json_schema`

Resolves issue: #26250

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-13 11:42:01 -07:00
Bagatur
0f2b32ffa9 core[patch]: Release 0.2.40 (#26435) 2024-09-13 09:57:09 -07:00
Bagatur
e32adad17a community[patch]: Release 0.2.17 (#26432) 2024-09-13 09:56:39 -07:00
langchain-infra
8a02fd9c01 core: add additional import mappings to loads (#26406)
Support using additional import mapping. This allows users to override
old mappings/add new imports to the loads function.

- [x ] **Add tests and docs**: If you're adding a new integration,
please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-09-13 09:39:58 -07:00
Erick Friis
1d98937e8d partners/openai: release 0.1.24 (#26417) 2024-09-12 21:54:13 -07:00
Harrison Chase
28ad244e77 community, openai: support nested dicts (#26414)
needed for thinking tokens

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-12 21:47:47 -07:00
Erick Friis
c0dd293f10 partners/groq: release 0.1.10 (#26393) 2024-09-12 17:41:11 +00:00
Erick Friis
54c85087e2 groq: add back streaming tool calls (#26391)
api no longer throws an error

https://console.groq.com/docs/tool-use#streaming
2024-09-12 10:29:45 -07:00
jessicaou
396c0aee4d docs: Adding LC Academy links (#26164)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Jess Ou <jessou@jesss-mbp.local.meter>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-11 23:37:17 +00:00
Bagatur
feb351737c core[patch]: fix empty OpenAI tools when strict=True (#26287)
Fix #26232

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-11 16:06:03 -07:00
William FH
d87feb1b04 [Docs] Correct the admonition explaining min langchain-anthropic version in doc (#26359)
0.1.15 instead of just 0.1.5
2024-09-11 23:03:42 +00:00
ccurme
398718e1cb core[patch]: fix regression in convert_to_openai_tool with instances of Tool (#26327)
```python
from langchain_core.tools import Tool
from langchain_core.utils.function_calling import convert_to_openai_tool

def my_function(x: int) -> int:
    return x + 2

tool = Tool(
    name="tool_name",
    func=my_function,
    description="test description",
)
convert_to_openai_tool(tool)
```

Current:
```
{'type': 'function',
 'function': {'name': 'tool_name',
  'description': 'test description',
  'parameters': {'type': 'object',
   'properties': {'args': {'type': 'array', 'items': {}},
    'config': {'type': 'object',
     'properties': {'tags': {'type': 'array', 'items': {'type': 'string'}},
      'metadata': {'type': 'object'},
      'callbacks': {'anyOf': [{'type': 'array', 'items': {}}, {}]},
      'run_name': {'type': 'string'},
      'max_concurrency': {'type': 'integer'},
      'recursion_limit': {'type': 'integer'},
      'configurable': {'type': 'object'},
      'run_id': {'type': 'string', 'format': 'uuid'}}},
    'kwargs': {'type': 'object'}},
   'required': ['config']}}}
```

Here:
```
{'type': 'function',
 'function': {'name': 'tool_name',
  'description': 'test description',
  'parameters': {'properties': {'__arg1': {'title': '__arg1',
     'type': 'string'}},
   'required': ['__arg1'],
   'type': 'object'}}}
```
2024-09-11 15:51:10 -04:00
이규민
7feae62ad7 core[patch]: Support non ASCII characters in tool output if user doesn't output string (#26319)
### simple modify
core: add supporting non english character

target issue is #26315 
same issue on langgraph -
https://github.com/langchain-ai/langgraph/issues/1504
2024-09-11 15:21:00 +00:00
William FH
b993172702 Keyword-like runnable config (#26295) 2024-09-11 07:44:47 -07:00
Bagatur
17659ca2cd core[patch]: Release 0.2.39 (#26279) 2024-09-10 20:11:27 +00:00
Nuno Campos
212c688ee0 core[minor]: Remove serialized manifest from tracing requests for non-llm runs (#26270)
- This takes a long time to compute, isn't used, and currently called on
every invocation of every chain/retriever/etc
2024-09-10 12:58:24 -07:00
ccurme
979232257b huggingface[patch]: add integration tests for embeddings (#26272) 2024-09-10 14:57:16 -04:00
ccurme
4ffd27c4d0 huggingface[patch]: add integration tests (#26269)
Add standard tests for ChatHuggingFace. About half of these fail
currently.
2024-09-10 18:31:51 +00:00
Emad Rad
16d41eab1e docs: typos fixed (#26234)
While going through the chatbot tutorial, I noticed a couple of typos
and grammatical issues. Also, the pip install command for
langchain_community was commented out, but the document mentions
installing it.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-10 00:52:20 +00:00
venkatram-dev
fa229d6c02 docs: fix_typo_llm_chain_tutorial (#26229)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
docs:tutorials:llm_chain:fix typo



- [ ] **PR message**: 
fix typo in llm chain tutorial

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:39:29 +00:00
Christophe Bornet
9cf7ae0a52 community: Add docstring for HtmlLinkExtractor (#26213)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:27:37 +00:00
Christophe Bornet
56580b5fff community: Add docstring for GLiNERLinkExtractor (#26218)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:27:23 +00:00
Christophe Bornet
e235a572a0 community: Add docstring for KeybertLinkExtractor (#26210)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:26:29 +00:00
Vadym Barda
bab9de581c core[patch]: wrap mermaid node names w/ markdown in <p> tag (#26235)
This fixes the issue where `__start__` and `__end__` node labels are
being interpreted as markdown, as of the most recent Mermaid update
2024-09-09 20:11:00 -04:00
miri-bar
3e48c728d5 docs: add ai21 tool calling example (#26199)
Add tool calling example to AI21 docs
2024-09-09 09:34:54 -07:00
Geoffrey HARRAZI
76bce42629 docs: Update Google BigQuery Vector Search with new SQL filter feature introduce in langchain-google-community 1.0.9 (#26184)
Hello,

fix: https://github.com/langchain-ai/langchain/issues/26183

Adding documentation regarding SQL like filter for Google BigQuery
Vector Search coming in next langchain-google-community 1.0.9 release.
Note: langchain-google-community==1.0.9 is not yet released

Question: There is no way to warn the user int the doc about the
availability of a feature after a specific package version ?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:58:28 +00:00
Matt Hull
bca51ca164 docs: Update func doc strings in tools_human (#26149)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Fix docstring for two functions that look like have
docstrings carried over from other functions.
    - **Issue:** Not found issue reporting the miss-leading docstrings.
    - **Dependencies:** None
    - **Twitter handle:** 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:48:24 +00:00
Qasim Khan
fa17b145bb docs: fix typo in graph_constructing tutorial (#26134)
Changed 

> "At a high-level, the steps of constructing a knowledge are from text
are:"

to 

> "At a high-level, the steps of constructing a knowledge graph from
text are:"

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:46:49 +00:00
Tomaz Bratanic
181e4fc0e0 Add session expired retry to neo4j graph (#26182)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 11:40:43 -07:00
Sebastian Cherny
b3c7ed4913 Adding bind_tools in ChatOctoAI (#26168)
The object extends from
langchain_community.chat_models.openai.ChatOpenAI which doesn't have
`bind_tools` defined. I tried extending from
`langchain_openai.ChatOpenAI` in
https://github.com/langchain-ai/langchain/pull/25975 but that PR got
closed because this is not correct.
So adding our own `bind_tools` (which for now copying from ChatOpenAI is
good enough) will solve the tool calling issue we are having now.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:38:43 +00:00
Malik Ashar Khan
042e84170b Fix typo in ScrapflyLoader documentation (#26117)
This PR fixes a minor typo in the ScrapflyLoader documentation. The word
"passigng" was changed to "passing."

Before: passigng
After: passing

This change improves the clarity and professionalism of the
documentation.

Co-authored-by: Ashar <asharmalik.ds193@gmail.com>
2024-09-08 18:33:01 +00:00
John
97a8e365ec partners/unstructured: update unstructured client version (#26105)
Users are having version conflicts with `unstructured-client` as
described here:

https://unstructuredw-kbe4326.slack.com/archives/C06JJHC9G4U/p1725557970546199?thread_ts=1725035247.162819&cid=C06JJHC9G4U

This PR fixes that issue and should update the version to "0.1.3" as
well for a clean-slate version for users to install

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:32:34 +00:00
Vadym Barda
1b3bd52e0e core[patch]: fix edge labels for mermaid graphs (#26201) 2024-09-08 14:35:25 +00:00
Marcelo Machado
9bd4f1dfa8 docs: small improvement ChatOllama setup description (#26043)
Small improvement on ChatOllama description

---------

Co-authored-by: Marcelo Machado <mmachado@ibm.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:15:05 +00:00
Leonid Ganeline
2f80d67dc1 docs: integrations reference updates 16 (#26059)
Added missed provider pages and links. Fixed inconsistent formatting.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:13:53 +00:00
Ikko Eltociear Ashimine
ffdc370200 docs: update agent_executor.ipynb (#26035)
initalize -> initialize

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:07:26 +00:00
Leonid Ganeline
5052e87d7c docs: integrations reference updates 15 (#25994)
Added missed provider pages and links. Fixed inconsistent formatting.
2024-09-07 16:51:47 -07:00
Erick Friis
6e82d2184b partners/mongodb: release 0.1.9 (#26193) 2024-09-07 23:20:25 +00:00
William FH
262e19b15d infra: Clear cache for env-var checks (#26073) 2024-09-06 21:29:29 +00:00
Brace Sproul
854f37be87 docs[minor]: Add state of agents survey to docs announcement bar (#26167) 2024-09-06 14:28:08 -07:00
ChengZi
a03141ac51 partners[milvus]: fix integration test issues (#26136)
fix some integration test issues:
https://github.com/langchain-ai/langchain/actions/runs/10688447230/job/29628412258

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-06 16:52:36 +00:00
Erick Friis
5c1ebd3086 partners/unstructured: release 0.1.3 (#26119) 2024-09-06 16:22:53 +00:00
Bagatur
de97d50644 core,standard-tests[patch]: add Ser/Des test and update serialization mapping (#26042) 2024-09-04 11:58:36 -07:00
Bagatur
1241a004cb fmt 2024-09-04 11:44:59 -07:00
Bagatur
4ba14ae9e5 fmt 2024-09-04 11:34:59 -07:00
Bagatur
dba308447d fmt 2024-09-04 11:28:04 -07:00
Bagatur
fdf6fbde18 fmt 2024-09-04 11:12:11 -07:00
Bagatur
576574c82c fmt 2024-09-04 11:05:36 -07:00
Bagatur
7bf54636ff make 2024-09-04 10:24:42 -07:00
Bagatur
3ec93c2817 standard-tests[patch]: add Ser/Des test 2024-09-04 10:24:06 -07:00
Friso H. Kingma
af11fbfbf6 langchain_openai: Make sure the response from the async client in the astream method of ChatOpenAI is properly awaited in case of "include_response_headers=True" (#26031)
- **Description:** This is a **one line change**. the
`self.async_client.with_raw_response.create(**payload)` call is not
properly awaited within the `_astream` method. In `_agenerate` this is
done already, but likely forgotten in the other method.
  - **Issue:** Not applicable
  - **Dependencies:** No dependencies required.

(If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-04 13:26:48 +00:00
ZhangShenao
c812237217 Improvement[Community] Improve args description in api doc of DocArrayInMemorySearch (#26024)
- Add missing arg
- Remove redundant arg
2024-09-04 09:26:26 -04:00
Tomaz Bratanic
c649b449d7 Add the option to ignore structured output method to LLM graph transf… (#26013)
Open source models like Llama3.1 have function calling, but it's not
that great. Therefore, we introduce the option to ignore model's
function calling and just use the prompt-based approach
2024-09-04 09:15:43 -04:00
Bagatur
34fc00aff1 openai[patch]: add back azure embeddings api_version alias (#26003) 2024-09-03 17:27:10 -07:00
Bagatur
4b99426a4f openai[patch]: add back azure embeddings api_version alias 2024-09-03 17:25:03 -07:00
Eugene Yurtsev
bc3b851f08 openai[patch]: Upgrade @root_validators in preparation for pydantic 2 migration (#25491)
* Upgrade @root_validator in openai pkg
* Ran notebooks for all but AzureAI embeddings

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-03 14:42:24 -07:00
Tom Daniel Grande
0207dc1431 community: delta in openai choice can be None, creates handler for that (#25954)
Thank you for contributing to LangChain!

- [X ] **PR title**

- [X ] **PR message**: 

     **Description:** adds a handler for when delta choice is None

     **Issue:** Fixes #25951
     **Dependencies:** Not applicable


- [ X] **Add tests and docs**: Not applicable

- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-03 20:30:03 +00:00
Bagatur
9eb9ff52c0 experimental[patch]: Release 0.0.65 (#25987) 2024-09-03 19:15:48 +00:00
Bagatur
bc3b02651c standard-tests[patch]: test init from env vars (#25983) 2024-09-03 19:05:39 +00:00
Bagatur
ac922105ad infra: rm ai21 from CI (#25984) 2024-09-03 11:47:27 -07:00
Bagatur
0af447c90b community[patch]: Release 0.2.16 (#25982) 2024-09-03 18:34:18 +00:00
Dan O'Donovan
f49da71e87 community[patch]: change default Neo4j username/password (#25226)
**Description:**

Change the default Neo4j username/password (when not supplied as
environment variable or in code) from `None` to `""`.

Neo4j has an option to [disable
auth](https://neo4j.com/docs/operations-manual/current/configuration/configuration-settings/#config_dbms.security.auth_enabled)
which is helpful when developing. When auth is disabled, the username /
password through the `neo4j` module should be `""` (ie an empty string).

Empty strings get marked as false in
`langchain_core.utils.env.get_from_dict_or_env` -- changing this code /
behaviour would have a wide impact and is undesirable.

In order to both _allow_ access to Neo4j with auth disabled and _not_
impact `langchain_core` this patch is presented. The downside would be
that if a user forgets to set NEO4J_USERNAME or NEO4J_PASSWORD they
would see an invalid credentials error rather than missing credentials
error. This could be mitigated but would result in a less elegant patch!

**Issue:**
Fix issue where langchain cannot communicate with Neo4j if Neo4j auth is
disabled.
2024-09-03 11:24:18 -07:00
Bagatur
035d8cf51b milvus[patch]: Release 0.1.5 (#25981) 2024-09-03 18:19:51 +00:00
1698 changed files with 70735 additions and 68415 deletions

View File

@@ -96,25 +96,21 @@ body:
attributes:
label: System Info
description: |
Please share your system info with us.
Please share your system info with us. Do NOT skip this step and please don't trim
the output. Most users don't include enough information here and it makes it harder
for us to help you.
"pip freeze | grep langchain"
platform (windows / linux / mac)
python version
OR if you're on a recent version of langchain-core you can paste the output of:
Run the following command in your terminal and paste the output here:
python -m langchain_core.sys_info
or if you have an existing python interpreter running:
from langchain_core import sys_info
sys_info.print_sys_info()
alternatively, put the entire output of `pip freeze` here.
placeholder: |
"pip freeze | grep langchain"
platform
python version
Alternatively, if you're on a recent version of langchain-core you can paste the output of:
python -m langchain_core.sys_info
These will only surface LangChain packages, don't forget to include any other relevant
packages you're using (if you're not sure what's relevant, you can paste the entire output of `pip freeze`).
validations:
required: true

View File

@@ -2,10 +2,12 @@ import glob
import json
import os
import sys
import tomllib
from collections import defaultdict
from typing import Dict, List, Set
from pathlib import Path
import tomllib
from get_min_versions import get_min_version_from_toml
LANGCHAIN_DIRS = [
@@ -16,6 +18,12 @@ LANGCHAIN_DIRS = [
"libs/experimental",
]
# when set to True, we are ignoring core dependents
# in order to be able to get CI to pass for each individual
# package that depends on core
# e.g. if you touch core, we don't then add textsplitters/etc to CI
IGNORE_CORE_DEPENDENTS = False
# ignored partners are removed from dependents
# but still run if directly edited
IGNORED_PARTNERS = [
@@ -23,9 +31,6 @@ IGNORED_PARTNERS = [
# specifically in huggingface jobs
# https://github.com/langchain-ai/langchain/issues/25558
"huggingface",
# remove ai21 because of breaking changes in sdk version 2.14.0
# that have not been fixed yet
"ai21",
]
@@ -102,44 +107,96 @@ def add_dependents(dirs_to_eval: Set[str], dependents: dict) -> List[str]:
def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
if dir_ == "libs/core":
return [
{"working-directory": dir_, "python-version": f"3.{v}"}
for v in range(8, 13)
]
min_python = "3.8"
max_python = "3.12"
if job == "test-pydantic":
return _get_pydantic_test_configs(dir_)
if dir_ == "libs/core":
py_versions = ["3.9", "3.10", "3.11", "3.12"]
# custom logic for specific directories
if dir_ == "libs/partners/milvus":
elif dir_ == "libs/partners/milvus":
# milvus poetry doesn't allow 3.12 because they
# declare deps in funny way
max_python = "3.11"
py_versions = ["3.9", "3.11"]
if dir_ in ["libs/community", "libs/langchain"] and job == "extended-tests":
elif dir_ in ["libs/community", "libs/langchain"] and job == "extended-tests":
# community extended test resolution in 3.12 is slow
# even in uv
max_python = "3.11"
py_versions = ["3.9", "3.11"]
if dir_ == "libs/community" and job == "compile-integration-tests":
elif dir_ == "libs/community" and job == "compile-integration-tests":
# community integration deps are slow in 3.12
max_python = "3.11"
py_versions = ["3.9", "3.11"]
else:
py_versions = ["3.9", "3.12"]
return [
{"working-directory": dir_, "python-version": min_python},
{"working-directory": dir_, "python-version": max_python},
return [{"working-directory": dir_, "python-version": py_v} for py_v in py_versions]
def _get_pydantic_test_configs(
dir_: str, *, python_version: str = "3.11"
) -> List[Dict[str, str]]:
with open("./libs/core/poetry.lock", "rb") as f:
core_poetry_lock_data = tomllib.load(f)
for package in core_poetry_lock_data["package"]:
if package["name"] == "pydantic":
core_max_pydantic_minor = package["version"].split(".")[1]
break
with open(f"./{dir_}/poetry.lock", "rb") as f:
dir_poetry_lock_data = tomllib.load(f)
for package in dir_poetry_lock_data["package"]:
if package["name"] == "pydantic":
dir_max_pydantic_minor = package["version"].split(".")[1]
break
core_min_pydantic_version = get_min_version_from_toml(
"./libs/core/pyproject.toml", "release", python_version, include=["pydantic"]
)["pydantic"]
core_min_pydantic_minor = core_min_pydantic_version.split(".")[1] if "." in core_min_pydantic_version else "0"
dir_min_pydantic_version = (
get_min_version_from_toml(
f"./{dir_}/pyproject.toml", "release", python_version, include=["pydantic"]
)
.get("pydantic", "0.0.0")
)
dir_min_pydantic_minor = dir_min_pydantic_version.split(".")[1] if "." in dir_min_pydantic_version else "0"
custom_mins = {
# depends on pydantic-settings 2.4 which requires pydantic 2.7
"libs/community": 7,
}
max_pydantic_minor = min(
int(dir_max_pydantic_minor),
int(core_max_pydantic_minor),
)
min_pydantic_minor = max(
int(dir_min_pydantic_minor),
int(core_min_pydantic_minor),
custom_mins.get(dir_, 0),
)
configs = [
{
"working-directory": dir_,
"pydantic-version": f"2.{v}.0",
"python-version": python_version,
}
for v in range(min_pydantic_minor, max_pydantic_minor + 1)
]
return configs
def _get_configs_for_multi_dirs(
job: str, dirs_to_run: List[str], dependents: dict
job: str, dirs_to_run: Dict[str, Set[str]], dependents: dict
) -> List[Dict[str, str]]:
if job == "lint":
dirs = add_dependents(
dirs_to_run["lint"] | dirs_to_run["test"] | dirs_to_run["extended-test"],
dependents,
)
elif job in ["test", "compile-integration-tests", "dependencies"]:
elif job in ["test", "compile-integration-tests", "dependencies", "test-pydantic"]:
dirs = add_dependents(
dirs_to_run["test"] | dirs_to_run["extended-test"], dependents
)
@@ -168,6 +225,7 @@ if __name__ == "__main__":
dirs_to_run["lint"] = all_package_dirs()
dirs_to_run["test"] = all_package_dirs()
dirs_to_run["extended-test"] = set(LANGCHAIN_DIRS)
for file in files:
if any(
file.startswith(dir_)
@@ -185,8 +243,12 @@ if __name__ == "__main__":
if any(file.startswith(dir_) for dir_ in LANGCHAIN_DIRS):
# add that dir and all dirs after in LANGCHAIN_DIRS
# for extended testing
found = False
for dir_ in LANGCHAIN_DIRS:
if dir_ == "libs/core" and IGNORE_CORE_DEPENDENTS:
dirs_to_run["extended-test"].add(dir_)
continue
if file.startswith(dir_):
found = True
if found:
@@ -198,7 +260,6 @@ if __name__ == "__main__":
dirs_to_run["test"].add("libs/partners/mistralai")
dirs_to_run["test"].add("libs/partners/openai")
dirs_to_run["test"].add("libs/partners/anthropic")
dirs_to_run["test"].add("libs/partners/ai21")
dirs_to_run["test"].add("libs/partners/fireworks")
dirs_to_run["test"].add("libs/partners/groq")
@@ -228,7 +289,6 @@ if __name__ == "__main__":
# we now have dirs_by_job
# todo: clean this up
map_job_to_configs = {
job: _get_configs_for_multi_dirs(job, dirs_to_run, dependents)
for job in [
@@ -237,6 +297,7 @@ if __name__ == "__main__":
"extended-tests",
"compile-integration-tests",
"dependencies",
"test-pydantic",
]
}
map_job_to_configs["test-doc-imports"] = (

View File

@@ -11,7 +11,7 @@ if __name__ == "__main__":
# see if we're releasing an rc
version = toml_data["tool"]["poetry"]["version"]
releasing_rc = "rc" in version
releasing_rc = "rc" in version or "dev" in version
# if not, iterate through dependencies and make sure none allow prereleases
if not releasing_rc:

View File

@@ -1,4 +1,5 @@
import sys
from typing import Optional
if sys.version_info >= (3, 11):
import tomllib
@@ -7,6 +8,9 @@ else:
import tomli as tomllib
from packaging.version import parse as parse_version
from packaging.specifiers import SpecifierSet
from packaging.version import Version
import re
MIN_VERSION_LIBS = [
@@ -17,7 +21,14 @@ MIN_VERSION_LIBS = [
"SQLAlchemy",
]
SKIP_IF_PULL_REQUEST = ["langchain-core"]
# some libs only get checked on release because of simultaneous changes in
# multiple libs
SKIP_IF_PULL_REQUEST = [
"langchain-core",
"langchain-text-splitters",
"langchain",
"langchain-community",
]
def get_min_version(version: str) -> str:
@@ -45,7 +56,13 @@ def get_min_version(version: str) -> str:
raise ValueError(f"Unrecognized version format: {version}")
def get_min_version_from_toml(toml_path: str, versions_for: str):
def get_min_version_from_toml(
toml_path: str,
versions_for: str,
python_version: str,
*,
include: Optional[list] = None,
):
# Parse the TOML file
with open(toml_path, "rb") as file:
toml_data = tomllib.load(file)
@@ -57,18 +74,26 @@ def get_min_version_from_toml(toml_path: str, versions_for: str):
min_versions = {}
# Iterate over the libs in MIN_VERSION_LIBS
for lib in MIN_VERSION_LIBS:
for lib in set(MIN_VERSION_LIBS + (include or [])):
if versions_for == "pull_request" and lib in SKIP_IF_PULL_REQUEST:
# some libs only get checked on release because of simultaneous
# changes
# changes in multiple libs
continue
# Check if the lib is present in the dependencies
if lib in dependencies:
if include and lib not in include:
continue
# Get the version string
version_string = dependencies[lib]
if isinstance(version_string, dict):
version_string = version_string["version"]
if isinstance(version_string, list):
version_string = [
vs
for vs in version_string
if check_python_version(python_version, vs["python"])
][0]["version"]
# Use parse_version to get the minimum supported version from version_string
min_version = get_min_version(version_string)
@@ -79,13 +104,31 @@ def get_min_version_from_toml(toml_path: str, versions_for: str):
return min_versions
def check_python_version(version_string, constraint_string):
"""
Check if the given Python version matches the given constraints.
:param version_string: A string representing the Python version (e.g. "3.8.5").
:param constraint_string: A string representing the package's Python version constraints (e.g. ">=3.6, <4.0").
:return: True if the version matches the constraints, False otherwise.
"""
try:
version = Version(version_string)
constraints = SpecifierSet(constraint_string)
return version in constraints
except Exception as e:
print(f"Error: {e}")
return False
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
versions_for = sys.argv[2]
python_version = sys.argv[3]
assert versions_for in ["release", "pull_request"]
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file, versions_for)
min_versions = get_min_version_from_toml(toml_file, versions_for, python_version)
print(" ".join([f"{lib}=={version}" for lib, version in min_versions.items()]))

View File

@@ -1,114 +0,0 @@
name: dependencies
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
type: string
description: "Relative path to the langchain library folder"
python-version:
required: true
type: string
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
name: dependency checks ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: pydantic-cross-compat
- name: Install dependencies
shell: bash
run: poetry install
- name: Check imports with base dependencies
shell: bash
run: poetry run make check_imports
- name: Install test dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Install the opposite major version of pydantic
# If normal tests use pydantic v1, here we'll use v2, and vice versa.
shell: bash
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
run: |
# Determine the major part of pydantic version
REGULAR_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
if [[ "$REGULAR_VERSION" == "1" ]]; then
PYDANTIC_DEP=">=2.1,<3"
TEST_WITH_VERSION="2"
elif [[ "$REGULAR_VERSION" == "2" ]]; then
PYDANTIC_DEP="<2"
TEST_WITH_VERSION="1"
else
echo "Unexpected pydantic major version '$REGULAR_VERSION', cannot determine which version to use for cross-compatibility test."
exit 1
fi
# Install via `pip` instead of `poetry add` to avoid changing lockfile,
# which would prevent caching from working: the cache would get saved
# to a different key than where it gets loaded from.
poetry run pip install "pydantic${PYDANTIC_DEP}"
# Ensure that the correct pydantic is installed now.
echo "Checking pydantic version... Expecting ${TEST_WITH_VERSION}"
# Determine the major part of pydantic version
CURRENT_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
# Check that the major part of pydantic version is as expected, if not
# raise an error
if [[ "$CURRENT_VERSION" != "$TEST_WITH_VERSION" ]]; then
echo "Error: expected pydantic version ${CURRENT_VERSION} to have been installed, but found: ${TEST_WITH_VERSION}"
exit 1
fi
echo "Found pydantic version ${CURRENT_VERSION}, as expected"
- name: Run pydantic compatibility tests
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
shell: bash
run: make test
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -58,6 +58,7 @@ jobs:
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
@@ -67,6 +68,7 @@ jobs:
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}

View File

@@ -7,10 +7,6 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
type: string
description: "Relative path to the langchain library folder"
python-version:
required: true
type: string
@@ -63,14 +59,6 @@ jobs:
run: |
poetry install --with lint,typing
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Get .mypy_cache to speed up mypy
uses: actions/cache@v4
env:

View File

@@ -85,7 +85,7 @@ jobs:
path: langchain
sparse-checkout: | # this only grabs files for relevant dir
${{ inputs.working-directory }}
ref: master # this scopes to just master branch
ref: ${{ github.ref }} # this scopes to just ref'd branch
fetch-depth: 0 # this fetches entire commit history
- name: Check Tags
id: check-tags
@@ -164,6 +164,7 @@ jobs:
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
id: setup-python
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
@@ -231,7 +232,8 @@ jobs:
id: min-version
run: |
poetry run pip install packaging
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml release)"
python_version="$(poetry run python --version | awk '{print $2}')"
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml release $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
@@ -273,6 +275,7 @@ jobs:
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}

View File

@@ -7,10 +7,6 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
type: string
description: "Relative path to the langchain library folder"
python-version:
required: true
type: string
@@ -31,29 +27,41 @@ jobs:
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
id: setup-python
with:
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Run core tests
shell: bash
run: |
make test
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
shell: bash
run: |
poetry run pip install packaging tomli
python_version="$(poetry run python --version | awk '{print $2}')"
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml pull_request $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
- name: Ensure the tests did not create any additional files
shell: bash
run: |
@@ -66,20 +74,3 @@ jobs:
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging tomli
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml pull_request)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install --force-reinstall $MIN_VERSIONS --editable .
make tests
working-directory: ${{ inputs.working-directory }}

64
.github/workflows/_test_pydantic.yml vendored Normal file
View File

@@ -0,0 +1,64 @@
name: test pydantic intermediate versions
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: false
type: string
description: "Python version to use"
default: "3.11"
pydantic-version:
required: true
type: string
description: "Pydantic version to test."
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
name: "make test # pydantic: ~=${{ inputs.pydantic-version }}, python: ${{ inputs.python-version }}, "
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Overwrite pydantic version
shell: bash
run: poetry run pip install pydantic~=${{ inputs.pydantic-version }}
- name: Run core tests
shell: bash
run: |
make test
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -31,6 +31,7 @@ jobs:
uses: Ana06/get-changed-files@v2.2.0
- id: set-matrix
run: |
python -m pip install packaging
python .github/scripts/check_diff.py ${{ steps.files.outputs.all }} >> $GITHUB_OUTPUT
outputs:
lint: ${{ steps.set-matrix.outputs.lint }}
@@ -39,6 +40,7 @@ jobs:
compile-integration-tests: ${{ steps.set-matrix.outputs.compile-integration-tests }}
dependencies: ${{ steps.set-matrix.outputs.dependencies }}
test-doc-imports: ${{ steps.set-matrix.outputs.test-doc-imports }}
test-pydantic: ${{ steps.set-matrix.outputs.test-pydantic }}
lint:
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
@@ -46,6 +48,7 @@ jobs:
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.lint) }}
fail-fast: false
uses: ./.github/workflows/_lint.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
@@ -59,18 +62,34 @@ jobs:
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test) }}
fail-fast: false
uses: ./.github/workflows/_test.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
test-pydantic:
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.test-pydantic != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test-pydantic) }}
fail-fast: false
uses: ./.github/workflows/_test_pydantic.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
pydantic-version: ${{ matrix.job-configs.pydantic-version }}
secrets: inherit
test-doc-imports:
needs: [ build ]
if: ${{ needs.build.outputs.test-doc-imports != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test-doc-imports) }}
fail-fast: false
uses: ./.github/workflows/_test_doc_imports.yml
secrets: inherit
with:
@@ -83,25 +102,13 @@ jobs:
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.compile-integration-tests) }}
fail-fast: false
uses: ./.github/workflows/_compile_integration_test.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
dependencies:
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dependencies != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.dependencies) }}
uses: ./.github/workflows/_dependencies.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
extended-tests:
name: "cd ${{ matrix.job-configs.working-directory }} / make extended_tests #${{ matrix.job-configs.python-version }}"
needs: [ build ]
@@ -110,6 +117,7 @@ jobs:
matrix:
# note different variable for extended test dirs
job-configs: ${{ fromJson(needs.build.outputs.extended-tests) }}
fail-fast: false
runs-on: ubuntu-latest
defaults:
run:
@@ -149,7 +157,7 @@ jobs:
echo "$STATUS" | grep 'nothing to commit, working tree clean'
ci_success:
name: "CI Success"
needs: [build, lint, test, compile-integration-tests, dependencies, extended-tests, test-doc-imports]
needs: [build, lint, test, compile-integration-tests, extended-tests, test-doc-imports, test-pydantic]
if: |
always()
runs-on: ubuntu-latest

View File

@@ -3,9 +3,8 @@ name: CI / cd . / make spell_check
on:
push:
branches: [master, v0.1]
branches: [master, v0.1, v0.2]
pull_request:
branches: [master, v0.1]
permissions:
contents: read

View File

@@ -17,7 +17,7 @@ jobs:
fail-fast: false
matrix:
python-version:
- "3.8"
- "3.9"
- "3.11"
working-directory:
- "libs/partners/openai"
@@ -86,10 +86,12 @@ jobs:
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}

View File

@@ -36,7 +36,6 @@ api_docs_build:
API_PKG ?= text-splitters
api_docs_quick_preview:
poetry run pip install "pydantic<2"
poetry run python docs/api_reference/create_api_rst.py $(API_PKG)
cd docs/api_reference && poetry run make html
poetry run python docs/api_reference/scripts/custom_formatter.py docs/api_reference/_build/html/

View File

@@ -38,8 +38,8 @@ conda install langchain -c conda-forge
For these applications, LangChain simplifies the entire application lifecycle:
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/v0.2/docs/concepts#langchain-expression-language-lcel), [components](https://python.langchain.com/v0.2/docs/concepts), and [third-party integrations](https://python.langchain.com/v0.2/docs/integrations/platforms/).
Use [LangGraph](/docs/concepts/#langgraph) to build stateful agents with first-class streaming and human-in-the-loop support.
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel), [components](https://python.langchain.com/docs/concepts/), and [third-party integrations](https://python.langchain.com/docs/integrations/platforms/).
Use [LangGraph](https://langchain-ai.github.io/langgraph/) to build stateful agents with first-class streaming and human-in-the-loop support.
- **Productionization**: Inspect, monitor, and evaluate your apps with [LangSmith](https://docs.smith.langchain.com/) so that you can constantly optimize and deploy with confidence.
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/).
@@ -49,7 +49,7 @@ For these applications, LangChain simplifies the entire application lifecycle:
- **`langchain-community`**: Third party integrations.
- Some integrations have been further split into **partner packages** that only rely on **`langchain-core`**. Examples include **`langchain_openai`** and **`langchain_anthropic`**.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **[`LangGraph`](https://langchain-ai.github.io/langgraph/)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it.
- **[`LangGraph`](https://langchain-ai.github.io/langgraph/)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. To learn more about LangGraph, check out our first LangChain Academy course, *Introduction to LangGraph*, available [here](https://academy.langchain.com/courses/intro-to-langgraph).
### Productionization:
@@ -65,20 +65,20 @@ For these applications, LangChain simplifies the entire application lifecycle:
**❓ Question answering with RAG**
- [Documentation](https://python.langchain.com/v0.2/docs/tutorials/rag/)
- [Documentation](https://python.langchain.com/docs/tutorials/rag/)
- End-to-end Example: [Chat LangChain](https://chat.langchain.com) and [repo](https://github.com/langchain-ai/chat-langchain)
**🧱 Extracting structured output**
- [Documentation](https://python.langchain.com/v0.2/docs/tutorials/extraction/)
- [Documentation](https://python.langchain.com/docs/tutorials/extraction/)
- End-to-end Example: [SQL Llama2 Template](https://github.com/langchain-ai/langchain-extract/)
**🤖 Chatbots**
- [Documentation](https://python.langchain.com/v0.2/docs/tutorials/chatbot/)
- [Documentation](https://python.langchain.com/docs/tutorials/chatbot/)
- End-to-end Example: [Web LangChain (web researcher chatbot)](https://weblangchain.vercel.app) and [repo](https://github.com/langchain-ai/weblangchain)
And much more! Head to the [Tutorials](https://python.langchain.com/v0.2/docs/tutorials/) section of the docs for more.
And much more! Head to the [Tutorials](https://python.langchain.com/docs/tutorials/) section of the docs for more.
## 🚀 How does LangChain help?
@@ -93,10 +93,10 @@ Off-the-shelf chains make it easy to get started. Components make it easy to cus
LCEL is a key part of LangChain, allowing you to build and organize chains of processes in a straightforward, declarative manner. It was designed to support taking prototypes directly into production without needing to alter any code. This means you can use LCEL to set up everything from basic "prompt + LLM" setups to intricate, multi-step workflows.
- **[Overview](https://python.langchain.com/v0.2/docs/concepts/#langchain-expression-language-lcel)**: LCEL and its benefits
- **[Interface](https://python.langchain.com/v0.2/docs/concepts/#runnable-interface)**: The standard Runnable interface for LCEL objects
- **[Primitives](https://python.langchain.com/v0.2/docs/how_to/#langchain-expression-language-lcel)**: More on the primitives LCEL includes
- **[Cheatsheet](https://python.langchain.com/v0.2/docs/how_to/lcel_cheatsheet/)**: Quick overview of the most common usage patterns
- **[Overview](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel)**: LCEL and its benefits
- **[Interface](https://python.langchain.com/docs/concepts/#runnable-interface)**: The standard Runnable interface for LCEL objects
- **[Primitives](https://python.langchain.com/docs/how_to/#langchain-expression-language-lcel)**: More on the primitives LCEL includes
- **[Cheatsheet](https://python.langchain.com/docs/how_to/lcel_cheatsheet/)**: Quick overview of the most common usage patterns
## Components
@@ -104,24 +104,24 @@ Components fall into the following **modules**:
**📃 Model I/O**
This includes [prompt management](https://python.langchain.com/v0.2/docs/concepts/#prompt-templates), [prompt optimization](https://python.langchain.com/v0.2/docs/concepts/#example-selectors), a generic interface for [chat models](https://python.langchain.com/v0.2/docs/concepts/#chat-models) and [LLMs](https://python.langchain.com/v0.2/docs/concepts/#llms), and common utilities for working with [model outputs](https://python.langchain.com/v0.2/docs/concepts/#output-parsers).
This includes [prompt management](https://python.langchain.com/docs/concepts/#prompt-templates), [prompt optimization](https://python.langchain.com/docs/concepts/#example-selectors), a generic interface for [chat models](https://python.langchain.com/docs/concepts/#chat-models) and [LLMs](https://python.langchain.com/docs/concepts/#llms), and common utilities for working with [model outputs](https://python.langchain.com/docs/concepts/#output-parsers).
**📚 Retrieval**
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/v0.2/docs/concepts/#document-loaders) from a variety of sources, [preparing it](https://python.langchain.com/v0.2/docs/concepts/#text-splitters), then [searching over (a.k.a. retrieving from)](https://python.langchain.com/v0.2/docs/concepts/#retrievers) it for use in the generation step.
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/docs/concepts/#document-loaders) from a variety of sources, [preparing it](https://python.langchain.com/docs/concepts/#text-splitters), then [searching over (a.k.a. retrieving from)](https://python.langchain.com/docs/concepts/#retrievers) it for use in the generation step.
**🤖 Agents**
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. LangChain provides a [standard interface for agents](https://python.langchain.com/v0.2/docs/concepts/#agents), along with [LangGraph](https://github.com/langchain-ai/langgraph) for building custom agents.
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. LangChain provides a [standard interface for agents](https://python.langchain.com/docs/concepts/#agents), along with [LangGraph](https://github.com/langchain-ai/langgraph) for building custom agents.
## 📖 Documentation
Please see [here](https://python.langchain.com) for full documentation, which includes:
- [Introduction](https://python.langchain.com/v0.2/docs/introduction/): Overview of the framework and the structure of the docs.
- [Introduction](https://python.langchain.com/docs/introduction/): Overview of the framework and the structure of the docs.
- [Tutorials](https://python.langchain.com/docs/use_cases/): If you're looking to build something specific or are more of a hands-on learner, check out our tutorials. This is the best place to get started.
- [How-to guides](https://python.langchain.com/v0.2/docs/how_to/): Answers to “How do I….?” type questions. These guides are goal-oriented and concrete; they're meant to help you complete a specific task.
- [Conceptual guide](https://python.langchain.com/v0.2/docs/concepts/): Conceptual explanations of the key parts of the framework.
- [How-to guides](https://python.langchain.com/docs/how_to/): Answers to “How do I….?” type questions. These guides are goal-oriented and concrete; they're meant to help you complete a specific task.
- [Conceptual guide](https://python.langchain.com/docs/concepts/): Conceptual explanations of the key parts of the framework.
- [API Reference](https://api.python.langchain.com): Thorough documentation of every class and method.
## 🌐 Ecosystem
@@ -134,7 +134,7 @@ Please see [here](https://python.langchain.com) for full documentation, which in
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](https://python.langchain.com/v0.2/docs/contributing/).
For detailed information on how to contribute, see [here](https://python.langchain.com/docs/contributing/).
## 🌟 Contributors

View File

@@ -90,7 +90,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"# Please manually enter OpenAI Key"
]
},

View File

@@ -33,8 +33,8 @@ install-py-deps:
python3 -m venv .venv
$(PYTHON) -m pip install --upgrade pip
$(PYTHON) -m pip install --upgrade uv
$(PYTHON) -m uv pip install -r vercel_requirements.txt
$(PYTHON) -m uv pip install --editable $(PARTNER_DEPS_LIST)
$(PYTHON) -m uv pip install --pre -r vercel_requirements.txt
$(PYTHON) -m uv pip install --pre --editable $(PARTNER_DEPS_LIST)
generate-files:
mkdir -p $(INTERMEDIATE_DIR)
@@ -46,7 +46,7 @@ generate-files:
$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR)
wget -q https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O $(INTERMEDIATE_DIR)/langserve.md
curl https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\&lt;=/g' > $(INTERMEDIATE_DIR)/langserve.md
$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/
copy-infra:
@@ -65,7 +65,7 @@ render:
$(PYTHON) scripts/notebook_convert.py $(INTERMEDIATE_DIR) $(OUTPUT_NEW_DOCS_DIR)
md-sync:
rsync -avm --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --include="*/_category_.yml" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)
rsync -avmq --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --include="*/_category_.yml" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)
append-related:
$(PYTHON) scripts/append_related_links.py $(OUTPUT_NEW_DOCS_DIR)
@@ -86,10 +86,6 @@ vercel-build: install-vercel-deps build generate-references
mv langchain-api-docs-build/api_reference_build/html/* static/api_reference/
rm -rf langchain-api-docs-build
NODE_OPTIONS="--max-old-space-size=5000" yarn run docusaurus build
mv build v0.2
mkdir build
mv v0.2 build
mv build/v0.2/404.html build
start:
cd $(OUTPUT_NEW_DIR) && yarn && yarn start --port=$(PORT)

File diff suppressed because one or more lines are too long

View File

@@ -1,5 +1,5 @@
autodoc_pydantic>=1,<2
sphinx<=7
autodoc_pydantic>=2,<3
sphinx>=8,<9
myst-parser>=3
sphinx-autobuild>=2024
pydata-sphinx-theme>=0.15
@@ -8,4 +8,4 @@ myst-nb>=1.1.1
pyyaml
sphinx-design
sphinx-copybutton
beautifulsoup4
beautifulsoup4

View File

@@ -17,7 +17,10 @@ def process_toc_h3_elements(html_content: str) -> str:
# Process each element
for element in toc_h3_elements:
element = element.a.code.span
try:
element = element.a.code.span
except Exception:
continue
# Get the text content of the element
content = element.get_text()

View File

@@ -15,7 +15,7 @@
:member-order: groupwise
:show-inheritance: True
:special-members: __call__
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace, model_construct, model_copy, model_dump, model_dump_json, model_parametrized_name, model_post_init, model_rebuild, model_validate, model_validate_json, model_validate_strings, model_extra, model_fields_set, model_json_schema
{% block attributes %}

View File

@@ -15,7 +15,7 @@
:member-order: groupwise
:show-inheritance: True
:special-members: __call__
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace, astream_log, transform, atransform, get_output_schema, get_prompts, config_schema, map, pick, pipe, with_listeners, with_alisteners, with_config, with_fallbacks, with_types, with_retry, InputType, OutputType, config_specs, output_schema, get_input_schema, get_graph, get_name, input_schema, name, bind, assign, as_tool
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace, astream_log, transform, atransform, get_output_schema, get_prompts, config_schema, map, pick, pipe, InputType, OutputType, config_specs, output_schema, get_input_schema, get_graph, get_name, input_schema, name, assign, as_tool, get_config_jsonschema, get_input_jsonschema, get_output_jsonschema, model_construct, model_copy, model_dump, model_dump_json, model_parametrized_name, model_post_init, model_rebuild, model_validate, model_validate_json, model_validate_strings, to_json, model_extra, model_fields_set, model_json_schema, predict, apredict, predict_messages, apredict_messages, generate, generate_prompt, agenerate, agenerate_prompt, call_as_llm
.. NOTE:: {{objname}} implements the standard :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>`. 🏃

View File

@@ -13,45 +13,45 @@ From the opposite direction, scientists use `LangChain` in research and referenc
| arXiv id / Title | Authors | Published date 🔻 | LangChain Documentation|
|------------------|---------|-------------------|------------------------|
| `2403.14403v2` [Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity](http://arxiv.org/abs/2403.14403v2) | Soyeong Jeong, Jinheon Baek, Sukmin Cho, et al. | 2024&#8209;03&#8209;21 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2403.14403v2` [Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity](http://arxiv.org/abs/2403.14403v2) | Soyeong Jeong, Jinheon Baek, Sukmin Cho, et al. | 2024&#8209;03&#8209;21 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `2402.03620v1` [Self-Discover: Large Language Models Self-Compose Reasoning Structures](http://arxiv.org/abs/2402.03620v1) | Pei Zhou, Jay Pujara, Xiang Ren, et al. | 2024&#8209;02&#8209;06 | `Cookbook:` [Self-Discover](https://github.com/langchain-ai/langchain/blob/master/cookbook/self-discover.ipynb)
| `2402.03367v2` [RAG-Fusion: a New Take on Retrieval-Augmented Generation](http://arxiv.org/abs/2402.03367v2) | Zackary Rackauckas | 2024&#8209;01&#8209;31 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2402.03367v2` [RAG-Fusion: a New Take on Retrieval-Augmented Generation](http://arxiv.org/abs/2402.03367v2) | Zackary Rackauckas | 2024&#8209;01&#8209;31 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `2401.18059v1` [RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval](http://arxiv.org/abs/2401.18059v1) | Parth Sarthi, Salman Abdullah, Aditi Tuli, et al. | 2024&#8209;01&#8209;31 | `Cookbook:` [Raptor](https://github.com/langchain-ai/langchain/blob/master/cookbook/RAPTOR.ipynb)
| `2401.15884v2` [Corrective Retrieval Augmented Generation](http://arxiv.org/abs/2401.15884v2) | Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, et al. | 2024&#8209;01&#8209;29 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), `Cookbook:` [Langgraph Crag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_crag.ipynb)
| `2401.08500v1` [Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering](http://arxiv.org/abs/2401.08500v1) | Tal Ridnik, Dedy Kredo, Itamar Friedman | 2024&#8209;01&#8209;16 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2401.15884v2` [Corrective Retrieval Augmented Generation](http://arxiv.org/abs/2401.15884v2) | Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, et al. | 2024&#8209;01&#8209;29 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts), `Cookbook:` [Langgraph Crag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_crag.ipynb)
| `2401.08500v1` [Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering](http://arxiv.org/abs/2401.08500v1) | Tal Ridnik, Dedy Kredo, Itamar Friedman | 2024&#8209;01&#8209;16 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `2401.04088v1` [Mixtral of Experts](http://arxiv.org/abs/2401.04088v1) | Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, et al. | 2024&#8209;01&#8209;08 | `Cookbook:` [Together Ai](https://github.com/langchain-ai/langchain/blob/master/cookbook/together_ai.ipynb)
| `2312.06648v2` [Dense X Retrieval: What Retrieval Granularity Should We Use?](http://arxiv.org/abs/2312.06648v2) | Tong Chen, Hongwei Wang, Sihao Chen, et al. | 2023&#8209;12&#8209;11 | `Template:` [propositional-retrieval](https://python.langchain.com/docs/templates/propositional-retrieval)
| `2311.09210v1` [Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models](http://arxiv.org/abs/2311.09210v1) | Wenhao Yu, Hongming Zhang, Xiaoman Pan, et al. | 2023&#8209;11&#8209;15 | `Template:` [chain-of-note-wiki](https://python.langchain.com/docs/templates/chain-of-note-wiki)
| `2310.11511v1` [Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection](http://arxiv.org/abs/2310.11511v1) | Akari Asai, Zeqiu Wu, Yizhong Wang, et al. | 2023&#8209;10&#8209;17 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), `Cookbook:` [Langgraph Self Rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_self_rag.ipynb)
| `2310.06117v2` [Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models](http://arxiv.org/abs/2310.06117v2) | Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, et al. | 2023&#8209;10&#8209;09 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), `Template:` [stepback-qa-prompting](https://python.langchain.com/docs/templates/stepback-qa-prompting), `Cookbook:` [Stepback-Qa](https://github.com/langchain-ai/langchain/blob/master/cookbook/stepback-qa.ipynb)
| `2310.11511v1` [Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection](http://arxiv.org/abs/2310.11511v1) | Akari Asai, Zeqiu Wu, Yizhong Wang, et al. | 2023&#8209;10&#8209;17 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts), `Cookbook:` [Langgraph Self Rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_self_rag.ipynb)
| `2310.06117v2` [Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models](http://arxiv.org/abs/2310.06117v2) | Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, et al. | 2023&#8209;10&#8209;09 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts), `Template:` [stepback-qa-prompting](https://python.langchain.com/docs/templates/stepback-qa-prompting), `Cookbook:` [Stepback-Qa](https://github.com/langchain-ai/langchain/blob/master/cookbook/stepback-qa.ipynb)
| `2307.15337v3` [Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](http://arxiv.org/abs/2307.15337v3) | Xuefei Ning, Zinan Lin, Zixuan Zhou, et al. | 2023&#8209;07&#8209;28 | `Template:` [skeleton-of-thought](https://python.langchain.com/docs/templates/skeleton-of-thought)
| `2307.09288v2` [Llama 2: Open Foundation and Fine-Tuned Chat Models](http://arxiv.org/abs/2307.09288v2) | Hugo Touvron, Louis Martin, Kevin Stone, et al. | 2023&#8209;07&#8209;18 | `Cookbook:` [Semi Structured Rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_Structured_RAG.ipynb)
| `2307.03172v3` [Lost in the Middle: How Language Models Use Long Contexts](http://arxiv.org/abs/2307.03172v3) | Nelson F. Liu, Kevin Lin, John Hewitt, et al. | 2023&#8209;07&#8209;06 | `Docs:` [docs/how_to/long_context_reorder](https://python.langchain.com/v0.2/docs/how_to/long_context_reorder)
| `2307.03172v3` [Lost in the Middle: How Language Models Use Long Contexts](http://arxiv.org/abs/2307.03172v3) | Nelson F. Liu, Kevin Lin, John Hewitt, et al. | 2023&#8209;07&#8209;06 | `Docs:` [docs/how_to/long_context_reorder](https://python.langchain.com/docs/how_to/long_context_reorder)
| `2305.14283v3` [Query Rewriting for Retrieval-Augmented Large Language Models](http://arxiv.org/abs/2305.14283v3) | Xinbei Ma, Yeyun Gong, Pengcheng He, et al. | 2023&#8209;05&#8209;23 | `Template:` [rewrite-retrieve-read](https://python.langchain.com/docs/templates/rewrite-retrieve-read), `Cookbook:` [Rewrite](https://github.com/langchain-ai/langchain/blob/master/cookbook/rewrite.ipynb)
| `2305.08291v1` [Large Language Model Guided Tree-of-Thought](http://arxiv.org/abs/2305.08291v1) | Jieyi Long | 2023&#8209;05&#8209;15 | `API:` [langchain_experimental.tot](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.tot), `Cookbook:` [Tree Of Thought](https://github.com/langchain-ai/langchain/blob/master/cookbook/tree_of_thought.ipynb)
| `2305.04091v3` [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](http://arxiv.org/abs/2305.04091v3) | Lei Wang, Wanyu Xu, Yihuai Lan, et al. | 2023&#8209;05&#8209;06 | `Cookbook:` [Plan And Execute Agent](https://github.com/langchain-ai/langchain/blob/master/cookbook/plan_and_execute_agent.ipynb)
| `2305.02156v1` [Zero-Shot Listwise Document Reranking with a Large Language Model](http://arxiv.org/abs/2305.02156v1) | Xueguang Ma, Xinyu Zhang, Ronak Pradeep, et al. | 2023&#8209;05&#8209;03 | `Docs:` [docs/how_to/contextual_compression](https://python.langchain.com/v0.2/docs/how_to/contextual_compression), `API:` [langchain...LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html#langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank)
| `2305.02156v1` [Zero-Shot Listwise Document Reranking with a Large Language Model](http://arxiv.org/abs/2305.02156v1) | Xueguang Ma, Xinyu Zhang, Ronak Pradeep, et al. | 2023&#8209;05&#8209;03 | `Docs:` [docs/how_to/contextual_compression](https://python.langchain.com/docs/how_to/contextual_compression), `API:` [langchain...LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html#langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank)
| `2304.08485v2` [Visual Instruction Tuning](http://arxiv.org/abs/2304.08485v2) | Haotian Liu, Chunyuan Li, Qingyang Wu, et al. | 2023&#8209;04&#8209;17 | `Cookbook:` [Semi Structured Multi Modal Rag Llama2](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb), [Semi Structured And Multi Modal Rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb)
| `2304.03442v2` [Generative Agents: Interactive Simulacra of Human Behavior](http://arxiv.org/abs/2304.03442v2) | Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, et al. | 2023&#8209;04&#8209;07 | `Cookbook:` [Generative Agents Interactive Simulacra Of Human Behavior](https://github.com/langchain-ai/langchain/blob/master/cookbook/generative_agents_interactive_simulacra_of_human_behavior.ipynb), [Multiagent Bidding](https://github.com/langchain-ai/langchain/blob/master/cookbook/multiagent_bidding.ipynb)
| `2303.17760v2` [CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society](http://arxiv.org/abs/2303.17760v2) | Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, et al. | 2023&#8209;03&#8209;31 | `Cookbook:` [Camel Role Playing](https://github.com/langchain-ai/langchain/blob/master/cookbook/camel_role_playing.ipynb)
| `2303.17580v4` [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face](http://arxiv.org/abs/2303.17580v4) | Yongliang Shen, Kaitao Song, Xu Tan, et al. | 2023&#8209;03&#8209;30 | `API:` [langchain_experimental.autonomous_agents](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.autonomous_agents), `Cookbook:` [Hugginggpt](https://github.com/langchain-ai/langchain/blob/master/cookbook/hugginggpt.ipynb)
| `2301.10226v4` [A Watermark for Large Language Models](http://arxiv.org/abs/2301.10226v4) | John Kirchenbauer, Jonas Geiping, Yuxin Wen, et al. | 2023&#8209;01&#8209;24 | `API:` [langchain_community...OCIModelDeploymentTGI](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI.html#langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint)
| `2212.10496v1` [Precise Zero-Shot Dense Retrieval without Relevance Labels](http://arxiv.org/abs/2212.10496v1) | Luyu Gao, Xueguang Ma, Jimmy Lin, et al. | 2022&#8209;12&#8209;20 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), `API:` [langchain...HypotheticalDocumentEmbedder](https://api.python.langchain.com/en/latest/chains/langchain.chains.hyde.base.HypotheticalDocumentEmbedder.html#langchain.chains.hyde.base.HypotheticalDocumentEmbedder), `Template:` [hyde](https://python.langchain.com/docs/templates/hyde), `Cookbook:` [Hypothetical Document Embeddings](https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb)
| `2212.08073v1` [Constitutional AI: Harmlessness from AI Feedback](http://arxiv.org/abs/2212.08073v1) | Yuntao Bai, Saurav Kadavath, Sandipan Kundu, et al. | 2022&#8209;12&#8209;15 | `Docs:` [docs/versions/migrating_chains/constitutional_chain](https://python.langchain.com/v0.2/docs/versions/migrating_chains/constitutional_chain)
| `2212.10496v1` [Precise Zero-Shot Dense Retrieval without Relevance Labels](http://arxiv.org/abs/2212.10496v1) | Luyu Gao, Xueguang Ma, Jimmy Lin, et al. | 2022&#8209;12&#8209;20 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts), `API:` [langchain...HypotheticalDocumentEmbedder](https://api.python.langchain.com/en/latest/chains/langchain.chains.hyde.base.HypotheticalDocumentEmbedder.html#langchain.chains.hyde.base.HypotheticalDocumentEmbedder), `Template:` [hyde](https://python.langchain.com/docs/templates/hyde), `Cookbook:` [Hypothetical Document Embeddings](https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb)
| `2212.08073v1` [Constitutional AI: Harmlessness from AI Feedback](http://arxiv.org/abs/2212.08073v1) | Yuntao Bai, Saurav Kadavath, Sandipan Kundu, et al. | 2022&#8209;12&#8209;15 | `Docs:` [docs/versions/migrating_chains/constitutional_chain](https://python.langchain.com/docs/versions/migrating_chains/constitutional_chain)
| `2212.07425v3` [Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments](http://arxiv.org/abs/2212.07425v3) | Zhivar Sourati, Vishnu Priya Prasanna Venkatesh, Darshan Deshpande, et al. | 2022&#8209;12&#8209;12 | `API:` [langchain_experimental.fallacy_removal](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.fallacy_removal)
| `2211.13892v2` [Complementary Explanations for Effective In-Context Learning](http://arxiv.org/abs/2211.13892v2) | Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, et al. | 2022&#8209;11&#8209;25 | `API:` [langchain_core...MaxMarginalRelevanceExampleSelector](https://api.python.langchain.com/en/latest/example_selectors/langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector.html#langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector)
| `2211.10435v2` [PAL: Program-aided Language Models](http://arxiv.org/abs/2211.10435v2) | Luyu Gao, Aman Madaan, Shuyan Zhou, et al. | 2022&#8209;11&#8209;18 | `API:` [langchain_experimental.pal_chain](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.pal_chain), [langchain_experimental...PALChain](https://api.python.langchain.com/en/latest/pal_chain/langchain_experimental.pal_chain.base.PALChain.html#langchain_experimental.pal_chain.base.PALChain), `Cookbook:` [Program Aided Language Model](https://github.com/langchain-ai/langchain/blob/master/cookbook/program_aided_language_model.ipynb)
| `2210.11934v2` [An Analysis of Fusion Functions for Hybrid Retrieval](http://arxiv.org/abs/2210.11934v2) | Sebastian Bruch, Siyu Gai, Amir Ingber | 2022&#8209;10&#8209;21 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2210.03629v3` [ReAct: Synergizing Reasoning and Acting in Language Models](http://arxiv.org/abs/2210.03629v3) | Shunyu Yao, Jeffrey Zhao, Dian Yu, et al. | 2022&#8209;10&#8209;06 | `Docs:` [docs/integrations/tools/ionic_shopping](https://python.langchain.com/v0.2/docs/integrations/tools/ionic_shopping), [docs/integrations/providers/cohere](https://python.langchain.com/v0.2/docs/integrations/providers/cohere), [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), `API:` [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent), [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain)
| `2209.10785v2` [Deep Lake: a Lakehouse for Deep Learning](http://arxiv.org/abs/2209.10785v2) | Sasun Hambardzumyan, Abhinav Tuli, Levon Ghukasyan, et al. | 2022&#8209;09&#8209;22 | `Docs:` [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/v0.2/docs/integrations/providers/activeloop_deeplake)
| `2205.13147v4` [Matryoshka Representation Learning](http://arxiv.org/abs/2205.13147v4) | Aditya Kusupati, Gantavya Bhatt, Aniket Rege, et al. | 2022&#8209;05&#8209;26 | `Docs:` [docs/integrations/providers/snowflake](https://python.langchain.com/v0.2/docs/integrations/providers/snowflake)
| `2210.11934v2` [An Analysis of Fusion Functions for Hybrid Retrieval](http://arxiv.org/abs/2210.11934v2) | Sebastian Bruch, Siyu Gai, Amir Ingber | 2022&#8209;10&#8209;21 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `2210.03629v3` [ReAct: Synergizing Reasoning and Acting in Language Models](http://arxiv.org/abs/2210.03629v3) | Shunyu Yao, Jeffrey Zhao, Dian Yu, et al. | 2022&#8209;10&#8209;06 | `Docs:` [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping), [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/concepts](https://python.langchain.com/docs/concepts), `API:` [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent), [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain)
| `2209.10785v2` [Deep Lake: a Lakehouse for Deep Learning](http://arxiv.org/abs/2209.10785v2) | Sasun Hambardzumyan, Abhinav Tuli, Levon Ghukasyan, et al. | 2022&#8209;09&#8209;22 | `Docs:` [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/docs/integrations/providers/activeloop_deeplake)
| `2205.13147v4` [Matryoshka Representation Learning](http://arxiv.org/abs/2205.13147v4) | Aditya Kusupati, Gantavya Bhatt, Aniket Rege, et al. | 2022&#8209;05&#8209;26 | `Docs:` [docs/integrations/providers/snowflake](https://python.langchain.com/docs/integrations/providers/snowflake)
| `2205.12654v1` [Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages](http://arxiv.org/abs/2205.12654v1) | Kevin Heffernan, Onur Çelebi, Holger Schwenk | 2022&#8209;05&#8209;25 | `API:` [langchain_community...LaserEmbeddings](https://api.python.langchain.com/en/latest/embeddings/langchain_community.embeddings.laser.LaserEmbeddings.html#langchain_community.embeddings.laser.LaserEmbeddings)
| `2204.00498v1` [Evaluating the Text-to-SQL Capabilities of Large Language Models](http://arxiv.org/abs/2204.00498v1) | Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau | 2022&#8209;03&#8209;15 | `Docs:` [docs/tutorials/sql_qa](https://python.langchain.com/v0.2/docs/tutorials/sql_qa), `API:` [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase), [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL)
| `2204.00498v1` [Evaluating the Text-to-SQL Capabilities of Large Language Models](http://arxiv.org/abs/2204.00498v1) | Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau | 2022&#8209;03&#8209;15 | `Docs:` [docs/tutorials/sql_qa](https://python.langchain.com/docs/tutorials/sql_qa), `API:` [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase), [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL)
| `2202.00666v5` [Locally Typical Sampling](http://arxiv.org/abs/2202.00666v5) | Clara Meister, Tiago Pimentel, Gian Wiher, et al. | 2022&#8209;02&#8209;01 | `API:` [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint)
| `2112.01488v3` [ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction](http://arxiv.org/abs/2112.01488v3) | Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, et al. | 2021&#8209;12&#8209;02 | `Docs:` [docs/integrations/retrievers/ragatouille](https://python.langchain.com/v0.2/docs/integrations/retrievers/ragatouille), [docs/integrations/providers/ragatouille](https://python.langchain.com/v0.2/docs/integrations/providers/ragatouille), [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), [docs/integrations/providers/dspy](https://python.langchain.com/v0.2/docs/integrations/providers/dspy)
| `2112.01488v3` [ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction](http://arxiv.org/abs/2112.01488v3) | Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, et al. | 2021&#8209;12&#8209;02 | `Docs:` [docs/integrations/retrievers/ragatouille](https://python.langchain.com/docs/integrations/retrievers/ragatouille), [docs/integrations/providers/ragatouille](https://python.langchain.com/docs/integrations/providers/ragatouille), [docs/concepts](https://python.langchain.com/docs/concepts), [docs/integrations/providers/dspy](https://python.langchain.com/docs/integrations/providers/dspy)
| `2103.00020v1` [Learning Transferable Visual Models From Natural Language Supervision](http://arxiv.org/abs/2103.00020v1) | Alec Radford, Jong Wook Kim, Chris Hallacy, et al. | 2021&#8209;02&#8209;26 | `API:` [langchain_experimental.open_clip](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.open_clip)
| `2005.14165v4` [Language Models are Few-Shot Learners](http://arxiv.org/abs/2005.14165v4) | Tom B. Brown, Benjamin Mann, Nick Ryder, et al. | 2020&#8209;05&#8209;28 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2005.11401v4` [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](http://arxiv.org/abs/2005.11401v4) | Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al. | 2020&#8209;05&#8209;22 | `Docs:` [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
| `2005.14165v4` [Language Models are Few-Shot Learners](http://arxiv.org/abs/2005.14165v4) | Tom B. Brown, Benjamin Mann, Nick Ryder, et al. | 2020&#8209;05&#8209;28 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `2005.11401v4` [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](http://arxiv.org/abs/2005.11401v4) | Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al. | 2020&#8209;05&#8209;22 | `Docs:` [docs/concepts](https://python.langchain.com/docs/concepts)
| `1909.05858v2` [CTRL: A Conditional Transformer Language Model for Controllable Generation](http://arxiv.org/abs/1909.05858v2) | Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, et al. | 2019&#8209;09&#8209;11 | `API:` [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint)
## Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity
@@ -60,7 +60,7 @@ From the opposite direction, scientists use `LangChain` in research and referenc
- **arXiv id:** [2403.14403v2](http://arxiv.org/abs/2403.14403v2) **Published Date:** 2024-03-21
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** Retrieval-Augmented Large Language Models (LLMs), which incorporate the
non-parametric knowledge from external knowledge bases into LLMs, have emerged
@@ -113,7 +113,7 @@ commonalities with human reasoning patterns.
- **arXiv id:** [2402.03367v2](http://arxiv.org/abs/2402.03367v2) **Published Date:** 2024-01-31
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** Infineon has identified a need for engineers, account managers, and customers
to rapidly obtain product information. This problem is traditionally addressed
@@ -159,7 +159,7 @@ benchmark by 20% in absolute accuracy.
- **arXiv id:** [2401.15884v2](http://arxiv.org/abs/2401.15884v2) **Published Date:** 2024-01-29
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
- **Cookbook:** [langgraph_crag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_crag.ipynb)
**Abstract:** Large language models (LLMs) inevitably exhibit hallucinations since the
@@ -187,7 +187,7 @@ performance of RAG-based approaches.
- **arXiv id:** [2401.08500v1](http://arxiv.org/abs/2401.08500v1) **Published Date:** 2024-01-16
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** Code generation problems differ from common natural language problems - they
require matching the exact syntax of the target language, identifying happy
@@ -293,7 +293,7 @@ outside the pre-training knowledge scope.
- **arXiv id:** [2310.11511v1](http://arxiv.org/abs/2310.11511v1) **Published Date:** 2023-10-17
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
- **Cookbook:** [langgraph_self_rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_self_rag.ipynb)
**Abstract:** Despite their remarkable capabilities, large language models (LLMs) often
@@ -324,7 +324,7 @@ to these models.
- **arXiv id:** [2310.06117v2](http://arxiv.org/abs/2310.06117v2) **Published Date:** 2023-10-09
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
- **Template:** [stepback-qa-prompting](https://python.langchain.com/docs/templates/stepback-qa-prompting)
- **Cookbook:** [stepback-qa](https://github.com/langchain-ai/langchain/blob/master/cookbook/stepback-qa.ipynb)
@@ -384,7 +384,7 @@ contribute to the responsible development of LLMs.
- **arXiv id:** [2307.03172v3](http://arxiv.org/abs/2307.03172v3) **Published Date:** 2023-07-06
- **LangChain:**
- **Documentation:** [docs/how_to/long_context_reorder](https://python.langchain.com/v0.2/docs/how_to/long_context_reorder)
- **Documentation:** [docs/how_to/long_context_reorder](https://python.langchain.com/docs/how_to/long_context_reorder)
**Abstract:** While recent language models have the ability to take long contexts as input,
relatively little is known about how well they use longer context. We analyze
@@ -451,8 +451,7 @@ steps of the thought-process and explore other directions from there. To verify
the effectiveness of the proposed technique, we implemented a ToT-based solver
for the Sudoku Puzzle. Experimental results show that the ToT framework can
significantly increase the success rate of Sudoku puzzle solving. Our
implementation of the ToT-based Sudoku solver is available on GitHub:
\url{https://github.com/jieyilong/tree-of-thought-puzzle-solver}.
implementation of the ToT-based Sudoku solver is available on [GitHub](https://github.com/jieyilong/tree-of-thought-puzzle-solver).
## Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
@@ -490,7 +489,7 @@ https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting.
- **arXiv id:** [2305.02156v1](http://arxiv.org/abs/2305.02156v1) **Published Date:** 2023-05-03
- **LangChain:**
- **Documentation:** [docs/how_to/contextual_compression](https://python.langchain.com/v0.2/docs/how_to/contextual_compression)
- **Documentation:** [docs/how_to/contextual_compression](https://python.langchain.com/docs/how_to/contextual_compression)
- **API Reference:** [langchain...LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html#langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank)
**Abstract:** Supervised ranking methods based on bi-encoder or cross-encoder architectures
@@ -649,7 +648,7 @@ family, and discuss robustness and security.
- **arXiv id:** [2212.10496v1](http://arxiv.org/abs/2212.10496v1) **Published Date:** 2022-12-20
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
- **API Reference:** [langchain...HypotheticalDocumentEmbedder](https://api.python.langchain.com/en/latest/chains/langchain.chains.hyde.base.HypotheticalDocumentEmbedder.html#langchain.chains.hyde.base.HypotheticalDocumentEmbedder)
- **Template:** [hyde](https://python.langchain.com/docs/templates/hyde)
- **Cookbook:** [hypothetical_document_embeddings](https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb)
@@ -678,7 +677,7 @@ search, QA, fact verification) and languages~(e.g. sw, ko, ja).
- **arXiv id:** [2212.08073v1](http://arxiv.org/abs/2212.08073v1) **Published Date:** 2022-12-15
- **LangChain:**
- **Documentation:** [docs/versions/migrating_chains/constitutional_chain](https://python.langchain.com/v0.2/docs/versions/migrating_chains/constitutional_chain)
- **Documentation:** [docs/versions/migrating_chains/constitutional_chain](https://python.langchain.com/docs/versions/migrating_chains/constitutional_chain)
**Abstract:** As AI systems become more capable, we would like to enlist their help to
supervise other AIs. We experiment with methods for training a harmless AI
@@ -792,7 +791,7 @@ publicly available at http://reasonwithpal.com/ .
- **arXiv id:** [2210.11934v2](http://arxiv.org/abs/2210.11934v2) **Published Date:** 2022-10-21
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** We study hybrid search in text retrieval where lexical and semantic search
are fused together with the intuition that the two are complementary in how
@@ -811,7 +810,7 @@ training examples to tune its only parameter to a target domain.
- **arXiv id:** [2210.03629v3](http://arxiv.org/abs/2210.03629v3) **Published Date:** 2022-10-06
- **LangChain:**
- **Documentation:** [docs/integrations/tools/ionic_shopping](https://python.langchain.com/v0.2/docs/integrations/tools/ionic_shopping), [docs/integrations/providers/cohere](https://python.langchain.com/v0.2/docs/integrations/providers/cohere), [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping), [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/concepts](https://python.langchain.com/docs/concepts)
- **API Reference:** [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent), [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain)
**Abstract:** While large language models (LLMs) have demonstrated impressive capabilities
@@ -843,7 +842,7 @@ Project site with code: https://react-lm.github.io
- **arXiv id:** [2209.10785v2](http://arxiv.org/abs/2209.10785v2) **Published Date:** 2022-09-22
- **LangChain:**
- **Documentation:** [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/v0.2/docs/integrations/providers/activeloop_deeplake)
- **Documentation:** [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/docs/integrations/providers/activeloop_deeplake)
**Abstract:** Traditional data lakes provide critical data infrastructure for analytical
workloads by enabling time travel, running SQL queries, ingesting data with
@@ -868,7 +867,7 @@ TensorFlow, JAX, and integrate with numerous MLOps tools.
- **arXiv id:** [2205.13147v4](http://arxiv.org/abs/2205.13147v4) **Published Date:** 2022-05-26
- **LangChain:**
- **Documentation:** [docs/integrations/providers/snowflake](https://python.langchain.com/v0.2/docs/integrations/providers/snowflake)
- **Documentation:** [docs/integrations/providers/snowflake](https://python.langchain.com/docs/integrations/providers/snowflake)
**Abstract:** Learned representations are a central component in modern ML systems, serving
a multitude of downstream tasks. When training such representations, it is
@@ -925,7 +924,7 @@ encoders, mine bitexts, and validate the bitexts by training NMT systems.
- **arXiv id:** [2204.00498v1](http://arxiv.org/abs/2204.00498v1) **Published Date:** 2022-03-15
- **LangChain:**
- **Documentation:** [docs/tutorials/sql_qa](https://python.langchain.com/v0.2/docs/tutorials/sql_qa)
- **Documentation:** [docs/tutorials/sql_qa](https://python.langchain.com/docs/tutorials/sql_qa)
- **API Reference:** [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase), [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL)
**Abstract:** We perform an empirical evaluation of Text-to-SQL capabilities of the Codex
@@ -971,7 +970,7 @@ reducing degenerate repetitions.
- **arXiv id:** [2112.01488v3](http://arxiv.org/abs/2112.01488v3) **Published Date:** 2021-12-02
- **LangChain:**
- **Documentation:** [docs/integrations/retrievers/ragatouille](https://python.langchain.com/v0.2/docs/integrations/retrievers/ragatouille), [docs/integrations/providers/ragatouille](https://python.langchain.com/v0.2/docs/integrations/providers/ragatouille), [docs/concepts](https://python.langchain.com/v0.2/docs/concepts), [docs/integrations/providers/dspy](https://python.langchain.com/v0.2/docs/integrations/providers/dspy)
- **Documentation:** [docs/integrations/retrievers/ragatouille](https://python.langchain.com/docs/integrations/retrievers/ragatouille), [docs/integrations/providers/ragatouille](https://python.langchain.com/docs/integrations/providers/ragatouille), [docs/concepts](https://python.langchain.com/docs/concepts), [docs/integrations/providers/dspy](https://python.langchain.com/docs/integrations/providers/dspy)
**Abstract:** Neural information retrieval (IR) has greatly advanced search and other
knowledge-intensive language tasks. While many neural IR methods encode queries
@@ -1022,7 +1021,7 @@ https://github.com/OpenAI/CLIP.
- **arXiv id:** [2005.14165v4](http://arxiv.org/abs/2005.14165v4) **Published Date:** 2020-05-28
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** Recent work has demonstrated substantial gains on many NLP tasks and
benchmarks by pre-training on a large corpus of text followed by fine-tuning on
@@ -1055,7 +1054,7 @@ and of GPT-3 in general.
- **arXiv id:** [2005.11401v4](http://arxiv.org/abs/2005.11401v4) **Published Date:** 2020-05-22
- **LangChain:**
- **Documentation:** [docs/concepts](https://python.langchain.com/v0.2/docs/concepts)
- **Documentation:** [docs/concepts](https://python.langchain.com/docs/concepts)
**Abstract:** Large pre-trained language models have been shown to store factual knowledge
in their parameters, and achieve state-of-the-art results when fine-tuned on

View File

@@ -97,7 +97,7 @@ For guides on how to do specific tasks with LCEL, check out [the relevant how-to
### Runnable interface
<span data-heading-keywords="invoke,runnable"></span>
To make it as easy as possible to create custom chains, we've implemented a ["Runnable"](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. Many LangChain components implement the `Runnable` protocol, including chat models, LLMs, output parsers, retrievers, prompt templates, and more. There are also several useful primitives for working with runnables, which you can read about below.
To make it as easy as possible to create custom chains, we've implemented a ["Runnable"](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. Many LangChain components implement the `Runnable` protocol, including chat models, LLMs, output parsers, retrievers, prompt templates, and more. There are also several useful primitives for working with runnables, which you can read about below.
This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way.
The standard interface includes:
@@ -116,14 +116,14 @@ These also have corresponding async methods that should be used with [asyncio](h
The **input type** and **output type** varies by component:
| Component | Input Type | Output Type |
| --- | --- | --- |
| Prompt | Dictionary | PromptValue |
| ChatModel | Single string, list of chat messages or a PromptValue | ChatMessage |
| LLM | Single string, list of chat messages or a PromptValue | String |
| OutputParser | The output of an LLM or ChatModel | Depends on the parser |
| Retriever | Single string | List of Documents |
| Tool | Single string or dictionary, depending on the tool | Depends on the tool |
| Component | Input Type | Output Type |
|--------------|-------------------------------------------------------|-----------------------|
| Prompt | Dictionary | PromptValue |
| ChatModel | Single string, list of chat messages or a PromptValue | ChatMessage |
| LLM | Single string, list of chat messages or a PromptValue | String |
| OutputParser | The output of an LLM or ChatModel | Depends on the parser |
| Retriever | Single string | List of Documents |
| Tool | Single string or dictionary, depending on the tool | Depends on the tool |
All runnables expose input and output **schemas** to inspect the inputs and outputs:
@@ -236,7 +236,7 @@ This is where information like log-probs and token usage may be stored.
**`tool_calls`**
These represent a decision from an language model to call a tool. They are included as part of an `AIMessage` output.
These represent a decision from a language model to call a tool. They are included as part of an `AIMessage` output.
They can be accessed from there with the `.tool_calls` property.
This property returns a list of `ToolCall`s. A `ToolCall` is a dictionary with the following arguments:
@@ -256,6 +256,8 @@ This represents a message with role "tool", which contains the result of calling
- a `tool_call_id` field which conveys the id of the call to the tool that was called to produce this result.
- an `artifact` field which can be used to pass along arbitrary artifacts of the tool execution which are useful to track but which should not be sent to the model.
With most chat models, a `ToolMessage` can only appear in the chat history after an `AIMessage` that has a populated `tool_calls` field.
#### (Legacy) FunctionMessage
This is a legacy message type, corresponding to OpenAI's legacy function-calling API. `ToolMessage` should be used instead to correspond to the updated tool-calling API.
@@ -380,17 +382,17 @@ LangChain has lots of different types of output parsers. This is a list of outpu
| Name | Supports Streaming | Has Format Instructions | Calls LLM | Input Type | Output Type | Description |
|-----------------|--------------------|-------------------------------|-----------|----------------------------------|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [JSON](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser) | ✅ | ✅ | | `str` \| `Message` | JSON object | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling. |
| [XML](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html#langchain_core.output_parsers.xml.XMLOutputParser) | ✅ | ✅ | | `str` \| `Message` | `dict` | Returns a dictionary of tags. Use when XML output is needed. Use with models that are good at writing XML (like Anthropic's). |
| [CSV](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.list.CommaSeparatedListOutputParser.html#langchain_core.output_parsers.list.CommaSeparatedListOutputParser) | ✅ | ✅ | | `str` \| `Message` | `List[str]` | Returns a list of comma separated values. |
| [OutputFixing](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser) | | | ✅ | `str` \| `Message` | | Wraps another output parser. If that output parser errors, then this will pass the error message and the bad output to an LLM and ask it to fix the output. |
| [RetryWithError](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.retry.RetryWithErrorOutputParser.html#langchain.output_parsers.retry.RetryWithErrorOutputParser) | | | ✅ | `str` \| `Message` | | Wraps another output parser. If that output parser errors, then this will pass the original inputs, the bad output, and the error message to an LLM and ask it to fix it. Compared to OutputFixingParser, this one also sends the original instructions. |
| [Pydantic](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.pydantic.PydanticOutputParser.html#langchain_core.output_parsers.pydantic.PydanticOutputParser) | | ✅ | | `str` \| `Message` | `pydantic.BaseModel` | Takes a user defined Pydantic model and returns data in that format. |
| [YAML](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.yaml.YamlOutputParser.html#langchain.output_parsers.yaml.YamlOutputParser) | | ✅ | | `str` \| `Message` | `pydantic.BaseModel` | Takes a user defined Pydantic model and returns data in that format. Uses YAML to encode it. |
| [PandasDataFrame](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.pandas_dataframe.PandasDataFrameOutputParser.html#langchain.output_parsers.pandas_dataframe.PandasDataFrameOutputParser) | | ✅ | | `str` \| `Message` | `dict` | Useful for doing operations with pandas DataFrames. |
| [Enum](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.enum.EnumOutputParser.html#langchain.output_parsers.enum.EnumOutputParser) | | ✅ | | `str` \| `Message` | `Enum` | Parses response into one of the provided enum values. |
| [Datetime](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.datetime.DatetimeOutputParser.html#langchain.output_parsers.datetime.DatetimeOutputParser) | | ✅ | | `str` \| `Message` | `datetime.datetime` | Parses response into a datetime string. |
| [Structured](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.structured.StructuredOutputParser.html#langchain.output_parsers.structured.StructuredOutputParser) | | ✅ | | `str` \| `Message` | `Dict[str, str]` | An output parser that returns structured information. It is less powerful than other output parsers since it only allows for fields to be strings. This can be useful when you are working with smaller LLMs. |
| [JSON](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser) | ✅ | ✅ | | `str` \| `Message` | JSON object | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling. |
| [XML](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html#langchain_core.output_parsers.xml.XMLOutputParser) | ✅ | ✅ | | `str` \| `Message` | `dict` | Returns a dictionary of tags. Use when XML output is needed. Use with models that are good at writing XML (like Anthropic's). |
| [CSV](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.list.CommaSeparatedListOutputParser.html#langchain_core.output_parsers.list.CommaSeparatedListOutputParser) | ✅ | ✅ | | `str` \| `Message` | `List[str]` | Returns a list of comma separated values. |
| [OutputFixing](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser) | | | ✅ | `str` \| `Message` | | Wraps another output parser. If that output parser errors, then this will pass the error message and the bad output to an LLM and ask it to fix the output. |
| [RetryWithError](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.retry.RetryWithErrorOutputParser.html#langchain.output_parsers.retry.RetryWithErrorOutputParser) | | | ✅ | `str` \| `Message` | | Wraps another output parser. If that output parser errors, then this will pass the original inputs, the bad output, and the error message to an LLM and ask it to fix it. Compared to OutputFixingParser, this one also sends the original instructions. |
| [Pydantic](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.pydantic.PydanticOutputParser.html#langchain_core.output_parsers.pydantic.PydanticOutputParser) | | ✅ | | `str` \| `Message` | `pydantic.BaseModel` | Takes a user defined Pydantic model and returns data in that format. |
| [YAML](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.yaml.YamlOutputParser.html#langchain.output_parsers.yaml.YamlOutputParser) | | ✅ | | `str` \| `Message` | `pydantic.BaseModel` | Takes a user defined Pydantic model and returns data in that format. Uses YAML to encode it. |
| [PandasDataFrame](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.pandas_dataframe.PandasDataFrameOutputParser.html#langchain.output_parsers.pandas_dataframe.PandasDataFrameOutputParser) | | ✅ | | `str` \| `Message` | `dict` | Useful for doing operations with pandas DataFrames. |
| [Enum](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.enum.EnumOutputParser.html#langchain.output_parsers.enum.EnumOutputParser) | | ✅ | | `str` \| `Message` | `Enum` | Parses response into one of the provided enum values. |
| [Datetime](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.datetime.DatetimeOutputParser.html#langchain.output_parsers.datetime.DatetimeOutputParser) | | ✅ | | `str` \| `Message` | `datetime.datetime` | Parses response into a datetime string. |
| [Structured](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.structured.StructuredOutputParser.html#langchain.output_parsers.structured.StructuredOutputParser) | | ✅ | | `str` \| `Message` | `Dict[str, str]` | An output parser that returns structured information. It is less powerful than other output parsers since it only allows for fields to be strings. This can be useful when you are working with smaller LLMs. |
For specifics on how to use output parsers, see the [relevant how-to guides here](/docs/how_to/#output-parsers).
@@ -501,7 +503,7 @@ For specifics on how to use retrievers, see the [relevant how-to guides here](/d
For some techniques, such as [indexing and retrieval with multiple vectors per document](/docs/how_to/multi_vector/) or
[caching embeddings](/docs/how_to/caching_embeddings/), having a form of key-value (KV) storage is helpful.
LangChain includes a [`BaseStore`](https://python.langchain.com/v0.2/api_reference/core/stores/langchain_core.stores.BaseStore.html) interface,
LangChain includes a [`BaseStore`](https://python.langchain.com/api_reference/core/stores/langchain_core.stores.BaseStore.html) interface,
which allows for storage of arbitrary data. However, LangChain components that require KV-storage accept a
more specific `BaseStore[str, bytes]` instance that stores binary data (referred to as a `ByteStore`), and internally take care of
encoding and decoding data for their specific needs.
@@ -510,7 +512,7 @@ This means that as a user, you only need to think about one type of store rather
#### Interface
All [`BaseStores`](https://python.langchain.com/v0.2/api_reference/core/stores/langchain_core.stores.BaseStore.html) support the following interface. Note that the interface allows
All [`BaseStores`](https://python.langchain.com/api_reference/core/stores/langchain_core.stores.BaseStore.html) support the following interface. Note that the interface allows
for modifying **multiple** key-value pairs at once:
- `mget(key: Sequence[str]) -> List[Optional[bytes]]`: get the contents of multiple keys, returning `None` if the key does not exist
@@ -595,10 +597,10 @@ tool_call = ai_msg.tool_calls[0]
# -> ToolCall(args={...}, id=..., ...)
tool_message = tool.invoke(tool_call)
# -> ToolMessage(
content="tool result foobar...",
tool_call_id=...,
name="tool_name"
)
# content="tool result foobar...",
# tool_call_id=...,
# name="tool_name"
# )
```
If you are invoking the tool this way and want to include an [artifact](/docs/concepts/#toolmessage) for the ToolMessage, you will need to have the tool return two things.
@@ -708,17 +710,15 @@ You can subscribe to these events by using the `callbacks` argument available th
Callback handlers can either be `sync` or `async`:
* Sync callback handlers implement the [BaseCallbackHandler](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html) interface.
* Async callback handlers implement the [AsyncCallbackHandler](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.AsyncCallbackHandler.html) interface.
* Sync callback handlers implement the [BaseCallbackHandler](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html) interface.
* Async callback handlers implement the [AsyncCallbackHandler](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.AsyncCallbackHandler.html) interface.
During run-time LangChain configures an appropriate callback manager (e.g., [CallbackManager](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.manager.CallbackManager.html) or [AsyncCallbackManager](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.manager.AsyncCallbackManager.html) which will be responsible for calling the appropriate method on each "registered" callback handler when the event is triggered.
During run-time LangChain configures an appropriate callback manager (e.g., [CallbackManager](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.CallbackManager.html) or [AsyncCallbackManager](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.AsyncCallbackManager.html) which will be responsible for calling the appropriate method on each "registered" callback handler when the event is triggered.
#### Passing callbacks
The `callbacks` property is available on most objects throughout the API (Models, Tools, Agents, etc.) in two different places:
The callbacks are available on most objects throughout the API (Models, Tools, Agents, etc.) in two different places:
- **Request time callbacks**: Passed at the time of the request in addition to the input data.
Available on all standard `Runnable` objects. These callbacks are INHERITED by all children
of the object they are defined on. For example, `chain.invoke({"number": 25}, {"callbacks": [handler]})`.
@@ -734,10 +734,10 @@ of the object.
If you're creating a custom chain or runnable, you need to remember to propagate request time
callbacks to any child objects.
:::important Async in Python<=3.10
:::important Async in Python&lt;=3.10
Any `RunnableLambda`, a `RunnableGenerator`, or `Tool` that invokes other runnables
and is running `async` in python<=3.10, will have to propagate callbacks to child
and is running `async` in python&lt;=3.10, will have to propagate callbacks to child
objects manually. This is because LangChain cannot automatically propagate
callbacks to child objects in this case.
@@ -779,7 +779,7 @@ For models (or other components) that don't support streaming natively, this ite
you could still use the same general pattern when calling them. Using `.stream()` will also automatically call the model in streaming mode
without the need to provide additional config.
The type of each outputted chunk depends on the type of component - for example, chat models yield [`AIMessageChunks`](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html).
The type of each outputted chunk depends on the type of component - for example, chat models yield [`AIMessageChunks`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html).
Because this method is part of [LangChain Expression Language](/docs/concepts/#langchain-expression-language-lcel),
you can handle formatting differences from different outputs using an [output parser](/docs/concepts/#output-parsers) to transform
each yielded chunk.
@@ -827,10 +827,10 @@ including a table listing available events.
#### Callbacks
The lowest level way to stream outputs from LLMs in LangChain is via the [callbacks](/docs/concepts/#callbacks) system. You can pass a
callback handler that handles the [`on_llm_new_token`](https://python.langchain.com/v0.2/api_reference/langchain/callbacks/langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.html#langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.on_llm_new_token) event into LangChain components. When that component is invoked, any
callback handler that handles the [`on_llm_new_token`](https://python.langchain.com/api_reference/langchain/callbacks/langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.html#langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.on_llm_new_token) event into LangChain components. When that component is invoked, any
[LLM](/docs/concepts/#llms) or [chat model](/docs/concepts/#chat-models) contained in the component calls
the callback with the generated token. Within the callback, you could pipe the tokens into some other destination, e.g. a HTTP response.
You can also handle the [`on_llm_end`](https://python.langchain.com/v0.2/api_reference/langchain/callbacks/langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.html#langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.on_llm_end) event to perform any necessary cleanup.
You can also handle the [`on_llm_end`](https://python.langchain.com/api_reference/langchain/callbacks/langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.html#langchain.callbacks.streaming_aiter.AsyncIteratorCallbackHandler.on_llm_end) event to perform any necessary cleanup.
You can see [this how-to section](/docs/how_to/#callbacks) for more specifics on using callbacks.
@@ -945,7 +945,7 @@ Here's an example:
```python
from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field
from pydantic import BaseModel, Field
class Joke(BaseModel):
@@ -1062,7 +1062,7 @@ a `tool_calls` field containing `args` that match the desired shape.
There are several acceptable formats you can use to bind tools to a model in LangChain. Here's one example:
```python
from langchain_core.pydantic_v1 import BaseModel, Field
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
class ResponseFormatter(BaseModel):

View File

@@ -12,7 +12,7 @@ It covers a wide array of topics, including tutorials, use cases, integrations,
and more, offering extensive guidance on building with LangChain.
The content for this documentation lives in the `/docs` directory of the monorepo.
2. In-code Documentation: This is documentation of the codebase itself, which is also
used to generate the externally facing [API Reference](https://python.langchain.com/v0.2/api_reference/langchain/index.html).
used to generate the externally facing [API Reference](https://python.langchain.com/api_reference/langchain/index.html).
The content for the API reference is autogenerated by scanning the docstrings in the codebase. For this reason we ask that
developers document their code well.

View File

@@ -50,7 +50,7 @@ There are other files in the root directory level, but their presence should be
## Documentation
The `/docs` directory contains the content for the documentation that is shown
at https://python.langchain.com/ and the associated API Reference https://python.langchain.com/v0.2/api_reference/langchain/index.html.
at https://python.langchain.com/ and the associated API Reference https://python.langchain.com/api_reference/langchain/index.html.
See the [documentation](/docs/contributing/documentation/) guidelines to learn how to contribute to the documentation.

View File

@@ -13,7 +13,7 @@
"# How to split by HTML header \n",
"## Description and motivation\n",
"\n",
"[HTMLHeaderTextSplitter](https://python.langchain.com/v0.2/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" chunker that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
"[HTMLHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" chunker that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
"\n",
"It is analogous to the [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter) for markdown files.\n",
"\n",

View File

@@ -9,7 +9,7 @@
"\n",
"Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on a distance metric. But, retrieval may produce different results with subtle changes in query wording, or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
"\n",
"The [MultiQueryRetriever](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.\n",
"The [MultiQueryRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.\n",
"\n",
"Let's build a vectorstore using the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng from the [RAG tutorial](/docs/tutorials/rag):"
]
@@ -18,8 +18,23 @@
"cell_type": "code",
"execution_count": 1,
"id": "994d6c74",
"metadata": {},
"outputs": [],
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:00.190093Z",
"iopub.status.busy": "2024-09-10T20:08:00.189665Z",
"iopub.status.idle": "2024-09-10T20:08:05.438015Z",
"shell.execute_reply": "2024-09-10T20:08:05.437685Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"USER_AGENT environment variable not set, consider setting it to identify your requests.\n"
]
}
],
"source": [
"# Build a sample vectorDB\n",
"from langchain_chroma import Chroma\n",
@@ -54,7 +69,14 @@
"cell_type": "code",
"execution_count": 2,
"id": "edbca101",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:05.439930Z",
"iopub.status.busy": "2024-09-10T20:08:05.439810Z",
"iopub.status.idle": "2024-09-10T20:08:05.553766Z",
"shell.execute_reply": "2024-09-10T20:08:05.553520Z"
}
},
"outputs": [],
"source": [
"from langchain.retrievers.multi_query import MultiQueryRetriever\n",
@@ -71,7 +93,14 @@
"cell_type": "code",
"execution_count": 3,
"id": "9e6d3b69",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:05.555359Z",
"iopub.status.busy": "2024-09-10T20:08:05.555262Z",
"iopub.status.idle": "2024-09-10T20:08:05.557046Z",
"shell.execute_reply": "2024-09-10T20:08:05.556825Z"
}
},
"outputs": [],
"source": [
"# Set logging for the queries\n",
@@ -85,13 +114,20 @@
"cell_type": "code",
"execution_count": 4,
"id": "bc93dc2b-9407-48b0-9f9a-338247e7eb69",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:05.558176Z",
"iopub.status.busy": "2024-09-10T20:08:05.558100Z",
"iopub.status.idle": "2024-09-10T20:08:07.250342Z",
"shell.execute_reply": "2024-09-10T20:08:07.249711Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be achieved through different methods?', '2. What strategies are commonly used for Task Decomposition?', '3. What are the various techniques for breaking down tasks in Task Decomposition?']\n"
"INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be achieved through different methods?', '2. What strategies are commonly used for Task Decomposition?', '3. What are the various ways to break down tasks in Task Decomposition?']\n"
]
},
{
@@ -125,9 +161,9 @@
"source": [
"#### Supplying your own prompt\n",
"\n",
"Under the hood, `MultiQueryRetriever` generates queries using a specific [prompt](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html). To customize this prompt:\n",
"Under the hood, `MultiQueryRetriever` generates queries using a specific [prompt](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html). To customize this prompt:\n",
"\n",
"1. Make a [PromptTemplate](https://python.langchain.com/v0.2/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html) with an input variable for the question;\n",
"1. Make a [PromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.prompt.PromptTemplate.html) with an input variable for the question;\n",
"2. Implement an [output parser](/docs/concepts#output-parsers) like the one below to split the result into a list of queries.\n",
"\n",
"The prompt and output parser together must support the generation of a list of queries."
@@ -137,14 +173,21 @@
"cell_type": "code",
"execution_count": 5,
"id": "d9afb0ca",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:07.253875Z",
"iopub.status.busy": "2024-09-10T20:08:07.253600Z",
"iopub.status.idle": "2024-09-10T20:08:07.277848Z",
"shell.execute_reply": "2024-09-10T20:08:07.277487Z"
}
},
"outputs": [],
"source": [
"from typing import List\n",
"\n",
"from langchain_core.output_parsers import BaseOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"# Output parser will split the LLM result into a list of queries\n",
@@ -180,13 +223,20 @@
"cell_type": "code",
"execution_count": 6,
"id": "59c75c56-dbd7-4887-b9ba-0b5b21069f51",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:08:07.280001Z",
"iopub.status.busy": "2024-09-10T20:08:07.279861Z",
"iopub.status.idle": "2024-09-10T20:08:09.579525Z",
"shell.execute_reply": "2024-09-10T20:08:09.578837Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:langchain.retrievers.multi_query:Generated queries: ['1. Can you provide insights on regression from the course material?', '2. How is regression discussed in the course content?', '3. What information does the course offer about regression?', '4. In what way is regression covered in the course?', '5. What are the teachings of the course regarding regression?']\n"
"INFO:langchain.retrievers.multi_query:Generated queries: ['1. Can you provide insights on regression from the course material?', '2. How is regression discussed in the course content?', '3. What information does the course offer regarding regression?', '4. In what way is regression covered in the course?', \"5. What are the course's teachings on regression?\"]\n"
]
},
{
@@ -228,7 +278,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -7,7 +7,7 @@
"source": [
"# How to add scores to retriever results\n",
"\n",
"Retrievers will return sequences of [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
"Retrievers will return sequences of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
"1. From [vectorstore retrievers](/docs/how_to/vectorstore_retriever);\n",
"2. From higher-order LangChain retrievers, such as [SelfQueryRetriever](/docs/how_to/self_query) or [MultiVectorRetriever](/docs/how_to/multi_vector).\n",
"\n",
@@ -15,7 +15,7 @@
"\n",
"## Create vector store\n",
"\n",
"First we populate a vector store with some data. We will use a [PineconeVectorStore](https://python.langchain.com/v0.2/api_reference/pinecone/vectorstores/langchain_pinecone.vectorstores.PineconeVectorStore.html), but this guide is compatible with any LangChain vector store that implements a `.similarity_search_with_score` method."
"First we populate a vector store with some data. We will use a [PineconeVectorStore](https://python.langchain.com/api_reference/pinecone/vectorstores/langchain_pinecone.vectorstores.PineconeVectorStore.html), but this guide is compatible with any LangChain vector store that implements a `.similarity_search_with_score` method."
]
},
{
@@ -206,7 +206,7 @@
" ) -> List[Document]:\n",
" \"\"\"Get docs, adding score information.\"\"\"\n",
" docs, scores = zip(\n",
" *vectorstore.similarity_search_with_score(query, **search_kwargs)\n",
" *self.vectorstore.similarity_search_with_score(query, **search_kwargs)\n",
" )\n",
" for doc, score in zip(docs, scores):\n",
" doc.metadata[\"score\"] = score\n",
@@ -263,7 +263,7 @@
"\n",
"To propagate similarity scores through this retriever, we can again subclass `MultiVectorRetriever` and override a method. This time we will override `_get_relevant_documents`.\n",
"\n",
"First, we prepare some fake data. We generate fake \"whole documents\" and store them in a document store; here we will use a simple [InMemoryStore](https://python.langchain.com/v0.2/api_reference/core/stores/langchain_core.stores.InMemoryBaseStore.html)."
"First, we prepare some fake data. We generate fake \"whole documents\" and store them in a document store; here we will use a simple [InMemoryStore](https://python.langchain.com/api_reference/core/stores/langchain_core.stores.InMemoryBaseStore.html)."
]
},
{

View File

@@ -17,7 +17,7 @@
"source": [
"# Build an Agent with AgentExecutor (Legacy)\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"This section will cover building with the legacy LangChain AgentExecutor. These are fine for getting started, but past a certain point, you will likely want flexibility and control that they do not offer. For working with more advanced agents, we'd recommend checking out [LangGraph Agents](/docs/concepts/#langgraph) or the [migration guide](/docs/how_to/migrate_agent/)\n",
":::\n",
"\n",
@@ -49,7 +49,6 @@
"\n",
"To install LangChain run:\n",
"\n",
"```{=mdx}\n",
"import Tabs from '@theme/Tabs';\n",
"import TabItem from '@theme/TabItem';\n",
"import CodeBlock from \"@theme/CodeBlock\";\n",
@@ -63,7 +62,6 @@
" </TabItem>\n",
"</Tabs>\n",
"\n",
"```\n",
"\n",
"\n",
"For more details, see our [Installation guide](/docs/how_to/installation).\n",
@@ -270,11 +268,9 @@
"\n",
"Next, let's learn how to use a language model by to call tools. LangChain supports many different language models that you can use interchangably - select the one you want to use below!\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n",
"```"
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n"
]
},
{
@@ -461,7 +457,7 @@
"id": "f8014c9d",
"metadata": {},
"source": [
"Now, we can initalize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts/#agents).\n",
"Now, we can initialize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts/#agents).\n",
"\n",
"Note that we are passing in the `model`, not `model_with_tools`. That is because `create_tool_calling_agent` will call `.bind_tools` for us under the hood."
]
@@ -805,7 +801,7 @@
"\n",
"That's a wrap! In this quick start we covered how to create a simple agent. Agents are a complex topic, and there's lot to learn! \n",
"\n",
":::{.callout-important}\n",
":::important\n",
"This section covered building with LangChain Agents. LangChain Agents are fine for getting started, but past a certain point you will likely want flexibility and control that they do not offer. For working with more advanced agents, we'd reccommend checking out [LangGraph](/docs/concepts/#langgraph)\n",
":::\n",
"\n",

View File

@@ -27,7 +27,7 @@
"\n",
":::\n",
"\n",
"An alternate way of [passing data through](/docs/how_to/passthrough) steps of a chain is to leave the current values of the chain state unchanged while assigning a new value under a given key. The [`RunnablePassthrough.assign()`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html#langchain_core.runnables.passthrough.RunnablePassthrough.assign) static method takes an input value and adds the extra arguments passed to the assign function.\n",
"An alternate way of [passing data through](/docs/how_to/passthrough) steps of a chain is to leave the current values of the chain state unchanged while assigning a new value under a given key. The [`RunnablePassthrough.assign()`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html#langchain_core.runnables.passthrough.RunnablePassthrough.assign) static method takes an input value and adds the extra arguments passed to the assign function.\n",
"\n",
"This is useful in the common [LangChain Expression Language](/docs/concepts/#langchain-expression-language) pattern of additively creating a dictionary to use as input to a later step.\n",
"\n",
@@ -45,7 +45,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{

View File

@@ -27,7 +27,7 @@
"\n",
":::\n",
"\n",
"Sometimes we want to invoke a [`Runnable`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) within a [RunnableSequence](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableSequence.html) with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use the [`Runnable.bind()`](https://python.langchain.com/v0.2/api_reference/langchain_core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind) method to set these arguments ahead of time.\n",
"Sometimes we want to invoke a [`Runnable`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) within a [RunnableSequence](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableSequence.html) with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use the [`Runnable.bind()`](https://python.langchain.com/api_reference/langchain_core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind) method to set these arguments ahead of time.\n",
"\n",
"## Binding stop sequences\n",
"\n",
@@ -49,7 +49,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -183,7 +184,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_z0OU2CytqENVrRTI6T8DkI3u', 'function': {'arguments': '{\"location\": \"San Francisco, CA\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}, {'id': 'call_ft96IJBh0cMKkQWrZjNg4bsw', 'function': {'arguments': '{\"location\": \"New York, NY\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}, {'id': 'call_tfbtGgCLmuBuWgZLvpPwvUMH', 'function': {'arguments': '{\"location\": \"Los Angeles, CA\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 84, 'prompt_tokens': 85, 'total_tokens': 169}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': 'fp_77a673219d', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d57ad5fa-b52a-4822-bc3e-74f838697e18-0', tool_calls=[{'name': 'get_current_weather', 'args': {'location': 'San Francisco, CA', 'unit': 'celsius'}, 'id': 'call_z0OU2CytqENVrRTI6T8DkI3u'}, {'name': 'get_current_weather', 'args': {'location': 'New York, NY', 'unit': 'celsius'}, 'id': 'call_ft96IJBh0cMKkQWrZjNg4bsw'}, {'name': 'get_current_weather', 'args': {'location': 'Los Angeles, CA', 'unit': 'celsius'}, 'id': 'call_tfbtGgCLmuBuWgZLvpPwvUMH'}])"
"AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_z0OU2CytqENVrRTI6T8DkI3u', 'function': {'arguments': '{\"location\": \"San Francisco, CA\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}, {'id': 'call_ft96IJBh0cMKkQWrZjNg4bsw', 'function': {'arguments': '{\"location\": \"New York, NY\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}, {'id': 'call_tfbtGgCLmuBuWgZLvpPwvUMH', 'function': {'arguments': '{\"location\": \"Los Angeles, CA\", \"unit\": \"celsius\"}', 'name': 'get_current_weather'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 84, 'prompt_tokens': 85, 'total_tokens': 169}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': 'fp_77a673219d', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d57ad5fa-b52a-4822-bc3e-74f838697e18-0', tool_calls=[{'name': 'get_current_weather', 'args': {'location': 'San Francisco, CA', 'unit': 'celsius'}, 'id': 'call_z0OU2CytqENVrRTI6T8DkI3u'}, {'name': 'get_current_weather', 'args': {'location': 'New York, NY', 'unit': 'celsius'}, 'id': 'call_ft96IJBh0cMKkQWrZjNg4bsw'}, {'name': 'get_current_weather', 'args': {'location': 'Los Angeles, CA', 'unit': 'celsius'}, 'id': 'call_tfbtGgCLmuBuWgZLvpPwvUMH'}])"
]
},
"execution_count": 5,
@@ -192,7 +193,7 @@
}
],
"source": [
"model = ChatOpenAI(model=\"gpt-3.5-turbo-1106\").bind(tools=tools)\n",
"model = ChatOpenAI(model=\"gpt-4o-mini\").bind(tools=tools)\n",
"model.invoke(\"What's the weather in SF, NYC and LA?\")"
]
},

View File

@@ -14,14 +14,14 @@
"- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
":::\n",
"\n",
"If you are planning to use the async APIs, it is recommended to use and extend [`AsyncCallbackHandler`](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.AsyncCallbackHandler.html) to avoid blocking the event.\n",
"If you are planning to use the async APIs, it is recommended to use and extend [`AsyncCallbackHandler`](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.AsyncCallbackHandler.html) to avoid blocking the event.\n",
"\n",
"\n",
":::{.callout-warning}\n",
":::warning\n",
"If you use a sync `CallbackHandler` while using an async method to run your LLM / Chain / Tool / Agent, it will still work. However, under the hood, it will be called with [`run_in_executor`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor) which can cause issues if your `CallbackHandler` is not thread-safe.\n",
":::\n",
"\n",
":::{.callout-danger}\n",
":::danger\n",
"\n",
"If you're on `python<=3.10`, you need to remember to propagate `config` or `callbacks` when invoking other `runnable` from within a `RunnableLambda`, `RunnableGenerator` or `@tool`. If you do not do this,\n",
"the callbacks will not be propagated to the child runnables being invoked.\n",

View File

@@ -17,9 +17,9 @@
"\n",
":::\n",
"\n",
"If you are composing a chain of runnables and want to reuse callbacks across multiple executions, you can attach callbacks with the [`.with_config()`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config) method. This saves you the need to pass callbacks in each time you invoke the chain.\n",
"If you are composing a chain of runnables and want to reuse callbacks across multiple executions, you can attach callbacks with the [`.with_config()`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config) method. This saves you the need to pass callbacks in each time you invoke the chain.\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"\n",
"`with_config()` binds a configuration which will be interpreted as **runtime** configuration. So these callbacks will propagate to all child components.\n",
":::\n",

View File

@@ -17,7 +17,7 @@
"\n",
"Most LangChain modules allow you to pass `callbacks` directly into the constructor (i.e., initializer). In this case, the callbacks will only be called for that instance (and any nested runs).\n",
"\n",
":::{.callout-warning}\n",
":::warning\n",
"Constructor callbacks are scoped only to the object they are defined on. They are **not** inherited by children of the object. This can lead to confusing behavior,\n",
"and it's generally better to pass callbacks as a run time argument.\n",
":::\n",

View File

@@ -29,7 +29,7 @@
"| data | Any | The data associated with the event. This can be anything, though we suggest making it JSON serializable. |\n",
"\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"* Dispatching custom callback events requires `langchain-core>=0.2.15`.\n",
"* Custom callback events can only be dispatched from within an existing `Runnable`.\n",
"* If using `astream_events`, you must use `version='v2'` to see custom events.\n",
@@ -38,9 +38,9 @@
"\n",
"\n",
":::caution COMPATIBILITY\n",
"LangChain cannot automatically propagate configuration, including callbacks necessary for astream_events(), to child runnables if you are running async code in python<=3.10. This is a common reason why you may fail to see events being emitted from custom runnables or tools.\n",
"LangChain cannot automatically propagate configuration, including callbacks necessary for astream_events(), to child runnables if you are running async code in python&lt;=3.10. This is a common reason why you may fail to see events being emitted from custom runnables or tools.\n",
"\n",
"If you are running python<=3.10, you will need to manually propagate the `RunnableConfig` object to the child runnable in async environments. For an example of how to manually propagate the config, see the implementation of the `bar` RunnableLambda below.\n",
"If you are running python&lt;=3.10, you will need to manually propagate the `RunnableConfig` object to the child runnable in async environments. For an example of how to manually propagate the config, see the implementation of the `bar` RunnableLambda below.\n",
"\n",
"If you are running python>=3.11, the `RunnableConfig` will automatically propagate to child runnables in async environment. However, it is still a good idea to propagate the `RunnableConfig` manually if your code may run in other Python versions.\n",
":::"
@@ -69,7 +69,7 @@
"We can use the `async` `adispatch_custom_event` API to emit custom events in an async setting. \n",
"\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"\n",
"To see custom events via the astream events API, you need to use the newer `v2` API of `astream_events`.\n",
":::"
@@ -115,7 +115,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In python <= 3.10, you must propagate the config manually!"
"In python &lt;= 3.10, you must propagate the config manually!"
]
},
{

View File

@@ -15,7 +15,7 @@
"\n",
":::\n",
"\n",
"In many cases, it is advantageous to pass in handlers instead when running the object. When we pass through [`CallbackHandlers`](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) using the `callbacks` keyword arg when executing an run, those callbacks will be issued by all nested objects involved in the execution. For example, when a handler is passed through to an Agent, it will be used for all callbacks related to the agent and all the objects involved in the agent's execution, in this case, the Tools and LLM.\n",
"In many cases, it is advantageous to pass in handlers instead when running the object. When we pass through [`CallbackHandlers`](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) using the `callbacks` keyword arg when executing an run, those callbacks will be issued by all nested objects involved in the execution. For example, when a handler is passed through to an Agent, it will be used for all callbacks related to the agent and all the objects involved in the agent's execution, in this case, the Tools and LLM.\n",
"\n",
"This prevents us from having to manually attach the handlers to each individual nested object. Here's an example:"
]

View File

@@ -28,7 +28,7 @@
"\n",
"To obtain the string content directly, use `.split_text`.\n",
"\n",
"To create LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html) objects (e.g., for use in downstream tasks), use `.create_documents`."
"To create LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects (e.g., for use in downstream tasks), use `.create_documents`."
]
},
{

View File

@@ -28,11 +28,9 @@
"id": "289b31de",
"metadata": {},
"source": [
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />\n",
"```"
"<ChatModelTabs customVarName=\"llm\" />\n"
]
},
{
@@ -50,7 +48,8 @@
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"\n",
"llm = ChatOpenAI()"
]

View File

@@ -11,16 +11,10 @@
"\n",
":::tip Supported models\n",
"\n",
"See the [init_chat_model()](https://python.langchain.com/v0.2/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) API reference for a full list of supported integrations.\n",
"See the [init_chat_model()](https://python.langchain.com/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) API reference for a full list of supported integrations.\n",
"\n",
"Make sure you have the integration packages installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
"\n",
":::\n",
"\n",
":::info Requires ``langchain >= 0.2.8``\n",
"\n",
"This functionality was added in ``langchain-core == 0.2.8``. Please make sure your package is up to date.\n",
"\n",
":::"
]
},
@@ -44,19 +38,48 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 2,
"id": "79e14913-803c-4382-9009-5c6af3d75d35",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:33.015729Z",
"iopub.status.busy": "2024-09-10T20:22:33.015241Z",
"iopub.status.idle": "2024-09-10T20:22:39.391716Z",
"shell.execute_reply": "2024-09-10T20:22:39.390438Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/4j/2rz3865x6qg07tx43146py8h0000gn/T/ipykernel_95293/571506279.py:4: LangChainBetaWarning: The function `init_chat_model` is in beta. It is actively being worked on, so the API may change.\n",
" gpt_4o = init_chat_model(\"gpt-4o\", model_provider=\"openai\", temperature=0)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"GPT-4o: I'm an AI created by OpenAI, and I don't have a personal name. How can I assist you today?\n",
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"GPT-4o: I'm an AI created by OpenAI, and I don't have a personal name. You can call me Assistant! How can I help you today?\n",
"\n",
"Claude Opus: My name is Claude. It's nice to meet you!\n",
"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Gemini 1.5: I am a large language model, trained by Google. \n",
"\n",
"Gemini 1.5: I am a large language model, trained by Google. I do not have a name. \n",
"I don't have a name like a person does. You can call me Bard if you like! 😊 \n",
"\n",
"\n"
]
@@ -89,14 +112,21 @@
"source": [
"## Inferring model provider\n",
"\n",
"For common and distinct model names `init_chat_model()` will attempt to infer the model provider. See the [API reference](https://python.langchain.com/v0.2/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) for a full list of inference behavior. E.g. any model that starts with `gpt-3...` or `gpt-4...` will be inferred as using model provider `openai`."
"For common and distinct model names `init_chat_model()` will attempt to infer the model provider. See the [API reference](https://python.langchain.com/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) for a full list of inference behavior. E.g. any model that starts with `gpt-3...` or `gpt-4...` will be inferred as using model provider `openai`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "0378ccc6-95bc-4d50-be50-fccc193f0a71",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:39.396908Z",
"iopub.status.busy": "2024-09-10T20:22:39.396563Z",
"iopub.status.idle": "2024-09-10T20:22:39.444959Z",
"shell.execute_reply": "2024-09-10T20:22:39.444646Z"
}
},
"outputs": [],
"source": [
"gpt_4o = init_chat_model(\"gpt-4o\", temperature=0)\n",
@@ -116,17 +146,24 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "6c037f27-12d7-4e83-811e-4245c0e3ba58",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:39.446901Z",
"iopub.status.busy": "2024-09-10T20:22:39.446773Z",
"iopub.status.idle": "2024-09-10T20:22:40.301906Z",
"shell.execute_reply": "2024-09-10T20:22:40.300918Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"I'm an AI language model created by OpenAI, and I don't have a personal name. You can call me Assistant or any other name you prefer! How can I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 11, 'total_tokens': 48}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_d576307f90', 'finish_reason': 'stop', 'logprobs': None}, id='run-5428ab5c-b5c0-46de-9946-5d4ca40dbdc8-0', usage_metadata={'input_tokens': 11, 'output_tokens': 37, 'total_tokens': 48})"
"AIMessage(content=\"I'm an AI created by OpenAI, and I don't have a personal name. How can I assist you today?\", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 11, 'total_tokens': 34}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_25624ae3a5', 'finish_reason': 'stop', 'logprobs': None}, id='run-b41df187-4627-490d-af3c-1c96282d3eb0-0', usage_metadata={'input_tokens': 11, 'output_tokens': 23, 'total_tokens': 34})"
]
},
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -141,17 +178,24 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "321e3036-abd2-4e1f-bcc6-606efd036954",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:40.316030Z",
"iopub.status.busy": "2024-09-10T20:22:40.315628Z",
"iopub.status.idle": "2024-09-10T20:22:41.199134Z",
"shell.execute_reply": "2024-09-10T20:22:41.198173Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", response_metadata={'id': 'msg_012XvotUJ3kGLXJUWKBVxJUi', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-1ad1eefe-f1c6-4244-8bc6-90e2cb7ee554-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", additional_kwargs={}, response_metadata={'id': 'msg_01Fx9P74A7syoFkwE73CdMMY', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-a0fd2bbd-3b7e-46bf-8d69-a48c7e60b03c-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
]
},
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -174,17 +218,24 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 6,
"id": "814a2289-d0db-401e-b555-d5116112b413",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:41.203346Z",
"iopub.status.busy": "2024-09-10T20:22:41.203004Z",
"iopub.status.idle": "2024-09-10T20:22:41.891450Z",
"shell.execute_reply": "2024-09-10T20:22:41.890539Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"I'm an AI language model created by OpenAI, and I don't have a personal name. You can call me Assistant or any other name you prefer! How can I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 11, 'total_tokens': 48}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_ce0793330f', 'finish_reason': 'stop', 'logprobs': None}, id='run-3923e328-7715-4cd6-b215-98e4b6bf7c9d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 37, 'total_tokens': 48})"
"AIMessage(content=\"I'm an AI created by OpenAI, and I don't have a personal name. How can I assist you today?\", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 11, 'total_tokens': 34}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_25624ae3a5', 'finish_reason': 'stop', 'logprobs': None}, id='run-3380f977-4b89-4f44-bc02-b64043b3166f-0', usage_metadata={'input_tokens': 11, 'output_tokens': 23, 'total_tokens': 34})"
]
},
"execution_count": 9,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -202,17 +253,24 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 7,
"id": "6c8755ba-c001-4f5a-a497-be3f1db83244",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:41.896413Z",
"iopub.status.busy": "2024-09-10T20:22:41.895967Z",
"iopub.status.idle": "2024-09-10T20:22:42.767565Z",
"shell.execute_reply": "2024-09-10T20:22:42.766619Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", response_metadata={'id': 'msg_01RyYR64DoMPNCfHeNnroMXm', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-22446159-3723-43e6-88df-b84797e7751d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", additional_kwargs={}, response_metadata={'id': 'msg_01EFKSWpmsn2PSYPQa4cNHWb', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-3c58f47c-41b9-4e56-92e7-fb9602e3787c-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
]
},
"execution_count": 10,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -242,28 +300,37 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "067dabee-1050-4110-ae24-c48eba01e13b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:42.771941Z",
"iopub.status.busy": "2024-09-10T20:22:42.771606Z",
"iopub.status.idle": "2024-09-10T20:22:43.909206Z",
"shell.execute_reply": "2024-09-10T20:22:43.908496Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[{'name': 'GetPopulation',\n",
" 'args': {'location': 'Los Angeles, CA'},\n",
" 'id': 'call_sYT3PFMufHGWJD32Hi2CTNUP'},\n",
" 'id': 'call_Ga9m8FAArIyEjItHmztPYA22',\n",
" 'type': 'tool_call'},\n",
" {'name': 'GetPopulation',\n",
" 'args': {'location': 'New York, NY'},\n",
" 'id': 'call_j1qjhxRnD3ffQmRyqjlI1Lnk'}]"
" 'id': 'call_jh2dEvBaAHRaw5JUDthOs7rt',\n",
" 'type': 'tool_call'}]"
]
},
"execution_count": 7,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class GetWeather(BaseModel):\n",
@@ -288,22 +355,31 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"id": "e57dfe9f-cd24-4e37-9ce9-ccf8daf78f89",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:22:43.912746Z",
"iopub.status.busy": "2024-09-10T20:22:43.912447Z",
"iopub.status.idle": "2024-09-10T20:22:46.437049Z",
"shell.execute_reply": "2024-09-10T20:22:46.436093Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[{'name': 'GetPopulation',\n",
" 'args': {'location': 'Los Angeles, CA'},\n",
" 'id': 'toolu_01CxEHxKtVbLBrvzFS7GQ5xR'},\n",
" 'id': 'toolu_01JMufPf4F4t2zLj7miFeqXp',\n",
" 'type': 'tool_call'},\n",
" {'name': 'GetPopulation',\n",
" 'args': {'location': 'New York City, NY'},\n",
" 'id': 'toolu_013A79qt5toWSsKunFBDZd5S'}]"
" 'id': 'toolu_01RQBHcE8kEEbYTuuS8WqY1u',\n",
" 'type': 'tool_call'}]"
]
},
"execution_count": 8,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -18,11 +18,11 @@
"# How to stream chat model responses\n",
"\n",
"\n",
"All [chat models](https://python.langchain.com/v0.2/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) implement the [Runnable interface](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable), which comes with a **default** implementations of standard runnable methods (i.e. `ainvoke`, `batch`, `abatch`, `stream`, `astream`, `astream_events`).\n",
"All [chat models](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) implement the [Runnable interface](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable), which comes with a **default** implementations of standard runnable methods (i.e. `ainvoke`, `batch`, `abatch`, `stream`, `astream`, `astream_events`).\n",
"\n",
"The **default** streaming implementation provides an`Iterator` (or `AsyncIterator` for asynchronous streaming) that yields a single value: the final output from the underlying chat model provider.\n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"\n",
"The **default** implementation does **not** provide support for token-by-token streaming, but it ensures that the the model can be swapped in for any other model as it supports the same standard interface.\n",
"\n",
@@ -120,7 +120,7 @@
"source": [
"## Astream events\n",
"\n",
"Chat models also support the standard [astream events](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream_events) method.\n",
"Chat models also support the standard [astream events](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream_events) method.\n",
"\n",
"This method is useful if you're streaming output from a larger LLM application that contains multiple steps (e.g., an LLM chain composed of a prompt, llm and parser)."
]

View File

@@ -42,7 +42,7 @@
"\n",
"A number of model providers return token usage information as part of the chat generation response. When available, this information will be included on the `AIMessage` objects produced by the corresponding model.\n",
"\n",
"LangChain `AIMessage` objects include a [usage_metadata](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.usage_metadata) attribute. When populated, this attribute will be a [UsageMetadata](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html) dictionary with standard keys (e.g., `\"input_tokens\"` and `\"output_tokens\"`).\n",
"LangChain `AIMessage` objects include a [usage_metadata](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.usage_metadata) attribute. When populated, this attribute will be a [UsageMetadata](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html) dictionary with standard keys (e.g., `\"input_tokens\"` and `\"output_tokens\"`).\n",
"\n",
"Examples:\n",
"\n",
@@ -71,7 +71,7 @@
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\")\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\")\n",
"openai_response = llm.invoke(\"hello\")\n",
"openai_response.usage_metadata"
]
@@ -118,7 +118,7 @@
"source": [
"### Using AIMessage.response_metadata\n",
"\n",
"Metadata from the model response is also included in the AIMessage [response_metadata](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.response_metadata) attribute. These data are typically not standardized. Note that different providers adopt different conventions for representing token counts:"
"Metadata from the model response is also included in the AIMessage [response_metadata](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.response_metadata) attribute. These data are typically not standardized. Note that different providers adopt different conventions for representing token counts:"
]
},
{
@@ -153,13 +153,11 @@
"\n",
"#### OpenAI\n",
"\n",
"For example, OpenAI will return a message [chunk](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html) at the end of a stream with token usage information. This behavior is supported by `langchain-openai >= 0.1.9` and can be enabled by setting `stream_usage=True`. This attribute can also be set when `ChatOpenAI` is instantiated.\n",
"For example, OpenAI will return a message [chunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html) at the end of a stream with token usage information. This behavior is supported by `langchain-openai >= 0.1.9` and can be enabled by setting `stream_usage=True`. This attribute can also be set when `ChatOpenAI` is instantiated.\n",
"\n",
"```{=mdx}\n",
":::note\n",
"By default, the last message chunk in a stream will include a `\"finish_reason\"` in the message's `response_metadata` attribute. If we include token usage in streaming mode, an additional chunk containing usage metadata will be added to the end of the stream, such that `\"finish_reason\"` appears on the second to last message chunk.\n",
":::\n",
"```"
":::\n"
]
},
{
@@ -182,13 +180,13 @@
"content=' you' id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623'\n",
"content=' today' id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623'\n",
"content='?' id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623'\n",
"content='' response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-3.5-turbo-0125'} id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623'\n",
"content='' response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-4o-mini'} id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623'\n",
"content='' id='run-adb20c31-60c7-43a2-99b2-d4a53ca5f623' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}\n"
]
}
],
"source": [
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\")\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\")\n",
"\n",
"aggregate = None\n",
"for chunk in llm.stream(\"hello\", stream_usage=True):\n",
@@ -252,7 +250,7 @@
"content=' you' id='run-8e758550-94b0-4cca-a298-57482793c25d'\n",
"content=' today' id='run-8e758550-94b0-4cca-a298-57482793c25d'\n",
"content='?' id='run-8e758550-94b0-4cca-a298-57482793c25d'\n",
"content='' response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-3.5-turbo-0125'} id='run-8e758550-94b0-4cca-a298-57482793c25d'\n"
"content='' response_metadata={'finish_reason': 'stop', 'model_name': 'gpt-4o-mini'} id='run-8e758550-94b0-4cca-a298-57482793c25d'\n"
]
}
],
@@ -289,7 +287,7 @@
}
],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class Joke(BaseModel):\n",
@@ -300,7 +298,7 @@
"\n",
"\n",
"llm = ChatOpenAI(\n",
" model=\"gpt-3.5-turbo-0125\",\n",
" model=\"gpt-4o-mini\",\n",
" stream_usage=True,\n",
")\n",
"# Under the hood, .with_structured_output binds tools to the\n",
@@ -362,7 +360,7 @@
"from langchain_community.callbacks.manager import get_openai_callback\n",
"\n",
"llm = ChatOpenAI(\n",
" model=\"gpt-3.5-turbo-0125\",\n",
" model=\"gpt-4o-mini\",\n",
" temperature=0,\n",
" stream_usage=True,\n",
")\n",

View File

@@ -77,7 +77,7 @@
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"chat = ChatOpenAI(model=\"gpt-3.5-turbo-0125\")"
"chat = ChatOpenAI(model=\"gpt-4o-mini\")"
]
},
{
@@ -140,7 +140,7 @@
"\n",
"## Chat history\n",
"\n",
"It's perfectly fine to store and pass messages directly as an array, but we can use LangChain's built-in [message history class](https://python.langchain.com/v0.2/api_reference/langchain/index.html#module-langchain.memory) to store and load messages as well. Instances of this class are responsible for storing and loading chat messages from persistent storage. LangChain integrates with many providers - you can see a [list of integrations here](/docs/integrations/memory) - but for this demo we will use an ephemeral demo class.\n",
"It's perfectly fine to store and pass messages directly as an array, but we can use LangChain's built-in [message history class](https://python.langchain.com/api_reference/langchain/index.html#module-langchain.memory) to store and load messages as well. Instances of this class are responsible for storing and loading chat messages from persistent storage. LangChain integrates with many providers - you can see a [list of integrations here](/docs/integrations/memory) - but for this demo we will use an ephemeral demo class.\n",
"\n",
"Here's an example of the API:"
]
@@ -191,7 +191,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='You just asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 61, 'total_tokens': 79}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5cbb21c2-9c30-4031-8ea8-bfc497989535-0', usage_metadata={'input_tokens': 61, 'output_tokens': 18, 'total_tokens': 79})"
"AIMessage(content='You just asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 61, 'total_tokens': 79}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5cbb21c2-9c30-4031-8ea8-bfc497989535-0', usage_metadata={'input_tokens': 61, 'output_tokens': 18, 'total_tokens': 79})"
]
},
"execution_count": 5,
@@ -312,7 +312,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='\"J\\'adore la programmation.\"', response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 39, 'total_tokens': 48}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-648b0822-b0bb-47a2-8e7d-7d34744be8f2-0', usage_metadata={'input_tokens': 39, 'output_tokens': 9, 'total_tokens': 48})"
"AIMessage(content='\"J\\'adore la programmation.\"', response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 39, 'total_tokens': 48}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-648b0822-b0bb-47a2-8e7d-7d34744be8f2-0', usage_metadata={'input_tokens': 39, 'output_tokens': 9, 'total_tokens': 48})"
]
},
"execution_count": 8,
@@ -342,7 +342,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='You asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 63, 'total_tokens': 80}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5950435c-1dc2-43a6-836f-f989fd62c95e-0', usage_metadata={'input_tokens': 63, 'output_tokens': 17, 'total_tokens': 80})"
"AIMessage(content='You asked me to translate the sentence \"I love programming\" from English to French.', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 63, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5950435c-1dc2-43a6-836f-f989fd62c95e-0', usage_metadata={'input_tokens': 63, 'output_tokens': 17, 'total_tokens': 80})"
]
},
"execution_count": 9,
@@ -421,7 +421,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72})"
"AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72})"
]
},
"execution_count": 22,
@@ -501,7 +501,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})"
"AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})"
]
},
"execution_count": 24,
@@ -529,9 +529,9 @@
" HumanMessage(content='How are you today?'),\n",
" AIMessage(content='Fine thanks!'),\n",
" HumanMessage(content=\"What's my name?\"),\n",
" AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72}),\n",
" AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f8aabef8-631a-4238-a39b-701e881fbe47-0', usage_metadata={'input_tokens': 66, 'output_tokens': 6, 'total_tokens': 72}),\n",
" HumanMessage(content='Where does P. Sherman live?'),\n",
" AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})]"
" AIMessage(content='P. Sherman is a fictional character from the animated movie \"Finding Nemo\" who lives at 42 Wallaby Way, Sydney.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 53, 'total_tokens': 80}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5642ef3a-fdbe-43cf-a575-d1785976a1b9-0', usage_metadata={'input_tokens': 53, 'output_tokens': 27, 'total_tokens': 80})]"
]
},
"execution_count": 25,
@@ -565,7 +565,7 @@
{
"data": {
"text/plain": [
"AIMessage(content=\"I'm sorry, but I don't have access to your personal information, so I don't know your name. How else may I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 74, 'total_tokens': 105}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0ab03495-1f7c-4151-9070-56d2d1c565ff-0', usage_metadata={'input_tokens': 74, 'output_tokens': 31, 'total_tokens': 105})"
"AIMessage(content=\"I'm sorry, but I don't have access to your personal information, so I don't know your name. How else may I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 74, 'total_tokens': 105}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0ab03495-1f7c-4151-9070-56d2d1c565ff-0', usage_metadata={'input_tokens': 74, 'output_tokens': 31, 'total_tokens': 105})"
]
},
"execution_count": 27,

View File

@@ -71,7 +71,7 @@
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"chat = ChatOpenAI(model=\"gpt-3.5-turbo-1106\", temperature=0.2)"
"chat = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0.2)"
]
},
{

View File

@@ -70,7 +70,7 @@
"\n",
"# Choose the LLM that will drive the agent\n",
"# Only certain models support this\n",
"chat = ChatOpenAI(model=\"gpt-3.5-turbo-1106\", temperature=0)"
"chat = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
]
},
{

View File

@@ -7,7 +7,7 @@
"source": [
"# How to split code\n",
"\n",
"[RecursiveCharacterTextSplitter](https://python.langchain.com/v0.2/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for splitting text in a specific programming language.\n",
"[RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for splitting text in a specific programming language.\n",
"\n",
"Supported languages are stored in the `langchain_text_splitters.Language` enum. They include:\n",
"\n",

View File

@@ -58,7 +58,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -99,7 +100,7 @@
"id": "b0f74589",
"metadata": {},
"source": [
"Above, we defined `temperature` as a [`ConfigurableField`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.utils.ConfigurableField.html#langchain_core.runnables.utils.ConfigurableField) that we can set at runtime. To do so, we use the [`with_config`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config) method like this:"
"Above, we defined `temperature` as a [`ConfigurableField`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.utils.ConfigurableField.html#langchain_core.runnables.utils.ConfigurableField) that we can set at runtime. To do so, we use the [`with_config`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config) method like this:"
]
},
{
@@ -281,7 +282,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"ANTHROPIC_API_KEY\"] = getpass()"
"if \"ANTHROPIC_API_KEY\" not in os.environ:\n",
" os.environ[\"ANTHROPIC_API_KEY\"] = getpass()"
]
},
{

View File

@@ -227,7 +227,7 @@
"source": [
"### `LLMListwiseRerank`\n",
"\n",
"[LLMListwiseRerank](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html) uses [zero-shot listwise document reranking](https://arxiv.org/pdf/2305.02156) and functions similarly to `LLMChainFilter` as a robust but more expensive option. It is recommended to use a more powerful LLM.\n",
"[LLMListwiseRerank](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html) uses [zero-shot listwise document reranking](https://arxiv.org/pdf/2305.02156) and functions similarly to `LLMChainFilter` as a robust but more expensive option. It is recommended to use a more powerful LLM.\n",
"\n",
"Note that `LLMListwiseRerank` requires a model with the [with_structured_output](/docs/integrations/chat/) method implemented."
]
@@ -258,7 +258,7 @@
"from langchain.retrievers.document_compressors import LLMListwiseRerank\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)\n",
"\n",
"_filter = LLMListwiseRerank.from_llm(llm, top_n=1)\n",
"compression_retriever = ContextualCompressionRetriever(\n",

View File

@@ -42,13 +42,13 @@
"source": [
"LangChain [tools](/docs/concepts#tools) are interfaces that an agent, chain, or chat model can use to interact with the world. See [here](/docs/how_to/#tools) for how-to guides covering tool-calling, built-in tools, custom tools, and more information.\n",
"\n",
"LangChain tools-- instances of [BaseTool](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.BaseTool.html)-- are [Runnables](/docs/concepts/#runnable-interface) with additional constraints that enable them to be invoked effectively by language models:\n",
"LangChain tools-- instances of [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html)-- are [Runnables](/docs/concepts/#runnable-interface) with additional constraints that enable them to be invoked effectively by language models:\n",
"\n",
"- Their inputs are constrained to be serializable, specifically strings and Python `dict` objects;\n",
"- They contain names and descriptions indicating how and when they should be used;\n",
"- They may contain a detailed [args_schema](https://python.langchain.com/v0.2/docs/how_to/custom_tools/) for their arguments. That is, while a tool (as a `Runnable`) might accept a single `dict` input, the specific keys and type information needed to populate a dict should be specified in the `args_schema`.\n",
"- They may contain a detailed [args_schema](https://python.langchain.com/docs/how_to/custom_tools/) for their arguments. That is, while a tool (as a `Runnable`) might accept a single `dict` input, the specific keys and type information needed to populate a dict should be specified in the `args_schema`.\n",
"\n",
"Runnables that accept string or `dict` input can be converted to tools using the [as_tool](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.as_tool) method, which allows for the specification of names, descriptions, and additional schema information for arguments."
"Runnables that accept string or `dict` input can be converted to tools using the [as_tool](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.as_tool) method, which allows for the specification of names, descriptions, and additional schema information for arguments."
]
},
{
@@ -180,7 +180,7 @@
"id": "32b1a992-8997-4c98-8eb2-c9fe9431b799",
"metadata": {},
"source": [
"Alternatively, the schema can be fully specified by directly passing the desired [args_schema](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.BaseTool.html#langchain_core.tools.BaseTool.args_schema) for the tool:"
"Alternatively, the schema can be fully specified by directly passing the desired [args_schema](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html#langchain_core.tools.BaseTool.args_schema) for the tool:"
]
},
{
@@ -190,7 +190,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class GSchema(BaseModel):\n",
@@ -266,11 +266,9 @@
"\n",
"We first instantiate a chat model that supports [tool calling](/docs/how_to/tool_calling/):\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />\n",
"```"
"<ChatModelTabs customVarName=\"llm\" />\n"
]
},
{
@@ -285,7 +283,7 @@
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
"llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
]
},
{
@@ -331,7 +329,7 @@
"id": "9ba737ac-43a2-4a6f-b855-5bd0305017f1",
"metadata": {},
"source": [
"We next create use a simple pre-built [LangGraph agent](https://python.langchain.com/v0.2/docs/tutorials/agents/) and provide it the tool:"
"We next create use a simple pre-built [LangGraph agent](https://python.langchain.com/docs/tutorials/agents/) and provide it the tool:"
]
},
{
@@ -362,11 +360,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_W8cnfOjwqEn4cFcg19LN9mYD', 'function': {'arguments': '{\"__arg1\":\"dogs\"}', 'name': 'pet_info_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 60, 'total_tokens': 79}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d7f81de9-1fb7-4caf-81ed-16dcdb0b2ab4-0', tool_calls=[{'name': 'pet_info_retriever', 'args': {'__arg1': 'dogs'}, 'id': 'call_W8cnfOjwqEn4cFcg19LN9mYD'}], usage_metadata={'input_tokens': 60, 'output_tokens': 19, 'total_tokens': 79})]}}\n",
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_W8cnfOjwqEn4cFcg19LN9mYD', 'function': {'arguments': '{\"__arg1\":\"dogs\"}', 'name': 'pet_info_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 60, 'total_tokens': 79}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d7f81de9-1fb7-4caf-81ed-16dcdb0b2ab4-0', tool_calls=[{'name': 'pet_info_retriever', 'args': {'__arg1': 'dogs'}, 'id': 'call_W8cnfOjwqEn4cFcg19LN9mYD'}], usage_metadata={'input_tokens': 60, 'output_tokens': 19, 'total_tokens': 79})]}}\n",
"----\n",
"{'tools': {'messages': [ToolMessage(content=\"[Document(id='86f835fe-4bbe-4ec6-aeb4-489a8b541707', page_content='Dogs are great companions, known for their loyalty and friendliness.')]\", name='pet_info_retriever', tool_call_id='call_W8cnfOjwqEn4cFcg19LN9mYD')]}}\n",
"----\n",
"{'agent': {'messages': [AIMessage(content='Dogs are known for being great companions, known for their loyalty and friendliness.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 134, 'total_tokens': 152}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9ca5847a-a5eb-44c0-a774-84cc2c5bbc5b-0', usage_metadata={'input_tokens': 134, 'output_tokens': 18, 'total_tokens': 152})]}}\n",
"{'agent': {'messages': [AIMessage(content='Dogs are known for being great companions, known for their loyalty and friendliness.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 134, 'total_tokens': 152}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9ca5847a-a5eb-44c0-a774-84cc2c5bbc5b-0', usage_metadata={'input_tokens': 134, 'output_tokens': 18, 'total_tokens': 152})]}}\n",
"----\n"
]
}
@@ -497,11 +495,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_17iLPWvOD23zqwd1QVQ00Y63', 'function': {'arguments': '{\"question\":\"What are dogs known for according to pirates?\",\"answer_style\":\"quote\"}', 'name': 'pet_expert'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 59, 'total_tokens': 87}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-7fef44f3-7bba-4e63-8c51-2ad9c5e65e2e-0', tool_calls=[{'name': 'pet_expert', 'args': {'question': 'What are dogs known for according to pirates?', 'answer_style': 'quote'}, 'id': 'call_17iLPWvOD23zqwd1QVQ00Y63'}], usage_metadata={'input_tokens': 59, 'output_tokens': 28, 'total_tokens': 87})]}}\n",
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_17iLPWvOD23zqwd1QVQ00Y63', 'function': {'arguments': '{\"question\":\"What are dogs known for according to pirates?\",\"answer_style\":\"quote\"}', 'name': 'pet_expert'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 59, 'total_tokens': 87}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-7fef44f3-7bba-4e63-8c51-2ad9c5e65e2e-0', tool_calls=[{'name': 'pet_expert', 'args': {'question': 'What are dogs known for according to pirates?', 'answer_style': 'quote'}, 'id': 'call_17iLPWvOD23zqwd1QVQ00Y63'}], usage_metadata={'input_tokens': 59, 'output_tokens': 28, 'total_tokens': 87})]}}\n",
"----\n",
"{'tools': {'messages': [ToolMessage(content='\"Dogs are known for their loyalty and friendliness, making them great companions for pirates on long sea voyages.\"', name='pet_expert', tool_call_id='call_17iLPWvOD23zqwd1QVQ00Y63')]}}\n",
"----\n",
"{'agent': {'messages': [AIMessage(content='According to pirates, dogs are known for their loyalty and friendliness, making them great companions for pirates on long sea voyages.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 119, 'total_tokens': 146}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5a30edc3-7be0-4743-b980-ca2f8cad9b8d-0', usage_metadata={'input_tokens': 119, 'output_tokens': 27, 'total_tokens': 146})]}}\n",
"{'agent': {'messages': [AIMessage(content='According to pirates, dogs are known for their loyalty and friendliness, making them great companions for pirates on long sea voyages.', response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 119, 'total_tokens': 146}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5a30edc3-7be0-4743-b980-ca2f8cad9b8d-0', usage_metadata={'input_tokens': 119, 'output_tokens': 27, 'total_tokens': 146})]}}\n",
"----\n"
]
}

View File

@@ -16,7 +16,7 @@
"\n",
"LangChain has some built-in callback handlers, but you will often want to create your own handlers with custom logic.\n",
"\n",
"To create a custom callback handler, we need to determine the [event(s)](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) we want our callback handler to handle as well as what we want our callback handler to do when the event is triggered. Then all we need to do is attach the callback handler to the object, for example via [the constructor](/docs/how_to/callbacks_constructor) or [at runtime](/docs/how_to/callbacks_runtime).\n",
"To create a custom callback handler, we need to determine the [event(s)](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) we want our callback handler to handle as well as what we want our callback handler to do when the event is triggered. Then all we need to do is attach the callback handler to the object, for example via [the constructor](/docs/how_to/callbacks_constructor) or [at runtime](/docs/how_to/callbacks_runtime).\n",
"\n",
"In the example below, we'll implement streaming with a custom handler.\n",
"\n",
@@ -107,7 +107,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see [this reference page](https://python.langchain.com/v0.2/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) for a list of events you can handle. Note that the `handle_chain_*` events run for most LCEL runnables.\n",
"You can see [this reference page](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.base.BaseCallbackHandler.html#langchain-core-callbacks-base-basecallbackhandler) for a list of events you can handle. Note that the `handle_chain_*` events run for most LCEL runnables.\n",
"\n",
"## Next steps\n",
"\n",

View File

@@ -16,7 +16,7 @@
"\n",
"In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
"\n",
"Wrapping your LLM with the standard [`BaseChatModel`](https://python.langchain.com/v0.2/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
"Wrapping your LLM with the standard [`BaseChatModel`](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
"\n",
"As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
"\n",
@@ -39,7 +39,7 @@
"| `AIMessageChunk` / `HumanMessageChunk` / ... | Chunk variant of each type of message. |\n",
"\n",
"\n",
"::: {.callout-note}\n",
":::note\n",
"`ToolMessage` and `FunctionMessage` closely follow OpenAI's `function` and `tool` roles.\n",
"\n",
"This is a rapidly developing field and as more models add function calling capabilities. Expect that there will be additions to this schema.\n",
@@ -145,7 +145,7 @@
"| `_astream` | Use to implement async version of `_stream`. | Optional |\n",
"\n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"The `_astream` implementation uses `run_in_executor` to launch the sync `_stream` in a separate thread if `_stream` is implemented, otherwise it fallsback to use `_agenerate`.\n",
"\n",
"You can use this trick if you want to reuse the `_stream` implementation, but if you're able to implement code that's natively async that's a better solution since that code will run with less overhead.\n",
@@ -503,7 +503,7 @@
"\n",
"Documentation:\n",
"\n",
"* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [APIReference](https://python.langchain.com/v0.2/api_reference/langchain/index.html).\n",
"* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [APIReference](https://python.langchain.com/api_reference/langchain/index.html).\n",
"* The class doc-string for the model contains a link to the model API if the model is powered by a service.\n",
"\n",
"Tests:\n",

View File

@@ -402,7 +402,7 @@
"\n",
"Documentation:\n",
"\n",
"* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [APIReference](https://python.langchain.com/v0.2/api_reference/langchain/index.html).\n",
"* The model contains doc-strings for all initialization arguments, as these will be surfaced in the [APIReference](https://python.langchain.com/api_reference/langchain/index.html).\n",
"* The class doc-string for the model contains a link to the model API if the model is powered by a service.\n",
"\n",
"Tests:\n",

View File

@@ -37,12 +37,12 @@
"\n",
"The logic inside of `_get_relevant_documents` can involve arbitrary calls to a database or to the web using requests.\n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"By inherting from `BaseRetriever`, your retriever automatically becomes a LangChain [Runnable](/docs/concepts#interface) and will gain the standard `Runnable` functionality out of the box!\n",
":::\n",
"\n",
"\n",
":::{.callout-info}\n",
":::info\n",
"You can use a `RunnableLambda` or `RunnableGenerator` to implement a retriever.\n",
"\n",
"The main benefit of implementing a retriever as a `BaseRetriever` vs. a `RunnableLambda` (a custom [runnable function](/docs/how_to/functions)) is that a `BaseRetriever` is a well\n",
@@ -270,7 +270,7 @@
"\n",
"Documentation:\n",
"\n",
"* The retriever contains doc-strings for all initialization arguments, as these will be surfaced in the [API Reference](https://python.langchain.com/v0.2/api_reference/langchain/index.html).\n",
"* The retriever contains doc-strings for all initialization arguments, as these will be surfaced in the [API Reference](https://python.langchain.com/api_reference/langchain/index.html).\n",
"* The class doc-string for the model contains a link to any relevant APIs used for the retriever (e.g., if the retriever is retrieving from wikipedia, it'll be good to link to the wikipedia API!)\n",
"\n",
"Tests:\n",

View File

@@ -13,20 +13,20 @@
"|---------------|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n",
"| name | str | Must be unique within a set of tools provided to an LLM or agent. |\n",
"| description | str | Describes what the tool does. Used as context by the LLM or agent. |\n",
"| args_schema | langchain.pydantic_v1.BaseModel | Optional but recommended, and required if using callback handlers. It can be used to provide more information (e.g., few-shot examples) or validation for expected parameters. |\n",
"| args_schema | pydantic.BaseModel | Optional but recommended, and required if using callback handlers. It can be used to provide more information (e.g., few-shot examples) or validation for expected parameters. |\n",
"| return_direct | boolean | Only relevant for agents. When True, after invoking the given tool, the agent will stop and return the result direcly to the user. |\n",
"\n",
"LangChain supports the creation of tools from:\n",
"\n",
"1. Functions;\n",
"2. LangChain [Runnables](/docs/concepts#runnable-interface);\n",
"3. By sub-classing from [BaseTool](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.BaseTool.html) -- This is the most flexible method, it provides the largest degree of control, at the expense of more effort and code.\n",
"3. By sub-classing from [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html) -- This is the most flexible method, it provides the largest degree of control, at the expense of more effort and code.\n",
"\n",
"Creating tools from functions may be sufficient for most use cases, and can be done via a simple [@tool decorator](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.tool.html#langchain_core.tools.tool). If more configuration is needed-- e.g., specification of both sync and async implementations-- one can also use the [StructuredTool.from_function](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method.\n",
"Creating tools from functions may be sufficient for most use cases, and can be done via a simple [@tool decorator](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.tool.html#langchain_core.tools.tool). If more configuration is needed-- e.g., specification of both sync and async implementations-- one can also use the [StructuredTool.from_function](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method.\n",
"\n",
"In this guide we provide an overview of these methods.\n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"\n",
"Models will perform better if the tools have well chosen names, descriptions and JSON schemas.\n",
":::"
@@ -48,7 +48,14 @@
"cell_type": "code",
"execution_count": 1,
"id": "cc7005cd-072f-4d37-8453-6297468e5192",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:52.645451Z",
"iopub.status.busy": "2024-09-10T20:25:52.645081Z",
"iopub.status.idle": "2024-09-10T20:25:53.030958Z",
"shell.execute_reply": "2024-09-10T20:25:53.030669Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -88,7 +95,14 @@
"cell_type": "code",
"execution_count": 2,
"id": "0c0991db-b997-4611-be37-4346e660506b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.032544Z",
"iopub.status.busy": "2024-09-10T20:25:53.032420Z",
"iopub.status.idle": "2024-09-10T20:25:53.035349Z",
"shell.execute_reply": "2024-09-10T20:25:53.035123Z"
}
},
"outputs": [],
"source": [
"from langchain_core.tools import tool\n",
@@ -112,22 +126,29 @@
"cell_type": "code",
"execution_count": 3,
"id": "5626423f-053e-4a66-adca-1d794d835397",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.036658Z",
"iopub.status.busy": "2024-09-10T20:25:53.036574Z",
"iopub.status.idle": "2024-09-10T20:25:53.041154Z",
"shell.execute_reply": "2024-09-10T20:25:53.040964Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'title': 'multiply_by_maxSchema',\n",
" 'description': 'Multiply a by the maximum of b.',\n",
" 'type': 'object',\n",
" 'properties': {'a': {'title': 'A',\n",
" 'description': 'scale factor',\n",
"{'description': 'Multiply a by the maximum of b.',\n",
" 'properties': {'a': {'description': 'scale factor',\n",
" 'title': 'A',\n",
" 'type': 'string'},\n",
" 'b': {'title': 'B',\n",
" 'description': 'list of ints over which to take maximum',\n",
" 'type': 'array',\n",
" 'items': {'type': 'integer'}}},\n",
" 'required': ['a', 'b']}"
" 'b': {'description': 'list of ints over which to take maximum',\n",
" 'items': {'type': 'integer'},\n",
" 'title': 'B',\n",
" 'type': 'array'}},\n",
" 'required': ['a', 'b'],\n",
" 'title': 'multiply_by_maxSchema',\n",
" 'type': 'object'}"
]
},
"execution_count": 3,
@@ -163,7 +184,14 @@
"cell_type": "code",
"execution_count": 4,
"id": "9216d03a-f6ea-4216-b7e1-0661823a4c0b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.042516Z",
"iopub.status.busy": "2024-09-10T20:25:53.042427Z",
"iopub.status.idle": "2024-09-10T20:25:53.045217Z",
"shell.execute_reply": "2024-09-10T20:25:53.045010Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -171,13 +199,13 @@
"text": [
"multiplication-tool\n",
"Multiply two numbers.\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n",
"{'a': {'description': 'first number', 'title': 'A', 'type': 'integer'}, 'b': {'description': 'second number', 'title': 'B', 'type': 'integer'}}\n",
"True\n"
]
}
],
"source": [
"from langchain.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class CalculatorInput(BaseModel):\n",
@@ -218,19 +246,26 @@
"cell_type": "code",
"execution_count": 5,
"id": "336f5538-956e-47d5-9bde-b732559f9e61",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.046526Z",
"iopub.status.busy": "2024-09-10T20:25:53.046456Z",
"iopub.status.idle": "2024-09-10T20:25:53.050045Z",
"shell.execute_reply": "2024-09-10T20:25:53.049836Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'title': 'fooSchema',\n",
" 'description': 'The foo.',\n",
" 'type': 'object',\n",
" 'properties': {'bar': {'title': 'Bar',\n",
" 'description': 'The bar.',\n",
"{'description': 'The foo.',\n",
" 'properties': {'bar': {'description': 'The bar.',\n",
" 'title': 'Bar',\n",
" 'type': 'string'},\n",
" 'baz': {'title': 'Baz', 'description': 'The baz.', 'type': 'integer'}},\n",
" 'required': ['bar', 'baz']}"
" 'baz': {'description': 'The baz.', 'title': 'Baz', 'type': 'integer'}},\n",
" 'required': ['bar', 'baz'],\n",
" 'title': 'fooSchema',\n",
" 'type': 'object'}"
]
},
"execution_count": 5,
@@ -258,8 +293,8 @@
"id": "f18a2503-5393-421b-99fa-4a01dd824d0e",
"metadata": {},
"source": [
":::{.callout-caution}\n",
"By default, `@tool(parse_docstring=True)` will raise `ValueError` if the docstring does not parse correctly. See [API Reference](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.tool.html) for detail and examples.\n",
":::caution\n",
"By default, `@tool(parse_docstring=True)` will raise `ValueError` if the docstring does not parse correctly. See [API Reference](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.tool.html) for detail and examples.\n",
":::"
]
},
@@ -277,7 +312,14 @@
"cell_type": "code",
"execution_count": 6,
"id": "564fbe6f-11df-402d-b135-ef6ff25e1e63",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.051302Z",
"iopub.status.busy": "2024-09-10T20:25:53.051218Z",
"iopub.status.idle": "2024-09-10T20:25:53.059704Z",
"shell.execute_reply": "2024-09-10T20:25:53.059490Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -320,7 +362,14 @@
"cell_type": "code",
"execution_count": 7,
"id": "6bc055d4-1fbe-4db5-8881-9c382eba6b1b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.060971Z",
"iopub.status.busy": "2024-09-10T20:25:53.060883Z",
"iopub.status.idle": "2024-09-10T20:25:53.064615Z",
"shell.execute_reply": "2024-09-10T20:25:53.064408Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -329,7 +378,7 @@
"6\n",
"Calculator\n",
"multiply numbers\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n"
"{'a': {'description': 'first number', 'title': 'A', 'type': 'integer'}, 'b': {'description': 'second number', 'title': 'B', 'type': 'integer'}}\n"
]
}
],
@@ -366,24 +415,39 @@
"source": [
"## Creating tools from Runnables\n",
"\n",
"LangChain [Runnables](/docs/concepts#runnable-interface) that accept string or `dict` input can be converted to tools using the [as_tool](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.as_tool) method, which allows for the specification of names, descriptions, and additional schema information for arguments.\n",
"LangChain [Runnables](/docs/concepts#runnable-interface) that accept string or `dict` input can be converted to tools using the [as_tool](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.as_tool) method, which allows for the specification of names, descriptions, and additional schema information for arguments.\n",
"\n",
"Example usage:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"id": "8ef593c5-cf72-4c10-bfc9-7d21874a0c24",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.065797Z",
"iopub.status.busy": "2024-09-10T20:25:53.065733Z",
"iopub.status.idle": "2024-09-10T20:25:53.130458Z",
"shell.execute_reply": "2024-09-10T20:25:53.130229Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/4j/2rz3865x6qg07tx43146py8h0000gn/T/ipykernel_95770/2548361071.py:14: LangChainBetaWarning: This API is in beta and may change in the future.\n",
" as_tool = chain.as_tool(\n"
]
},
{
"data": {
"text/plain": [
"{'answer_style': {'title': 'Answer Style', 'type': 'string'}}"
]
},
"execution_count": 9,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -428,19 +492,26 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"id": "1dad8f8e",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.131904Z",
"iopub.status.busy": "2024-09-10T20:25:53.131803Z",
"iopub.status.idle": "2024-09-10T20:25:53.136797Z",
"shell.execute_reply": "2024-09-10T20:25:53.136563Z"
}
},
"outputs": [],
"source": [
"from typing import Optional, Type\n",
"\n",
"from langchain.pydantic_v1 import BaseModel\n",
"from langchain_core.callbacks import (\n",
" AsyncCallbackManagerForToolRun,\n",
" CallbackManagerForToolRun,\n",
")\n",
"from langchain_core.tools import BaseTool\n",
"from pydantic import BaseModel\n",
"\n",
"\n",
"class CalculatorInput(BaseModel):\n",
@@ -448,9 +519,11 @@
" b: int = Field(description=\"second number\")\n",
"\n",
"\n",
"# Note: It's important that every field has type hints. BaseTool is a\n",
"# Pydantic class and not having type hints can lead to unexpected behavior.\n",
"class CustomCalculatorTool(BaseTool):\n",
" name = \"Calculator\"\n",
" description = \"useful for when you need to answer questions about math\"\n",
" name: str = \"Calculator\"\n",
" description: str = \"useful for when you need to answer questions about math\"\n",
" args_schema: Type[BaseModel] = CalculatorInput\n",
" return_direct: bool = True\n",
"\n",
@@ -477,9 +550,16 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"id": "bb551c33",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.138074Z",
"iopub.status.busy": "2024-09-10T20:25:53.138007Z",
"iopub.status.idle": "2024-09-10T20:25:53.141360Z",
"shell.execute_reply": "2024-09-10T20:25:53.141158Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -487,7 +567,7 @@
"text": [
"Calculator\n",
"useful for when you need to answer questions about math\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n",
"{'a': {'description': 'first number', 'title': 'A', 'type': 'integer'}, 'b': {'description': 'second number', 'title': 'B', 'type': 'integer'}}\n",
"True\n",
"6\n",
"6\n"
@@ -512,7 +592,7 @@
"source": [
"## How to create async tools\n",
"\n",
"LangChain Tools implement the [Runnable interface 🏃](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
"LangChain Tools implement the [Runnable interface 🏃](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
"\n",
"All Runnables expose the `invoke` and `ainvoke` methods (as well as other methods like `batch`, `abatch`, `astream` etc).\n",
"\n",
@@ -528,9 +608,16 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 11,
"id": "6615cb77-fd4c-4676-8965-f92cc71d4944",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.142587Z",
"iopub.status.busy": "2024-09-10T20:25:53.142504Z",
"iopub.status.idle": "2024-09-10T20:25:53.147205Z",
"shell.execute_reply": "2024-09-10T20:25:53.146995Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -560,9 +647,16 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 12,
"id": "bb2af583-eadd-41f4-a645-bf8748bd3dcd",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.148383Z",
"iopub.status.busy": "2024-09-10T20:25:53.148307Z",
"iopub.status.idle": "2024-09-10T20:25:53.152684Z",
"shell.execute_reply": "2024-09-10T20:25:53.152486Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -605,9 +699,16 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 13,
"id": "4ad0932c-8610-4278-8c57-f9218f654c8a",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.153849Z",
"iopub.status.busy": "2024-09-10T20:25:53.153773Z",
"iopub.status.idle": "2024-09-10T20:25:53.158312Z",
"shell.execute_reply": "2024-09-10T20:25:53.158130Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -650,9 +751,16 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 14,
"id": "7094c0e8-6192-4870-a942-aad5b5ae48fd",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.159440Z",
"iopub.status.busy": "2024-09-10T20:25:53.159364Z",
"iopub.status.idle": "2024-09-10T20:25:53.160922Z",
"shell.execute_reply": "2024-09-10T20:25:53.160712Z"
}
},
"outputs": [],
"source": [
"from langchain_core.tools import ToolException\n",
@@ -673,9 +781,16 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 15,
"id": "b4d22022-b105-4ccc-a15b-412cb9ea3097",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.162046Z",
"iopub.status.busy": "2024-09-10T20:25:53.161968Z",
"iopub.status.idle": "2024-09-10T20:25:53.165236Z",
"shell.execute_reply": "2024-09-10T20:25:53.165052Z"
}
},
"outputs": [
{
"data": {
@@ -683,7 +798,7 @@
"'Error: There is no city by the name of foobar.'"
]
},
"execution_count": 16,
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
@@ -707,9 +822,16 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 16,
"id": "3fad1728-d367-4e1b-9b54-3172981271cf",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.166372Z",
"iopub.status.busy": "2024-09-10T20:25:53.166294Z",
"iopub.status.idle": "2024-09-10T20:25:53.169739Z",
"shell.execute_reply": "2024-09-10T20:25:53.169553Z"
}
},
"outputs": [
{
"data": {
@@ -717,7 +839,7 @@
"\"There is no such city, but it's probably above 0K there!\""
]
},
"execution_count": 17,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@@ -741,9 +863,16 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 17,
"id": "ebfe7c1f-318d-4e58-99e1-f31e69473c46",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.170937Z",
"iopub.status.busy": "2024-09-10T20:25:53.170859Z",
"iopub.status.idle": "2024-09-10T20:25:53.174498Z",
"shell.execute_reply": "2024-09-10T20:25:53.174304Z"
}
},
"outputs": [
{
"data": {
@@ -751,7 +880,7 @@
"'The following errors occurred during tool execution: `Error: There is no city by the name of foobar.`'"
]
},
"execution_count": 18,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
@@ -778,7 +907,7 @@
"\n",
"Sometimes there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns custom objects like Documents, we may want to pass some view or metadata about this output to the model without passing the raw output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
"\n",
"The Tool and [ToolMessage](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) interfaces make it possible to distinguish between the parts of the tool output meant for the model (this is the ToolMessage.content) and those parts which are meant for use outside the model (ToolMessage.artifact).\n",
"The Tool and [ToolMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) interfaces make it possible to distinguish between the parts of the tool output meant for the model (this is the ToolMessage.content) and those parts which are meant for use outside the model (ToolMessage.artifact).\n",
"\n",
":::info Requires ``langchain-core >= 0.2.19``\n",
"\n",
@@ -791,9 +920,16 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 18,
"id": "14905425-0334-43a0-9de9-5bcf622ede0e",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.175683Z",
"iopub.status.busy": "2024-09-10T20:25:53.175605Z",
"iopub.status.idle": "2024-09-10T20:25:53.178798Z",
"shell.execute_reply": "2024-09-10T20:25:53.178601Z"
}
},
"outputs": [],
"source": [
"import random\n",
@@ -820,9 +956,16 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 19,
"id": "0f2e1528-404b-46e6-b87c-f0957c4b9217",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.179881Z",
"iopub.status.busy": "2024-09-10T20:25:53.179807Z",
"iopub.status.idle": "2024-09-10T20:25:53.182100Z",
"shell.execute_reply": "2024-09-10T20:25:53.181940Z"
}
},
"outputs": [
{
"data": {
@@ -830,7 +973,7 @@
"'Successfully generated array of 10 random ints in [0, 9].'"
]
},
"execution_count": 9,
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
@@ -849,17 +992,24 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 20,
"id": "cc197777-26eb-46b3-a83b-c2ce116c6311",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.183238Z",
"iopub.status.busy": "2024-09-10T20:25:53.183170Z",
"iopub.status.idle": "2024-09-10T20:25:53.185752Z",
"shell.execute_reply": "2024-09-10T20:25:53.185567Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ToolMessage(content='Successfully generated array of 10 random ints in [0, 9].', name='generate_random_ints', tool_call_id='123', artifact=[1, 4, 2, 5, 3, 9, 0, 4, 7, 7])"
"ToolMessage(content='Successfully generated array of 10 random ints in [0, 9].', name='generate_random_ints', tool_call_id='123', artifact=[4, 8, 2, 4, 1, 0, 9, 5, 8, 1])"
]
},
"execution_count": 3,
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
@@ -885,9 +1035,16 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 21,
"id": "fe1a09d1-378b-4b91-bb5e-0697c3d7eb92",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.186884Z",
"iopub.status.busy": "2024-09-10T20:25:53.186803Z",
"iopub.status.idle": "2024-09-10T20:25:53.190718Z",
"shell.execute_reply": "2024-09-10T20:25:53.190494Z"
}
},
"outputs": [],
"source": [
"from langchain_core.tools import BaseTool\n",
@@ -917,17 +1074,24 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 22,
"id": "8c3d16f6-1c4a-48ab-b05a-38547c592e79",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:25:53.191872Z",
"iopub.status.busy": "2024-09-10T20:25:53.191794Z",
"iopub.status.idle": "2024-09-10T20:25:53.194396Z",
"shell.execute_reply": "2024-09-10T20:25:53.194184Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ToolMessage(content='Generated 3 floats in [0.1, 3.3333], rounded to 4 decimals.', name='generate_random_floats', tool_call_id='123', artifact=[1.4277, 0.7578, 2.4871])"
"ToolMessage(content='Generated 3 floats in [0.1, 3.3333], rounded to 4 decimals.', name='generate_random_floats', tool_call_id='123', artifact=[1.5566, 0.5134, 2.7914])"
]
},
"execution_count": 8,
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}

View File

@@ -49,13 +49,11 @@
"\n",
"Let's suppose we have an agent, and want to visualize the actions it takes and tool outputs it receives. Without any debugging, here's what we see:\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
"/>\n",
"```"
"/>\n"
]
},
{

View File

@@ -9,7 +9,7 @@
"\n",
"A [comma-separated values (CSV)](https://en.wikipedia.org/wiki/Comma-separated_values) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas.\n",
"\n",
"LangChain implements a [CSV Loader](https://python.langchain.com/v0.2/api_reference/community/document_loaders/langchain_community.document_loaders.csv_loader.CSVLoader.html) that will load CSV files into a sequence of [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects. Each row of the CSV file is translated to one document."
"LangChain implements a [CSV Loader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.csv_loader.CSVLoader.html) that will load CSV files into a sequence of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects. Each row of the CSV file is translated to one document."
]
},
{
@@ -88,7 +88,7 @@
"source": [
"## Specify a column to identify the document source\n",
"\n",
"The `\"source\"` key on [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) metadata can be set using a column of the CSV. Use the `source_column` argument to specify a source for the document created from each row. Otherwise `file_path` will be used as the source for all documents created from the CSV file.\n",
"The `\"source\"` key on [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) metadata can be set using a column of the CSV. Use the `source_column` argument to specify a source for the document created from each row. Otherwise `file_path` will be used as the source for all documents created from the CSV file.\n",
"\n",
"This is useful when using documents loaded from CSV files for chains that answer questions using sources."
]

View File

@@ -63,7 +63,7 @@
"* The `load` methods is a convenience method meant solely for prototyping work -- it just invokes `list(self.lazy_load())`.\n",
"* The `alazy_load` has a default implementation that will delegate to `lazy_load`. If you're using async, we recommend overriding the default implementation and providing a native async implementation.\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"When implementing a document loader do **NOT** provide parameters via the `lazy_load` or `alazy_load` methods.\n",
"\n",
"All configuration is expected to be passed through the initializer (__init__). This was a design choice made by LangChain to make sure that once a document loader has been instantiated it has all the information needed to load documents.\n",
@@ -235,7 +235,7 @@
"id": "56cb443e-f987-4386-b4ec-975ee129adb2",
"metadata": {},
"source": [
":::{.callout-tip}\n",
":::tip\n",
"\n",
"`load()` can be helpful in an interactive environment such as a jupyter notebook.\n",
"\n",

View File

@@ -7,7 +7,7 @@
"source": [
"# How to load documents from a directory\n",
"\n",
"LangChain's [DirectoryLoader](https://python.langchain.com/v0.2/api_reference/community/document_loaders/langchain_community.document_loaders.directory.DirectoryLoader.html) implements functionality for reading files from disk into LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects. Here we demonstrate:\n",
"LangChain's [DirectoryLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.directory.DirectoryLoader.html) implements functionality for reading files from disk into LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects. Here we demonstrate:\n",
"\n",
"- How to load from a filesystem, including use of wildcard patterns;\n",
"- How to use multithreading for file I/O;\n",
@@ -134,7 +134,7 @@
"metadata": {},
"source": [
"## Change loader class\n",
"By default this uses the `UnstructuredLoader` class. To customize the loader, specify the loader class in the `loader_cls` kwarg. Below we show an example using [TextLoader](https://python.langchain.com/v0.2/api_reference/community/document_loaders/langchain_community.document_loaders.text.TextLoader.html):"
"By default this uses the `UnstructuredLoader` class. To customize the loader, specify the loader class in the `loader_cls` kwarg. Below we show an example using [TextLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.text.TextLoader.html):"
]
},
{

View File

@@ -9,7 +9,7 @@
"\n",
"The HyperText Markup Language or [HTML](https://en.wikipedia.org/wiki/HTML) is the standard markup language for documents designed to be displayed in a web browser.\n",
"\n",
"This covers how to load `HTML` documents into a LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects that we can use downstream.\n",
"This covers how to load `HTML` documents into a LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects that we can use downstream.\n",
"\n",
"Parsing HTML files often requires specialized tools. Here we demonstrate parsing via [Unstructured](https://unstructured-io.github.io/unstructured/) and [BeautifulSoup4](https://beautiful-soup-4.readthedocs.io/en/latest/), which can be installed via pip. Head over to the integrations page to find integrations with additional services, such as [Azure AI Document Intelligence](/docs/integrations/document_loaders/azure_document_intelligence) or [FireCrawl](/docs/integrations/document_loaders/firecrawl).\n",
"\n",

View File

@@ -4,8 +4,8 @@
[JSON Lines](https://jsonlines.org/) is a file format where each line is a valid JSON value.
LangChain implements a [JSONLoader](https://python.langchain.com/v0.2/api_reference/community/document_loaders/langchain_community.document_loaders.json_loader.JSONLoader.html)
to convert JSON and JSONL data into LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document)
LangChain implements a [JSONLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.json_loader.JSONLoader.html)
to convert JSON and JSONL data into LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document)
objects. It uses a specified [jq schema](https://en.wikipedia.org/wiki/Jq_(programming_language)) to parse the JSON files, allowing for the extraction of specific fields into the content
and metadata of the LangChain Document.
@@ -41,10 +41,7 @@ data = json.loads(Path(file_path).read_text())
```python
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
{'image': {'creation_timestamp': 1675549016, 'uri': 'image_of_the_chat.jpg'},
'is_still_participant': True,
'joinable_mode': {'link': '', 'mode': 1},
@@ -91,7 +88,6 @@ pprint(data)
'title': 'User 1 and User 2 chat'}
```
</CodeOutputBlock>
## Using `JSONLoader`
@@ -115,9 +111,7 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='Bye!', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 1}),
Document(page_content='Oh no worries! Bye', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 2}),
Document(page_content='No Im sorry it was my mistake, the blue one is not for sale', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 3}),
@@ -131,7 +125,6 @@ pprint(data)
Document(page_content='Hi! Im interested in your bag. Im offering $50. Let me know if you are interested. Thanks!', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 11})]
```
</CodeOutputBlock>
### JSON Lines file
@@ -144,9 +137,7 @@ file_path = './example_data/facebook_chat_messages.jsonl'
pprint(Path(file_path).read_text())
```
<CodeOutputBlock lang="python">
```
```output
('{"sender_name": "User 2", "timestamp_ms": 1675597571851, "content": "Bye!"}\n'
'{"sender_name": "User 1", "timestamp_ms": 1675597435669, "content": "Oh no '
'worries! Bye"}\n'
@@ -154,7 +145,6 @@ pprint(Path(file_path).read_text())
'sorry it was my mistake, the blue one is not for sale"}\n')
```
</CodeOutputBlock>
```python
@@ -171,15 +161,12 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='Bye!', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 1}),
Document(page_content='Oh no worries! Bye', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 2}),
Document(page_content='No Im sorry it was my mistake, the blue one is not for sale', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 3})]
```
</CodeOutputBlock>
Another option is to set `jq_schema='.'` and provide `content_key`:
@@ -198,15 +185,12 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='User 2', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 1}),
Document(page_content='User 1', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 2}),
Document(page_content='User 2', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat_messages.jsonl', 'seq_num': 3})]
```
</CodeOutputBlock>
### JSON file with jq schema `content_key`
@@ -218,9 +202,7 @@ file_path = './sample.json'
pprint(Path(file_path).read_text())
```
<CodeOutputBlock lang="python">
```json
```outputjson
{"data": [
{"attributes": {
"message": "message1",
@@ -234,7 +216,6 @@ pprint(Path(file_path).read_text())
"id": "2"}]}
```
</CodeOutputBlock>
```python
@@ -252,14 +233,11 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='message1', metadata={'source': '/path/to/sample.json', 'seq_num': 1}),
Document(page_content='message2', metadata={'source': '/path/to/sample.json', 'seq_num': 2})]
```
</CodeOutputBlock>
## Extracting metadata
@@ -309,9 +287,7 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='Bye!', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 1, 'sender_name': 'User 2', 'timestamp_ms': 1675597571851}),
Document(page_content='Oh no worries! Bye', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 2, 'sender_name': 'User 1', 'timestamp_ms': 1675597435669}),
Document(page_content='No Im sorry it was my mistake, the blue one is not for sale', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 3, 'sender_name': 'User 2', 'timestamp_ms': 1675596277579}),
@@ -325,7 +301,6 @@ pprint(data)
Document(page_content='Hi! Im interested in your bag. Im offering $50. Let me know if you are interested. Thanks!', metadata={'source': '/Users/avsolatorio/WBG/langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 11, 'sender_name': 'User 1', 'timestamp_ms': 1675549022673})]
```
</CodeOutputBlock>
Now, you will see that the documents contain the metadata associated with the content we extracted.
@@ -368,9 +343,7 @@ data = loader.load()
pprint(data)
```
<CodeOutputBlock lang="python">
```
```output
[Document(page_content='Bye!', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 1, 'sender_name': 'User 2', 'timestamp_ms': 1675597571851}),
Document(page_content='Oh no worries! Bye', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 2, 'sender_name': 'User 1', 'timestamp_ms': 1675597435669}),
Document(page_content='No Im sorry it was my mistake, the blue one is not for sale', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 3, 'sender_name': 'User 2', 'timestamp_ms': 1675596277579}),
@@ -384,7 +357,6 @@ pprint(data)
Document(page_content='Hi! Im interested in your bag. Im offering $50. Let me know if you are interested. Thanks!', metadata={'source': 'langchain/docs/modules/indexes/document_loaders/examples/example_data/facebook_chat.json', 'seq_num': 11, 'sender_name': 'User 1', 'timestamp_ms': 1675549022673})]
```
</CodeOutputBlock>
## Common JSON structures with jq schema

View File

@@ -9,14 +9,14 @@
"\n",
"[Markdown](https://en.wikipedia.org/wiki/Markdown) is a lightweight markup language for creating formatted text using a plain-text editor.\n",
"\n",
"Here we cover how to load `Markdown` documents into LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects that we can use downstream.\n",
"Here we cover how to load `Markdown` documents into LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects that we can use downstream.\n",
"\n",
"We will cover:\n",
"\n",
"- Basic usage;\n",
"- Parsing of Markdown into elements such as titles, list items, and text.\n",
"\n",
"LangChain implements an [UnstructuredMarkdownLoader](https://python.langchain.com/v0.2/api_reference/community/document_loaders/langchain_community.document_loaders.markdown.UnstructuredMarkdownLoader.html) object which requires the [Unstructured](https://unstructured-io.github.io/unstructured/) package. First we install it:"
"LangChain implements an [UnstructuredMarkdownLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.markdown.UnstructuredMarkdownLoader.html) object which requires the [Unstructured](https://unstructured-io.github.io/unstructured/) package. First we install it:"
]
},
{

View File

@@ -3,7 +3,7 @@
The [Microsoft Office](https://www.office.com/) suite of productivity software includes Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Microsoft Outlook, and Microsoft OneNote. It is available for Microsoft Windows and macOS operating systems. It is also available on Android and iOS.
This covers how to load commonly used file formats including `DOCX`, `XLSX` and `PPTX` documents into a LangChain
[Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document)
[Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document)
object that we can use downstream.

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,486 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "7414502a-4532-4da3-aef0-71aac4d0d4dd",
"metadata": {},
"source": [
"# How to load web pages\n",
"\n",
"This guide covers how to load web pages into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream. Web pages contain text, images, and other multimedia elements, and are typically represented with HTML. They may include links to other pages or resources.\n",
"\n",
"LangChain integrates with a host of parsers that are appropriate for web pages. The right parser will depend on your needs. Below we demonstrate two possibilities:\n",
"\n",
"- [Simple and fast](/docs/how_to/document_loader_web#simple-and-fast-text-extraction) parsing, in which we recover one `Document` per web page with its content represented as a \"flattened\" string;\n",
"- [Advanced](/docs/how_to/document_loader_web#advanced-parsing) parsing, in which we recover multiple `Document` objects per page, allowing one to identify and traverse sections, links, tables, and other structures.\n",
"\n",
"## Setup\n",
"\n",
"For the \"simple and fast\" parsing, we will need `langchain-community` and the `beautifulsoup4` library:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "89bc7be9-ab50-4c5a-860a-deee7b469f67",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-community beautifulsoup4"
]
},
{
"cell_type": "markdown",
"id": "a07f5ca3-e2b7-4d9c-b1f2-7547856cbdf7",
"metadata": {},
"source": [
"For advanced parsing, we will use `langchain-unstructured`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8a3ef1fc-dfde-4814-b7f6-b6c0c649f044",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-unstructured"
]
},
{
"cell_type": "markdown",
"id": "4ef11005-1bd0-43a3-8d52-ea823c830c34",
"metadata": {},
"source": [
"## Simple and fast text extraction\n",
"\n",
"If you are looking for a simple string representation of text that is embedded in a web page, the method below is appropriate. It will return a list of `Document` objects -- one per page -- containing a single string of the page's text. Under the hood it uses the `beautifulsoup4` Python library.\n",
"\n",
"LangChain document loaders implement `lazy_load` and its async variant, `alazy_load`, which return iterators of `Document objects`. We will use these below."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7faeccbc-4e56-4b88-99db-2274ed0680c1",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"USER_AGENT environment variable not set, consider setting it to identify your requests.\n"
]
}
],
"source": [
"import bs4\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"\n",
"page_url = \"https://python.langchain.com/docs/how_to/chatbots_memory/\"\n",
"\n",
"loader = WebBaseLoader(web_paths=[page_url])\n",
"docs = []\n",
"async for doc in loader.alazy_load():\n",
" docs.append(doc)\n",
"\n",
"assert len(docs) == 1\n",
"doc = docs[0]"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "21199a0d-3bd2-4410-a060-763649b14691",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'source': 'https://python.langchain.com/docs/how_to/chatbots_memory/', 'title': 'How to add memory to chatbots | \\uf8ffü¶úÔ∏è\\uf8ffüîó LangChain', 'description': 'A key feature of chatbots is their ability to use content of previous conversation turns as context. This state management can take several forms, including:', 'language': 'en'}\n",
"\n",
"How to add memory to chatbots | 🦜️🔗 LangChain\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"Skip to main contentShare your thoughts on AI agents. Take the 3-min survey.IntegrationsAPI ReferenceMoreContributingPeopleLangSmithLangGraphLangChain HubLangChain JS/TSv0.3v0.3v0.2v0.1💬SearchIntroductionTutorialsBuild a Question Answering application over a Graph DatabaseTutorialsBuild a Simple LLM Application with LCELBuild a Query Analysis SystemBuild a ChatbotConversational RAGBuild an Extraction ChainBuild an AgentTaggingd\n"
]
}
],
"source": [
"print(f\"{doc.metadata}\\n\")\n",
"print(doc.page_content[:500].strip())"
]
},
{
"cell_type": "markdown",
"id": "23189e91-5237-4a9e-a4bb-cb79e130c364",
"metadata": {},
"source": [
"This is essentially a dump of the text from the page's HTML. It may contain extraneous information like headings and navigation bars. If you are familiar with the expected HTML, you can specify desired `<div>` classes and other parameters via BeautifulSoup. Below we parse only the body text of the article:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4211b1a6-e636-415b-a556-ae01969399a7",
"metadata": {},
"outputs": [],
"source": [
"loader = WebBaseLoader(\n",
" web_paths=[page_url],\n",
" bs_kwargs={\n",
" \"parse_only\": bs4.SoupStrainer(class_=\"theme-doc-markdown markdown\"),\n",
" },\n",
" bs_get_text_kwargs={\"separator\": \" | \", \"strip\": True},\n",
")\n",
"\n",
"docs = []\n",
"async for doc in loader.alazy_load():\n",
" docs.append(doc)\n",
"\n",
"assert len(docs) == 1\n",
"doc = docs[0]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "7edf6ed0-e22f-4c64-b986-8ba019c14757",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'source': 'https://python.langchain.com/docs/how_to/chatbots_memory/'}\n",
"\n",
"How to add memory to chatbots | A key feature of chatbots is their ability to use content of previous conversation turns as context. This state management can take several forms, including: | Simply stuffing previous messages into a chat model prompt. | The above, but trimming old messages to reduce the amount of distracting information the model has to deal with. | More complex modifications like synthesizing summaries for long running conversations. | We'll go into more detail on a few techniq\n"
]
}
],
"source": [
"print(f\"{doc.metadata}\\n\")\n",
"print(doc.page_content[:500])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6ab1ba2b-3b22-4c5d-8ad3-f6809d075d26",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a greeting. Nemo then asks the AI how it is doing, and the AI responds that it is fine.'), | HumanMessage(content='What did I say my name was?'), | AIMessage(content='You introduced yourself as Nemo. How can I assist you today, Nemo?')] | Note that invoking the chain again will generate another summary generated from the initial summary plus new messages and so on. You could also design a hybrid approach where a certain number of messages are retained in chat history while others are summarized.\n"
]
}
],
"source": [
"print(doc.page_content[-500:])"
]
},
{
"cell_type": "markdown",
"id": "0a411144-a234-4505-956c-930d399ffefb",
"metadata": {},
"source": [
"Note that this required advance technical knowledge of how the body text is represented in the underlying HTML.\n",
"\n",
"We can parameterize `WebBaseLoader` with a variety of settings, allowing for specification of request headers, rate limits, and parsers and other kwargs for BeautifulSoup. See its [API reference](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) for detail."
]
},
{
"cell_type": "markdown",
"id": "cdfc8f68-2b08-4a6c-9705-c17e6deaf411",
"metadata": {},
"source": [
"## Advanced parsing\n",
"\n",
"This method is appropriate if we want more granular control or processing of the page content. Below, instead of generating one `Document` per page and controlling its content via BeautifulSoup, we generate multiple `Document` objects representing distinct structures on a page. These structures can include section titles and their corresponding body texts, lists or enumerations, tables, and more.\n",
"\n",
"Under the hood it uses the `langchain-unstructured` library. See the [integration docs](/docs/integrations/document_loaders/unstructured_file/) for more information about using [Unstructured](https://docs.unstructured.io/welcome) with LangChain."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "7a6bbfef-ebd5-4357-a7f5-9c989dda092d",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO: Note: NumExpr detected 12 cores but \"NUMEXPR_MAX_THREADS\" not set, so enforcing safe limit of 8.\n",
"INFO: NumExpr defaulting to 8 threads.\n"
]
}
],
"source": [
"from langchain_unstructured import UnstructuredLoader\n",
"\n",
"page_url = \"https://python.langchain.com/docs/how_to/chatbots_memory/\"\n",
"loader = UnstructuredLoader(web_url=page_url)\n",
"\n",
"docs = []\n",
"async for doc in loader.alazy_load():\n",
" docs.append(doc)"
]
},
{
"cell_type": "markdown",
"id": "53a600b0-fcd2-4074-80b6-a1dd2c0d9235",
"metadata": {},
"source": [
"Note that with no advance knowledge of the page HTML structure, we recover a natural organization of the body text:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "198b7469-587f-4a80-a49f-440e6157b241",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"How to add memory to chatbots\n",
"A key feature of chatbots is their ability to use content of previous conversation turns as context. This state management can take several forms, including:\n",
"Simply stuffing previous messages into a chat model prompt.\n",
"The above, but trimming old messages to reduce the amount of distracting information the model has to deal with.\n",
"More complex modifications like synthesizing summaries for long running conversations.\n",
"ERROR! Session/line number was not unique in database. History logging moved to new session 2747\n"
]
}
],
"source": [
"for doc in docs[:5]:\n",
" print(doc.page_content)"
]
},
{
"cell_type": "markdown",
"id": "3f4254b5-5e3b-45c4-9cd0-7ed753687783",
"metadata": {},
"source": [
"### Extracting content from specific sections"
]
},
{
"cell_type": "markdown",
"id": "627d9b7a-31fe-4923-bc9a-4b0caae1d760",
"metadata": {},
"source": [
"Each `Document` object represents an element of the page. Its metadata contains useful information, such as its category:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "2fa19a61-a53d-42e0-a01a-7ea99fc40810",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Title: How to add memory to chatbots\n",
"NarrativeText: A key feature of chatbots is their ability to use content of previous conversation turns as context. This state management can take several forms, including:\n",
"ListItem: Simply stuffing previous messages into a chat model prompt.\n",
"ListItem: The above, but trimming old messages to reduce the amount of distracting information the model has to deal with.\n",
"ListItem: More complex modifications like synthesizing summaries for long running conversations.\n"
]
}
],
"source": [
"for doc in docs[:5]:\n",
" print(f'{doc.metadata[\"category\"]}: {doc.page_content}')"
]
},
{
"cell_type": "markdown",
"id": "ca0f8025-58b8-4d2a-95aa-124d7a8ee812",
"metadata": {},
"source": [
"Elements may also have parent-child relationships -- for example, a paragraph might belong to a section with a title. If a section is of particular interest (e.g., for indexing) we can isolate the corresponding `Document` objects.\n",
"\n",
"As an example, below we load the content of the \"Setup\" sections for two web pages:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "793018fd-1365-4a8a-8690-6d51dad2e1cf",
"metadata": {},
"outputs": [],
"source": [
"from typing import List\n",
"\n",
"from langchain_core.documents import Document\n",
"\n",
"\n",
"async def _get_setup_docs_from_url(url: str) -> List[Document]:\n",
" loader = UnstructuredLoader(web_url=url)\n",
"\n",
" setup_docs = []\n",
" parent_id = -1\n",
" async for doc in loader.alazy_load():\n",
" if doc.metadata[\"category\"] == \"Title\" and doc.page_content.startswith(\"Setup\"):\n",
" parent_id = doc.metadata[\"element_id\"]\n",
" if doc.metadata.get(\"parent_id\") == parent_id:\n",
" setup_docs.append(doc)\n",
"\n",
" return setup_docs\n",
"\n",
"\n",
"page_urls = [\n",
" \"https://python.langchain.com/docs/how_to/chatbots_memory/\",\n",
" \"https://python.langchain.com/docs/how_to/chatbots_tools/\",\n",
"]\n",
"setup_docs = []\n",
"for url in page_urls:\n",
" page_setup_docs = await _get_setup_docs_from_url(url)\n",
" setup_docs.extend(page_setup_docs)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "a67e0745-abfc-4baa-94b3-2e8815bfa52a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'https://python.langchain.com/docs/how_to/chatbots_memory/': \"You'll need to install a few packages, and have your OpenAI API key set as an environment variable named OPENAI_API_KEY:\\n%pip install --upgrade --quiet langchain langchain-openai\\n\\n# Set env var OPENAI_API_KEY or load from a .env file:\\nimport dotenv\\n\\ndotenv.load_dotenv()\\n[33mWARNING: You are using pip version 22.0.4; however, version 23.3.2 is available.\\nYou should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.[0m[33m\\n[0mNote: you may need to restart the kernel to use updated packages.\\n\",\n",
" 'https://python.langchain.com/docs/how_to/chatbots_tools/': \"For this guide, we'll be using a tool calling agent with a single tool for searching the web. The default will be powered by Tavily, but you can switch it out for any similar tool. The rest of this section will assume you're using Tavily.\\nYou'll need to sign up for an account on the Tavily website, and install the following packages:\\n%pip install --upgrade --quiet langchain-community langchain-openai tavily-python\\n\\n# Set env var OPENAI_API_KEY or load from a .env file:\\nimport dotenv\\n\\ndotenv.load_dotenv()\\nYou will also need your OpenAI key set as OPENAI_API_KEY and your Tavily API key set as TAVILY_API_KEY.\\n\"}"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from collections import defaultdict\n",
"\n",
"setup_text = defaultdict(str)\n",
"\n",
"for doc in setup_docs:\n",
" url = doc.metadata[\"url\"]\n",
" setup_text[url] += f\"{doc.page_content}\\n\"\n",
"\n",
"dict(setup_text)"
]
},
{
"cell_type": "markdown",
"id": "5cd42892-24a6-4969-92c8-c928680be9b5",
"metadata": {},
"source": [
"### Vector search over page content\n",
"\n",
"Once we have loaded the page contents into LangChain `Document` objects, we can index them (e.g., for a RAG application) in the usual way. Below we use OpenAI [embeddings](/docs/concepts/#embedding-models), although any LangChain embeddings model will suffice."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5a6cbb01-6e0d-418f-9f76-2031622bebb0",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-openai"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "598e612c-180d-494d-8caa-761c89f84eae",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "5eeaeb54-ea03-4634-8a79-b60c22ab2b66",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO: HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
"INFO: HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Page https://python.langchain.com/docs/how_to/chatbots_tools/: You'll need to sign up for an account on the Tavily website, and install the following packages:\n",
"\n",
"Page https://python.langchain.com/docs/how_to/chatbots_tools/: For this guide, we'll be using a tool calling agent with a single tool for searching the web. The default will be powered by Tavily, but you can switch it out for any similar tool. The rest of this section will assume you're using Tavily.\n",
"\n"
]
}
],
"source": [
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"vector_store = InMemoryVectorStore.from_documents(setup_docs, OpenAIEmbeddings())\n",
"retrieved_docs = vector_store.similarity_search(\"Install Tavily\", k=2)\n",
"for doc in retrieved_docs:\n",
" print(f'Page {doc.metadata[\"url\"]}: {doc.page_content[:300]}\\n')"
]
},
{
"cell_type": "markdown",
"id": "67be9c94-dbde-4fdd-87d0-e83ed6066d2b",
"metadata": {},
"source": [
"## Other web page loaders\n",
"\n",
"For a list of available LangChain web page loaders, please see [this table](/docs/integrations/document_loaders/#webpages)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -17,13 +17,11 @@
"\n",
"Sometimes we want to construct parts of a chain at runtime, depending on the chain inputs ([routing](/docs/how_to/routing/) is the most common example of this). We can create dynamic chains like this using a very useful property of RunnableLambda's, which is that if a RunnableLambda returns a Runnable, that Runnable is itself invoked. Let's see an example.\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
"/>\n",
"```"
"/>\n"
]
},
{

View File

@@ -115,13 +115,10 @@ embeddings = embeddings_model.embed_documents(
len(embeddings), len(embeddings[0])
```
<CodeOutputBlock language="python">
```
```output
(5, 1536)
```
</CodeOutputBlock>
### `embed_query`
#### Embed single query
@@ -132,9 +129,7 @@ embedded_query = embeddings_model.embed_query("What was the name mentioned in th
embedded_query[:5]
```
<CodeOutputBlock language="python">
```
```output
[0.0053587136790156364,
-0.0004999046213924885,
0.038883671164512634,
@@ -142,4 +137,3 @@ embedded_query[:5]
-0.00900818221271038]
```
</CodeOutputBlock>

View File

@@ -6,7 +6,7 @@
"source": [
"# How to combine results from multiple retrievers\n",
"\n",
"The [EnsembleRetriever](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://python.langchain.com/v0.2/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
"The [EnsembleRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
"\n",
"By leveraging the strengths of different algorithms, the `EnsembleRetriever` can achieve better performance than any single algorithm. \n",
"\n",
@@ -14,7 +14,7 @@
"\n",
"## Basic usage\n",
"\n",
"Below we demonstrate ensembling of a [BM25Retriever](https://python.langchain.com/v0.2/api_reference/community/retrievers/langchain_community.retrievers.bm25.BM25Retriever.html) with a retriever derived from the [FAISS vector store](https://python.langchain.com/v0.2/api_reference/community/vectorstores/langchain_community.vectorstores.faiss.FAISS.html)."
"Below we demonstrate ensembling of a [BM25Retriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.bm25.BM25Retriever.html) with a retriever derived from the [FAISS vector store](https://python.langchain.com/api_reference/community/vectorstores/langchain_community.vectorstores.faiss.FAISS.html)."
]
},
{

View File

@@ -11,16 +11,16 @@
"\n",
"Data extraction attempts to generate structured representations of information found in text and other unstructured or semi-structured formats. [Tool-calling](/docs/concepts#functiontool-calling) LLM features are often used in this context. This guide demonstrates how to build few-shot examples of tool calls to help steer the behavior of extraction and similar applications.\n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"While this guide focuses how to use examples with a tool calling model, this technique is generally applicable, and will work\n",
"also with JSON more or prompt based techniques.\n",
":::\n",
"\n",
"LangChain implements a [tool-call attribute](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.tool_calls) on messages from LLMs that include tool calls. See our [how-to guide on tool calling](/docs/how_to/tool_calling) for more detail. To build reference examples for data extraction, we build a chat history containing a sequence of: \n",
"LangChain implements a [tool-call attribute](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.tool_calls) on messages from LLMs that include tool calls. See our [how-to guide on tool calling](/docs/how_to/tool_calling) for more detail. To build reference examples for data extraction, we build a chat history containing a sequence of: \n",
"\n",
"- [HumanMessage](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) containing example inputs;\n",
"- [AIMessage](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html) containing example tool calls;\n",
"- [ToolMessage](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) containing example tool outputs.\n",
"- [HumanMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.human.HumanMessage.html) containing example inputs;\n",
"- [AIMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html) containing example tool calls;\n",
"- [ToolMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) containing example tool outputs.\n",
"\n",
"LangChain adopts this convention for structuring tool calls into conversation across LLM model providers.\n",
"\n",
@@ -29,9 +29,16 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "89579144-bcb3-490a-8036-86a0a6bcd56b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:41.780410Z",
"iopub.status.busy": "2024-09-10T20:26:41.780102Z",
"iopub.status.idle": "2024-09-10T20:26:42.147112Z",
"shell.execute_reply": "2024-09-10T20:26:42.146838Z"
}
},
"outputs": [],
"source": [
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
@@ -67,17 +74,24 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "610c3025-ea63-4cd7-88bd-c8cbcb4d8a3f",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.148746Z",
"iopub.status.busy": "2024-09-10T20:26:42.148621Z",
"iopub.status.idle": "2024-09-10T20:26:42.162044Z",
"shell.execute_reply": "2024-09-10T20:26:42.161794Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ChatPromptValue(messages=[SystemMessage(content=\"You are an expert extraction algorithm. Only extract relevant information from the text. If you do not know the value of an attribute asked to extract, return null for the attribute's value.\"), HumanMessage(content='testing 1 2 3'), HumanMessage(content='this is some text')])"
"ChatPromptValue(messages=[SystemMessage(content=\"You are an expert extraction algorithm. Only extract relevant information from the text. If you do not know the value of an attribute asked to extract, return null for the attribute's value.\", additional_kwargs={}, response_metadata={}), HumanMessage(content='testing 1 2 3', additional_kwargs={}, response_metadata={}), HumanMessage(content='this is some text', additional_kwargs={}, response_metadata={})])"
]
},
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
@@ -104,15 +118,22 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "d875a49a-d2cb-4b9e-b5bf-41073bc3905c",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.163477Z",
"iopub.status.busy": "2024-09-10T20:26:42.163391Z",
"iopub.status.idle": "2024-09-10T20:26:42.324449Z",
"shell.execute_reply": "2024-09-10T20:26:42.324206Z"
}
},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class Person(BaseModel):\n",
@@ -151,7 +172,7 @@
"\n",
"Each example contains an example `input` text and an example `output` showing what should be extracted from the text.\n",
"\n",
":::{.callout-important}\n",
":::important\n",
"This is a bit in the weeds, so feel free to skip.\n",
"\n",
"The format of the example needs to match the API used (e.g., tool calling or JSON mode etc.).\n",
@@ -162,9 +183,16 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "08356810-77ce-4e68-99d9-faa0326f2cee",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.326100Z",
"iopub.status.busy": "2024-09-10T20:26:42.326016Z",
"iopub.status.idle": "2024-09-10T20:26:42.329260Z",
"shell.execute_reply": "2024-09-10T20:26:42.329014Z"
}
},
"outputs": [],
"source": [
"import uuid\n",
@@ -177,7 +205,7 @@
" SystemMessage,\n",
" ToolMessage,\n",
")\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class Example(TypedDict):\n",
@@ -238,9 +266,16 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "7f59a745-5c81-4011-a4c5-a33ec1eca7ef",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.330580Z",
"iopub.status.busy": "2024-09-10T20:26:42.330488Z",
"iopub.status.idle": "2024-09-10T20:26:42.332813Z",
"shell.execute_reply": "2024-09-10T20:26:42.332598Z"
}
},
"outputs": [],
"source": [
"examples = [\n",
@@ -273,22 +308,29 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "976bb7b8-09c4-4a3e-80df-49a483705c08",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.333955Z",
"iopub.status.busy": "2024-09-10T20:26:42.333876Z",
"iopub.status.idle": "2024-09-10T20:26:42.336841Z",
"shell.execute_reply": "2024-09-10T20:26:42.336635Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"system: content=\"You are an expert extraction algorithm. Only extract relevant information from the text. If you do not know the value of an attribute asked to extract, return null for the attribute's value.\"\n",
"human: content=\"The ocean is vast and blue. It's more than 20,000 feet deep. There are many fish in it.\"\n",
"ai: content='' tool_calls=[{'name': 'Person', 'args': {'name': None, 'hair_color': None, 'height_in_meters': None}, 'id': 'b843ba77-4c9c-48ef-92a4-54e534f24521'}]\n",
"tool: content='You have correctly called this tool.' tool_call_id='b843ba77-4c9c-48ef-92a4-54e534f24521'\n",
"human: content='Fiona traveled far from France to Spain.'\n",
"ai: content='' tool_calls=[{'name': 'Person', 'args': {'name': 'Fiona', 'hair_color': None, 'height_in_meters': None}, 'id': '46f00d6b-50e5-4482-9406-b07bb10340f6'}]\n",
"tool: content='You have correctly called this tool.' tool_call_id='46f00d6b-50e5-4482-9406-b07bb10340f6'\n",
"human: content='this is some text'\n"
"system: content=\"You are an expert extraction algorithm. Only extract relevant information from the text. If you do not know the value of an attribute asked to extract, return null for the attribute's value.\" additional_kwargs={} response_metadata={}\n",
"human: content=\"The ocean is vast and blue. It's more than 20,000 feet deep. There are many fish in it.\" additional_kwargs={} response_metadata={}\n",
"ai: content='' additional_kwargs={} response_metadata={} tool_calls=[{'name': 'Data', 'args': {'people': []}, 'id': '240159b1-1405-4107-a07c-3c6b91b3d5b7', 'type': 'tool_call'}]\n",
"tool: content='You have correctly called this tool.' tool_call_id='240159b1-1405-4107-a07c-3c6b91b3d5b7'\n",
"human: content='Fiona traveled far from France to Spain.' additional_kwargs={} response_metadata={}\n",
"ai: content='' additional_kwargs={} response_metadata={} tool_calls=[{'name': 'Data', 'args': {'people': [{'name': 'Fiona', 'hair_color': None, 'height_in_meters': None}]}, 'id': '3fc521e4-d1d2-4c20-bf40-e3d72f1068da', 'type': 'tool_call'}]\n",
"tool: content='You have correctly called this tool.' tool_call_id='3fc521e4-d1d2-4c20-bf40-e3d72f1068da'\n",
"human: content='this is some text' additional_kwargs={} response_metadata={}\n"
]
}
],
@@ -308,21 +350,26 @@
"\n",
"Let's select an LLM. Because we are using tool-calling, we will need a model that supports a tool-calling feature. See [this table](/docs/integrations/chat) for available LLMs.\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" openaiParams={`model=\"gpt-4-0125-preview\", temperature=0`}\n",
"/>\n",
"```"
"/>\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"id": "df2e1ee1-69e8-4c4d-b349-95f2e320317b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.338001Z",
"iopub.status.busy": "2024-09-10T20:26:42.337915Z",
"iopub.status.idle": "2024-09-10T20:26:42.349121Z",
"shell.execute_reply": "2024-09-10T20:26:42.348908Z"
}
},
"outputs": [],
"source": [
"# | output: false\n",
@@ -343,9 +390,16 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"id": "dbfea43d-769b-42e9-a76f-ce722f7d6f93",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.350335Z",
"iopub.status.busy": "2024-09-10T20:26:42.350264Z",
"iopub.status.idle": "2024-09-10T20:26:42.424894Z",
"shell.execute_reply": "2024-09-10T20:26:42.424623Z"
}
},
"outputs": [],
"source": [
"runnable = prompt | llm.with_structured_output(\n",
@@ -367,18 +421,49 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"id": "66545cab-af2a-40a4-9dc9-b4110458b7d3",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:42.426258Z",
"iopub.status.busy": "2024-09-10T20:26:42.426187Z",
"iopub.status.idle": "2024-09-10T20:26:46.151633Z",
"shell.execute_reply": "2024-09-10T20:26:46.150690Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[Person(name='earth', hair_color='null', height_in_meters='null')]\n",
"people=[Person(name='earth', hair_color='null', height_in_meters='null')]\n",
"people=[]\n",
"people=[Person(name='earth', hair_color='null', height_in_meters='null')]\n",
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
}
@@ -401,18 +486,49 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"id": "1c09d805-ec16-4123-aef9-6a5b59499b5c",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:46.155346Z",
"iopub.status.busy": "2024-09-10T20:26:46.155110Z",
"iopub.status.idle": "2024-09-10T20:26:51.810359Z",
"shell.execute_reply": "2024-09-10T20:26:51.809636Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n",
"people=[]\n",
"people=[]\n",
"people=[]\n",
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"people=[]\n"
]
}
@@ -435,9 +551,16 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 11,
"id": "a9b7a762-1b75-4f9f-b9d9-6732dd05802c",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:26:51.813309Z",
"iopub.status.busy": "2024-09-10T20:26:51.813150Z",
"iopub.status.idle": "2024-09-10T20:26:53.474153Z",
"shell.execute_reply": "2024-09-10T20:26:53.473522Z"
}
},
"outputs": [
{
"data": {
@@ -445,7 +568,7 @@
"Data(people=[Person(name='Harrison', hair_color='black', height_in_meters=None)])"
]
},
"execution_count": 12,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
@@ -476,7 +599,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -23,16 +23,56 @@
"id": "57969139-ad0a-487e-97d8-cb30e2af9742",
"metadata": {},
"source": [
"## Set up\n",
"## Setup\n",
"\n",
"We need some example data! Let's download an article about [cars from wikipedia](https://en.wikipedia.org/wiki/Car) and load it as a LangChain [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html)."
"First we'll install the dependencies needed for this guide:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "84460db2-36e1-4037-bfa6-2a11883c2ba5",
"id": "a3b4d838-5be4-4207-8a4a-9ef5624c48f2",
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:19.850767Z",
"iopub.status.busy": "2024-09-10T20:35:19.850427Z",
"iopub.status.idle": "2024-09-10T20:35:21.432233Z",
"shell.execute_reply": "2024-09-10T20:35:21.431606Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install -qU langchain-community lxml faiss-cpu langchain-openai"
]
},
{
"cell_type": "markdown",
"id": "ac000b03-33fc-414f-8f2c-3850df621a35",
"metadata": {},
"source": [
"Now we need some example data! Let's download an article about [cars from wikipedia](https://en.wikipedia.org/wiki/Car) and load it as a LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "84460db2-36e1-4037-bfa6-2a11883c2ba5",
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:21.434882Z",
"iopub.status.busy": "2024-09-10T20:35:21.434571Z",
"iopub.status.idle": "2024-09-10T20:35:22.214545Z",
"shell.execute_reply": "2024-09-10T20:35:22.214253Z"
}
},
"outputs": [],
"source": [
"import re\n",
@@ -55,15 +95,22 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "fcb6917b-123d-4630-a0ce-ed8b293d482d",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.216143Z",
"iopub.status.busy": "2024-09-10T20:35:22.216039Z",
"iopub.status.idle": "2024-09-10T20:35:22.218117Z",
"shell.execute_reply": "2024-09-10T20:35:22.217854Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"79174\n"
"80427\n"
]
}
],
@@ -87,13 +134,20 @@
"cell_type": "code",
"execution_count": 4,
"id": "a3b288ed-87a6-4af0-aac8-20921dc370d4",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.219468Z",
"iopub.status.busy": "2024-09-10T20:35:22.219395Z",
"iopub.status.idle": "2024-09-10T20:35:22.340594Z",
"shell.execute_reply": "2024-09-10T20:35:22.340319Z"
}
},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class KeyDevelopment(BaseModel):\n",
@@ -142,21 +196,26 @@
"\n",
"Let's select an LLM. Because we are using tool-calling, we will need a model that supports a tool-calling feature. See [this table](/docs/integrations/chat) for available LLMs.\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" openaiParams={`model=\"gpt-4-0125-preview\", temperature=0`}\n",
"/>\n",
"```"
"/>\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "109f4f05-d0ff-431d-93d9-8f5aa34979a6",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.342277Z",
"iopub.status.busy": "2024-09-10T20:35:22.342171Z",
"iopub.status.idle": "2024-09-10T20:35:22.532302Z",
"shell.execute_reply": "2024-09-10T20:35:22.532034Z"
}
},
"outputs": [],
"source": [
"# | output: false\n",
@@ -171,7 +230,14 @@
"cell_type": "code",
"execution_count": 6,
"id": "aa4ae224-6d3d-4fe2-b210-7db19a9fe580",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.533795Z",
"iopub.status.busy": "2024-09-10T20:35:22.533708Z",
"iopub.status.idle": "2024-09-10T20:35:22.610573Z",
"shell.execute_reply": "2024-09-10T20:35:22.610307Z"
}
},
"outputs": [],
"source": [
"extractor = prompt | llm.with_structured_output(\n",
@@ -194,7 +260,14 @@
"cell_type": "code",
"execution_count": 7,
"id": "27b8a373-14b3-45ea-8bf5-9749122ad927",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.612123Z",
"iopub.status.busy": "2024-09-10T20:35:22.612052Z",
"iopub.status.idle": "2024-09-10T20:35:22.753493Z",
"shell.execute_reply": "2024-09-10T20:35:22.753179Z"
}
},
"outputs": [],
"source": [
"from langchain_text_splitters import TokenTextSplitter\n",
@@ -214,9 +287,9 @@
"id": "5b43d7e0-3c85-4d97-86c7-e8c984b60b0a",
"metadata": {},
"source": [
"Use [batch](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) functionality to run the extraction in **parallel** across each chunk! \n",
"Use [batch](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) functionality to run the extraction in **parallel** across each chunk! \n",
"\n",
":::{.callout-tip}\n",
":::tip\n",
"You can often use .batch() to parallelize the extractions! `.batch` uses a threadpool under the hood to help you parallelize workloads.\n",
"\n",
"If your model is exposed via an API, this will likely speed up your extraction flow!\n",
@@ -227,7 +300,14 @@
"cell_type": "code",
"execution_count": 8,
"id": "6ba766b5-8d6c-48e6-8d69-f391a66b65d2",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:22.755067Z",
"iopub.status.busy": "2024-09-10T20:35:22.754987Z",
"iopub.status.idle": "2024-09-10T20:35:36.691130Z",
"shell.execute_reply": "2024-09-10T20:35:36.690500Z"
}
},
"outputs": [],
"source": [
"# Limit just to the first 3 chunks\n",
@@ -254,21 +334,27 @@
"cell_type": "code",
"execution_count": 9,
"id": "c3f77470-ce6c-477f-8957-650913218632",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:36.694799Z",
"iopub.status.busy": "2024-09-10T20:35:36.694458Z",
"iopub.status.idle": "2024-09-10T20:35:36.701416Z",
"shell.execute_reply": "2024-09-10T20:35:36.700993Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[KeyDevelopment(year=1966, description='The Toyota Corolla began production, becoming the best-selling series of automobile in history.', evidence='The Toyota Corolla, which has been in production since 1966, is the best-selling series of automobile in history.'),\n",
" KeyDevelopment(year=1769, description='Nicolas-Joseph Cugnot built the first steam-powered road vehicle.', evidence='The French inventor Nicolas-Joseph Cugnot built the first steam-powered road vehicle in 1769.'),\n",
" KeyDevelopment(year=1808, description='François Isaac de Rivaz designed and constructed the first internal combustion-powered automobile.', evidence='the Swiss inventor François Isaac de Rivaz designed and constructed the first internal combustion-powered automobile in 1808.'),\n",
" KeyDevelopment(year=1886, description='Carl Benz patented his Benz Patent-Motorwagen, inventing the modern car.', evidence='The modern car—a practical, marketable automobile for everyday use—was invented in 1886, when the German inventor Carl Benz patented his Benz Patent-Motorwagen.'),\n",
" KeyDevelopment(year=1908, description='Ford Model T, one of the first cars affordable by the masses, began production.', evidence='One of the first cars affordable by the masses was the Ford Model T, begun in 1908, an American car manufactured by the Ford Motor Company.'),\n",
" KeyDevelopment(year=1888, description=\"Bertha Benz undertook the first road trip by car to prove the road-worthiness of her husband's invention.\", evidence=\"In August 1888, Bertha Benz, the wife of Carl Benz, undertook the first road trip by car, to prove the road-worthiness of her husband's invention.\"),\n",
"[KeyDevelopment(year=1769, description='Nicolas-Joseph Cugnot built the first full-scale, self-propelled mechanical vehicle, a steam-powered tricycle.', evidence='Nicolas-Joseph Cugnot is widely credited with building the first full-scale, self-propelled mechanical vehicle in about 1769; he created a steam-powered tricycle.'),\n",
" KeyDevelopment(year=1807, description=\"Nicéphore Niépce and his brother Claude created what was probably the world's first internal combustion engine.\", evidence=\"In 1807, Nicéphore Niépce and his brother Claude created what was probably the world's first internal combustion engine (which they called a Pyréolophore), but installed it in a boat on the river Saone in France.\"),\n",
" KeyDevelopment(year=1886, description='Carl Benz patented the Benz Patent-Motorwagen, marking the birth of the modern car.', evidence='In November 1881, French inventor Gustave Trouvé demonstrated a three-wheeled car powered by electricity at the International Exposition of Electricity. Although several other German engineers (including Gottlieb Daimler, Wilhelm Maybach, and Siegfried Marcus) were working on cars at about the same time, the year 1886 is regarded as the birth year of the modern car—a practical, marketable automobile for everyday use—when the German Carl Benz patented his Benz Patent-Motorwagen; he is generally acknowledged as the inventor of the car.'),\n",
" KeyDevelopment(year=1886, description='Carl Benz began promotion of his vehicle, marking the introduction of the first commercially available automobile.', evidence='Benz began promotion of the vehicle on 3 July 1886.'),\n",
" KeyDevelopment(year=1888, description=\"Bertha Benz undertook the first road trip by car to prove the road-worthiness of her husband's invention.\", evidence=\"In August 1888, Bertha Benz, the wife and business partner of Carl Benz, undertook the first road trip by car, to prove the road-worthiness of her husband's invention.\"),\n",
" KeyDevelopment(year=1896, description='Benz designed and patented the first internal-combustion flat engine, called boxermotor.', evidence='In 1896, Benz designed and patented the first internal-combustion flat engine, called boxermotor.'),\n",
" KeyDevelopment(year=1897, description='Nesselsdorfer Wagenbau produced the Präsident automobil, one of the first factory-made cars in the world.', evidence='The first motor car in central Europe and one of the first factory-made cars in the world, was produced by Czech company Nesselsdorfer Wagenbau (later renamed to Tatra) in 1897, the Präsident automobil.'),\n",
" KeyDevelopment(year=1890, description='Daimler Motoren Gesellschaft (DMG) was founded by Daimler and Maybach in Cannstatt.', evidence='Daimler and Maybach founded Daimler Motoren Gesellschaft (DMG) in Cannstatt in 1890.'),\n",
" KeyDevelopment(year=1891, description='Auguste Doriot and Louis Rigoulot completed the longest trip by a petrol-driven vehicle with a Daimler powered Peugeot Type 3.', evidence='In 1891, Auguste Doriot and his Peugeot colleague Louis Rigoulot completed the longest trip by a petrol-driven vehicle when their self-designed and built Daimler powered Peugeot Type 3 completed 2,100 kilometres (1,300 mi) from Valentigney to Paris and Brest and back again.')]"
" KeyDevelopment(year=1897, description='The first motor car in central Europe and one of the first factory-made cars in the world, the Präsident automobil, was produced by Nesselsdorfer Wagenbau.', evidence='The first motor car in central Europe and one of the first factory-made cars in the world, was produced by Czech company Nesselsdorfer Wagenbau (later renamed to Tatra) in 1897, the Präsident automobil.'),\n",
" KeyDevelopment(year=1901, description='Ransom Olds started large-scale, production-line manufacturing of affordable cars at his Oldsmobile factory in Lansing, Michigan.', evidence='Large-scale, production-line manufacturing of affordable cars was started by Ransom Olds in 1901 at his Oldsmobile factory in Lansing, Michigan.'),\n",
" KeyDevelopment(year=1913, description=\"Henry Ford introduced the world's first moving assembly line for cars at the Highland Park Ford Plant.\", evidence=\"This concept was greatly expanded by Henry Ford, beginning in 1913 with the world's first moving assembly line for cars at the Highland Park Ford Plant.\")]"
]
},
"execution_count": 9,
@@ -294,7 +380,7 @@
"\n",
"Another simple idea is to chunk up the text, but instead of extracting information from every chunk, just focus on the the most relevant chunks.\n",
"\n",
":::{.callout-caution}\n",
":::caution\n",
"It can be difficult to identify which chunks are relevant.\n",
"\n",
"For example, in the `car` article we're using here, most of the article contains key development information. So by using\n",
@@ -315,7 +401,14 @@
"cell_type": "code",
"execution_count": 10,
"id": "aaf37c82-625b-4fa1-8e88-73303f08ac16",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:36.703897Z",
"iopub.status.busy": "2024-09-10T20:35:36.703718Z",
"iopub.status.idle": "2024-09-10T20:35:38.451523Z",
"shell.execute_reply": "2024-09-10T20:35:38.450925Z"
}
},
"outputs": [],
"source": [
"from langchain_community.vectorstores import FAISS\n",
@@ -344,7 +437,14 @@
"cell_type": "code",
"execution_count": 11,
"id": "47aad00b-7013-4f7f-a1b0-02ef269093bf",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:38.455094Z",
"iopub.status.busy": "2024-09-10T20:35:38.454851Z",
"iopub.status.idle": "2024-09-10T20:35:38.458315Z",
"shell.execute_reply": "2024-09-10T20:35:38.457940Z"
}
},
"outputs": [],
"source": [
"rag_extractor = {\n",
@@ -356,7 +456,14 @@
"cell_type": "code",
"execution_count": 12,
"id": "68f2de01-0cd8-456e-a959-db236189d41b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:38.460115Z",
"iopub.status.busy": "2024-09-10T20:35:38.459949Z",
"iopub.status.idle": "2024-09-10T20:35:43.195532Z",
"shell.execute_reply": "2024-09-10T20:35:43.194254Z"
}
},
"outputs": [],
"source": [
"results = rag_extractor.invoke(\"Key developments associated with cars\")"
@@ -366,15 +473,21 @@
"cell_type": "code",
"execution_count": 13,
"id": "1788e2d6-77bb-417f-827c-eb96c035164e",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:43.200497Z",
"iopub.status.busy": "2024-09-10T20:35:43.200037Z",
"iopub.status.idle": "2024-09-10T20:35:43.206773Z",
"shell.execute_reply": "2024-09-10T20:35:43.205426Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"year=1869 description='Mary Ward became one of the first documented car fatalities in Parsonstown, Ireland.' evidence='Mary Ward became one of the first documented car fatalities in 1869 in Parsonstown, Ireland,'\n",
"year=1899 description=\"Henry Bliss one of the US's first pedestrian car casualties in New York City.\" evidence=\"Henry Bliss one of the US's first pedestrian car casualties in 1899 in New York City.\"\n",
"year=2030 description='All fossil fuel vehicles will be banned in Amsterdam.' evidence='all fossil fuel vehicles will be banned in Amsterdam from 2030.'\n"
"year=2006 description='Car-sharing services in the US experienced double-digit growth in revenue and membership.' evidence='in the US, some car-sharing services have experienced double-digit growth in revenue and membership growth between 2006 and 2007.'\n",
"year=2020 description='56 million cars were manufactured worldwide, with China producing the most.' evidence='In 2020, there were 56 million cars manufactured worldwide, down from 67 million the previous year. The automotive industry in China produces by far the most (20 million in 2020).'\n"
]
}
],
@@ -416,7 +529,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -18,18 +18,23 @@
"\n",
"First we select a LLM:\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"model\" />\n",
"```"
"<ChatModelTabs customVarName=\"model\" />\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "25487939-8713-4ec7-b774-e4a761ac8298",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:44.442501Z",
"iopub.status.busy": "2024-09-10T20:35:44.442044Z",
"iopub.status.idle": "2024-09-10T20:35:44.872217Z",
"shell.execute_reply": "2024-09-10T20:35:44.871897Z"
}
},
"outputs": [],
"source": [
"# | output: false\n",
@@ -45,7 +50,7 @@
"id": "3e412374-3beb-4bbf-966b-400c1f66a258",
"metadata": {},
"source": [
":::{.callout-tip}\n",
":::tip\n",
"This tutorial is meant to be simple, but generally should really include reference examples to squeeze out performance!\n",
":::"
]
@@ -62,16 +67,23 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "497eb023-c043-443d-ac62-2d4ea85fe1b0",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:44.873979Z",
"iopub.status.busy": "2024-09-10T20:35:44.873840Z",
"iopub.status.idle": "2024-09-10T20:35:44.878966Z",
"shell.execute_reply": "2024-09-10T20:35:44.878718Z"
}
},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from langchain_core.output_parsers import PydanticOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field, validator\n",
"from pydantic import BaseModel, Field, validator\n",
"\n",
"\n",
"class Person(BaseModel):\n",
@@ -114,9 +126,16 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "20b99ffb-a114-49a9-a7be-154c525f8ada",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:44.880355Z",
"iopub.status.busy": "2024-09-10T20:35:44.880277Z",
"iopub.status.idle": "2024-09-10T20:35:44.881834Z",
"shell.execute_reply": "2024-09-10T20:35:44.881601Z"
}
},
"outputs": [],
"source": [
"query = \"Anna is 23 years old and she is 6 feet tall\""
@@ -124,9 +143,16 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "4f3a66ce-de19-4571-9e54-67504ae3fba7",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:44.883138Z",
"iopub.status.busy": "2024-09-10T20:35:44.883049Z",
"iopub.status.idle": "2024-09-10T20:35:44.885139Z",
"shell.execute_reply": "2024-09-10T20:35:44.884801Z"
}
},
"outputs": [
{
"name": "stdout",
@@ -140,7 +166,7 @@
"\n",
"Here is the output schema:\n",
"```\n",
"{\"description\": \"Identifying information about all people in a text.\", \"properties\": {\"people\": {\"title\": \"People\", \"type\": \"array\", \"items\": {\"$ref\": \"#/definitions/Person\"}}}, \"required\": [\"people\"], \"definitions\": {\"Person\": {\"title\": \"Person\", \"description\": \"Information about a person.\", \"type\": \"object\", \"properties\": {\"name\": {\"title\": \"Name\", \"description\": \"The name of the person\", \"type\": \"string\"}, \"height_in_meters\": {\"title\": \"Height In Meters\", \"description\": \"The height of the person expressed in meters.\", \"type\": \"number\"}}, \"required\": [\"name\", \"height_in_meters\"]}}}\n",
"{\"$defs\": {\"Person\": {\"description\": \"Information about a person.\", \"properties\": {\"name\": {\"description\": \"The name of the person\", \"title\": \"Name\", \"type\": \"string\"}, \"height_in_meters\": {\"description\": \"The height of the person expressed in meters.\", \"title\": \"Height In Meters\", \"type\": \"number\"}}, \"required\": [\"name\", \"height_in_meters\"], \"title\": \"Person\", \"type\": \"object\"}}, \"description\": \"Identifying information about all people in a text.\", \"properties\": {\"people\": {\"items\": {\"$ref\": \"#/$defs/Person\"}, \"title\": \"People\", \"type\": \"array\"}}, \"required\": [\"people\"]}\n",
"```\n",
"Human: Anna is 23 years old and she is 6 feet tall\n"
]
@@ -160,9 +186,16 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "7e0041eb-37dc-4384-9fe3-6dd8c356371e",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:44.886765Z",
"iopub.status.busy": "2024-09-10T20:35:44.886675Z",
"iopub.status.idle": "2024-09-10T20:35:46.835960Z",
"shell.execute_reply": "2024-09-10T20:35:46.835282Z"
}
},
"outputs": [
{
"data": {
@@ -170,7 +203,7 @@
"People(people=[Person(name='Anna', height_in_meters=1.83)])"
]
},
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -202,16 +235,23 @@
"\n",
"If desired, it's easy to create a custom prompt and parser with `LangChain` and `LCEL`.\n",
"\n",
"To create a custom parser, define a function to parse the output from the model (typically an [AIMessage](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html)) into an object of your choice.\n",
"To create a custom parser, define a function to parse the output from the model (typically an [AIMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html)) into an object of your choice.\n",
"\n",
"See below for a simple implementation of a JSON parser."
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "b1f11912-c1bb-4a2a-a482-79bf3996961f",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:46.839577Z",
"iopub.status.busy": "2024-09-10T20:35:46.839233Z",
"iopub.status.idle": "2024-09-10T20:35:46.849663Z",
"shell.execute_reply": "2024-09-10T20:35:46.849177Z"
}
},
"outputs": [],
"source": [
"import json\n",
@@ -221,7 +261,7 @@
"from langchain_anthropic.chat_models import ChatAnthropic\n",
"from langchain_core.messages import AIMessage\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field, validator\n",
"from pydantic import BaseModel, Field, validator\n",
"\n",
"\n",
"class Person(BaseModel):\n",
@@ -279,16 +319,23 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"id": "9260d5e8-3b6c-4639-9f3b-fb2f90239e4b",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:46.851870Z",
"iopub.status.busy": "2024-09-10T20:35:46.851698Z",
"iopub.status.idle": "2024-09-10T20:35:46.854786Z",
"shell.execute_reply": "2024-09-10T20:35:46.854424Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"System: Answer the user query. Output your answer as JSON that matches the given schema: ```json\n",
"{'title': 'People', 'description': 'Identifying information about all people in a text.', 'type': 'object', 'properties': {'people': {'title': 'People', 'type': 'array', 'items': {'$ref': '#/definitions/Person'}}}, 'required': ['people'], 'definitions': {'Person': {'title': 'Person', 'description': 'Information about a person.', 'type': 'object', 'properties': {'name': {'title': 'Name', 'description': 'The name of the person', 'type': 'string'}, 'height_in_meters': {'title': 'Height In Meters', 'description': 'The height of the person expressed in meters.', 'type': 'number'}}, 'required': ['name', 'height_in_meters']}}}\n",
"{'$defs': {'Person': {'description': 'Information about a person.', 'properties': {'name': {'description': 'The name of the person', 'title': 'Name', 'type': 'string'}, 'height_in_meters': {'description': 'The height of the person expressed in meters.', 'title': 'Height In Meters', 'type': 'number'}}, 'required': ['name', 'height_in_meters'], 'title': 'Person', 'type': 'object'}}, 'description': 'Identifying information about all people in a text.', 'properties': {'people': {'items': {'$ref': '#/$defs/Person'}, 'title': 'People', 'type': 'array'}}, 'required': ['people'], 'title': 'People', 'type': 'object'}\n",
"```. Make sure to wrap the answer in ```json and ``` tags\n",
"Human: Anna is 23 years old and she is 6 feet tall\n"
]
@@ -301,17 +348,32 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"id": "c523301d-ae0e-45e3-b195-7fd28c67a5c4",
"metadata": {},
"metadata": {
"execution": {
"iopub.execute_input": "2024-09-10T20:35:46.856945Z",
"iopub.status.busy": "2024-09-10T20:35:46.856769Z",
"iopub.status.idle": "2024-09-10T20:35:48.373728Z",
"shell.execute_reply": "2024-09-10T20:35:48.373079Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/bagatur/langchain/.venv/lib/python3.11/site-packages/pydantic/_internal/_fields.py:201: UserWarning: Field name \"schema\" in \"PromptInput\" shadows an attribute in parent \"BaseModel\"\n",
" warnings.warn(\n"
]
},
{
"data": {
"text/plain": [
"[{'people': [{'name': 'Anna', 'height_in_meters': 1.83}]}]"
]
},
"execution_count": 9,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -349,7 +411,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -90,7 +90,7 @@
"outputs": [],
"source": [
"# Note that we set max_retries = 0 to avoid retrying on RateLimits, etc\n",
"openai_llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", max_retries=0)\n",
"openai_llm = ChatOpenAI(model=\"gpt-4o-mini\", max_retries=0)\n",
"anthropic_llm = ChatAnthropic(model=\"claude-3-haiku-20240307\")\n",
"llm = openai_llm.with_fallbacks([anthropic_llm])"
]

View File

@@ -29,7 +29,7 @@
"\n",
"In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
"\n",
"A few-shot prompt template can be constructed from either a set of examples, or from an [Example Selector](https://python.langchain.com/v0.2/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html) class responsible for choosing a subset of examples from the defined set.\n",
"A few-shot prompt template can be constructed from either a set of examples, or from an [Example Selector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html) class responsible for choosing a subset of examples from the defined set.\n",
"\n",
"This guide will cover few-shotting with string prompt templates. For a guide on few-shotting with chat messages for chat models, see [here](/docs/how_to/few_shot_examples_chat/).\n",
"\n",
@@ -160,7 +160,7 @@
"source": [
"### Pass the examples and formatter to `FewShotPromptTemplate`\n",
"\n",
"Finally, create a [`FewShotPromptTemplate`](https://python.langchain.com/v0.2/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotPromptTemplate.html) object. This object takes in the few-shot examples and the formatter for the few-shot examples. When this `FewShotPromptTemplate` is formatted, it formats the passed examples using the `example_prompt`, then and adds them to the final prompt before `suffix`:"
"Finally, create a [`FewShotPromptTemplate`](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotPromptTemplate.html) object. This object takes in the few-shot examples and the formatter for the few-shot examples. When this `FewShotPromptTemplate` is formatted, it formats the passed examples using the `example_prompt`, then and adds them to the final prompt before `suffix`:"
]
},
{
@@ -251,7 +251,7 @@
"source": [
"## Using an example selector\n",
"\n",
"We will reuse the example set and the formatter from the previous section. However, instead of feeding the examples directly into the `FewShotPromptTemplate` object, we will feed them into an implementation of `ExampleSelector` called [`SemanticSimilarityExampleSelector`](https://python.langchain.com/v0.2/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html) instance. This class selects few-shot examples from the initial set based on their similarity to the input. It uses an embedding model to compute the similarity between the input and the few-shot examples, as well as a vector store to perform the nearest neighbor search.\n",
"We will reuse the example set and the formatter from the previous section. However, instead of feeding the examples directly into the `FewShotPromptTemplate` object, we will feed them into an implementation of `ExampleSelector` called [`SemanticSimilarityExampleSelector`](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html) instance. This class selects few-shot examples from the initial set based on their similarity to the input. It uses an embedding model to compute the similarity between the input and the few-shot examples, as well as a vector store to perform the nearest neighbor search.\n",
"\n",
"To show what it looks like, let's initialize an instance and call it in isolation:"
]

View File

@@ -29,7 +29,7 @@
"\n",
"This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
"\n",
"There does not appear to be solid consensus on how best to do few-shot prompting, and the optimal prompt compilation will likely vary by model. Because of this, we provide few-shot prompt templates like the [FewShotChatMessagePromptTemplate](https://python.langchain.com/v0.2/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate.html?highlight=fewshot#langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate) as a flexible starting point, and you can modify or replace them as you see fit.\n",
"There does not appear to be solid consensus on how best to do few-shot prompting, and the optimal prompt compilation will likely vary by model. Because of this, we provide few-shot prompt templates like the [FewShotChatMessagePromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate.html?highlight=fewshot#langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate) as a flexible starting point, and you can modify or replace them as you see fit.\n",
"\n",
"The goal of few-shot prompt templates are to dynamically select examples based on an input, and then format the examples in a final prompt to provide for the model.\n",
"\n",
@@ -49,7 +49,7 @@
"\n",
"The basic components of the template are:\n",
"- `examples`: A list of dictionary examples to include in the final prompt.\n",
"- `example_prompt`: converts each example into 1 or more messages through its [`format_messages`](https://python.langchain.com/v0.2/api_reference/core/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html?highlight=format_messages#langchain_core.prompts.chat.ChatPromptTemplate.format_messages) method. A common example would be to convert each example into one human message and one AI message response, or a human message followed by a function call message.\n",
"- `example_prompt`: converts each example into 1 or more messages through its [`format_messages`](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html?highlight=format_messages#langchain_core.prompts.chat.ChatPromptTemplate.format_messages) method. A common example would be to convert each example into one human message and one AI message response, or a human message followed by a function call message.\n",
"\n",
"Below is a simple demonstration. First, define the examples you'd like to include. Let's give the LLM an unfamiliar mathematical operator, denoted by the \"🦜\" emoji:"
]
@@ -66,7 +66,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -86,7 +87,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='The expression \"2 🦜 9\" is not a standard mathematical operation or equation. It appears to be a combination of the number 2 and the parrot emoji 🦜 followed by the number 9. It does not have a specific mathematical meaning.', response_metadata={'token_usage': {'completion_tokens': 54, 'prompt_tokens': 17, 'total_tokens': 71}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-aad12dda-5c47-4a1e-9949-6fe94e03242a-0', usage_metadata={'input_tokens': 17, 'output_tokens': 54, 'total_tokens': 71})"
"AIMessage(content='The expression \"2 🦜 9\" is not a standard mathematical operation or equation. It appears to be a combination of the number 2 and the parrot emoji 🦜 followed by the number 9. It does not have a specific mathematical meaning.', response_metadata={'token_usage': {'completion_tokens': 54, 'prompt_tokens': 17, 'total_tokens': 71}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-aad12dda-5c47-4a1e-9949-6fe94e03242a-0', usage_metadata={'input_tokens': 17, 'output_tokens': 54, 'total_tokens': 71})"
]
},
"execution_count": 4,
@@ -97,7 +98,7 @@
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0.0)\n",
"model = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0.0)\n",
"\n",
"model.invoke(\"What is 2 🦜 9?\")"
]
@@ -212,7 +213,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='11', response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 60, 'total_tokens': 61}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5ec4e051-262f-408e-ad00-3f2ebeb561c3-0', usage_metadata={'input_tokens': 60, 'output_tokens': 1, 'total_tokens': 61})"
"AIMessage(content='11', response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 60, 'total_tokens': 61}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5ec4e051-262f-408e-ad00-3f2ebeb561c3-0', usage_metadata={'input_tokens': 60, 'output_tokens': 1, 'total_tokens': 61})"
]
},
"execution_count": 8,
@@ -239,8 +240,8 @@
"\n",
"Sometimes you may want to select only a few examples from your overall set to show based on the input. For this, you can replace the `examples` passed into `FewShotChatMessagePromptTemplate` with an `example_selector`. The other components remain the same as above! Our dynamic few-shot prompt template would look like:\n",
"\n",
"- `example_selector`: responsible for selecting few-shot examples (and the order in which they are returned) for a given input. These implement the [BaseExampleSelector](https://python.langchain.com/v0.2/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html?highlight=baseexampleselector#langchain_core.example_selectors.base.BaseExampleSelector) interface. A common example is the vectorstore-backed [SemanticSimilarityExampleSelector](https://python.langchain.com/v0.2/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html?highlight=semanticsimilarityexampleselector#langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector)\n",
"- `example_prompt`: convert each example into 1 or more messages through its [`format_messages`](https://python.langchain.com/v0.2/api_reference/core/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html?highlight=chatprompttemplate#langchain_core.prompts.chat.ChatPromptTemplate.format_messages) method. A common example would be to convert each example into one human message and one AI message response, or a human message followed by a function call message.\n",
"- `example_selector`: responsible for selecting few-shot examples (and the order in which they are returned) for a given input. These implement the [BaseExampleSelector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html?highlight=baseexampleselector#langchain_core.example_selectors.base.BaseExampleSelector) interface. A common example is the vectorstore-backed [SemanticSimilarityExampleSelector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html?highlight=semanticsimilarityexampleselector#langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector)\n",
"- `example_prompt`: convert each example into 1 or more messages through its [`format_messages`](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html?highlight=chatprompttemplate#langchain_core.prompts.chat.ChatPromptTemplate.format_messages) method. A common example would be to convert each example into one human message and one AI message response, or a human message followed by a function call message.\n",
"\n",
"These once again can be composed with other messages and chat templates to assemble your final prompt.\n",
"\n",
@@ -418,7 +419,7 @@
{
"data": {
"text/plain": [
"AIMessage(content='6', response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 60, 'total_tokens': 61}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d1863e5e-17cd-4e9d-bf7a-b9f118747a65-0', usage_metadata={'input_tokens': 60, 'output_tokens': 1, 'total_tokens': 61})"
"AIMessage(content='6', response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 60, 'total_tokens': 61}, 'model_name': 'gpt-4o-mini', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d1863e5e-17cd-4e9d-bf7a-b9f118747a65-0', usage_metadata={'input_tokens': 60, 'output_tokens': 1, 'total_tokens': 61})"
]
},
"execution_count": 13,
@@ -427,7 +428,7 @@
}
],
"source": [
"chain = final_prompt | ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0.0)\n",
"chain = final_prompt | ChatOpenAI(model=\"gpt-4o-mini\", temperature=0.0)\n",
"\n",
"chain.invoke({\"input\": \"What's 3 🦜 3?\"})"
]

View File

@@ -175,7 +175,7 @@
"source": [
"## API reference\n",
"\n",
"For a complete description of all arguments head to the API reference: https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.utils.filter_messages.html"
"For a complete description of all arguments head to the API reference: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.filter_messages.html"
]
}
],

View File

@@ -17,14 +17,12 @@
"source": [
"# How to do tool/function calling\n",
"\n",
"```{=mdx}\n",
":::info\n",
"We use the term tool calling interchangeably with function calling. Although\n",
"function calling is sometimes meant to refer to invocations of a single function,\n",
"we treat all models as though they can return multiple tool or function calls in \n",
"each message.\n",
":::\n",
"```\n",
"\n",
"Tool calling allows a model to respond to a given prompt by generating output that \n",
"matches a user-defined schema. While the name implies that the model is performing \n",
@@ -88,7 +86,7 @@
"## Passing tools to LLMs\n",
"\n",
"Chat models supporting tool calling features implement a `.bind_tools` method, which \n",
"receives a list of LangChain [tool objects](https://python.langchain.com/v0.2/api_reference/core/tools/langchain_core.tools.BaseTool.html#langchain_core.tools.BaseTool) \n",
"receives a list of LangChain [tool objects](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html#langchain_core.tools.BaseTool) \n",
"and binds them to the chat model in its expected format. Subsequent invocations of the \n",
"chat model will include tool schemas in its calls to the LLM.\n",
"\n",
@@ -136,7 +134,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"# Note that the docstrings here are crucial, as they will be passed along\n",
@@ -165,14 +163,12 @@
"source": [
"We can bind them to chat models as follows:\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
" fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
"/>\n",
"```\n",
"\n",
"We can use the `bind_tools()` method to handle converting\n",
"`Multiply` to a \"tool\" and binding it to the model (i.e.,\n",
@@ -191,7 +187,7 @@
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
"llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
]
},
{
@@ -212,9 +208,9 @@
"## Tool calls\n",
"\n",
"If tool calls are included in a LLM response, they are attached to the corresponding \n",
"[message](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage) \n",
"or [message chunk](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"as a list of [tool call](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.tool.ToolCall.html#langchain_core.messages.tool.ToolCall) \n",
"[message](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage) \n",
"or [message chunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"as a list of [tool call](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCall.html#langchain_core.messages.tool.ToolCall) \n",
"objects in the `.tool_calls` attribute. A `ToolCall` is a typed dict that includes a \n",
"tool name, dict of argument values, and (optionally) an identifier. Messages with no \n",
"tool calls default to an empty list for this attribute.\n",
@@ -258,7 +254,7 @@
"The `.tool_calls` attribute should contain valid tool calls. Note that on occasion, \n",
"model providers may output malformed tool calls (e.g., arguments that are not \n",
"valid JSON). When parsing fails in these cases, instances \n",
"of [InvalidToolCall](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.tool.InvalidToolCall.html#langchain_core.messages.tool.InvalidToolCall) \n",
"of [InvalidToolCall](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.InvalidToolCall.html#langchain_core.messages.tool.InvalidToolCall) \n",
"are populated in the `.invalid_tool_calls` attribute. An `InvalidToolCall` can have \n",
"a name, string arguments, identifier, and error message.\n",
"\n",
@@ -298,8 +294,8 @@
"### Streaming\n",
"\n",
"When tools are called in a streaming context, \n",
"[message chunks](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"will be populated with [tool call chunk](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.tool.ToolCallChunk.html#langchain_core.messages.tool.ToolCallChunk) \n",
"[message chunks](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"will be populated with [tool call chunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCallChunk.html#langchain_core.messages.tool.ToolCallChunk) \n",
"objects in a list via the `.tool_call_chunks` attribute. A `ToolCallChunk` includes \n",
"optional string fields for the tool `name`, `args`, and `id`, and includes an optional \n",
"integer field `index` that can be used to join chunks together. Fields are optional \n",
@@ -307,7 +303,7 @@
"that includes a substring of the arguments may have null values for the tool name and id).\n",
"\n",
"Because message chunks inherit from their parent message class, an \n",
"[AIMessageChunk](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"[AIMessageChunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
"with tool call chunks will also include `.tool_calls` and `.invalid_tool_calls` fields. \n",
"These fields are parsed best-effort from the message's tool call chunks.\n",
"\n",
@@ -696,7 +692,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -26,7 +26,7 @@
"\n",
":::\n",
"\n",
"You can use arbitrary functions as [Runnables](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable). This is useful for formatting or when you need functionality not provided by other LangChain components, and custom functions used as Runnables are called [`RunnableLambdas`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html).\n",
"You can use arbitrary functions as [Runnables](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable). This is useful for formatting or when you need functionality not provided by other LangChain components, and custom functions used as Runnables are called [`RunnableLambdas`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html).\n",
"\n",
"Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single dict input and unpacks it into multiple arguments.\n",
"\n",
@@ -54,7 +54,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -210,7 +211,7 @@
"\n",
"## Passing run metadata\n",
"\n",
"Runnable lambdas can optionally accept a [RunnableConfig](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig) parameter, which they can use to pass callbacks, tags, and other configuration information to nested runs."
"Runnable lambdas can optionally accept a [RunnableConfig](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig) parameter, which they can use to pass callbacks, tags, and other configuration information to nested runs."
]
},
{
@@ -302,8 +303,8 @@
"source": [
"## Streaming\n",
"\n",
":::{.callout-note}\n",
"[RunnableLambda](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html) is best suited for code that does not need to support streaming. If you need to support streaming (i.e., be able to operate on chunks of inputs and yield chunks of outputs), use [RunnableGenerator](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableGenerator.html) instead as in the example below.\n",
":::note\n",
"[RunnableLambda](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html) is best suited for code that does not need to support streaming. If you need to support streaming (i.e., be able to operate on chunks of inputs and yield chunks of outputs), use [RunnableGenerator](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableGenerator.html) instead as in the example below.\n",
":::\n",
"\n",
"You can use generator functions (ie. functions that use the `yield` keyword, and behave like iterators) in a chain.\n",

View File

@@ -24,7 +24,7 @@
"\n",
"## Architecture\n",
"\n",
"At a high-level, the steps of constructing a knowledge are from text are:\n",
"At a high-level, the steps of constructing a knowledge graph from text are:\n",
"\n",
"1. **Extracting structured information from text**: Model is used to extract structured graph information from text.\n",
"2. **Storing into graph database**: Storing the extracted structured graph information into a graph database enables downstream RAG applications\n",

View File

@@ -163,8 +163,8 @@
"from typing import List, Optional\n",
"\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)\n",
"\n",

View File

@@ -347,7 +347,7 @@
"\n",
"If we have enough examples, we may want to only include the most relevant ones in the prompt, either because they don't fit in the model's context window or because the long tail of examples distracts the model. And specifically, given any input we want to include the examples most relevant to that input.\n",
"\n",
"We can do just this using an ExampleSelector. In this case we'll use a [SemanticSimilarityExampleSelector](https://python.langchain.com/v0.2/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html), which will store the examples in the vector database of our choosing. At runtime it will perform a similarity search between the input and our examples, and return the most semantically similar ones: "
"We can do just this using an ExampleSelector. In this case we'll use a [SemanticSimilarityExampleSelector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector.html), which will store the examples in the vector database of our choosing. At runtime it will perform a similarity search between the input and our examples, and return the most semantically similar ones: "
]
},
{

View File

@@ -177,14 +177,15 @@
"source": [
"from typing import Optional, Type\n",
"\n",
"# Import things that are needed generically\n",
"from langchain.pydantic_v1 import BaseModel, Field\n",
"from langchain_core.callbacks import (\n",
" AsyncCallbackManagerForToolRun,\n",
" CallbackManagerForToolRun,\n",
")\n",
"from langchain_core.tools import BaseTool\n",
"\n",
"# Import things that are needed generically\n",
"from pydantic import BaseModel, Field\n",
"\n",
"description_query = \"\"\"\n",
"MATCH (m:Movie|Person)\n",
"WHERE m.title CONTAINS $candidate OR m.name CONTAINS $candidate\n",
@@ -226,14 +227,15 @@
"source": [
"from typing import Optional, Type\n",
"\n",
"# Import things that are needed generically\n",
"from langchain.pydantic_v1 import BaseModel, Field\n",
"from langchain_core.callbacks import (\n",
" AsyncCallbackManagerForToolRun,\n",
" CallbackManagerForToolRun,\n",
")\n",
"from langchain_core.tools import BaseTool\n",
"\n",
"# Import things that are needed generically\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class InformationInput(BaseModel):\n",
" entity: str = Field(description=\"movie or a person mentioned in the question\")\n",

View File

@@ -9,7 +9,7 @@ Here youll find answers to “How do I….?” types of questions.
These guides are *goal-oriented* and *concrete*; they're meant to help you complete a specific task.
For conceptual explanations see the [Conceptual guide](/docs/concepts/).
For end-to-end walkthroughs see [Tutorials](/docs/tutorials).
For comprehensive descriptions of every class and function see the [API Reference](https://python.langchain.com/v0.2/api_reference/).
For comprehensive descriptions of every class and function see the [API Reference](https://python.langchain.com/api_reference/).
## Installation
@@ -27,7 +27,7 @@ This highlights functionality that is core to using LangChain.
## LangChain Expression Language (LCEL)
[LangChain Expression Language](/docs/concepts/#langchain-expression-language-lcel) is a way to create arbitrary custom chains. It is built on the [Runnable](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) protocol.
[LangChain Expression Language](/docs/concepts/#langchain-expression-language-lcel) is a way to create arbitrary custom chains. It is built on the [Runnable](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) protocol.
[**LCEL cheatsheet**](/docs/how_to/lcel_cheatsheet/): For a quick overview of how to use the main LCEL primitives.
@@ -126,13 +126,14 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo
[Document Loaders](/docs/concepts/#document-loaders) are responsible for loading documents from a variety of sources.
- [How to: load PDF files](/docs/how_to/document_loader_pdf)
- [How to: load web pages](/docs/how_to/document_loader_web)
- [How to: load CSV data](/docs/how_to/document_loader_csv)
- [How to: load data from a directory](/docs/how_to/document_loader_directory)
- [How to: load HTML data](/docs/how_to/document_loader_html)
- [How to: load JSON data](/docs/how_to/document_loader_json)
- [How to: load Markdown data](/docs/how_to/document_loader_markdown)
- [How to: load Microsoft Office data](/docs/how_to/document_loader_office_file)
- [How to: load PDF files](/docs/how_to/document_loader_pdf)
- [How to: write a custom document loader](/docs/how_to/document_loader_custom)
### Text splitters

View File

@@ -7,10 +7,10 @@
"source": [
"# LangChain Expression Language Cheatsheet\n",
"\n",
"This is a quick reference for all the most important LCEL primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
"This is a quick reference for all the most important LCEL primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
"\n",
"### Invoke a runnable\n",
"#### [Runnable.invoke()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.invoke) / [Runnable.ainvoke()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.ainvoke)"
"#### [Runnable.invoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.invoke) / [Runnable.ainvoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.ainvoke)"
]
},
{
@@ -46,7 +46,7 @@
"metadata": {},
"source": [
"### Batch a runnable\n",
"#### [Runnable.batch()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.batch) / [Runnable.abatch()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.abatch)"
"#### [Runnable.batch()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.batch) / [Runnable.abatch()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.abatch)"
]
},
{
@@ -82,7 +82,7 @@
"metadata": {},
"source": [
"### Stream a runnable\n",
"#### [Runnable.stream()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) / [Runnable.astream()](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream)"
"#### [Runnable.stream()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) / [Runnable.astream()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream)"
]
},
{
@@ -165,7 +165,7 @@
"metadata": {},
"source": [
"### Invoke runnables in parallel\n",
"#### [RunnableParallel](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableParallel.html)"
"#### [RunnableParallel](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableParallel.html)"
]
},
{
@@ -202,7 +202,7 @@
"metadata": {},
"source": [
"### Turn any function into a runnable\n",
"#### [RunnableLambda](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html)"
"#### [RunnableLambda](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableLambda.html)"
]
},
{
@@ -240,7 +240,7 @@
"metadata": {},
"source": [
"### Merge input and output dicts\n",
"#### [RunnablePassthrough.assign](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html)"
"#### [RunnablePassthrough.assign](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html)"
]
},
{
@@ -276,7 +276,7 @@
"metadata": {},
"source": [
"### Include input dict in output dict\n",
"#### [RunnablePassthrough](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html)"
"#### [RunnablePassthrough](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html)"
]
},
{
@@ -316,7 +316,7 @@
"metadata": {},
"source": [
"### Add default invocation args\n",
"#### [Runnable.bind](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind)"
"#### [Runnable.bind](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind)"
]
},
{
@@ -360,7 +360,7 @@
"metadata": {},
"source": [
"### Add fallbacks\n",
"#### [Runnable.with_fallbacks](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_fallbacks)"
"#### [Runnable.with_fallbacks](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_fallbacks)"
]
},
{
@@ -397,7 +397,7 @@
"metadata": {},
"source": [
"### Add retries\n",
"#### [Runnable.with_retry](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_retry)"
"#### [Runnable.with_retry](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_retry)"
]
},
{
@@ -449,7 +449,7 @@
"metadata": {},
"source": [
"### Configure runnable execution\n",
"#### [RunnableConfig](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html)"
"#### [RunnableConfig](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html)"
]
},
{
@@ -487,7 +487,7 @@
"metadata": {},
"source": [
"### Add default config to runnable\n",
"#### [Runnable.with_config](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config)"
"#### [Runnable.with_config](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_config)"
]
},
{
@@ -526,7 +526,7 @@
"metadata": {},
"source": [
"### Make runnable attributes configurable\n",
"#### [Runnable.with_configurable_fields](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableSerializable.html#langchain_core.runnables.base.RunnableSerializable.configurable_fields)"
"#### [Runnable.with_configurable_fields](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableSerializable.html#langchain_core.runnables.base.RunnableSerializable.configurable_fields)"
]
},
{
@@ -605,7 +605,7 @@
"metadata": {},
"source": [
"### Make chain components configurable\n",
"#### [Runnable.with_configurable_alternatives](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableSerializable.html#langchain_core.runnables.base.RunnableSerializable.configurable_alternatives)"
"#### [Runnable.with_configurable_alternatives](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableSerializable.html#langchain_core.runnables.base.RunnableSerializable.configurable_alternatives)"
]
},
{
@@ -745,7 +745,7 @@
"metadata": {},
"source": [
"### Generate a stream of events\n",
"#### [Runnable.astream_events](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream_events)"
"#### [Runnable.astream_events](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.astream_events)"
]
},
{
@@ -817,7 +817,7 @@
"metadata": {},
"source": [
"### Yield batched outputs as they complete\n",
"#### [Runnable.batch_as_completed](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.batch_as_completed) / [Runnable.abatch_as_completed](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.abatch_as_completed)"
"#### [Runnable.batch_as_completed](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.batch_as_completed) / [Runnable.abatch_as_completed](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.abatch_as_completed)"
]
},
{
@@ -858,7 +858,7 @@
"metadata": {},
"source": [
"### Return subset of output dict\n",
"#### [Runnable.pick](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.pick)"
"#### [Runnable.pick](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.pick)"
]
},
{
@@ -893,7 +893,7 @@
"metadata": {},
"source": [
"### Declaratively make a batched version of a runnable\n",
"#### [Runnable.map](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.map)"
"#### [Runnable.map](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.map)"
]
},
{
@@ -930,7 +930,7 @@
"metadata": {},
"source": [
"### Get a graph representation of a runnable\n",
"#### [Runnable.get_graph](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.get_graph)"
"#### [Runnable.get_graph](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.get_graph)"
]
},
{
@@ -991,7 +991,7 @@
"metadata": {},
"source": [
"### Get all prompts in a chain\n",
"#### [Runnable.get_prompts](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.get_prompts)"
"#### [Runnable.get_prompts](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.get_prompts)"
]
},
{
@@ -1071,7 +1071,7 @@
"metadata": {},
"source": [
"### Add lifecycle listeners\n",
"#### [Runnable.with_listeners](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_listeners)"
"#### [Runnable.with_listeners](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_listeners)"
]
},
{

View File

@@ -25,7 +25,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"# Please manually enter OpenAI Key"
]
},

View File

@@ -24,13 +24,13 @@
"\n",
"There are some API-specific callback context managers that allow you to track token usage across multiple calls. You'll need to check whether such an integration is available for your particular model.\n",
"\n",
"If such an integration is not available for your model, you can create a custom callback manager by adapting the implementation of the [OpenAI callback manager](https://python.langchain.com/v0.2/api_reference/community/callbacks/langchain_community.callbacks.openai_info.OpenAICallbackHandler.html).\n",
"If such an integration is not available for your model, you can create a custom callback manager by adapting the implementation of the [OpenAI callback manager](https://python.langchain.com/api_reference/community/callbacks/langchain_community.callbacks.openai_info.OpenAICallbackHandler.html).\n",
"\n",
"### OpenAI\n",
"\n",
"Let's first look at an extremely simple example of tracking token usage for a single Chat model call.\n",
"\n",
":::{.callout-danger}\n",
":::danger\n",
"\n",
"The callback handler does not currently support streaming token counts for legacy language models (e.g., `langchain_openai.OpenAI`). For support in a streaming context, refer to the corresponding guide for chat models [here](/docs/how_to/chat_token_usage_tracking).\n",
"\n",
@@ -162,7 +162,7 @@
"source": [
"## Streaming\n",
"\n",
":::{.callout-danger}\n",
":::danger\n",
"\n",
"`get_openai_callback` does not currently support streaming token counts for legacy language models (e.g., `langchain_openai.OpenAI`). If you want to count tokens correctly in a streaming context, there are a number of options:\n",
"\n",

View File

@@ -208,7 +208,7 @@
"\n",
"Metal is a graphics and compute API created by Apple providing near-direct access to the GPU. \n",
"\n",
"See the [`llama.cpp`](docs/integrations/llms/llamacpp) setup [here](https://github.com/abetlen/llama-cpp-python/blob/main/docs/install/macos.md) to enable this.\n",
"See the [`llama.cpp`](/docs/integrations/llms/llamacpp) setup [here](https://github.com/abetlen/llama-cpp-python/blob/main/docs/install/macos.md) to enable this.\n",
"\n",
"In particular, ensure that conda is using the correct virtual environment that you created (`miniforge3`).\n",
"\n",
@@ -244,7 +244,7 @@
"\n",
"* E.g., for Llama 2 7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n",
"* We can also specify a particular version from the [model list](https://github.com/jmorganca/ollama?tab=readme-ov-file#model-library), e.g., `ollama pull llama2:13b`\n",
"* See the full set of parameters on the [API reference page](https://python.langchain.com/v0.2/api_reference/community/llms/langchain_community.llms.ollama.Ollama.html)"
"* See the full set of parameters on the [API reference page](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.ollama.Ollama.html)"
]
},
{
@@ -280,9 +280,9 @@
"\n",
"For example, below we run inference on `llama2-13b` with 4 bit quantization downloaded from [HuggingFace](https://huggingface.co/TheBloke/Llama-2-13B-GGML/tree/main).\n",
"\n",
"As noted above, see the [API reference](https://python.langchain.com/v0.2/api_reference/langchain/llms/langchain.llms.llamacpp.LlamaCpp.html?highlight=llamacpp#langchain.llms.llamacpp.LlamaCpp) for the full set of parameters. \n",
"As noted above, see the [API reference](https://python.langchain.com/api_reference/langchain/llms/langchain.llms.llamacpp.LlamaCpp.html?highlight=llamacpp#langchain.llms.llamacpp.LlamaCpp) for the full set of parameters. \n",
"\n",
"From the [llama.cpp API reference docs](https://python.langchain.com/v0.2/api_reference/community/llms/langchain_community.llms.llamacpp.LlamaCpp.html), a few are worth commenting on:\n",
"From the [llama.cpp API reference docs](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.llamacpp.LlamaCpp.html), a few are worth commenting on:\n",
"\n",
"`n_gpu_layers`: number of layers to be loaded into GPU memory\n",
"\n",
@@ -416,7 +416,7 @@
"\n",
"We can use model weights downloaded from [GPT4All](/docs/integrations/llms/gpt4all) model explorer.\n",
"\n",
"Similar to what is shown above, we can run inference and use [the API reference](https://python.langchain.com/v0.2/api_reference/community/llms/langchain_community.llms.gpt4all.GPT4All.html) to set parameters of interest."
"Similar to what is shown above, we can run inference and use [the API reference](https://python.langchain.com/api_reference/community/llms/langchain_community.llms.gpt4all.GPT4All.html) to set parameters of interest."
]
},
{

View File

@@ -55,7 +55,7 @@
"id": "f88ffa0d-f4a7-482c-88de-cbec501a79b1",
"metadata": {},
"source": [
"For the OpenAI API to return log probabilities we need to configure the `logprobs=True` param. Then, the logprobs are included on each output [`AIMessage`](https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html) as part of the `response_metadata`:"
"For the OpenAI API to return log probabilities we need to configure the `logprobs=True` param. Then, the logprobs are included on each output [`AIMessage`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html) as part of the `response_metadata`:"
]
},
{
@@ -94,7 +94,7 @@
"source": [
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\").bind(logprobs=True)\n",
"llm = ChatOpenAI(model=\"gpt-4o-mini\").bind(logprobs=True)\n",
"\n",
"msg = llm.invoke((\"human\", \"how are you today\"))\n",
"\n",

View File

@@ -13,7 +13,7 @@
"\n",
"To mitigate the [\"lost in the middle\"](https://arxiv.org/abs/2307.03172) effect, you can re-order documents after retrieval such that the most relevant documents are positioned at extrema (e.g., the first and last pieces of context), and the least relevant documents are positioned in the middle. In some cases this can help surface the most relevant information to LLMs.\n",
"\n",
"The [LongContextReorder](https://python.langchain.com/v0.2/api_reference/community/document_transformers/langchain_community.document_transformers.long_context_reorder.LongContextReorder.html) document transformer implements this re-ordering procedure. Below we demonstrate an example."
"The [LongContextReorder](https://python.langchain.com/api_reference/community/document_transformers/langchain_community.document_transformers.long_context_reorder.LongContextReorder.html) document transformer implements this re-ordering procedure. Below we demonstrate an example."
]
},
{

View File

@@ -17,7 +17,7 @@
"When a full paragraph or document is embedded, the embedding process considers both the overall context and the relationships between the sentences and phrases within the text. This can result in a more comprehensive vector representation that captures the broader meaning and themes of the text.\n",
"```\n",
" \n",
"As mentioned, chunking often aims to keep text with common context together. With this in mind, we might want to specifically honor the structure of the document itself. For example, a markdown file is organized by headers. Creating chunks within specific header groups is an intuitive idea. To address this challenge, we can use [MarkdownHeaderTextSplitter](https://python.langchain.com/v0.2/api_reference/text_splitters/markdown/langchain_text_splitters.markdown.MarkdownHeaderTextSplitter.html). This will split a markdown file by a specified set of headers. \n",
"As mentioned, chunking often aims to keep text with common context together. With this in mind, we might want to specifically honor the structure of the document itself. For example, a markdown file is organized by headers. Creating chunks within specific header groups is an intuitive idea. To address this challenge, we can use [MarkdownHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/markdown/langchain_text_splitters.markdown.MarkdownHeaderTextSplitter.html). This will split a markdown file by a specified set of headers. \n",
"\n",
"For example, if we want to split this markdown:\n",
"```\n",

View File

@@ -11,12 +11,30 @@
"\n",
"The `merge_message_runs` utility makes it easy to merge consecutive messages of the same type.\n",
"\n",
"### Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "198ce37f-4466-45a2-8878-d75cd01a5d23",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain-core langchain-anthropic"
]
},
{
"cell_type": "markdown",
"id": "b5c3ca6e-e5b3-4151-8307-9101713a20ae",
"metadata": {},
"source": [
"## Basic usage"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 8,
"id": "1a215bbb-c05c-40b0-a6fd-d94884d517df",
"metadata": {},
"outputs": [
@@ -24,11 +42,11 @@
"name": "stdout",
"output_type": "stream",
"text": [
"SystemMessage(content=\"you're a good assistant.\\nyou always respond with a joke.\")\n",
"SystemMessage(content=\"you're a good assistant.\\nyou always respond with a joke.\", additional_kwargs={}, response_metadata={})\n",
"\n",
"HumanMessage(content=[{'type': 'text', 'text': \"i wonder why it's called langchain\"}, 'and who is harrison chasing anyways'])\n",
"HumanMessage(content=[{'type': 'text', 'text': \"i wonder why it's called langchain\"}, 'and who is harrison chasing anyways'], additional_kwargs={}, response_metadata={})\n",
"\n",
"AIMessage(content='Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!\\nWhy, he\\'s probably chasing after the last cup of coffee in the office!')\n"
"AIMessage(content='Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!\\nWhy, he\\'s probably chasing after the last cup of coffee in the office!', additional_kwargs={}, response_metadata={})\n"
]
}
],
@@ -63,38 +81,6 @@
"Notice that if the contents of one of the messages to merge is a list of content blocks then the merged message will have a list of content blocks. And if both messages to merge have string contents then those are concatenated with a newline character."
]
},
{
"cell_type": "markdown",
"id": "11f7e8d3",
"metadata": {},
"source": [
"The `merge_message_runs` utility also works with messages composed together using the overloaded `+` operation:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b51855c5",
"metadata": {},
"outputs": [],
"source": [
"messages = (\n",
" SystemMessage(\"you're a good assistant.\")\n",
" + SystemMessage(\"you always respond with a joke.\")\n",
" + HumanMessage([{\"type\": \"text\", \"text\": \"i wonder why it's called langchain\"}])\n",
" + HumanMessage(\"and who is harrison chasing anyways\")\n",
" + AIMessage(\n",
" 'Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!'\n",
" )\n",
" + AIMessage(\n",
" \"Why, he's probably chasing after the last cup of coffee in the office!\"\n",
" )\n",
")\n",
"\n",
"merged = merge_message_runs(messages)\n",
"print(\"\\n\\n\".join([repr(x) for x in merged]))"
]
},
{
"cell_type": "markdown",
"id": "1b2eee74-71c8-4168-b968-bca580c25d18",
@@ -107,23 +93,30 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 9,
"id": "6d5a0283-11f8-435b-b27b-7b18f7693592",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
},
{
"data": {
"text/plain": [
"AIMessage(content=[], response_metadata={'id': 'msg_01D6R8Naum57q8qBau9vLBUX', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 84, 'output_tokens': 3}}, id='run-ac0c465b-b54f-4b8b-9295-e5951250d653-0', usage_metadata={'input_tokens': 84, 'output_tokens': 3, 'total_tokens': 87})"
"AIMessage(content=[], additional_kwargs={}, response_metadata={'id': 'msg_01KNGUMTuzBVfwNouLDpUMwf', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 84, 'output_tokens': 3}}, id='run-b908b198-9c24-450b-9749-9d4a8182937b-0', usage_metadata={'input_tokens': 84, 'output_tokens': 3, 'total_tokens': 87})"
]
},
"execution_count": 3,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# pip install -U langchain-anthropic\n",
"%pip install -qU langchain-anthropic\n",
"from langchain_anthropic import ChatAnthropic\n",
"\n",
"llm = ChatAnthropic(model=\"claude-3-sonnet-20240229\", temperature=0)\n",
@@ -146,19 +139,19 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 10,
"id": "460817a6-c327-429d-958e-181a8c46059c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[SystemMessage(content=\"you're a good assistant.\\nyou always respond with a joke.\"),\n",
" HumanMessage(content=[{'type': 'text', 'text': \"i wonder why it's called langchain\"}, 'and who is harrison chasing anyways']),\n",
" AIMessage(content='Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!\\nWhy, he\\'s probably chasing after the last cup of coffee in the office!')]"
"[SystemMessage(content=\"you're a good assistant.\\nyou always respond with a joke.\", additional_kwargs={}, response_metadata={}),\n",
" HumanMessage(content=[{'type': 'text', 'text': \"i wonder why it's called langchain\"}, 'and who is harrison chasing anyways'], additional_kwargs={}, response_metadata={}),\n",
" AIMessage(content='Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!\\nWhy, he\\'s probably chasing after the last cup of coffee in the office!', additional_kwargs={}, response_metadata={})]"
]
},
"execution_count": 4,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -167,6 +160,53 @@
"merger.invoke(messages)"
]
},
{
"cell_type": "markdown",
"id": "4178837d-b155-492d-9404-d567accc1fa0",
"metadata": {},
"source": [
"`merge_message_runs` can also be placed after a prompt:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "620530ab-ed05-4899-b984-bfa4cd738465",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='A convergent series is an infinite series whose partial sums approach a finite value as more terms are added. In other words, the sequence of partial sums has a limit.\\n\\nMore formally, an infinite series Σ an (where an are the terms of the series) is said to be convergent if the sequence of partial sums:\\n\\nS1 = a1\\nS2 = a1 + a2 \\nS3 = a1 + a2 + a3\\n...\\nSn = a1 + a2 + a3 + ... + an\\n...\\n\\nconverges to some finite number S as n goes to infinity. We write:\\n\\nlim n→∞ Sn = S\\n\\nThe finite number S is called the sum of the convergent infinite series.\\n\\nIf the sequence of partial sums does not approach any finite limit, the infinite series is said to be divergent.\\n\\nSome key properties:\\n- A series converges if and only if the sequence of its partial sums is a Cauchy sequence.\\n- Absolute/conditional convergence criteria help determine if a given series converges.\\n- Convergent series have many important applications in mathematics, physics, engineering etc.', additional_kwargs={}, response_metadata={'id': 'msg_01MfV6y2hep7ZNvDz24A36U4', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 29, 'output_tokens': 267}}, id='run-9d925f58-021e-4bd0-94fc-f8f5e91010a4-0', usage_metadata={'input_tokens': 29, 'output_tokens': 267, 'total_tokens': 296})"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.prompts import ChatPromptTemplate\n",
"\n",
"prompt = ChatPromptTemplate(\n",
" [\n",
" (\"system\", \"You're great a {skill}\"),\n",
" (\"system\", \"You're also great at explaining things\"),\n",
" (\"human\", \"{query}\"),\n",
" ]\n",
")\n",
"chain = prompt | merger | llm\n",
"chain.invoke({\"skill\": \"math\", \"query\": \"what's the definition of a convergent series\"})"
]
},
{
"cell_type": "markdown",
"id": "51ba533a-43c7-4e5f-bd91-a4ec23ceeb34",
"metadata": {},
"source": [
"LangSmith Trace: https://smith.langchain.com/public/432150b6-9909-40a7-8ae7-944b7e657438/r/f4ad5fb2-4d38-42a6-b780-25f62617d53f"
]
},
{
"cell_type": "markdown",
"id": "4548d916-ce21-4dc6-8f19-eedb8003ace6",
@@ -174,7 +214,7 @@
"source": [
"## API reference\n",
"\n",
"For a complete description of all arguments head to the API reference: https://python.langchain.com/v0.2/api_reference/core/messages/langchain_core.messages.utils.merge_message_runs.html"
"For a complete description of all arguments head to the API reference: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.merge_message_runs.html"
]
}
],
@@ -194,7 +234,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -32,7 +32,7 @@
"\n",
":::\n",
"\n",
"Passing conversation state into and out a chain is vital when building a chatbot. The [`RunnableWithMessageHistory`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html#langchain_core.runnables.history.RunnableWithMessageHistory) class lets us add message history to certain types of chains. It wraps another Runnable and manages the chat message history for it. Specifically, it loads previous messages in the conversation BEFORE passing it to the Runnable, and it saves the generated response as a message AFTER calling the runnable. This class also enables multiple conversations by saving each conversation with a `session_id` - it then expects a `session_id` to be passed in the config when calling the runnable, and uses that to look up the relevant conversation history.\n",
"Passing conversation state into and out a chain is vital when building a chatbot. The [`RunnableWithMessageHistory`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html#langchain_core.runnables.history.RunnableWithMessageHistory) class lets us add message history to certain types of chains. It wraps another Runnable and manages the chat message history for it. Specifically, it loads previous messages in the conversation BEFORE passing it to the Runnable, and it saves the generated response as a message AFTER calling the runnable. This class also enables multiple conversations by saving each conversation with a `session_id` - it then expects a `session_id` to be passed in the config when calling the runnable, and uses that to look up the relevant conversation history.\n",
"\n",
"![index_diagram](../../static/img/message_history.png)\n",
"\n",
@@ -155,13 +155,11 @@
"\n",
"First we construct a runnable (which here accepts a dict as input and returns a message as output):\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
"/>\n",
"```"
"/>\n"
]
},
{

View File

@@ -31,7 +31,7 @@
":::\n",
"\n",
"Here we focus on how to move from legacy LangChain agents to more flexible [LangGraph](https://langchain-ai.github.io/langgraph/) agents.\n",
"LangChain agents (the [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor) in particular) have multiple configuration parameters.\n",
"LangChain agents (the [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor) in particular) have multiple configuration parameters.\n",
"In this notebook we will show how those parameters map to the LangGraph react agent executor using the [create_react_agent](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) prebuilt helper method.\n",
"\n",
"#### Prerequisites\n",
@@ -65,9 +65,11 @@
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API key:\\n\")"
]
},
{
@@ -110,7 +112,7 @@
"id": "af002033-fe51-4d14-b47c-3e9b483c8395",
"metadata": {},
"source": [
"For the LangChain [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor), we define a prompt with a placeholder for the agent's scratchpad. The agent can be invoked as follows:"
"For the LangChain [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor), we define a prompt with a placeholder for the agent's scratchpad. The agent can be invoked as follows:"
]
},
{
@@ -381,7 +383,7 @@
"source": [
"### In LangChain\n",
"\n",
"With LangChain's [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could add chat [Memory](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.memory) so it can engage in a multi-turn conversation."
"With LangChain's [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could add chat [Memory](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.memory) so it can engage in a multi-turn conversation."
]
},
{
@@ -539,7 +541,7 @@
"\n",
"### In LangChain\n",
"\n",
"With LangChain's [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could iterate over the steps using the [stream](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) (or async `astream`) methods or the [iter](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter) method. LangGraph supports stepwise iteration using [stream](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) "
"With LangChain's [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could iterate over the steps using the [stream](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) (or async `astream`) methods or the [iter](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter) method. LangGraph supports stepwise iteration using [stream](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.stream) "
]
},
{
@@ -1017,7 +1019,7 @@
"\n",
"### In LangChain\n",
"\n",
"With LangChain's [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could configure an [early_stopping_method](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.early_stopping_method) to either return a string saying \"Agent stopped due to iteration limit or time limit.\" (`\"force\"`) or prompt the LLM a final time to respond (`\"generate\"`)."
"With LangChain's [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.iter), you could configure an [early_stopping_method](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.early_stopping_method) to either return a string saying \"Agent stopped due to iteration limit or time limit.\" (`\"force\"`) or prompt the LLM a final time to respond (`\"generate\"`)."
]
},
{
@@ -1128,7 +1130,7 @@
"\n",
"### In LangChain\n",
"\n",
"With LangChain's [AgentExecutor](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor), you could trim the intermediate steps of long-running agents using [trim_intermediate_steps](https://python.langchain.com/v0.2/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.trim_intermediate_steps), which is either an integer (indicating the agent should keep the last N steps) or a custom function.\n",
"With LangChain's [AgentExecutor](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor), you could trim the intermediate steps of long-running agents using [trim_intermediate_steps](https://python.langchain.com/api_reference/langchain/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor.trim_intermediate_steps), which is either an integer (indicating the agent should keep the last N steps) or a custom function.\n",
"\n",
"For instance, we could trim the value so the agent only sees the most recent intermediate step."
]
@@ -1325,7 +1327,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -9,17 +9,17 @@
"\n",
"It can often be useful to store multiple vectors per document. There are multiple use cases where this is beneficial. For example, we can embed multiple chunks of a document and associate those embeddings with the parent document, allowing retriever hits on the chunks to return the larger document.\n",
"\n",
"LangChain implements a base [MultiVectorRetriever](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.multi_vector.MultiVectorRetriever.html), which simplifies this process. Much of the complexity lies in how to create the multiple vectors per document. This notebook covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.\n",
"LangChain implements a base [MultiVectorRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_vector.MultiVectorRetriever.html), which simplifies this process. Much of the complexity lies in how to create the multiple vectors per document. This notebook covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.\n",
"\n",
"The methods to create multiple vectors per document include:\n",
"\n",
"- Smaller chunks: split a document into smaller chunks, and embed those (this is [ParentDocumentRetriever](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.parent_document_retriever.ParentDocumentRetriever.html)).\n",
"- Smaller chunks: split a document into smaller chunks, and embed those (this is [ParentDocumentRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.parent_document_retriever.ParentDocumentRetriever.html)).\n",
"- Summary: create a summary for each document, embed that along with (or instead of) the document.\n",
"- Hypothetical questions: create hypothetical questions that each document would be appropriate to answer, embed those along with (or instead of) the document.\n",
"\n",
"Note that this also enables another method of adding embeddings - manually. This is useful because you can explicitly add questions or queries that should lead to a document being recovered, giving you more control.\n",
"\n",
"Below we walk through an example. First we instantiate some documents. We will index them in an (in-memory) [Chroma](/docs/integrations/providers/chroma/) vector store using [OpenAI](https://python.langchain.com/v0.2/docs/integrations/text_embedding/openai/) embeddings, but any LangChain vector store or embeddings model will suffice."
"Below we walk through an example. First we instantiate some documents. We will index them in an (in-memory) [Chroma](/docs/integrations/providers/chroma/) vector store using [OpenAI](https://python.langchain.com/docs/integrations/text_embedding/openai/) embeddings, but any LangChain vector store or embeddings model will suffice."
]
},
{
@@ -68,7 +68,7 @@
"source": [
"## Smaller chunks\n",
"\n",
"Often times it can be useful to retrieve larger chunks of information, but embed smaller chunks. This allows for embeddings to capture the semantic meaning as closely as possible, but for as much context as possible to be passed downstream. Note that this is what the [ParentDocumentRetriever](https://python.langchain.com/v0.2/api_reference/langchain/retrievers/langchain.retrievers.parent_document_retriever.ParentDocumentRetriever.html) does. Here we show what is going on under the hood.\n",
"Often times it can be useful to retrieve larger chunks of information, but embed smaller chunks. This allows for embeddings to capture the semantic meaning as closely as possible, but for as much context as possible to be passed downstream. Note that this is what the [ParentDocumentRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.parent_document_retriever.ParentDocumentRetriever.html) does. Here we show what is going on under the hood.\n",
"\n",
"We will make a distinction between the vector store, which indexes embeddings of the (sub) documents, and the document store, which houses the \"parent\" documents and associates them with an identifier."
]
@@ -103,7 +103,7 @@
"id": "d4feded4-856a-4282-91c3-53aabc62e6ff",
"metadata": {},
"source": [
"We next generate the \"sub\" documents by splitting the original documents. Note that we store the document identifier in the `metadata` of the corresponding [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html) object."
"We next generate the \"sub\" documents by splitting the original documents. Note that we store the document identifier in the `metadata` of the corresponding [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) object."
]
},
{
@@ -207,7 +207,7 @@
"id": "cdef8339-f9fa-4b3b-955f-ad9dbdf2734f",
"metadata": {},
"source": [
"The default search type the retriever performs on the vector database is a similarity search. LangChain vector stores also support searching via [Max Marginal Relevance](https://python.langchain.com/v0.2/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.max_marginal_relevance_search). This can be controlled via the `search_type` parameter of the retriever:"
"The default search type the retriever performs on the vector database is a similarity search. LangChain vector stores also support searching via [Max Marginal Relevance](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.max_marginal_relevance_search). This can be controlled via the `search_type` parameter of the retriever:"
]
},
{
@@ -244,13 +244,11 @@
"\n",
"A summary may be able to distill more accurately what a chunk is about, leading to better retrieval. Here we show how to create summaries, and then embed those.\n",
"\n",
"We construct a simple [chain](/docs/how_to/sequence) that will receive an input [Document](https://python.langchain.com/v0.2/api_reference/core/documents/langchain_core.documents.base.Document.html) object and generate a summary using a LLM.\n",
"We construct a simple [chain](/docs/how_to/sequence) that will receive an input [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) object and generate a summary using a LLM.\n",
"\n",
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />\n",
"```"
"<ChatModelTabs customVarName=\"llm\" />\n"
]
},
{
@@ -294,7 +292,7 @@
"id": "3faa9fde-1b09-4849-a815-8b2e89c30a02",
"metadata": {},
"source": [
"Note that we can [batch](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) the chain accross documents:"
"Note that we can [batch](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) the chain accross documents:"
]
},
{
@@ -440,7 +438,7 @@
"source": [
"from typing import List\n",
"\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"class HypotheticalQuestions(BaseModel):\n",

View File

@@ -71,7 +71,7 @@
"id": "eed8baf2-f4c2-44c1-b47d-e9f560af6202",
"metadata": {},
"source": [
":::{.callout-tip}\n",
":::tip\n",
"\n",
"LCEL automatically upgrades the function `parse` to `RunnableLambda(parse)` when composed using a `|` syntax.\n",
"\n",
@@ -140,7 +140,7 @@
"id": "62192808-c7e1-4b3a-85f4-b7901de7c0b8",
"metadata": {},
"source": [
":::{.callout-important}\n",
":::important\n",
"\n",
"Please wrap the streaming parser in `RunnableGenerator` as we may stop automatically upgrading it with the `|` syntax.\n",
":::"
@@ -219,7 +219,7 @@
"\n",
"When the output from the chat model or LLM is malformed, the can throw an `OutputParserException` to indicate that parsing fails because of bad input. Using this exception allows code that utilizes the parser to handle the exceptions in a consistent manner.\n",
"\n",
":::{.callout-tip} Parsers are Runnables! 🏃\n",
":::tip Parsers are Runnables! 🏃\n",
"\n",
"Because `BaseOutputParser` implements the `Runnable` interface, any custom parser you will create this way will become valid LangChain Runnables and will benefit from automatic async support, batch interface, logging support etc.\n",
":::\n"
@@ -458,7 +458,7 @@
"id": "18f83192-37e8-43f5-ab29-9568b1279f1b",
"metadata": {},
"source": [
":::{.callout-note}\n",
":::note\n",
"The parser will work with either the output from an LLM (a string) or the output from a chat model (an `AIMessage`)!\n",
":::"
]

View File

@@ -24,8 +24,8 @@
"from typing import List\n",
"\n",
"from langchain_core.output_parsers import PydanticOutputParser\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI"
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field"
]
},
{
@@ -131,7 +131,7 @@
"id": "84498e02",
"metadata": {},
"source": [
"Find out api documentation for [OutputFixingParser](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser)."
"Find out api documentation for [OutputFixingParser](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser)."
]
},
{

View File

@@ -20,7 +20,7 @@
"\n",
"While some model providers support [built-in ways to return structured output](/docs/how_to/structured_output), not all do. We can use an output parser to help users to specify an arbitrary JSON schema via the prompt, query a model for outputs that conform to that schema, and finally parse that schema as JSON.\n",
"\n",
":::{.callout-note}\n",
":::note\n",
"Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON.\n",
":::"
]
@@ -30,7 +30,7 @@
"id": "ae909b7a",
"metadata": {},
"source": [
"The [`JsonOutputParser`](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html) is one built-in option for prompting for and then parsing JSON output. While it is similar in functionality to the [`PydanticOutputParser`](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.pydantic.PydanticOutputParser.html), it also supports streaming back partial JSON objects.\n",
"The [`JsonOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html) is one built-in option for prompting for and then parsing JSON output. While it is similar in functionality to the [`PydanticOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.pydantic.PydanticOutputParser.html), it also supports streaming back partial JSON objects.\n",
"\n",
"Here's an example of how it can be used alongside [Pydantic](https://docs.pydantic.dev/) to conveniently declare the expected schema:"
]
@@ -47,7 +47,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -71,8 +72,8 @@
"source": [
"from langchain_core.output_parsers import JsonOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field\n",
"\n",
"model = ChatOpenAI(temperature=0)\n",
"\n",

View File

@@ -20,8 +20,8 @@
"from langchain.output_parsers import OutputFixingParser\n",
"from langchain_core.output_parsers import PydanticOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI, OpenAI"
"from langchain_openai import ChatOpenAI, OpenAI\n",
"from pydantic import BaseModel, Field"
]
},
{
@@ -244,7 +244,7 @@
"id": "e3a2513a",
"metadata": {},
"source": [
"Find out api documentation for [RetryOutputParser](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.retry.RetryOutputParser.html#langchain.output_parsers.retry.RetryOutputParser)."
"Find out api documentation for [RetryOutputParser](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.retry.RetryOutputParser.html#langchain.output_parsers.retry.RetryOutputParser)."
]
},
{

View File

@@ -35,17 +35,17 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 1,
"id": "1594b2bf-2a6f-47bb-9a81-38930f8e606b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')"
"Joke(setup='Why did the tomato turn red?', punchline='Because it saw the salad dressing!')"
]
},
"execution_count": 6,
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
@@ -53,8 +53,8 @@
"source": [
"from langchain_core.output_parsers import PydanticOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field, validator\n",
"from langchain_openai import OpenAI\n",
"from pydantic import BaseModel, Field, model_validator\n",
"\n",
"model = OpenAI(model_name=\"gpt-3.5-turbo-instruct\", temperature=0.0)\n",
"\n",
@@ -65,11 +65,13 @@
" punchline: str = Field(description=\"answer to resolve the joke\")\n",
"\n",
" # You can add custom validation logic easily with Pydantic.\n",
" @validator(\"setup\")\n",
" def question_ends_with_question_mark(cls, field):\n",
" if field[-1] != \"?\":\n",
" @model_validator(mode=\"before\")\n",
" @classmethod\n",
" def question_ends_with_question_mark(cls, values: dict) -> dict:\n",
" setup = values[\"setup\"]\n",
" if setup[-1] != \"?\":\n",
" raise ValueError(\"Badly formed question!\")\n",
" return field\n",
" return values\n",
"\n",
"\n",
"# Set up a parser + inject instructions into the prompt template.\n",
@@ -239,9 +241,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "poetry-venv-311",
"language": "python",
"name": "python3"
"name": "poetry-venv-311"
},
"language_info": {
"codemirror_mode": {
@@ -253,7 +255,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -20,9 +20,9 @@
"\n",
"LLMs from different providers often have different strengths depending on the specific data they are trianed on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
"\n",
"This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/v0.2/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
"This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
"\n",
":::{.callout-note}\n",
":::note\n",
"Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed XML.\n",
":::\n",
"\n",
@@ -41,7 +41,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"ANTHROPIC_API_KEY\"] = getpass()"
"if \"ANTHROPIC_API_KEY\" not in os.environ:\n",
" os.environ[\"ANTHROPIC_API_KEY\"] = getpass()"
]
},
{

View File

@@ -22,7 +22,7 @@
"\n",
"This output parser allows users to specify an arbitrary schema and query LLMs for outputs that conform to that schema, using YAML to format their response.\n",
"\n",
":::{.callout-note}\n",
":::note\n",
"Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed YAML.\n",
":::\n"
]
@@ -39,7 +39,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -47,7 +48,7 @@
"id": "cc479f3a",
"metadata": {},
"source": [
"We use [Pydantic](https://docs.pydantic.dev) with the [`YamlOutputParser`](https://python.langchain.com/v0.2/api_reference/langchain/output_parsers/langchain.output_parsers.yaml.YamlOutputParser.html#langchain.output_parsers.yaml.YamlOutputParser) to declare our data model and give the model more context as to what type of YAML it should generate:"
"We use [Pydantic](https://docs.pydantic.dev) with the [`YamlOutputParser`](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.yaml.YamlOutputParser.html#langchain.output_parsers.yaml.YamlOutputParser) to declare our data model and give the model more context as to what type of YAML it should generate:"
]
},
{
@@ -70,8 +71,8 @@
"source": [
"from langchain.output_parsers import YamlOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"from langchain_openai import ChatOpenAI\n",
"from pydantic import BaseModel, Field\n",
"\n",
"\n",
"# Define your desired data structure.\n",

View File

@@ -26,7 +26,7 @@
"\n",
":::\n",
"\n",
"The [`RunnableParallel`](https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.base.RunnableParallel.html) primitive is essentially a dict whose values are runnables (or things that can be coerced to runnables, like functions). It runs all of its values in parallel, and each value is called with the overall input of the `RunnableParallel`. The final return value is a dict with the results of each value under its appropriate key.\n",
"The [`RunnableParallel`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableParallel.html) primitive is essentially a dict whose values are runnables (or things that can be coerced to runnables, like functions). It runs all of its values in parallel, and each value is called with the overall input of the `RunnableParallel`. The final return value is a dict with the results of each value under its appropriate key.\n",
"\n",
"## Formatting with `RunnableParallels`\n",
"\n",
@@ -60,7 +60,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()"
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()"
]
},
{
@@ -117,7 +118,7 @@
"id": "392cd4c4-e7ed-4ab8-934d-f7a4eca55ee1",
"metadata": {},
"source": [
"::: {.callout-tip}\n",
":::tip\n",
"Note that when composing a RunnableParallel with another Runnable we don't even need to wrap our dictionary in the RunnableParallel class — the type conversion is handled for us. In the context of a chain, these are equivalent:\n",
":::\n",
"\n",

Some files were not shown because too many files have changed in this diff Show More