Compare commits

...

696 Commits

Author SHA1 Message Date
jacoblee93
ef0965ec36 Hide and deindex template pages 2024-08-20 10:44:05 -07:00
Bagatur
8a71f1b41b core[minor]: add langsmith document loader (#25493)
needs tests
2024-08-20 10:22:14 -07:00
Bob Merkus
8e3e532e7d docs: ollama doc update (toolcalling, install, notebook examples) (#25549)
The new `langchain-ollama` package seems pretty well implemented, but I
noticed the docs were still outdated so I decided to fix em up a bit.

- Llama3.1 was release on 23rd of July;
https://ai.meta.com/blog/meta-llama-3-1/
- Ollama supports tool calling since 25th of July;
https://ollama.com/blog/tool-support
- LangChain Ollama partner package was released 1st of august;
https://pypi.org/project/langchain-ollama/

**Problem**: Docs note langchain-community instead of langchain-ollama

**Solution**: Update docs to
https://python.langchain.com/v0.2/docs/integrations/chat/ollama/


**Problem**: OllamaFunctions is deprecated, as noted on
[Integrations](https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/):
This was an experimental wrapper that attempts to bolt-on tool calling
support to models that do not natively support it. The [primary Ollama
integration](https://python.langchain.com/v0.2/docs/integrations/chat/ollama/) now
supports tool calling, and should be used instead.

**Solution**: Delete old notebook from repo, update the existing one
with @tool decorator + pydantic examples to the notebook


**Problem**: Llama3.1 was released while llama3-groq-tool-call fine-tune
Is noted in notebooks.

**Solution**: update docs + notebooks to llama3.1 (which has improved
tool calling support)


**Problem**: Install instructions are incomplete, there is no
information to download a model and/or run the Ollama server

**Solution**: Add simple instructions to start the ollama service and
pull model (for toolcalling)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-20 09:20:59 -04:00
Jabir
12e490ea56 Update azuresearch.py (#25577)
This will allow complextype metadata to be returned. the current
implementation throws error when dealing with nested metadata

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-20 12:53:30 +00:00
Abraham Omorogbe
498a482e76 docs: Adding Azure Database for PostgreSQL docs (#25560)
This PR to show support for the Azure Database for PostgreSQL Vector
Store and Memory

[Azure Database for PostgreSQL - Flexible
Server](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/service-overview)
[Azure Database for PostgreSQL pgvector
extension](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/how-to-use-pgvector)

**Description:** Added vector store and memory usage documentation for
Azure Database for PostgreSQL
 **Twitter handle:** [@_aiabe](https://x.com/_aiabe)

---------

Co-authored-by: Abeomor <{ID}+{username}@users.noreply.github.com>
2024-08-20 12:01:32 +00:00
Leonid Ganeline
d324fd1821 docs: added Constitutional AI references (#25553)
Added reference to the source paper.
2024-08-20 08:00:58 -04:00
Bagatur
4bd005adb6 core[patch]: Allow bound models as token_counter in trim_messages (#25563) 2024-08-20 00:21:22 -07:00
Erick Friis
e01c6789c4 core,community: add beta decorator to missed GraphVectorStore extensions (#25562) 2024-08-19 17:29:09 -07:00
Erick Friis
dd2d094adc infra: remove huggingface from ci tree (#25559) 2024-08-19 22:48:26 +00:00
Bagatur
6b98207eda infra: test chat prompt ser/des (#25557) 2024-08-19 15:27:36 -07:00
ccurme
c5bf114c0f together, standard-tests: specify tool_choice in standard tests (#25548)
Here we allow standard tests to specify a value for `tool_choice` via a
`tool_choice_value` property, which defaults to None.

Chat models [available in
Together](https://docs.together.ai/docs/chat-models) have issues passing
standard tool calling tests:
- llama 3.1 models currently [appear to rely on user-side
parsing](https://docs.together.ai/docs/llama-3-function-calling) in
Together;
- Mixtral-8x7B and Mistral-7B (currently tested) consistently do not
call tools in some tests.

Specifying tool_choice also lets us remove an existing `xfail` and use a
smaller model in Groq tests.
2024-08-19 16:37:36 -04:00
maang-h
015ab91b83 community[patch]: Add ToolMessage for ChatZhipuAI (#25547)
- **Description:** Add ToolMessage for `ChatZhipuAI` to solve the issue
#25490
2024-08-19 11:26:38 -04:00
ccurme
5a3aaae6dc groq[patch]: update model used for llama tests (#25542)
`llama-3.1-8b-instant` often fails some of the tool calling standard
tests. Here we update to `llama-3.1-70b-versatile`.
2024-08-19 13:58:06 +00:00
Mohammad Mohtashim
75c3c81b8c [Community]: Fix - Open AI Whisper client.audio.transcriptions returning Text Object which raises error (#25271)
- **Description:** The following
[line](fd546196ef/libs/community/langchain_community/document_loaders/parsers/audio.py (L117))
in `OpenAIWhisperParser` returns a text object for some odd reason
despite the official documentation saying it should return `Transcript`
Instance which should have the text attribute. But for the example given
in the issue and even when I tried running on my own, I was directly
getting the text. The small PR accounts for that.
 - **Issue:** : #25218
 

I was able to replicate the error even without the GenericLoader as
shown below and the issue was with `OpenAIWhisperParser`

```python
parser = OpenAIWhisperParser(api_key="sk-fxxxxxxxxx",
                                            response_format="srt",
                                            temperature=0)

list(parser.lazy_parse(Blob.from_path('path_to_file.m4a')))
```
2024-08-19 09:36:42 -04:00
Thin red line 未来产品经理
0f7b8adddf fix issue: cannot use document_variable_name to override context in create_stuff_documents_chain (#25531)
…he prompt in the create_stuff_documents_chain

Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain:add document_variable_name in the
function _validate_prompt in create_stuff_documents_chain"
 

- [ ] **PR message**: 
- **Description:** add document_variable_name in the function
_validate_prompt in create_stuff_documents_chain
- **Issue:** according to the description of
create_stuff_documents_chain function, the parameter
document_variable_name can be used to override the "context" in the
prompt, but in the function, _validate_prompt it still use DOCUMENTS_KEY
to check if it is a valid prompt, the value of DOCUMENTS_KEY is always
"context", so even through the user use document_variable_name to
override it, the code still tries to check if "context" is in the
prompt, and finally it reports error. so I use document_variable_name to
replace DOCUMENTS_KEY, the default value of document_variable_name is
"context" which is same as DOCUMENTS_KEY, but it can be override by
users.
    - **Dependencies:** none
    - **Twitter handle:** https://x.com/xjr199703


- [ ] **Add tests and docs**: none

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-19 13:33:19 +00:00
ccurme
09c0823c3a docs: update summarization guides (#25408) 2024-08-19 13:29:25 +00:00
maang-h
32f5147523 docs: Fix QianfanLLMEndpoint and Tongyi input text (#25529)
- **Description:** Fix `QianfanLLMEndpoint` and `Tongyi` input text.
2024-08-19 09:23:09 -04:00
ZhangShenao
4255a30f20 Improvement[Community] Improve api doc for SingleFileFacebookMessengerChatLoader (#25536)
Delete redundant args in api doc
2024-08-19 09:00:21 -04:00
Bagatur
49dea06af1 docs: fix Agent deprecation msg (#25464) 2024-08-18 19:15:52 +00:00
Hassan El Mghari
937b3904eb together[patch]: update base url (#25524)
Updated the Together base URL from `.ai` to `.xyz` since some customers
have reported problems with `.ai`.
2024-08-18 10:48:30 -07:00
gbaian10
bda3becbe7 docs: add prompt to install beautifulsoup4. (#25518)
fix: #25482

- **Description:**
Add a prompt to install beautifulsoup4 in places where `from
langchain_community.document_loaders import WebBaseLoader` is used.
- **Issue:** #25482
2024-08-17 23:23:24 -07:00
gbaian10
f6e6a17878 docs: add prompt to install nltk (#25519)
fix: #25473 

- **Description:** add prompt to install nltk
- **Issue:** #25473
2024-08-17 23:22:49 -07:00
Chengzu Ou
c1bd4e05bc docs: fix Databricks Vector Search demo notebook (#25504)
**Description:** This PR fixes an issue in the demo notebook of
Databricks Vector Search in "Work with Delta Sync Index" section.

**Issue:** N/A

**Dependencies:** N/A

---------

Co-authored-by: Chengzu Ou <chengzu.ou@databrick.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-16 20:24:30 +00:00
Isaac Francisco
a2e90a5a43 add embeddings integration tests (#25508) 2024-08-16 13:20:37 -07:00
Bagatur
a06818a654 openai[patch]: update core dep (#25502) 2024-08-16 18:30:17 +00:00
Bagatur
df98552b6f core[patch]: Release 0.2.33 (#25498) 2024-08-16 11:18:54 -07:00
ccurme
b83f1eb0d5 core, partners: implement standard tracing params for LLMs (#25410) 2024-08-16 13:18:09 -04:00
Bagatur
9f0c76bf89 openai[patch]: Release 0.1.22 (#25496) 2024-08-16 16:53:04 +00:00
ccurme
01ecd0acba openai[patch]: fix json mode for Azure (#25488)
https://github.com/langchain-ai/langchain/issues/25479
https://github.com/langchain-ai/langchain/issues/25485

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-16 09:50:50 -07:00
Eugene Yurtsev
1fd1c1dca5 docs: use .invoke rather than __call__ in openai integration notebook (#25494)
Documentation should be using .invoke()
2024-08-16 15:59:18 +00:00
Bagatur
253ceca76a docs: fix mimetype parser docstring (#25463) 2024-08-15 16:16:52 -07:00
Eugene Yurtsev
e18511bb22 core[minor], anthropic[patch]: Upgrade @root_validator usage to be consistent with pydantic 2 (#25457)
anthropic: Upgrade `@root_validator` usage to be consistent with
pydantic 2
core: support looking up multiple keys from env in from_env factory
2024-08-15 20:09:34 +00:00
Eugene Yurtsev
34da8be60b pinecone[patch]: Upgrade @root_validators to be consistent with pydantic 2 (#25453)
Upgrade root validators for pydantic 2 migration
2024-08-15 19:45:14 +00:00
Eugene Yurtsev
b297af5482 voyageai[patch]: Upgrade root validators for pydantic 2 (#25455)
Update @root_validators to be consistent with pydantic 2 semantics
2024-08-15 15:30:41 -04:00
Eugene Yurtsev
4cdaca67dc ai21[patch]: Upgrade @root_validators for pydantic 2 migration (#25454)
Upgrade @root_validators usage to match pydantic 2 semantics
2024-08-15 14:54:08 -04:00
Eugene Yurtsev
d72a08a60d groq[patch]: Update root validators for pydantic 2 migration (#25402) 2024-08-15 18:46:52 +00:00
Leonid Ganeline
8eb63a609e docs: arxiv page update (#25450)
Added `arxive` papers that use `LangGraph` or `LangSmith`. Improved the
page formatting.
2024-08-15 14:30:35 -04:00
Isaac Francisco
5150ec3a04 [experimental]: minor fix to open assistants code (#24682) 2024-08-15 17:50:57 +00:00
Bagatur
2b4fbcb4b4 docs: format oai embeddings docstring (#25448) 2024-08-15 16:57:54 +00:00
Eugene Yurtsev
eb3870e9d8 fireworks[patch]: Upgrade @root_validators to be pydantic 2 compliant (#25443)
Update @root_validators to be pydantic 2 compliant
2024-08-15 16:56:48 +00:00
William FH
75ae585deb Merge support for group manager (#25360) 2024-08-15 09:56:31 -07:00
Eugene Yurtsev
b7c070d437 docs[patch]: Update code that checks API keys (#25444)
Check whether the API key is already in the environment

Update:

```python
import getpass
import os

os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
os.environ["DATABRICKS_TOKEN"] = getpass.getpass("Enter your Databricks access token: ")
```

To:

```python
import getpass
import os

os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
if "DATABRICKS_TOKEN" not in os.environ:
    os.environ["DATABRICKS_TOKEN"] = getpass.getpass(
        "Enter your Databricks access token: "
    )
```

grit migration:

```
engine marzano(0.1)
language python

`os.environ[$Q] = getpass.getpass("$X")` as $CHECK where {
    $CHECK <: ! within if_statement(),
    $CHECK => `if $Q not in os.environ:\n    $CHECK`
}
```
2024-08-15 12:52:37 -04:00
Bagatur
60b65528c5 docs: fix api ref mod links in pkg page (#25447) 2024-08-15 16:52:12 +00:00
Eugene Yurtsev
2ef9d12372 mistralai[patch]: Update more @root_validators for pydantic 2 compatibility (#25446)
Update @root_validators in mistralai integration for pydantic 2 compatibility
2024-08-15 12:44:42 -04:00
Eugene Yurtsev
6910b0b3aa docs[patch]: Fix integration notebook for Fireworks llm (#25442)
Fix integration notebook
2024-08-15 12:42:33 -04:00
Eugene Yurtsev
831708beb7 together[patch]: Update @root_validator for pydantic 2 compatibility (#25423)
This PR updates usage of @root_validator to be compatible with pydantic 2.
2024-08-15 11:27:42 -04:00
Eugene Yurtsev
a114255b82 ai21[patch]: Update @root_validators for pydantic2 migration (#25401)
Update @root_validators for pydantic 2 migration.
2024-08-15 11:26:44 -04:00
Eugene Yurtsev
6f68c8d6ab mistralai[patch]: Update root validator for compatibility with pydantic 2 (#25403) 2024-08-15 11:26:24 -04:00
ccurme
8afbab4cf6 langchain[patch]: deprecate various chains (#25310)
- [x] NatbotChain: move to community, deprecate langchain version.
Update to use `prompt | llm | output_parser` instead of LLMChain.
- [x] LLMMathChain: deprecate + add langgraph replacement example to API
ref
- [x] HypotheticalDocumentEmbedder (retriever): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] FlareChain: update to use `prompt | llm | output_parser` instead
of LLMChain
- [x] ConstitutionalChain: deprecate + add langgraph replacement example
to API ref
- [x] LLMChainExtractor (document compressor): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] LLMChainFilter (document compressor): update to use `prompt | llm
| output_parser` instead of LLMChain
- [x] RePhraseQueryRetriever (retriever): update to use `prompt | llm |
output_parser` instead of LLMChain
2024-08-15 10:49:26 -04:00
Luke
66e30efa61 experimental: Fix divide by 0 error (#25439)
Within the semantic chunker, when calling `_threshold_from_clusters`
there is the possibility for a divide by 0 error if the
`number_of_chunks` is equal to the length of `distances`.

Fix simply implements a check if these values match to prevent the error
and enable chunking to continue.
2024-08-15 14:46:30 +00:00
ccurme
ba167dc158 community[patch]: update connection string in azure cosmos integration test (#25438) 2024-08-15 14:07:54 +00:00
Eugene Yurtsev
44f69063b1 docs[patch]: Fix a few typos in the chat integration docs for TogetherAI (#25424)
Fix a few minor typos
2024-08-15 09:48:36 -04:00
Isaac Francisco
f18b77fd59 [docs]: pdf loaders (#25425) 2024-08-14 21:44:57 -07:00
Isaac Francisco
966b408634 [docs]: doc loader changes (#25417) 2024-08-14 19:46:33 -07:00
ccurme
bd261456f6 langchain: bump core to 0.2.32 (#25421) 2024-08-15 00:00:42 +00:00
Bagatur
ec8ffc8f40 core[patch]: Release 0.2.32 (#25420) 2024-08-14 15:56:56 -07:00
Bagatur
2494cecabf core[patch]: tool import fix (#25419) 2024-08-14 22:54:13 +00:00
ccurme
df632b8cde langchain: bump min core version (#25418) 2024-08-14 22:51:35 +00:00
ccurme
1050e890c6 langchain: release 0.2.14 (#25416)
Fixes https://github.com/langchain-ai/langchain/issues/25413
2024-08-14 22:29:39 +00:00
Isaac Francisco
c4779f5b9c [docs]: sitemaploader update (#25363) 2024-08-14 15:27:40 -07:00
gbaian10
0a99935794 docs: remove the extra period in docstring (#25414)
Remove the period after the hyperlink in the docstring of
BaseChatOpenAI.with_structured_output.

I have repeatedly copied the extra period at the end of the hyperlink,
which results in a "Page not found" page when pasted into the browser.
2024-08-14 18:07:15 -04:00
Isaac Francisco
63aba3fe5b [docs]: link fix directory loader (#25411) 2024-08-14 20:58:54 +00:00
Bagatur
dc80be5efe docs: fix deprecated functions table (#25409) 2024-08-14 12:25:39 -07:00
Erick Friis
ab29ee79a3 docs: fix tool index (#25404) 2024-08-14 18:36:41 +00:00
Werner van der Merwe
1d3f7231b8 fix: typo where github should be gitlab (#25397)
**PR title**: "GitLabToolkit: fix typo"
    - **Description:** fix typo where GitHub should have been GitLab
    - **Dependencies:** None
2024-08-14 18:36:25 +00:00
Bagatur
a58d4ba340 core[patch]: Release 0.2.31 (#25388) 2024-08-14 11:26:49 -07:00
Bagatur
d178fb9dc3 docs: fix api ref package tables (#25400) 2024-08-14 10:40:16 -07:00
Bagatur
414154fa59 experimental[patch]: refactor rl chain structure (#25398)
can't have a class and function with same name but different
capitalization in same file for api reference building
2024-08-14 17:09:43 +00:00
Flávio Knob
94c9cb7321 Update document_loader_custom.ipynb (#25393)
Fix typo

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-14 12:33:21 -04:00
Jacob Lee
012929551c docs[patch]: Hide deprecated integration pages (#25389) 2024-08-14 09:17:39 -07:00
Bagatur
63c483ea01 standard-tests: import fix (#25395) 2024-08-14 09:13:56 -07:00
Bagatur
eec7bb4f51 anthropic[patch]: Release 0.1.23 (#25394) 2024-08-14 09:03:39 -07:00
Flávio Knob
f0f125dac7 Update document_loader_custom.ipynb (#25391)
Fix typo and some `callout` tags

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-14 15:07:42 +00:00
Eugene Yurtsev
f4196f1fb8 ollama[patch]: Update extra in ollama package (#25383)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.
2024-08-14 10:30:01 -04:00
Chengyu Yan
d0ad713937 core: fix issue#24660, slove error messages about ValueError when use model with history (#25183)
- **Description:**
This PR will slove error messages about `ValueError` when use model with
history.
Detail in #24660.
#22933 causes that
`langchain_core.runnables.history.RunnableWithMessageHistory._get_output_messages`
miss type check of `output_val` if `output_val` is `False`. After
running `RunnableWithMessageHistory._is_not_async`, `output` is `False`.

249945a572/libs/core/langchain_core/runnables/history.py (L323-L334)

15a36dd0a2/libs/core/langchain_core/runnables/history.py (L461-L471)
~~I suggest that `_get_output_messages` return empty list when
`output_val == False`.~~

- **Issue**:
  - #24660

- **Dependencies:**: No Change.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-08-14 14:26:22 +00:00
Jacob Lee
ddd7919f6a docs[patch]: Add conceptual guide links to integration index pages (#25387) 2024-08-14 07:14:24 -07:00
Bagatur
493e474063 docs: udpated api reference (#25172)
- Move the API reference into the vercel build
- Update api reference organization and styling
2024-08-14 07:00:17 -07:00
Leonid Ganeline
4a812e3193 docs: integrations references update (#25217)
Added missed provider pages. Fixed formats and added descriptions and
links.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-14 13:58:38 +00:00
Eugene Yurtsev
5f5e8c9a60 huggingface[patch], pinecone[patch], fireworks[patch], mistralai[patch], voyageai[patch], togetherai[path]: convert Pydantic extras to literals (#25384)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.

- fireworks
- voyage ai
- mistralai
- mistral ai
- together ai
- huggigng face
- pinecone
2024-08-14 09:55:30 -04:00
Eugene Yurtsev
d00176e523 openai[patch]: Update extra to match pydantic 2 (#25382)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.
2024-08-14 09:55:18 -04:00
Eugene Yurtsev
dc51cc5690 core[minor]: Prevent PydanticOutputParser from encoding schema as ASCII (#25386)
This allows users to provide parameter descriptions in the pydantic
models in other languages.

Continuing this PR: https://github.com/langchain-ai/langchain/pull/24809
2024-08-14 13:54:31 +00:00
ccurme
27690506d0 multiple: update removal targets (#25361) 2024-08-14 09:50:39 -04:00
Ikko Eltociear Ashimine
4029f5650c docs: update clarifai.ipynb (#25373)
Intialize -> Initialize
2024-08-14 09:20:17 -04:00
Erick Friis
10e6725a7e docs: tools index table (#25370) 2024-08-14 02:38:03 +00:00
Harrison Chase
967b6f21f6 docs: improve document loaders index (#25365)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-14 01:48:48 +00:00
Erick Friis
4a78be7861 docs: remove sidebar comment (#25369) 2024-08-14 01:47:12 +00:00
Eugene Yurtsev
d6c180996f docs[patch]: Fix typo in CohereEmbeddings integration docs (#25367)
Fix typo
2024-08-14 01:18:54 +00:00
Eugene Yurtsev
93dcc47463 docs: Partial integration update for cohere embeddings (#25250)
This can be finished after the following issue is resolved:

https://github.com/langchain-ai/langchain-cohere/issues/81

Related to: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "cohere",
      "js":  true,
      "local": false,
     "serializable": false,
   }
]
```

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-14 00:53:13 +00:00
Eugene Yurtsev
27def6bddb docs[patch]: Update integration docs for AzureOpenAIEmbeddings (#25311)
https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:33:13 +00:00
Eugene Yurtsev
b4e3bdb714 docs: Update nomic AI embeddings integration docs (#25308)
Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:32:07 +00:00
Eugene Yurtsev
f82c3f622a docs: Update AI21Embeddings Integration docs (#25298)
Update AI21 Integration docs

Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:30:16 +00:00
Eugene Yurtsev
d55d99222b docs: update integration docs for mistral ai embedding model (#25253)
Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "mistralai",
      "js":  true,
      "local": false,
     "serializable": false,
    "native_async": true
   }
]
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:25:36 +00:00
Eugene Yurtsev
0f6217f507 docs: together ai embeddings integration docs (#25252)
Update together AI embedding integration docs

Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "together",
      "js":  true,
      "local": false,
     "serializable": false,
   }
]
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:24:02 +00:00
Eugene Yurtsev
8645a49f31 docs: Update integration docs for OllamaEmbeddingsModel (#25314)
Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:23:05 +00:00
Eugene Yurtsev
a4ef830480 docs: update integration docs for openai embeddings (#25249)
Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
   {
      "provider": "openai",
      "js":  true,
      "local": false,
     "serializable": false,
"async_native": true
  }
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:21:36 +00:00
Eugene Yurtsev
b1aed44540 docs: Updating integration docs for Fireworks Embeddings (#25247)
Providers:
* fireworks

See related issue:
* https://github.com/langchain-ai/langchain/issues/24856

Features:

```json
[
   {
      "provider": "fireworks",
      "js":  true,
      "local": false,
     "serializable": false,
   }



]


```

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-13 17:04:18 -07:00
Isaac Francisco
f4ffd692a3 [docs]: standardize doc loader doc strings (#25325) 2024-08-13 23:18:56 +00:00
Isaac Francisco
e0bbb81d04 [docs]: standardize tool docstrings (#25351) 2024-08-13 16:10:00 -07:00
Erick Friis
d5b548b4ce docs: index pages, sidebars (#25316) 2024-08-13 15:52:51 -07:00
Isaac Francisco
0478f7f5e4 [docs]: LLM integration pages (#25005) 2024-08-13 14:50:45 -07:00
thedavgar
9d08369442 community: fix AzureSearch vectorstore asyncronous methods (#24921)
**Description**
Fix the asyncronous methods to retrieve documents from AzureSearch
VectorStore. The previous changes from [this
commit](ffe6ca986e)
create a similar code for the syncronous methods and the asyncronous
ones but the asyncronous client return an asyncronous iterator
"AsyncSearchItemPaged" as said in the issue #24740.
To solve this issue, the syncronous iterators in asyncronous methods
where changed to asyncronous iterators.

@chrislrobert said in [this
comment](https://github.com/langchain-ai/langchain/issues/24740#issuecomment-2254168302)
that there was a still a flaw due to `with` blocks that close the client
after each call. I removed this `with` blocks in the `async_client`
following the same pattern as the sync `client`.

In order to close up the connections, a __del__ method is included to
gently close up clients once the vectorstore object is destroyed.

**Issue:** #24740 and #24064
**Dependencies:** No new dependencies for this change

**Example notebook:** I created a notebook just to test the changes work
and gives the same results as the syncronous methods for vector and
hybrid search. With these changes, the asyncronous methods in the
retriever work as well.

![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67)


**Lint and test**: Passes the tests and the linter
2024-08-13 14:20:51 -07:00
Isaac Francisco
6bc451b942 [docs]: merge tool/toolkit duplicates (#25197) 2024-08-13 12:19:17 -07:00
Fedor Nikolaev
2b15518c5f community: add args_schema to SearxSearchResults tool (#25350)
This adds `args_schema` member to `SearxSearchResults` tool. This member
is already present in the `SearxSearchRun` tool in the same file.

I was having `TypeError: Type is not JSON serializable:
AsyncCallbackManagerForToolRun` being thrown in langserve playground
when I was using `SearxSearchResults` tool as a part of chain there.
This fixes the issue, so the error is not raised anymore.

This is a example langserve app that was giving me the error, but it
works properly after the proposed fix:
```python
#!/usr/bin/env python

from fastapi import FastAPI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain_community.utilities import SearxSearchWrapper
from langchain_community.tools.searx_search.tool import SearxSearchResults
from langserve import add_routes

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()

s = SearxSearchWrapper(searx_host="http://localhost:8080")

search = SearxSearchResults(wrapper=s)

search_chain = (
    {"context": search, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

app = FastAPI()

add_routes(
    app,
    search_chain,
    path="/chain",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)
```
2024-08-13 18:26:09 +00:00
Matt Kandler
b6df3405fb docs: Fix broken link to Runhouse documentation (#25349)
- **Description:** Runhouse recently migrated from Read the Docs to a
self-hosted solution. This PR updates a broken link from the old docs to
www.run.house/docs. Also changed "The Runhouse" to "Runhouse" (it's
cleaner).
- **Issue:** None
- **Dependencies:** None
2024-08-13 18:18:19 +00:00
maang-h
089f5e6cad Standardize SparkLLM (#25239)
- **Description:** Standardize SparkLLM, include:
  - docs, the issue #24803 
  - to support stream
  - update api url
  - model init arg names, the issue #20085
2024-08-13 09:50:12 -04:00
Leonid Ganeline
35e2230f56 docs: integrationsreferences update (#25322)
Added missed provider pages. Fixed formats and added descriptions and
links.
2024-08-13 09:29:51 -04:00
Chen Xiabin
24155aa1ac qianfan generate/agenerate with usage_metadata (#25332) 2024-08-13 09:24:41 -04:00
Christophe Bornet
ebbe609193 Add README for astradb package (#25345)
Similar to
https://github.com/langchain-ai/langchain/blob/master/libs/partners/ibm/README.md
2024-08-13 09:17:23 -04:00
Eugene Yurtsev
f679ed72ca ollama[patch]: Update API Reference for ollama embeddings (#25315)
Update API reference for OllamaEmbeddings
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 21:31:48 -04:00
Erick Friis
2907ab2297 community: release 0.2.12 (#25324) 2024-08-12 23:30:27 +00:00
Erick Friis
06f8bd9946 langchain: release 0.2.13 (#25323) 2024-08-12 22:24:06 +00:00
Erick Friis
252f0877d1 core: release 0.2.30 (#25321) 2024-08-12 22:01:24 +00:00
Eugene Yurtsev
217a915b29 openai: Update API Reference docs for AzureOpenAI Embeddings (#25312)
Update AzureOpenAI Embeddings docs
2024-08-12 19:41:18 +00:00
Eugene Yurtsev
056c7c2983 core[patch]: Update API reference for fake embeddings (#25313)
Issue: https://github.com/langchain-ai/langchain/issues/24856

Using the same template for the fake embeddings in langchain_core as
used in the integrations.
2024-08-12 19:40:05 +00:00
Ben Chambers
1adc161642 community: kwargs for CassandraGraphVectorStore (#25300)
- **Description:** pass kwargs from CassandraGraphVectorStore to
underlying store

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-12 18:01:29 +00:00
Hassan-Memon
deb27d8970 docs: remove unused imports in Conversational RAG tutorial (#25297)
Cleaned up the "Tying it Together" section of the Conversational RAG
tutorial by removing unnecessary imports that were not used. This
reduces confusion and makes the code more concise.

Thank you for contributing to LangChain!

PR title: docs: remove unused imports in Conversational RAG tutorial

PR message:

Description: Removed unnecessary imports from the "Tying it Together"
section of the Conversational RAG tutorial. These imports were not used
in the code and created confusion. The updated code is now more concise
and easier to understand.
Issue: N/A
Dependencies: None
LinkedIn handle: [Hassan
Memon](https://www.linkedin.com/in/hassan-memon-a109b3257/)
Add tests and docs:

Hi [LangChain Team Member’s Name],

I hope you're doing well! I’m thrilled to share that I recently made my
second contribution to the LangChain project. If possible, could you
give me a shoutout on LinkedIn? It would mean a lot to me and could help
inspire others to contribute to the community as well.

Here’s my LinkedIn profile: [Hassan
Memon](https://www.linkedin.com/in/hassan-memon-a109b3257/).

Thank you so much for your support and for creating such a great
platform for learning and collaboration. I'm looking forward to
contributing more in the future!

Best regards,
Hassan Memon
2024-08-12 13:49:55 -04:00
gbaian10
5efd0fe9ae docs: Change SqliteSaver to MemorySaver (#25306)
fix: #25137

`SqliteSaver.from_conn_string()` has been changed to a `contextmanager`
method in `langgraph >= 0.2.0`, the original usage is no longer
applicable.

Refer to
<https://github.com/langchain-ai/langgraph/pull/1271#issue-2454736415>
modification method to replace `SqliteSaver` with `MemorySaver`.
2024-08-12 13:45:32 -04:00
Eugene Yurtsev
1c9917dfa2 fireworks[patch]: Fix doc-string for API Referenmce (#25304) 2024-08-12 17:16:13 +00:00
Eugene Yurtsev
ccff1ba8b8 ai21[patch]: Update API reference documentation (#25302)
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 13:15:27 -04:00
Eugene Yurtsev
53ee5770d3 fireworks: Add APIReference for the FireworksEmbeddings model (#25292)
Add API Reference documentation for the FireworksEmbedding model.

Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 13:13:43 -04:00
Eugene Yurtsev
8626abf8b5 togetherai[patch]: Update API Reference for together AI embeddings model (#25295)
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 17:12:28 +00:00
Eugene Yurtsev
1af8456a2c mistralai[patch]: Docs Update APIReference for MistralAIEmbeddings (#25294)
Update API Reference for MistralAI embeddings

Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 15:25:37 +00:00
Eugene Yurtsev
0a3500808d openai[patch]: Docs fix RST formatting in OpenAIEmbeddings (#25293) 2024-08-12 11:24:35 -04:00
Eugene Yurtsev
ee8a585791 openai[patch]: Add API Reference docs to OpenAIEmbeddings (#25290)
Issue: [24856](https://github.com/langchain-ai/langchain/issues/24856)
2024-08-12 14:53:51 +00:00
ccurme
e77eeee6ee core[patch]: add standard tracing params for retrievers (#25240) 2024-08-12 14:51:59 +00:00
Mohammad Mohtashim
9927a4866d [Community] - Added bind_tools and with_structured_output for ChatZhipuAI (#23887)
- **Description:** This PR implements the `bind_tool` functionality for
ChatZhipuAI as requested by the user. ChatZhipuAI models support tool
calling according to the `OpenAI` tool format, as outlined in their
official documentation [here](https://open.bigmodel.cn/dev/api#glm-4).
- **Issue:**  ##23868

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-12 14:11:43 +00:00
Hassan-Memon
420534c8ca docs: Replaced SqliteSaver with MemorySaver and updated installation instru… (#25285)
…ctions to match LangGraph v2 documentation. Corrected code snippet to
prevent validation errors.

Here's how you can fill out the provided template for your pull request:

---

**Thank you for contributing to LangChain!**

- [ ] **PR title**: `docs: update checkpointer example in Conversational
RAG tutorial`

- [ ] **PR message**:
- **Description:** Updated the Conversational RAG tutorial to correct
the checkpointer example by replacing `SqliteSaver` with `MemorySaver`.
Added installation instructions for `langgraph-checkpoint-memory` to
match LangGraph v2 documentation and prevent validation errors.
    - **Issue:** N/A
    - **Dependencies:** `langgraph-checkpoint-memory`
    - **Twitter handle:** N/A

- [ ] **Add tests and docs**: 
  1. No new integration tests are required.
  2. Updated documentation in the Conversational RAG tutorial.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: [LangChain Contribution
Guidelines](https://python.langchain.com/docs/contributing/)

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 09:24:51 -04:00
Yunus Emre Özdemir
794f28d4e2 docs: document upstash vector namespaces (#25289)
**Description:** This PR rearranges the examples in Upstash Vector
integration documentation to describe how to use namespaces and improve
the description of metadata filtering.
2024-08-12 09:17:11 -04:00
JasonJ
f28ae20b81 docs: pip install bug fixed (#25287)
Thank you for contributing to LangChain!
- **Description:** Fixing package install bug in cookbook
- **Issue:** zsh:1: no matches found: unstructured[all-docs]
- **Dependencies:** N/A
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!



If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 05:12:44 +00:00
Soichi Sumi
9f0eda6a18 docs: Fix link for API reference of Gmail Toolkit (#25286)
- **Description:** Fix link for API reference of Gmail Toolkit
- **Issue:** I've just found this issue while I'm reading the doc
- **Dependencies:** N/A
- **Twitter handle:** [@soichisumi](https://x.com/soichisumi)

TODO: If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 05:12:31 +00:00
Anush
472527166f qdrant: Update API reference link and install command (#25245)
## Description

As the title goes. The current API reference links to the deprecated
class.
2024-08-11 16:54:14 -04:00
Aryan Singh
074fa0db73 docs: Fixed grammer error in functions.ipynb (#25255)
**Description**: Grammer Error in functions.ipynb
**Issue**: #25222
2024-08-11 20:53:27 +00:00
gbaian10
4fd1efc48f docs: update "Build an Agent" Installation Hint in agents.ipynb (#25263)
fix #25257
2024-08-11 16:51:34 -04:00
gbaian10
aa2722cbe2 docs: update numbering of items in docstring (#25267)
A problem similar to #25093 .

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-11 20:50:24 +00:00
Maddy Adams
a82c0533f2 langchain: default to langsmith sdk for pulling prompts, fallback to langchainhub (#24156)
**Description:** Deprecating langchainhub, replacing with langsmith sdk
2024-08-11 13:30:52 -07:00
maang-h
bc60cddc1b docs: Fix ChatBaichuan, QianfanChatEndpoint, ChatSparkLLM, ChatZhipuAI docs (#25265)
- **Description:** Fix some chat models docs, include:
  - ChatBaichuan
  - QianfanChatEndpoint
  - ChatSparkLLM
  - ChatZhipuAI
2024-08-11 16:23:55 -04:00
ZhangShenao
43deed2a95 Improvement[Embeddings] Add dimension support to ZhipuAIEmbeddings (#25274)
- In the in ` embedding-3 ` and later models of Zhipu AI, it is
supported to specify the dimensions parameter of Embedding. Ref:
https://bigmodel.cn/dev/api#text_embedding-3 .
- Add test case for `embedding-3` model by assigning dimensions.
2024-08-11 16:20:37 -04:00
maang-h
9cd608efb3 docs: Standardize OpenAI Docs (#25280)
- **Description:** Standardize OpenAI Docs
- **Issue:** the issue #24803

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-11 20:20:16 +00:00
Bagatur
fd546196ef openai[patch]: Release 0.1.21 (#25269) 2024-08-10 16:37:31 -07:00
Eugene Yurtsev
6dd9f053e3 core[patch]: Deprecating beta upsert APIs in vectorstore (#25069)
This PR deprecates the beta upsert APIs in vectorstore.

We'll introduce them in a V2 abstraction instead to keep the existing
vectorstore implementations lighter weight.

The main problem with the existing APIs is that it's a bit more
challenging to
implement the correct behavior w/ respect to IDs since ID can be present
in
both the function signature and as an optional attribute on the document
object.

But VectorStores that pass the standard tests should have implemented
the semantics properly!

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-09 17:17:36 -04:00
Bagatur
ca9dcee940 standard-tests[patch]: test ToolMessage.status="error" (#25210) 2024-08-09 13:00:14 -07:00
Eugene Yurtsev
dadb6f1445 cli[patch]: Update integration template for embedding models (#25248)
Update integration template for embedding models
2024-08-09 14:28:57 -04:00
Eugene Yurtsev
b6f0174bb9 community[patch],core[patch]: Update EdenaiTool root_validator and add unit test in core (#25233)
This PR gets rid `root_validators(allow_reuse=True)` logic used in
EdenAI Tool in preparation for pydantic 2 upgrade.
- add another test to secret_from_env_factory
2024-08-09 15:59:27 +00:00
blueoom
c3ced4c6ce core[patch]: use time.monotonic() instead time.time() in InMemoryRateLimiter
**Description:**

The get time point method in the _consume() method of
core.rate_limiters.InMemoryRateLimiter uses time.time(), which can be
affected by system time backwards. Therefore, it is recommended to use
the monotonically increasing monotonic() to obtain the time

```python
        with self._consume_lock:
            now = time.time()  # time.time() -> time.monotonic()

            # initialize on first call to avoid a burst
            if self.last is None:
                self.last = now

            elapsed = now - self.last  # when use time.time(), elapsed may be negative when system time backwards

```
2024-08-09 11:31:20 -04:00
Eugene Yurtsev
bd6c31617e community[patch]: Remove more @allow_reuse=True validators (#25236)
Remove some additional allow_reuse=True usage in @root_validators.
2024-08-09 11:10:27 -04:00
Eugene Yurtsev
6e57aa7c36 community[patch]: Remove usage of @root_validator(allow_reuse=True) (#25235)
Remove usage of @root_validator(allow_reuse=True)
2024-08-09 10:57:42 -04:00
thiswillbeyourgithub
a2b4c33bd6 community[patch]: FAISS: ValueError mentions normalize_score_fn isntead of relevance_score_fn (#25225)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: fix valueerror mentions wrong argument
missing"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** when faiss.py has a None relevance_score_fn it raises
a ValueError that says a normalize_fn_score argument is needed.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-09 14:40:29 +00:00
ccurme
4825dc0d76 langchain[patch]: add deprecations (#24792) 2024-08-09 10:34:43 -04:00
ccurme
02300471be langchain[patch]: extended-tests: drop logprobs from OAI expected config (#25234)
Following https://github.com/langchain-ai/langchain/pull/25229
2024-08-09 14:23:11 +00:00
Shivendra Soni
66b7206ab6 community: Add llm-extraction option to FireCrawl Document Loader (#25231)
**Description:** This minor PR aims to add `llm_extraction` to Firecrawl
loader. This feature is supported on API and PythonSDK, but the
langchain loader omits adding this to the response.
**Twitter handle:** [scalable_pizza](https://x.com/scalablepizza)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-09 13:59:10 +00:00
blaufink
c81c77b465 partners: fix of issue #24880 (#25229)
- Description: As described in the related issue: There is an error
occuring when using langchain-openai>=0.1.17 which can be attributed to
the following PR: #23691
Here, the parameter logprobs is added to requests per default.
However, AzureOpenAI takes issue with this parameter as stated here:
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python-new&pivots=programming-language-chat-completions
-> "If you set any of these parameters, you get an error."
Therefore, this PR changes the default value of logprobs parameter to
None instead of False. This results in it being filtered before the
request is sent.
- Issue: #24880
- Dependencies: /

Co-authored-by: blaufink <sebastian.brueckner@outlook.de>
2024-08-09 13:21:37 +00:00
ccurme
3b7437d184 docs: update integration api refs (#25195)
- [x] toolkits
- [x] retrievers (in this repo)
2024-08-09 12:27:32 +00:00
Bagatur
91ea4b7449 infra: avoid orjson 3.10.7 in vercel build (#25212) 2024-08-09 02:23:18 +00:00
Isaac Francisco
652b3fa4a4 [docs]: playwright fix (#25163) 2024-08-08 17:13:42 -07:00
Bagatur
7040013140 core[patch]: fix deprecation pydantic bug (#25204)
#25004 is incompatible with pydantic < 1.10.17. Introduces fix for this.
2024-08-08 16:39:38 -07:00
Isaac Francisco
dc7423e88f [docs]: standardizing document loader integration pages (#25002) 2024-08-08 16:33:09 -07:00
Casey Clements
25f2e25be1 partners[patch]: Mongodb Retrievers - CI final touches. (#25202)
## Description

Contains 2 updates to for integration tests to run on langchain's CI.
Addendum to #25057 to get release github action to succeed.
2024-08-08 15:38:31 -07:00
Bagatur
786ef021a3 docs: redirect toolkits (#25190) 2024-08-08 14:54:11 -07:00
Eugene Yurtsev
429a0ee7fd core[minor]: Add factory for looking up secrets from the env (#25198)
Add factory method for looking secrets from the env.
2024-08-08 16:41:58 -04:00
Erick Friis
da9281feb2 cli: release 0.0.29 (#25196) 2024-08-08 12:52:49 -07:00
Erick Friis
c6ece6a96d core: autodetect more ls params (#25044)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-08 12:44:21 -07:00
Eugene Yurtsev
86355640c3 experimental[patch]: Use get_fields adapter (#25193)
Change all usages of __fields__ with get_fields adapter merged into
langchain_core.

Code mod generated using the following grit pattern:

```
engine marzano(0.1)
language python


`$X.__fields__` => `get_fields($X)` where {
    add_import(source="langchain_core.utils.pydantic", name="get_fields")
}
```
2024-08-08 15:10:11 -04:00
Eugene Yurtsev
b9f65e5038 experimental[patch]: Migrate pydantic extra to literals (#25194)
Migrate pydantic extra to literals

Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 19:05:54 +00:00
Eugene Yurtsev
30fb345342 core[minor]: Add from_env utility (#25189)
Add a utility that can be used as a default factory

The goal will be to start migrating from of the pydantic models to use
`from_env` as a default factory if possible.

```python

from pydantic import Field, BaseModel
from langchain_core.utils import from_env

class Foo(BaseModel):
   name: str = Field(default_factory=from_env('HELLO'))
```
2024-08-08 14:52:35 -04:00
Eugene Yurtsev
98779797fe community[patch]: Use get_fields adapter for pydantic (#25191)
Change all usages of __fields__ with get_fields adapter merged into
langchain_core.

Code mod generated using the following grit pattern:

```
engine marzano(0.1)
language python


`$X.__fields__` => `get_fields($X)` where {
    add_import(source="langchain_core.utils.pydantic", name="get_fields")
}
```
2024-08-08 14:43:09 -04:00
Rajendra Kadam
663638d6a8 community[minor]: [SharePointLoader] Load extended metadata for the root folder (#24872)
- **Title:** [SharePointLoader] Load extended metadata for the root
folder
- **Description:** 
    - Ensure extended metadata loads correctly for the root folder.
- Cleanup: Refactor SharePointLoader to remove unused fields(`file_id` &
`site_id`).
- **Dependencies:** NA
- **Add tests and docs:** NA
2024-08-08 14:39:16 -04:00
Eugene Yurtsev
2f209d84fa core[patch]: Add pydantic get_fields adapter (#25187)
Add adapter to get fields
2024-08-08 17:47:42 +00:00
Eugene Yurtsev
c72e522e96 langchain[patch]: Upgrade pydantic extra (#25186)
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 17:27:27 +00:00
Eugene Yurtsev
bf5193bb99 community[patch]: Upgrade pydantic extra (#25185)
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 17:20:39 +00:00
Isaac Francisco
11adc09e02 [docs]: change rag reference in vector store pages (#25125) 2024-08-08 10:08:14 -07:00
Anush
6b32810b68 qdrant: Update doc with usage snippets (#25179)
## Description

This PR adds back snippets demonstrating sparse and hybrid retrieval in
the Qdrant notebook.

Without the snippets, it's hard to grok the usage.
2024-08-08 12:58:26 -04:00
Eugene Yurtsev
3da2713172 docs: Update pydantic compatibility (#25145)
Update pydantic compatibility
2024-08-08 12:10:44 -04:00
Eugene Yurtsev
425f6ffa5b core[patch]: Fix aindex API (#25155)
A previous PR accidentally broke the aindex API by renaming a positional
argument vectorstore into vector_store. This PR reverts this change.
2024-08-08 12:08:18 -04:00
Isaac Francisco
15a36dd0a2 [docs]: combine tools and toolkits (#25158) 2024-08-08 08:59:02 -07:00
ololand
249945a572 Update polygon.py for business subscription (#25085)
For business subscription the status is STOCKSBUSINESS not OK

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-08 15:28:41 +00:00
ccurme
59b8850909 groq[patch]: update rate limit in integration tests (#25177)
Divide by ~2 to account for testing python 3.8 and 3.12 in parallel.
2024-08-08 13:33:25 +00:00
Chad Juliano
4828c441a7 docs: Update notebook name for Kinetica (#25149)
**Description:** Change notebook description in documentation.
**Issue:** N/A
**Dependencies:** N/A
2024-08-08 09:27:29 -04:00
Francisco Kurucz
725e4912ae docs: Fix reference to SQL QA migration (#25157)
**Description:** I found that the link to the notebook in the Migration
notes is broken, i found that it was linked to this file
https://github.com/langchain-ai/langchain/blob/v0.0.250/docs/extras/use_cases/tabular/sql_query.ipynb
and i think now this tutorial
https://github.com/JuanFKurucz/langchain/blob/master/docs/docs/tutorials/sql_qa.ipynb
is the best fit for this reference

**Twitter handle:** @juanfkurucz

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-08 09:26:13 -04:00
ogawa
d895db11d6 community[patch]: gpt-4o-2024-08-06 costs (#25164)
- **Description:** updated OpenAI cost definitions according to the
following:
  - https://openai.com/api/pricing/
- **Twitter handle:** `@ogawa65a`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-08 13:22:11 +00:00
Brace Sproul
d77c7c4236 docs: Fix misspelling of instantiate in docs (#25107) 2024-08-07 15:05:06 -07:00
Eugene Yurtsev
7b1a132aff core[patch]: Add unit tests for Serializable (#25152)
Add a few test cases for serializable (many other test cases already
covered
throguh runnable tests).
2024-08-07 21:01:36 +00:00
Bagatur
df99b832a7 core[patch]: support Field deprecation (#25004)
![Screenshot 2024-08-02 at 4 23 17
PM](https://github.com/user-attachments/assets/c757e093-877e-4af6-9dcd-984195454158)
2024-08-07 13:57:55 -07:00
ccurme
803eba3163 core[patch]: check for model_fields attribute (#25108)
`__fields__` raises a warning in pydantic v2
2024-08-07 13:32:56 -07:00
Casey Clements
6e9a8b188f mongodb: Add Hybrid and Full-Text Search Retrievers, release 0.2.0 (#25057)
## Description

This pull-request extends the existing vector search strategies of
MongoDBAtlasVectorSearch to include Hybrid (Reciprocal Rank Fusion) and
Full-text via new Retrievers.

There is a small breaking change in the form of the `prefilter` kwarg to
search. For this, and because we have now added a great deal of
features, including programmatic Index creation/deletion since 0.1.0, we
plan to bump the version to 0.2.0.

### Checklist
* Unit tests have been extended
* formatting has been applied
* One mypy error remains which will either go away in CI or be
simplified.

---------

Signed-off-by: Casey Clements <casey.clements@mongodb.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-07 20:10:29 +00:00
Isaac Francisco
f337408b0f [docs]: add sidebar for different tool categories (#25065) 2024-08-07 12:57:58 -07:00
Bagatur
0b4608f71e infra: temp skip oai embeddings test (#25148) 2024-08-07 17:51:39 +00:00
Bagatur
a4086119f8 openai[patch]: Release 0.1.21rc2 (#25146) 2024-08-07 16:59:15 +00:00
Bagatur
b4c12346cc core[patch]: Release 0.2.29 (#25126) 2024-08-07 09:50:20 -07:00
Erick Friis
dff83cce66 core[patch]: base language model disable_streaming (#25070)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-07 09:26:21 -07:00
eric-langenberg
130e80b60f docs: rag.ipynb - fixing typo (#25142)
Just changing gpt-3.5 to gpt-4o-mini . That's what's used in the code
examples now. It just didn't get updated in the main text.
2024-08-07 16:02:22 +00:00
Bagatur
09fbce13c5 openai[patch]: ChatOpenAI.with_structured_output json_schema support (#25123) 2024-08-07 08:09:07 -07:00
maang-h
0ba125c3cd docs: Standardize QianfanLLMEndpoint LLM (#25139)
- **Description:** Standardize QianfanLLMEndpoint LLM,include:
  - docs, the issue #24803 
  - model init arg names, the issue #20085
2024-08-07 10:57:27 -04:00
Eugene Yurtsev
28e0958ff4 core[patch]: Relax rate limit unit tests in terms of timing (#25140)
Relax rate limit unit tests
2024-08-07 14:04:58 +00:00
Eray Eroğlu
a2e9910268 Documentation Update for Upstash Semantic Caching (#25114)
Thank you for contributing to LangChain!

- [ ] **PR title**: "Documentation Update : Semantic Caching Update for
Upstash"
 - Docs, llm caching integrations update

- **Description:** Upstash supports semantic caching, and we would like
to inform you about this
- **Twitter handle:** You can mention eray_eroglu_ if you want to post a
tweet about the PR

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-07 14:02:07 +00:00
Pat Patterson
7e7fcf5b1f community: Fix ValidationError on creating GPT4AllEmbeddings with no gpt4all_kwargs (#25124)
- **Description:** Instantiating `GPT4AllEmbeddings` with no
`gpt4all_kwargs` argument raised a `ValidationError`. Root cause: #21238
added the capability to pass `gpt4all_kwargs` through to the `GPT4All`
instance via `Embed4All`, but broke code that did not specify a
`gpt4all_kwargs` argument.
- **Issue:** #25119 
- **Dependencies:** None
- **Twitter handle:** [`@metadaddy`](https://twitter.com/metadaddy)
2024-08-07 13:34:01 +00:00
Atanu Dasgupta
04dd8d3b0a Update google_search.ipynb (#25135)
updated with langchain_google_community instead as the latest revision

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-07 13:30:59 +00:00
ZhangShenao
63d84e93b9 patch[doc] Fix word spelling error (#25128)
Fix word spelling error
2024-08-07 09:16:17 -04:00
Eugene Yurtsev
4d28c70000 core[patch]: Sort Config attributes (#25127)
This PR does an aesthetic sort of the config object attributes. This
will make it a bit easier to go back and forth between pydantic v1 and
pydantic v2 on the 0.3.x branch
2024-08-07 02:53:50 +00:00
Erick Friis
46a47710b0 partners/milvus: release 0.1.4 (#25058) 2024-08-06 16:29:29 -07:00
Erick Friis
35ebd2620c infra,cli: template matching registration (#25110) 2024-08-06 15:29:55 -07:00
ccurme
23c9aba575 groq[patch]: allow warnings during tests (#25105)
Among integration packages in libs/partners, Groq is an exception in
that it errors on warnings.

Following https://github.com/langchain-ai/langchain/pull/25084, Groq
fails with

> pydantic.warnings.PydanticDeprecatedSince20: The `__fields__`
attribute is deprecated, use `model_fields` instead. Deprecated in
Pydantic V2.0 to be removed in V3.0.

Here we update the behavior to no longer fail on warning, which is
consistent with the rest of the packages in libs/partners.
2024-08-06 18:02:20 -04:00
Bagatur
1331e8589c docs: oai chat nit (#25117) 2024-08-06 22:00:42 +00:00
Bagatur
7882d5c978 openai[patch]: Release 0.1.21rc1 (#25116) 2024-08-06 21:50:36 +00:00
Bagatur
70677202c7 core[patch]: Release 0.2.29rc1 (#25115) 2024-08-06 21:36:56 +00:00
Bagatur
78403a3746 core[patch], openai[patch]: enable strict tool calling (#25111)
Introduced
https://openai.com/index/introducing-structured-outputs-in-the-api/
2024-08-06 21:21:06 +00:00
ccurme
5d10139fc7 docs[patch]: add to qa with sources guide (#25112) 2024-08-06 17:08:35 -04:00
Eugene Yurtsev
d283f452cc core[minor]: Add support for DocumentIndex in the index api (#25100)
Support document index in the index api.
2024-08-06 12:30:49 -07:00
Virat Singh
264ab96980 community: Add stock market tools from financialdatasets.ai (#25025)
**Description:**
In this PR, I am adding three stock market tools from
financialdatasets.ai (my API!):
- get balance sheets
- get cash flow statements
- get income statements

Twitter handle: [@virattt](https://twitter.com/virattt)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-06 18:28:12 +00:00
William FH
267855b3c1 Set Context in RunnableSequence & RunnableParallel (#25073) 2024-08-06 11:10:37 -07:00
Naval Chand
71c0698ee4 Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler (#25104)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Example: "community: Added bedrock 3-5 sonnet cost detials for
BedrockAnthropicTokenUsageCallbackHandler"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Naval Chand <navalchand@192.168.1.36>
2024-08-06 17:28:47 +00:00
Isaac Francisco
a72fddbf8d [docs]: vector store integration pages (#24858)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-06 17:20:27 +00:00
Bagatur
2c798622cd docs: runnable docstring space (#25106) 2024-08-06 16:46:50 +00:00
Bagatur
3abf1b6905 docs: versions sidebar (#25061) 2024-08-06 09:23:43 -07:00
maang-h
1028af17e7 docs: Standardize Tongyi (#25103)
- **Description:** Standardize Tongyi LLM,include:
  - docs, the issue #24803
  - model init arg names, the issue #20085
2024-08-06 11:44:12 -04:00
Dobiichi-Origami
061ed250f6 delete the default model value from langchain and discard the need fo… (#24915)
- description: I remove the limitation of mandatory existence of
`QIANFAN_AK` and default model name which langchain uses cause there is
already a default model nama underlying `qianfan` SDK powering langchain
component.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-06 14:11:05 +00:00
Eugene Yurtsev
293a4a78de core[patch]: Include dependencies in sys_info (#25076)
`python -m langchain_core.sys_info`

```bash
System Information
------------------
> OS:  Linux
> OS Version:  #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Jun 18 14:36:16 UTC 2
> Python Version:  3.11.4 (main, Sep 25 2023, 10:06:23) [GCC 11.4.0]

Package Information
-------------------
> langchain_core: 0.2.28
> langchain: 0.2.8
> langsmith: 0.1.85
> langchain_anthropic: 0.1.20
> langchain_openai: 0.1.20
> langchain_standard_tests: 0.1.1
> langchain_text_splitters: 0.2.2
> langgraph: 0.1.19

Optional packages not installed
-------------------------------
> langserve

Other Dependencies
------------------
> aiohttp: 3.9.5
> anthropic: 0.31.1
> async-timeout: Installed. No version info available.
> defusedxml: 0.7.1
> httpx: 0.27.0
> jsonpatch: 1.33
> numpy: 1.26.4
> openai: 1.39.0
> orjson: 3.10.6
> packaging: 24.1
> pydantic: 2.8.2
> pytest: 7.4.4
> PyYAML: 6.0.1
> requests: 2.32.3
> SQLAlchemy: 2.0.31
> tenacity: 8.5.0
> tiktoken: 0.7.0
> typing-extensions: 4.12.2
```
2024-08-06 09:57:39 -04:00
Dominik Fladung
ffa0c838d8 Allow ConfluenceLoader authorization via Personal Access Tokens (#25096)
- community: Allow authorization to Confluence with bearer token

- **Description:** Allow authorization to Confluence with [Personal
Access
Token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html)
by checking for the keys `['client_id', token: ['access_token',
'token_type']]`

- **Issue:** 

Currently the following error occurs when using an personal access token
for authorization.

```python
loader = ConfluenceLoader(
    url=os.getenv('CONFLUENCE_URL'),
    oauth2={
        'token': {"access_token": os.getenv("CONFLUENCE_ACCESS_TOKEN"), "token_type": "bearer"},
        'client_id': 'client_id',
    },
    page_ids=['12345678'], 
)
```

```
ValueError: Error(s) while validating input: ["You have either omitted require keys or added extra keys to the oauth2 dictionary. key values should be `['access_token', 'access_token_secret', 'consumer_key', 'key_cert']`"]
```

With this PR the loader runs as expected.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-06 13:42:47 +00:00
orkhank
111c7df117 docs: update numbering of items in method docs (#25093)
Some methods' doc strings have a wrong numbering of items. The numbers
were adjusted accordingly
2024-08-06 09:21:52 -04:00
Bagatur
6eb42c657e core[patch]: Remove default BaseModel init docstring (#25009)
Currently a default init docstring gets appended to the class docstring
of every BaseModel inherited object. This removes the default init
docstring.

![Screenshot 2024-08-02 at 5 09 55
PM](https://github.com/user-attachments/assets/757fe4ae-a793-4e7d-8354-512de2c06818)
2024-08-06 01:04:04 +00:00
Gram Liu
88a9a6a758 core[patch]: Add pydantic metadata to subset model (#25032)
- **Description:** This includes Pydantic field metadata in
`_create_subset_model_v2` so that it gets included in the final
serialized form that get sent out.
- **Issue:** #25031 
- **Dependencies:** n/a
- **Twitter handle:** @gramliu

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-05 17:57:39 -07:00
BhujayKumarBhatta
8f33fce871 docs: change for optional variables in chatprompt (#25017)
Fixes #24884
2024-08-05 23:57:44 +00:00
Erick Friis
423d286546 infra: check doc script skip index page (#25088) 2024-08-05 16:38:30 -07:00
Bagatur
e572521f2a core[patch]: exclude special pydantic init params (#25084) 2024-08-05 23:32:51 +00:00
Isaac Francisco
63ddf0afb4 ollama: allow base_url, headers, and auth to be passed (#25078) 2024-08-05 15:39:36 -07:00
Eugene Yurtsev
4bcd2aad6c core[patch]: Relax time constraints on rate limit test (#25071)
Try to keep the unit test fast, but also have it repeat more robustly
2024-08-05 17:04:22 -04:00
jigsawlabs-student
427a04151c community: fix neo4j from_existing_graph (#24912)
Fixes Neo4JVector.from_existing_graph integration with huggingface

Previously threw an error with existing databases, because
from_existing_graph query returns empty list of new nodes, which are
then passed to embedding function, and huggingface errors with empty
list.

Fixes [24401](https://github.com/langchain-ai/langchain/issues/24401)

---------

Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>
2024-08-05 21:01:46 +00:00
Tomaz Bratanic
d166967003 experimental: Add gliner graph transformer (#25066)
You can use this with:

```
from langchain_experimental.graph_transformers import GlinerGraphTransformer
gliner = GlinerGraphTransformer(allowed_nodes=["Person", "Organization", "Nobel"], allowed_relationships=["EMPLOYEE", "WON"])

from langchain_core.documents import Document

text = """
Marie Curie, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.
"""
documents = [Document(page_content=text)]

gliner.convert_to_graph_documents(documents)
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 21:01:27 +00:00
Bagatur
a74e466507 docs: aws pydantic v2 compat (#25075) 2024-08-05 20:47:11 +00:00
Bagatur
a02a09c973 docs: remove redundant deprecation warning (#25067) 2024-08-05 18:44:47 +00:00
Eugene Yurtsev
41dfad5104 core[minor]: Introduce DocumentIndex abstraction (#25062)
This PR adds a minimal document indexer abstraction.

The goal of this abstraction is to allow developers to create custom
retrievers that also have a standard indexing API and allow updating the
document content in them.

The abstraction comes with a test suite that can verify that the indexer
implements the correct semantics.

This is an iteration over a previous PRs
(https://github.com/langchain-ai/langchain/pull/24364). The main
difference is that we're sub-classing from BaseRetriever in this
iteration and as so have consolidated the sync and async interfaces.

The main problem with the current design is that runt time search
configuration has to be specified at init rather than provided at run
time.

We will likely resolve this issue in one of the two ways:

(1) Define a method (`get_retriever`) that will allow creating a
retriever at run time with a specific configuration.. If we do this, we
will likely break the subclass on BaseRetriever
(2) Generalize base retriever so it can support structured queries

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 18:06:33 +00:00
Vkzem
e7b95e0802 docs: update exa search (#24861)
- [x] **PR title**: "docs: changed example for Exa search retriever
usage"

- [x] **PR message**:
- **Description:** Changed Exa integration doc at
`docs/docs/integrations/tools/exa_search.ipynb` to better reflect simple
Exa use case
- **Issue:** move toward more canonical use of Exa method
(`search_and_contents` rather than just `search`)
    - **Dependencies:** no dependencies; docs only change
    - **Twitter handle:** n/a - small change

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. - will do

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 17:41:33 +00:00
Stuart Marsh
16bd0697dc milvus: fixed bug when using partition key and dynamic fields together (#25028)
**Description:**

This PR fixes a bug where if `enable_dynamic_field` and
`partition_key_field` are enabled at the same time, a pymilvus error
occurs.

Milvus requires the partition key field to be a full schema defined
field, and not a dynamic one, so it will throw the error "the specified
partition key field {field} not exist" when creating the collection.

When `enabled_dynamic_field` is set to `True`, all schema field creation
based on `metadatas` is skipped. This code now checks if
`partition_key_field` is set, and creates the field.

Integration test added.

**Twitter handle:** StuartMarshUK

---------

Co-authored-by: Stuart Marsh <stuart.marsh@qumata.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 16:01:55 +00:00
Jim Baldwin
6890daa90c community: make AthenaLoader profile_name optional and fix type hint (#24958)
- **Description:** This PR makes the AthenaLoader profile_name optional
and fixes the type hint which says the type is `str` but it should be
`str` or `None` as None is handled in the loader init. This is a minor
problem but it just confused me when I was using the Athena Loader to
why we had to use a Profile, as I want that for local but not
production.
- **Issue:** #24957 
- **Dependencies:** None.
2024-08-05 14:28:58 +00:00
Alexey Lapin
335894893b langchain: Make RetryWithErrorOutputParser.from_llm() create a correct retry chain (#25053)
Description: RetryWithErrorOutputParser.from_llm() creates a retry chain
that returns a Generation instance, when it should actually just return
a string.
This class was forgotten when fixing the issue in PR #24687
2024-08-05 14:21:27 +00:00
Dobiichi-Origami
c5cb52a3c6 community: fix issue of the existence of numeric object in additional_kwargs a… (#24863)
- **Description:** A previous PR breaks the code from
`baidu_qianfan_endpoint.py` which causes the malfunction of streaming
2024-08-05 10:15:55 -04:00
ZhangShenao
cda79dbb6c community[patch]: Optimize test case for MoonshotChat (#25050)
Optimize test case for `MoonshotChat`. Use standard
ChatModelIntegrationTests.
2024-08-05 10:11:25 -04:00
orkhank
cea3f72485 docs: fix comment lines in code blocks (#25054)
The comments inside some code blocks seems to be misplaced. The comment
lines containing explanation about `default_key` behavior when operating
with prompts are updated.
2024-08-05 14:11:09 +00:00
ZhangShenao
02c35da445 doc[Retriever] Enhance api docs for MultiQueryRetriever (#25035)
Enhance api docs for `MultiQueryRetriever`:

- Complete missing parameters
- Unify parameter name
2024-08-04 13:56:38 -04:00
Alex Sherstinsky
208042e0f2 community: Fix Predibase Integration for HuggingFace-hosted fine-tuned adapters (#25015)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-03 14:05:43 -07:00
maang-h
f5da0d6d87 docs: Standardize MiniMaxEmbeddings (#24983)
- **Description:** Standardize MiniMaxEmbeddings
  - docs, the issue #24856 
  - model init arg names, the issue #20085
2024-08-03 14:01:23 -04:00
ZhangShenao
2c3e3dc6b1 patch[Partners] Unified fix of incorrect variable declarations in all check_imports (#25014)
There are some incorrect declarations of variable `has_failure` in
check_imports. The purpose of this PR is to uniformly fix these errors.
2024-08-03 13:49:41 -04:00
maang-h
7de62abc91 docs: Standardize SparkLLMTextEmbeddings docstrings (#25021)
- **Description:** Standardize SparkLLMTextEmbeddings docstrings
- **Issue:** the issue #24856
2024-08-03 13:44:09 -04:00
Tomaz Bratanic
f9a11a9197 Add relik transformer config (#25019) 2024-08-03 08:41:45 -04:00
Bagatur
1dcee68cb8 docs: show beta directive (#25013)
![Screenshot 2024-08-02 at 7 15 34
PM](https://github.com/user-attachments/assets/086831c7-36f3-4962-98dc-d707b6289747)
2024-08-03 03:07:45 +00:00
Bagatur
e81ddb32a6 docs: fix kwargs docstring (#25010)
Fix:
![Screenshot 2024-08-02 at 5 33 37
PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)
2024-08-02 19:54:54 -07:00
Bagatur
57747892ce docs: show deprecation warning first in api ref (#25001)
OLD
![Screenshot 2024-08-02 at 3 29 39
PM](https://github.com/user-attachments/assets/7f169121-1202-4770-a006-d72ac7a1aa33)


NEW
![Screenshot 2024-08-02 at 3 29 45
PM](https://github.com/user-attachments/assets/9cc07cbd-2ae9-4077-95c5-03cb051e6cd7)
2024-08-02 17:35:25 -07:00
Bagatur
679843abb0 docs: separate deprecated classes (#25007)
![Screenshot 2024-08-02 at 4 58 54
PM](https://github.com/user-attachments/assets/29424dd5-0593-4818-9eed-901ff47246b9)
2024-08-02 17:12:47 -07:00
Isaac Francisco
73570873ab docs: standardizing tavily tool docs (#24736)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-02 22:25:27 +00:00
Isaac Francisco
2ae76cecde [docs]: updating mistral and hugging face chat model pages (#24731) 2024-08-02 15:21:25 -07:00
Bagatur
4305f78e40 core[patch]: Release 0.2.28 (#25000) 2024-08-02 21:07:06 +00:00
Bagatur
64ccddf3cb docs: fmt concepts (#24999) 2024-08-02 20:35:45 +00:00
Bagatur
dd8e4cd020 text-splitters[patch]: Release 0.2.3 (#24998) 2024-08-02 20:27:22 +00:00
Bagatur
0de0cd2d31 core[patch]: merge message runs nit (#24997)
Only add separator if both chunks are non-empty
2024-08-02 20:25:43 +00:00
Bagatur
8e2316b8c2 community[patch]: Release 0.2.11 (#24989) 2024-08-02 20:08:44 +00:00
ccurme
c2538e7834 experimental[patch]: bump min versions of core and community (#24996)
Ollama functions unit test broken with min version of community.
2024-08-02 19:58:55 +00:00
ccurme
acba38a18e docs: update toolkit guides (#24992) 2024-08-02 15:51:05 -04:00
ccurme
22c1a4041b community[patch]: support named arguments in github toolkit (#24986)
Parameters may be passed in by name if generated from tool calls.
2024-08-02 18:27:32 +00:00
ccurme
4797b806c2 experimental[patch]: release 0.0.64 (#24990) 2024-08-02 18:00:57 +00:00
Tomaz Bratanic
7061869aec Add relik graph transformer (#24982)
Relik is a new library for graph extraction that offers smaller and
cheaper models for graph construction
2024-08-02 13:55:41 -04:00
Erick Friis
98c22e9082 docs: feature table component (#24985) 2024-08-02 17:41:47 +00:00
ccurme
c04d95b962 standard-tests: set integration test parameters independent of unit test (#24979)
This ends up getting set in integration tests.
2024-08-02 10:40:11 -07:00
gbaian10
54e9ea433a fix: Modify the order of init_chat_model import ollama package. (#24977) 2024-08-02 08:32:56 -07:00
David Gao
fe1820cdaf docs: add wikipedia integration docs (#24932)
Dear langchain maintainers, 

I add the wikipedia integration docs according to the [web
docs](https://python.langchain.com/v0.2/docs/integrations/retrievers/wikipedia/),
and follow the format of [tavily
example](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/tavily.ipynb)
and [retriever
template](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/docs/retrievers.ipynb),
this is my first time contributing large repo. please let me know if I'm
doing anything wrong, thank you!

Topic related: #24908

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-02 10:12:04 -04:00
ZhangShenao
71c0564c9f community[patch]: Add test case for MoonshotChat (#24960)
Add test case for `MoonshotChat`.
2024-08-02 09:37:31 -04:00
ZhangShenao
c65e48996c patch[partners] Fix check_imports bugs in pinecone and milvus (#24971)
Fix wrong declared variables of `check_imports` in pinecone and milvus
2024-08-02 09:27:11 -04:00
Isaac Francisco
d7688a4328 community[patch]: adding artifact to Tavily search (#24376)
This allows you to get raw content as well as the answer, instead of
just getting the results.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-01 21:12:11 -07:00
Bagatur
7b08de8909 langchain[patch]: Release 0.2.12 (#24954) 2024-08-02 04:04:49 +00:00
Bagatur
245cb5a252 core[patch]: Release 0.2.27 (#24952) 2024-08-02 01:43:24 +00:00
Bagatur
199e9c5ae0 core[patch]: Fix tool args schema inherited field parsing (#24936)
Fix #24925
2024-08-01 18:36:33 -07:00
Bagatur
fba65ba04f infra: test core on py 3.9, 10, 11 (#24951) 2024-08-01 18:23:37 -07:00
Leonid Ganeline
4092876863 core: docstrings `BaseCallbackHandler update (#24948)
Added missed docstrings
2024-08-01 20:46:53 -04:00
ccurme
6e45dba471 docs: fix redirect (#24950) 2024-08-01 20:45:54 -04:00
WU LIFU
ad16eed119 core[patch]: runnable config ensure_config deep copy from var_child_runnable… (#24862)
**issue**: #24660 
RunnableWithMessageHistory.stream result in error because the
[evaluation](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/runnables/branch.py#L220)
of the branch
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
unexpectedly trigger the
"[on_end](99eb31ec41/libs/core/langchain_core/runnables/history.py (L332))"
(exit_history) callback of the default branch


**descriptions**
After a lot of investigation I'm convinced that the root cause is that
1. during the execution of the runnable, the
[var_child_runnable_config](99eb31ec41/libs/core/langchain_core/runnables/config.py (L122))
is shared between the branch
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
runnable and the [default branch
runnable](99eb31ec41/libs/core/langchain_core/runnables/history.py (L332))
within the same context
2. when the default branch runnable runs, it gets the
[var_child_runnable_config](99eb31ec41/libs/core/langchain_core/runnables/config.py (L163))
and may unintentionally [add more handlers
](99eb31ec41/libs/core/langchain_core/runnables/config.py (L325))to
the callback manager of this config
3. when it is again the turn for the
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
to run, it gets the `var_child_runnable_config` whose callback manager
has the handlers added by the default branch. When it runs that handler
(`exit_history`) it leads to the error
   
with the assumption that, the `ensure_config` function actually does
want to create a immutable copy from `var_child_runnable_config` because
it starts with an [`empty` variable
](99eb31ec41/libs/core/langchain_core/runnables/config.py (L156)),
i go ahead to do a deepcopy to ensure that future modification to the
returned value won't affect the `var_child_runnable_config` variable
   
   Having said that I actually 
1. don't know if this is a proper fix
2. don't know whether it will lead to other unintended consequence 
3. don't know why only "stream" runs into this issue while "invoke" runs
without problem

so @nfcampos @hwchase17 please help review, thanks!

---------

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-01 17:30:32 -07:00
Jacob Lee
3ab09d87d6 docs[patch]: Adds components for prereqs, compatibility, fix chat model tab issue (#24585)
Added to `docs/how_to/tools_runtime` as a proof of concept, will apply
everywhere if we like.

A bit more compact than the default callouts, will help standardize the
layout of our pages since we frequently use these boxes.

<img width="1088" alt="Screenshot 2024-07-23 at 4 49 02 PM"
src="https://github.com/user-attachments/assets/7380801c-e092-4d31-bcd8-3652ee05f29e">
2024-08-01 15:04:13 -07:00
ccurme
9cb69a8746 docs: update retriever template, add arxiv retriever (#24947) 2024-08-01 16:53:18 -04:00
Casey Clements
db3ceb4d0a partners/mongodb: Improved search index commands (#24745)
Hardens index commands with try/except for free clusters and optional
waits for syncing and tests.

[efriis](https://github.com/efriis) These are the upgrades to the search
index commands (CRUD) that I mentioned.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-01 20:16:32 +00:00
ccurme
db42576b09 docs: delete old migration guide (#24881)
Redirects to
https://python.langchain.com/v0.2/docs/versions/migrating_chains/
2024-08-01 16:11:47 -04:00
Ikko Eltociear Ashimine
be5294e35d docs: update agents.ipynb (#24945)
initalize -> initialize
2024-08-01 14:37:37 -04:00
ccurme
41ed23a050 docs: update retriever integration pages (#24931) 2024-08-01 14:37:07 -04:00
maang-h
ea505985c4 docs: Standardize ZhipuAIEmbeddings docstrings (#24933)
- **Description:** Standardize ZhipuAIEmbeddings rich docstrings.
- **Issue:** the issue #24856
2024-08-01 14:06:53 -04:00
ccurme
02db66d764 docs: fix kv store column headers (#24941)
![Screenshot 2024-08-01 at 12 32 19
PM](https://github.com/user-attachments/assets/888056b7-3065-4be0-a6b8-bcab5b729c2c)
2024-08-01 09:49:36 -07:00
Anneli Samuel
2204d8cb7d community[patch]: Invoke on_llm_new_token callback before yielding chunk (#24938)
**Description**: Invoke on_llm_new_token callback before yielding chunk
in streaming mode
**Issue**:
[#16913](https://github.com/langchain-ai/langchain/issues/16913)
2024-08-01 16:39:04 +00:00
John
ff6274d32d docs: update langchain-unstructured docs (#24935)
- **Description:** The UnstructuredClient will have a breaking change in
the near future. Add a note in the docs that the examples here may not
use the latest version and users should refer to the SDK docs for the
latest info.
2024-08-01 16:27:40 +00:00
ccurme
c72f0d2f20 docs: update toolkit integration pages (#24887)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-01 12:13:08 -04:00
Eugene Yurtsev
75776e4a54 core[patch]: In unit tests, use _schema() instead of BaseModel.schema() (#24930)
This PR introduces a module with some helper utilities for the pydantic
1 -> 2 migration.

They're meant to be used in the following way:

1) Use the utility code to get unit tests pass without requiring
modification to the unit tests
2) (If desired) upgrade the unit tests to match pydantic 2 output
3) (If desired) stop using the utility code

Currently, this module contains a way to map `schema()` generated by
pydantic 2 to (mostly) match the output from pydantic v1.
2024-08-01 11:59:04 -04:00
Serena Ruan
1827bb4042 community[patch]: support bind_tools for ChatMlflow (#24547)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- **Description:** 
Support ChatMlflow.bind_tools method
Tested in Databricks:
<img width="836" alt="image"
src="https://github.com/user-attachments/assets/fa28ef50-0110-4698-8eda-4faf6f0b9ef8">



- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
2024-08-01 08:43:07 -07:00
Michal Gregor
769c3bb838 huggingface: Added a missing argument to a ChatHuggingFace doc notebook. (#24929)
- **Description:** When adding docs for constructing ChatHuggingFace
using a HuggingFacePipeline, I forgot to add `return_full_text=False` as
an argument. In this setup, the chat response would incorrectly contain
all the input text. I am fixing that here by adding that line to the
offending notebook.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-01 15:42:35 +00:00
BottlePumpkin
bfc59c1d26 community: Fix KeyError in NotionDB loader when 'name' is missing (#24224)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.



**Description:** This PR fixes a KeyError in NotionDBLoader when the
"name" key is missing in the "people" property.

**Issue:** Fixes #24223 

**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-01 13:55:40 +00:00
alexqiao
8eb0bdead3 community[patch]: Invoke callback prior to yielding token (#24917)
**Description: Invoke callback prior to yielding token in stream method
for chat_models .**
**Issue**: https://github.com/langchain-ai/langchain/issues/16913
#16913
2024-08-01 13:19:55 +00:00
ZhangShenao
b2dd9ffaaf patch[cli] Fix bug in check_imports.py (#24918)
The variable `has_failure` in check_imports.py is wrong-declared. It's
actually an another variable.
2024-08-01 09:08:12 -04:00
Jacob Lee
f14121faaf docs[patch]: Update local RAG tutorial (#24909) 2024-07-31 19:19:23 -07:00
Bagatur
b7abac9f92 infra: poetry lock root (#24913) 2024-08-01 01:19:34 +00:00
Jacob Lee
42c686bc28 docs[patch]: Update local model how-to guide (#24911)
Updates to use `langchain_ollama`, new models, chat model example
2024-07-31 18:01:55 -07:00
Erick Friis
600fc233ef partners/ollama: release 0.1.1 (#24910) 2024-07-31 17:31:29 -07:00
Bagatur
25b93cc4c0 core[patch]: stringify tool non-content blocks (#24626)
Slightly breaking bugfix. Shouldn't cause too many issues since no
models would be able to handle non-content block ToolMessage.content
anyways.
2024-07-31 16:42:38 -07:00
Bagatur
492df75937 docs: chat model table nit (#24907) 2024-07-31 15:14:27 -07:00
Bagatur
a24c445e02 docs: cleanup readme (#24905) 2024-07-31 15:03:28 -07:00
Jacob Lee
5098f9dc79 infra: related section in docs (#24829)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-31 14:25:58 -07:00
Nikita Pakunov
c776471ac6 community: fix AttributeError: 'YandexGPT' object has no attribute '_grpc_metadata' (#24432)
Fixes #24049

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-31 21:18:33 +00:00
Bagatur
752a71b688 integrations[patch]: release model packages (#24900) 2024-07-31 20:48:20 +00:00
Jacob Lee
1213a59f87 docs[patch]: Update kv store docs pages (#24848) 2024-07-31 13:23:24 -07:00
Erick Friis
17a06cb7a6 infra: check templates based on integration (#24857)
instead of hardcoding a linter for each, iterate through the lines of
the template notebook and find lines that start with `##` (includes
lower headings), and enforce that those headings are found in new docs
that are contributed
2024-07-31 13:19:50 -07:00
Erick Friis
a7380dd531 cli: release 0.0.28 (#24852) 2024-07-31 13:03:24 -07:00
Erick Friis
e98e4be0f7 cli: register new integration doc templates (#24854)
- wait to merge for retriever.ipynb merge #24836
2024-07-31 13:03:05 -07:00
Eugene Yurtsev
210623b409 core[minor]: Add support for pydantic 2 to utility to get fields (#24899)
Add compatibility for pydantic 2 for a utility function.

This will help push some small changes to master, so they don't have to
be kept track of on a separate branch.
2024-07-31 19:11:07 +00:00
Bagatur
7d1694040d core[patch]: Release 0.2.26 (#24898) 2024-07-31 19:00:50 +00:00
Eugene Yurtsev
add16111b9 community[patch]: Make the pydantic linter stricter (#24897)
Stricter linting of deprecated pydantic features.
2024-07-31 18:57:37 +00:00
Eugene Yurtsev
a4a444f73d community[patch]: Fix arcee llm usage of root_validator(pre=False) (#24896)
Should be pre=True
2024-07-31 18:49:20 +00:00
Eugene Yurtsev
69c656aa5f langchain[minor]: Upgrade ambiguous root_validator to @pre_init (#24895)
The @pre_init validator is a temporary solution for base models. It has
similar (but not identical) semantics to @root_validator(), but it works
strictly as a pre-init validator.

It'll work as expected as long as the pydantic model type hints were
correct.
2024-07-31 18:46:47 +00:00
Eugene Yurtsev
5099a9c9b4 core[patch]: Update unit tests with a workaround for using AnyID in pydantic 2 (#24892)
Pydantic 2 ignores __eq__ overload for subclasses of strings.
2024-07-31 14:42:12 -04:00
Bagatur
8461934c2b core[patch], integrations[patch]: convert TypedDict to tool schema support (#24641)
supports following UX

```python
    class SubTool(TypedDict):
        """Subtool docstring"""

        args: Annotated[Dict[str, Any], {}, "this does bar"]

    class Tool(TypedDict):
        """Docstring
        Args:
            arg1: foo
        """

        arg1: str
        arg2: Union[int, str]
        arg3: Optional[List[SubTool]]
        arg4: Annotated[Literal["bar", "baz"], ..., "this does foo"]
        arg5: Annotated[Optional[float], None]
```

- can parse google style docstring
- can use Annotated to specify default value (second arg)
- can use Annotated to specify arg description (third arg)
- can have nested complex types
2024-07-31 18:27:24 +00:00
Eugene Yurtsev
d24b82357f community[patch]: Add missing annotations (#24890)
This PR adds annotations in comunity package.

Annotations are only strictly needed in subclasses of BaseModel for
pydantic 2 compatibility.

This PR adds some unnecessary annotations, but they're not bad to have
regardless for documentation pages.
2024-07-31 18:13:44 +00:00
Eugene Yurtsev
7720483432 langchain[patch]: Update unit tests to workaround a pydantic 2 issue (#24886)
This will allow our unit tests to pass when using AnyID() with our pydantic models.
2024-07-31 14:09:40 -04:00
Eugene Yurtsev
2019e31bc5 langchain[patch]: Add missing type annotations (#24889)
Adds missing type annotations in preparation for pydantic 2 upgrade.
2024-07-31 14:09:22 -04:00
ccurme
30f18c7b02 docs: add retriever integrations template (#24836) 2024-07-31 13:50:44 -04:00
Anirudh31415926535
4da3d4b18e docs: Minor corrections and updates to Cohere docs (#22726)
- **Description:** Update the Cohere's provider and RagRetriever
documentations with latest updates.
    - **Twitter handle:** Anirudh1810
2024-07-31 10:16:26 -07:00
ccurme
40b4a3de6e docs: update chat model integration pages (#24882)
to conform with template
2024-07-31 11:26:52 -04:00
Nishan Jain
b00c0fc558 [Community][minor]: Added prompt governance in pebblo_retrieval (#24874)
Title: [pebblo_retrieval] Identifying entities in prompts given in
PebbloRetrievalQA leading to prompt governance
Description: Implemented identification of entities in the prompt using
Pebblo prompt governance API.
Issue: NA
Dependencies: NA
Add tests and docs: NA
2024-07-31 13:14:51 +00:00
Rajendra Kadam
a6add89bd4 community[minor]: [PebbloSafeLoader] Implement content-size-based batching (#24871)
- **Title:** [PebbloSafeLoader] Implement content-size-based batching in
the classification flow(loader/doc API)
- **Description:** 
- Implemented content-size-based batching in the loader/doc API, set to
100KB with no external configuration option, intentionally hard-coded to
prevent timeouts.
    - Remove unused field(pb_id) from doc_metadata
- **Issue:** NA
- **Dependencies:** NA
- **Add tests and docs:** Updated
2024-07-31 09:10:28 -04:00
TrumanYan
096b66db4a community: replace it with Tencent Cloud SDK (#24172)
Description: The old method will be discontinued; use the official SDK
for more model options.
Issue: None
Dependencies: None
Twitter handle: None

Co-authored-by: trumanyan <trumanyan@tencent.com>
2024-07-31 09:05:38 -04:00
Erick Friis
99eb31ec41 cli: embed docstring template (#24855) 2024-07-31 02:16:40 +00:00
Noah Peterson
4b2a8ce6c7 docs: Shorten unreasonably long OllamaEmbeddings page (#24850)
This change removes excessive embeddings output in the Jupyter Notebook
on the [Ollama text embedding
page](https://python.langchain.com/v0.2/docs/integrations/text_embedding/ollama/)
2024-07-31 01:57:04 +00:00
Erick Friis
3999e9035c cli/docs: embedding template standardization (#24849) 2024-07-30 18:54:03 -07:00
Bagatur
1181c10c65 docs: reorder integrations sidebar (#24847) 2024-07-30 16:58:26 -07:00
Bagatur
943126c5fd docs: chat model pkg links (#24845) 2024-07-30 16:26:06 -07:00
Erick Friis
1f5444817a community: deprecate BedrockEmbeddings in favor of langchain-aws (#24846) 2024-07-30 23:13:17 +00:00
Jacob Lee
21eb4c9e5d docs[patch]: Adds first kv store doc matching new template (#24844) 2024-07-30 15:58:51 -07:00
Bagatur
a4e940550a docs: integrations custom callout (#24843) 2024-07-30 22:48:18 +00:00
Bagatur
61ecb10a77 docs: partner pkg table (#24840) 2024-07-30 15:28:10 -07:00
Erick Friis
b099cc3507 cli: release 0.0.27 (#24842) 2024-07-30 22:07:50 +00:00
Bagatur
419f2c2585 cli[patch]: tool integration templates (#24837)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-30 14:59:33 -07:00
mschoenb97IL
19b127f640 langchain: Update Langchain -> Langgraph migration docs for the deprecation of the messages_modifier parameter. (#24839)
**Description:** Updated the Langgraph migration docs to use
`state_modifier` rather than `messages_modifier`
**Issue:** N/A
**Dependencies:** N/A

- [ X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 21:28:32 +00:00
ccurme
c123cb2b30 docs: update migration guide (#24835)
Move to its own section in the sidebar.
2024-07-30 20:17:12 +00:00
Erick Friis
957b05b8d5 infra: py3.11 for community integration test compiling (#24834)
e.g.
https://github.com/langchain-ai/langchain/actions/runs/10167754785/job/28120861343?pr=24833
2024-07-30 18:43:10 +00:00
Erick Friis
88418af3f5 core: release 0.2.25 (#24833) 2024-07-30 18:41:09 +00:00
Bagatur
37b060112a langchain[patch]: fix ollama in init_chat_model (#24832) 2024-07-30 18:38:53 +00:00
Jerron Lim
d8f3ea82db langchain[patch]: init_chat_model() to import ChatOllama from langchain-ollama and fallback on langchain-community (#24821)
Description: init_chat_model() should import ChatOllama from
`langchain-ollama`. If that fails, fallback to `langchain-community`
2024-07-30 11:16:10 -07:00
Eugene Yurtsev
3a7f3d46c3 docs: Add pydantic compatibility to side bar (#24826)
Add pydantic compatibility to side bar
2024-07-30 14:10:48 -04:00
Isaac Francisco
511242280b [docs]: standardize vectorstores (#24797) 2024-07-30 10:38:04 -07:00
Jacob Lee
ac649800df docs[patch]: Adds kv store integration docs template (#24804) 2024-07-30 10:07:57 -07:00
cffranco94
b01d938997 experimental: Add config to convert_to_graph_documents (#24012)
PR title: Experimental: Add config to convert_to_graph_documents

Description: In order to use langfuse, i need to pass the langfuse
configuration when invoking the chain. langchain_experimental does not
allow to add any parameters (beside the documents) to the
convert_to_graph_documents method. This way, I cannot monitor the chain
in langfuse.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Catarina Franco <catarina.franco@criticalsoftware.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 17:01:06 +00:00
Shailendra Mishra
f2d810b3c0 clob_bugfix... (#24813)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-30 12:44:04 -04:00
Anush
51b15448cc community: Fix FastEmbedEmbeddings (#24462)
## Description

This PR:
- Fixes the validation error in `FastEmbedEmbeddings`.
- Adds support for `batch_size`, `parallel` params.
- Removes support for very old FastEmbed versions.
- Updates the FastEmbed doc with the new params.

Associated Issues:
- Resolves #24039
- Resolves #https://github.com/qdrant/fastembed/issues/296
2024-07-30 12:42:46 -04:00
ccurme
73ec24fc56 docs[patch]: add toolkit template (#24791) 2024-07-30 12:36:09 -04:00
Tamir Zitman
b3e1378f2b langchain : text_splitters Added PowerShell (#24582)
- **Description:** Added PowerShell support for text splitters language
include docs relevant update
  - **Issue:** None
  - **Dependencies:** None

---------

Co-authored-by: tzitman <tamir.zitman@intel.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 16:13:52 +00:00
ccurme
187ee96f7a docs: update chat model feature table (#24822) 2024-07-30 09:06:42 -07:00
Nuno Campos
68ecebf1ec core: Fix implementation of trim_first_node/trim_last_node to use exact same definition of first/last node as in the getter methods (#24802) 2024-07-30 08:44:27 -07:00
Igor Drozdov
c2706cfb9e feat(community): add tools support for litellm (#23906)
I used the following example to validate the behavior

```python
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import ConfigurableField
from langchain_anthropic import ChatAnthropic
from langchain_community.chat_models import ChatLiteLLM
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' times 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y

@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

tools = [multiply, exponentiate, add]

llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)
# llm = ChatLiteLLM(model="claude-3-sonnet-20240229", temperature=0)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })
```

`ChatAnthropic` version works:

```
> Entering new AgentExecutor chain...

Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}`
responded: [{'text': 'To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:', 'type': 'text', 'index': 0}, {'id': 'toolu_01Gf54DFTkfLMJQX3TXffmxe', 'input': {}, 'name': 'exponentiate', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 5, "y": 2.743}'}]

82.65606421491815
Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}`
responded: [{'id': 'toolu_01XUq9S56GT3Yv2N1KmNmmWp', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 0, 'partial_json': '{"x": 3, "y": 82.65606421491815}'}]

85.65606421491815
Invoking: `add` with `{'x': 17.24, 'y': -918.1241}`
responded: [{'text': '\n\nSo 3 + 5^2.743 = 85.66\n\nTo calculate 17.24 - 918.1241, we can use:', 'type': 'text', 'index': 0}, {'id': 'toolu_01BkXTwP7ec9JKYtZPy5JKjm', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 17.24, "y": -918.1241}'}]

-900.8841[{'text': '\n\nTherefore, 17.24 - 918.1241 = -900.88', 'type': 'text', 'index': 0}]

> Finished chain.
```

While `ChatLiteLLM` version doesn't.

But with the changes in this PR, along with:

- https://github.com/langchain-ai/langchain/pull/23823
- https://github.com/BerriAI/litellm/pull/4554

The result is _almost_ the same:

```
> Entering new AgentExecutor chain...

Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}`
responded: To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:

82.65606421491815
Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}`


85.65606421491815
Invoking: `add` with `{'x': 17.24, 'y': -918.1241}`
responded:

So 3 + 5^2.743 = 85.66

To calculate 17.24 - 918.1241, we can use:

-900.8841

Therefore, 17.24 - 918.1241 = -900.88

> Finished chain.
```

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-30 15:39:34 +00:00
David Robertson
bfb7f8d40a Brave Search: Enhance search result details with extra snippets (#19209)
**Description:** 

This update significantly improves the Brave Search Tool's utility
within the LangChain library by enriching the search results it returns.
The tool previously returned title, link, and snippet, with the snippet
being a truncated 140-character description from the search engine. To
make the search results more informative, this update enables
extra_snippets by default and introduces additional result fields:
title, link, description (enhancing and renaming the former snippet
field), age, and snippets. The snippets field provides a list of strings
summarizing the webpage, utilizing Brave's capability for more detailed
search insights. This enhancement aims to make the search tool far more
informative and beneficial for users.

**Issue:** N/A

**Dependencies:** No additional dependencies introduced.

**Twitter handle:** @davidalexr987

**Code Changes Summary:**

- Changed the default setting to include extra_snippets in search
results.
- Renamed the snippet field to description to accurately reflect its
content and included an age field for search results.
- Introduced a snippets field that lists webpage summaries, providing
users with comprehensive search result insights.

**Backward Compatibility Note:**

The renaming of snippet to description improves the accuracy of the
returned data field but may impact existing users who have developed
integration's or analyses based on the snippet field. I believe this
change is essential for clarity and utility, and it aligns better with
the data provided by Brave Search.

**Additional Notes:**

This proposal focuses exclusively on the Brave Search package, without
affecting other LangChain packages or introducing new dependencies.
2024-07-30 15:29:38 +00:00
Eugene Yurtsev
873f64751e docs: Remove danger on how to migrate to astream events v2 (#24825)
Users should migrate to v2 now
2024-07-30 15:28:07 +00:00
Ben Chambers
435771fe74 [community]: Fix package name mismatch (#24824)
- **Description:** fix a mismatch in pypi package names
2024-07-30 11:21:39 -04:00
ccurme
b7bbfc7c67 langchain: revert "init_chat_model() to support ChatOllama from langchain-ollama" (#24819)
Reverts langchain-ai/langchain#24818

Overlooked discussion in
https://github.com/langchain-ai/langchain/pull/24801.
2024-07-30 14:23:36 +00:00
Jerron Lim
5abfc85fec langchain: init_chat_model() to support ChatOllama from langchain-ollama (#24818)
Description: Since moving away from `langchain-community` is
recommended, `init_chat_models()` should import ChatOllama from
`langchain-ollama` instead.
2024-07-30 10:17:38 -04:00
Eugene Yurtsev
4fab8996cf docs: Update pydantic compatibility (#24625)
Update pydantic compatibility. This will only be true after we release
the partner packages.
2024-07-29 22:19:00 -04:00
Jacob Lee
d6ca1474e0 docs[patch]: Adds key-value store to conceptual guide (#24798) 2024-07-29 18:45:16 -07:00
Erick Friis
cdaea17b3e cli/docs: llm integration template standardization (#24795) 2024-07-29 17:47:13 -07:00
Bagatur
a6d1fb4275 core[patch]: introduce ToolMessage.status (#24628)
Anthropic models (including via Bedrock and other cloud platforms)
accept a status/is_error attribute on tool messages/results
(specifically in `tool_result` content blocks for Anthropic API). Adding
a ToolMessage.status attribute so that users can set this attribute when
using those models
2024-07-29 14:01:53 -07:00
Isaac Francisco
78d97b49d9 [partner]: ollama llm fix (#24790) 2024-07-29 13:00:02 -07:00
maang-h
4bb1a11e02 community: Add MiniMaxChat bind_tools and structured output (#24310)
- **Description:** 
  - Add `bind_tools` method to support tool calling 
  - Add `with_structured_output` method to support structured output
2024-07-29 15:51:52 -04:00
John
0a2ff40fcc partners/unstructured: fix client api_url (#24680)
**Description:** Add empty string default for api_key and change
`server_url` to `url` to match existing loaders.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-29 11:16:41 -07:00
maang-h
bf685c242f docs: Standardize QianfanEmbeddingsEndpoint (#24786)
- **Description:** Standardize QianfanEmbeddingsEndpoint, include:
  - docstrings, the issue #21983 
  - model init arg names, the issue #20085
2024-07-29 13:19:24 -04:00
ccurme
9998e55936 core[patch]: support tool calls with non-pickleable args in tools (#24741)
Deepcopy raises with non-pickleable args.
2024-07-29 13:18:39 -04:00
Erick Friis
df78608741 mongodb: bson optional import (#24685) 2024-07-29 09:54:01 -07:00
M. Ali
c086410677 fix docs typos (#23668)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: fix multiple typos"

Co-authored-by: mohblnk <mohamed.ali@blnk.ai>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-29 16:10:55 +00:00
Pere Pasamonte
98175860ad community: Fix AWS DocumentDB similarity_search when filter is None (#24777)
**Description**

Fixes DocumentDBVectorSearch similarity_search when no filter is used;
it defaults to None but $match does not accept None, so changed default
to empty {} before pipeline is created.

**Issue**

AWS DocumentDB similarity search does not work when no filter is used.
Error msg: "the match filter must be an expression in an object" #24775

**Dependencies**

No dependencies

**Twitter handle**

https://x.com/perepasamonte

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-29 15:32:05 +00:00
Lennart J. Kurzweg
7da0597ecb partners[ollama]: Support seed parameter for ChatOllama (#24782)
## Description

Adds seed parameter to ChatOllama

## Resolves Issues
- #24703

## Dependency Changes
None

Co-authored-by: Lennart J. Kurzweg (Nx2) <git@nx2.site>
2024-07-29 15:15:20 +00:00
ccurme
e264ccf484 standard-tests[patch]: update groq and structured output test (#24781)
- Mixtral with Groq has started consistently failing tool calling tests.
Here we restrict testing to llama 3.1.
- `.schema` is deprecated in pydantic proper in favor of
`.model_json_schema`.
2024-07-29 11:10:01 -04:00
ZhangShenao
4a05679fdb patch[experimental] Fix prompt in GenerativeAgentMemory (#24771)
There is an issue with the prompt format in `GenerativeAgentMemory` ,
try to fix it.
The prompt is same as the one in method `_score_memory_importance`.
2024-07-29 07:02:31 -04:00
WU LIFU
2ba8393182 graph_transformers: bug fix for create_simple_model not passing in ll… (#24643)
issue: #24615 

descriptions: The _Graph pydantic model generated from
create_simple_model (which LLMGraphTransformer uses when allowed nodes
and relationships are provided) does not constrain the relationships
(source and target types, relationship type), and the node and
relationship properties with enums when using ChatOpenAI.
The issue is that when calling optional_enum_field throughout
create_simple_model the llm_type parameter is not passed in except for
when creating node type. Passing it into each call fixes the issue.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-07-29 07:00:56 -04:00
William FH
01ab2918a2 core[patch]: Respect injected in bound fns (#24733)
Since right now you cant use the nice injected arg syntas directly with
model.bind_tools()
2024-07-28 15:45:19 -07:00
Pavel
7fcfe7c1f4 openai[patch]: openai proxy added to base embeddings (#24539)
- [ ] **PR title**: "langchain-openai: openai proxy added to base
embeddings"

- [ ] **PR message**: 
    - **Description:** 
    Dear langchain developers,
You've already supported proxy for ChatOpenAI implementation in your
package. At the same time, if somebody needed to use proxy for chat, it
also could be necessary to be able to use it for OpenAIEmbeddings.
That's why I think it's important to add proxy support for OpenAI
embeddings. That's what I've done in this PR.

@baskaryan

---------

Co-authored-by: karpov <karpov@dohod.ru>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-28 20:54:13 +00:00
Lakshmi Peri
821196c4ee langchain-aws InMemoryVectorStore documentation updates (#24347)
Thank you for contributing to LangChain!

- [x] **PR title**: "Add documentaiton on InMemoryVectorStore driver for
MemoryDB to langchain-aws"
  - Langchain-aws repo :Add MemoryDB documentation 
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Added documentation on InMemoryVectorStore driver to
aws.mdx and usage example on MemoryDB clusuter
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
Add memorydb notebook to docs/docs/integrations/ folde


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-28 15:09:51 -04:00
Chuck Wooters
56c2a7f6d4 partners: add missing key name to Field() for ChatFireworks model (#24721)
**Description:** 

In the `ChatFireworks` class definition, the Field() call for the "stop"
("stop_sequences") parameter is missing the "default" keyword.

**Issue:**

Type checker reports "stop_sequences" as a missing arg (not recognizing
the default value is None)

**Dependencies:**

None

**Twitter handle:**

None
2024-07-28 18:40:21 +00:00
AmosDinh
c113682328 community:Add support for specifying document_loaders.firecrawl api url. (#24747)
community:Add support for specifying document_loaders.firecrawl api url.


Add support for specifying document_loaders.firecrawl api url. 
This is mainly to support the
[self-hosting](https://github.com/mendableai/firecrawl/blob/main/SELF_HOST.md)
option firecrawl provides. Eg. now I can specify localhost:....

The corresponding firecrawl class already provides functionality to pass
the argument. See here:
4c9d62f6d3/apps/python-sdk/firecrawl/firecrawl.py (L29)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-28 14:30:36 -04:00
Jerron Lim
df37c0d086 partners[ollama]: Support base_url for ChatOllama (#24719)
Add a class attribute `base_url` for ChatOllama to allow users to choose
a different URL to connect to.

Fixes #24555
2024-07-28 14:25:58 -04:00
Bagatur
8964f8a710 core: use mypy<1.11 (#24749)
Bug in mypy 1.11.0 blocking CI, see example:
https://github.com/langchain-ai/langchain/actions/runs/10127096903/job/28004492692?pr=24641
2024-07-27 16:37:02 -07:00
Moritz
b81fbc962c docs: fix typo in DSPy docs (#24748)
**Description:** Just a missing "r" in metric
**Dependencies:**N/A
2024-07-27 23:34:39 +00:00
Isaac Francisco
152427eca1 make image inputs compatible with langchain_ollama (#24619) 2024-07-26 17:39:57 -07:00
William FH
0535d72927 Add type() in error msg (#24723) 2024-07-26 16:48:45 -07:00
Eugene Yurtsev
9be6b5a20f core[patch]: Correct doc-string for InMemoryRateLimiter (#24730)
Correct the documentaiton string.
2024-07-26 22:17:22 +00:00
Erick Friis
d5b4b7e05c infra: langchain max python 3.11 for resolution (#24729) 2024-07-26 21:17:11 +00:00
Erick Friis
3c3d3e9579 infra: community max python 3.11 for resolution (#24728) 2024-07-26 21:10:14 +00:00
Cristi Burcă
174e7d2ab2 langchain: Make OutputFixingParser.from_llm() create a useable retry chain (#24687)
Description: OutputFixingParser.from_llm() creates a retry chain that
returns a Generation instance, when it should actually just return a
string.
Issue: https://github.com/langchain-ai/langchain/issues/24600
Twitter handle: scribu

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-26 13:55:47 -07:00
Bagatur
b3a23ddf93 integration releases (#24725)
Release anthropic, openai, groq, mistralai, robocorp
2024-07-26 12:30:10 -07:00
Bagatur
315223ce26 core[patch]: Release 0.2.24 (#24722) 2024-07-26 18:55:32 +00:00
Hayden Wolff
0345990a42 docs: Add NVIDIA NIMs to Model Tab and Feature Table (#24146)
**Description:** Add NVIDIA NIMs to Model Tab and LLM Feature Table

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-26 18:20:52 +00:00
Haijian Wang
cda3025ee1 Integrating the Yi family of models. (#24491)
Thank you for contributing to LangChain!

- [x] **PR title**: "community:add Yi LLM", "docs:add Yi Documentation"
                          
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** This PR adds support for the Yi model to LangChain.
- **Dependencies:**
[langchain_core,requests,contextlib,typing,logging,json,langchain_community]
    - **Twitter handle:** 01.AI


- [x] **Add tests and docs**: I've added the corresponding documentation
to the relevant paths

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-26 10:57:33 -07:00
Bagatur
ad7581751f core[patch]: ChatPromptTemplate.init same as ChatPromptTemplate.from_… (#24486) 2024-07-26 10:48:39 -07:00
Marc Gibbons
cc451effd1 community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935)
Raise `LangChainException` instead of `Exception`. This alleviates the
need for library users to use bare try/except to handle exceptions
raised by `AzureSearch`.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-26 15:59:06 +00:00
Jacob Lee
3d16dcd88d docs[patch]: Hide deprecated ChatGPT plugins page (#24704) 2024-07-26 08:24:33 -07:00
Eugene Yurtsev
3a5365a33e ai21: apply rate limiter in integration tests (#24717)
Apply rate limiter in integration tests
2024-07-26 11:15:36 -04:00
Eugene Yurtsev
03d62a737a together: Add rate limiter to integration tests (#24714)
Rate limit the integration tests to avoid getting 429s.
2024-07-26 10:59:33 -04:00
Eugene Yurtsev
e00cc74926 docs[minor]: Add how to guide for rate limiting a chat model (#24686)
Add how-to guide for rate limiting a chat model.
2024-07-26 14:29:06 +00:00
Diverrez morgan
c4d2a53f18 community: creation score_threshold in flashrank_rerank.py (#24016)
Description: 
add a optional score relevance threshold for select only coherent
document, it's in complement of top_n

Discussion:
add relevance score threshold in flashrank_rerank document compressors
#24013

Dependencies:
 no dependencies

---------

Co-authored-by: Benjamin BERNARD <benjamin.bernard@openpathview.fr>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-26 13:34:39 +00:00
Cong Peng
190988d93e community: Add parameter allow_dangerous_requests to WebResearchRetriever.from_llm construct (#24712)
**Description:** To avoid ValueError when construct the retriever from
method `from_llm()`.
2024-07-26 06:24:58 -07:00
monysun
5f593c172a community: fix dashcope embeddings embed_query func post too much req to api (#24707)
the fuc of embed_query of dashcope embeddings send a str param, and in
the embed_with_retry func will send error content to api
2024-07-26 12:44:07 +00:00
yonarw
b65ac8d39c community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494)
Description:

- This PR adds a self query retriever implementation for SAP HANA Cloud
Vector Engine. The retriever supports all operators except for contains.
- Issue: N/A
- Dependencies: no new dependencies added

**Add tests and docs:**
Added integration tests to:
libs/community/tests/unit_tests/query_constructors/test_hanavector.py

**Documentation for self query retriever:**
/docs/integrations/retrievers/self_query/hanavector_self_query.ipynb

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-26 06:56:51 +00:00
nobbbbby
4f3b4fc7fe community[patch]: Extend Baichuan model with tool support (#24529)
**Description:** Expanded the chat model functionality to support tools
in the 'baichuan.py' file. Updated module imports and added tool object
handling in message conversions. Additional changes include the
implementation of tool binding and related unit tests. The alterations
offer enhanced model capabilities by enabling interaction with tool-like
objects.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-25 23:20:44 -07:00
Rave Harpaz
ee399e3ec5 community[patch]: Add OCI Generative AI tool and structured output support (#24693)
- [x] **PR title**: 
  community: Add OCI Generative AI tool and structured output support


- [x] **PR message**: 
- **Description:** adding tool calling and structured output support for
chat models offered by OCI Generative AI services. This is an update to
our last PR 22880 with changes in
/langchain_community/chat_models/oci_generative_ai.py
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:** NA


- [x] **Add tests and docs**: 
  1. we have updated our unit tests
2. we have updated our documentation under
/docs/docs/integrations/chat/oci_generative_ai.ipynb


- [x] **Lint and test**: `make format`, `make lint` and `make test` we
run successfully

---------

Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
2024-07-25 23:19:00 -07:00
Yuki Watanabe
2b6a262f84 community[patch]: Replace filters argument to filter in DatabricksVectorSearch (#24530)
The
[DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21)
class exposes similarity search APIs with argument `filters`, which is
inconsistent with other VS classes who uses `filter` (singular). This PR
updates the argument and add alias for backward compatibility.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-07-25 21:20:18 -07:00
Leonid Ganeline
148766ddc1 docs: integrations missed links (#24681)
Added missed links; missed provider page
2024-07-25 20:38:25 -07:00
Sunish Sheth
59880a9147 community[patch]: mlflow handle empty chunk(#24689) 2024-07-25 20:36:29 -07:00
Eugene Yurtsev
20690db482 core[minor]: Add BaseModel.rate_limiter, RateLimiter abstraction and in-memory implementation (#24669)
This PR proposes to create a rate limiter in the chat model directly,
and would replace: https://github.com/langchain-ai/langchain/pull/21992

It resolves most of the constraints that the Runnable rate limiter
introduced:

1. It's not annoying to apply the rate limiter to existing code; i.e., 
possible to roll out the change at the location where the model is
instantiated,
rather than at every location where the model is used! (Which is
necessary
   if the model is used in different ways in a given application.)
2. batch rate limiting is enforced properly
3. the rate limiter works correctly with streaming
4. the rate limiter is aware of the cache
5. The rate limiter can take into account information about the inputs
into the
model (we can add optional inputs to it down-the road together with
outputs!)

The only downside is that information will not be properly reflected in
tracing
as we don't have any metadata evens about a rate limiter. So the total
time
spent on a model invocation will be: 

* time spent waiting for the rate limiter
* time spend on the actual model request

## Example

```python
from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_groq import ChatGroq

groq = ChatGroq(rate_limiter=InMemoryRateLimiter(check_every_n_seconds=1))
groq.invoke('hello')
```
2024-07-26 03:03:34 +00:00
Eugene Yurtsev
c623ae6661 experimental[patch]: Fix import test (#24672)
Import test was misconfigured, the glob wasn't returning any file paths
2024-07-25 22:14:40 -04:00
Chaunte W. Lacewell
69eacaa887 Community[minor]: Update VDMS vectorstore (#23729)
**Description:** 
- This PR exposes some functions in VDMS vectorstore, updates VDMS
related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20)

**Issue:** N/A

**Dependencies:** 
- Update vdms>=0.0.20
2024-07-25 22:13:04 -04:00
sykp241095
703491e824 docs: update another TiDB Cloud link as it is already public beta (#24694)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 18:39:55 -07:00
Nuno Campos
8734cabc09 core: Don't draw None edge labels (#24690)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 22:12:39 +00:00
Jacob Lee
ce067c19e9 docs[patch]: Simplify tool calling guide, improve tool calling conceptual guide (#24637)
Lots of duplicated content from concepts, missing pointers to the second
half of the tool calling loop

Simpler + more focused + a more prominent link to the second half of the
loop was what I was aiming for, but down to be more conservative and
just more prominently link the "passing tools back to the model" guide.

I have also moved the tool calling conceptual guide out from under
`Structured Output` (while leaving a small section for structured
output-specific information) and added more content. The existing
`#functiontool-calling` link will go to this new section.
2024-07-25 14:39:14 -07:00
Bagatur
4840db6892 docs: standardize groq chat model docs (#24616)
part of #22296

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-25 14:10:49 -07:00
Isaac Francisco
218c554c4f [docs]: add doctoring to ChatTogether (#24636) 2024-07-25 14:10:41 -07:00
Bagatur
0fe29b4343 docs: standardize Together docs (#24617)
Part of #22296

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-25 14:10:31 -07:00
Isaac Francisco
5c7e589aaf deprecating ollama_functions (#24632) 2024-07-25 13:50:04 -07:00
KyrianC
0fdbaf4a8d community: fix ChatEdenAI + EdenAI Tools (#23715)
Fixes for Eden AI Custom tools and ChatEdenAI:
- add missing import in __init__ of chat_models
- add `args_schema` to custom tools. otherwise '__arg1' would sometimes
be passed to the `run` method
- fix IndexError when no human msg is added in ChatEdenAI
2024-07-25 15:19:14 -04:00
Daniel Campos
871bf5a841 docs: Update snowflake.mdx for arctic-m-v1.5 (#24678)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 17:48:54 +00:00
Leonid Ganeline
8b7cffc363 docs: integrations missed references (#24631)
**Issue:** Several packages are not referenced in the `providers` pages.

**Fix:** Added the missed references. Fixed the notebook formatting.
2024-07-25 13:26:46 -04:00
ccurme
58dd69f7f2 core[patch]: fix mutating tool calls (#24677)
In some cases tool calls are mutated when passed through a tool.
2024-07-25 16:46:36 +00:00
ccurme
dfbd12b384 mistral[patch]: translate tool call IDs to mistral compatible format (#24668)
Mistral appears to have added validation for the format of its tool call
IDs:

`{"object":"error","message":"Tool call id was abc123 but must be a-z,
A-Z, 0-9, with a length of
9.","type":"invalid_request_error","param":null,"code":null}`

This breaks compatibility of messages from other providers. Here we add
a function that converts any string to a Mistral-valid tool call ID, and
apply it to incoming messages.
2024-07-25 12:39:32 -04:00
maang-h
38d30e285a docs: Standardize BaichuanTextEmbeddings docstrings (#24674)
- **Description:** Standardize BaichuanTextEmbeddings docstrings.
- **Issue:** the issue #21983
2024-07-25 12:12:00 -04:00
Eugene Yurtsev
89bcca3542 experimental[patch]: Bump core (#24671) 2024-07-25 09:05:43 -07:00
rick-SOPTIM
cd563fb628 community[minor]: passthrough auth parameter on requests to Ollama-LLMs (#24068)
Thank you for contributing to LangChain!

**Description:**
This PR allows users of `langchain_community.llms.ollama.Ollama` to
specify the `auth` parameter, which is then forwarded to all internal
calls of `requests.request`. This works in the same way as the existing
`headers` parameters. The auth parameter enables the usage of the given
class with Ollama instances, which are secured by more complex
authentication mechanisms, that do not only rely on static headers. An
example are AWS API Gateways secured by the IAM authorizer, which
expects signatures dynamically calculated on the specific HTTP request.

**Issue:**

Integrating a remote LLM running through Ollama using
`langchain_community.llms.ollama.Ollama` only allows setting static HTTP
headers with the parameter `headers`. This does not work, if the given
instance of Ollama is secured with an authentication mechanism that
makes use of dynamically created HTTP headers which for example may
depend on the content of a given request.

**Dependencies:**

None

**Twitter handle:**

None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-25 15:48:35 +00:00
남광우
256bad3251 core[minor]: Support asynchronous in InMemoryVectorStore (#24472)
### Description

* support asynchronous in InMemoryVectorStore
* since embeddings might be possible to call asynchronously, ensure that
both asynchronous and synchronous functions operate correctly.
2024-07-25 11:36:55 -04:00
Luca Dorigo
5fdbdd6bec community[patch]: Fix invalid iohttp verify parameter (#24655)
Should fix https://github.com/langchain-ai/langchain/issues/24654
2024-07-25 11:09:21 -04:00
Daniel Glogowski
221486687a docs: updated CHATNVIDIA notebooks (#24584)
Updated notebook for tool calling support in chat models
2024-07-25 09:22:53 -04:00
Ken Jenney
d6631919f4 docs: tool calling is enabled in ChatOllama (#24665)
Description: According to this page:
https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/
ChatOllama does support Tool Calling.
Issue: The documentation is incorrect
Dependencies: None
Twitter handle: NA
2024-07-25 13:21:30 +00:00
sykp241095
235eb38d3e docs: update TiDB Cloud links as vector search feature becomes public beta (#24667)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 13:20:02 +00:00
Eugene Yurtsev
7dd6b32991 core[minor]: Add InMemoryRateLimiter (#21992)
This PR introduces the following Runnables:

1. BaseRateLimiter: an abstraction for specifying a time based rate
limiter as a Runnable
2. InMemoryRateLimiter: Provides an in-memory implementation of a rate
limiter

## Example

```python

from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda
from datetime import datetime

foo = InMemoryRateLimiter(requests_per_second=0.5)

def meow(x):
    print(datetime.now().strftime("%H:%M:%S.%f"))
    return x

chain = foo | meow

for _ in range(10):
    print(chain.invoke('hello'))
```

Produces:

```
17:12:07.530151
hello
17:12:09.537932
hello
17:12:11.548375
hello
17:12:13.558383
hello
17:12:15.568348
hello
17:12:17.578171
hello
17:12:19.587508
hello
17:12:21.597877
hello
17:12:23.607707
hello
17:12:25.617978
hello
```


![image](https://github.com/user-attachments/assets/283af59f-e1e1-408b-8e75-d3910c3c44cc)


## Interface

The rate limiter uses the following interface for acquiring a token:

```python
class BaseRateLimiter(Runnable[Input, Output], abc.ABC):
  @abc.abstractmethod
  def acquire(self, *, blocking: bool = True) -> bool:
      """Attempt to acquire the necessary tokens for the rate limiter.```
```

The flag `blocking` has been added to the abstraction to allow
supporting streaming (which is easier if blocking=False).

## Limitations

- The rate limiter is not designed to work across different processes.
It is an in-memory rate limiter, but it is thread safe.
- The rate limiter only supports time-based rate limiting. It does not
take into account the size of the request or any other factors.
- The current implementation does not handle streaming inputs well and
will consume all inputs even if the rate limit has been reached. Better
support for streaming inputs will be added in the future.
- When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.
One way to mitigate this is to use batch_as_completed() or
abatch_as_completed().

## Bursty behavior in `batch` and `abatch`

When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.

This becomes a problem if users are using `batch` and `abatch` with many
inputs (e.g., 100). In this case, there will be a burst of 100 inputs
into the batch of the rate limited runnable.

1. Using a RunnableBinding

The API would look like:

```python
from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda

rate_limiter = InMemoryRateLimiter(requests_per_second=0.5)

def meow(x):
    return x

rate_limited_meow = RunnableLambda(meow).with_rate_limiter(rate_limiter)
```

2. Another option is to add some init option to RunnableSequence that
changes `.batch()` to be depth first (e.g., by delegating to
`batch_as_completed`)

```python
RunnableSequence(first=rate_limiter, last=model, how='batch-depth-first')
```

Pros: Does not require Runnable Binding
Cons: Feels over-complicated
2024-07-25 01:34:03 +00:00
Oleg Kulyk
4b1b7959a2 community[minor]: Add ScrapingAnt Loader Community Integration (#24514)
Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration.
ScrapingAnt is a web scraping API that allows extracting web page data
into accessible and well-formatted markdown.

Description: Added ScrapingAnt web loader for retrieving web page data
as markdown
Dependencies: scrapingant-client
Twitter: @WeRunTheWorld3

---------

Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>
2024-07-24 21:11:43 -04:00
Jacob Lee
afee851645 docs[patch]: Fix image caption document loader page and typo on custom tools page (#24635) 2024-07-24 17:16:18 -07:00
Jacob Lee
a73e2222d4 docs[patch]: Updates LLM caching, HF sentence transformers, and DDG pages (#24633) 2024-07-24 16:58:05 -07:00
Erick Friis
e160b669c8 infra: add unstructured api key to release (#24638) 2024-07-24 16:47:24 -07:00
John
d59c656ea5 unstructured, community, initialize langchain-unstructured package (#22779)
#### Update (2): 
A single `UnstructuredLoader` is added to handle both local and api
partitioning. This loader also handles single or multiple documents.

#### Changes in `community`:
Changes here do not affect users. In the initial process of using the
SDK for the API Loaders, the Loaders in community were refactored.
Other changes include:
The `UnstructuredBaseLoader` has a new check to see if both
`mode="paged"` and `chunking_strategy="by_page"`. It also now has
`Element.element_id` added to the `Document.metadata`.
`UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such,
now both directly inherit from `UnstructuredBaseLoader` and initialize
their `file_path`/`file` attributes respectively and implement their own
`_post_process_elements` methods.

--------
#### Update:
New SDK Loaders in a [partner
package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo)
are introduced to prevent breaking changes for users (see discussion
below).

##### TODO:
- [x] Test docstring examples
--------
- **Description:** UnstructuredAPIFileIOLoader and
UnstructuredAPIFileLoader calls to the unstructured api are now made
using the unstructured-client sdk.
- **New Dependencies:** unstructured-client

- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] a test for the integration, preferably unit tests that do not rely
on network access,
- [x] update the description in
`docs/docs/integrations/providers/unstructured.mdx`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

TODO:
- [x] Update
https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api
-
`langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb`
- The description here needs to indicate that users should install
`unstructured-client` instead of `unstructured`. Read over closely to
look for any other changes that need to be made.
- [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to
handle json responses from the API instead of just lists of elements.
- This method may need to be overwritten by the API loaders instead of
changing it in the `UnstructuredBaseLoader`.
- [x] Update the documentation links in the class docstrings (the
Unstructured documents have moved)
- [x] Update Document.metadata to include `element_id` (see thread
[here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419))

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
2024-07-24 23:21:20 +00:00
Leonid Ganeline
2394807033 docs: fix ChatGooglePalm fix (#24629)
**Issue:** now the
[ChatGooglePalm](https://python.langchain.com/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo)
class is not parsed and do not presented in the "API Reference:" line.

**PR:** [Fixed
it](https://langchain-7n5k5wkfs-langchain.vercel.app/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo)
by properly importing.
2024-07-24 18:09:08 -04:00
Joel Akeret
acfce30017 Adding compatibility for OllamaFunctions with ImagePromptTemplate (#24499)
- [ ] **PR title**: "experimental: Adding compatibility for
OllamaFunctions with ImagePromptTemplate"

- [ ] **PR message**: 
- **Description:** Removes the outdated
`_convert_messages_to_ollama_messages` method override in the
`OllamaFunctions` class to ensure that ollama multimodal models can be
invoked with an image.
    - **Issue:** #24174

---------

Co-authored-by: Joel Akeret <joel.akeret@ti&m.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-24 14:57:05 -07:00
Erick Friis
8f3c052db1 cli: release 0.0.26 (#24623)
- **cli: remove snapshot flag from pytest defaults**
- **x**
- **x**
2024-07-24 13:13:58 -07:00
ChengZi
29a3b3a711 partners[milvus]: add dynamic field (#24544)
add dynamic field feature to langchain_milvus
more unittest, more robustic

plan to deprecate the `metadata_field` in the future, because it's
function is the same as `enable_dynamic_field`, but the latter one is a
more advanced concept in milvus

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-24 20:01:58 +00:00
Erick Friis
20fe4deea0 milvus: release 0.1.3 (#24624) 2024-07-24 13:01:27 -07:00
Erick Friis
3a55f4bfe9 cli: remove snapshot flag from pytest defaults (#24622) 2024-07-24 19:41:01 +00:00
Isaac Francisco
fea9ff3831 docs: add tables for search and code interpreter tools (#24586) 2024-07-24 10:51:39 -07:00
Eugene Yurtsev
b55f6105c6 community[patch]: Add linter to prevent further usage of root_validator and validator (#24613)
This linter is meant to move development to use __init__ instead of
root_validator and validator.

We need to investigate whether we need to lint some of the functionality
of Field (e.g., `lt` and `gt`, `alias`)

`alias` is the one that's most popular:

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "alias=" | wc -l
144

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "ge=" | wc -l
10

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "gt=" | wc -l
4
2024-07-24 12:35:21 -04:00
Anush
4585eaef1b qdrant: Fix vectors_config access (#24606)
## Description

Fixes #24558 by accessing `vectors_config` after asserting it to be a
dict.
2024-07-24 10:54:33 -04:00
ccurme
f337f3ed36 docs: update chain migration guide (#24501)
- Update `ConversationChain` example to show use without session IDs;
- Fix a minor bug (specify history_messages_key).
2024-07-24 10:45:00 -04:00
maang-h
22175738ac docs: Add MongoDBChatMessageHistory docstrings (#24608)
- **Description:** Add MongoDBChatMessageHistory rich docstrings.
- **Issue:** the issue #21983
2024-07-24 10:12:44 -04:00
Anindyadeep
12c3454fd9 [Community] PremAI Tool Calling Functionality (#23931)
This PR is under WIP and adds the following functionalities:

- [X] Supports tool calling across the langchain ecosystem. (However
streaming is not supported)
- [X] Update documentation
2024-07-24 09:53:58 -04:00
Vishnu Nandakumar
e271965d1e community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424)
- [ ] **Community**: "Retrievers: Product Quantization"
- [X] This PR adds Product Quantization feature to the retrievers to the
Langchain Community. PQ is one of the fastest retrieval methods if the
embeddings are rich enough in context due to the concepts of
quantization and representation through centroids
    - **Description:** Adding PQ as one of the retrievers
    - **Dependencies:** using the package nanopq for this PR
    - **Twitter handle:** vishnunkumar_


- [X] **Add tests and docs**: If you're adding a new integration, please
include
   - [X] Added unit tests for the same in the retrievers.
   - [] Will add an example notebook subsequently

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/ -
done the same

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-24 13:52:15 +00:00
stydxm
b9bea36dd4 community: fix typo in warning message (#24597)
- **Description:** 
  This PR fixes a small typo in a warning message
- **Issue:**

![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49)
There were double `Use` and double `instead`
2024-07-24 13:19:07 +00:00
cüre
da06d4d7af community: update finetuned model cost for 4o-mini (#24605)
- **Description:** adds model price for. reference:
https://openai.com/api/pricing/
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** cureef
2024-07-24 13:17:26 +00:00
Philippe PRADOS
5f73c836a6 openai[small]: Add the new model: gpt-4o-mini (#24594) 2024-07-24 09:14:48 -04:00
Mateusz Szewczyk
597be7d501 docs: Update IBM docs about information to pass client into WatsonxLLM and WatsonxEmbeddings object. (#24602)
Thank you for contributing to LangChain!

- [x] **PR title**: Update IBM docs about information to pass client
into WatsonxLLM and WatsonxEmbeddings object.


- [x] **PR message**: 
- **Description:** Update IBM docs about information to pass client into
WatsonxLLM and WatsonxEmbeddings object.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-24 09:12:13 -04:00
Jacob Lee
379803751e docs[patch]: Remove very old document comparison notebook (#24587) 2024-07-23 22:25:35 -07:00
ZhangShenao
ad18afc3ec community[patch]: Fix param spelling error in ElasticsearchChatMessageHistory (#24589)
Fix param spelling error in `ElasticsearchChatMessageHistory`
2024-07-23 19:29:42 -07:00
Isaac Francisco
464a525a5a [partner]: minor change to embeddings for Ollama (#24521) 2024-07-24 00:00:13 +00:00
Aayush Kataria
0f45ac4088 LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087)
Thank you for contributing to LangChain!

- This PR adds vector search filtering for Azure Cosmos DB Mongo vCore
and NoSQL.


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-23 16:59:23 -07:00
Gareth
ac41c97d21 pinecone: Add embedding Inference Support (#24515)
**Description**

Add support for Pinecone hosted embedding models as
`PineconeEmbeddings`. Replacement for #22890

**Dependencies**
Add `aiohttp` to support async embeddings call against REST directly

- [x] **Add tests and docs**: If you're adding a new integration, please
include

Added `docs/docs/integrations/text_embedding/pinecone.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Twitter: `gdjdg17`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-23 22:50:28 +00:00
ccurme
aaf788b7cb docs[patch]: fix chat model tabs in runnable-as-tool guide (#24580) 2024-07-23 18:36:01 -04:00
Bagatur
47ae06698f docs: update ChatModelTabs defaults (#24583) 2024-07-23 21:56:30 +00:00
Erick Friis
03881c6743 docs: fix hf embeddings install (#24577) 2024-07-23 21:03:30 +00:00
ccurme
2d6b0bf3e3 core[patch]: add to RunnableLambda docstring (#24575)
Explain behavior when function returns a runnable.
2024-07-23 20:46:44 +00:00
Erick Friis
ee3955c68c docs: add tool calling for ollama (#24574) 2024-07-23 20:33:23 +00:00
Carlos André Antunes
325068bb53 community: Fix azure_openai.py (#24572)
In some lines its trying to read a key that do not exists yet. In this
cases I changed the direct access to dict.get() method


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-23 16:22:21 -04:00
Bagatur
bff6ca78a2 docs: duplicate how to link (#24569) 2024-07-23 18:52:05 +00:00
Nik Jmaeff
6878bc39b5 langchain: fix TrajectoryEvalChain.prep_inputs (#19959)
The previous implementation would never be called.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-23 18:37:39 +00:00
Bagatur
55e66aa40c langchain[patch]: init_chat_model support ChatBedrockConverse (#24564) 2024-07-23 11:07:38 -07:00
Bagatur
9b7db08184 experimental[patch]: Release 0.0.63 (#24563) 2024-07-23 16:28:37 +00:00
Bagatur
8691a5a37f community[patch]: Release 0.2.10 (#24560) 2024-07-23 09:24:57 -07:00
Bagatur
4919d5d6df langchain[patch]: Release 0.2.11 (#24559) 2024-07-23 09:18:44 -07:00
Bagatur
918e1c8a93 core[patch]: Release 0.2.23 (#24557) 2024-07-23 09:01:18 -07:00
Lance Martin
58def6e34d Add tool calling example to Ollama ntbk (#24522) 2024-07-23 15:58:54 +00:00
Leonid Ganeline
e787532479 langchain: globals fix (#21281)
Issue: functions from `globals`, like the `get_debug` are placed in the
init.py file. As a result, they don't listed in the API Reference docs.
[See
this](https://langchain-9jq1kef7i-langchain.vercel.app/v0.2/docs/how_to/debugging/#set_debugtrue)
and [broken
this](https://api.python.langchain.com/en/latest/globals/langchain.globals.set_debug.html).
Change: moved code from init.py into the `globals.py` file and removed
`globals` directory. Similar to: #21266
BTW `globals` in core implemented exactly inside a file not inside a
folder.
2024-07-23 11:23:18 -04:00
Ben Chambers
e80b0932ee community[patch]: small fixes to link extractors (#24528)
- **Description:** small fixes to imports / types in the link extraction
work

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-23 14:28:06 +00:00
Morteza Hosseini
9e06991aae community[patch]: Update URL to the 2markdown API (#24546)
Update the URL to Markdown endpoint.

API information is available here: https://2markdown.com/docs#url2md
2024-07-23 14:27:55 +00:00
ZhangShenao
a14e02ab33 core[patch]: Fix word spelling error in globals.py (#24532)
Fix word spelling error in `globals.py`

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-23 14:27:16 +00:00
maang-h
378db2e1a5 docs: Add RedisChatMessageHistory docstrings (#24548)
- **Description:** Add `RedisChatMessageHistory ` rich docstrings.
- **Issue:** the issue #21983

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-23 14:23:46 +00:00
ccurme
a197a8e184 openai[patch]: move test (#24552)
No-override tests (https://github.com/langchain-ai/langchain/pull/24407)
include a condition that integrations not implement additional tests.
2024-07-23 10:22:22 -04:00
Eugene Yurtsev
0bb54ab9f0 CI: Temporarily disable min version checking on pull request (#24551)
Short term to fix CI
2024-07-23 14:12:08 +00:00
Eugene Yurtsev
f47b4edcc2 standard-test: Fix typo in skipif for chat model integration tests (#24553) 2024-07-23 10:11:01 -04:00
Jesse Wright
837a3d400b chore(docs): SQARQL -> SPARQL typo fix (#24536)
nit picky typo fix
2024-07-23 13:39:34 +00:00
Eugene Yurtsev
20b72a044c standard-tests: Add BaseModel variations tests to with_structured_output (#24527)
After this standard tests will test with the following combinations:

1. pydantic.BaseModel
2. pydantic.v1.BaseModel

If ran within a matrix, it'll covert both pydantic.BaseModel originating
from
pydantic 1 and the one defined in pydantic 2.
2024-07-23 09:01:26 -04:00
Bagatur
70c71efcab core[patch]: merge_content fix (#24526) 2024-07-22 22:20:22 -07:00
Ben Chambers
a5a3d28776 community[patch]: Remove targets_table from C* GraphVectorStore (#24502)
- **Description:** Remove the unnecessary `targets_table` parameter
2024-07-22 22:09:36 -04:00
Alexander Golodkov
2a70a07aad community[minor]: added new document loaders based on dedoc library (#24303)
### Description
This pull request added new document loaders to load documents of
various formats using [Dedoc](https://github.com/ispras/dedoc):
  - `DedocFileLoader` (determine file types automatically and parse)
  - `DedocPDFLoader` (for `PDF` and images parsing)
- `DedocAPIFileLoader` (determine file types automatically and parse
using Dedoc API without library installation)

[Dedoc](https://dedoc.readthedocs.io) is an open-source library/service
that extracts texts, tables, attached files and document structure
(e.g., titles, list items, etc.) from files of various formats. The
library is actively developed and maintained by a group of developers.

`Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images
and more.
Full list of supported formats can be found
[here](https://dedoc.readthedocs.io/en/latest/#id1).
For `PDF` documents, `Dedoc` allows to determine textual layer
correctness and split the document into paragraphs.


### Issue
This pull request extends variety of document loaders supported by
`langchain_community` allowing users to choose the most suitable option
for raw documents parsing.

### Dependencies
The PR added a new (optional) dependency `dedoc>=2.2.5` ([library
documentation](https://dedoc.readthedocs.io)) to the
`extended_testing_deps.txt`

### Twitter handle
None

### Add tests and docs
1. Test for the integration:
`libs/community/tests/integration_tests/document_loaders/test_dedoc.py`
2. Example notebook:
`docs/docs/integrations/document_loaders/dedoc.ipynb`
3. Information about the library:
`docs/docs/integrations/providers/dedoc.mdx`

### Lint and test

Done locally:

  - `make format`
  - `make lint`
  - `make integration_tests`
  - `make docs_build` (from the project root)

---------

Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>
2024-07-23 02:04:53 +00:00
Ben Chambers
5ac936a284 community[minor]: add document transformer for extracting links (#24186)
- **Description:** Add a DocumentTransformer for executing one or more
`LinkExtractor`s and adding the extracted links to each document.
- **Issue:** n/a
- **Depedencies:** none

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-22 22:01:21 -04:00
Jacob Lee
3c4652c906 docs[patch]: Hide OllamaFunctions now that Ollama supports tool calling (#24523) 2024-07-22 17:56:51 -07:00
Erick Friis
2c6b9e8771 standard-tests: add override check (#24407) 2024-07-22 23:38:01 +00:00
Nithish Raghunandanan
1639ccfd15 couchbase: [patch] Return chat message history in order (#24498)
**Description:** Fixes an issue where the chat message history was not
returned in order. Fixed it now by returning based on timestamps.

- [x] **Add tests and docs**: Updated the tests to check the order
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-22 23:30:29 +00:00
C K Ashby
ab036c1a4c docs: Update .run() to .invoke() (#24520) 2024-07-22 14:21:33 -07:00
Erick Friis
3dce2e1d35 all: add release notes to pypi (#24519) 2024-07-22 13:59:13 -07:00
Bagatur
c48e99e7f2 docs: fix sql db note (#24505) 2024-07-22 13:30:29 -07:00
Bagatur
8a140ee77c core[patch]: don't serialize BasePromptTemplate.input_types (#24516)
Candidate fix for #24513
2024-07-22 13:30:16 -07:00
MarkYQJ
df357f82ca ignore the first turn to apply "history" mechanism (#14118)
This will generate a meaningless string "system: " for generating
condense question; this increases the probability to make an improper
condense question and misunderstand user's question. Below is a case
- Original Question: Can you explain the arguments of Meilisearch?
- Condense Question
  - What are the benefits of using Meilisearch? (by CodeLlama)
  - What are the reasons for using Meilisearch? (by GPT-4)

The condense questions (not matter from CodeLlam or GPT-4) are different
from the original one.

By checking the content of each dialogue turn, generating history string
only when the dialog content is not empty.
Since there is nothing before first turn, the "history" mechanism will
be ignored at the very first turn.

Doing so, the condense question will be "What are the arguments for
using Meilisearch?".

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 20:11:17 +00:00
Bagatur
236e957abb core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417)
After this PR chat models will correctly handle pydantic 2 with
bind_tools and with_structured_output.


```python
import pydantic
print(pydantic.__version__)
```
2.8.2

```python
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

class Add(BaseModel):
    x: int
    y: int

model = ChatOpenAI().bind_tools([Add])
print(model.invoke('2 + 5').tool_calls)

model = ChatOpenAI().with_structured_output(Add)
print(type(model.invoke('2 + 5')))
```

```
[{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}]
<class '__main__.Add'>
```


```python
from langchain_openai import ChatOpenAI
from pydantic.v1 import BaseModel, Field

class Add(BaseModel):
    x: int
    y: int

model = ChatOpenAI().bind_tools([Add])
print(model.invoke('2 + 5').tool_calls)

model = ChatOpenAI().with_structured_output(Add)
print(type(model.invoke('2 + 5')))
```

```python
[{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}]
<class '__main__.Add'>
```

Addresses issues: https://github.com/langchain-ai/langchain/issues/22782

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-22 20:07:39 +00:00
C K Ashby
199e64d372 Please spell Lex's name correctly Fridman (#24517)
https://www.youtube.com/watch?v=ZIyB9e_7a4c

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-22 19:38:32 +00:00
Erick Friis
1f01c0fd98 infra: remove core from min version pr testing (#24507)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-22 17:46:15 +00:00
Naka Masato
884f76e05a fix: load google credentials properly in GoogleDriveLoader (#12871)
- **Description:** 
- Fix #12870: set scope in `default` func (ref:
https://google-auth.readthedocs.io/en/master/reference/google.auth.html)
- Moved the code to load default credentials to the bottom for clarity
of the logic
	- Add docstring and comment for each credential loading logic
- **Issue:** https://github.com/langchain-ai/langchain/issues/12870
- **Dependencies:** no dependencies change
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** @gymnstcs

<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-22 17:43:33 +00:00
Erick Friis
a45337ea07 ollama: release 0.1.0 (#24510) 2024-07-22 10:35:26 -07:00
Isaac Francisco
1318d534af [docs]: minor react change (#24509) 2024-07-22 10:25:01 -07:00
Jorge Piedrahita Ortiz
10e3982b59 community: sambanova integration minor changes (#24503)
- Minor changes in samabanova llm integration 
  - default api 
  - docstrings
- minor changes in docs
2024-07-22 17:06:35 +00:00
maang-h
721f709dec community: Improve QianfanChatEndpoint tool result to model (#24466)
- **Description:** `QianfanChatEndpoint` When using tool result to
answer questions, the content of the tool is required to be in Dict
format. Of course, this can require users to return Dict format when
calling the tool, but in order to be consistent with other Chat Models,
I think such modifications are necessary.
2024-07-22 11:29:00 -04:00
Chaunte W. Lacewell
02f0a29293 Cookbook: Add Visual RAG example using VDMS (#24353)
- **Description:** Adding notebook to demonstrate visual RAG which uses
both video scene description generated by open source vision models (ex.
video-llama, video-llava etc.) as text embeddings and frames as image
embeddings to perform vector similarity search using VDMS.
  - **Issue:** N/A
  - **Dependencies:** N/A
2024-07-22 11:16:06 -04:00
ccurme
dcba7df2fe community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474) 2024-07-22 11:00:13 -04:00
ccurme
0f7569ddbc core[patch]: enable RunnableWithMessageHistory without config (#23775)
Feedback that `RunnableWithMessageHistory` is unwieldy compared to
ConversationChain and similar legacy abstractions is common.

Legacy chains using memory typically had no explicit notion of threads
or separate sessions. To use `RunnableWithMessageHistory`, users are
forced to introduce this concept into their code. This possibly felt
like unnecessary boilerplate.

Here we enable `RunnableWithMessageHistory` to run without a config if
the `get_session_history` callable has no arguments. This enables
minimal implementations like the following:
```python
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
memory = InMemoryChatMessageHistory()
chain = RunnableWithMessageHistory(llm, lambda: memory)

chain.invoke("Hi I'm Bob")  # Hello Bob!
chain.invoke("What is my name?")  # Your name is Bob.
```
2024-07-22 10:36:53 -04:00
Mohammad Mohtashim
5ade0187d0 [Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in create_sql_agent function (#23693)
- **Description:** The correct Prompts for ZERO_SHOT_REACT were not
being used in the `create_sql_agent` function. They were not using the
specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not
provide any prompts. This is fixed.
- **Issue:** #23585

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 14:04:20 +00:00
ZhangShenao
0f6737cbfe [Vector Store] Fix function add_texts in TencentVectorDB (#24469)
Regardless of whether `embedding_func` is set or not, the 'text'
attribute of document should be assigned, otherwise the `page_content`
in the document of the final search result will be lost
2024-07-22 09:50:22 -04:00
남광우
7ab82eb8cc langchain: Copy libs/standard-tests folder when building devcontainer (#24470)
### Description

* Fix `libs/langchain/dev.Dockerfile` file. copy the
`libs/standard-tests` folder when building the devcontainer.
* `poetry install --no-interaction --no-ansi --with dev,test,docs`
command requires this folder, but it was not copied.

### Reference

#### Error message when building the devcontainer from the master branch

```
...

[2024-07-20T14:27:34.779Z] ------
 > [langchain langchain-dev-dependencies 7/7] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
0.409 
0.409 Directory ../standard-tests does not exist
------

...
```

#### After the fix

Build success at vscode:

<img width="866" alt="image"
src="https://github.com/user-attachments/assets/10db1b50-6fcf-4dfe-83e1-d93c96aa2317">
2024-07-22 13:46:38 +00:00
rbrugaro
37b89fb7fc fix RAG with quantized embeddings notebook (#24422)
1. Fix HuggingfacePipeline import error to newer partner package
 2. Switch to IPEXModelForCausalLM for performance

There are no dependency changes since optimum intel is also needed for
QuantizedBiEncoderEmbeddings

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 13:44:03 +00:00
Thomas Meike
40c02cedaf langchain[patch]: add async methods to ConversationSummaryBufferMemory (#20956)
Added asynchronously callable methods according to the
ConversationSummaryBufferMemory API documentation.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-22 09:21:43 -04:00
Steve Sharp
cecd875cdc docs: Update streaming.ipynb (typo fix) (#24483)
**Description:** Fixes typo `Le'ts` -> `Let's`.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-22 11:09:13 +00:00
Sheng Han Lim
0c6a3fdd6b langchain: Update ContextualCompressionRetriever base_retriever type to RetrieverLike (#24192)
**Description:**
When initializing retrievers with `configurable_fields` as base
retriever, `ContextualCompressionRetriever` validation fails with the
following error:

```
ValidationError: 1 validation error for ContextualCompressionRetriever
base_retriever
  Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error)
```

Example code:

```python
esearch_retriever = VertexAISearchRetriever(
    project_id=GCP_PROJECT_ID,
    location_id="global",
    data_store_id=SEARCH_ENGINE_ID,
).configurable_fields(
    filter=ConfigurableField(id="vertex_search_filter", name="Vertex Search Filter")
)

# rerank documents with Vertex AI Rank API
reranker = VertexAIRank(
    project_id=GCP_PROJECT_ID,
    location_id=GCP_REGION,
    ranking_config="default_ranking_config",
)

retriever_with_reranker = ContextualCompressionRetriever(
    base_compressor=reranker, base_retriever=esearch_retriever
)
```

It seems like the issue stems from ContextualCompressionRetriever
insisting that base retrievers must be strictly `BaseRetriever`
inherited, and doesn't take into account cases where retrievers need to
be chained and can have configurable fields defined.


0a1e475a30/libs/langchain/langchain/retrievers/contextual_compression.py (L15-L22)

This PR proposes that the base_retriever type be set to `RetrieverLike`,
similar to how `EnsembleRetriever` validates its list of retrievers:


0a1e475a30/libs/langchain/langchain/retrievers/ensemble.py (L58-L75)
2024-07-21 14:23:19 -04:00
clement.l
d98b830e4b community: add flag to toggle progress bar (#24463)
- **Description:** Add a flag to determine whether to show progress bar 
- **Issue:** n/a
- **Dependencies:** n/a
- **Twitter handle:** n/a

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-20 13:18:02 +00:00
chuanbei888
6b08a33fa4 community: fix QianfanChatEndpoint default model (#24464)
the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to
ERNIE-Lite-8K
2024-07-20 13:00:29 +00:00
Nuno Campos
947628311b core[patch]: Accept configurable keys top-level (#23806)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-20 03:49:00 +00:00
Jesus Martinez
c1d1fc13c2 langchain[patch]: Remove multiagent return_direct validation (#24419)
**Description:**

When you use Agents with multi-input tool and some of these tools have
`return_direct=True`, langchain thrown an error related to one
validator.
This change is implemented on [JS
community](https://github.com/langchain-ai/langchainjs/pull/4643) as
well

**Issue**:
This MR resolves #19843

**Dependencies:**

None

Co-authored-by: Jesus Martinez <jesusabraham.martinez@tyson.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-20 03:27:43 +00:00
Will Badart
74e3d796f1 core[patch]: ensure iterator_ in scope for _atransform_stream_with_config except (#24454)
Before, if an exception was raised in the outer `try` block in
`Runnable._atransform_stream_with_config` before `iterator_` is
assigned, the corresponding `finally` block would blow up with an
`UnboundLocalError`:

```txt
UnboundLocalError: cannot access local variable 'iterator_' where it is not associated with a value
```

By assigning an initial value to `iterator_` before entering the `try`
block, this commit ensures that the `finally` can run, and not bury the
"true" exception under a "During handling of the above exception [...]"
traceback.

Thanks for your consideration!
2024-07-20 03:24:04 +00:00
maang-h
7b28359719 docs: Add ChatSparkLLM docstrings (#24449)
- **Description:** 
  - Add `ChatSparkLLM` docstrings, the issue #22296 
  - To support `stream` method
2024-07-19 20:19:14 -07:00
Eugene Yurtsev
5e48f35fba core[minor]: Relax constraints on type checking for tools and parsers (#24459)
This will allow tools and parsers to accept pydantic models from any of
the
following namespaces:

* pydantic.BaseModel with pydantic 1
* pydantic.BaseModel with pydantic 2
* pydantic.v1.BaseModel with pydantic 2
2024-07-19 21:47:34 -04:00
Isaac Francisco
838464de25 ollama: init package (#23615)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-20 00:43:29 +00:00
Erick Friis
f4ee3c8a22 infra: add min version testing to pr test flow (#24358)
xfailing some sql tests that do not currently work on sqlalchemy v1

#22207 was very much not sqlalchemy v1 compatible. 

Moving forward, implementations should be compatible with both to pass
CI
2024-07-19 22:03:19 +00:00
Erick Friis
50cb0a03bc docs: advanced feature note (#24456)
fixes #24430
2024-07-19 20:05:59 +00:00
Bagatur
842065a9cc community[patch]: Release 0.2.9 (#24453) 2024-07-19 12:50:22 -07:00
Bagatur
27ad6a4bb3 langchain[patch]: Release 0.2.10 (#24452) 2024-07-19 12:50:13 -07:00
Bagatur
dda9438e87 community[patch]: gpt-4o-mini costs (#24421) 2024-07-19 19:02:44 +00:00
Eugene Yurtsev
604dfe2d99 community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451)
This PR addresses the issue raised by (CVE-2024-3095)

https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273

Unfortunately, we didn't do a good job writing the initial report. It's
pointing at both the wrong package and the wrong code.

The affected code is the Web Retriever not the AsyncHTMLLoader, and the
WebRetriever lives in langchain-community

The vulnerable code lives here: 

0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)


This PR adds a forced opt-in for users to make sure they are aware of
the risk and can mitigate by configuring a proxy:


0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)
2024-07-19 18:51:35 +00:00
Bagatur
f101c759ed docs: how to pass runtime secrets (#24450) 2024-07-19 18:36:28 +00:00
Asi Greenholts
372c27f2e5 community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034)
- **Description:** Search has a limit of 500 results, playlistItems
doesn't. Added a class in except clause to catch another common error.
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** @TupleType

---------

Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 14:04:34 -04:00
Rafael Pereira
6a45bf9554 community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300)
**Description:** This PR introduces a change to the
`cypher_generation_chain` to dynamically concatenate inputs. This
improvement aims to streamline the input handling process and make the
method more flexible. The change involves updating the arguments
dictionary with all elements from the `inputs` dictionary, ensuring that
all necessary inputs are dynamically appended. This will ensure that any
cypher generation template will not require a new `_call` method patch.

**Issue:** This PR fixes issue #24260.
2024-07-19 14:03:14 -04:00
Philippe PRADOS
f5856680fe community[minor]: add mongodb byte store (#23876)
The `MongoDBStore` can manage only documents.
It's not possible to use MongoDB for an `CacheBackedEmbeddings`.

With this new implementation, it's possible to use:
```python
CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)
```
and use MongoDB to cache the embeddings !
2024-07-19 13:54:12 -04:00
yabooung
07715f815b community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258)
Description:
Add UTF-8 encoding support

Issue:
Inability to properly handle characters from certain languages (e.g.,
Korean)

Fix:
Implement UTF-8 encoding in FileChatMessageHistory

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 13:53:21 -04:00
Dristy Srivastava
020cc1cf3e Community[minor]: Added checksum in while send data to pebblo-cloud (#23968)
- **Description:** 
            - Updated checksum in doc metadata
- Sending checksum and removing actual content, while sending data to
`pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`/loader/doc` API
            - Adding `pb_id` i.e. pebblo id to doc metadata
            - Refactoring as needed.
- Sending `content-checksum` and removing actual content, while sending
data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`prmopt` API
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** Updated
- **Docs** NA

---------

Co-authored-by: dristy.cd <dristy@clouddefense.io>
2024-07-19 13:52:54 -04:00
Eun Hye Kim
9aae8ef416 core[patch]: Fix utils.json_schema.dereference_refs (#24335 KeyError: 400 in JSON schema processing) (#24337)
Description:
This PR fixes a KeyError: 400 that occurs in the JSON schema processing
within the reduce_openapi_spec function. The _retrieve_ref function in
json_schema.py was modified to handle missing components gracefully by
continuing to the next component if the current one is not found. This
ensures that the OpenAPI specification is fully interpreted and the
agent executes without errors.

Issue:
Fixes issue #24335

Dependencies:
No additional dependencies are required for this change.

Twitter handle:
@lunara_x
2024-07-19 13:31:00 -04:00
keval dekivadiya
06f47678ae community[minor]: Add TextEmbed Embedding Integration (#22946)
**Description:**

**TextEmbed** is a high-performance embedding inference server designed
to provide a high-throughput, low-latency solution for serving
embeddings. It supports various sentence-transformer models and includes
the ability to deploy image and text embedding models. TextEmbed offers
flexibility and scalability for diverse applications.

- **PyPI Package:** [TextEmbed on
PyPI](https://pypi.org/project/textembed/)
- **Docker Image:** [TextEmbed on Docker
Hub](https://hub.docker.com/r/kevaldekivadiya/textembed)
- **GitHub Repository:** [TextEmbed on
GitHub](https://github.com/kevaldekivadiya2415/textembed)

**PR Description**
This PR adds functionality for embedding documents and queries using the
`TextEmbedEmbeddings` class. The implementation allows for both
synchronous and asynchronous embedding requests to a TextEmbed API
endpoint. The class handles batching and permuting of input texts to
optimize the embedding process.

**Example Usage:**

```python
from langchain_community.embeddings import TextEmbedEmbeddings

# Initialise the embeddings class
embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url")

# Define a list of documents
documents = [
    "Data science involves extracting insights from data.",
    "Artificial intelligence is transforming various industries.",
    "Cloud computing provides scalable computing resources over the internet.",
    "Big data analytics helps in understanding large datasets.",
    "India has a diverse cultural heritage."
]

# Define a query
query = "What is the cultural heritage of India?"

# Embed all documents
document_embeddings = embeddings.embed_documents(documents)

# Embed the query
query_embedding = embeddings.embed_query(query)

# Print embeddings for each document
for i, embedding in enumerate(document_embeddings):
    print(f"Document {i+1} Embedding:", embedding)

# Print the query embedding
print("Query Embedding:", query_embedding)

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-19 17:30:25 +00:00
Shikanime Deva
9c3da11910 Fix MultiQueryRetriever breaking Embeddings with empty lines (#21093)
Fix MultiQueryRetriever breaking Embeddings with empty lines

```
[chain/end] [1:chain:ConversationalRetrievalChain > 2:retriever:Retriever > 3:retriever:Retriever > 4:chain:LLMChain] [2.03s] Exiting Chain run with output:
[outputs]
> /workspaces/Sfeir/sncf/metabot-backend/.venv/lib/python3.11/site-packages/langchain/retrievers/multi_query.py(116)_aget_relevant_documents()
-> if self.include_original:
(Pdb) queries
['## Alternative questions for "Hello, tell me about phones?":', '', '1. **What are the latest trends in smartphone technology?** (Focuses on recent advancements)', '2. **How has the mobile phone industry evolved over the years?** (Historical perspective)', '3. **What are the different types of phones available in the market, and which one is best for me?** (Categorization and recommendation)']
```

Example of failure on VertexAIEmbeddings

```
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "The text content is empty."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.184.234:443 {created_time:"2024-04-30T09:57:45.625698408+00:00", grpc_status:3, grpc_message:"The text content is empty."}"
```

Fixes: #15959

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 17:13:12 +00:00
John Kelly
5affbada61 langchain: Add aadd_documents to ParentDocumentRetriever (#23969)
- **Description:** Add an async version of `add_documents` to
`ParentDocumentRetriever`
-  **Twitter handle:** @johnkdev

---------

Co-authored-by: John Kelly <j.kelly@mwam.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 13:12:39 -04:00
Andrew Benton
f9d64d22e5 community[minor]: Add Riza Python/JS code execution tool (#23995)
- **Description:** Add Riza Python/JS code execution tool
- **Issue:** N/A
- **Dependencies:** an optional dependency on the `rizaio` pypi package
- **Twitter handle:** [@rizaio](https://x.com/rizaio)

[Riza](https://riza.io) is a safe code execution environment for
agent-generated Python and JavaScript that's easy to integrate into
langchain apps. This PR adds two new tool classes to the community
package.
2024-07-19 17:03:22 +00:00
Ben Chambers
3691701d58 community[minor]: Add keybert-based link extractor (#24311)
- **Description:** Add a `KeybertLinkExtractor` for graph vectorstores.
This allows extracting links from keywords in a Document and linking
nodes that have common keywords.
- **Issue:** None
- **Dependencies:** None.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-19 12:25:07 -04:00
Erick Friis
ef049769f0 core[patch]: Release 0.2.22 (#24423)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-19 09:09:24 -07:00
Bagatur
cd19ba9a07 core[patch]: core lint fix (#24447) 2024-07-19 09:01:22 -07:00
Ben Chambers
83f3d95ffa community[minor]: GLiNER link extraction (#24314)
- **Description:** This allows extracting links between documents with
common named entities using [GLiNER](https://github.com/urchade/GLiNER).
- **Issue:** None
- **Dependencies:** None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 15:34:54 +00:00
Anas Khan
b5acb91080 Mask API keys for various LLM/ChatModel Modules (#13885)
**Description:** 
- Added masking of the API Keys for the modules:
  - `langchain/chat_models/openai.py`
  - `langchain/llms/openai.py`
  - `langchain/llms/google_palm.py`
  - `langchain/chat_models/google_palm.py`
  - `langchain/llms/edenai.py`

- Updated the modules to utilize `SecretStr` from pydantic to securely
manage API key.
- Added unit/integration tests
- `langchain/chat_models/asure_openai.py` used the `open_api_key` that
is derived from the `ChatOpenAI` Class and it was assuming
`openai_api_key` is a str so we changed it to expect `SecretStr`
instead.

**Issue:** https://github.com/langchain-ai/langchain/issues/12165 ,
**Dependencies:** none,
**Tag maintainer:** @eyurtsev

---------

Co-authored-by: HassanA01 <anikeboss@gmail.com>
Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca>
Co-authored-by: kristinspenc <kristinspenc2003@gmail.com>
Co-authored-by: faisalt14 <faisalt14@gmail.com>
Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com>
Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com>
Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 15:23:34 +00:00
ccurme
f99369a54c community[patch]: fix formatting (#24443)
Somehow this got through CI:
https://github.com/langchain-ai/langchain/pull/24363
2024-07-19 14:38:53 +00:00
Ben Chambers
242b085be7 Merge pull request #24315
* community: Add Hierarchy link extractor

* add example

* lint
2024-07-19 09:42:26 -04:00
Rhuan Barros
c3308f31bc Merge pull request #24363
* important email fields
2024-07-19 09:41:20 -04:00
Piotr Romanowski
c50dd79512 docs: Update langchain-openai package version in chat_token_usage_tracking (#24436)
This PR updates docs to mention correct version of the
`langchain-openai` package required to use the `stream_usage` parameter.

As it can be noticed in the details of this [merge
commit](722c8f50ea),
that functionality is available only in `langchain-openai >= 0.1.9`
while docs state it's available in `langchain-openai >= 0.1.8`.
2024-07-19 13:07:37 +00:00
Han Sol Park
aade9bfde5 Mask API key for ChatOpenAI based chat_models (#14293)
- **Description**: Mask API key for ChatOpenAi based chat_models
(openai, azureopenai, anyscale, everlyai).
Made changes to all chat_models that are based on ChatOpenAI since all
of them assumes that openai_api_key is str rather than SecretStr.
  - **Issue:**: #12165 
  - **Dependencies:**  N/A
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 02:25:38 +00:00
William FH
0ee6ed76ca [Evaluation] Pass in seed directly (#24403)
adding test rn
2024-07-18 19:12:28 -07:00
Nuno Campos
62b6965d2a core: In ensure_config don't copy dunder configurable keys to metadata (#24420) 2024-07-18 22:28:52 +00:00
Eugene Yurtsev
ef22ebe431 standard-tests[patch]: Add pytest assert rewrites (#24408)
This will surface nice error messages in subclasses that fail assertions.
2024-07-18 21:41:11 +00:00
Eugene Yurtsev
f62b323108 core[minor]: Support all versions of pydantic base model in argsschema (#24418)
This adds support to any pydantic base model for tools.

The only potential issue is that `get_input_schema()` will not always
return a v1 base model.
2024-07-18 17:14:23 -04:00
Prakul
b2bc15e640 docs: Update mongodb README.md (#24412)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-18 14:02:34 -07:00
Evan Harris
61ea7bf60b Add a ListRerank document compressor (#13311)
- **Description:** This PR adds a new document compressor called
`ListRerank`. It's derived from `BaseDocumentCompressor`. It's a near
exact implementation of introduced by this paper: [Zero-Shot Listwise
Document Reranking with a Large Language
Model](https://arxiv.org/pdf/2305.02156.pdf) which it finds to
outperform pointwise reranking, which is somewhat implemented in
LangChain as
[LLMChainFilter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter.py).
- **Issue:** None
- **Dependencies:** None
- **Tag maintainer:** @hwchase17 @izzymsft
- **Twitter handle:** @HarrisEMitchell

Notes:
1. I didn't add anything to `docs`. I wasn't exactly sure which patterns
to follow as [cohere reranker is under
Retrievers](https://python.langchain.com/docs/integrations/retrievers/cohere-reranker)
with other external document retrieval integrations, but other
contextual compression is
[here](https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/).
Happy to contribute to either with some direction.
2. I followed syntax, docstrings, implementation patterns, etc. as well
as I could looking at nearby modules. One thing I didn't do was put the
default prompt in a separate `.py` file like [Chain
Filter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter_prompt.py)
and [Chain
Extract](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_extract_prompt.py).
Happy to follow that pattern if it would be preferred.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 20:34:38 +00:00
Srijan Dubey
4c651ba13a Adding LangChain v0.2 support for nvidia ai endpoint, langchain-nvidia-ai-endpoints. Removed deprecated classes from nvidia_ai_endpoints.ipynb (#24411)
Description: added support for LangChain v0.2 for nvidia ai endpoint.
Implremented inMemory storage for chains using
RunnableWithMessageHistory which is analogous to using
`ConversationChain` which was used in v0.1 with the default
`ConversationBufferMemory`. This class is deprecated in favor of
`RunnableWithMessageHistory` in LangChain v0.2

Issue: None

Dependencies: None.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 15:59:26 -04:00
Erick Friis
334fc1ed1c mongodb: release 0.1.7 (#24409) 2024-07-18 18:13:27 +00:00
ccurme
ba74341eee docs: update tool calling how-to to pass functions to bind_tools (#24402) 2024-07-18 08:53:48 -07:00
Harrison Chase
3adf710f1d docs: improve docs on tools (#24404)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-18 08:52:12 -07:00
Eun Hye Kim
07c5c60f63 community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384)
**Description:**
- Updated the format for the 'Action' section in the planner prompt to
ensure it must be one of the tools without additional words. Adjusted
the phrasing from "should be" to "must be" for clarity and
enforceability.
- Corrected the tool appending logic in the
`_create_api_controller_agent` function to ensure that
`RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are
properly added to the tools list for "DELETE" and "PATCH" operations.

**Issue:** #24382

**Dependencies:** None

**Twitter handle:** @lunara_x

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 13:37:46 +00:00
Casey Clements
aade1550c6 docs: Adds MongoDBAtlasVectorSearch to VectorStore list compatible with Indexing API (#24374)
Adds MongoDBAtlasVectorSearch to list of VectorStores compatible with
the Indexing API.

(One line change.)

As of `langchain-mongodb = "0.1.7"`, the requirements that the
VectorStore have both add_documents and delete methods with an ids kwarg
is satisfied. #23535 contains the implementation of that, and has been
merged.
2024-07-18 09:37:29 -04:00
Chen Xiabin
63c60a31f0 [fix] baidu qianfan AiMessage with usage_metadata (#24389)
make AIMessage usage_metadata has error
2024-07-18 09:28:16 -04:00
João Dinis Ferreira
242de9aa5e docs: remove redundant --quiet option in pip install command (#24397)
- **Description:** Removes a redundant option in a `pip install` command
in the documentation.
- **Issue:** N/A
- **Dependencies:** N/A
2024-07-18 13:24:42 +00:00
ZhangShenao
916b813107 community[patch]: Fix spelling error in ConversationVectorStoreTokenBufferMemory doc-string (#24385)
Fix word spelling error in `ConversationVectorStoreTokenBufferMemory`
2024-07-18 12:27:36 +00:00
Rajendra Kadam
1c65529fd7 community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393)
Thank you for contributing to LangChain!

- [x] **PR title**: [PebbloSafeLoader] Rename loader type and add
SharePointLoader to supported loaders
    - **Description:** Minor fixes in the PebbloSafeLoader:
        - Renamed the loader type from `remote_db` to `cloud_folder`.
- Added `SharePointLoader` to the list of loaders supported by
PebbloSafeLoader.
    - **Issue:** NA
    - **Dependencies:** NA
- [x] **Add tests and docs**: NA
2024-07-18 08:23:12 -04:00
Eugene Yurtsev
6182a402f1 experimental[patch]: block a few more things from PALValidator (#24379)
* Please see security warning already in existing class.
* The approach here is fundamentally insecure as it's relying on a block
  approach rather than an approach based on only running allowed nodes.
So users should only use this code if its running from a properly
sandboxed  environment.
2024-07-18 08:22:45 -04:00
Paolo Ráez
0dec72cab0 Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987)
### Description
Missing "stream" parameter. Without it, you'd never receive a stream of
tokens when using stream() or astream()

### Issue
No existing issue available
2024-07-18 02:09:39 +00:00
Eugene Yurtsev
570566b858 core[patch]: Update API reference for astream events (#24359)
Update the API reference for astream events to include information about
custom events.
2024-07-17 21:48:53 -04:00
Bagatur
f9baaae3ec docs: clean up tool how to titles (#24373) 2024-07-17 17:08:31 -07:00
Bagatur
4da1df568a docs: tools concepts (#24368) 2024-07-17 17:08:16 -07:00
Erick Friis
96ccba9c27 infra: 15s retry wait on test pypi (#24375) 2024-07-17 23:41:22 +00:00
Bagatur
a4c101ae97 core[patch]: Release 0.2.21 (#24372) 2024-07-17 22:44:35 +00:00
William FH
c5a07e2dd8 core[patch]: add InjectedToolArg annotation (#24279)
```python
from typing_extensions import Annotated
from langchain_core.tools import tool, InjectedToolArg
from langchain_anthropic import ChatAnthropic

@tool
def multiply(x: int, y: int, not_for_model: Annotated[dict, InjectedToolArg]) -> str:
    """multiply."""
    return x * y 

ChatAnthropic(model='claude-3-sonnet-20240229',).bind_tools([multiply]).invoke('5 times 3').tool_calls
'''
-> [{'name': 'multiply',
  'args': {'x': 5, 'y': 3},
  'id': 'toolu_01Y1QazYWhu4R8vF4hF4z9no',
  'type': 'tool_call'}]
'''
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-17 15:28:40 -07:00
Erick Friis
80f3d48195 openai: release 0.1.18 (#24369) 2024-07-17 22:26:33 +00:00
Bagatur
7d83189b19 openai[patch]: use model_name in AzureOpenAI.ls_model_name (#24366) 2024-07-17 15:24:05 -07:00
Nithish Raghunandanan
eb26b5535a couchbase: Add chat message history (#24356)
**Description:** : Add support for chat message history using Couchbase

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
2024-07-17 15:22:42 -07:00
Eugene Yurtsev
96bac8e20d core[patch]: Fix regression requiring input_variables in few chat prompt templates (#24360)
* Fix regression that requires users passing input_variables=[].

* Regression introduced by my own changes to this PR:
https://github.com/langchain-ai/langchain/pull/22851
2024-07-17 18:14:57 -04:00
Brice Fotzo
034a8c7c1b community: support advanced text extraction options for pdf documents (#20265)
**Description:** 
- Updated constructors in PyPDFParser and PyPDFLoader to handle
`extraction_mode` and additional kwargs, aligning with the capabilities
of `PageObject.extract_text()` from pypdf.

- Added `test_pypdf_loader_with_layout` along with a corresponding
example text file to validate layout extraction from PDFs.

**Issue:** fixes #19735 

**Dependencies:** This change requires updating the pypdf dependency
from version 3.4.0 to at least 4.0.0.

Additional changes include the addition of a new test
test_pypdf_loader_with_layout and an example text file to ensure the
functionality of layout extraction from PDFs aligns with the new
capabilities.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 20:47:09 +00:00
hmasdev
a402de3dae langchain[patch]: fix wrong dict key in OutputFixingParser, RetryOutputParser and RetryWithErrorOutputParser (#23967)
# Description
This PR aims to solve a bug in `OutputFixingParser`, `RetryOutputParser`
and `RetryWithErrorOutputParser`
The bug is that the wrong keyword argument was given to `retry_chain`.
The correct keyword argument is 'completion', but 'input' is used.

This pull request makes the following changes:
1. correct a `dict` key given to `retry_chain`;
2. add a test when using the default prompt.
   - `NAIVE_FIX_PROMPT` for `OutputFixingParser`;
   - `NAIVE_RETRY_PROMPT` for `RetryOutputParser`;
   - `NAIVE_RETRY_WITH_ERROR_PROMPT` for `RetryWithErrorOutputParser`;
3. ~~add comments on `retry_chain` input and output types~~ clarify
`InputType` and `OutputType` of `retry_chain`

# Issue
The bug is pointed out in
https://github.com/langchain-ai/langchain/pull/19792#issuecomment-2196512928

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 20:34:46 +00:00
Casey Clements
a47f69a120 partners/mongodb : Significant MongoDBVectorSearch ID enhancements (#23535)
## Description

This pull-request improves the treatment of document IDs in
`MongoDBAtlasVectorSearch`.

Class method signatures of add_documents, add_texts, delete, and
from_texts
now include an `ids:Optional[List[str]]` keyword argument permitting the
user
greater control. 
Note that, as before, IDs may also be inferred from
`Document.metadata['_id']`
if present, but this is no longer required,
IDs can also optionally be returned from searches.

This PR closes the following JIRA issues.

* [PYTHON-4446](https://jira.mongodb.org/browse/PYTHON-4446)
MongoDBVectorSearch delete / add_texts function rework
* [PYTHON-4435](https://jira.mongodb.org/browse/PYTHON-4435) Add support
for "Indexing"
* [PYTHON-4534](https://jira.mongodb.org/browse/PYTHON-4534) Ensure
datetimes are json-serializable

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 13:26:20 -07:00
Erick Friis
cc2cbfabfc milvus: release 0.1.2 (#24365) 2024-07-17 19:42:44 +00:00
Eugene Yurtsev
9e4a0e76f6 core[patch]: Fix one unit test for chat prompt template (#24362)
Minor change that fixes a unit test that had missing assertions.
2024-07-17 18:56:48 +00:00
Erick Friis
81639243e2 openai: release 0.1.17 (#24361) 2024-07-17 18:50:42 +00:00
Erick Friis
61976a4147 pinecone: release 0.1.2 (#24355) 2024-07-17 17:09:07 +00:00
Bagatur
b5360e2e5f community[patch]: Release 0.2.8 (#24354) 2024-07-17 17:07:27 +00:00
ccurme
4cf67084d3 openai[patch]: fix key collision and _astream (#24345)
Fixes small issues introduced in
https://github.com/langchain-ai/langchain/pull/24150 (unreleased).
2024-07-17 12:59:26 -04:00
Luis Moros
bcb5f354ad community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346)
- Description: When SQLDatabase.from_databricks is ran from a Databricks
Workflow job, line 205 (default_host = context.browserHostName) throws
an ``AttributeError`` as the ``context`` object has no
``browserHostName`` attribute. The fix handles the exception and sets
the ``default_host`` variable to null

---------

Co-authored-by: lmorosdb <lmorosdb>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-17 12:40:12 -04:00
Bagatur
24e9b48d15 langchain[patch]: Release 0.2.9 (#24327) 2024-07-17 09:39:57 -07:00
Rafael Pereira
cf28708e7b Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725)
**Description:** At the moment neo4j wrapper is using setVectorProperty,
which is deprecated
([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)).
I replaced with the non-deprecated version.

Neo4j recently introduced a new cypher method to associate embeddings
into relations using "setRelationshipVectorProperty" method. In this PR
I also implemented a new method to perform this association maintaining
the same format used in the "add_embeddings" method which is used to
associate embeddings into Nodes.
I also included a test case for this new method.
2024-07-17 12:37:47 -04:00
maang-h
2a3288b15d docs: Add ChatBaichuan docstrings (#24348)
- **Description:** Add ChatBaichuan rich docstrings.
- **Issue:** the issue #22296
2024-07-17 12:00:16 -04:00
Srijan Dubey
1792684e8f removed deprecated classes from pipelineai.ipynb, added support for LangChain v0.2 for PipelineAI integration (#24333)
Description: added support for LangChain v0.2 for PipelineAI
integration. Removed deprecated classes and incorporated support for
LangChain v0.2 to integrate with PipelineAI. Removed LLMChain and
replaced it with Runnable interface. Also added StrOutputParser, that
parses LLMResult into the top likely string.

Issue: None

Dependencies: None.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-17 13:48:32 +00:00
Tobias Sette
e60ad12521 docs(infobip.ipynb): fix typo (#24328) 2024-07-17 13:33:34 +00:00
Rafael Pereira
fc41730e28 neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338)
**Description:** 
This PR addresses two main issues in the `test_neo4jvector.py`:
1. **Order-insensitive Comparison:** Modified the
`test_retrieval_dictionary` to ensure that it passes regardless of the
order of returned values by parsing `page_content` into a structured
format (dictionary) before comparison.
2. **Floating-point Precision:** Updated
`test_neo4jvector_relevance_score` to handle minor floating-point
precision differences by using the `isclose` function for comparing
relevance scores with a relative tolerance.

Errors addressed:

- **test_neo4jvector_relevance_score:**
  ```
AssertionError: assert [(Document(page_content='foo', metadata={'page':
'0'}), 1.0000014305114746), (Document(page_content='bar',
metadata={'page': '1'}), 0.9998371005058289),
(Document(page_content='baz', metadata={'page': '2'}),
0.9993508458137512)] == [(Document(page_content='foo', metadata={'page':
'0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}),
0.9998376369476318), (Document(page_content='baz', metadata={'page':
'2'}), 0.9993523359298706)]
At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}),
1.0000014305114746) != (Document(page_content='foo', metadata={'page':
'0'}), 1.0)
  Full diff:
  - [(Document(page_content='foo', metadata={'page': '0'}), 1.0),
+ [(Document(page_content='foo', metadata={'page': '0'}),
1.0000014305114746),
? +++++++++++++++
- (Document(page_content='bar', metadata={'page': '1'}),
0.9998376369476318),
? ^^^ ------
+ (Document(page_content='bar', metadata={'page': '1'}),
0.9998371005058289),
? ^^^^^^^^^
- (Document(page_content='baz', metadata={'page': '2'}),
0.9993523359298706),
? ----------
+ (Document(page_content='baz', metadata={'page': '2'}),
0.9993508458137512),
? ++++++++++
  ]
  ```

- **test_retrieval_dictionary:**
  ```
AssertionError: assert [Document(page_content='skills:\n- Python\n- Data
Analysis\n- Machine Learning\nname: John\nage: 30\n')] ==
[Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')]
At index 0 diff: Document(page_content='skills:\n- Python\n- Data
Analysis\n- Machine Learning\nname: John\nage: 30\n') !=
Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')
  Full diff:
- [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')]
? ---------
+ [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: John\nage: 30\n')]
? +++++++++
  ```
2024-07-17 09:28:25 -04:00
Erick Friis
47ed7f766a infra: fix release prerelease deps bug (#24323) 2024-07-16 15:13:41 -07:00
Bagatur
80e7cd6cff core[patch]: Release 0.2.20 (#24322) 2024-07-16 15:04:36 -07:00
Erick Friis
6c3e65a878 infra: prerelease dep checking on release (#23269) 2024-07-16 21:48:15 +00:00
Eugene Yurtsev
616196c620 Docs: Add how to dispatch custom callback events (#24278)
* Add how-to guide for dispatching custom callback events.
* Add links from index to the how to guide
* Add link from streaming from within a tool
* Update versionadded to correct release
https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.2.15
2024-07-16 17:38:32 -04:00
Erick Friis
dd7938ace8 docs: readthedocs deprecation fix (#24321)
https://about.readthedocs.com/blog/2024/07/addons-by-default/#how-does-it-affect-my-projects

we use build.command so we're already using addons, so I think this is
it
2024-07-16 20:32:51 +00:00
Srijan Dubey
ef07308c30 Upgraded shaleprotocol to use langchain v0.2 removed deprecated classes (#24320)
Description: Added support for langchain v0.2 for shale protocol.
Replaced LLMChain with Runnable interface which allows any two Runnables
to be 'chained' together into sequences. Also added
StreamingStdOutCallbackHandler. Callback handler for streaming.
Issue: None
Dependencies: None.
2024-07-16 20:07:36 +00:00
pbharti0831
049bc37111 Cookbook for applying RAG locally using open source models and tools on CPU (#24284)
This cookbook guides user to implement RAG locally on CPU using
langchain tools and open source models. It enables Llama2 model to
answer queries about Intel Q1 2024 earning release using RAG pipeline.

Main libraries are langchain, llama-cpp-python and gpt4all.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Sriragavi <sriragavi.r@intel.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-16 15:17:10 -04:00
Leonid Ganeline
5ccf8ebfac core: docstrings vectorstores update (#24281)
Added missed docstrings. Formatted docstrings to the consistent form.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-16 16:58:11 +00:00
Erick Friis
1e9cc02ed8 openai: raw response headers (#24150) 2024-07-16 09:54:54 -07:00
Bagatur
dc42279eb5 core[patch]: fix Typing.cast import (#24313)
Fixes #24287
2024-07-16 16:53:48 +00:00
Anush
e38bf08139 qdrant: Fixed typos in Qdrant vectorstore docs (#24312)
## Description 

As that title goes.
2024-07-16 09:44:07 -07:00
bovlb
5caa381177 community[minor]: Add ApertureDB as a vectorstore (#24088)
Thank you for contributing to LangChain!

- [X] *ApertureDB as vectorstore**: "community: Add ApertureDB as a
vectorestore"

- **Description:** this change provides a new community integration that
uses ApertureData's ApertureDB as a vector store.
    - **Issue:** none
    - **Dependencies:** depends on ApertureDB Python SDK
    - **Twitter handle:** ApertureData

- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Integration tests rely on a local run of a public docker image.
Example notebook additionally relies on a local Ollama server.

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

All lint tests pass.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Gautam <gautam@aperturedata.io>
2024-07-16 09:32:59 -07:00
frob
c59e663365 community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973)
Fix doc-string for ollama integration
2024-07-16 14:48:38 +00:00
Mazen Ramadan
0c1889c713 docs: fix parameter typo in scrapfly loader docs (#24307)
Fixed wrong parameter typo in
[ScrapflyLoader](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/document_loaders/scrapfly.py)
docs, where `ignore_scrape_failures` is used instead of
`continue_on_failure`.

- Description: Fix wrong param typo in ScrapflyLoader docs.
2024-07-16 14:48:13 +00:00
Leonid Ganeline
5fcf2ef7ca core: docstrings documents (#23506)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-16 10:43:54 -04:00
Rafael Pereira
77dd327282 Docs: Fix Concepts Integration Tools Link (#24301)
- **Description:** This PR fix concepts integrations tools link.

- **Issue:** Fixes issue #24112

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-16 10:29:30 -04:00
Rahul Raghavendra Choudhury
f5a38772a8 community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270)
On using TavilySearchAPIRetriever with any conversation chain getting
error :

`TypeError: Client.__init__() got an unexpected keyword argument
'api_key'`

It is because the retreiver class is using the depreciated `Client`
class, `TavilyClient` need to be used instead.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-16 13:35:28 +00:00
Shenhai Ran
5f2dea2b20 core[patch]: Add encoding options when create prompt template from a file (#24054)
- Uses default utf-8 encoding for loading prompt templates from file

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-16 09:35:09 -04:00
Chen Xiabin
69b1603173 baidu qianfan AiMessage with usage_metadata (#24288)
add usage_metadata to qianfan AIMessage. Thanks
2024-07-16 09:30:50 -04:00
amcastror
d83164f837 Update retrievers.ipynb (#24289)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-16 13:30:41 +00:00
Leonid Ganeline
198b85334f core[patch]: docstrings langchain_core/ files update (#24285)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-16 09:21:51 -04:00
Dobiichi-Origami
7aeaa1974d community[patch]: change the class of qianfan_ak and qianfan_sk parameters (#24293)
- **Description:** we changed the class of two parameters to fix a bug,
which causes validation failure when using QianfanEmbeddingEndpoint
2024-07-16 09:17:48 -04:00
Tibor Reiss
1c753d1e81 core[patch]: Update typing for template format to include jinja2 as a Literal (#24144)
Fixes #23929 via adjusting the typing
2024-07-16 09:09:42 -04:00
Jacob Lee
6716379f0c docs[patch]: Fix rendering issue in code splitter page (#24291) 2024-07-15 23:08:21 -07:00
Jacob Lee
58fdb070fa docs[patch]: Update intro diagram (#24290)
CC @agola11
2024-07-15 22:04:42 -07:00
Erick Friis
1d7a3ae7ce infra: add test deps to add_dependents (#24283) 2024-07-15 15:48:53 -07:00
Erick Friis
d2f671271e langchain: fix extended test (#24282) 2024-07-15 15:29:48 -07:00
Lage Ragnarsson
a3c10fc6ce community: Add support for specifying hybrid search for Databricks vector search (#23528)
**Description:**

Databricks Vector Search recently added support for hybrid
keyword-similarity search.
See [usage
examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint)
from their documentation.

This PR updates the Langchain vectorstore interface for Databricks to
enable the user to pass the *query_type* parameter to
*similarity_search* to make use of this functionality.
By default, there will not be any changes for existing users of this
interface. To use the new hybrid search feature, it is now possible to
do

```python
# ...
dvs = DatabricksVectorSearch(index)
dvs.similarity_search("my search query", query_type="HYBRID")
```

Or using the retriever:

```python
retriever = dvs.as_retriever(
    search_kwargs={
        "query_type": "HYBRID",
    }
)
retriever.invoke("my search query")
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-15 22:14:08 +00:00
Christopher Tee
5171ffc026 community(you): Integrate You.com conversational APIs (#23046)
You.com is releasing two new conversational APIs — Smart and Research. 

This PR:
- integrates those APIs with Langchain, as an LLM
- streaming is supported

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-15 17:46:58 -04:00
maang-h
6c7d9f93b9 feat: Add ChatTongyi structured output (#24187)
- **Description:** Add `with_structured_output` method to ChatTongyi to
support structured output.
2024-07-15 15:57:21 -04:00
Chen Xiabin
8f4620f4b8 baidu qianfan streaming token_usage (#24117)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-15 19:52:31 +00:00
maang-h
9d97de34ae community[patch]: Improve ChatBaichuan init args and role (#23878)
- **Description:** Improve ChatBaichuan init args and role
   -  ChatBaichuan adds `system` role
   - alias: `baichuan_api_base` -> `base_url`
   - `with_search_enhance` is deprecated
   - Add `max_tokens` argument
2024-07-15 15:17:00 -04:00
Erick Friis
56cca23745 openai: remove some params from default serialization (#24280) 2024-07-15 18:53:36 +00:00
mrugank-wadekar
66bebeb76a partners: add similarity search by image functionality to langchain_chroma partner package (#22982)
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
  - **Issue:** N/A
  - **Dependencies:** None
  - **Twitter handle:** @mrugank9009

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-15 18:48:22 +00:00
pm390
b0aa915dea community[patch]: use asyncio.sleep instead of sleep in OpenAI Assistant async (#24275)
**Description:** Implemented async sleep using asyncio instead of
synchronous sleep in openAI Assistants
**Issue:** 24194
**Dependencies:** asyncio
**Twitter handle:** pietromald60939
2024-07-15 18:14:39 +00:00
Anush
d93ae756e6 qdrant: Documentation for the new QdrantVectorStore class (#24166)
## Description

Follow up on #24165. Adds a page to document the latest usage of the new
`QdrantVectorStore` class.
2024-07-15 10:39:23 -07:00
Erick Friis
1244e66bd4 docs: remove couchbase from docs linking (#24277)
`pip install couchbase` adds 12 minutes to the docs build...
2024-07-15 17:34:41 +00:00
wenngong
a001037319 retrievers: MultiVectorRetriever similarity_score_threshold search type (#23539)
Description: support MultiVectorRetriever similarity_score_threshold
search type.

Issue: #23387 #19404

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-07-15 13:31:34 -04:00
Carlos André Antunes
20151384d7 fix azure_openai.py: some keys do not exists (#24158)
In some lines its trying to read a key that do not exists yet. In this
cases I changed the direct access to dict.get() method

Thank you for contributing to LangChain!

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-15 17:17:05 +00:00
blueoom
d895614d19 text_splitters: add request parameters for function HTMLHeaderTextSplitter.split_text… (#24178)
**Description:**

The `split_text_from_url` method of `HTMLHeaderTextSplitter` does not
include parameters like `timeout` when using `requests` to send a
request. Therefore, I suggest adding a `kwargs` parameter to the
function, which can be passed as arguments to `requests.get()`
internally, allowing control over the `get` request.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-15 16:43:56 +00:00
Bagatur
9d0c1d2dc9 docs: specify init_chat_model version (#24274) 2024-07-15 16:29:06 +00:00
MoraxMa
a7296bddc2 docs: updated Tongyi package (#24259)
* updated pip install package
2024-07-15 16:25:35 +00:00
Bagatur
c9473367b1 langchain[patch]: Release 0.2.8 (#24273) 2024-07-15 16:05:51 +00:00
JP-Ellis
f77659463a core[patch]: allow message utils to work with lcel (#23743)
The functions `convert_to_messages` has had an expansion of the
arguments it can take:

1. Previously, it only could take a `Sequence` in order to iterate over
it. This has been broadened slightly to an `Iterable` (which should have
no other impact).
2. Support for `PromptValue` and `BaseChatPromptTemplate` has been
added. These are generated when combining messages using the overloaded
`+` operator.

Functions which rely on `convert_to_messages` (namely `filter_messages`,
`merge_message_runs` and `trim_messages`) have had the type of their
arguments similarly expanded.

Resolves #23706.

<!--
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
-->

---------

Signed-off-by: JP-Ellis <josh@jpellis.me>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-15 08:58:05 -07:00
Harold Martin
ccdaf14eff docs: Spell check fixes (#24217)
**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
2024-07-15 15:51:43 +00:00
Leonid Ganeline
cacdf96f9c core docstrings tracers update (#24211)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-15 11:37:09 -04:00
Leonid Ganeline
36ee083753 core: docstrings utils update (#24213)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-15 11:36:00 -04:00
thehunmonkgroup
e8a21146d3 community[patch]: upgrade default model for ChatAnyscale (#24232)
Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.
2024-07-15 11:34:59 -04:00
Bagatur
a0958c0607 docs: more tool call -> tool message docs (#24271) 2024-07-15 07:55:07 -07:00
Bagatur
620b118c70 core[patch]: Release 0.2.19 (#24272) 2024-07-15 07:51:30 -07:00
ccurme
888fbc07b5 core[patch]: support passing args_schema through as_tool (#24269)
Note: this allows the schema to be passed in positionally.

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.runnables import RunnableLambda


class Add(BaseModel):
    """Add two integers together."""

    a: int = Field(..., description="First integer")
    b: int = Field(..., description="Second integer")


def add(input: dict) -> int:
    return input["a"] + input["b"]


runnable = RunnableLambda(add)
as_tool = runnable.as_tool(Add)
as_tool.args_schema.schema()
```
```
{'title': 'Add',
 'description': 'Add two integers together.',
 'type': 'object',
 'properties': {'a': {'title': 'A',
   'description': 'First integer',
   'type': 'integer'},
  'b': {'title': 'B', 'description': 'Second integer', 'type': 'integer'}},
 'required': ['a', 'b']}
```
2024-07-15 07:51:05 -07:00
ccurme
ab2d7821a7 fireworks[patch]: use firefunction-v2 in standard tests (#24264) 2024-07-15 13:15:08 +00:00
ccurme
6fc7610b1c standard-tests[patch]: update test_bind_runnables_as_tools (#24241)
Reduce number of tool arguments from two to one.
2024-07-15 08:35:07 -04:00
Bagatur
0da5078cad langchain[minor]: Generic configurable model (#23419)
alternative to
[23244](https://github.com/langchain-ai/langchain/pull/23244). allows
you to use chat model declarative methods

![Screenshot 2024-06-25 at 1 07 10
PM](https://github.com/langchain-ai/langchain/assets/22008038/910d1694-9b7b-46bc-bc2e-3792df9321d6)
2024-07-15 01:11:01 +00:00
Bagatur
d0728b0ba0 core[patch]: add tool name to tool message (#24243)
Copying current ToolNode behavior
2024-07-15 00:42:40 +00:00
Bagatur
9224027e45 docs: tool artifacts how to (#24198) 2024-07-14 17:04:47 -07:00
Bagatur
5c3e2612da core[patch]: Release 0.2.18 (#24230) 2024-07-13 09:14:43 -07:00
Bagatur
65321bf975 core[patch]: fix ToolCall "type" when streaming (#24218) 2024-07-13 08:59:03 -07:00
Jacob Lee
2b7d1cdd2f docs[patch]: Update tool child run docs (#24160)
Documents #24143
2024-07-13 07:52:37 -07:00
Anush
a653b209ba qdrant: test new QdrantVectorStore (#24165)
## Description

This PR adds integration tests to follow up on #24164.

By default, the tests use an in-memory instance.

To run the full suite of tests, with both in-memory and Qdrant server:

```
$ docker run -p 6333:6333 qdrant/qdrant

$ make test

$ make integration_test
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:59:30 +00:00
Roman Solomatin
f071581aea openai[patch]: update openai params (#23691)
**Description:** Explicitly add parameters from openai API



- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 16:53:33 -07:00
Leonid Ganeline
f0a7581b50 milvus: docstring (#23151)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:25:31 +00:00
Christian D. Glissov
474b88326f langchain_qdrant: Added method "_asimilarity_search_with_relevance_scores" to Qdrant class (#23954)
I stumbled upon a bug that led to different similarity scores between
the async and sync similarity searches with relevance scores in Qdrant.
The reason being is that _asimilarity_search_with_relevance_scores is
missing, this makes langchain_qdrant use the method of the vectorstore
baseclass leading to drastically different results.

To illustrate the magnitude here are the results running an identical
search in a test vectorstore.

Output of asimilarity_search_with_relevance_scores:
[0.9902903374601824, 0.9472135924938804, 0.8535534011299859]

Output of similarity_search_with_relevance_scores:
[0.9805806749203648, 0.8944271849877607, 0.7071068022599718]

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:25:20 +00:00
Bagatur
bdc03997c9 standard-tests[patch]: check for ToolCall["type"] (#24209) 2024-07-12 16:17:34 -07:00
Nada Amin
3f1cf00d97 docs: Improve neo4j semantic templates (#23939)
I made some changes based on the issues I stumbled on while following
the README of neo4j-semantic-ollama.
I made the changes to the ollama variant, and can also port the relevant
ones to the layer variant once this is approved.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:08:25 +00:00
Nada Amin
6b47c7361e docs: fix code usage to use the ollama variant (#23937)
**Description:** the template neo4j-semantic-ollama uses an import from
the neo4j-semantic-layer template instead of its own.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:07:42 +00:00
Anirudh31415926535
7677ceea60 docs: model parameter mandatory for cohere embedding and rerank (#23349)
Latest langchain-cohere sdk mandates passing in the model parameter into
the Embeddings and Reranker inits.

This PR is to update the docs to reflect these changes.
2024-07-12 23:07:28 +00:00
Miroslav
aee55eda39 community: Skip Login to HuggubgFaceHub when token is not set (#21561)
Thank you for contributing to LangChain!

- [ ] **HuggingFaceEndpoint**: "Skip Login to HuggingFaceHub"
  - Where:  langchain, community, llm, huggingface_endpoint
 


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Skip login to huggingface hub when when
`huggingfacehub_api_token` is not set. This is needed when using custom
`endpoint_url` outside of HuggingFaceHub.
- **Issue:** the issue # it fixes
https://github.com/langchain-ai/langchain/issues/20342 and
https://github.com/langchain-ai/langchain/issues/19685
    - **Dependencies:** None


- [ ] **Add tests and docs**: 
  1. Tested with locally available TGI endpoint
  2.  Example Usage
```python
from langchain_community.llms import HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    endpoint_url='http://localhost:8080',
    server_kwargs={
        "headers": {"Content-Type": "application/json"}
    }
)
resp = llm.invoke("Tell me a joke")
print(resp)
```
 Also tested against HF Endpoints
 ```python
 from langchain_community.llms import HuggingFaceEndpoint
huggingfacehub_api_token = "hf_xyz"
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(
    huggingfacehub_api_token=huggingfacehub_api_token,
    repo_id=repo_id,
)
resp = llm.invoke("Tell me a joke")
print(resp)
 ```
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 22:10:32 +00:00
Anush
d09dda5a08 qdrant: Bump patch version (#24168)
# Description

To release a new version of `langchain-qdrant` after #24165 and #24166.
2024-07-12 14:48:50 -07:00
Bagatur
12950cc602 standard-tests[patch]: improve runnable tool description (#24210) 2024-07-12 21:33:56 +00:00
Erick Friis
e8ee781a42 ibm: move to external repo (#24208) 2024-07-12 21:14:24 +00:00
Bagatur
02e71cebed together[patch]: Release 0.1.4 (#24205) 2024-07-12 13:59:58 -07:00
Bagatur
259d4d2029 anthropic[patch]: Release 0.1.20 (#24204) 2024-07-12 13:59:15 -07:00
Bagatur
3aed74a6fc fireworks[patch]: Release 0.1.5 (#24203) 2024-07-12 13:58:58 -07:00
Bagatur
13b0d7ec8f openai[patch]: Release 0.1.16 (#24202) 2024-07-12 13:58:39 -07:00
Bagatur
71cd6e6feb groq[patch]: Release 0.1.7 (#24201) 2024-07-12 13:58:19 -07:00
Bagatur
99054e19eb mistralai[patch]: Release 0.1.10 (#24200) 2024-07-12 13:57:58 -07:00
Bagatur
7a1321e2f9 ibm[patch]: Release 0.1.10 (#24199) 2024-07-12 13:57:38 -07:00
Bagatur
cb5031f22f integrations[patch]: require core >=0.2.17 (#24207) 2024-07-12 20:54:01 +00:00
Nithish Raghunandanan
f1618ec540 couchbase: Add standard and semantic caches (#23607)
Thank you for contributing to LangChain!

**Description:** Add support for caching (standard + semantic) LLM
responses using Couchbase


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 20:30:03 +00:00
Eugene Yurtsev
8d82a0d483 core[patch]: Mark GraphVectorStore as beta (#24195)
* This PR marks graph vectorstore as beta
2024-07-12 14:28:06 -04:00
Bagatur
0a1e475a30 core[patch]: Release 0.2.17 (#24189) 2024-07-12 17:08:29 +00:00
Bagatur
6166ea67a8 core[minor]: rename ToolMessage.raw_output -> artifact (#24185) 2024-07-12 09:52:44 -07:00
Jean Nshuti
d77d9bfc00 community[patch]: update typo document content returned from semanticscholar (#24175)
Update "astract" -> abstract
2024-07-12 15:40:47 +00:00
Leonid Ganeline
aa3e3cfa40 core[patch]: docstrings runnables update (#24161)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-12 11:27:06 -04:00
mumu
14ba1d4b45 docs: fix numeric errors in tools_chain.ipynb (#24169)
Description: Corrected several numeric errors in the
docs/docs/how_to/tools_chain.ipynb file to ensure the accuracy of the
documentation.
2024-07-12 11:26:26 -04:00
Ikko Eltociear Ashimine
18da9f5e59 docs: update custom_chat_model.ipynb (#24170)
characetrs -> characters
2024-07-12 06:48:22 -04:00
Tomaz Bratanic
d3a2b9fae0 Fix neo4j type error on missing constraint information (#24177)
If you use `refresh_schema=False`, then the metadata constraint doesn't
exist. ATM, we used default `None` in the constraint check, but then
`any` fails because it can't iterate over None value
2024-07-12 06:39:29 -04:00
Anush
7014d07cab qdrant: new Qdrant implementation (#24164) 2024-07-12 04:52:02 +02:00
Xander Dumaine
35784d1c33 langchain[minor]: add document_variable_name to create_stuff_documents_chain (#24083)
- **Description:** `StuffDocumentsChain` uses `LLMChain` which is
deprecated by langchain runnables. `create_stuff_documents_chain` is the
replacement, but needs support for `document_variable_name` to allow
multiple uses of the chain within a longer chain.
- **Issue:** none
- **Dependencies:** none
2024-07-12 02:31:46 +00:00
Eugene Yurtsev
8858846607 milvus[patch]: Fix Milvus vectorstore for newer versions of langchain-core (#24152)
Fix for: https://github.com/langchain-ai/langchain/issues/24116

This keeps the old behavior of add_documents and add_texts
2024-07-11 18:51:18 -07:00
thedavgar
ffe6ca986e community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081)
Thank you for contributing to LangChain!

**Description**:
This PR fixes a bug described in the issue in #24064, when using the
AzureSearch Vectorstore with the asyncronous methods to do search which
is also the method used for the retriever. The proposed change includes
just change the access of the embedding as optional because is it not
used anywhere to retrieve documents. Actually, the syncronous methods of
retrieval do not use the embedding neither.

With this PR the code given by the user in the issue works.

```python
vectorstore = AzureSearch(
    azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"),
    azure_search_key=os.getenv("AI_SEARCH_API_KEY"),
    index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"),
    fields=fields,
    embedding_function=encoder,
)

retriever = vectorstore.as_retriever(search_type="hybrid", k=2)

await vectorstore.avector_search("what is the capital of France")
await retriever.ainvoke("what is the capital of France")
```

**Issue**:
The Azure Search Vectorstore is not working when searching for documents
with asyncronous methods, as described in issue #24064

**Dependencies**:
There are no extra dependencies required for this change.

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-11 18:32:19 -07:00
Anush
7790d67f94 qdrant: New sparse embeddings provider interface - PART 1 (#24015)
## Description

This PR introduces a new sparse embedding provider interface to work
with the new Qdrant implementation that will follow this PR.

Additionally, an implementation of this interface is provided with
https://github.com/qdrant/fastembed.

This PR will be followed by
https://github.com/Anush008/langchain/pull/3.
2024-07-11 17:07:25 -07:00
Erick Friis
1132fb801b core: release 0.2.16 (#24159) 2024-07-11 23:59:41 +00:00
Nuno Campos
1d37aa8403 core: Remove extra newline (#24157) 2024-07-11 23:55:36 +00:00
ccurme
cb95198398 standard-tests[patch]: add tests for runnables as tools and streaming usage metadata (#24153) 2024-07-11 18:30:05 -04:00
Erick Friis
d002fa902f infra: fix redundant matrix config (#24151) 2024-07-11 15:15:41 -07:00
Bagatur
8d100c58de core[patch]: Tool accept RunnableConfig (#24143)
Relies on #24038

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-11 22:13:17 +00:00
Bagatur
5fd1e67808 core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038)
Changes:
- ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type"
parameter now
- LLM integration packages add "type" to all the above
- Tool supports ToolCall inputs that have "type" specified
- Tool outputs ToolMessage when a ToolCall is passed as input
- Tools can separately specify ToolMessage.content and
ToolMessage.raw_output
- Tools emit events for validation errors (using on_tool_error and
on_tool_end)

Example:
```python
@tool("structured_api", response_format="content_and_raw_output")
def _mock_structured_tool_with_raw_output(
    arg1: int, arg2: bool, arg3: Optional[dict] = None
) -> Tuple[str, dict]:
    """A Structured Tool"""
    return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3}


def test_tool_call_input_tool_message_with_raw_output() -> None:
    tool_call: Dict = {
        "name": "structured_api",
        "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}},
        "id": "123",
        "type": "tool_call",
    }
    expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123")
    tool = _mock_structured_tool_with_raw_output
    actual = tool.invoke(tool_call)
    assert actual == expected

    tool_call.pop("type")
    with pytest.raises(ValidationError):
        tool.invoke(tool_call)

    actual_content = tool.invoke(tool_call["args"])
    assert actual_content == expected.content
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-11 14:54:02 -07:00
Bagatur
eeb996034b core[patch]: Release 0.2.15 (#24149) 2024-07-11 21:34:25 +00:00
Nuno Campos
03fba07d15 core[patch]: Update styles for mermaid graphs (#24147) 2024-07-11 14:19:36 -07:00
Jacob Lee
c481a2715d docs[patch]: Add structural example to style guide (#24133)
CC @nfcampos
2024-07-11 13:20:14 -07:00
ccurme
8ee8ca7c83 core[patch]: propagate parse_docstring to tool decorator (#24123)
Disabled by default.

```python
from langchain_core.tools import tool

@tool(parse_docstring=True)
def foo(bar: str, baz: int) -> str:
    """The foo.

    Args:
        bar: this is the bar
        baz: this is the baz
    """
    return bar


foo.args_schema.schema()
```
```json
{
  "title": "fooSchema",
  "description": "The foo.",
  "type": "object",
  "properties": {
    "bar": {
      "title": "Bar",
      "description": "this is the bar",
      "type": "string"
    },
    "baz": {
      "title": "Baz",
      "description": "this is the baz",
      "type": "integer"
    }
  },
  "required": [
    "bar",
    "baz"
  ]
}
```
2024-07-11 20:11:45 +00:00
Jacob Lee
4121d4151f docs[patch]: Fix typo (#24132)
CC @efriis
2024-07-11 20:10:48 +00:00
Erick Friis
bd18faa2a0 infra: add SQLAlchemy to min version testing (#23186)
preventing issues like #22546 

Notes:
- this will only affect release CI. We may want to consider adding
running unit tests with min versions to PR CI in some form
- because this only affects release CI, it could create annoying issues
releasing while I'm on vacation. Unless anyone feels strongly, I'll wait
to merge this til when I'm back
2024-07-11 20:09:57 +00:00
Jacob Lee
f1f1f75782 community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085) 2024-07-11 21:45:30 +02:00
Eugene Yurtsev
4ba14adec6 core[patch]: Clean up indexing test code (#24139)
Refactor the code to use the existing InMemroyVectorStore.

This change is needed for another PR that moves some of the imports
around (and messes up the mock.patch in this file)
2024-07-11 18:54:46 +00:00
Atul R
457677c1b7 community: Fixes use of ImagePromptTemplate with Ollama (#24140)
Description: ImagePromptTemplate for Multimodal llms like llava when
using Ollama
Twitter handle: https://x.com/a7ulr

Details:

When using llava models / any ollama multimodal llms and passing images
in the prompt as urls, langchain breaks with this error.

```python
image_url_components = image_url.split(",")
                           ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'split'
```

From the looks of it, there was bug where the condition did check for a
`url` field in the variable but missed to actually assign it.

This PR fixes ImagePromptTemplate for Multimodal llms like llava when
using Ollama specifically.

@hwchase17
2024-07-11 11:31:48 -07:00
Matt
8327925ab7 community:support additional Azure Search Options (#24134)
- **Description:** Support additional kwargs options for the Azure
Search client (Described here
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations)
    - **Issue:** N/A
    - **Dependencies:** No additional Dependencies

---------
2024-07-11 18:22:36 +00:00
ccurme
122e80e04d core[patch]: add versionadded to as_tool (#24138) 2024-07-11 18:08:08 +00:00
1623 changed files with 100470 additions and 47162 deletions

View File

@@ -5,10 +5,10 @@ services:
dockerfile: libs/langchain/dev.Dockerfile
context: ..
volumes:
# Update this to wherever you want VS Code to mount the folder of your project
# Update this to wherever you want VS Code to mount the folder of your project
- ..:/workspaces/langchain:cached
networks:
- langchain-network
- langchain-network
# environment:
# MONGO_ROOT_USERNAME: root
# MONGO_ROOT_PASSWORD: example123
@@ -28,5 +28,3 @@ services:
networks:
langchain-network:
driver: bridge

View File

@@ -1,11 +1,11 @@
import glob
import json
import os
import re
import sys
import tomllib
from collections import defaultdict
from typing import Dict, List, Set
from pathlib import Path
LANGCHAIN_DIRS = [
@@ -26,17 +26,55 @@ def all_package_dirs() -> Set[str]:
def dependents_graph() -> dict:
"""
Construct a mapping of package -> dependents, such that we can
run tests on all dependents of a package when a change is made.
"""
dependents = defaultdict(set)
for path in glob.glob("./libs/**/pyproject.toml", recursive=True):
if "template" in path:
continue
# load regular and test deps from pyproject.toml
with open(path, "rb") as f:
pyproject = tomllib.load(f)["tool"]["poetry"]
pkg_dir = "libs" + "/".join(path.split("libs")[1].split("/")[:-1])
for dep in pyproject["dependencies"]:
for dep in [
*pyproject["dependencies"].keys(),
*pyproject["group"]["test"]["dependencies"].keys(),
]:
if "langchain" in dep:
dependents[dep].add(pkg_dir)
continue
# load extended deps from extended_testing_deps.txt
package_path = Path(path).parent
extended_requirement_path = package_path / "extended_testing_deps.txt"
if extended_requirement_path.exists():
with open(extended_requirement_path, "r") as f:
extended_deps = f.read().splitlines()
for depline in extended_deps:
if depline.startswith("-e "):
# editable dependency
assert depline.startswith(
"-e ../partners/"
), "Extended test deps should only editable install partner packages"
partner = depline.split("partners/")[1]
dep = f"langchain-{partner}"
else:
dep = depline.split("==")[0]
if "langchain" in dep:
dependents[dep].add(pkg_dir)
# remove huggingface from dependents because of CI instability
# specifically in huggingface jobs
# https://github.com/langchain-ai/langchain/issues/25558
for k in dependents:
if "libs/partners/huggingface" in dependents[k]:
dependents[k].remove("libs/partners/huggingface")
return dependents
@@ -54,6 +92,11 @@ def add_dependents(dirs_to_eval: Set[str], dependents: dict) -> List[str]:
def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
if dir_ == "libs/core":
return [
{"working-directory": dir_, "python-version": f"3.{v}"}
for v in range(8, 13)
]
min_python = "3.8"
max_python = "3.12"
@@ -63,6 +106,15 @@ def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
# declare deps in funny way
max_python = "3.11"
if dir_ in ["libs/community", "libs/langchain"] and job == "extended-tests":
# community extended test resolution in 3.12 is slow
# even in uv
max_python = "3.11"
if dir_ == "libs/community" and job == "compile-integration-tests":
# community integration deps are slow in 3.12
max_python = "3.11"
return [
{"working-directory": dir_, "python-version": min_python},
{"working-directory": dir_, "python-version": max_python},

View File

@@ -0,0 +1,35 @@
import sys
import tomllib
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
# read toml file
with open(toml_file, "rb") as file:
toml_data = tomllib.load(file)
# see if we're releasing an rc
version = toml_data["tool"]["poetry"]["version"]
releasing_rc = "rc" in version
# if not, iterate through dependencies and make sure none allow prereleases
if not releasing_rc:
dependencies = toml_data["tool"]["poetry"]["dependencies"]
for lib in dependencies:
dep_version = dependencies[lib]
dep_version_string = (
dep_version["version"] if isinstance(dep_version, dict) else dep_version
)
if "rc" in dep_version_string:
raise ValueError(
f"Dependency {lib} has a prerelease version. Please remove this."
)
if isinstance(dep_version, dict) and dep_version.get(
"allow-prereleases", False
):
raise ValueError(
f"Dependency {lib} has allow-prereleases set to true. Please remove this."
)

View File

@@ -1,6 +1,11 @@
import sys
import tomllib
if sys.version_info >= (3, 11):
import tomllib
else:
# for python 3.10 and below, which doesnt have stdlib tomllib
import tomli as tomllib
from packaging.version import parse as parse_version
import re
@@ -9,8 +14,11 @@ MIN_VERSION_LIBS = [
"langchain-community",
"langchain",
"langchain-text-splitters",
"SQLAlchemy",
]
SKIP_IF_PULL_REQUEST = ["langchain-core"]
def get_min_version(version: str) -> str:
# base regex for x.x.x with cases for rc/post/etc
@@ -37,7 +45,7 @@ def get_min_version(version: str) -> str:
raise ValueError(f"Unrecognized version format: {version}")
def get_min_version_from_toml(toml_path: str):
def get_min_version_from_toml(toml_path: str, versions_for: str):
# Parse the TOML file
with open(toml_path, "rb") as file:
toml_data = tomllib.load(file)
@@ -50,6 +58,10 @@ def get_min_version_from_toml(toml_path: str):
# Iterate over the libs in MIN_VERSION_LIBS
for lib in MIN_VERSION_LIBS:
if versions_for == "pull_request" and lib in SKIP_IF_PULL_REQUEST:
# some libs only get checked on release because of simultaneous
# changes
continue
# Check if the lib is present in the dependencies
if lib in dependencies:
# Get the version string
@@ -70,8 +82,10 @@ def get_min_version_from_toml(toml_path: str):
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
versions_for = sys.argv[2]
assert versions_for in ["release", "pull_request"]
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file)
min_versions = get_min_version_from_toml(toml_file, versions_for)
print(" ".join([f"{lib}=={version}" for lib, version in min_versions.items()]))

View File

@@ -21,14 +21,6 @@ jobs:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
- "3.12"
name: "poetry run pytest -m compile tests/integration_tests #${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v4

View File

@@ -189,7 +189,7 @@ jobs:
--extra-index-url https://test.pypi.org/simple/ \
"$PKG_NAME==$VERSION" || \
( \
sleep 5 && \
sleep 15 && \
poetry run pip install \
--extra-index-url https://test.pypi.org/simple/ \
"$PKG_NAME==$VERSION" \
@@ -221,12 +221,17 @@ jobs:
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: Check for prerelease versions
working-directory: ${{ inputs.working-directory }}
run: |
poetry run python $GITHUB_WORKSPACE/.github/scripts/check_prerelease_dependencies.py pyproject.toml
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml)"
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml release)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
@@ -285,6 +290,7 @@ jobs:
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
UNSTRUCTURED_API_KEY: ${{ secrets.UNSTRUCTURED_API_KEY }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}

View File

@@ -65,3 +65,22 @@ jobs:
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging tomli
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml pull_request)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
# Temporarily disabled until we can get the minimum versions working
# - name: Run unit tests with minimum dependency versions
# if: ${{ steps.min-version.outputs.min-versions != '' }}
# env:
# MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
# run: |
# poetry run pip install --force-reinstall $MIN_VERSIONS --editable .
# make tests
# working-directory: ${{ inputs.working-directory }}

View File

@@ -14,10 +14,6 @@ env:
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.12"
name: "check doc imports #${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v4

2
.gitignore vendored
View File

@@ -172,6 +172,8 @@ docs/api_reference/*/
!docs/api_reference/_static/
!docs/api_reference/templates/
!docs/api_reference/themes/
!docs/api_reference/_extensions/
!docs/api_reference/scripts/
docs/docs/build
docs/docs/node_modules
docs/docs/yarn.lock

View File

@@ -52,7 +52,7 @@ Now:
`from langchain_experimental.sql import SQLDatabaseChain`
Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out [`create_sql_query_chain`](https://github.com/langchain-ai/langchain/blob/master/docs/extras/use_cases/tabular/sql_query.ipynb)
Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out this [`SQL question-answering tutorial`](https://python.langchain.com/v0.2/docs/tutorials/sql_qa/#convert-question-to-sql-query)
`from langchain.chains import create_sql_query_chain`

View File

@@ -31,6 +31,7 @@ docs_linkcheck:
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
poetry run python docs/api_reference/scripts/custom_formatter.py docs/api_reference/_build/html/
API_PKG ?= text-splitters
@@ -38,12 +39,14 @@ api_docs_quick_preview:
poetry run pip install "pydantic<2"
poetry run python docs/api_reference/create_api_rst.py $(API_PKG)
cd docs/api_reference && poetry run make html
open docs/api_reference/_build/html/$(shell echo $(API_PKG) | sed 's/-/_/g')_api_reference.html
poetry run python docs/api_reference/scripts/custom_formatter.py docs/api_reference/_build/html/
open docs/api_reference/_build/html/reference.html
## api_docs_clean: Clean the API Reference documentation build artifacts.
api_docs_clean:
find ./docs/api_reference -name '*_api_reference.rst' -delete
git clean -fdX ./docs/api_reference
rm docs/api_reference/index.md
## api_docs_linkcheck: Run linkchecker on the API Reference documentation.

View File

@@ -7,7 +7,6 @@
[![PyPI - License](https://img.shields.io/pypi/l/langchain-core?style=flat-square)](https://opensource.org/licenses/MIT)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-core?style=flat-square)](https://pypistats.org/packages/langchain-core)
[![GitHub star chart](https://img.shields.io/github/stars/langchain-ai/langchain?style=flat-square)](https://star-history.com/#langchain-ai/langchain)
[![Dependency Status](https://img.shields.io/librariesio/github/langchain-ai/langchain?style=flat-square)](https://libraries.io/github/langchain-ai/langchain)
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain?style=flat-square)](https://github.com/langchain-ai/langchain/issues)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/langchain)

View File

@@ -64,7 +64,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain-chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -355,7 +355,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -37,7 +37,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install -U --quiet langchain langchain_community openai chromadb langchain-experimental\n",
"%pip install -U --quiet langchain langchain-chroma langchain-community openai langchain-experimental\n",
"%pip install --quiet \"unstructured[all-docs]\" pypdf pillow pydantic lxml pillow matplotlib chromadb tiktoken"
]
},
@@ -344,8 +344,8 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import VertexAIEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.documents import Document\n",
"\n",
"\n",

View File

@@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"pip install -U langchain umap-learn scikit-learn langchain_community tiktoken langchain-openai langchainhub chromadb langchain-anthropic"
"pip install -U langchain umap-learn scikit-learn langchain_community tiktoken langchain-openai langchainhub langchain-chroma langchain-anthropic"
]
},
{
@@ -645,7 +645,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"\n",
"# Initialize all_texts with leaf_texts\n",
"all_texts = leaf_texts.copy()\n",

View File

@@ -36,6 +36,7 @@ Notebook | Description
[llm_symbolic_math.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_symbolic_math.ipynb) | Solve algebraic equations with the help of llms (language learning models) and sympy, a python library for symbolic mathematics.
[meta_prompt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/meta_prompt.ipynb) | Implement the meta-prompt concept, which is a method for building self-improving agents that reflect on their own performance and modify their instructions accordingly.
[multi_modal_output_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_output_agent.ipynb) | Generate multi-modal outputs, specifically images and text.
[multi_modal_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_RAG_vdms.ipynb) | Perform retrieval-augmented generation (rag) on documents including text and images, using unstructured for parsing, Intel's Visual Data Management System (VDMS) as the vectorstore, and chains.
[multi_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_player_dnd.ipynb) | Simulate multi-player dungeons & dragons games, with a custom function determining the speaking schedule of the agents.
[multiagent_authoritarian.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_authoritarian.ipynb) | Implement a multi-agent simulation where a privileged agent controls the conversation, including deciding who speaks and when the conversation ends, in the context of a simulated news network.
[multiagent_bidding.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_bidding.ipynb) | Implement a multi-agent simulation where agents bid to speak, with the highest bidder speaking next, demonstrated through a fictitious presidential debate example.
@@ -57,4 +58,6 @@ Notebook | Description
[two_agent_debate_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_agent_debate_tools.ipynb) | Simulate multi-agent dialogues where the agents can utilize various tools.
[two_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_player_dnd.ipynb) | Simulate a two-player dungeons & dragons game, where a dialogue simulator class is used to coordinate the dialogue between the protagonist and the dungeon master.
[wikibase_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/wikibase_agent.ipynb) | Create a simple wikibase agent that utilizes sparql generation, with testing done on http://wikidata.org.
[oracleai_demo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/oracleai_demo.ipynb) | This guide outlines how to utilize Oracle AI Vector Search alongside Langchain for an end-to-end RAG pipeline, providing step-by-step examples. The process includes loading documents from various sources using OracleDocLoader, summarizing them either within or outside the database with OracleSummary, and generating embeddings similarly through OracleEmbeddings. It also covers chunking documents according to specific requirements using Advanced Oracle Capabilities from OracleTextSplitter, and finally, storing and indexing these documents in a Vector Store for querying with OracleVS.
[oracleai_demo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/oracleai_demo.ipynb) | This guide outlines how to utilize Oracle AI Vector Search alongside Langchain for an end-to-end RAG pipeline, providing step-by-step examples. The process includes loading documents from various sources using OracleDocLoader, summarizing them either within or outside the database with OracleSummary, and generating embeddings similarly through OracleEmbeddings. It also covers chunking documents according to specific requirements using Advanced Oracle Capabilities from OracleTextSplitter, and finally, storing and indexing these documents in a Vector Store for querying with OracleVS.
[rag-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag-locally-on-intel-cpu.ipynb) | Perform Retrieval-Augmented-Generation (RAG) on locally downloaded open-source models using langchain and open source tools and execute it on Intel Xeon CPU. We showed an example of how to apply RAG on Llama 2 model and enable it to answer the queries related to Intel Q1 2024 earnings release.
[visual_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/visual_RAG_vdms.ipynb) | Performs Visual Retrieval-Augmented-Generation (RAG) using videos and scene descriptions generated by open source models.

View File

@@ -39,7 +39,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml langchainhub"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml langchainhub"
]
},
{
@@ -320,7 +320,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -59,7 +59,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml"
]
},
{
@@ -375,7 +375,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -59,7 +59,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml"
]
},
{
@@ -378,8 +378,8 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import GPT4AllEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.documents import Document\n",
"\n",
"# The vectorstore to use to index the child chunks\n",

View File

@@ -19,7 +19,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain_chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -132,7 +132,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"baseline = Chroma.from_texts(\n",

View File

@@ -28,7 +28,7 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",

View File

@@ -14,7 +14,7 @@
}
],
"source": [
"%pip install -qU langchain-airbyte"
"%pip install -qU langchain-airbyte langchain_chroma"
]
},
{
@@ -123,7 +123,7 @@
"outputs": [],
"source": [
"import tiktoken\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"enc = tiktoken.get_encoding(\"cl100k_base\")\n",

View File

@@ -166,7 +166,7 @@
"source": [
"### SQL Database Agent example\n",
"\n",
"This example demonstrates the use of the [SQL Database Agent](/docs/integrations/toolkits/sql_database.html) for answering questions over a Databricks database."
"This example demonstrates the use of the [SQL Database Agent](/docs/integrations/tools/sql_database) for answering questions over a Databricks database."
]
},
{

View File

@@ -39,7 +39,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain docugami==0.0.8 dgml-utils==0.3.0 pydantic langchainhub chromadb hnswlib --upgrade --quiet"
"! pip install langchain docugami==0.0.8 dgml-utils==0.3.0 pydantic langchainhub langchain-chroma hnswlib --upgrade --quiet"
]
},
{
@@ -547,7 +547,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores.chroma import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -84,7 +84,7 @@
}
],
"source": [
"%pip install --quiet pypdf chromadb tiktoken openai \n",
"%pip install --quiet pypdf langchain-chroma tiktoken openai \n",
"%pip uninstall -y langchain-fireworks\n",
"%pip install --editable /mnt/disks/data/langchain/libs/partners/fireworks"
]
@@ -138,7 +138,7 @@
"all_splits = text_splitter.split_documents(data)\n",
"\n",
"# Add to vectorDB\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_fireworks.embeddings import FireworksEmbeddings\n",
"\n",
"vectorstore = Chroma.from_documents(\n",

View File

@@ -170,7 +170,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"with open(\"../../state_of_the_union.txt\") as f:\n",

File diff suppressed because one or more lines are too long

View File

@@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain_community tiktoken langchain-openai langchainhub chromadb langchain langgraph"
"! pip install langchain-chroma langchain_community tiktoken langchain-openai langchainhub langchain langgraph"
]
},
{
@@ -30,8 +30,8 @@
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"urls = [\n",

View File

@@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain_community tiktoken langchain-openai langchainhub chromadb langchain langgraph tavily-python"
"! pip install langchain-chroma langchain_community tiktoken langchain-openai langchainhub langchain langgraph tavily-python"
]
},
{
@@ -77,8 +77,8 @@
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"urls = [\n",
@@ -180,8 +180,8 @@
"from langchain.output_parsers.openai_tools import PydanticToolsParser\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.schema import Document\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.tools.tavily_search import TavilySearchResults\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.messages import BaseMessage, FunctionMessage\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",

View File

@@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain_community tiktoken langchain-openai langchainhub chromadb langchain langgraph"
"! pip install langchain-chroma langchain_community tiktoken langchain-openai langchainhub langchain langgraph"
]
},
{
@@ -86,8 +86,8 @@
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders import WebBaseLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"urls = [\n",
@@ -188,7 +188,7 @@
"from langchain.output_parsers import PydanticOutputParser\n",
"from langchain.output_parsers.openai_tools import PydanticToolsParser\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.messages import BaseMessage, FunctionMessage\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.pydantic_v1 import BaseModel, Field\n",

View File

@@ -58,7 +58,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain-chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -187,7 +187,7 @@
"\n",
"import chromadb\n",
"import numpy as np\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_experimental.open_clip import OpenCLIPEmbeddings\n",
"from PIL import Image as _PILImage\n",
"\n",

View File

@@ -18,26 +18,7 @@
"* Use of multimodal embeddings (such as [CLIP](https://openai.com/research/clip)) to embed images and text\n",
"* Use of [VDMS](https://github.com/IntelLabs/vdms/blob/master/README.md) as a vector store with support for multi-modal\n",
"* Retrieval of both images and text using similarity search\n",
"* Passing raw images and text chunks to a multimodal LLM for answer synthesis \n",
"\n",
"\n",
"## Packages\n",
"\n",
"For `unstructured`, you will also need `poppler` ([installation instructions](https://pdf2image.readthedocs.io/en/latest/installation.html)) and `tesseract` ([installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)) in your system."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "febbc459-ebba-4c1a-a52b-fed7731593f8",
"metadata": {},
"outputs": [],
"source": [
"# (newest versions required for multi-modal)\n",
"! pip install --quiet -U vdms langchain-experimental\n",
"\n",
"# lock to 0.10.19 due to a persistent bug in more recent versions\n",
"! pip install --quiet pdf2image \"unstructured[all-docs]==0.10.19\" pillow pydantic lxml open_clip_torch"
"* Passing raw images and text chunks to a multimodal LLM for answer synthesis "
]
},
{
@@ -53,7 +34,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"id": "5f483872",
"metadata": {},
"outputs": [
@@ -61,8 +42,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"docker: Error response from daemon: Conflict. The container name \"/vdms_rag_nb\" is already in use by container \"0c19ed281463ac10d7efe07eb815643e3e534ddf24844357039453ad2b0c27e8\". You have to remove (or rename) that container to be able to reuse that name.\n",
"See 'docker run --help'.\n"
"a1b9206b08ef626e15b356bf9e031171f7c7eb8f956a2733f196f0109246fe2b\n"
]
}
],
@@ -75,9 +55,32 @@
"vdms_client = VDMS_Client(port=55559)"
]
},
{
"cell_type": "markdown",
"id": "2498a0a1",
"metadata": {},
"source": [
"## Packages\n",
"\n",
"For `unstructured`, you will also need `poppler` ([installation instructions](https://pdf2image.readthedocs.io/en/latest/installation.html)) and `tesseract` ([installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)) in your system."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 2,
"id": "febbc459-ebba-4c1a-a52b-fed7731593f8",
"metadata": {},
"outputs": [],
"source": [
"! pip install --quiet -U vdms langchain-experimental\n",
"\n",
"# lock to 0.10.19 due to a persistent bug in more recent versions\n",
"! pip install --quiet pdf2image \"unstructured[all-docs]==0.10.19\" pillow pydantic lxml open_clip_torch"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "78ac6543",
"metadata": {},
"outputs": [],
@@ -95,14 +98,9 @@
"\n",
"### Partition PDF text and images\n",
" \n",
"Let's look at an example pdf containing interesting images.\n",
"Let's use famous photographs from the PDF version of Library of Congress Magazine in this example.\n",
"\n",
"Famous photographs from library of congress:\n",
"\n",
"* https://www.loc.gov/lcm/pdf/LCM_2020_1112.pdf\n",
"* We'll use this as an example below\n",
"\n",
"We can use `partition_pdf` below from [Unstructured](https://unstructured-io.github.io/unstructured/introduction.html#key-concepts) to extract text and images."
"We can use `partition_pdf` from [Unstructured](https://unstructured-io.github.io/unstructured/introduction.html#key-concepts) to extract text and images."
]
},
{
@@ -116,8 +114,8 @@
"\n",
"import requests\n",
"\n",
"# Folder with pdf and extracted images\n",
"datapath = Path(\"./multimodal_files\").resolve()\n",
"# Folder to store pdf and extracted images\n",
"datapath = Path(\"./data/multimodal_files\").resolve()\n",
"datapath.mkdir(parents=True, exist_ok=True)\n",
"\n",
"pdf_url = \"https://www.loc.gov/lcm/pdf/LCM_2020_1112.pdf\"\n",
@@ -174,14 +172,8 @@
"source": [
"## Multi-modal embeddings with our document\n",
"\n",
"We will use [OpenClip multimodal embeddings](https://python.langchain.com/docs/integrations/text_embedding/open_clip).\n",
"\n",
"We use a larger model for better performance (set in `langchain_experimental.open_clip.py`).\n",
"\n",
"```\n",
"model_name = \"ViT-g-14\"\n",
"checkpoint = \"laion2b_s34b_b88k\"\n",
"```"
"In this section, we initialize the VDMS vector store for both text and images. For better performance, we use model `ViT-g-14` from [OpenClip multimodal embeddings](https://python.langchain.com/docs/integrations/text_embedding/open_clip).\n",
"The images are stored as base64 encoded strings with `vectorstore.add_images`.\n"
]
},
{
@@ -200,9 +192,7 @@
"vectorstore = VDMS(\n",
" client=vdms_client,\n",
" collection_name=\"mm_rag_clip_photos\",\n",
" embedding_function=OpenCLIPEmbeddings(\n",
" model_name=\"ViT-g-14\", checkpoint=\"laion2b_s34b_b88k\"\n",
" ),\n",
" embedding=OpenCLIPEmbeddings(model_name=\"ViT-g-14\", checkpoint=\"laion2b_s34b_b88k\"),\n",
")\n",
"\n",
"# Get image URIs with .jpg extension only\n",
@@ -233,7 +223,7 @@
"source": [
"## RAG\n",
"\n",
"`vectorstore.add_images` will store / retrieve images as base64 encoded strings."
"Here we define helper functions for image results."
]
},
{
@@ -392,7 +382,8 @@
"id": "1566096d-97c2-4ddc-ba4a-6ef88c525e4e",
"metadata": {},
"source": [
"## Test retrieval and run RAG"
"## Test retrieval and run RAG\n",
"Now let's query for a `woman with children` and retrieve the top results."
]
},
{
@@ -452,6 +443,14 @@
" print(doc.page_content)"
]
},
{
"cell_type": "markdown",
"id": "15e9b54d",
"metadata": {},
"source": [
"Now let's use the `multi_modal_rag_chain` to process the same query and display the response."
]
},
{
"cell_type": "code",
"execution_count": 11,
@@ -462,10 +461,10 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1. Detailed description of the visual elements in the image: The image features a woman with children, likely a mother and her family, standing together outside. They appear to be poor or struggling financially, as indicated by their attire and surroundings.\n",
"2. Historical and cultural context of the image: The photo was taken in 1936 during the Great Depression, when many families struggled to make ends meet. Dorothea Lange, a renowned American photographer, took this iconic photograph that became an emblem of poverty and hardship experienced by many Americans at that time.\n",
"3. Interpretation of the image's symbolism and meaning: The image conveys a sense of unity and resilience despite adversity. The woman and her children are standing together, displaying their strength as a family unit in the face of economic challenges. The photograph also serves as a reminder of the importance of empathy and support for those who are struggling.\n",
"4. Connections between the image and the related text: The text provided offers additional context about the woman in the photo, her background, and her feelings towards the photograph. It highlights the historical backdrop of the Great Depression and emphasizes the significance of this particular image as a representation of that time period.\n"
" The image depicts a woman with several children. The woman appears to be of Cherokee heritage, as suggested by the text provided. The image is described as having been initially regretted by the subject, Florence Owens Thompson, due to her feeling that it did not accurately represent her leadership qualities.\n",
"The historical and cultural context of the image is tied to the Great Depression and the Dust Bowl, both of which affected the Cherokee people in Oklahoma. The photograph was taken during this period, and its subject, Florence Owens Thompson, was a leader within her community who worked tirelessly to help those affected by these crises.\n",
"The image's symbolism and meaning can be interpreted as a representation of resilience and strength in the face of adversity. The woman is depicted with multiple children, which could signify her role as a caregiver and protector during difficult times.\n",
"Connections between the image and the related text include Florence Owens Thompson's leadership qualities and her regretted feelings about the photograph. Additionally, the mention of Dorothea Lange, the photographer who took this photo, ties the image to its historical context and the broader narrative of the Great Depression and Dust Bowl in Oklahoma. \n"
]
}
],
@@ -492,14 +491,6 @@
"source": [
"! docker kill vdms_rag_nb"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8ba652da",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -518,7 +509,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -58,7 +58,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain-nomic langchain_community tiktoken langchain-openai chromadb langchain"
"! pip install -U langchain-nomic langchain-chroma langchain-community tiktoken langchain-openai langchain"
]
},
{
@@ -167,7 +167,7 @@
"source": [
"import os\n",
"\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
"from langchain_nomic import NomicEmbeddings\n",

View File

@@ -56,7 +56,7 @@
},
"outputs": [],
"source": [
"! pip install -U langchain-nomic langchain_community tiktoken langchain-openai chromadb langchain # (newest versions required for multi-modal)"
"! pip install -U langchain-nomic langchain-chroma langchain-community tiktoken langchain-openai langchain # (newest versions required for multi-modal)"
]
},
{
@@ -194,7 +194,7 @@
"\n",
"import chromadb\n",
"import numpy as np\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_nomic import NomicEmbeddings\n",
"from PIL import Image as _PILImage\n",
"\n",

View File

@@ -20,8 +20,8 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders import TextLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter"
]

View File

@@ -80,7 +80,7 @@
"outputs": [],
"source": [
"from langchain.schema import Document\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"embeddings = OpenAIEmbeddings()"

View File

@@ -0,0 +1,756 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "10f50955-be55-422f-8c62-3a32f8cf02ed",
"metadata": {},
"source": [
"# RAG application running locally on Intel Xeon CPU using langchain and open-source models"
]
},
{
"cell_type": "markdown",
"id": "48113be6-44bb-4aac-aed3-76a1365b9561",
"metadata": {},
"source": [
"Author - Pratool Bharti (pratool.bharti@intel.com)"
]
},
{
"cell_type": "markdown",
"id": "8b10b54b-1572-4ea1-9c1e-1d29fcc3dcd9",
"metadata": {},
"source": [
"In this cookbook, we use langchain tools and open source models to execute locally on CPU. This notebook has been validated to run on Intel Xeon 8480+ CPU. Here we implement a RAG pipeline for Llama2 model to answer questions about Intel Q1 2024 earnings release."
]
},
{
"cell_type": "markdown",
"id": "acadbcec-3468-4926-8ce5-03b678041c0a",
"metadata": {},
"source": [
"**Create a conda or virtualenv environment with python >=3.10 and install following libraries**\n",
"<br>\n",
"\n",
"`pip install --upgrade langchain langchain-community langchainhub langchain-chroma bs4 gpt4all pypdf pysqlite3-binary` <br>\n",
"`pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu`"
]
},
{
"cell_type": "markdown",
"id": "84c392c8-700a-42ec-8e94-806597f22e43",
"metadata": {},
"source": [
"**Load pysqlite3 in sys modules since ChromaDB requires sqlite3.**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "145cd491-b388-4ea7-bdc8-2f4995cac6fd",
"metadata": {},
"outputs": [],
"source": [
"__import__(\"pysqlite3\")\n",
"import sys\n",
"\n",
"sys.modules[\"sqlite3\"] = sys.modules.pop(\"pysqlite3\")"
]
},
{
"cell_type": "markdown",
"id": "14dde7e2-b236-49b9-b3a0-08c06410418c",
"metadata": {},
"source": [
"**Import essential components from langchain to load and split data**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "887643ba-249e-48d6-9aa7-d25087e8dfbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_community.document_loaders import PyPDFLoader"
]
},
{
"cell_type": "markdown",
"id": "922c0eba-8736-4de5-bd2f-3d0f00b16e43",
"metadata": {},
"source": [
"**Download Intel Q1 2024 earnings release**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2d6a2419-5338-4188-8615-a40a65ff8019",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-07-15 15:04:43-- https://d1io3yog0oux5.cloudfront.net/_11d435a500963f99155ee058df09f574/intel/db/887/9014/earnings_release/Q1+24_EarningsRelease_FINAL.pdf\n",
"Resolving proxy-dmz.intel.com (proxy-dmz.intel.com)... 10.7.211.16\n",
"Connecting to proxy-dmz.intel.com (proxy-dmz.intel.com)|10.7.211.16|:912... connected.\n",
"Proxy request sent, awaiting response... 200 OK\n",
"Length: 133510 (130K) [application/pdf]\n",
"Saving to: intel_q1_2024_earnings.pdf\n",
"\n",
"intel_q1_2024_earni 100%[===================>] 130.38K --.-KB/s in 0.005s \n",
"\n",
"2024-07-15 15:04:44 (24.6 MB/s) - intel_q1_2024_earnings.pdf saved [133510/133510]\n",
"\n"
]
}
],
"source": [
"!wget 'https://d1io3yog0oux5.cloudfront.net/_11d435a500963f99155ee058df09f574/intel/db/887/9014/earnings_release/Q1+24_EarningsRelease_FINAL.pdf' -O intel_q1_2024_earnings.pdf"
]
},
{
"cell_type": "markdown",
"id": "e3612627-e105-453d-8a50-bbd6e39dedb5",
"metadata": {},
"source": [
"**Loading earning release pdf document through PyPDFLoader**"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "cac6278e-ebad-4224-a062-bf6daca24cb0",
"metadata": {},
"outputs": [],
"source": [
"loader = PyPDFLoader(\"intel_q1_2024_earnings.pdf\")\n",
"data = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "a7dca43b-1c62-41df-90c7-6ed2904f823d",
"metadata": {},
"source": [
"**Splitting entire document in several chunks with each chunk size is 500 tokens**"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4486adbe-0d0e-4685-8c08-c1774ed6e993",
"metadata": {},
"outputs": [],
"source": [
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
"all_splits = text_splitter.split_documents(data)"
]
},
{
"cell_type": "markdown",
"id": "af142346-e793-4a52-9a56-63e3be416b3d",
"metadata": {},
"source": [
"**Looking at the first split of the document**"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "e4240fd1-898e-4bfc-a377-02c9bc25b56e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(metadata={'source': 'intel_q1_2024_earnings.pdf', 'page': 0}, page_content='Intel Corporation\\n2200 Mission College Blvd.\\nSanta Clara, CA 95054-1549\\n \\nNews Release\\n Intel Reports First -Quarter 2024 Financial Results\\nNEWS SUMMARY\\n▪First-quarter revenue of $12.7 billion , up 9% year over year (YoY).\\n▪First-quarter GAAP earnings (loss) per share (EPS) attributable to Intel was $(0.09) ; non-GAAP EPS \\nattributable to Intel was $0.18 .')"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"all_splits[0]"
]
},
{
"cell_type": "markdown",
"id": "b88d2632-7c1b-49ef-a691-c0eb67d23e6a",
"metadata": {},
"source": [
"**One of the major step in RAG is to convert each split of document into embeddings and store in a vector database such that searching relevant documents are efficient.** <br>\n",
"**For that, importing Chroma vector database from langchain. Also, importing open source GPT4All for embedding models**"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "9ff99dd7-9d47-4239-ba0a-d775792334ba",
"metadata": {},
"outputs": [],
"source": [
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import GPT4AllEmbeddings"
]
},
{
"cell_type": "markdown",
"id": "b5d1f4dd-dd8d-4a20-95d1-2dbdd204375a",
"metadata": {},
"source": [
"**In next step, we will download one of the most popular embedding model \"all-MiniLM-L6-v2\". Find more details of the model at this link https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2**"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "05db3494-5d8e-4a13-9941-26330a86f5e5",
"metadata": {},
"outputs": [],
"source": [
"model_name = \"all-MiniLM-L6-v2.gguf2.f16.gguf\"\n",
"gpt4all_kwargs = {\"allow_download\": \"True\"}\n",
"embeddings = GPT4AllEmbeddings(model_name=model_name, gpt4all_kwargs=gpt4all_kwargs)"
]
},
{
"cell_type": "markdown",
"id": "4e53999e-1983-46ac-8039-2783e194c3ae",
"metadata": {},
"source": [
"**Store all the embeddings in the Chroma database**"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "0922951a-9ddf-4761-973d-8e9a86f61284",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Chroma.from_documents(documents=all_splits, embedding=embeddings)"
]
},
{
"cell_type": "markdown",
"id": "29f94fa0-6c75-4a65-a1a3-debc75422479",
"metadata": {},
"source": [
"**Now, let's find relevant splits from the documents related to the question**"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "88c8152d-ec7a-4f0b-9d86-877789407537",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n"
]
}
],
"source": [
"question = \"What is Intel CCG revenue in Q1 2024\"\n",
"docs = vectorstore.similarity_search(question)\n",
"print(len(docs))"
]
},
{
"cell_type": "markdown",
"id": "53330c6b-cb0f-43f9-b379-2e57ac1e5335",
"metadata": {},
"source": [
"**Look at the first retrieved document from the vector database**"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "43a6d94f-b5c4-47b0-a353-2db4c3d24d9c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(metadata={'page': 1, 'source': 'intel_q1_2024_earnings.pdf'}, page_content='Client Computing Group (CCG) $7.5 billion up31%\\nData Center and AI (DCAI) $3.0 billion up5%\\nNetwork and Edge (NEX) $1.4 billion down 8%\\nTotal Intel Products revenue $11.9 billion up17%\\nIntel Foundry $4.4 billion down 10%\\nAll other:\\nAltera $342 million down 58%\\nMobileye $239 million down 48%\\nOther $194 million up17%\\nTotal all other revenue $775 million down 46%\\nIntersegment eliminations $(4.4) billion\\nTotal net revenue $12.7 billion up9%\\nIntel Products Highlights')"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[0]"
]
},
{
"cell_type": "markdown",
"id": "64ba074f-4b36-442e-b7e2-b26d6e2815c3",
"metadata": {},
"source": [
"**Download Lllama-2 model from Huggingface and store locally** <br>\n",
"**You can download different quantization variant of Lllama-2 model from the link below. We are using Q8 version here (7.16GB).** <br>\n",
"https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c8dd0811-6f43-4bc6-b854-2ab377639c9a",
"metadata": {},
"outputs": [],
"source": [
"!huggingface-cli download TheBloke/Llama-2-7b-Chat-GGUF llama-2-7b-chat.Q8_0.gguf --local-dir . --local-dir-use-symlinks False"
]
},
{
"cell_type": "markdown",
"id": "3895b1f5-f51d-4539-abf0-af33d7ca48ea",
"metadata": {},
"source": [
"**Import langchain components required to load downloaded LLMs model**"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "fb087088-aa62-44c0-8356-061e9b9f1186",
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks.manager import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain_community.llms import LlamaCpp"
]
},
{
"cell_type": "markdown",
"id": "5a8a111e-2614-4b70-b034-85cd3e7304cb",
"metadata": {},
"source": [
"**Loading the local Lllama-2 model using Llama-cpp library**"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "fb917da2-c0d7-4995-b56d-26254276e0da",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from llama-2-7b-chat.Q8_0.gguf (version GGUF V2)\n",
"llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.\n",
"llama_model_loader: - kv 0: general.architecture str = llama\n",
"llama_model_loader: - kv 1: general.name str = LLaMA v2\n",
"llama_model_loader: - kv 2: llama.context_length u32 = 4096\n",
"llama_model_loader: - kv 3: llama.embedding_length u32 = 4096\n",
"llama_model_loader: - kv 4: llama.block_count u32 = 32\n",
"llama_model_loader: - kv 5: llama.feed_forward_length u32 = 11008\n",
"llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 128\n",
"llama_model_loader: - kv 7: llama.attention.head_count u32 = 32\n",
"llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 32\n",
"llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000001\n",
"llama_model_loader: - kv 10: general.file_type u32 = 7\n",
"llama_model_loader: - kv 11: tokenizer.ggml.model str = llama\n",
"llama_model_loader: - kv 12: tokenizer.ggml.tokens arr[str,32000] = [\"<unk>\", \"<s>\", \"</s>\", \"<0x00>\", \"<...\n",
"llama_model_loader: - kv 13: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...\n",
"llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...\n",
"llama_model_loader: - kv 15: tokenizer.ggml.bos_token_id u32 = 1\n",
"llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32 = 2\n",
"llama_model_loader: - kv 17: tokenizer.ggml.unknown_token_id u32 = 0\n",
"llama_model_loader: - kv 18: general.quantization_version u32 = 2\n",
"llama_model_loader: - type f32: 65 tensors\n",
"llama_model_loader: - type q8_0: 226 tensors\n",
"llm_load_vocab: special tokens cache size = 259\n",
"llm_load_vocab: token to piece cache size = 0.1684 MB\n",
"llm_load_print_meta: format = GGUF V2\n",
"llm_load_print_meta: arch = llama\n",
"llm_load_print_meta: vocab type = SPM\n",
"llm_load_print_meta: n_vocab = 32000\n",
"llm_load_print_meta: n_merges = 0\n",
"llm_load_print_meta: vocab_only = 0\n",
"llm_load_print_meta: n_ctx_train = 4096\n",
"llm_load_print_meta: n_embd = 4096\n",
"llm_load_print_meta: n_layer = 32\n",
"llm_load_print_meta: n_head = 32\n",
"llm_load_print_meta: n_head_kv = 32\n",
"llm_load_print_meta: n_rot = 128\n",
"llm_load_print_meta: n_swa = 0\n",
"llm_load_print_meta: n_embd_head_k = 128\n",
"llm_load_print_meta: n_embd_head_v = 128\n",
"llm_load_print_meta: n_gqa = 1\n",
"llm_load_print_meta: n_embd_k_gqa = 4096\n",
"llm_load_print_meta: n_embd_v_gqa = 4096\n",
"llm_load_print_meta: f_norm_eps = 0.0e+00\n",
"llm_load_print_meta: f_norm_rms_eps = 1.0e-06\n",
"llm_load_print_meta: f_clamp_kqv = 0.0e+00\n",
"llm_load_print_meta: f_max_alibi_bias = 0.0e+00\n",
"llm_load_print_meta: f_logit_scale = 0.0e+00\n",
"llm_load_print_meta: n_ff = 11008\n",
"llm_load_print_meta: n_expert = 0\n",
"llm_load_print_meta: n_expert_used = 0\n",
"llm_load_print_meta: causal attn = 1\n",
"llm_load_print_meta: pooling type = 0\n",
"llm_load_print_meta: rope type = 0\n",
"llm_load_print_meta: rope scaling = linear\n",
"llm_load_print_meta: freq_base_train = 10000.0\n",
"llm_load_print_meta: freq_scale_train = 1\n",
"llm_load_print_meta: n_ctx_orig_yarn = 4096\n",
"llm_load_print_meta: rope_finetuned = unknown\n",
"llm_load_print_meta: ssm_d_conv = 0\n",
"llm_load_print_meta: ssm_d_inner = 0\n",
"llm_load_print_meta: ssm_d_state = 0\n",
"llm_load_print_meta: ssm_dt_rank = 0\n",
"llm_load_print_meta: model type = 7B\n",
"llm_load_print_meta: model ftype = Q8_0\n",
"llm_load_print_meta: model params = 6.74 B\n",
"llm_load_print_meta: model size = 6.67 GiB (8.50 BPW) \n",
"llm_load_print_meta: general.name = LLaMA v2\n",
"llm_load_print_meta: BOS token = 1 '<s>'\n",
"llm_load_print_meta: EOS token = 2 '</s>'\n",
"llm_load_print_meta: UNK token = 0 '<unk>'\n",
"llm_load_print_meta: LF token = 13 '<0x0A>'\n",
"llm_load_print_meta: max token length = 48\n",
"llm_load_tensors: ggml ctx size = 0.14 MiB\n",
"llm_load_tensors: CPU buffer size = 6828.64 MiB\n",
"...................................................................................................\n",
"llama_new_context_with_model: n_ctx = 2048\n",
"llama_new_context_with_model: n_batch = 512\n",
"llama_new_context_with_model: n_ubatch = 512\n",
"llama_new_context_with_model: flash_attn = 0\n",
"llama_new_context_with_model: freq_base = 10000.0\n",
"llama_new_context_with_model: freq_scale = 1\n",
"llama_kv_cache_init: CPU KV buffer size = 1024.00 MiB\n",
"llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB\n",
"llama_new_context_with_model: CPU output buffer size = 0.12 MiB\n",
"llama_new_context_with_model: CPU compute buffer size = 164.01 MiB\n",
"llama_new_context_with_model: graph nodes = 1030\n",
"llama_new_context_with_model: graph splits = 1\n",
"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 | \n",
"Model metadata: {'tokenizer.ggml.unknown_token_id': '0', 'tokenizer.ggml.eos_token_id': '2', 'general.architecture': 'llama', 'llama.context_length': '4096', 'general.name': 'LLaMA v2', 'llama.embedding_length': '4096', 'llama.feed_forward_length': '11008', 'llama.attention.layer_norm_rms_epsilon': '0.000001', 'llama.rope.dimension_count': '128', 'llama.attention.head_count': '32', 'tokenizer.ggml.bos_token_id': '1', 'llama.block_count': '32', 'llama.attention.head_count_kv': '32', 'general.quantization_version': '2', 'tokenizer.ggml.model': 'llama', 'general.file_type': '7'}\n",
"Using fallback chat format: llama-2\n"
]
}
],
"source": [
"llm = LlamaCpp(\n",
" model_path=\"llama-2-7b-chat.Q8_0.gguf\",\n",
" n_gpu_layers=-1,\n",
" n_batch=512,\n",
" n_ctx=2048,\n",
" f16_kv=True, # MUST set to True, otherwise you will run into problem after a couple of calls\n",
" callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "43e06f56-ef97-451b-87d9-8465ea442aed",
"metadata": {},
"source": [
"**Now let's ask the same question to Llama model without showing them the earnings release.**"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "1033dd82-5532-437d-a548-27695e109589",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"?\n",
"(NASDAQ:INTC)\n",
"Intel's CCG (Client Computing Group) revenue for Q1 2024 was $9.6 billion, a decrease of 35% from the previous quarter and a decrease of 42% from the same period last year."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 16.05 ms / 68 runs ( 0.24 ms per token, 4236.76 tokens per second)\n",
"llama_print_timings: prompt eval time = 131.14 ms / 16 tokens ( 8.20 ms per token, 122.01 tokens per second)\n",
"llama_print_timings: eval time = 3225.00 ms / 67 runs ( 48.13 ms per token, 20.78 tokens per second)\n",
"llama_print_timings: total time = 3466.40 ms / 83 tokens\n"
]
},
{
"data": {
"text/plain": [
"\"?\\n(NASDAQ:INTC)\\nIntel's CCG (Client Computing Group) revenue for Q1 2024 was $9.6 billion, a decrease of 35% from the previous quarter and a decrease of 42% from the same period last year.\""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm.invoke(question)"
]
},
{
"cell_type": "markdown",
"id": "75f5cb10-746f-4e37-9386-b85a4d2b84ef",
"metadata": {},
"source": [
"**As you can see, model is giving wrong information. Correct asnwer is CCG revenue in Q1 2024 is $7.5B. Now let's apply RAG using the earning release document**"
]
},
{
"cell_type": "markdown",
"id": "0f4150ec-5692-4756-b11a-22feb7ab88ff",
"metadata": {},
"source": [
"**in RAG, we modify the input prompt by adding relevent documents with the question. Here, we use one of the popular RAG prompt**"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "226c14b0-f43e-4a1f-a1e4-04731d467ec4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template=\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\\nQuestion: {question} \\nContext: {context} \\nAnswer:\"))]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain import hub\n",
"\n",
"rag_prompt = hub.pull(\"rlm/rag-prompt\")\n",
"rag_prompt.messages"
]
},
{
"cell_type": "markdown",
"id": "77deb6a0-0950-450a-916a-f2a029676c20",
"metadata": {},
"source": [
"**Appending all retreived documents in a single document**"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "2dbc3327-6ef3-4c1f-8797-0c71964b0921",
"metadata": {},
"outputs": [],
"source": [
"def format_docs(docs):\n",
" return \"\\n\\n\".join(doc.page_content for doc in docs)"
]
},
{
"cell_type": "markdown",
"id": "2e2d9f18-49d0-43a3-bea8-78746ffa86b7",
"metadata": {},
"source": [
"**The last step is to create a chain using langchain tool that will create an e2e pipeline. It will take question and context as an input.**"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "427379c2-51ff-4e0f-8278-a45221363299",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnablePassthrough, RunnablePick\n",
"\n",
"# Chain\n",
"chain = (\n",
" RunnablePassthrough.assign(context=RunnablePick(\"context\") | format_docs)\n",
" | rag_prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "095d6280-c949-4d00-8e32-8895a82d245f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Llama.generate: prefix-match hit\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" Based on the provided context, Intel CCG revenue in Q1 2024 was $7.5 billion up 31%."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 7.74 ms / 31 runs ( 0.25 ms per token, 4004.13 tokens per second)\n",
"llama_print_timings: prompt eval time = 2529.41 ms / 674 tokens ( 3.75 ms per token, 266.46 tokens per second)\n",
"llama_print_timings: eval time = 1542.94 ms / 30 runs ( 51.43 ms per token, 19.44 tokens per second)\n",
"llama_print_timings: total time = 4123.68 ms / 704 tokens\n"
]
},
{
"data": {
"text/plain": [
"' Based on the provided context, Intel CCG revenue in Q1 2024 was $7.5 billion up 31%.'"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"context\": docs, \"question\": question})"
]
},
{
"cell_type": "markdown",
"id": "638364b2-6bd2-4471-9961-d3a1d1b9d4ee",
"metadata": {},
"source": [
"**Now we see the results are correct as it is mentioned in earnings release.** <br>\n",
"**To further automate, we will create a chain that will take input as question and retriever so that we don't need to retrieve documents seperately**"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "4654e5b7-635f-4767-8b31-4c430164cdd5",
"metadata": {},
"outputs": [],
"source": [
"retriever = vectorstore.as_retriever()\n",
"qa_chain = (\n",
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
" | rag_prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0979f393-fd0a-4e82-b844-68371c6ad68f",
"metadata": {},
"source": [
"**Now we only need to pass the question to the chain and it will fetch the contexts directly from the vector database to generate the answer**\n",
"<br>\n",
"**Let's try with another question**"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "3ea07b82-e6ec-4084-85f4-191373530172",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Llama.generate: prefix-match hit\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" According to the provided context, Intel DCAI revenue in Q1 2024 was $3.0 billion up 5%."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 6.28 ms / 31 runs ( 0.20 ms per token, 4937.88 tokens per second)\n",
"llama_print_timings: prompt eval time = 2681.93 ms / 730 tokens ( 3.67 ms per token, 272.19 tokens per second)\n",
"llama_print_timings: eval time = 1471.07 ms / 30 runs ( 49.04 ms per token, 20.39 tokens per second)\n",
"llama_print_timings: total time = 4206.77 ms / 760 tokens\n"
]
},
{
"data": {
"text/plain": [
"' According to the provided context, Intel DCAI revenue in Q1 2024 was $3.0 billion up 5%.'"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qa_chain.invoke(\"what is Intel DCAI revenue in Q1 2024?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9407f2a0-4a35-4315-8e96-02fcb80f210c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "rag-on-intel",
"language": "python",
"name": "rag-on-intel"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -36,10 +36,10 @@
"from bs4 import BeautifulSoup as Soup\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryByteStore, LocalFileStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders.recursive_url_loader import (\n",
" RecursiveUrlLoader,\n",
")\n",
"from langchain_community.vectorstores import Chroma\n",
"\n",
"# For our example, we'll load docs from the web\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
@@ -370,13 +370,14 @@
],
"source": [
"import torch\n",
"from langchain.llms.huggingface_pipeline import HuggingFacePipeline\n",
"from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\n",
"from langchain_huggingface.llms import HuggingFacePipeline\n",
"from optimum.intel.ipex import IPEXModelForCausalLM\n",
"from transformers import AutoTokenizer, pipeline\n",
"\n",
"model_id = \"Intel/neural-chat-7b-v3-3\"\n",
"tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" model_id, device_map=\"auto\", torch_dtype=torch.bfloat16\n",
"model = IPEXModelForCausalLM.from_pretrained(\n",
" model_id, torch_dtype=torch.bfloat16, export=True\n",
")\n",
"\n",
"pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer, max_new_tokens=100)\n",
@@ -581,7 +582,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.10.14"
}
},
"nbformat": 4,

View File

@@ -740,7 +740,7 @@ Even this relatively large model will most likely fail to generate more complica
```bash
poetry run pip install pyyaml chromadb
poetry run pip install pyyaml langchain_chroma
import yaml
```
@@ -994,7 +994,7 @@ from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.chains.sql_database.prompt import _sqlite_prompt, PROMPT_SUFFIX
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.prompts.example_selector.semantic_similarity import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_chroma import Chroma
example_prompt = PromptTemplate(
input_variables=["table_info", "input", "sql_cmd", "sql_result", "answer"],

View File

@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install --quiet pypdf chromadb tiktoken openai langchain-together"
"! pip install --quiet pypdf tiktoken openai langchain-chroma langchain-together"
]
},
{
@@ -45,8 +45,8 @@
"all_splits = text_splitter.split_documents(data)\n",
"\n",
"# Add to vectorDB\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import OpenAIEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"\n",
"\"\"\"\n",
"from langchain_together.embeddings import TogetherEmbeddings\n",

File diff suppressed because one or more lines are too long

View File

@@ -13,7 +13,12 @@ OUTPUT_NEW_DOCS_DIR = $(OUTPUT_NEW_DIR)/docs
PYTHON = .venv/bin/python
PARTNER_DEPS_LIST := $(shell find ../libs/partners -mindepth 1 -maxdepth 1 -type d -exec test -e "{}/pyproject.toml" \; -print | grep -vE "airbyte|ibm" | tr '\n' ' ')
PARTNER_DEPS_LIST := $(shell find ../libs/partners -mindepth 1 -maxdepth 1 -type d -exec sh -c ' \
for dir; do \
if find "$$dir" -maxdepth 1 -type f \( -name "pyproject.toml" -o -name "setup.py" \) | grep -q .; then \
echo "$$dir"; \
fi \
done' sh {} + | grep -vE "airbyte|ibm|couchbase" | tr '\n' ' ')
PORT ?= 3001
@@ -36,9 +41,11 @@ generate-files:
cp -r $(SOURCE_DIR)/* $(INTERMEDIATE_DIR)
mkdir -p $(INTERMEDIATE_DIR)/templates
$(PYTHON) scripts/model_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/tool_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/document_loader_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/kv_store_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/copy_templates.py $(INTERMEDIATE_DIR)
@@ -63,16 +70,23 @@ render:
md-sync:
rsync -avm --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --include="*/_category_.yml" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)
append-related:
$(PYTHON) scripts/append_related_links.py $(OUTPUT_NEW_DOCS_DIR)
generate-references:
$(PYTHON) scripts/generate_api_reference_links.py --docs_dir $(OUTPUT_NEW_DOCS_DIR)
build: install-py-deps generate-files copy-infra render md-sync
build: install-py-deps generate-files copy-infra render md-sync append-related
vercel-build: install-vercel-deps build generate-references
rm -rf docs
mv $(OUTPUT_NEW_DOCS_DIR) docs
rm -rf build
yarn run docusaurus build
mkdir static/api_reference
git clone --depth=1 https://github.com/baskaryan/langchain-api-docs-build.git
mv langchain-api-docs-build/api_reference_build/html/* static/api_reference/
rm -rf langchain-api-docs-build
NODE_OPTIONS="--max-old-space-size=5000" yarn run docusaurus build
mv build v0.2
mkdir build
mv v0.2 build

View File

@@ -0,0 +1,144 @@
"""A directive to generate a gallery of images from structured data.
Generating a gallery of images that are all the same size is a common
pattern in documentation, and this can be cumbersome if the gallery is
generated programmatically. This directive wraps this particular use-case
in a helper-directive to generate it with a single YAML configuration file.
It currently exists for maintainers of the pydata-sphinx-theme,
but might be abstracted into a standalone package if it proves useful.
"""
from pathlib import Path
from typing import Any, ClassVar, Dict, List
from docutils import nodes
from docutils.parsers.rst import directives
from sphinx.application import Sphinx
from sphinx.util import logging
from sphinx.util.docutils import SphinxDirective
from yaml import safe_load
logger = logging.getLogger(__name__)
TEMPLATE_GRID = """
`````{{grid}} {columns}
{options}
{content}
`````
"""
GRID_CARD = """
````{{grid-item-card}} {title}
{options}
{content}
````
"""
class GalleryGridDirective(SphinxDirective):
"""A directive to show a gallery of images and links in a Bootstrap grid.
The grid can be generated from a YAML file that contains a list of items, or
from the content of the directive (also formatted in YAML). Use the parameter
"class-card" to add an additional CSS class to all cards. When specifying the grid
items, you can use all parameters from "grid-item-card" directive to customize
individual cards + ["image", "header", "content", "title"].
Danger:
This directive can only be used in the context of a Myst documentation page as
the templates use Markdown flavored formatting.
"""
name = "gallery-grid"
has_content = True
required_arguments = 0
optional_arguments = 1
final_argument_whitespace = True
option_spec: ClassVar[dict[str, Any]] = {
# A class to be added to the resulting container
"grid-columns": directives.unchanged,
"class-container": directives.unchanged,
"class-card": directives.unchanged,
}
def run(self) -> List[nodes.Node]:
"""Create the gallery grid."""
if self.arguments:
# If an argument is given, assume it's a path to a YAML file
# Parse it and load it into the directive content
path_data_rel = Path(self.arguments[0])
path_doc, _ = self.get_source_info()
path_doc = Path(path_doc).parent
path_data = (path_doc / path_data_rel).resolve()
if not path_data.exists():
logger.info(f"Could not find grid data at {path_data}.")
nodes.text("No grid data found at {path_data}.")
return
yaml_string = path_data.read_text()
else:
yaml_string = "\n".join(self.content)
# Use all the element with an img-bottom key as sites to show
# and generate a card item for each of them
grid_items = []
for item in safe_load(yaml_string):
# remove parameters that are not needed for the card options
title = item.pop("title", "")
# build the content of the card using some extra parameters
header = f"{item.pop('header')} \n^^^ \n" if "header" in item else ""
image = f"![image]({item.pop('image')}) \n" if "image" in item else ""
content = f"{item.pop('content')} \n" if "content" in item else ""
# optional parameter that influence all cards
if "class-card" in self.options:
item["class-card"] = self.options["class-card"]
loc_options_str = "\n".join(f":{k}: {v}" for k, v in item.items()) + " \n"
card = GRID_CARD.format(
options=loc_options_str, content=header + image + content, title=title
)
grid_items.append(card)
# Parse the template with Sphinx Design to create an output container
# Prep the options for the template grid
class_ = "gallery-directive" + f' {self.options.get("class-container", "")}'
options = {"gutter": 2, "class-container": class_}
options_str = "\n".join(f":{k}: {v}" for k, v in options.items())
# Create the directive string for the grid
grid_directive = TEMPLATE_GRID.format(
columns=self.options.get("grid-columns", "1 2 3 4"),
options=options_str,
content="\n".join(grid_items),
)
# Parse content as a directive so Sphinx Design processes it
container = nodes.container()
self.state.nested_parse([grid_directive], 0, container)
# Sphinx Design outputs a container too, so just use that
return [container.children[0]]
def setup(app: Sphinx) -> Dict[str, Any]:
"""Add custom configuration to sphinx app.
Args:
app: the Sphinx application
Returns:
the 2 parallel parameters set to ``True``.
"""
app.add_directive("gallery-grid", GalleryGridDirective)
return {
"parallel_read_safe": True,
"parallel_write_safe": True,
}

View File

@@ -1,26 +1,411 @@
pre {
white-space: break-spaces;
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@400;700&display=swap');
/*******************************************************************************
* master color map. Only the colors that actually differ between light and dark
* themes are specified separately.
*
* To see the full list of colors see https://www.figma.com/file/rUrrHGhUBBIAAjQ82x6pz9/PyData-Design-system---proposal-for-implementation-(2)?node-id=1234%3A765&t=ifcFT1JtnrSshGfi-1
*/
/**
* Function to get items from nested maps
*/
/* Assign base colors for the PyData theme */
:root {
--pst-teal-50: #f4fbfc;
--pst-teal-100: #e9f6f8;
--pst-teal-200: #d0ecf1;
--pst-teal-300: #abdde6;
--pst-teal-400: #3fb1c5;
--pst-teal-500: #0a7d91;
--pst-teal-600: #085d6c;
--pst-teal-700: #064752;
--pst-teal-800: #042c33;
--pst-teal-900: #021b1f;
--pst-violet-50: #f4eefb;
--pst-violet-100: #e0c7ff;
--pst-violet-200: #d5b4fd;
--pst-violet-300: #b780ff;
--pst-violet-400: #9c5ffd;
--pst-violet-500: #8045e5;
--pst-violet-600: #6432bd;
--pst-violet-700: #4b258f;
--pst-violet-800: #341a61;
--pst-violet-900: #1e0e39;
--pst-gray-50: #f9f9fa;
--pst-gray-100: #f3f4f5;
--pst-gray-200: #e5e7ea;
--pst-gray-300: #d1d5da;
--pst-gray-400: #9ca4af;
--pst-gray-500: #677384;
--pst-gray-600: #48566b;
--pst-gray-700: #29313d;
--pst-gray-800: #222832;
--pst-gray-900: #14181e;
--pst-pink-50: #fcf8fd;
--pst-pink-100: #fcf0fa;
--pst-pink-200: #f8dff5;
--pst-pink-300: #f3c7ee;
--pst-pink-400: #e47fd7;
--pst-pink-500: #c132af;
--pst-pink-600: #912583;
--pst-pink-700: #6e1c64;
--pst-pink-800: #46123f;
--pst-pink-900: #2b0b27;
--pst-foundation-white: #ffffff;
--pst-foundation-black: #14181e;
--pst-green-10: #f1fdfd;
--pst-green-50: #E0F7F6;
--pst-green-100: #B3E8E6;
--pst-green-200: #80D6D3;
--pst-green-300: #4DC4C0;
--pst-green-400: #4FB2AD;
--pst-green-500: #287977;
--pst-green-600: #246161;
--pst-green-700: #204F4F;
--pst-green-800: #1C3C3C;
--pst-green-900: #0D2427;
--pst-lilac-50: #f4eefb;
--pst-lilac-100: #DAD6FE;
--pst-lilac-200: #BCB2FD;
--pst-lilac-300: #9F8BFA;
--pst-lilac-400: #7F5CF6;
--pst-lilac-500: #6F3AED;
--pst-lilac-600: #6028D9;
--pst-lilac-700: #5021B6;
--pst-lilac-800: #431D95;
--pst-lilac-900: #1e0e39;
--pst-header-height: 2.5rem;
}
@media (min-width: 1200px) {
.container,
.container-lg,
.container-md,
.container-sm,
.container-xl {
max-width: 2560px !important;
}
html {
--pst-font-family-base: 'Inter';
--pst-font-family-heading: 'Inter Tight', sans-serif;
}
#my-component-root *,
#headlessui-portal-root * {
z-index: 10000;
/*******************************************************************************
* write the color rules for each theme (light/dark)
*/
/* NOTE:
* Mixins enable us to reuse the same definitions for the different modes
* https://sass-lang.com/documentation/at-rules/mixin
* something inserts a variable into a CSS selector or property name
* https://sass-lang.com/documentation/interpolation
*/
/* Defaults to light mode if data-theme is not set */
html:not([data-theme]) {
--pst-color-primary: #287977;
--pst-color-primary-bg: #80D6D3;
--pst-color-secondary: #6F3AED;
--pst-color-secondary-bg: #DAD6FE;
--pst-color-accent: #c132af;
--pst-color-accent-bg: #f8dff5;
--pst-color-info: #276be9;
--pst-color-info-bg: #dce7fc;
--pst-color-warning: #f66a0a;
--pst-color-warning-bg: #f8e3d0;
--pst-color-success: #00843f;
--pst-color-success-bg: #d6ece1;
--pst-color-attention: var(--pst-color-warning);
--pst-color-attention-bg: var(--pst-color-warning-bg);
--pst-color-danger: #d72d47;
--pst-color-danger-bg: #f9e1e4;
--pst-color-text-base: #222832;
--pst-color-text-muted: #48566b;
--pst-color-heading-color: #ffffff;
--pst-color-shadow: rgba(0, 0, 0, 0.1);
--pst-color-border: #d1d5da;
--pst-color-border-muted: rgba(23, 23, 26, 0.2);
--pst-color-inline-code: #912583;
--pst-color-inline-code-links: #246161;
--pst-color-target: #f3cf95;
--pst-color-background: #ffffff;
--pst-color-on-background: #F4F9F8;
--pst-color-surface: #F4F9F8;
--pst-color-on-surface: #222832;
}
html:not([data-theme]) {
--pst-color-link: var(--pst-color-primary);
--pst-color-link-hover: var(--pst-color-secondary);
}
html:not([data-theme]) .only-dark,
html:not([data-theme]) .only-dark ~ figcaption {
display: none !important;
}
table.longtable code {
white-space: normal;
/* NOTE: @each {...} is like a for-loop
* https://sass-lang.com/documentation/at-rules/control/each
*/
html[data-theme=light] {
--pst-color-primary: #287977;
--pst-color-primary-bg: #80D6D3;
--pst-color-secondary: #6F3AED;
--pst-color-secondary-bg: #DAD6FE;
--pst-color-accent: #c132af;
--pst-color-accent-bg: #f8dff5;
--pst-color-info: #276be9;
--pst-color-info-bg: #dce7fc;
--pst-color-warning: #f66a0a;
--pst-color-warning-bg: #f8e3d0;
--pst-color-success: #00843f;
--pst-color-success-bg: #d6ece1;
--pst-color-attention: var(--pst-color-warning);
--pst-color-attention-bg: var(--pst-color-warning-bg);
--pst-color-danger: #d72d47;
--pst-color-danger-bg: #f9e1e4;
--pst-color-text-base: #222832;
--pst-color-text-muted: #48566b;
--pst-color-heading-color: #ffffff;
--pst-color-shadow: rgba(0, 0, 0, 0.1);
--pst-color-border: #d1d5da;
--pst-color-border-muted: rgba(23, 23, 26, 0.2);
--pst-color-inline-code: #912583;
--pst-color-inline-code-links: #246161;
--pst-color-target: #f3cf95;
--pst-color-background: #ffffff;
--pst-color-on-background: #F4F9F8;
--pst-color-surface: #F4F9F8;
--pst-color-on-surface: #222832;
color-scheme: light;
}
html[data-theme=light] {
--pst-color-link: var(--pst-color-primary);
--pst-color-link-hover: var(--pst-color-secondary);
}
html[data-theme=light] .only-dark,
html[data-theme=light] .only-dark ~ figcaption {
display: none !important;
}
table.longtable td {
max-width: 600px;
html[data-theme=dark] {
--pst-color-primary: #4FB2AD;
--pst-color-primary-bg: #1C3C3C;
--pst-color-secondary: #7F5CF6;
--pst-color-secondary-bg: #431D95;
--pst-color-accent: #e47fd7;
--pst-color-accent-bg: #46123f;
--pst-color-info: #79a3f2;
--pst-color-info-bg: #06245d;
--pst-color-warning: #ff9245;
--pst-color-warning-bg: #652a02;
--pst-color-success: #5fb488;
--pst-color-success-bg: #002f17;
--pst-color-attention: var(--pst-color-warning);
--pst-color-attention-bg: var(--pst-color-warning-bg);
--pst-color-danger: #e78894;
--pst-color-danger-bg: #4e111b;
--pst-color-text-base: #ced6dd;
--pst-color-text-muted: #9ca4af;
--pst-color-heading-color: #14181e;
--pst-color-shadow: rgba(0, 0, 0, 0.2);
--pst-color-border: #48566b;
--pst-color-border-muted: #29313d;
--pst-color-inline-code: #f3c7ee;
--pst-color-inline-code-links: #4FB2AD;
--pst-color-target: #675c04;
--pst-color-background: #14181e;
--pst-color-on-background: #222832;
--pst-color-surface: #29313d;
--pst-color-on-surface: #f3f4f5;
/* Adjust images in dark mode (unless they have class .only-dark or
* .dark-light, in which case assume they're already optimized for dark
* mode).
*/
/* Give images a light background in dark mode in case they have
* transparency and black text (unless they have class .only-dark or .dark-light, in
* which case assume they're already optimized for dark mode).
*/
color-scheme: dark;
}
html[data-theme=dark] {
--pst-color-link: var(--pst-color-primary);
--pst-color-link-hover: var(--pst-color-secondary);
}
html[data-theme=dark] .only-light,
html[data-theme=dark] .only-light ~ figcaption {
display: none !important;
}
html[data-theme=dark] img:not(.only-dark):not(.dark-light) {
filter: brightness(0.8) contrast(1.2);
}
html[data-theme=dark] .bd-content img:not(.only-dark):not(.dark-light) {
background: rgb(255, 255, 255);
border-radius: 0.25rem;
}
html[data-theme=dark] .MathJax_SVG * {
fill: var(--pst-color-text-base);
}
.pst-color-primary {
color: var(--pst-color-primary);
}
.pst-color-secondary {
color: var(--pst-color-secondary);
}
.pst-color-accent {
color: var(--pst-color-accent);
}
.pst-color-info {
color: var(--pst-color-info);
}
.pst-color-warning {
color: var(--pst-color-warning);
}
.pst-color-success {
color: var(--pst-color-success);
}
.pst-color-attention {
color: var(--pst-color-attention);
}
.pst-color-danger {
color: var(--pst-color-danger);
}
.pst-color-text-base {
color: var(--pst-color-text-base);
}
.pst-color-text-muted {
color: var(--pst-color-text-muted);
}
.pst-color-heading-color {
color: var(--pst-color-heading-color);
}
.pst-color-shadow {
color: var(--pst-color-shadow);
}
.pst-color-border {
color: var(--pst-color-border);
}
.pst-color-border-muted {
color: var(--pst-color-border-muted);
}
.pst-color-inline-code {
color: var(--pst-color-inline-code);
}
.pst-color-inline-code-links {
color: var(--pst-color-inline-code-links);
}
.pst-color-target {
color: var(--pst-color-target);
}
.pst-color-background {
color: var(--pst-color-background);
}
.pst-color-on-background {
color: var(--pst-color-on-background);
}
.pst-color-surface {
color: var(--pst-color-surface);
}
.pst-color-on-surface {
color: var(--pst-color-on-surface);
}
/* Adjust the height of the navbar */
.bd-header .bd-header__inner{
height: 52px; /* Adjust this value as needed */
}
.navbar-nav > li > a {
line-height: 52px; /* Vertically center the navbar links */
}
/* Make sure the navbar items align properly */
.navbar-nav {
display: flex;
}
.bd-header .navbar-header-items__start{
margin-left: 0rem
}
.bd-header button.primary-toggle {
margin-right: 0rem;
}
.bd-header ul.navbar-nav .dropdown .dropdown-menu {
overflow-y: auto; /* Enable vertical scrolling */
max-height: 80vh
}
.bd-sidebar-primary {
width: 22%; /* Adjust this value to your preference */
line-height: 1.4;
}
.bd-sidebar-secondary {
line-height: 1.4;
}
.toc-entry a.nav-link, .toc-entry a>code {
background-color: transparent;
border-color: transparent;
}
.bd-sidebar-primary code{
background-color: transparent;
border-color: transparent;
}
.toctree-wrapper li[class^=toctree-l1]>a {
font-size: 1.3em
}
.toctree-wrapper li[class^=toctree-l1] {
margin-bottom: 2em;
}
.toctree-wrapper li[class^=toctree-l]>ul {
margin-top: 0.5em;
font-size: 0.9em;
}
*, :after, :before {
font-style: normal;
}
div.deprecated {
margin-top: 0.5em;
margin-bottom: 2em;
}
.admonition-beta.admonition, div.admonition-beta.admonition {
border-color: var(--pst-color-warning);
margin-top:0.5em;
margin-bottom: 2em;
}
.admonition-beta>.admonition-title, div.admonition-beta>.admonition-title {
background-color: var(--pst-color-warning-bg);
}
dl[class]:not(.option-list):not(.field-list):not(.footnote):not(.glossary):not(.simple) dd {
margin-left: 1rem;
}
p {
font-size: 0.9rem;
margin-bottom: 0.5rem;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 777 B

View File

@@ -0,0 +1,11 @@
<svg width="72" height="19" viewBox="0 0 72 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<g clip-path="url(#clip0_4019_2020)">
<path d="M29.4038 5.84477C30.1256 6.56657 30.1256 7.74117 29.4038 8.46296L27.7869 10.0538L27.7704 9.96259C27.6524 9.30879 27.3415 8.71552 26.8723 8.24627C26.5189 7.8936 26.1012 7.63282 25.6305 7.47143C25.3383 7.76508 25.1777 8.14989 25.1777 8.55487C25.1777 8.63706 25.1851 8.72224 25.2001 8.80742C25.4593 8.90082 25.6887 9.04503 25.8815 9.23781C26.6033 9.9596 26.6033 11.1342 25.8815 11.856L24.4738 13.2637C24.1129 13.6246 23.6392 13.8047 23.1647 13.8047C22.6902 13.8047 22.2165 13.6246 21.8556 13.2637C21.1338 12.5419 21.1338 11.3673 21.8556 10.6455L23.4725 9.05549L23.489 9.14665C23.6063 9.79896 23.9171 10.3922 24.3879 10.8622C24.742 11.2164 25.1343 11.4518 25.6043 11.6124L25.691 11.5257C25.954 11.2627 26.0982 10.913 26.0982 10.5402C26.0982 10.4572 26.0907 10.3743 26.0765 10.2929C25.8053 10.2032 25.5819 10.0754 25.3786 9.87218C25.0857 9.57928 24.9034 9.20493 24.8526 8.79024C24.8489 8.76035 24.8466 8.73121 24.8437 8.70132C24.8033 8.16109 24.9983 7.63357 25.3786 7.25399L26.7864 5.84627C27.1353 5.49733 27.6001 5.30455 28.0955 5.30455C28.5909 5.30455 29.0556 5.49658 29.4046 5.84627L29.4038 5.84477ZM36.7548 9.56583C36.7548 14.7163 32.5645 18.9058 27.4148 18.9058H9.34C4.1903 18.9058 0 14.7163 0 9.56583C0 4.41538 4.1903 0.22583 9.34 0.22583H27.4148C32.5652 0.22583 36.7548 4.41613 36.7548 9.56583ZM18 14.25C18.1472 14.0714 17.4673 13.5686 17.3283 13.384C17.0459 13.0777 17.0444 12.6368 16.8538 12.2789C16.3876 11.1985 15.8518 10.1262 15.1024 9.21166C14.3104 8.21116 13.333 7.38326 12.4745 6.44403C11.8371 5.78873 11.6668 4.85548 11.1041 4.15087C10.3285 3.00541 7.87624 2.69308 7.51683 4.31077C7.51833 4.36158 7.50264 4.39371 7.45855 4.42584C7.2598 4.57005 7.08271 4.73518 6.93402 4.93468C6.57013 5.44129 6.51409 6.30057 6.96839 6.75561C6.98333 6.51576 6.99155 6.28936 7.18134 6.1175C7.53252 6.41862 8.06304 6.52547 8.47026 6.30057C9.36989 7.585 9.14573 9.36184 9.86005 10.7457C10.0573 11.0729 10.2561 11.4069 10.5094 11.6939C10.7148 12.0137 11.4247 12.391 11.4665 12.6869C11.474 13.195 11.4142 13.7502 11.7475 14.1753C11.9044 14.4936 11.5188 14.8134 11.208 14.7738C10.8045 14.8291 10.3121 14.5026 9.95868 14.7036C9.8339 14.8388 9.58957 14.6894 9.48197 14.8769C9.44461 14.9741 9.24286 15.1108 9.36316 15.2042C9.49691 15.1026 9.62095 14.9965 9.80102 15.057C9.77412 15.2035 9.88994 15.2244 9.98184 15.267C9.97886 15.3663 9.92057 15.468 9.99679 15.5524C10.0857 15.4627 10.1388 15.3357 10.28 15.2983C10.7492 15.9238 11.2267 14.6655 12.2421 15.2318C12.0359 15.2214 11.8528 15.2475 11.7139 15.4172C11.6795 15.4553 11.6503 15.5001 11.7109 15.5494C12.2586 15.196 12.2556 15.6705 12.6112 15.5248C12.8847 15.382 13.1567 15.2035 13.4817 15.2543C13.1657 15.3454 13.153 15.5995 12.9677 15.8139C12.9363 15.8468 12.9213 15.8842 12.9579 15.9387C13.614 15.8834 13.6678 15.6652 14.1975 15.3977C14.5928 15.1564 14.9866 15.7414 15.3288 15.4082C15.4043 15.3357 15.5074 15.3604 15.6008 15.3507C15.4812 14.7133 14.1669 15.4672 14.1878 14.6124C14.6107 14.3247 14.5136 13.7741 14.542 13.3295C15.0284 13.5992 15.5694 13.7561 16.0461 14.0139C16.2867 14.4025 16.6641 14.9158 17.1669 14.8822C17.1804 14.8433 17.1923 14.8089 17.2065 14.7693C17.359 14.7955 17.5547 14.8964 17.6384 14.7036C17.8663 14.9419 18.201 14.93 18.4992 14.8687C18.7196 14.6894 18.0845 14.4338 17.9993 14.2493L18 14.25ZM31.3458 7.15387C31.3458 6.28413 31.0081 5.46744 30.3946 4.85399C29.7812 4.24054 28.9645 3.9028 28.094 3.9028C27.2235 3.9028 26.4068 4.24054 25.7933 4.85399L24.3856 6.26171C24.0569 6.59048 23.8073 6.97678 23.6436 7.40941L23.6339 7.43407L23.6085 7.44154C23.0974 7.5992 22.6469 7.86969 22.2696 8.24702L20.8618 9.65475C19.5938 10.9235 19.5938 12.9873 20.8618 14.2553C21.4753 14.8687 22.292 15.2064 23.1617 15.2064C24.0314 15.2064 24.8489 14.8687 25.4623 14.2553L26.8701 12.8475C27.1973 12.5203 27.4454 12.1355 27.609 11.7036L27.6188 11.6789L27.6442 11.6707C28.1463 11.5168 28.6095 11.2373 28.9854 10.8622L30.3931 9.4545C31.0066 8.84105 31.3443 8.02436 31.3443 7.15387H31.3458ZM12.8802 13.1972C12.7592 13.6695 12.7196 14.4742 12.1054 14.4974C12.0546 14.7701 12.2944 14.8724 12.5119 14.785C12.7278 14.6856 12.8302 14.8635 12.9026 15.0406C13.2359 15.0891 13.7291 14.9292 13.7477 14.5347C13.2501 14.2478 13.0962 13.7023 12.8795 13.1972H12.8802Z" fill="#F4F3FF"/>
<path d="M43.5142 15.2258L47.1462 3.70583H49.9702L53.6022 15.2258H51.6182L48.3222 4.88983H48.7542L45.4982 15.2258H43.5142ZM45.5382 12.7298V10.9298H51.5862V12.7298H45.5382ZM55.0486 15.2258V3.70583H59.8086C59.9206 3.70583 60.0646 3.71116 60.2406 3.72183C60.4166 3.72716 60.5792 3.74316 60.7286 3.76983C61.3952 3.87116 61.9446 4.0925 62.3766 4.43383C62.8139 4.77516 63.1366 5.20716 63.3446 5.72983C63.5579 6.24716 63.6646 6.82316 63.6646 7.45783C63.6646 8.08716 63.5579 8.66316 63.3446 9.18583C63.1312 9.70316 62.8059 10.1325 62.3686 10.4738C61.9366 10.8152 61.3899 11.0365 60.7286 11.1378C60.5792 11.1592 60.4139 11.1752 60.2326 11.1858C60.0566 11.1965 59.9152 11.2018 59.8086 11.2018H56.9766V15.2258H55.0486ZM56.9766 9.40183H59.7286C59.8352 9.40183 59.9552 9.3965 60.0886 9.38583C60.2219 9.37516 60.3446 9.35383 60.4566 9.32183C60.7766 9.24183 61.0272 9.1005 61.2086 8.89783C61.3952 8.69516 61.5259 8.46583 61.6006 8.20983C61.6806 7.95383 61.7206 7.70316 61.7206 7.45783C61.7206 7.2125 61.6806 6.96183 61.6006 6.70583C61.5259 6.4445 61.3952 6.2125 61.2086 6.00983C61.0272 5.80716 60.7766 5.66583 60.4566 5.58583C60.3446 5.55383 60.2219 5.53516 60.0886 5.52983C59.9552 5.51916 59.8352 5.51383 59.7286 5.51383H56.9766V9.40183ZM65.4273 15.2258V3.70583H67.3553V15.2258H65.4273Z" fill="#F4F3FF"/>
</g>
<defs>
<clipPath id="clip0_4019_2020">
<rect width="71.0711" height="18.68" fill="white" transform="translate(0 0.22583)"/>
</clipPath>
</defs>
</svg>

After

Width:  |  Height:  |  Size: 5.7 KiB

View File

@@ -0,0 +1,11 @@
<svg width="72" height="20" viewBox="0 0 72 20" fill="none" xmlns="http://www.w3.org/2000/svg">
<g clip-path="url(#clip0_4019_689)">
<path d="M29.4038 5.97905C30.1256 6.70085 30.1256 7.87545 29.4038 8.59724L27.7869 10.188L27.7704 10.0969C27.6524 9.44307 27.3415 8.84979 26.8723 8.38055C26.5189 8.02787 26.1012 7.7671 25.6305 7.60571C25.3383 7.89936 25.1777 8.28416 25.1777 8.68915C25.1777 8.77134 25.1851 8.85652 25.2001 8.9417C25.4593 9.0351 25.6887 9.17931 25.8815 9.37209C26.6033 10.0939 26.6033 11.2685 25.8815 11.9903L24.4738 13.398C24.1129 13.7589 23.6392 13.939 23.1647 13.939C22.6902 13.939 22.2165 13.7589 21.8556 13.398C21.1338 12.6762 21.1338 11.5016 21.8556 10.7798L23.4725 9.18977L23.489 9.28093C23.6063 9.93323 23.9171 10.5265 24.3879 10.9965C24.742 11.3507 25.1343 11.586 25.6043 11.7467L25.691 11.66C25.954 11.397 26.0982 11.0473 26.0982 10.6745C26.0982 10.5915 26.0907 10.5086 26.0765 10.4271C25.8053 10.3375 25.5819 10.2097 25.3786 10.0065C25.0857 9.71356 24.9034 9.33921 24.8526 8.92451C24.8489 8.89463 24.8466 8.86549 24.8437 8.8356C24.8033 8.29537 24.9983 7.76785 25.3786 7.38827L26.7864 5.98055C27.1353 5.6316 27.6001 5.43883 28.0955 5.43883C28.5909 5.43883 29.0556 5.63086 29.4046 5.98055L29.4038 5.97905ZM36.7548 9.70011C36.7548 14.8506 32.5645 19.0401 27.4148 19.0401H9.34C4.1903 19.0401 0 14.8506 0 9.70011C0 4.54966 4.1903 0.360107 9.34 0.360107H27.4148C32.5652 0.360107 36.7548 4.55041 36.7548 9.70011ZM18 14.3843C18.1472 14.2057 17.4673 13.7029 17.3283 13.5183C17.0459 13.2119 17.0444 12.7711 16.8538 12.4132C16.3876 11.3327 15.8518 10.2605 15.1024 9.34594C14.3104 8.34543 13.333 7.51754 12.4745 6.57831C11.8371 5.92301 11.6668 4.98976 11.1041 4.28515C10.3285 3.13969 7.87624 2.82736 7.51683 4.44505C7.51833 4.49586 7.50264 4.52799 7.45855 4.56012C7.2598 4.70433 7.08271 4.86946 6.93402 5.06896C6.57013 5.57556 6.51409 6.43484 6.96839 6.88989C6.98333 6.65004 6.99155 6.42364 7.18134 6.25178C7.53252 6.5529 8.06304 6.65975 8.47026 6.43484C9.36989 7.71928 9.14573 9.49612 9.86005 10.8799C10.0573 11.2072 10.2561 11.5412 10.5094 11.8281C10.7148 12.1479 11.4247 12.5253 11.4665 12.8212C11.474 13.3293 11.4142 13.8844 11.7475 14.3096C11.9044 14.6279 11.5188 14.9477 11.208 14.9081C10.8045 14.9634 10.3121 14.6369 9.95868 14.8379C9.8339 14.9731 9.58957 14.8237 9.48197 15.0112C9.44461 15.1083 9.24286 15.2451 9.36316 15.3385C9.49691 15.2369 9.62095 15.1308 9.80102 15.1913C9.77412 15.3377 9.88994 15.3587 9.98184 15.4012C9.97886 15.5006 9.92057 15.6022 9.99679 15.6867C10.0857 15.597 10.1388 15.47 10.28 15.4326C10.7492 16.058 11.2267 14.7997 12.2421 15.3661C12.0359 15.3557 11.8528 15.3818 11.7139 15.5514C11.6795 15.5895 11.6503 15.6344 11.7109 15.6837C12.2586 15.3303 12.2556 15.8047 12.6112 15.659C12.8847 15.5163 13.1567 15.3377 13.4817 15.3885C13.1657 15.4797 13.153 15.7337 12.9677 15.9482C12.9363 15.9811 12.9213 16.0184 12.9579 16.073C13.614 16.0177 13.6678 15.7995 14.1975 15.532C14.5928 15.2907 14.9866 15.8757 15.3288 15.5425C15.4043 15.47 15.5074 15.4946 15.6008 15.4849C15.4812 14.8476 14.1669 15.6015 14.1878 14.7467C14.6107 14.459 14.5136 13.9083 14.542 13.4638C15.0284 13.7335 15.5694 13.8904 16.0461 14.1482C16.2867 14.5367 16.6641 15.0501 17.1669 15.0164C17.1804 14.9776 17.1923 14.9432 17.2065 14.9036C17.359 14.9298 17.5547 15.0306 17.6384 14.8379C17.8663 15.0762 18.201 15.0643 18.4992 15.003C18.7196 14.8237 18.0845 14.5681 17.9993 14.3836L18 14.3843ZM31.3458 7.28815C31.3458 6.41841 31.0081 5.60172 30.3946 4.98826C29.7812 4.37481 28.9645 4.03708 28.094 4.03708C27.2235 4.03708 26.4068 4.37481 25.7933 4.98826L24.3856 6.39599C24.0569 6.72476 23.8073 7.11106 23.6436 7.54369L23.6339 7.56835L23.6085 7.57582C23.0974 7.73348 22.6469 8.00396 22.2696 8.3813L20.8618 9.78902C19.5938 11.0578 19.5938 13.1215 20.8618 14.3895C21.4753 15.003 22.292 15.3407 23.1617 15.3407C24.0314 15.3407 24.8489 15.003 25.4623 14.3895L26.8701 12.9818C27.1973 12.6545 27.4454 12.2697 27.609 11.8378L27.6188 11.8132L27.6442 11.805C28.1463 11.651 28.6095 11.3716 28.9854 10.9965L30.3931 9.58878C31.0066 8.97532 31.3443 8.15863 31.3443 7.28815H31.3458ZM12.8802 13.3315C12.7592 13.8037 12.7196 14.6085 12.1054 14.6316C12.0546 14.9044 12.2944 15.0067 12.5119 14.9193C12.7278 14.8199 12.8302 14.9978 12.9026 15.1748C13.2359 15.2234 13.7291 15.0635 13.7477 14.669C13.2501 14.3821 13.0962 13.8366 12.8795 13.3315H12.8802Z" fill="#246161"/>
<path d="M43.5142 15.3601L47.1462 3.84011H49.9702L53.6022 15.3601H51.6182L48.3222 5.02411H48.7542L45.4982 15.3601H43.5142ZM45.5382 12.8641V11.0641H51.5862V12.8641H45.5382ZM55.0486 15.3601V3.84011H59.8086C59.9206 3.84011 60.0646 3.84544 60.2406 3.85611C60.4166 3.86144 60.5792 3.87744 60.7286 3.90411C61.3952 4.00544 61.9446 4.22677 62.3766 4.56811C62.8139 4.90944 63.1366 5.34144 63.3446 5.86411C63.5579 6.38144 63.6646 6.95744 63.6646 7.59211C63.6646 8.22144 63.5579 8.79744 63.3446 9.32011C63.1312 9.83744 62.8059 10.2668 62.3686 10.6081C61.9366 10.9494 61.3899 11.1708 60.7286 11.2721C60.5792 11.2934 60.4139 11.3094 60.2326 11.3201C60.0566 11.3308 59.9152 11.3361 59.8086 11.3361H56.9766V15.3601H55.0486ZM56.9766 9.53611H59.7286C59.8352 9.53611 59.9552 9.53077 60.0886 9.52011C60.2219 9.50944 60.3446 9.48811 60.4566 9.45611C60.7766 9.37611 61.0272 9.23477 61.2086 9.03211C61.3952 8.82944 61.5259 8.60011 61.6006 8.34411C61.6806 8.08811 61.7206 7.83744 61.7206 7.59211C61.7206 7.34677 61.6806 7.09611 61.6006 6.84011C61.5259 6.57877 61.3952 6.34677 61.2086 6.14411C61.0272 5.94144 60.7766 5.80011 60.4566 5.72011C60.3446 5.68811 60.2219 5.66944 60.0886 5.66411C59.9552 5.65344 59.8352 5.64811 59.7286 5.64811H56.9766V9.53611ZM65.4273 15.3601V3.84011H67.3553V15.3601H65.4273Z" fill="#246161"/>
</g>
<defs>
<clipPath id="clip0_4019_689">
<rect width="71.0711" height="18.68" fill="white" transform="translate(0 0.360107)"/>
</clipPath>
</defs>
</svg>

After

Width:  |  Height:  |  Size: 5.7 KiB

View File

@@ -15,6 +15,8 @@ from pathlib import Path
import toml
from docutils import nodes
from docutils.parsers.rst.directives.admonitions import BaseAdmonition
from docutils.statemachine import StringList
from sphinx.util.docutils import SphinxDirective
# If extensions (or modules to document with autodoc) are in another directory,
@@ -60,26 +62,41 @@ class ExampleLinksDirective(SphinxDirective):
item_node.append(para_node)
list_node.append(item_node)
if list_node.children:
title_node = nodes.title()
title_node = nodes.rubric()
title_node.append(nodes.Text(f"Examples using {class_or_func_name}"))
return [title_node, list_node]
return [list_node]
class Beta(BaseAdmonition):
required_arguments = 0
node_class = nodes.admonition
def run(self):
self.content = self.content or StringList(
[
(
"This feature is in beta. It is actively being worked on, so the "
"API may change."
)
]
)
self.arguments = self.arguments or ["Beta"]
return super().run()
def setup(app):
app.add_directive("example_links", ExampleLinksDirective)
app.add_directive("beta", Beta)
# -- Project information -----------------------------------------------------
project = "🦜🔗 LangChain"
copyright = "2023, LangChain, Inc."
author = "LangChain, Inc."
copyright = "2023, LangChain Inc"
author = "LangChain, Inc"
version = data["tool"]["poetry"]["version"]
release = version
html_title = project + " " + version
html_favicon = "_static/img/brand/favicon.png"
html_last_updated_fmt = "%b %d, %Y"
@@ -95,11 +112,13 @@ extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"sphinx_copybutton",
"sphinx_panels",
"IPython.sphinxext.ipython_console_highlighting",
"myst_parser",
"_extensions.gallery_directive",
"sphinx_design",
"sphinx_copybutton",
]
source_suffix = [".rst"]
source_suffix = [".rst", ".md"]
# some autodoc pydantic options are repeated in the actual template.
# potentially user error, but there may be bugs in the sphinx extension
@@ -131,23 +150,84 @@ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "scikit-learn-modern"
html_theme_path = ["themes"]
# The theme to use for HTML and HTML Help pages.
html_theme = "pydata_sphinx_theme"
# redirects dictionary maps from old links to new links
html_additional_pages = {}
redirects = {
"index": "langchain_api_reference",
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
html_theme_options = {
# # -- General configuration ------------------------------------------------
"sidebar_includehidden": True,
"use_edit_page_button": False,
# # "analytics": {
# # "plausible_analytics_domain": "scikit-learn.org",
# # "plausible_analytics_url": "https://views.scientific-python.org/js/script.js",
# # },
# # If "prev-next" is included in article_footer_items, then setting show_prev_next
# # to True would repeat prev and next links. See
# # https://github.com/pydata/pydata-sphinx-theme/blob/b731dc230bc26a3d1d1bb039c56c977a9b3d25d8/src/pydata_sphinx_theme/theme/pydata_sphinx_theme/layout.html#L118-L129
"show_prev_next": False,
"search_bar_text": "Search",
"navigation_with_keys": True,
"collapse_navigation": True,
"navigation_depth": 3,
"show_nav_level": 1,
"show_toc_level": 3,
"navbar_align": "left",
"header_links_before_dropdown": 5,
"header_dropdown_text": "Integrations",
"logo": {
"image_light": "_static/wordmark-api.svg",
"image_dark": "_static/wordmark-api-dark.svg",
},
"surface_warnings": True,
# # -- Template placement in theme layouts ----------------------------------
"navbar_start": ["navbar-logo"],
# # Note that the alignment of navbar_center is controlled by navbar_align
"navbar_center": ["navbar-nav"],
"navbar_end": ["langchain_docs", "theme-switcher", "navbar-icon-links"],
# # navbar_persistent is persistent right (even when on mobiles)
"navbar_persistent": ["search-field"],
"article_header_start": ["breadcrumbs"],
"article_header_end": [],
"article_footer_items": [],
"content_footer_items": [],
# # Use html_sidebars that map page patterns to list of sidebar templates
# "primary_sidebar_end": [],
"footer_start": ["copyright"],
"footer_center": [],
"footer_end": [],
# # When specified as a dictionary, the keys should follow glob-style patterns, as in
# # https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-exclude_patterns
# # In particular, "**" specifies the default for all pages
# # Use :html_theme.sidebar_secondary.remove: for file-wide removal
# "secondary_sidebar_items": {"**": ["page-toc", "sourcelink"]},
# "show_version_warning_banner": True,
# "announcement": None,
"icon_links": [
{
# Label for this link
"name": "GitHub",
# URL where the link will redirect
"url": "https://github.com/langchain-ai/langchain", # required
# Icon class (if "type": "fontawesome"), or path to local image (if "type": "local")
"icon": "fa-brands fa-square-github",
# The type of image to be used (see below for details)
"type": "fontawesome",
},
{
"name": "X / Twitter",
"url": "https://twitter.com/langchainai",
"icon": "fab fa-twitter-square",
},
],
"icon_links_label": "Quick Links",
"external_links": [
{"name": "Legacy reference", "url": "https://api.python.langchain.com/"},
],
}
for old_link in redirects:
html_additional_pages[old_link] = "redirects.html"
partners_dir = Path(__file__).parent.parent.parent / "libs/partners"
partners = [
(p.name, p.name.replace("-", "_") + "_api_reference")
for p in partners_dir.iterdir()
]
partners = sorted(partners)
html_context = {
"display_github": True, # Integrate GitHub
@@ -155,8 +235,6 @@ html_context = {
"github_repo": "langchain", # Repo name
"github_version": "master", # Version
"conf_py_path": "/docs/api_reference", # Path in the checkout to the docs root
"redirects": redirects,
"partners": partners,
}
# Add any paths that contain custom static files (such as style sheets) here,
@@ -166,9 +244,7 @@ html_static_path = ["_static"]
# These paths are either relative to html_static_path
# or fully qualified paths (e.g. https://...)
html_css_files = [
"css/custom.css",
]
html_css_files = ["css/custom.css"]
html_use_index = False
myst_enable_extensions = ["colon_fence"]
@@ -178,3 +254,12 @@ autosummary_generate = True
html_copy_source = False
html_show_sourcelink = False
# Set canonical URL from the Read the Docs Domain
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "")
# Tell Jinja2 templates the build is running on Read the Docs
if os.environ.get("READTHEDOCS", "") == "True":
html_context["READTHEDOCS"] = True
master_doc = "index"

View File

@@ -38,6 +38,8 @@ class ClassInfo(TypedDict):
"""The kind of the class."""
is_public: bool
"""Whether the class is public or not."""
is_deprecated: bool
"""Whether the class is deprecated."""
class FunctionInfo(TypedDict):
@@ -49,6 +51,8 @@ class FunctionInfo(TypedDict):
"""The fully qualified name of the function."""
is_public: bool
"""Whether the function is public or not."""
is_deprecated: bool
"""Whether the function is deprecated."""
class ModuleMembers(TypedDict):
@@ -78,7 +82,7 @@ def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
continue
if inspect.isclass(type_):
# The clasification of the class is used to select a template
# The type of the class is used to select a template
# for the object when rendering the documentation.
# See `templates` directory for defined templates.
# This is a hacky solution to distinguish between different
@@ -121,6 +125,7 @@ def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
qualified_name=f"{namespace}.{name}",
kind=kind,
is_public=not name.startswith("_"),
is_deprecated=".. deprecated::" in (type_.__doc__ or ""),
)
)
elif inspect.isfunction(type_):
@@ -129,6 +134,7 @@ def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
name=name,
qualified_name=f"{namespace}.{name}",
is_public=not name.startswith("_"),
is_deprecated=".. deprecated::" in (type_.__doc__ or ""),
)
)
else:
@@ -233,7 +239,7 @@ def _construct_doc(
package_namespace: str,
members_by_namespace: Dict[str, ModuleMembers],
package_version: str,
) -> str:
) -> List[typing.Tuple[str, str]]:
"""Construct the contents of the reference.rst file for the given package.
Args:
@@ -245,23 +251,62 @@ def _construct_doc(
Returns:
The contents of the reference.rst file.
"""
full_doc = f"""\
=======================
``{package_namespace}`` {package_version}
=======================
docs = []
index_doc = f"""\
:html_theme.sidebar_secondary.remove:
.. currentmodule:: {package_namespace}
.. _{package_namespace}:
======================================
{package_namespace.replace('_', '-')}: {package_version}
======================================
.. automodule:: {package_namespace}
:no-members:
:no-inherited-members:
.. toctree::
:hidden:
:maxdepth: 2
"""
index_autosummary = """
"""
namespaces = sorted(members_by_namespace)
for module in namespaces:
index_doc += f" {module}\n"
module_doc = f"""\
.. currentmodule:: {package_namespace}
.. _{package_namespace}_{module}:
"""
_members = members_by_namespace[module]
classes = [el for el in _members["classes_"] if el["is_public"]]
functions = [el for el in _members["functions"] if el["is_public"]]
classes = [
el
for el in _members["classes_"]
if el["is_public"] and not el["is_deprecated"]
]
functions = [
el
for el in _members["functions"]
if el["is_public"] and not el["is_deprecated"]
]
deprecated_classes = [
el for el in _members["classes_"] if el["is_public"] and el["is_deprecated"]
]
deprecated_functions = [
el
for el in _members["functions"]
if el["is_public"] and el["is_deprecated"]
]
if not (classes or functions):
continue
section = f":mod:`{package_namespace}.{module}`"
section = f":mod:`{module}`"
underline = "=" * (len(section) + 1)
full_doc += f"""\
module_doc += f"""
{section}
{underline}
@@ -269,16 +314,26 @@ def _construct_doc(
:no-members:
:no-inherited-members:
"""
index_autosummary += f"""
:ref:`{package_namespace}_{module}`
{'^' * (len(package_namespace) + len(module) + 8)}
"""
if classes:
full_doc += f"""\
Classes
--------------
module_doc += f"""\
**Classes**
.. currentmodule:: {package_namespace}
.. autosummary::
:toctree: {module}
"""
index_autosummary += """
**Classes**
.. autosummary::
"""
for class_ in sorted(classes, key=lambda c: c["qualified_name"]):
@@ -295,19 +350,22 @@ Classes
else:
template = "class.rst"
full_doc += f"""\
module_doc += f"""\
:template: {template}
{class_["qualified_name"]}
"""
index_autosummary += f"""
{class_['qualified_name']}
"""
if functions:
_functions = [f["qualified_name"] for f in functions]
fstring = "\n ".join(sorted(_functions))
full_doc += f"""\
Functions
--------------
module_doc += f"""\
**Functions**
.. currentmodule:: {package_namespace}
.. autosummary::
@@ -317,7 +375,80 @@ Functions
{fstring}
"""
return full_doc
index_autosummary += f"""
**Functions**
.. autosummary::
{fstring}
"""
if deprecated_classes:
module_doc += f"""\
**Deprecated classes**
.. currentmodule:: {package_namespace}
.. autosummary::
:toctree: {module}
"""
index_autosummary += """
**Deprecated classes**
.. autosummary::
"""
for class_ in sorted(deprecated_classes, key=lambda c: c["qualified_name"]):
if class_["kind"] == "TypedDict":
template = "typeddict.rst"
elif class_["kind"] == "enum":
template = "enum.rst"
elif class_["kind"] == "Pydantic":
template = "pydantic.rst"
elif class_["kind"] == "RunnablePydantic":
template = "runnable_pydantic.rst"
elif class_["kind"] == "RunnableNonPydantic":
template = "runnable_non_pydantic.rst"
else:
template = "class.rst"
module_doc += f"""\
:template: {template}
{class_["qualified_name"]}
"""
index_autosummary += f"""
{class_['qualified_name']}
"""
if deprecated_functions:
_functions = [f["qualified_name"] for f in deprecated_functions]
fstring = "\n ".join(sorted(_functions))
module_doc += f"""\
**Deprecated functions**
.. currentmodule:: {package_namespace}
.. autosummary::
:toctree: {module}
:template: function.rst
{fstring}
"""
index_autosummary += f"""
**Deprecated functions**
.. autosummary::
{fstring}
"""
docs.append((f"{module}.rst", module_doc))
docs.append(("index.rst", index_doc + index_autosummary))
return docs
def _build_rst_file(package_name: str = "langchain") -> None:
@@ -329,13 +460,14 @@ def _build_rst_file(package_name: str = "langchain") -> None:
package_dir = _package_dir(package_name)
package_members = _load_package_modules(package_dir)
package_version = _get_package_version(package_dir)
with open(_out_file_path(package_name), "w") as f:
f.write(
_doc_first_line(package_name)
+ _construct_doc(
_package_namespace(package_name), package_members, package_version
)
)
output_dir = _out_file_path(package_name)
os.mkdir(output_dir)
rsts = _construct_doc(
_package_namespace(package_name), package_members, package_version
)
for name, rst in rsts:
with open(output_dir / name, "w") as f:
f.write(rst)
def _package_namespace(package_name: str) -> str:
@@ -385,12 +517,119 @@ def _get_package_version(package_dir: Path) -> str:
def _out_file_path(package_name: str) -> Path:
"""Return the path to the file containing the documentation."""
return HERE / f"{package_name.replace('-', '_')}_api_reference.rst"
return HERE / f"{package_name.replace('-', '_')}"
def _doc_first_line(package_name: str) -> str:
"""Return the path to the file containing the documentation."""
return f".. {package_name.replace('-', '_')}_api_reference:\n\n"
def _build_index(dirs: List[str]) -> None:
custom_names = {
"airbyte": "Airbyte",
"aws": "AWS",
"ai21": "AI21",
}
ordered = ["core", "langchain", "text-splitters", "community", "experimental"]
main_ = [dir_ for dir_ in ordered if dir_ in dirs]
integrations = sorted(dir_ for dir_ in dirs if dir_ not in main_)
doc = """# LangChain Python API Reference
Welcome to the LangChain Python API reference. This is a reference for all
`langchain-x` packages.
For user guides see [https://python.langchain.com](https://python.langchain.com).
For the legacy API reference hosted on ReadTheDocs see [https://api.python.langchain.com/](https://api.python.langchain.com/).
"""
if main_:
main_headers = [
" ".join(custom_names.get(x, x.title()) for x in dir_.split("-"))
for dir_ in main_
]
main_tree = "\n".join(
f"{header_name}<{dir_.replace('-', '_')}/index>"
for header_name, dir_ in zip(main_headers, main_)
)
main_grid = "\n".join(
f'- header: "**{header_name}**"\n content: "{_package_namespace(dir_).replace("_", "-")}: {_get_package_version(_package_dir(dir_))}"\n link: {dir_.replace("-", "_")}/index.html'
for header_name, dir_ in zip(main_headers, main_)
)
doc += f"""## Base packages
```{{gallery-grid}}
:grid-columns: "1 2 2 3"
{main_grid}
```
```{{toctree}}
:maxdepth: 2
:hidden:
:caption: Base packages
{main_tree}
```
"""
if integrations:
integration_headers = [
" ".join(
custom_names.get(x, x.title().replace("ai", "AI").replace("db", "DB"))
for x in dir_.split("-")
)
for dir_ in integrations
]
integration_tree = "\n".join(
f"{header_name}<{dir_.replace('-', '_')}/index>"
for header_name, dir_ in zip(integration_headers, integrations)
)
integration_grid = ""
integrations_to_show = [
"openai",
"anthropic",
"google-vertexai",
"aws",
"huggingface",
"mistralai",
]
for header_name, dir_ in sorted(
zip(integration_headers, integrations),
key=lambda h_d: integrations_to_show.index(h_d[1])
if h_d[1] in integrations_to_show
else len(integrations_to_show),
)[: len(integrations_to_show)]:
integration_grid += f'\n- header: "**{header_name}**"\n content: {_package_namespace(dir_).replace("_", "-")} {_get_package_version(_package_dir(dir_))}\n link: {dir_.replace("-", "_")}/index.html'
doc += f"""## Integrations
```{{gallery-grid}}
:grid-columns: "1 2 2 3"
{integration_grid}
```
See the full list of integrations in the Section Navigation.
```{{toctree}}
:maxdepth: 2
:hidden:
:caption: Integrations
{integration_tree}
```
"""
with open(HERE / "reference.md", "w") as f:
f.write(doc)
dummy_index = """\
# API reference
```{toctree}
:maxdepth: 3
:hidden:
Reference<reference>
```
"""
with open(HERE / "index.md", "w") as f:
f.write(dummy_index)
def main(dirs: Optional[list] = None) -> None:
@@ -418,6 +657,8 @@ def main(dirs: Optional[list] = None) -> None:
else:
print("Building package:", dir_)
_build_rst_file(package_name=dir_)
_build_index(dirs)
print("API reference files built.")

View File

@@ -1,8 +0,0 @@
=============
LangChain API
=============
.. toctree::
:maxdepth: 2
api_reference.rst

View File

@@ -1,17 +1,11 @@
-e libs/experimental
-e libs/langchain
-e libs/core
-e libs/community
pydantic<2
autodoc_pydantic==1.8.0
myst_parser
nbsphinx==0.8.9
sphinx>=5
sphinx-autobuild==2021.3.14
sphinx_rtd_theme==1.0.0
sphinx-typlog-theme==0.8.0
sphinx-panels
toml
myst_nb
sphinx_copybutton
pydata-sphinx-theme==0.13.1
autodoc_pydantic>=1,<2
sphinx<=7
myst-parser>=3
sphinx-autobuild>=2024
pydata-sphinx-theme>=0.15
toml>=0.10.2
myst-nb>=1.1.1
pyyaml
sphinx-design
sphinx-copybutton
beautifulsoup4

View File

@@ -0,0 +1,41 @@
import sys
from glob import glob
from pathlib import Path
from bs4 import BeautifulSoup
CUR_DIR = Path(__file__).parents[1]
def process_toc_h3_elements(html_content: str) -> str:
"""Update Class.method() TOC headers to just method()."""
# Create a BeautifulSoup object
soup = BeautifulSoup(html_content, "html.parser")
# Find all <li> elements with class "toc-h3"
toc_h3_elements = soup.find_all("li", class_="toc-h3")
# Process each element
for element in toc_h3_elements:
element = element.a.code.span
# Get the text content of the element
content = element.get_text()
# Apply the regex substitution
modified_content = content.split(".")[-1]
# Update the element's content
element.string = modified_content
# Return the modified HTML
return str(soup)
if __name__ == "__main__":
dir = sys.argv[1]
for fn in glob(str(f"{dir.rstrip('/')}/**/*.html"), recursive=True):
with open(fn, "r") as f:
html = f.read()
processed_html = process_toc_h3_elements(html)
with open(fn, "w") as f:
f.write(processed_html)

View File

@@ -1,4 +1,4 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. currentmodule:: {{ module }}
@@ -11,7 +11,7 @@
.. autosummary::
{% for item in attributes %}
~{{ name }}.{{ item }}
~{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}
@@ -22,11 +22,11 @@
.. autosummary::
{% for item in methods %}
~{{ name }}.{{ item }}
~{{ item }}
{%- endfor %}
{% for item in methods %}
.. automethod:: {{ name }}.{{ item }}
.. automethod:: {{ item }}
{%- endfor %}
{% endif %}

View File

@@ -1,4 +1,4 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. currentmodule:: {{ module }}

View File

@@ -1,4 +1,4 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. currentmodule:: {{ module }}

View File

@@ -0,0 +1,12 @@
<!-- This will display a link to LangChain docs -->
<head>
<style>
.text-link {
text-decoration: none; /* Remove underline */
color: inherit; /* Inherit color from parent element */
}
</style>
</head>
<body>
<a href="https://python.langchain.com/" class='text-link'>Docs</a>
</body>

View File

@@ -1,4 +1,4 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. currentmodule:: {{ module }}

View File

@@ -1,21 +1,21 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. NOTE:: {{objname}} implements the standard :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>`. 🏃
The :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>` has additional methods that are available on runnables, such as :py:meth:`with_types <langchain_core.runnables.base.Runnable.with_types>`, :py:meth:`with_retry <langchain_core.runnables.base.Runnable.with_retry>`, :py:meth:`assign <langchain_core.runnables.base.Runnable.assign>`, :py:meth:`bind <langchain_core.runnables.base.Runnable.bind>`, :py:meth:`get_graph <langchain_core.runnables.base.Runnable.get_graph>`, and more.
.. currentmodule:: {{ module }}
.. autoclass:: {{ objname }}
.. NOTE:: {{objname}} implements the standard :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>`. 🏃
The :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>` has additional methods that are available on runnables, such as :py:meth:`with_types <langchain_core.runnables.base.Runnable.with_types>`, :py:meth:`with_retry <langchain_core.runnables.base.Runnable.with_retry>`, :py:meth:`assign <langchain_core.runnables.base.Runnable.assign>`, :py:meth:`bind <langchain_core.runnables.base.Runnable.bind>`, :py:meth:`get_graph <langchain_core.runnables.base.Runnable.get_graph>`, and more.
{% block attributes %}
{% if attributes %}
.. rubric:: {{ _('Attributes') }}
.. autosummary::
{% for item in attributes %}
~{{ name }}.{{ item }}
~{{ item }}
{%- endfor %}
{% endif %}
{% endblock %}
@@ -26,11 +26,11 @@
.. autosummary::
{% for item in methods %}
~{{ name }}.{{ item }}
~{{ item }}
{%- endfor %}
{% for item in methods %}
.. automethod:: {{ name }}.{{ item }}
.. automethod:: {{ item }}
{%- endfor %}
{% endif %}

View File

@@ -1,10 +1,6 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. NOTE:: {{objname}} implements the standard :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>`. 🏃
The :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>` has additional methods that are available on runnables, such as :py:meth:`with_types <langchain_core.runnables.base.Runnable.with_types>`, :py:meth:`with_retry <langchain_core.runnables.base.Runnable.with_retry>`, :py:meth:`assign <langchain_core.runnables.base.Runnable.assign>`, :py:meth:`bind <langchain_core.runnables.base.Runnable.bind>`, :py:meth:`get_graph <langchain_core.runnables.base.Runnable.get_graph>`, and more.
.. currentmodule:: {{ module }}
.. autopydantic_model:: {{ objname }}
@@ -19,6 +15,10 @@
:member-order: groupwise
:show-inheritance: True
:special-members: __call__
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace, astream_log, transform, atransform, get_output_schema, get_prompts, config_schema, map, pick, pipe, with_listeners, with_alisteners, with_config, with_fallbacks, with_types, with_retry, InputType, OutputType, config_specs, output_schema, get_input_schema, get_graph, get_name, input_schema, name, bind, assign
:exclude-members: construct, copy, dict, from_orm, parse_file, parse_obj, parse_raw, schema, schema_json, update_forward_refs, validate, json, is_lc_serializable, to_json_not_implemented, lc_secrets, lc_attributes, lc_id, get_lc_namespace, astream_log, transform, atransform, get_output_schema, get_prompts, config_schema, map, pick, pipe, with_listeners, with_alisteners, with_config, with_fallbacks, with_types, with_retry, InputType, OutputType, config_specs, output_schema, get_input_schema, get_graph, get_name, input_schema, name, bind, assign, as_tool
.. NOTE:: {{objname}} implements the standard :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>`. 🏃
The :py:class:`Runnable Interface <langchain_core.runnables.base.Runnable>` has additional methods that are available on runnables, such as :py:meth:`with_types <langchain_core.runnables.base.Runnable.with_types>`, :py:meth:`with_retry <langchain_core.runnables.base.Runnable.with_retry>`, :py:meth:`assign <langchain_core.runnables.base.Runnable.assign>`, :py:meth:`bind <langchain_core.runnables.base.Runnable.bind>`, :py:meth:`get_graph <langchain_core.runnables.base.Runnable.get_graph>`, and more.
.. example_links:: {{ objname }}

View File

@@ -1,4 +1,4 @@
:mod:`{{module}}`.{{objname}}
{{ objname }}
{{ underline }}==============
.. currentmodule:: {{ module }}

View File

@@ -1,27 +0,0 @@
Copyright (c) 2007-2023 The scikit-learn developers.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -1,67 +0,0 @@
<script>
$(document).ready(function() {
/* Add a [>>>] button on the top-right corner of code samples to hide
* the >>> and ... prompts and the output and thus make the code
* copyable. */
var div = $('.highlight-python .highlight,' +
'.highlight-python3 .highlight,' +
'.highlight-pycon .highlight,' +
'.highlight-default .highlight')
var pre = div.find('pre');
// get the styles from the current theme
pre.parent().parent().css('position', 'relative');
var hide_text = 'Hide prompts and outputs';
var show_text = 'Show prompts and outputs';
// create and add the button to all the code blocks that contain >>>
div.each(function(index) {
var jthis = $(this);
if (jthis.find('.gp').length > 0) {
var button = $('<span class="copybutton">&gt;&gt;&gt;</span>');
button.attr('title', hide_text);
button.data('hidden', 'false');
jthis.prepend(button);
}
// tracebacks (.gt) contain bare text elements that need to be
// wrapped in a span to work with .nextUntil() (see later)
jthis.find('pre:has(.gt)').contents().filter(function() {
return ((this.nodeType == 3) && (this.data.trim().length > 0));
}).wrap('<span>');
});
// define the behavior of the button when it's clicked
$('.copybutton').click(function(e){
e.preventDefault();
var button = $(this);
if (button.data('hidden') === 'false') {
// hide the code output
button.parent().find('.go, .gp, .gt').hide();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'hidden');
button.css('text-decoration', 'line-through');
button.attr('title', show_text);
button.data('hidden', 'true');
} else {
// show the code output
button.parent().find('.go, .gp, .gt').show();
button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'visible');
button.css('text-decoration', 'none');
button.attr('title', hide_text);
button.data('hidden', 'false');
}
});
/*** Add permalink buttons next to glossary terms ***/
$('dl.glossary > dt[id]').append(function() {
return ('<a class="headerlink" href="#' +
this.getAttribute('id') +
'" title="Permalink to this term">¶</a>');
});
});
</script>
{%- if pagename != 'index' and pagename != 'documentation' %}
{% if theme_mathjax_path %}
<script id="MathJax-script" async src="{{ theme_mathjax_path }}"></script>
{% endif %}
{%- endif %}

View File

@@ -1,132 +0,0 @@
{# TEMPLATE VAR SETTINGS #}
{%- set url_root = pathto('', 1) %}
{%- if url_root == '#' %}{% set url_root = '' %}{% endif %}
{%- if not embedded and docstitle %}
{%- set titlesuffix = " &mdash; "|safe + docstitle|e %}
{%- else %}
{%- set titlesuffix = "" %}
{%- endif %}
{%- set lang_attr = 'en' %}
<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="{{ lang_attr }}" > <![endif]-->
<!--[if gt IE 8]><!-->
<html class="no-js" lang="{{ lang_attr }}"> <!--<![endif]-->
<head>
<meta charset="utf-8">
{{ metatags }}
<meta name="viewport" content="width=device-width, initial-scale=1.0">
{% block htmltitle %}
<title>{{ title|striptags|e }}{{ titlesuffix }}</title>
{% endblock %}
<link rel="canonical"
href="https://api.python.langchain.com/en/latest/{{ pagename }}.html"/>
{% if favicon_url %}
<link rel="shortcut icon" href="{{ favicon_url|e }}"/>
{% endif %}
<link rel="stylesheet"
href="{{ pathto('_static/css/vendor/bootstrap.min.css', 1) }}"
type="text/css"/>
{%- for css in css_files %}
{%- if css|attr("rel") %}
<link rel="{{ css.rel }}" href="{{ pathto(css.filename, 1) }}"
type="text/css"{% if css.title is not none %}
title="{{ css.title }}"{% endif %} />
{%- else %}
<link rel="stylesheet" href="{{ pathto(css, 1) }}" type="text/css"/>
{%- endif %}
{%- endfor %}
<link rel="stylesheet" href="{{ pathto('_static/' + style, 1) }}" type="text/css"/>
<script id="documentation_options" data-url_root="{{ pathto('', 1) }}"
src="{{ pathto('_static/documentation_options.js', 1) }}"></script>
<script src="{{ pathto('_static/jquery.js', 1) }}"></script>
{%- block extrahead %} {% endblock %}
</head>
<body>
{% include "nav.html" %}
{%- block content %}
<div class="d-flex" id="sk-doc-wrapper">
<input type="checkbox" name="sk-toggle-checkbox" id="sk-toggle-checkbox">
<label id="sk-sidemenu-toggle" class="sk-btn-toggle-toc btn sk-btn-primary"
for="sk-toggle-checkbox">Toggle Menu</label>
<div id="sk-sidebar-wrapper" class="border-right">
<div class="sk-sidebar-toc-wrapper">
{%- if meta and meta['parenttoc']|tobool %}
<div class="sk-sidebar-toc">
{% set nav = get_nav_object(maxdepth=3, collapse=True, numbered=True) %}
<ul>
{% for main_nav_item in nav %}
{% if main_nav_item.active %}
<li>
<a href="{{ main_nav_item.url }}"
class="sk-toc-active">{{ main_nav_item.title }}</a>
</li>
<ul>
{% for nav_item in main_nav_item.children %}
<li>
<a href="{{ nav_item.url }}"
class="{% if nav_item.active %}sk-toc-active{% endif %}">{{ nav_item.title }}</a>
{% if nav_item.children %}
<ul>
{% for inner_child in nav_item.children %}
<li class="sk-toctree-l3">
<a href="{{ inner_child.url }}">{{ inner_child.title }}</a>
</li>
{% endfor %}
</ul>
{% endif %}
</li>
{% endfor %}
</ul>
{% endif %}
{% endfor %}
</ul>
</div>
{%- elif meta and meta['globalsidebartoc']|tobool %}
<div class="sk-sidebar-toc sk-sidebar-global-toc">
{{ toctree(maxdepth=2, titles_only=True) }}
</div>
{%- else %}
<div class="sk-sidebar-toc">
{{ toc }}
</div>
{%- endif %}
</div>
</div>
<div id="sk-page-content-wrapper">
<div class="sk-page-content container-fluid body px-md-3" role="main">
{% block body %}{% endblock %}
</div>
<div class="container">
<footer class="sk-content-footer">
{%- if pagename != 'index' %}
{%- if show_copyright %}
{%- if hasdoc('copyright') %}
{% trans path=pathto('copyright'), copyright=copyright|e %}
&copy; {{ copyright }}.{% endtrans %}
{%- else %}
{% trans copyright=copyright|e %}&copy; {{ copyright }}
.{% endtrans %}
{%- endif %}
{%- endif %}
{%- if last_updated %}
{% trans last_updated=last_updated|e %}Last updated
on {{ last_updated }}.{% endtrans %}
{%- endif %}
{%- if show_source and has_source and sourcename %}
<a href="{{ pathto('_sources/' + sourcename, true)|e }}"
rel="nofollow">{{ _('Show this page source') }}</a>
{%- endif %}
{%- endif %}
</footer>
</div>
</div>
</div>
{%- endblock %}
<script src="{{ pathto('_static/js/vendor/bootstrap.min.js', 1) }}"></script>
{% include "javascript.html" %}
</body>
</html>

View File

@@ -1,78 +0,0 @@
{%- if pagename != 'index' and pagename != 'documentation' %}
{%- set nav_bar_class = "sk-docs-navbar" %}
{%- set top_container_cls = "sk-docs-container" %}
{%- else %}
{%- set nav_bar_class = "sk-landing-navbar" %}
{%- set top_container_cls = "sk-landing-container" %}
{%- endif %}
<nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">
<div class="container-fluid {{ top_container_cls }} px-0">
{%- if logo_url %}
<a class="navbar-brand py-0" href="{{ pathto('index') }}">
<img
class="sk-brand-img"
src="{{ logo_url|e }}"
alt="logo"/>
</a>
{%- endif %}
<button
id="sk-navbar-toggler"
class="navbar-toggler"
type="button"
data-toggle="collapse"
data-target="#navbarSupportedContent"
aria-controls="navbarSupportedContent"
aria-expanded="false"
aria-label="Toggle navigation"
>
<span class="navbar-toggler-icon"></span>
</button>
<div class="sk-navbar-collapse collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav mr-auto">
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('langchain_api_reference') }}">LangChain</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('core_api_reference') }}">Core</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('community_api_reference') }}">Community</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('text_splitters_api_reference') }}">Text splitters</a>
</li>
{%- for title, pathname in partners %}
<li class="nav-item">
<a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ pathto(pathname) }}">{{ title }}</a>
</li>
{%- endfor %}
<li class="nav-item dropdown nav-more-item-dropdown">
<a class="sk-nav-link nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Partner libs</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdown">
{%- for title, pathname in partners %}
<a class="sk-nav-dropdown-item dropdown-item" href="{{ pathto(pathname) }}">{{ title }}</a>
{%- endfor %}
</div>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Docs</a>
</li>
</ul>
{%- if pagename != "search"%}
<div id="searchbox" role="search">
<div class="searchformwrapper">
<form class="search" action="{{ pathto('search') }}" method="get">
<input class="sk-search-text-input" type="text" name="q" aria-labelledby="searchlabel" />
<input class="sk-search-text-btn" type="submit" value="{{ _('Go') }}" />
</form>
</div>
</div>
{%- endif %}
</div>
</div>
</nav>

View File

@@ -1,16 +0,0 @@
{%- extends "basic/search.html" %}
{% block extrahead %}
<script type="text/javascript" src="{{ pathto('_static/underscore.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('searchindex.js', 1) }}" defer></script>
<script type="text/javascript" src="{{ pathto('_static/doctools.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('_static/language_data.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('_static/searchtools.js', 1) }}"></script>
<script type="text/javascript" src="{{ pathto('_static/sphinx_highlight.js', 1) }}"></script>
<script type="text/javascript">
$(document).ready(function() {
if (!Search.out) {
Search.init();
}
});
</script>
{% endblock %}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,8 +0,0 @@
[theme]
inherit = basic
pygments_style = default
stylesheet = css/theme.css
[options]
link_to_live_contributing_page = false
mathjax_path =

View File

@@ -4,8 +4,11 @@ LangChain implements the latest research in the field of Natural Language Proces
This page contains `arXiv` papers referenced in the LangChain Documentation, API Reference,
Templates, and Cookbooks.
From the opposite direction, scientists use LangChain in research and reference LangChain in the research papers.
Here you find [such papers](https://arxiv.org/search/?query=langchain&searchtype=all&source=header).
From the opposite direction, scientists use `LangChain` in research and reference it in the research papers.
Here you find papers that reference:
- [LangChain](https://arxiv.org/search/?query=langchain&searchtype=all&source=header)
- [LangGraph](https://arxiv.org/search/?query=langgraph&searchtype=all&source=header)
- [LangSmith](https://arxiv.org/search/?query=langsmith&searchtype=all&source=header)
## Summary
@@ -23,32 +26,30 @@ Here you find [such papers](https://arxiv.org/search/?query=langchain&searchtype
| `2305.14283v3` [Query Rewriting for Retrieval-Augmented Large Language Models](http://arxiv.org/abs/2305.14283v3) | Xinbei Ma, Yeyun Gong, Pengcheng He, et al. | 2023-05-23 | `Template:` [rewrite-retrieve-read](https://python.langchain.com/docs/templates/rewrite-retrieve-read), `Cookbook:` [rewrite](https://github.com/langchain-ai/langchain/blob/master/cookbook/rewrite.ipynb)
| `2305.08291v1` [Large Language Model Guided Tree-of-Thought](http://arxiv.org/abs/2305.08291v1) | Jieyi Long | 2023-05-15 | `API:` [langchain_experimental.tot](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.tot), `Cookbook:` [tree_of_thought](https://github.com/langchain-ai/langchain/blob/master/cookbook/tree_of_thought.ipynb)
| `2305.04091v3` [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](http://arxiv.org/abs/2305.04091v3) | Lei Wang, Wanyu Xu, Yihuai Lan, et al. | 2023-05-06 | `Cookbook:` [plan_and_execute_agent](https://github.com/langchain-ai/langchain/blob/master/cookbook/plan_and_execute_agent.ipynb)
| `2305.02156v1` [Zero-Shot Listwise Document Reranking with a Large Language Model](http://arxiv.org/abs/2305.02156v1) | Xueguang Ma, Xinyu Zhang, Ronak Pradeep, et al. | 2023-05-03 | `API:` [langchain...LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html#langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank)
| `2304.08485v2` [Visual Instruction Tuning](http://arxiv.org/abs/2304.08485v2) | Haotian Liu, Chunyuan Li, Qingyang Wu, et al. | 2023-04-17 | `Cookbook:` [Semi_structured_and_multi_modal_RAG](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb), [Semi_structured_multi_modal_RAG_LLaMA2](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb)
| `2304.03442v2` [Generative Agents: Interactive Simulacra of Human Behavior](http://arxiv.org/abs/2304.03442v2) | Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, et al. | 2023-04-07 | `Cookbook:` [multiagent_bidding](https://github.com/langchain-ai/langchain/blob/master/cookbook/multiagent_bidding.ipynb), [generative_agents_interactive_simulacra_of_human_behavior](https://github.com/langchain-ai/langchain/blob/master/cookbook/generative_agents_interactive_simulacra_of_human_behavior.ipynb)
| `2303.17760v2` [CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society](http://arxiv.org/abs/2303.17760v2) | Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, et al. | 2023-03-31 | `Cookbook:` [camel_role_playing](https://github.com/langchain-ai/langchain/blob/master/cookbook/camel_role_playing.ipynb)
| `2303.17580v4` [HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face](http://arxiv.org/abs/2303.17580v4) | Yongliang Shen, Kaitao Song, Xu Tan, et al. | 2023-03-30 | `API:` [langchain_experimental.autonomous_agents](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.autonomous_agents), `Cookbook:` [hugginggpt](https://github.com/langchain-ai/langchain/blob/master/cookbook/hugginggpt.ipynb)
| `2303.08774v6` [GPT-4 Technical Report](http://arxiv.org/abs/2303.08774v6) | OpenAI, Josh Achiam, Steven Adler, et al. | 2023-03-15 | `Docs:` [docs/integrations/vectorstores/mongodb_atlas](https://python.langchain.com/docs/integrations/vectorstores/mongodb_atlas)
| `2301.10226v4` [A Watermark for Large Language Models](http://arxiv.org/abs/2301.10226v4) | John Kirchenbauer, Jonas Geiping, Yuxin Wen, et al. | 2023-01-24 | `API:` [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...OCIModelDeploymentTGI](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI.html#langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
| `2301.10226v4` [A Watermark for Large Language Models](http://arxiv.org/abs/2301.10226v4) | John Kirchenbauer, Jonas Geiping, Yuxin Wen, et al. | 2023-01-24 | `API:` [langchain_community...OCIModelDeploymentTGI](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI.html#langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
| `2212.10496v1` [Precise Zero-Shot Dense Retrieval without Relevance Labels](http://arxiv.org/abs/2212.10496v1) | Luyu Gao, Xueguang Ma, Jimmy Lin, et al. | 2022-12-20 | `API:` [langchain...HypotheticalDocumentEmbedder](https://api.python.langchain.com/en/latest/chains/langchain.chains.hyde.base.HypotheticalDocumentEmbedder.html#langchain.chains.hyde.base.HypotheticalDocumentEmbedder), `Template:` [hyde](https://python.langchain.com/docs/templates/hyde), `Cookbook:` [hypothetical_document_embeddings](https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb)
| `2212.07425v3` [Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments](http://arxiv.org/abs/2212.07425v3) | Zhivar Sourati, Vishnu Priya Prasanna Venkatesh, Darshan Deshpande, et al. | 2022-12-12 | `API:` [langchain_experimental.fallacy_removal](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.fallacy_removal)
| `2211.13892v2` [Complementary Explanations for Effective In-Context Learning](http://arxiv.org/abs/2211.13892v2) | Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, et al. | 2022-11-25 | `API:` [langchain_core...MaxMarginalRelevanceExampleSelector](https://api.python.langchain.com/en/latest/example_selectors/langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector.html#langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector)
| `2211.10435v2` [PAL: Program-aided Language Models](http://arxiv.org/abs/2211.10435v2) | Luyu Gao, Aman Madaan, Shuyan Zhou, et al. | 2022-11-18 | `API:` [langchain_experimental...PALChain](https://api.python.langchain.com/en/latest/pal_chain/langchain_experimental.pal_chain.base.PALChain.html#langchain_experimental.pal_chain.base.PALChain), [langchain_experimental.pal_chain](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.pal_chain), `Cookbook:` [program_aided_language_model](https://github.com/langchain-ai/langchain/blob/master/cookbook/program_aided_language_model.ipynb)
| `2210.03629v3` [ReAct: Synergizing Reasoning and Acting in Language Models](http://arxiv.org/abs/2210.03629v3) | Shunyu Yao, Jeffrey Zhao, Dian Yu, et al. | 2022-10-06 | `Docs:` [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/integrations/chat/huggingface](https://python.langchain.com/docs/integrations/chat/huggingface), [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping), `API:` [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent), [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain)
| `2211.10435v2` [PAL: Program-aided Language Models](http://arxiv.org/abs/2211.10435v2) | Luyu Gao, Aman Madaan, Shuyan Zhou, et al. | 2022-11-18 | `API:` [langchain_experimental.pal_chain](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.pal_chain), [langchain_experimental...PALChain](https://api.python.langchain.com/en/latest/pal_chain/langchain_experimental.pal_chain.base.PALChain.html#langchain_experimental.pal_chain.base.PALChain), `Cookbook:` [program_aided_language_model](https://github.com/langchain-ai/langchain/blob/master/cookbook/program_aided_language_model.ipynb)
| `2210.03629v3` [ReAct: Synergizing Reasoning and Acting in Language Models](http://arxiv.org/abs/2210.03629v3) | Shunyu Yao, Jeffrey Zhao, Dian Yu, et al. | 2022-10-06 | `Docs:` [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping), `API:` [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain), [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent)
| `2209.10785v2` [Deep Lake: a Lakehouse for Deep Learning](http://arxiv.org/abs/2209.10785v2) | Sasun Hambardzumyan, Abhinav Tuli, Levon Ghukasyan, et al. | 2022-09-22 | `Docs:` [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/docs/integrations/providers/activeloop_deeplake)
| `2205.13147v4` [Matryoshka Representation Learning](http://arxiv.org/abs/2205.13147v4) | Aditya Kusupati, Gantavya Bhatt, Aniket Rege, et al. | 2022-05-26 | `Docs:` [docs/integrations/providers/snowflake](https://python.langchain.com/docs/integrations/providers/snowflake)
| `2205.12654v1` [Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages](http://arxiv.org/abs/2205.12654v1) | Kevin Heffernan, Onur Çelebi, Holger Schwenk | 2022-05-25 | `API:` [langchain_community...LaserEmbeddings](https://api.python.langchain.com/en/latest/embeddings/langchain_community.embeddings.laser.LaserEmbeddings.html#langchain_community.embeddings.laser.LaserEmbeddings)
| `2204.00498v1` [Evaluating the Text-to-SQL Capabilities of Large Language Models](http://arxiv.org/abs/2204.00498v1) | Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau | 2022-03-15 | `API:` [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL), [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase)
| `2202.00666v5` [Locally Typical Sampling](http://arxiv.org/abs/2202.00666v5) | Clara Meister, Tiago Pimentel, Gian Wiher, et al. | 2022-02-01 | `API:` [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
| `2204.00498v1` [Evaluating the Text-to-SQL Capabilities of Large Language Models](http://arxiv.org/abs/2204.00498v1) | Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau | 2022-03-15 | `API:` [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase), [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL)
| `2202.00666v5` [Locally Typical Sampling](http://arxiv.org/abs/2202.00666v5) | Clara Meister, Tiago Pimentel, Gian Wiher, et al. | 2022-02-01 | `API:` [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
| `2103.00020v1` [Learning Transferable Visual Models From Natural Language Supervision](http://arxiv.org/abs/2103.00020v1) | Alec Radford, Jong Wook Kim, Chris Hallacy, et al. | 2021-02-26 | `API:` [langchain_experimental.open_clip](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.open_clip)
| `1909.05858v2` [CTRL: A Conditional Transformer Language Model for Controllable Generation](http://arxiv.org/abs/1909.05858v2) | Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, et al. | 2019-09-11 | `API:` [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
| `1908.10084v1` [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](http://arxiv.org/abs/1908.10084v1) | Nils Reimers, Iryna Gurevych | 2019-08-27 | `Docs:` [docs/integrations/text_embedding/sentence_transformers](https://python.langchain.com/docs/integrations/text_embedding/sentence_transformers)
| `1909.05858v2` [CTRL: A Conditional Transformer Language Model for Controllable Generation](http://arxiv.org/abs/1909.05858v2) | Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, et al. | 2019-09-11 | `API:` [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
## Self-Discover: Large Language Models Self-Compose Reasoning Structures
- **arXiv id:** 2402.03620v1
- **arXiv id:** [2402.03620v1](http://arxiv.org/abs/2402.03620v1) **Published Date:** 2024-02-06
- **Title:** Self-Discover: Large Language Models Self-Compose Reasoning Structures
- **Authors:** Pei Zhou, Jay Pujara, Xiang Ren, et al.
- **Published Date:** 2024-02-06
- **URL:** http://arxiv.org/abs/2402.03620v1
- **LangChain:**
- **Cookbook:** [self-discover](https://github.com/langchain-ai/langchain/blob/master/cookbook/self-discover.ipynb)
@@ -70,11 +71,9 @@ commonalities with human reasoning patterns.
## RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
- **arXiv id:** 2401.18059v1
- **arXiv id:** [2401.18059v1](http://arxiv.org/abs/2401.18059v1) **Published Date:** 2024-01-31
- **Title:** RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
- **Authors:** Parth Sarthi, Salman Abdullah, Aditi Tuli, et al.
- **Published Date:** 2024-01-31
- **URL:** http://arxiv.org/abs/2401.18059v1
- **LangChain:**
- **Cookbook:** [RAPTOR](https://github.com/langchain-ai/langchain/blob/master/cookbook/RAPTOR.ipynb)
@@ -96,11 +95,9 @@ benchmark by 20% in absolute accuracy.
## Corrective Retrieval Augmented Generation
- **arXiv id:** 2401.15884v2
- **arXiv id:** [2401.15884v2](http://arxiv.org/abs/2401.15884v2) **Published Date:** 2024-01-29
- **Title:** Corrective Retrieval Augmented Generation
- **Authors:** Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, et al.
- **Published Date:** 2024-01-29
- **URL:** http://arxiv.org/abs/2401.15884v2
- **LangChain:**
- **Cookbook:** [langgraph_crag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_crag.ipynb)
@@ -126,11 +123,9 @@ performance of RAG-based approaches.
## Mixtral of Experts
- **arXiv id:** 2401.04088v1
- **arXiv id:** [2401.04088v1](http://arxiv.org/abs/2401.04088v1) **Published Date:** 2024-01-08
- **Title:** Mixtral of Experts
- **Authors:** Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, et al.
- **Published Date:** 2024-01-08
- **URL:** http://arxiv.org/abs/2401.04088v1
- **LangChain:**
- **Cookbook:** [together_ai](https://github.com/langchain-ai/langchain/blob/master/cookbook/together_ai.ipynb)
@@ -152,11 +147,9 @@ the base and instruct models are released under the Apache 2.0 license.
## Dense X Retrieval: What Retrieval Granularity Should We Use?
- **arXiv id:** 2312.06648v2
- **arXiv id:** [2312.06648v2](http://arxiv.org/abs/2312.06648v2) **Published Date:** 2023-12-11
- **Title:** Dense X Retrieval: What Retrieval Granularity Should We Use?
- **Authors:** Tong Chen, Hongwei Wang, Sihao Chen, et al.
- **Published Date:** 2023-12-11
- **URL:** http://arxiv.org/abs/2312.06648v2
- **LangChain:**
- **Template:** [propositional-retrieval](https://python.langchain.com/docs/templates/propositional-retrieval)
@@ -181,11 +174,9 @@ information.
## Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
- **arXiv id:** 2311.09210v1
- **arXiv id:** [2311.09210v1](http://arxiv.org/abs/2311.09210v1) **Published Date:** 2023-11-15
- **Title:** Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
- **Authors:** Wenhao Yu, Hongming Zhang, Xiaoman Pan, et al.
- **Published Date:** 2023-11-15
- **URL:** http://arxiv.org/abs/2311.09210v1
- **LangChain:**
- **Template:** [chain-of-note-wiki](https://python.langchain.com/docs/templates/chain-of-note-wiki)
@@ -215,11 +206,9 @@ outside the pre-training knowledge scope.
## Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
- **arXiv id:** 2310.11511v1
- **arXiv id:** [2310.11511v1](http://arxiv.org/abs/2310.11511v1) **Published Date:** 2023-10-17
- **Title:** Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
- **Authors:** Akari Asai, Zeqiu Wu, Yizhong Wang, et al.
- **Published Date:** 2023-10-17
- **URL:** http://arxiv.org/abs/2310.11511v1
- **LangChain:**
- **Cookbook:** [langgraph_self_rag](https://github.com/langchain-ai/langchain/blob/master/cookbook/langgraph_self_rag.ipynb)
@@ -248,11 +237,9 @@ to these models.
## Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
- **arXiv id:** 2310.06117v2
- **arXiv id:** [2310.06117v2](http://arxiv.org/abs/2310.06117v2) **Published Date:** 2023-10-09
- **Title:** Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
- **Authors:** Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, et al.
- **Published Date:** 2023-10-09
- **URL:** http://arxiv.org/abs/2310.06117v2
- **LangChain:**
- **Template:** [stepback-qa-prompting](https://python.langchain.com/docs/templates/stepback-qa-prompting)
@@ -271,11 +258,9 @@ and 11% respectively, TimeQA by 27%, and MuSiQue by 7%.
## Llama 2: Open Foundation and Fine-Tuned Chat Models
- **arXiv id:** 2307.09288v2
- **arXiv id:** [2307.09288v2](http://arxiv.org/abs/2307.09288v2) **Published Date:** 2023-07-18
- **Title:** Llama 2: Open Foundation and Fine-Tuned Chat Models
- **Authors:** Hugo Touvron, Louis Martin, Kevin Stone, et al.
- **Published Date:** 2023-07-18
- **URL:** http://arxiv.org/abs/2307.09288v2
- **LangChain:**
- **Cookbook:** [Semi_Structured_RAG](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_Structured_RAG.ipynb)
@@ -292,11 +277,9 @@ contribute to the responsible development of LLMs.
## Query Rewriting for Retrieval-Augmented Large Language Models
- **arXiv id:** 2305.14283v3
- **arXiv id:** [2305.14283v3](http://arxiv.org/abs/2305.14283v3) **Published Date:** 2023-05-23
- **Title:** Query Rewriting for Retrieval-Augmented Large Language Models
- **Authors:** Xinbei Ma, Yeyun Gong, Pengcheng He, et al.
- **Published Date:** 2023-05-23
- **URL:** http://arxiv.org/abs/2305.14283v3
- **LangChain:**
- **Template:** [rewrite-retrieve-read](https://python.langchain.com/docs/templates/rewrite-retrieve-read)
@@ -322,11 +305,9 @@ for retrieval-augmented LLM.
## Large Language Model Guided Tree-of-Thought
- **arXiv id:** 2305.08291v1
- **arXiv id:** [2305.08291v1](http://arxiv.org/abs/2305.08291v1) **Published Date:** 2023-05-15
- **Title:** Large Language Model Guided Tree-of-Thought
- **Authors:** Jieyi Long
- **Published Date:** 2023-05-15
- **URL:** http://arxiv.org/abs/2305.08291v1
- **LangChain:**
- **API Reference:** [langchain_experimental.tot](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.tot)
@@ -352,11 +333,9 @@ implementation of the ToT-based Sudoku solver is available on GitHub:
## Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
- **arXiv id:** 2305.04091v3
- **arXiv id:** [2305.04091v3](http://arxiv.org/abs/2305.04091v3) **Published Date:** 2023-05-06
- **Title:** Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
- **Authors:** Lei Wang, Wanyu Xu, Yihuai Lan, et al.
- **Published Date:** 2023-05-06
- **URL:** http://arxiv.org/abs/2305.04091v3
- **LangChain:**
- **Cookbook:** [plan_and_execute_agent](https://github.com/langchain-ai/langchain/blob/master/cookbook/plan_and_execute_agent.ipynb)
@@ -383,13 +362,35 @@ Prompting, and has comparable performance with 8-shot CoT prompting on the math
reasoning problem. The code can be found at
https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting.
## Zero-Shot Listwise Document Reranking with a Large Language Model
- **arXiv id:** [2305.02156v1](http://arxiv.org/abs/2305.02156v1) **Published Date:** 2023-05-03
- **Title:** Zero-Shot Listwise Document Reranking with a Large Language Model
- **Authors:** Xueguang Ma, Xinyu Zhang, Ronak Pradeep, et al.
- **LangChain:**
- **API Reference:** [langchain...LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html#langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank)
**Abstract:** Supervised ranking methods based on bi-encoder or cross-encoder architectures
have shown success in multi-stage text ranking tasks, but they require large
amounts of relevance judgments as training data. In this work, we propose
Listwise Reranker with a Large Language Model (LRL), which achieves strong
reranking effectiveness without using any task-specific training data.
Different from the existing pointwise ranking methods, where documents are
scored independently and ranked according to the scores, LRL directly generates
a reordered list of document identifiers given the candidate documents.
Experiments on three TREC web search datasets demonstrate that LRL not only
outperforms zero-shot pointwise methods when reranking first-stage retrieval
results, but can also act as a final-stage reranker to improve the top-ranked
results of a pointwise method for improved efficiency. Additionally, we apply
our approach to subsets of MIRACL, a recent multilingual retrieval dataset,
with results showing its potential to generalize across different languages.
## Visual Instruction Tuning
- **arXiv id:** 2304.08485v2
- **arXiv id:** [2304.08485v2](http://arxiv.org/abs/2304.08485v2) **Published Date:** 2023-04-17
- **Title:** Visual Instruction Tuning
- **Authors:** Haotian Liu, Chunyuan Li, Qingyang Wu, et al.
- **Published Date:** 2023-04-17
- **URL:** http://arxiv.org/abs/2304.08485v2
- **LangChain:**
- **Cookbook:** [Semi_structured_and_multi_modal_RAG](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb), [Semi_structured_multi_modal_RAG_LLaMA2](https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb)
@@ -412,11 +413,9 @@ publicly available.
## Generative Agents: Interactive Simulacra of Human Behavior
- **arXiv id:** 2304.03442v2
- **arXiv id:** [2304.03442v2](http://arxiv.org/abs/2304.03442v2) **Published Date:** 2023-04-07
- **Title:** Generative Agents: Interactive Simulacra of Human Behavior
- **Authors:** Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, et al.
- **Published Date:** 2023-04-07
- **URL:** http://arxiv.org/abs/2304.03442v2
- **LangChain:**
- **Cookbook:** [multiagent_bidding](https://github.com/langchain-ai/langchain/blob/master/cookbook/multiagent_bidding.ipynb), [generative_agents_interactive_simulacra_of_human_behavior](https://github.com/langchain-ai/langchain/blob/master/cookbook/generative_agents_interactive_simulacra_of_human_behavior.ipynb)
@@ -448,11 +447,9 @@ interaction patterns for enabling believable simulations of human behavior.
## CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
- **arXiv id:** 2303.17760v2
- **arXiv id:** [2303.17760v2](http://arxiv.org/abs/2303.17760v2) **Published Date:** 2023-03-31
- **Title:** CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
- **Authors:** Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, et al.
- **Published Date:** 2023-03-31
- **URL:** http://arxiv.org/abs/2303.17760v2
- **LangChain:**
- **Cookbook:** [camel_role_playing](https://github.com/langchain-ai/langchain/blob/master/cookbook/camel_role_playing.ipynb)
@@ -478,11 +475,9 @@ agents and beyond: https://github.com/camel-ai/camel.
## HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
- **arXiv id:** 2303.17580v4
- **arXiv id:** [2303.17580v4](http://arxiv.org/abs/2303.17580v4) **Published Date:** 2023-03-30
- **Title:** HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
- **Authors:** Yongliang Shen, Kaitao Song, Xu Tan, et al.
- **Published Date:** 2023-03-30
- **URL:** http://arxiv.org/abs/2303.17580v4
- **LangChain:**
- **API Reference:** [langchain_experimental.autonomous_agents](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.autonomous_agents)
@@ -508,40 +503,14 @@ modalities and domains and achieve impressive results in language, vision,
speech, and other challenging tasks, which paves a new way towards the
realization of artificial general intelligence.
## GPT-4 Technical Report
- **arXiv id:** 2303.08774v6
- **Title:** GPT-4 Technical Report
- **Authors:** OpenAI, Josh Achiam, Steven Adler, et al.
- **Published Date:** 2023-03-15
- **URL:** http://arxiv.org/abs/2303.08774v6
- **LangChain:**
- **Documentation:** [docs/integrations/vectorstores/mongodb_atlas](https://python.langchain.com/docs/integrations/vectorstores/mongodb_atlas)
**Abstract:** We report the development of GPT-4, a large-scale, multimodal model which can
accept image and text inputs and produce text outputs. While less capable than
humans in many real-world scenarios, GPT-4 exhibits human-level performance on
various professional and academic benchmarks, including passing a simulated bar
exam with a score around the top 10% of test takers. GPT-4 is a
Transformer-based model pre-trained to predict the next token in a document.
The post-training alignment process results in improved performance on measures
of factuality and adherence to desired behavior. A core component of this
project was developing infrastructure and optimization methods that behave
predictably across a wide range of scales. This allowed us to accurately
predict some aspects of GPT-4's performance based on models trained with no
more than 1/1,000th the compute of GPT-4.
## A Watermark for Large Language Models
- **arXiv id:** 2301.10226v4
- **arXiv id:** [2301.10226v4](http://arxiv.org/abs/2301.10226v4) **Published Date:** 2023-01-24
- **Title:** A Watermark for Large Language Models
- **Authors:** John Kirchenbauer, Jonas Geiping, Yuxin Wen, et al.
- **Published Date:** 2023-01-24
- **URL:** http://arxiv.org/abs/2301.10226v4
- **LangChain:**
- **API Reference:** [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...OCIModelDeploymentTGI](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI.html#langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
- **API Reference:** [langchain_community...OCIModelDeploymentTGI](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI.html#langchain_community.llms.oci_data_science_model_deployment_endpoint.OCIModelDeploymentTGI), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
**Abstract:** Potential harms of large language models can be mitigated by watermarking
model output, i.e., embedding signals into generated text that are invisible to
@@ -559,11 +528,9 @@ family, and discuss robustness and security.
## Precise Zero-Shot Dense Retrieval without Relevance Labels
- **arXiv id:** 2212.10496v1
- **arXiv id:** [2212.10496v1](http://arxiv.org/abs/2212.10496v1) **Published Date:** 2022-12-20
- **Title:** Precise Zero-Shot Dense Retrieval without Relevance Labels
- **Authors:** Luyu Gao, Xueguang Ma, Jimmy Lin, et al.
- **Published Date:** 2022-12-20
- **URL:** http://arxiv.org/abs/2212.10496v1
- **LangChain:**
- **API Reference:** [langchain...HypotheticalDocumentEmbedder](https://api.python.langchain.com/en/latest/chains/langchain.chains.hyde.base.HypotheticalDocumentEmbedder.html#langchain.chains.hyde.base.HypotheticalDocumentEmbedder)
@@ -590,11 +557,9 @@ search, QA, fact verification) and languages~(e.g. sw, ko, ja).
## Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
- **arXiv id:** 2212.07425v3
- **arXiv id:** [2212.07425v3](http://arxiv.org/abs/2212.07425v3) **Published Date:** 2022-12-12
- **Title:** Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
- **Authors:** Zhivar Sourati, Vishnu Priya Prasanna Venkatesh, Darshan Deshpande, et al.
- **Published Date:** 2022-12-12
- **URL:** http://arxiv.org/abs/2212.07425v3
- **LangChain:**
- **API Reference:** [langchain_experimental.fallacy_removal](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.fallacy_removal)
@@ -623,11 +588,9 @@ further work on logical fallacy identification.
## Complementary Explanations for Effective In-Context Learning
- **arXiv id:** 2211.13892v2
- **arXiv id:** [2211.13892v2](http://arxiv.org/abs/2211.13892v2) **Published Date:** 2022-11-25
- **Title:** Complementary Explanations for Effective In-Context Learning
- **Authors:** Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, et al.
- **Published Date:** 2022-11-25
- **URL:** http://arxiv.org/abs/2211.13892v2
- **LangChain:**
- **API Reference:** [langchain_core...MaxMarginalRelevanceExampleSelector](https://api.python.langchain.com/en/latest/example_selectors/langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector.html#langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector)
@@ -651,14 +614,12 @@ performance across three real-world tasks on multiple LLMs.
## PAL: Program-aided Language Models
- **arXiv id:** 2211.10435v2
- **arXiv id:** [2211.10435v2](http://arxiv.org/abs/2211.10435v2) **Published Date:** 2022-11-18
- **Title:** PAL: Program-aided Language Models
- **Authors:** Luyu Gao, Aman Madaan, Shuyan Zhou, et al.
- **Published Date:** 2022-11-18
- **URL:** http://arxiv.org/abs/2211.10435v2
- **LangChain:**
- **API Reference:** [langchain_experimental...PALChain](https://api.python.langchain.com/en/latest/pal_chain/langchain_experimental.pal_chain.base.PALChain.html#langchain_experimental.pal_chain.base.PALChain), [langchain_experimental.pal_chain](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.pal_chain)
- **API Reference:** [langchain_experimental.pal_chain](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.pal_chain), [langchain_experimental...PALChain](https://api.python.langchain.com/en/latest/pal_chain/langchain_experimental.pal_chain.base.PALChain.html#langchain_experimental.pal_chain.base.PALChain)
- **Cookbook:** [program_aided_language_model](https://github.com/langchain-ai/langchain/blob/master/cookbook/program_aided_language_model.ipynb)
**Abstract:** Large language models (LLMs) have recently demonstrated an impressive ability
@@ -686,15 +647,13 @@ publicly available at http://reasonwithpal.com/ .
## ReAct: Synergizing Reasoning and Acting in Language Models
- **arXiv id:** 2210.03629v3
- **arXiv id:** [2210.03629v3](http://arxiv.org/abs/2210.03629v3) **Published Date:** 2022-10-06
- **Title:** ReAct: Synergizing Reasoning and Acting in Language Models
- **Authors:** Shunyu Yao, Jeffrey Zhao, Dian Yu, et al.
- **Published Date:** 2022-10-06
- **URL:** http://arxiv.org/abs/2210.03629v3
- **LangChain:**
- **Documentation:** [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/integrations/chat/huggingface](https://python.langchain.com/docs/integrations/chat/huggingface), [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping)
- **API Reference:** [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent), [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain)
- **Documentation:** [docs/integrations/providers/cohere](https://python.langchain.com/docs/integrations/providers/cohere), [docs/integrations/tools/ionic_shopping](https://python.langchain.com/docs/integrations/tools/ionic_shopping)
- **API Reference:** [langchain...TrajectoryEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain.html#langchain.evaluation.agents.trajectory_eval_chain.TrajectoryEvalChain), [langchain...create_react_agent](https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html#langchain.agents.react.agent.create_react_agent)
**Abstract:** While large language models (LLMs) have demonstrated impressive capabilities
across tasks in language understanding and interactive decision making, their
@@ -721,11 +680,9 @@ Project site with code: https://react-lm.github.io
## Deep Lake: a Lakehouse for Deep Learning
- **arXiv id:** 2209.10785v2
- **arXiv id:** [2209.10785v2](http://arxiv.org/abs/2209.10785v2) **Published Date:** 2022-09-22
- **Title:** Deep Lake: a Lakehouse for Deep Learning
- **Authors:** Sasun Hambardzumyan, Abhinav Tuli, Levon Ghukasyan, et al.
- **Published Date:** 2022-09-22
- **URL:** http://arxiv.org/abs/2209.10785v2
- **LangChain:**
- **Documentation:** [docs/integrations/providers/activeloop_deeplake](https://python.langchain.com/docs/integrations/providers/activeloop_deeplake)
@@ -747,13 +704,43 @@ visualization engine, or (c) deep learning frameworks without sacrificing GPU
utilization. Datasets stored in Deep Lake can be accessed from PyTorch,
TensorFlow, JAX, and integrate with numerous MLOps tools.
## Matryoshka Representation Learning
- **arXiv id:** [2205.13147v4](http://arxiv.org/abs/2205.13147v4) **Published Date:** 2022-05-26
- **Title:** Matryoshka Representation Learning
- **Authors:** Aditya Kusupati, Gantavya Bhatt, Aniket Rege, et al.
- **LangChain:**
- **Documentation:** [docs/integrations/providers/snowflake](https://python.langchain.com/docs/integrations/providers/snowflake)
**Abstract:** Learned representations are a central component in modern ML systems, serving
a multitude of downstream tasks. When training such representations, it is
often the case that computational and statistical constraints for each
downstream task are unknown. In this context rigid, fixed capacity
representations can be either over or under-accommodating to the task at hand.
This leads us to ask: can we design a flexible representation that can adapt to
multiple downstream tasks with varying computational resources? Our main
contribution is Matryoshka Representation Learning (MRL) which encodes
information at different granularities and allows a single embedding to adapt
to the computational constraints of downstream tasks. MRL minimally modifies
existing representation learning pipelines and imposes no additional cost
during inference and deployment. MRL learns coarse-to-fine representations that
are at least as accurate and rich as independently trained low-dimensional
representations. The flexibility within the learned Matryoshka Representations
offer: (a) up to 14x smaller embedding size for ImageNet-1K classification at
the same level of accuracy; (b) up to 14x real-world speed-ups for large-scale
retrieval on ImageNet-1K and 4K; and (c) up to 2% accuracy improvements for
long-tail few-shot classification, all while being as robust as the original
representations. Finally, we show that MRL extends seamlessly to web-scale
datasets (ImageNet, JFT) across various modalities -- vision (ViT, ResNet),
vision + language (ALIGN) and language (BERT). MRL code and pretrained models
are open-sourced at https://github.com/RAIVNLab/MRL.
## Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
- **arXiv id:** 2205.12654v1
- **arXiv id:** [2205.12654v1](http://arxiv.org/abs/2205.12654v1) **Published Date:** 2022-05-25
- **Title:** Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
- **Authors:** Kevin Heffernan, Onur Çelebi, Holger Schwenk
- **Published Date:** 2022-05-25
- **URL:** http://arxiv.org/abs/2205.12654v1
- **LangChain:**
- **API Reference:** [langchain_community...LaserEmbeddings](https://api.python.langchain.com/en/latest/embeddings/langchain_community.embeddings.laser.LaserEmbeddings.html#langchain_community.embeddings.laser.LaserEmbeddings)
@@ -778,14 +765,12 @@ encoders, mine bitexts, and validate the bitexts by training NMT systems.
## Evaluating the Text-to-SQL Capabilities of Large Language Models
- **arXiv id:** 2204.00498v1
- **arXiv id:** [2204.00498v1](http://arxiv.org/abs/2204.00498v1) **Published Date:** 2022-03-15
- **Title:** Evaluating the Text-to-SQL Capabilities of Large Language Models
- **Authors:** Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau
- **Published Date:** 2022-03-15
- **URL:** http://arxiv.org/abs/2204.00498v1
- **LangChain:**
- **API Reference:** [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL), [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase)
- **API Reference:** [langchain_community...SQLDatabase](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.sql_database.SQLDatabase.html#langchain_community.utilities.sql_database.SQLDatabase), [langchain_community...SparkSQL](https://api.python.langchain.com/en/latest/utilities/langchain_community.utilities.spark_sql.SparkSQL.html#langchain_community.utilities.spark_sql.SparkSQL)
**Abstract:** We perform an empirical evaluation of Text-to-SQL capabilities of the Codex
language model. We find that, without any finetuning, Codex is a strong
@@ -797,14 +782,12 @@ few-shot examples.
## Locally Typical Sampling
- **arXiv id:** 2202.00666v5
- **arXiv id:** [2202.00666v5](http://arxiv.org/abs/2202.00666v5) **Published Date:** 2022-02-01
- **Title:** Locally Typical Sampling
- **Authors:** Clara Meister, Tiago Pimentel, Gian Wiher, et al.
- **Published Date:** 2022-02-01
- **URL:** http://arxiv.org/abs/2202.00666v5
- **LangChain:**
- **API Reference:** [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
- **API Reference:** [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
**Abstract:** Today's probabilistic language generators fall short when it comes to
producing coherent and fluent text despite the fact that the underlying models
@@ -829,11 +812,9 @@ reducing degenerate repetitions.
## Learning Transferable Visual Models From Natural Language Supervision
- **arXiv id:** 2103.00020v1
- **arXiv id:** [2103.00020v1](http://arxiv.org/abs/2103.00020v1) **Published Date:** 2021-02-26
- **Title:** Learning Transferable Visual Models From Natural Language Supervision
- **Authors:** Alec Radford, Jong Wook Kim, Chris Hallacy, et al.
- **Published Date:** 2021-02-26
- **URL:** http://arxiv.org/abs/2103.00020v1
- **LangChain:**
- **API Reference:** [langchain_experimental.open_clip](https://api.python.langchain.com/en/latest/experimental_api_reference.html#module-langchain_experimental.open_clip)
@@ -861,14 +842,12 @@ https://github.com/OpenAI/CLIP.
## CTRL: A Conditional Transformer Language Model for Controllable Generation
- **arXiv id:** 1909.05858v2
- **arXiv id:** [1909.05858v2](http://arxiv.org/abs/1909.05858v2) **Published Date:** 2019-09-11
- **Title:** CTRL: A Conditional Transformer Language Model for Controllable Generation
- **Authors:** Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, et al.
- **Published Date:** 2019-09-11
- **URL:** http://arxiv.org/abs/1909.05858v2
- **LangChain:**
- **API Reference:** [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
- **API Reference:** [langchain_huggingface...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_huggingface.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceEndpoint](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint.html#langchain_community.llms.huggingface_endpoint.HuggingFaceEndpoint), [langchain_community...HuggingFaceTextGenInference](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference.html#langchain_community.llms.huggingface_text_gen_inference.HuggingFaceTextGenInference)
**Abstract:** Large-scale language models show promising text generation capabilities, but
users cannot easily control particular aspects of the generated text. We
@@ -881,32 +860,4 @@ codes also allow CTRL to predict which parts of the training data are most
likely given a sequence. This provides a potential method for analyzing large
amounts of data via model-based source attribution. We have released multiple
full-sized, pretrained versions of CTRL at https://github.com/salesforce/ctrl.
## Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- **arXiv id:** 1908.10084v1
- **Title:** Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
- **Authors:** Nils Reimers, Iryna Gurevych
- **Published Date:** 2019-08-27
- **URL:** http://arxiv.org/abs/1908.10084v1
- **LangChain:**
- **Documentation:** [docs/integrations/text_embedding/sentence_transformers](https://python.langchain.com/docs/integrations/text_embedding/sentence_transformers)
**Abstract:** BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new
state-of-the-art performance on sentence-pair regression tasks like semantic
textual similarity (STS). However, it requires that both sentences are fed into
the network, which causes a massive computational overhead: Finding the most
similar pair in a collection of 10,000 sentences requires about 50 million
inference computations (~65 hours) with BERT. The construction of BERT makes it
unsuitable for semantic similarity search as well as for unsupervised tasks
like clustering.
In this publication, we present Sentence-BERT (SBERT), a modification of the
pretrained BERT network that use siamese and triplet network structures to
derive semantically meaningful sentence embeddings that can be compared using
cosine-similarity. This reduces the effort for finding the most similar pair
from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while
maintaining the accuracy from BERT.
We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning
tasks, where it outperforms other state-of-the-art sentence embeddings methods.

View File

@@ -55,6 +55,7 @@ A developer platform that lets you debug, test, evaluate, and monitor LLM applic
dark: useBaseUrl('/svg/langchain_stack_062024_dark.svg'),
}}
title="LangChain Framework Overview"
style={{ width: "100%" }}
/>
## LangChain Expression Language (LCEL)
@@ -89,7 +90,7 @@ LCEL aims to provide consistency around behavior and customization over legacy s
`ConversationalRetrievalChain`. Many of these legacy chains hide important details like prompts, and as a wider variety
of viable models emerge, customization has become more and more important.
If you are currently using one of these legacy chains, please see [this guide for guidance on how to migrate](/docs/how_to/migrate_chains/).
If you are currently using one of these legacy chains, please see [this guide for guidance on how to migrate](/docs/versions/migrating_chains).
For guides on how to do specific tasks with LCEL, check out [the relevant how-to guides](/docs/how_to/#langchain-expression-language-lcel).
@@ -164,7 +165,7 @@ Some important things to note:
ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the API reference for that model.
:::important
**Tool Calling** Some chat models have been fine-tuned for tool calling and provide a dedicated API for tool calling.
Some chat models have been fine-tuned for **tool calling** and provide a dedicated API for it.
Generally, such models are better at tool calling than non-fine-tuned models, and are recommended for use cases that require tool calling.
Please see the [tool calling section](/docs/concepts/#functiontool-calling) for more information.
:::
@@ -235,7 +236,7 @@ This is where information like log-probs and token usage may be stored.
These represent a decision from an language model to call a tool. They are included as part of an `AIMessage` output.
They can be accessed from there with the `.tool_calls` property.
This property returns a list of dictionaries. Each dictionary has the following keys:
This property returns a list of `ToolCall`s. A `ToolCall` is a dictionary with the following arguments:
- `name`: The name of the tool that should be called.
- `args`: The arguments to that tool.
@@ -245,13 +246,18 @@ This property returns a list of dictionaries. Each dictionary has the following
This represents a system message, which tells the model how to behave. Not every model provider supports this.
#### FunctionMessage
This represents the result of a function call. In addition to `role` and `content`, this message has a `name` parameter which conveys the name of the function that was called to produce this result.
#### ToolMessage
This represents the result of a tool call. This is distinct from a FunctionMessage in order to match OpenAI's `function` and `tool` message types. In addition to `role` and `content`, this message has a `tool_call_id` parameter which conveys the id of the call to the tool that was called to produce this result.
This represents the result of a tool call. In addition to `role` and `content`, this message has:
- a `tool_call_id` field which conveys the id of the call to the tool that was called to produce this result.
- an `artifact` field which can be used to pass along arbitrary artifacts of the tool execution which are useful to track but which should not be sent to the model.
#### (Legacy) FunctionMessage
This is a legacy message type, corresponding to OpenAI's legacy function-calling API. `ToolMessage` should be used instead to correspond to the updated tool-calling API.
This represents the result of a function call. In addition to `role` and `content`, this message has a `name` parameter which conveys the name of the function that was called to produce this result.
### Prompt templates
@@ -492,38 +498,130 @@ Retrievers accept a string query as input and return a list of Document's as out
For specifics on how to use retrievers, see the [relevant how-to guides here](/docs/how_to/#retrievers).
### Key-value stores
For some techniques, such as [indexing and retrieval with multiple vectors per document](/docs/how_to/multi_vector/) or
[caching embeddings](/docs/how_to/caching_embeddings/), having a form of key-value (KV) storage is helpful.
LangChain includes a [`BaseStore`](https://api.python.langchain.com/en/latest/stores/langchain_core.stores.BaseStore.html) interface,
which allows for storage of arbitrary data. However, LangChain components that require KV-storage accept a
more specific `BaseStore[str, bytes]` instance that stores binary data (referred to as a `ByteStore`), and internally take care of
encoding and decoding data for their specific needs.
This means that as a user, you only need to think about one type of store rather than different ones for different types of data.
#### Interface
All [`BaseStores`](https://api.python.langchain.com/en/latest/stores/langchain_core.stores.BaseStore.html) support the following interface. Note that the interface allows
for modifying **multiple** key-value pairs at once:
- `mget(key: Sequence[str]) -> List[Optional[bytes]]`: get the contents of multiple keys, returning `None` if the key does not exist
- `mset(key_value_pairs: Sequence[Tuple[str, bytes]]) -> None`: set the contents of multiple keys
- `mdelete(key: Sequence[str]) -> None`: delete multiple keys
- `yield_keys(prefix: Optional[str] = None) -> Iterator[str]`: yield all keys in the store, optionally filtering by a prefix
For key-value store implementations, see [this section](/docs/integrations/stores/).
### Tools
<span data-heading-keywords="tool,tools"></span>
Tools are interfaces that an agent, a chain, or a chat model / LLM can use to interact with the world.
Tools are utilities designed to be called by a model: their inputs are designed to be generated by models, and their outputs are designed to be passed back to models.
Tools are needed whenever you want a model to control parts of your code or call out to external APIs.
A tool consists of the following components:
A tool consists of:
1. The name of the tool
2. A description of what the tool does
3. JSON schema of what the inputs to the tool are
4. The function to call
5. Whether the result of a tool should be returned directly to the user (only relevant for agents)
1. The name of the tool.
2. A description of what the tool does.
3. A JSON schema defining the inputs to the tool.
4. A function (and, optionally, an async variant of the function).
The name, description and JSON schema are provided as context
to the LLM, allowing the LLM to determine how to use the tool
appropriately.
When a tool is bound to a model, the name, description and JSON schema are provided as context to the model.
Given a list of tools and a set of instructions, a model can request to call one or more tools with specific inputs.
Typical usage may look like the following:
Given a list of available tools and a prompt, an LLM can request
that one or more tools be invoked with appropriate arguments.
```python
tools = [...] # Define a list of tools
llm_with_tools = llm.bind_tools(tools)
ai_msg = llm_with_tools.invoke("do xyz...")
# -> AIMessage(tool_calls=[ToolCall(...), ...], ...)
```
Generally, when designing tools to be used by a chat model or LLM, it is important to keep in mind the following:
The `AIMessage` returned from the model MAY have `tool_calls` associated with it.
Read [this guide](/docs/concepts/#aimessage) for more information on what the response type may look like.
- Chat models that have been fine-tuned for tool calling will be better at tool calling than non-fine-tuned models.
- Non fine-tuned models may not be able to use tools at all, especially if the tools are complex or require multiple tool calls.
- Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas.
- Simpler tools are generally easier for models to use than more complex tools.
Once the chosen tools are invoked, the results can be passed back to the model so that it can complete whatever task
it's performing.
There are generally two different ways to invoke the tool and pass back the response:
For specifics on how to use tools, see the [relevant how-to guides here](/docs/how_to/#tools).
#### Invoke with just the arguments
To use an existing pre-built tool, see [here](docs/integrations/tools/) for a list of pre-built tools.
When you invoke a tool with just the arguments, you will get back the raw tool output (usually a string).
This generally looks like:
```python
# You will want to previously check that the LLM returned tool calls
tool_call = ai_msg.tool_calls[0]
# ToolCall(args={...}, id=..., ...)
tool_output = tool.invoke(tool_call["args"])
tool_message = ToolMessage(
content=tool_output,
tool_call_id=tool_call["id"],
name=tool_call["name"]
)
```
Note that the `content` field will generally be passed back to the model.
If you do not want the raw tool response to be passed to the model, but you still want to keep it around,
you can transform the tool output but also pass it as an artifact (read more about [`ToolMessage.artifact` here](/docs/concepts/#toolmessage))
```python
... # Same code as above
response_for_llm = transform(response)
tool_message = ToolMessage(
content=response_for_llm,
tool_call_id=tool_call["id"],
name=tool_call["name"],
artifact=tool_output
)
```
#### Invoke with `ToolCall`
The other way to invoke a tool is to call it with the full `ToolCall` that was generated by the model.
When you do this, the tool will return a ToolMessage.
The benefits of this are that you don't have to write the logic yourself to transform the tool output into a ToolMessage.
This generally looks like:
```python
tool_call = ai_msg.tool_calls[0]
# -> ToolCall(args={...}, id=..., ...)
tool_message = tool.invoke(tool_call)
# -> ToolMessage(
content="tool result foobar...",
tool_call_id=...,
name="tool_name"
)
```
If you are invoking the tool this way and want to include an [artifact](/docs/concepts/#toolmessage) for the ToolMessage, you will need to have the tool return two things.
Read more about [defining tools that return artifacts here](/docs/how_to/tool_artifacts/).
#### Best practices
When designing tools to be used by a model, it is important to keep in mind that:
- Chat models that have explicit [tool-calling APIs](/docs/concepts/#functiontool-calling) will be better at tool calling than non-fine-tuned models.
- Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas. This another form of prompt engineering.
- Simple, narrowly scoped tools are easier for models to use than complex tools.
#### Related
For specifics on how to use tools, see the [tools how-to guides](/docs/how_to/#tools).
To use a pre-built tool, see the [tool integration docs](/docs/integrations/tools/).
### Toolkits
<span data-heading-keywords="toolkit,toolkits"></span>
Toolkits are collections of tools that are designed to be used together for specific tasks. They have convenient loading methods.
@@ -768,6 +866,61 @@ units (like words or subwords) that carry meaning, rather than individual charac
to learn and understand the structure of the language, including grammar and context.
Furthermore, using tokens can also improve efficiency, since the model processes fewer units of text compared to character-level processing.
### Function/tool calling
:::info
We use the term tool calling interchangeably with function calling. Although
function calling is sometimes meant to refer to invocations of a single function,
we treat all models as though they can return multiple tool or function calls in
each message.
:::
Tool calling allows a [chat model](/docs/concepts/#chat-models) to respond to a given prompt by generating output that
matches a user-defined schema.
While the name implies that the model is performing
some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.
One common example where you **wouldn't** want to call a function with the generated arguments
is if you want to [extract structured output matching some schema](/docs/concepts/#structured-output)
from unstructured text. You would give the model an "extraction" tool that takes
parameters matching the desired schema, then treat the generated output as your final
result.
![Diagram of a tool call by a chat model](/img/tool_call.png)
Tool calling is not universal, but is supported by many popular LLM providers, including [Anthropic](/docs/integrations/chat/anthropic/),
[Cohere](/docs/integrations/chat/cohere/), [Google](/docs/integrations/chat/google_vertex_ai_palm/),
[Mistral](/docs/integrations/chat/mistralai/), [OpenAI](/docs/integrations/chat/openai/), and even for locally-running models via [Ollama](/docs/integrations/chat/ollama/).
LangChain provides a standardized interface for tool calling that is consistent across different models.
The standard interface consists of:
* `ChatModel.bind_tools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools) as well as [Pydantic](https://pydantic.dev/) objects.
* `AIMessage.tool_calls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model.
#### Tool usage
After the model calls tools, you can use the tool by invoking it, then passing the arguments back to the model.
LangChain provides the [`Tool`](/docs/concepts/#tools) abstraction to help you handle this.
The general flow is this:
1. Generate tool calls with a chat model in response to a query.
2. Invoke the appropriate tools using the generated tool call as arguments.
3. Format the result of the tool invocations as [`ToolMessages`](/docs/concepts/#toolmessage).
4. Pass the entire list of messages back to the model so that it can generate a final answer (or call more tools).
![Diagram of a complete tool calling flow](/img/tool_calling_flow.png)
This is how tool calling [agents](/docs/concepts/#agents) perform tasks and answer queries.
Check out some more focused guides below:
- [How to use chat models to call tools](/docs/how_to/tool_calling/)
- [How to pass tool outputs to chat models](/docs/how_to/tool_results_pass_to_model/)
- [Building an agent with LangGraph](https://langchain-ai.github.io/langgraph/tutorials/introduction/)
### Structured output
LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide
@@ -821,7 +974,7 @@ We recommend this method as a starting point when working with structured output
- If multiple underlying techniques are supported, you can supply a `method` parameter to
[toggle which one is used](/docs/how_to/structured_output/#advanced-specifying-the-method-for-structuring-outputs).
You may want or need to use other techiniques if:
You may want or need to use other techniques if:
- The chat model you are using does not support tool calling.
- You are working with very complex schemas and the model is having trouble generating outputs that conform.
@@ -900,48 +1053,48 @@ chain.invoke({ "question": "What is the powerhouse of the cell?" })
For a full list of model providers that support JSON mode, see [this table](/docs/integrations/chat/#advanced-features).
#### Function/tool calling
#### Tool calling {#structured-output-tool-calling}
:::info
We use the term tool calling interchangeably with function calling. Although
function calling is sometimes meant to refer to invocations of a single function,
we treat all models as though they can return multiple tool or function calls in
each message
:::
For models that support it, [tool calling](/docs/concepts/#functiontool-calling) can be very convenient for structured output. It removes the
guesswork around how best to prompt schemas in favor of a built-in model feature.
Tool calling allows a model to respond to a given prompt by generating output that
matches a user-defined schema. While the name implies that the model is performing
some action, this is actually not the case! The model is coming up with the
arguments to a tool, and actually running the tool (or not) is up to the user -
for example, if you want to [extract output matching some schema](/docs/tutorials/extraction)
from unstructured text, you could give the model an "extraction" tool that takes
parameters matching the desired schema, then treat the generated output as your final
result.
It works by first binding the desired schema either directly or via a [LangChain tool](/docs/concepts/#tools) to a
[chat model](/docs/concepts/#chat-models) using the `.bind_tools()` method. The model will then generate an `AIMessage` containing
a `tool_calls` field containing `args` that match the desired shape.
For models that support it, tool calling can be very convenient. It removes the
guesswork around how best to prompt schemas in favor of a built-in model feature. It can also
more naturally support agentic flows, since you can just pass multiple tool schemas instead
of fiddling with enums or unions.
There are several acceptable formats you can use to bind tools to a model in LangChain. Here's one example:
Many LLM providers, including [Anthropic](https://www.anthropic.com/),
[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai),
[Mistral](https://mistral.ai/), [OpenAI](https://openai.com/), and others,
support variants of a tool calling feature. These features typically allow requests
to the LLM to include available tools and their schemas, and for responses to include
calls to these tools. For instance, given a search engine tool, an LLM might handle a
query by first issuing a call to the search engine. The system calling the LLM can
receive the tool call, execute it, and return the output to the LLM to inform its
response. LangChain includes a suite of [built-in tools](/docs/integrations/tools/)
and supports several methods for defining your own [custom tools](/docs/how_to/custom_tools).
```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI
LangChain provides a standardized interface for tool calling that is consistent across different models.
class ResponseFormatter(BaseModel):
"""Always use this tool to structure your response to the user."""
The standard interface consists of:
answer: str = Field(description="The answer to the user's question")
followup_question: str = Field(description="A followup question the user could ask")
* `ChatModel.bind_tools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools) here.
* `AIMessage.tool_calls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model.
model = ChatOpenAI(
model="gpt-4o",
temperature=0,
)
The following how-to guides are good practical resources for using function/tool calling:
model_with_tools = model.bind_tools([ResponseFormatter])
ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
ai_msg.tool_calls[0]["args"]
```
```
{'answer': "The powerhouse of the cell is the mitochondrion. It generates most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
'followup_question': 'How do mitochondria generate ATP?'}
```
Tool calling is a generally consistent way to get a model to generate structured output, and is the default technique
used for the [`.with_structured_output()`](/docs/concepts/#with_structured_output) method when a model supports it.
The following how-to guides are good practical resources for using function/tool calling for structured output:
- [How to return structured data from an LLM](/docs/how_to/structured_output/)
- [How to use a model to call tools](/docs/how_to/tool_calling)

View File

@@ -33,6 +33,8 @@ Some examples include:
- [Build a Simple LLM Application with LCEL](/docs/tutorials/llm_chain/)
- [Build a Retrieval Augmented Generation (RAG) App](/docs/tutorials/rag/)
A good structural rule of thumb is to follow the structure of this [example from Numpy](https://numpy.org/numpy-tutorials/content/tutorial-svd.html).
Here are some high-level tips on writing a good tutorial:

Binary file not shown.

View File

@@ -153,7 +153,7 @@
"\n",
" def parse(self, text: str) -> List[str]:\n",
" lines = text.strip().split(\"\\n\")\n",
" return lines\n",
" return list(filter(None, lines)) # Remove empty lines\n",
"\n",
"\n",
"output_parser = LineListOutputParser()\n",

View File

@@ -0,0 +1,342 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to dispatch custom callback events\n",
"\n",
":::info Prerequisites\n",
"\n",
"This guide assumes familiarity with the following concepts:\n",
"\n",
"- [Callbacks](/docs/concepts/#callbacks)\n",
"- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
"- [Astream Events API](/docs/concepts/#astream_events) the `astream_events` method will surface custom callback events.\n",
":::\n",
"\n",
"In some situations, you may want to dipsatch a custom callback event from within a [Runnable](/docs/concepts/#runnable-interface) so it can be surfaced\n",
"in a custom callback handler or via the [Astream Events API](/docs/concepts/#astream_events).\n",
"\n",
"For example, if you have a long running tool with multiple steps, you can dispatch custom events between the steps and use these custom events to monitor progress.\n",
"You could also surface these custom events to an end user of your application to show them how the current task is progressing.\n",
"\n",
"To dispatch a custom event you need to decide on two attributes for the event: the `name` and the `data`.\n",
"\n",
"| Attribute | Type | Description |\n",
"|-----------|------|----------------------------------------------------------------------------------------------------------|\n",
"| name | str | A user defined name for the event. |\n",
"| data | Any | The data associated with the event. This can be anything, though we suggest making it JSON serializable. |\n",
"\n",
"\n",
":::{.callout-important}\n",
"* Dispatching custom callback events requires `langchain-core>=0.2.15`.\n",
"* Custom callback events can only be dispatched from within an existing `Runnable`.\n",
"* If using `astream_events`, you must use `version='v2'` to see custom events.\n",
"* Sending or rendering custom callbacks events in LangSmith is not yet supported.\n",
":::\n",
"\n",
"\n",
":::caution COMPATIBILITY\n",
"LangChain cannot automatically propagate configuration, including callbacks necessary for astream_events(), to child runnables if you are running async code in python<=3.10. This is a common reason why you may fail to see events being emitted from custom runnables or tools.\n",
"\n",
"If you are running python<=3.10, you will need to manually propagate the `RunnableConfig` object to the child runnable in async environments. For an example of how to manually propagate the config, see the implementation of the `bar` RunnableLambda below.\n",
"\n",
"If you are running python>=3.11, the `RunnableConfig` will automatically propagate to child runnables in async environment. However, it is still a good idea to propagate the `RunnableConfig` manually if your code may run in other Python versions.\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# | output: false\n",
"# | echo: false\n",
"\n",
"%pip install -qU langchain-core"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Astream Events API\n",
"\n",
"The most useful way to consume custom events is via the [Astream Events API](/docs/concepts/#astream_events).\n",
"\n",
"We can use the `async` `adispatch_custom_event` API to emit custom events in an async setting. \n",
"\n",
"\n",
":::{.callout-important}\n",
"\n",
"To see custom events via the astream events API, you need to use the newer `v2` API of `astream_events`.\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'event': 'on_chain_start', 'data': {'input': 'hello world'}, 'name': 'foo', 'tags': [], 'run_id': 'f354ffe8-4c22-4881-890a-c1cad038a9a6', 'metadata': {}, 'parent_ids': []}\n",
"{'event': 'on_custom_event', 'run_id': 'f354ffe8-4c22-4881-890a-c1cad038a9a6', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 'hello world'}, 'parent_ids': []}\n",
"{'event': 'on_custom_event', 'run_id': 'f354ffe8-4c22-4881-890a-c1cad038a9a6', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': 5, 'parent_ids': []}\n",
"{'event': 'on_chain_stream', 'run_id': 'f354ffe8-4c22-4881-890a-c1cad038a9a6', 'name': 'foo', 'tags': [], 'metadata': {}, 'data': {'chunk': 'hello world'}, 'parent_ids': []}\n",
"{'event': 'on_chain_end', 'data': {'output': 'hello world'}, 'run_id': 'f354ffe8-4c22-4881-890a-c1cad038a9a6', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []}\n"
]
}
],
"source": [
"from langchain_core.callbacks.manager import (\n",
" adispatch_custom_event,\n",
")\n",
"from langchain_core.runnables import RunnableLambda\n",
"from langchain_core.runnables.config import RunnableConfig\n",
"\n",
"\n",
"@RunnableLambda\n",
"async def foo(x: str) -> str:\n",
" await adispatch_custom_event(\"event1\", {\"x\": x})\n",
" await adispatch_custom_event(\"event2\", 5)\n",
" return x\n",
"\n",
"\n",
"async for event in foo.astream_events(\"hello world\", version=\"v2\"):\n",
" print(event)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In python <= 3.10, you must propagate the config manually!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'event': 'on_chain_start', 'data': {'input': 'hello world'}, 'name': 'bar', 'tags': [], 'run_id': 'c787b09d-698a-41b9-8290-92aaa656f3e7', 'metadata': {}, 'parent_ids': []}\n",
"{'event': 'on_custom_event', 'run_id': 'c787b09d-698a-41b9-8290-92aaa656f3e7', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 'hello world'}, 'parent_ids': []}\n",
"{'event': 'on_custom_event', 'run_id': 'c787b09d-698a-41b9-8290-92aaa656f3e7', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': 5, 'parent_ids': []}\n",
"{'event': 'on_chain_stream', 'run_id': 'c787b09d-698a-41b9-8290-92aaa656f3e7', 'name': 'bar', 'tags': [], 'metadata': {}, 'data': {'chunk': 'hello world'}, 'parent_ids': []}\n",
"{'event': 'on_chain_end', 'data': {'output': 'hello world'}, 'run_id': 'c787b09d-698a-41b9-8290-92aaa656f3e7', 'name': 'bar', 'tags': [], 'metadata': {}, 'parent_ids': []}\n"
]
}
],
"source": [
"from langchain_core.callbacks.manager import (\n",
" adispatch_custom_event,\n",
")\n",
"from langchain_core.runnables import RunnableLambda\n",
"from langchain_core.runnables.config import RunnableConfig\n",
"\n",
"\n",
"@RunnableLambda\n",
"async def bar(x: str, config: RunnableConfig) -> str:\n",
" \"\"\"An example that shows how to manually propagate config.\n",
"\n",
" You must do this if you're running python<=3.10.\n",
" \"\"\"\n",
" await adispatch_custom_event(\"event1\", {\"x\": x}, config=config)\n",
" await adispatch_custom_event(\"event2\", 5, config=config)\n",
" return x\n",
"\n",
"\n",
"async for event in bar.astream_events(\"hello world\", version=\"v2\"):\n",
" print(event)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Async Callback Handler\n",
"\n",
"You can also consume the dispatched event via an async callback handler."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Received event event1 with data: {'x': 1}, with tags: ['foo', 'bar'], with metadata: {} and run_id: a62b84be-7afd-4829-9947-7165df1f37d9\n",
"Received event event2 with data: 5, with tags: ['foo', 'bar'], with metadata: {} and run_id: a62b84be-7afd-4829-9947-7165df1f37d9\n"
]
},
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from typing import Any, Dict, List, Optional\n",
"from uuid import UUID\n",
"\n",
"from langchain_core.callbacks import AsyncCallbackHandler\n",
"from langchain_core.callbacks.manager import (\n",
" adispatch_custom_event,\n",
")\n",
"from langchain_core.runnables import RunnableLambda\n",
"from langchain_core.runnables.config import RunnableConfig\n",
"\n",
"\n",
"class AsyncCustomCallbackHandler(AsyncCallbackHandler):\n",
" async def on_custom_event(\n",
" self,\n",
" name: str,\n",
" data: Any,\n",
" *,\n",
" run_id: UUID,\n",
" tags: Optional[List[str]] = None,\n",
" metadata: Optional[Dict[str, Any]] = None,\n",
" **kwargs: Any,\n",
" ) -> None:\n",
" print(\n",
" f\"Received event {name} with data: {data}, with tags: {tags}, with metadata: {metadata} and run_id: {run_id}\"\n",
" )\n",
"\n",
"\n",
"@RunnableLambda\n",
"async def bar(x: str, config: RunnableConfig) -> str:\n",
" \"\"\"An example that shows how to manually propagate config.\n",
"\n",
" You must do this if you're running python<=3.10.\n",
" \"\"\"\n",
" await adispatch_custom_event(\"event1\", {\"x\": x}, config=config)\n",
" await adispatch_custom_event(\"event2\", 5, config=config)\n",
" return x\n",
"\n",
"\n",
"async_handler = AsyncCustomCallbackHandler()\n",
"await foo.ainvoke(1, {\"callbacks\": [async_handler], \"tags\": [\"foo\", \"bar\"]})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sync Callback Handler\n",
"\n",
"Let's see how to emit custom events in a sync environment using `dispatch_custom_event`.\n",
"\n",
"You **must** call `dispatch_custom_event` from within an existing `Runnable`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Received event event1 with data: {'x': 1}, with tags: ['foo', 'bar'], with metadata: {} and run_id: 27b5ce33-dc26-4b34-92dd-08a89cb22268\n",
"Received event event2 with data: {'x': 1}, with tags: ['foo', 'bar'], with metadata: {} and run_id: 27b5ce33-dc26-4b34-92dd-08a89cb22268\n"
]
},
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from typing import Any, Dict, List, Optional\n",
"from uuid import UUID\n",
"\n",
"from langchain_core.callbacks import BaseCallbackHandler\n",
"from langchain_core.callbacks.manager import (\n",
" dispatch_custom_event,\n",
")\n",
"from langchain_core.runnables import RunnableLambda\n",
"from langchain_core.runnables.config import RunnableConfig\n",
"\n",
"\n",
"class CustomHandler(BaseCallbackHandler):\n",
" def on_custom_event(\n",
" self,\n",
" name: str,\n",
" data: Any,\n",
" *,\n",
" run_id: UUID,\n",
" tags: Optional[List[str]] = None,\n",
" metadata: Optional[Dict[str, Any]] = None,\n",
" **kwargs: Any,\n",
" ) -> None:\n",
" print(\n",
" f\"Received event {name} with data: {data}, with tags: {tags}, with metadata: {metadata} and run_id: {run_id}\"\n",
" )\n",
"\n",
"\n",
"@RunnableLambda\n",
"def foo(x: int, config: RunnableConfig) -> int:\n",
" dispatch_custom_event(\"event1\", {\"x\": x})\n",
" dispatch_custom_event(\"event2\", {\"x\": x})\n",
" return x\n",
"\n",
"\n",
"handler = CustomHandler()\n",
"foo.invoke(1, {\"callbacks\": [handler], \"tags\": [\"foo\", \"bar\"]})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next steps\n",
"\n",
"You've seen how to emit custom events, you can check out the more in depth guide for [astream events](/docs/how_to/streaming/#using-stream-events) which is the easiest way to leverage custom events."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -63,7 +63,7 @@
"outputs": [],
"source": [
"# <!-- ruff: noqa: F821 -->\n",
"from langchain.globals import set_llm_cache"
"from langchain_core.globals import set_llm_cache"
]
},
{
@@ -103,7 +103,7 @@
],
"source": [
"%%time\n",
"from langchain.cache import InMemoryCache\n",
"from langchain_core.caches import InMemoryCache\n",
"\n",
"set_llm_cache(InMemoryCache())\n",
"\n",

View File

@@ -0,0 +1,146 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "dcf87b32",
"metadata": {},
"source": [
"# How to handle rate limits\n",
"\n",
":::info Prerequisites\n",
"\n",
"This guide assumes familiarity with the following concepts:\n",
"- [Chat models](/docs/concepts/#chat-models)\n",
"- [LLMs](/docs/concepts/#llms)\n",
":::\n",
"\n",
"\n",
"You may find yourself in a situation where you are getting rate limited by the model provider API because you're making too many requests.\n",
"\n",
"For example, this might happen if you are running many parallel queries to benchmark the chat model on a test dataset.\n",
"\n",
"If you are facing such a situation, you can use a rate limiter to help match the rate at which you're making request to the rate allowed\n",
"by the API.\n",
"\n",
":::info Requires ``langchain-core >= 0.2.24``\n",
"\n",
"This functionality was added in ``langchain-core == 0.2.24``. Please make sure your package is up to date.\n",
":::"
]
},
{
"cell_type": "markdown",
"id": "cbc3c873-6109-4e03-b775-b73c1003faea",
"metadata": {},
"source": [
"## Initialize a rate limiter\n",
"\n",
"Langchain comes with a built-in in memory rate limiter. This rate limiter is thread safe and can be shared by multiple threads in the same process.\n",
"\n",
"The provided rate limiter can only limit the number of requests per unit time. It will not help if you need to also limited based on the size\n",
"of the requests."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "aa9c3c8c-0464-4190-a8c5-d69d173505a6",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.rate_limiters import InMemoryRateLimiter\n",
"\n",
"rate_limiter = InMemoryRateLimiter(\n",
" requests_per_second=0.1, # <-- Super slow! We can only make a request once every 10 seconds!!\n",
" check_every_n_seconds=0.1, # Wake up every 100 ms to check whether allowed to make a request,\n",
" max_bucket_size=10, # Controls the maximum burst size.\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8e058bde-9413-4b08-8cc6-0c9cb638f19f",
"metadata": {},
"source": [
"## Choose a model\n",
"\n",
"Choose any model and pass to it the rate_limiter via the `rate_limiter` attribute."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "0f880a3a-c047-4e94-a323-fff2a4c0e96d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import time\n",
"from getpass import getpass\n",
"\n",
"if \"ANTHROPIC_API_KEY\" not in os.environ:\n",
" os.environ[\"ANTHROPIC_API_KEY\"] = getpass()\n",
"\n",
"\n",
"from langchain_anthropic import ChatAnthropic\n",
"\n",
"model = ChatAnthropic(model_name=\"claude-3-opus-20240229\", rate_limiter=rate_limiter)"
]
},
{
"cell_type": "markdown",
"id": "80c9ab3a-299a-460f-985c-90280a046f52",
"metadata": {},
"source": [
"Let's confirm that the rate limiter works. We should only be able to invoke the model once per 10 seconds."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d074265c-9f32-4c5f-b914-944148993c4d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"11.599073648452759\n",
"10.7502121925354\n",
"10.244257926940918\n",
"8.83088755607605\n",
"11.645203590393066\n"
]
}
],
"source": [
"for _ in range(5):\n",
" tic = time.time()\n",
" model.invoke(\"hello\")\n",
" toc = time.time()\n",
" print(toc - tic)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -15,6 +15,12 @@
"\n",
"Make sure you have the integration packages installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
"\n",
":::\n",
"\n",
":::info Requires ``langchain >= 0.2.8``\n",
"\n",
"This functionality was added in ``langchain-core == 0.2.8``. Please make sure your package is up to date.\n",
"\n",
":::"
]
},
@@ -25,7 +31,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain langchain-openai langchain-anthropic langchain-google-vertexai"
"%pip install -qU langchain>=0.2.8 langchain-openai langchain-anthropic langchain-google-vertexai"
]
},
{
@@ -76,32 +82,6 @@
"print(\"Gemini 1.5: \" + gemini_15.invoke(\"what's your name\").content + \"\\n\")"
]
},
{
"cell_type": "markdown",
"id": "fff9a4c8-b6ee-4a1a-8d3d-0ecaa312d4ed",
"metadata": {},
"source": [
"## Simple config example"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "75c25d39-bf47-4b51-a6c6-64d9c572bfd6",
"metadata": {},
"outputs": [],
"source": [
"user_config = {\n",
" \"model\": \"...user-specified...\",\n",
" \"model_provider\": \"...user-specified...\",\n",
" \"temperature\": 0,\n",
" \"max_tokens\": 1000,\n",
"}\n",
"\n",
"llm = init_chat_model(**user_config)\n",
"llm.invoke(\"what's your name\")"
]
},
{
"cell_type": "markdown",
"id": "f811f219-5e78-4b62-b495-915d52a22532",
@@ -125,12 +105,215 @@
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "da07b5c0-d2e6-42e4-bfcd-2efcfaae6221",
"cell_type": "markdown",
"id": "476a44db-c50d-4846-951d-0f1c9ba8bbaa",
"metadata": {},
"outputs": [],
"source": []
"source": [
"## Creating a configurable model\n",
"\n",
"You can also create a runtime-configurable model by specifying `configurable_fields`. If you don't specify a `model` value, then \"model\" and \"model_provider\" be configurable by default."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6c037f27-12d7-4e83-811e-4245c0e3ba58",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"I'm an AI language model created by OpenAI, and I don't have a personal name. You can call me Assistant or any other name you prefer! How can I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 11, 'total_tokens': 48}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_d576307f90', 'finish_reason': 'stop', 'logprobs': None}, id='run-5428ab5c-b5c0-46de-9946-5d4ca40dbdc8-0', usage_metadata={'input_tokens': 11, 'output_tokens': 37, 'total_tokens': 48})"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"configurable_model = init_chat_model(temperature=0)\n",
"\n",
"configurable_model.invoke(\n",
" \"what's your name\", config={\"configurable\": {\"model\": \"gpt-4o\"}}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "321e3036-abd2-4e1f-bcc6-606efd036954",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", response_metadata={'id': 'msg_012XvotUJ3kGLXJUWKBVxJUi', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-1ad1eefe-f1c6-4244-8bc6-90e2cb7ee554-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"configurable_model.invoke(\n",
" \"what's your name\", config={\"configurable\": {\"model\": \"claude-3-5-sonnet-20240620\"}}\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7f3b3d4a-4066-45e4-8297-ea81ac8e70b7",
"metadata": {},
"source": [
"### Configurable model with default values\n",
"\n",
"We can create a configurable model with default model values, specify which parameters are configurable, and add prefixes to configurable params:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "814a2289-d0db-401e-b555-d5116112b413",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"I'm an AI language model created by OpenAI, and I don't have a personal name. You can call me Assistant or any other name you prefer! How can I assist you today?\", response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 11, 'total_tokens': 48}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_ce0793330f', 'finish_reason': 'stop', 'logprobs': None}, id='run-3923e328-7715-4cd6-b215-98e4b6bf7c9d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 37, 'total_tokens': 48})"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"first_llm = init_chat_model(\n",
" model=\"gpt-4o\",\n",
" temperature=0,\n",
" configurable_fields=(\"model\", \"model_provider\", \"temperature\", \"max_tokens\"),\n",
" config_prefix=\"first\", # useful when you have a chain with multiple models\n",
")\n",
"\n",
"first_llm.invoke(\"what's your name\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "6c8755ba-c001-4f5a-a497-be3f1db83244",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"My name is Claude. It's nice to meet you!\", response_metadata={'id': 'msg_01RyYR64DoMPNCfHeNnroMXm', 'model': 'claude-3-5-sonnet-20240620', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 11, 'output_tokens': 15}}, id='run-22446159-3723-43e6-88df-b84797e7751d-0', usage_metadata={'input_tokens': 11, 'output_tokens': 15, 'total_tokens': 26})"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"first_llm.invoke(\n",
" \"what's your name\",\n",
" config={\n",
" \"configurable\": {\n",
" \"first_model\": \"claude-3-5-sonnet-20240620\",\n",
" \"first_temperature\": 0.5,\n",
" \"first_max_tokens\": 100,\n",
" }\n",
" },\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0072b1a3-7e44-4b4e-8b07-efe1ba91a689",
"metadata": {},
"source": [
"### Using a configurable model declaratively\n",
"\n",
"We can call declarative operations like `bind_tools`, `with_structured_output`, `with_configurable`, etc. on a configurable model and chain a configurable model in the same way that we would a regularly instantiated chat model object."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "067dabee-1050-4110-ae24-c48eba01e13b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'name': 'GetPopulation',\n",
" 'args': {'location': 'Los Angeles, CA'},\n",
" 'id': 'call_sYT3PFMufHGWJD32Hi2CTNUP'},\n",
" {'name': 'GetPopulation',\n",
" 'args': {'location': 'New York, NY'},\n",
" 'id': 'call_j1qjhxRnD3ffQmRyqjlI1Lnk'}]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"class GetWeather(BaseModel):\n",
" \"\"\"Get the current weather in a given location\"\"\"\n",
"\n",
" location: str = Field(..., description=\"The city and state, e.g. San Francisco, CA\")\n",
"\n",
"\n",
"class GetPopulation(BaseModel):\n",
" \"\"\"Get the current population in a given location\"\"\"\n",
"\n",
" location: str = Field(..., description=\"The city and state, e.g. San Francisco, CA\")\n",
"\n",
"\n",
"llm = init_chat_model(temperature=0)\n",
"llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])\n",
"\n",
"llm_with_tools.invoke(\n",
" \"what's bigger in 2024 LA or NYC\", config={\"configurable\": {\"model\": \"gpt-4o\"}}\n",
").tool_calls"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e57dfe9f-cd24-4e37-9ce9-ccf8daf78f89",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'name': 'GetPopulation',\n",
" 'args': {'location': 'Los Angeles, CA'},\n",
" 'id': 'toolu_01CxEHxKtVbLBrvzFS7GQ5xR'},\n",
" {'name': 'GetPopulation',\n",
" 'args': {'location': 'New York City, NY'},\n",
" 'id': 'toolu_013A79qt5toWSsKunFBDZd5S'}]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm_with_tools.invoke(\n",
" \"what's bigger in 2024 LA or NYC\",\n",
" config={\"configurable\": {\"model\": \"claude-3-5-sonnet-20240620\"}},\n",
").tool_calls"
]
}
],
"metadata": {
@@ -149,7 +332,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.9"
}
},
"nbformat": 4,

View File

@@ -16,7 +16,7 @@
"\n",
"Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
"\n",
"This guide requires `langchain-openai >= 0.1.8`."
"This guide requires `langchain-openai >= 0.1.9`."
]
},
{
@@ -153,7 +153,7 @@
"\n",
"#### OpenAI\n",
"\n",
"For example, OpenAI will return a message [chunk](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessageChunk.html) at the end of a stream with token usage information. This behavior is supported by `langchain-openai >= 0.1.8` and can be enabled by setting `stream_usage=True`. This attribute can also be set when `ChatOpenAI` is instantiated.\n",
"For example, OpenAI will return a message [chunk](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.ai.AIMessageChunk.html) at the end of a stream with token usage information. This behavior is supported by `langchain-openai >= 0.1.9` and can be enabled by setting `stream_usage=True`. This attribute can also be set when `ChatOpenAI` is instantiated.\n",
"\n",
"```{=mdx}\n",
":::note\n",

View File

@@ -54,7 +54,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9e4144de-d925-4d4c-91c3-685ef8baa57c",
"id": "2bb9c73f-9d00-4a19-a81f-cab2f0fd921a",
"metadata": {},
"outputs": [],
"source": [
@@ -63,7 +63,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 4,
"id": "a9e37aa1",
"metadata": {},
"outputs": [],
@@ -300,7 +300,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 2,
"id": "ac9295d3",
"metadata": {},
"outputs": [],
@@ -312,10 +312,8 @@
"\n",
"## Quick Install\n",
"\n",
"```bash\n",
"# Hopefully this code block isn't split\n",
"pip install langchain\n",
"```\n",
"\n",
"As an open-source project in a rapidly developing field, we are extremely open to contributions.\n",
"\"\"\""
@@ -323,7 +321,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 3,
"id": "3a0cb17a",
"metadata": {},
"outputs": [
@@ -332,15 +330,14 @@
"text/plain": [
"[Document(page_content='# 🦜️🔗 LangChain'),\n",
" Document(page_content='⚡ Building applications with LLMs through composability ⚡'),\n",
" Document(page_content='## Quick Install\\n\\n```bash'),\n",
" Document(page_content='## Quick Install'),\n",
" Document(page_content=\"# Hopefully this code block isn't split\"),\n",
" Document(page_content='pip install langchain'),\n",
" Document(page_content='```'),\n",
" Document(page_content='As an open-source project in a rapidly developing field, we'),\n",
" Document(page_content='are extremely open to contributions.')]"
]
},
"execution_count": 9,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -721,8 +718,44 @@
"php_splitter = RecursiveCharacterTextSplitter.from_language(\n",
" language=Language.PHP, chunk_size=50, chunk_overlap=0\n",
")\n",
"haskell_docs = php_splitter.create_documents([PHP_CODE])\n",
"haskell_docs"
"php_docs = php_splitter.create_documents([PHP_CODE])\n",
"php_docs"
]
},
{
"cell_type": "markdown",
"id": "e9fa62c1",
"metadata": {},
"source": [
"## PowerShell\n",
"Here's an example using the PowerShell text splitter:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e6893ad",
"metadata": {},
"outputs": [],
"source": [
"POWERSHELL_CODE = \"\"\"\n",
"$directoryPath = Get-Location\n",
"\n",
"$items = Get-ChildItem -Path $directoryPath\n",
"\n",
"$files = $items | Where-Object { -not $_.PSIsContainer }\n",
"\n",
"$sortedFiles = $files | Sort-Object LastWriteTime\n",
"\n",
"foreach ($file in $sortedFiles) {\n",
" Write-Output (\"Name: \" + $file.Name + \" | Last Write Time: \" + $file.LastWriteTime)\n",
"}\n",
"\"\"\"\n",
"powershell_splitter = RecursiveCharacterTextSplitter.from_language(\n",
" language=Language.POWERSHELL, chunk_size=100, chunk_overlap=0\n",
")\n",
"powershell_docs = powershell_splitter.create_documents([POWERSHELL_CODE])\n",
"powershell_docs"
]
}
],

View File

@@ -48,20 +48,10 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "40ed76a2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mWARNING: You are using pip version 22.0.4; however, version 24.0 is available.\n",
"You should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
"\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
]
}
],
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain langchain-openai\n",
"\n",
@@ -419,7 +409,7 @@
" # When configuring the end runnable, we can then use this id to configure this field\n",
" ConfigurableField(id=\"prompt\"),\n",
" # This sets a default_key.\n",
" # If we specify this key, the default LLM (ChatAnthropic initialized above) will be used\n",
" # If we specify this key, the default prompt (asking for a joke, as initialized above) will be used\n",
" default_key=\"joke\",\n",
" # This adds a new option, with name `poem`\n",
" poem=PromptTemplate.from_template(\"Write a short poem about {topic}\"),\n",
@@ -504,7 +494,7 @@
" # When configuring the end runnable, we can then use this id to configure this field\n",
" ConfigurableField(id=\"prompt\"),\n",
" # This sets a default_key.\n",
" # If we specify this key, the default LLM (ChatAnthropic initialized above) will be used\n",
" # If we specify this key, the default prompt (asking for a joke, as initialized above) will be used\n",
" default_key=\"joke\",\n",
" # This adds a new option, with name `poem`\n",
" poem=PromptTemplate.from_template(\"Write a short poem about {topic}\"),\n",

View File

@@ -220,6 +220,57 @@
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "markdown",
"id": "14002ec8-7ee5-4f91-9315-dd21c3808776",
"metadata": {},
"source": [
"### `LLMListwiseRerank`\n",
"\n",
"[LLMListwiseRerank](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.document_compressors.listwise_rerank.LLMListwiseRerank.html) uses [zero-shot listwise document reranking](https://arxiv.org/pdf/2305.02156) and functions similarly to `LLMChainFilter` as a robust but more expensive option. It is recommended to use a more powerful LLM.\n",
"\n",
"Note that `LLMListwiseRerank` requires a model with the [with_structured_output](/docs/integrations/chat/) method implemented."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4ab9ee9f-917e-4d6f-9344-eb7f01533228",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Document 1:\n",
"\n",
"Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while youre at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
"\n",
"Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
"\n",
"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
"\n",
"And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nations top legal minds, who will continue Justice Breyers legacy of excellence.\n"
]
}
],
"source": [
"from langchain.retrievers.document_compressors import LLMListwiseRerank\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
"\n",
"_filter = LLMListwiseRerank.from_llm(llm, top_n=1)\n",
"compression_retriever = ContextualCompressionRetriever(\n",
" base_compressor=_filter, base_retriever=retriever\n",
")\n",
"\n",
"compressed_docs = compression_retriever.invoke(\n",
" \"What did the president say about Ketanji Jackson Brown\"\n",
")\n",
"pretty_print_docs(compressed_docs)"
]
},
{
"cell_type": "markdown",
"id": "7194da42",
@@ -295,7 +346,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "617a1756",
"metadata": {},
"outputs": [],

View File

@@ -5,7 +5,7 @@
"id": "9a8bceb3-95bd-4496-bb9e-57655136e070",
"metadata": {},
"source": [
"# How to use Runnables as Tools\n",
"# How to convert Runnables as Tools\n",
"\n",
":::info Prerequisites\n",
"\n",
@@ -180,7 +180,7 @@
"id": "32b1a992-8997-4c98-8eb2-c9fe9431b799",
"metadata": {},
"source": [
"Alternatively, we can add typing information via [Runnable.with_types](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.with_types):"
"Alternatively, the schema can be fully specified by directly passing the desired [args_schema](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.BaseTool.html#langchain_core.tools.BaseTool.args_schema) for the tool:"
]
},
{
@@ -190,10 +190,18 @@
"metadata": {},
"outputs": [],
"source": [
"as_tool = runnable.with_types(input_type=Args).as_tool(\n",
" name=\"My tool\",\n",
" description=\"Explanation of when to use tool.\",\n",
")"
"from langchain_core.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"class GSchema(BaseModel):\n",
" \"\"\"Apply a function to an integer and list of integers.\"\"\"\n",
"\n",
" a: int = Field(..., description=\"Integer\")\n",
" b: List[int] = Field(..., description=\"List of ints\")\n",
"\n",
"\n",
"runnable = RunnableLambda(g)\n",
"as_tool = runnable.as_tool(GSchema)"
]
},
{
@@ -259,9 +267,9 @@
"We first instantiate a chat model that supports [tool calling](/docs/how_to/tool_calling/):\n",
"\n",
"```{=mdx}\n",
"<ChatModelTabs\n",
" customVarName=\"llm\"\n",
"/>\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs customVarName=\"llm\" />\n",
"```"
]
},

View File

@@ -131,7 +131,7 @@
"source": [
"## Base Chat Model\n",
"\n",
"Let's implement a chat model that echoes back the first `n` characetrs of the last message in the prompt!\n",
"Let's implement a chat model that echoes back the first `n` characters of the last message in the prompt!\n",
"\n",
"To do so, we will inherit from `BaseChatModel` and we'll need to implement the following:\n",
"\n",

View File

@@ -5,7 +5,7 @@
"id": "5436020b",
"metadata": {},
"source": [
"# How to create custom tools\n",
"# How to create tools\n",
"\n",
"When constructing an agent, you will need to provide it with a list of `Tool`s that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
"\n",
@@ -16,13 +16,15 @@
"| args_schema | Pydantic BaseModel | Optional but recommended, can be used to provide more information (e.g., few-shot examples) or validation for expected parameters |\n",
"| return_direct | boolean | Only relevant for agents. When True, after invoking the given tool, the agent will stop and return the result direcly to the user. |\n",
"\n",
"LangChain provides 3 ways to create tools:\n",
"LangChain supports the creation of tools from:\n",
"\n",
"1. Using [@tool decorator](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.tool.html#langchain_core.tools.tool) -- the simplest way to define a custom tool.\n",
"2. Using [StructuredTool.from_function](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method -- this is similar to the `@tool` decorator, but allows more configuration and specification of both sync and async implementations.\n",
"1. Functions;\n",
"2. LangChain [Runnables](/docs/concepts#runnable-interface);\n",
"3. By sub-classing from [BaseTool](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.BaseTool.html) -- This is the most flexible method, it provides the largest degree of control, at the expense of more effort and code.\n",
"\n",
"The `@tool` or the `StructuredTool.from_function` class method should be sufficient for most use cases.\n",
"Creating tools from functions may be sufficient for most use cases, and can be done via a simple [@tool decorator](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.tool.html#langchain_core.tools.tool). If more configuration is needed-- e.g., specification of both sync and async implementations-- one can also use the [StructuredTool.from_function](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method.\n",
"\n",
"In this guide we provide an overview of these methods.\n",
"\n",
":::{.callout-tip}\n",
"\n",
@@ -35,7 +37,9 @@
"id": "c7326b23",
"metadata": {},
"source": [
"## @tool decorator\n",
"## Creating tools from functions\n",
"\n",
"### @tool decorator\n",
"\n",
"This `@tool` decorator is the simplest way to define a custom tool. The decorator uses the function name as the tool name by default, but this can be overridden by passing a string as the first argument. Additionally, the decorator will use the function's docstring as the tool's description - so a docstring MUST be provided. "
]
@@ -51,7 +55,7 @@
"output_type": "stream",
"text": [
"multiply\n",
"multiply(a: int, b: int) -> int - Multiply two numbers.\n",
"Multiply two numbers.\n",
"{'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}\n"
]
}
@@ -96,6 +100,57 @@
" return a * b"
]
},
{
"cell_type": "markdown",
"id": "8f0edc51-c586-414c-8941-c8abe779943f",
"metadata": {},
"source": [
"Note that `@tool` supports parsing of annotations, nested schemas, and other features:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5626423f-053e-4a66-adca-1d794d835397",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'title': 'multiply_by_maxSchema',\n",
" 'description': 'Multiply a by the maximum of b.',\n",
" 'type': 'object',\n",
" 'properties': {'a': {'title': 'A',\n",
" 'description': 'scale factor',\n",
" 'type': 'string'},\n",
" 'b': {'title': 'B',\n",
" 'description': 'list of ints over which to take maximum',\n",
" 'type': 'array',\n",
" 'items': {'type': 'integer'}}},\n",
" 'required': ['a', 'b']}"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from typing import Annotated, List\n",
"\n",
"\n",
"@tool\n",
"def multiply_by_max(\n",
" a: Annotated[str, \"scale factor\"],\n",
" b: Annotated[List[int], \"list of ints over which to take maximum\"],\n",
") -> int:\n",
" \"\"\"Multiply a by the maximum of b.\"\"\"\n",
" return a * max(b)\n",
"\n",
"\n",
"multiply_by_max.args_schema.schema()"
]
},
{
"cell_type": "markdown",
"id": "98d6eee9",
@@ -106,7 +161,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "9216d03a-f6ea-4216-b7e1-0661823a4c0b",
"metadata": {},
"outputs": [
@@ -115,7 +170,7 @@
"output_type": "stream",
"text": [
"multiplication-tool\n",
"multiplication-tool(a: int, b: int) -> int - Multiply two numbers.\n",
"Multiply two numbers.\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n",
"True\n"
]
@@ -145,17 +200,82 @@
},
{
"cell_type": "markdown",
"id": "b63fcc3b",
"id": "33a9e94d-0b60-48f3-a4c2-247dce096e66",
"metadata": {},
"source": [
"## StructuredTool\n",
"\n",
"The `StrurcturedTool.from_function` class method provides a bit more configurability than the `@tool` decorator, without requiring much additional code."
"#### Docstring parsing"
]
},
{
"cell_type": "markdown",
"id": "6d0cb586-93d4-4ff1-9779-71df7853cb68",
"metadata": {},
"source": [
"`@tool` can optionally parse [Google Style docstrings](https://google.github.io/styleguide/pyguide.html#383-functions-and-methods) and associate the docstring components (such as arg descriptions) to the relevant parts of the tool schema. To toggle this behavior, specify `parse_docstring`:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "336f5538-956e-47d5-9bde-b732559f9e61",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'title': 'fooSchema',\n",
" 'description': 'The foo.',\n",
" 'type': 'object',\n",
" 'properties': {'bar': {'title': 'Bar',\n",
" 'description': 'The bar.',\n",
" 'type': 'string'},\n",
" 'baz': {'title': 'Baz', 'description': 'The baz.', 'type': 'integer'}},\n",
" 'required': ['bar', 'baz']}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@tool(parse_docstring=True)\n",
"def foo(bar: str, baz: int) -> str:\n",
" \"\"\"The foo.\n",
"\n",
" Args:\n",
" bar: The bar.\n",
" baz: The baz.\n",
" \"\"\"\n",
" return bar\n",
"\n",
"\n",
"foo.args_schema.schema()"
]
},
{
"cell_type": "markdown",
"id": "f18a2503-5393-421b-99fa-4a01dd824d0e",
"metadata": {},
"source": [
":::{.callout-caution}\n",
"By default, `@tool(parse_docstring=True)` will raise `ValueError` if the docstring does not parse correctly. See [API Reference](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.tool.html) for detail and examples.\n",
":::"
]
},
{
"cell_type": "markdown",
"id": "b63fcc3b",
"metadata": {},
"source": [
"### StructuredTool\n",
"\n",
"The `StructuredTool.from_function` class method provides a bit more configurability than the `@tool` decorator, without requiring much additional code."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "564fbe6f-11df-402d-b135-ef6ff25e1e63",
"metadata": {},
"outputs": [
@@ -198,7 +318,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 7,
"id": "6bc055d4-1fbe-4db5-8881-9c382eba6b1b",
"metadata": {},
"outputs": [
@@ -208,7 +328,7 @@
"text": [
"6\n",
"Calculator\n",
"Calculator(a: int, b: int) -> int - multiply numbers\n",
"multiply numbers\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n"
]
}
@@ -239,6 +359,63 @@
"print(calculator.args)"
]
},
{
"cell_type": "markdown",
"id": "5517995d-54e3-449b-8fdb-03561f5e4647",
"metadata": {},
"source": [
"## Creating tools from Runnables\n",
"\n",
"LangChain [Runnables](/docs/concepts#runnable-interface) that accept string or `dict` input can be converted to tools using the [as_tool](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.as_tool) method, which allows for the specification of names, descriptions, and additional schema information for arguments.\n",
"\n",
"Example usage:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8ef593c5-cf72-4c10-bfc9-7d21874a0c24",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'answer_style': {'title': 'Answer Style', 'type': 'string'}}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.language_models import GenericFakeChatModel\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [(\"human\", \"Hello. Please respond in the style of {answer_style}.\")]\n",
")\n",
"\n",
"# Placeholder LLM\n",
"llm = GenericFakeChatModel(messages=iter([\"hello matey\"]))\n",
"\n",
"chain = prompt | llm | StrOutputParser()\n",
"\n",
"as_tool = chain.as_tool(\n",
" name=\"Style responder\", description=\"Description of when to use tool.\"\n",
")\n",
"as_tool.args"
]
},
{
"cell_type": "markdown",
"id": "0521b787-a146-45a6-8ace-ae1ac4669dd7",
"metadata": {},
"source": [
"See [this guide](/docs/how_to/convert_runnable_to_tool) for more detail."
]
},
{
"cell_type": "markdown",
"id": "b840074b-9c10-4ca0-aed8-626c52b2398f",
@@ -251,7 +428,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 10,
"id": "1dad8f8e",
"metadata": {},
"outputs": [],
@@ -300,7 +477,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 11,
"id": "bb551c33",
"metadata": {},
"outputs": [
@@ -351,7 +528,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 12,
"id": "6615cb77-fd4c-4676-8965-f92cc71d4944",
"metadata": {},
"outputs": [
@@ -383,7 +560,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 13,
"id": "bb2af583-eadd-41f4-a645-bf8748bd3dcd",
"metadata": {},
"outputs": [
@@ -428,7 +605,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 14,
"id": "4ad0932c-8610-4278-8c57-f9218f654c8a",
"metadata": {},
"outputs": [
@@ -473,7 +650,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 15,
"id": "7094c0e8-6192-4870-a942-aad5b5ae48fd",
"metadata": {},
"outputs": [],
@@ -496,7 +673,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 16,
"id": "b4d22022-b105-4ccc-a15b-412cb9ea3097",
"metadata": {},
"outputs": [
@@ -506,7 +683,7 @@
"'Error: There is no city by the name of foobar.'"
]
},
"execution_count": 12,
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
@@ -530,7 +707,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 17,
"id": "3fad1728-d367-4e1b-9b54-3172981271cf",
"metadata": {},
"outputs": [
@@ -540,7 +717,7 @@
"\"There is no such city, but it's probably above 0K there!\""
]
},
"execution_count": 13,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
@@ -564,7 +741,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 18,
"id": "ebfe7c1f-318d-4e58-99e1-f31e69473c46",
"metadata": {},
"outputs": [
@@ -574,7 +751,7 @@
"'The following errors occurred during tool execution: `Error: There is no city by the name of foobar.`'"
]
},
"execution_count": 14,
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
@@ -591,13 +768,189 @@
"\n",
"get_weather_tool.invoke({\"city\": \"foobar\"})"
]
},
{
"cell_type": "markdown",
"id": "1a8d8383-11b3-445e-956f-df4e96995e00",
"metadata": {},
"source": [
"## Returning artifacts of Tool execution\n",
"\n",
"Sometimes there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns custom objects like Documents, we may want to pass some view or metadata about this output to the model without passing the raw output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
"\n",
"The Tool and [ToolMessage](https://api.python.langchain.com/en/latest/messages/langchain_core.messages.tool.ToolMessage.html) interfaces make it possible to distinguish between the parts of the tool output meant for the model (this is the ToolMessage.content) and those parts which are meant for use outside the model (ToolMessage.artifact).\n",
"\n",
":::info Requires ``langchain-core >= 0.2.19``\n",
"\n",
"This functionality was added in ``langchain-core == 0.2.19``. Please make sure your package is up to date.\n",
"\n",
":::\n",
"\n",
"If we want our tool to distinguish between message content and other artifacts, we need to specify `response_format=\"content_and_artifact\"` when defining our tool and make sure that we return a tuple of (content, artifact):"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "14905425-0334-43a0-9de9-5bcf622ede0e",
"metadata": {},
"outputs": [],
"source": [
"import random\n",
"from typing import List, Tuple\n",
"\n",
"from langchain_core.tools import tool\n",
"\n",
"\n",
"@tool(response_format=\"content_and_artifact\")\n",
"def generate_random_ints(min: int, max: int, size: int) -> Tuple[str, List[int]]:\n",
" \"\"\"Generate size random ints in the range [min, max].\"\"\"\n",
" array = [random.randint(min, max) for _ in range(size)]\n",
" content = f\"Successfully generated array of {size} random ints in [{min}, {max}].\"\n",
" return content, array"
]
},
{
"cell_type": "markdown",
"id": "49f057a6-8938-43ea-8faf-ae41e797ceb8",
"metadata": {},
"source": [
"If we invoke our tool directly with the tool arguments, we'll get back just the content part of the output:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "0f2e1528-404b-46e6-b87c-f0957c4b9217",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Successfully generated array of 10 random ints in [0, 9].'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generate_random_ints.invoke({\"min\": 0, \"max\": 9, \"size\": 10})"
]
},
{
"cell_type": "markdown",
"id": "1e62ebba-1737-4b97-b61a-7313ade4e8c2",
"metadata": {},
"source": [
"If we invoke our tool with a ToolCall (like the ones generated by tool-calling models), we'll get back a ToolMessage that contains both the content and artifact generated by the Tool:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "cc197777-26eb-46b3-a83b-c2ce116c6311",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ToolMessage(content='Successfully generated array of 10 random ints in [0, 9].', name='generate_random_ints', tool_call_id='123', artifact=[1, 4, 2, 5, 3, 9, 0, 4, 7, 7])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generate_random_ints.invoke(\n",
" {\n",
" \"name\": \"generate_random_ints\",\n",
" \"args\": {\"min\": 0, \"max\": 9, \"size\": 10},\n",
" \"id\": \"123\", # required\n",
" \"type\": \"tool_call\", # required\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"id": "dfdc1040-bf25-4790-b4c3-59452db84e11",
"metadata": {},
"source": [
"We can do the same when subclassing BaseTool:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "fe1a09d1-378b-4b91-bb5e-0697c3d7eb92",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.tools import BaseTool\n",
"\n",
"\n",
"class GenerateRandomFloats(BaseTool):\n",
" name: str = \"generate_random_floats\"\n",
" description: str = \"Generate size random floats in the range [min, max].\"\n",
" response_format: str = \"content_and_artifact\"\n",
"\n",
" ndigits: int = 2\n",
"\n",
" def _run(self, min: float, max: float, size: int) -> Tuple[str, List[float]]:\n",
" range_ = max - min\n",
" array = [\n",
" round(min + (range_ * random.random()), ndigits=self.ndigits)\n",
" for _ in range(size)\n",
" ]\n",
" content = f\"Generated {size} floats in [{min}, {max}], rounded to {self.ndigits} decimals.\"\n",
" return content, array\n",
"\n",
" # Optionally define an equivalent async method\n",
"\n",
" # async def _arun(self, min: float, max: float, size: int) -> Tuple[str, List[float]]:\n",
" # ..."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "8c3d16f6-1c4a-48ab-b05a-38547c592e79",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ToolMessage(content='Generated 3 floats in [0.1, 3.3333], rounded to 4 decimals.', name='generate_random_floats', tool_call_id='123', artifact=[1.4277, 0.7578, 2.4871])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rand_gen = GenerateRandomFloats(ndigits=4)\n",
"\n",
"rand_gen.invoke(\n",
" {\n",
" \"name\": \"generate_random_floats\",\n",
" \"args\": {\"min\": 0.1, \"max\": 3.3333, \"size\": 3},\n",
" \"id\": \"123\",\n",
" \"type\": \"tool_call\",\n",
" }\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "poetry-venv-311",
"language": "python",
"name": "python3"
"name": "poetry-venv-311"
},
"language_info": {
"codemirror_mode": {
@@ -609,7 +962,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.11.9"
},
"vscode": {
"interpreter": {

View File

@@ -63,7 +63,7 @@
"* The `load` methods is a convenience method meant solely for prototyping work -- it just invokes `list(self.lazy_load())`.\n",
"* The `alazy_load` has a default implementation that will delegate to `lazy_load`. If you're using async, we recommend overriding the default implementation and providing a native async implementation.\n",
"\n",
"::: {.callout-important}\n",
":::{.callout-important}\n",
"When implementing a document loader do **NOT** provide parameters via the `lazy_load` or `alazy_load` methods.\n",
"\n",
"All configuration is expected to be passed through the initializer (__init__). This was a design choice made by LangChain to make sure that once a document loader has been instantiated it has all the information needed to load documents.\n",
@@ -235,7 +235,7 @@
"id": "56cb443e-f987-4386-b4ec-975ee129adb2",
"metadata": {},
"source": [
"::: {.callout-tip}\n",
":::{.callout-tip}\n",
"\n",
"`load()` can be helpful in an interactive environment such as a jupyter notebook.\n",
"\n",
@@ -276,7 +276,7 @@
"source": [
"## Working with Files\n",
"\n",
"Many document loaders invovle parsing files. The difference between such loaders usually stems from how the file is parsed rather than how the file is loaded. For example, you can use `open` to read the binary content of either a PDF or a markdown file, but you need different parsing logic to convert that binary data into text.\n",
"Many document loaders involve parsing files. The difference between such loaders usually stems from how the file is parsed, rather than how the file is loaded. For example, you can use `open` to read the binary content of either a PDF or a markdown file, but you need different parsing logic to convert that binary data into text.\n",
"\n",
"As a result, it can be helpful to decouple the parsing logic from the loading logic, which makes it easier to re-use a given parser regardless of how the data was loaded.\n",
"\n",
@@ -355,7 +355,7 @@
"id": "433bfb7c-7767-43bc-b71e-42413d7494a8",
"metadata": {},
"source": [
"Using the **blob** API also allows one to load content direclty from memory without having to read it from a file!"
"Using the **blob** API also allows one to load content directly from memory without having to read it from a file!"
]
},
{

View File

@@ -182,7 +182,7 @@ pprint(data)
</CodeOutputBlock>
Another option is set `jq_schema='.'` and provide `content_key`:
Another option is to set `jq_schema='.'` and provide `content_key`:
```python
loader = JSONLoader(

View File

@@ -26,7 +26,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install \"unstructured[md]\""
"%pip install \"unstructured[md]\" nltk"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -67,15 +67,16 @@ If you'd prefer not to set an environment variable you can pass the key in direc
```python
from langchain_cohere import CohereEmbeddings
embeddings_model = CohereEmbeddings(cohere_api_key="...")
embeddings_model = CohereEmbeddings(cohere_api_key="...", model='embed-english-v3.0')
```
Otherwise you can initialize without any params:
Otherwise you can initialize simply as shown below:
```python
from langchain_cohere import CohereEmbeddings
embeddings_model = CohereEmbeddings()
embeddings_model = CohereEmbeddings(model='embed-english-v3.0')
```
Do note that it is mandatory to pass the model parameter while initializing the CohereEmbeddings class.
</TabItem>
<TabItem value="huggingface" label="Hugging Face">

View File

@@ -28,7 +28,7 @@
"\n",
"You can use arbitrary functions as [Runnables](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable). This is useful for formatting or when you need functionality not provided by other LangChain components, and custom functions used as Runnables are called [`RunnableLambdas`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableLambda.html).\n",
"\n",
"Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single dict input and unpacks it into multiple argument.\n",
"Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single dict input and unpacks it into multiple arguments.\n",
"\n",
"This guide will cover:\n",
"\n",

View File

@@ -364,7 +364,10 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.graph_qa.cypher_utils import CypherQueryCorrector, Schema\n",
"from langchain_community.chains.graph_qa.cypher_utils import (\n",
" CypherQueryCorrector,\n",
" Schema,\n",
")\n",
"\n",
"# Cypher validation tool for relationship directions\n",
"corrector_schema = [\n",

View File

@@ -9,11 +9,13 @@
"source": [
"# Hybrid Search\n",
"\n",
"The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, ...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
"The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
"\n",
"**Step 1: Make sure the vectorstore you are using supports hybrid search**\n",
"\n",
"At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`. By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.\n",
"At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`.\n",
"\n",
"By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.\n",
"\n",
"**Step 2: Add that parameter as a configurable field for the chain**\n",
"\n",

View File

@@ -31,6 +31,8 @@ This highlights functionality that is core to using LangChain.
[**LCEL cheatsheet**](/docs/how_to/lcel_cheatsheet/): For a quick overview of how to use the main LCEL primitives.
[**Migration guide**](/docs/versions/migrating_chains): For migrating legacy chain abstractions to LCEL.
- [How to: chain runnables](/docs/how_to/sequence)
- [How to: stream runnables](/docs/how_to/streaming)
- [How to: invoke runnables in parallel](/docs/how_to/parallel/)
@@ -43,7 +45,7 @@ This highlights functionality that is core to using LangChain.
- [How to: create a dynamic (self-constructing) chain](/docs/how_to/dynamic_chain/)
- [How to: inspect runnables](/docs/how_to/inspect)
- [How to: add fallbacks to a runnable](/docs/how_to/fallbacks)
- [How to: migrate chains to LCEL](/docs/how_to/migrate_chains)
- [How to: pass runtime secrets to a runnable](/docs/how_to/runnable_runtime_secrets)
## Components
@@ -80,12 +82,13 @@ These are the core building blocks you can use when building applications.
- [How to: stream a response back](/docs/how_to/chat_streaming)
- [How to: track token usage](/docs/how_to/chat_token_usage_tracking)
- [How to: track response metadata across providers](/docs/how_to/response_metadata)
- [How to: let your end users choose their model](/docs/how_to/chat_models_universal_init/)
- [How to: use chat model to call tools](/docs/how_to/tool_calling)
- [How to: stream tool calls](/docs/how_to/tool_streaming)
- [How to: handle rate limits](/docs/how_to/chat_model_rate_limiting)
- [How to: few shot prompt tool behavior](/docs/how_to/tools_few_shot)
- [How to: bind model-specific formated tools](/docs/how_to/tools_model_specific)
- [How to: bind model-specific formatted tools](/docs/how_to/tools_model_specific)
- [How to: force a specific tool call](/docs/how_to/tool_choice)
- [How to: work with local models](/docs/how_to/local_llms)
- [How to: init any model in one line](/docs/how_to/chat_models_universal_init/)
### Messages
@@ -104,7 +107,7 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo
- [How to: create a custom LLM class](/docs/how_to/custom_llm)
- [How to: stream a response back](/docs/how_to/streaming_llm)
- [How to: track token usage](/docs/how_to/llm_token_usage_tracking)
- [How to: work with local LLMs](/docs/how_to/local_llms)
- [How to: work with local models](/docs/how_to/local_llms)
### Output parsers
@@ -185,17 +188,21 @@ Indexing is the process of keeping your vectorstore in-sync with the underlying
LangChain [Tools](/docs/concepts/#tools) contain a description of the tool (to pass to the language model) as well as the implementation of the function to call. Refer [here](/docs/integrations/tools/) for a list of pre-buit tools.
- [How to: create custom tools](/docs/how_to/custom_tools)
- [How to: use built-in tools and built-in toolkits](/docs/how_to/tools_builtin)
- [How to: convert Runnables to tools](/docs/how_to/convert_runnable_to_tool)
- [How to: use chat model to call tools](/docs/how_to/tool_calling)
- [How to: pass tool results back to model](/docs/how_to/tool_results_pass_to_model)
- [How to: add ad-hoc tool calling capability to LLMs and chat models](/docs/how_to/tools_prompting)
- [How to: create tools](/docs/how_to/custom_tools)
- [How to: use built-in tools and toolkits](/docs/how_to/tools_builtin)
- [How to: use chat models to call tools](/docs/how_to/tool_calling)
- [How to: pass tool outputs to chat models](/docs/how_to/tool_results_pass_to_model)
- [How to: pass run time values to tools](/docs/how_to/tool_runtime)
- [How to: add a human in the loop to tool usage](/docs/how_to/tools_human)
- [How to: handle errors when calling tools](/docs/how_to/tools_error)
- [How to: disable parallel tool calling](/docs/how_to/tool_choice)
- [How to: stream events from within a tool](/docs/how_to/tool_stream_events)
- [How to: add a human-in-the-loop for tools](/docs/how_to/tools_human)
- [How to: handle tool errors](/docs/how_to/tools_error)
- [How to: force models to call a tool](/docs/how_to/tool_choice)
- [How to: disable parallel tool calling](/docs/how_to/tool_calling_parallel)
- [How to: access the `RunnableConfig` from a tool](/docs/how_to/tool_configure)
- [How to: stream events from a tool](/docs/how_to/tool_stream_events)
- [How to: return artifacts from a tool](/docs/how_to/tool_artifacts/)
- [How to: convert Runnables to tools](/docs/how_to/convert_runnable_to_tool)
- [How to: add ad-hoc tool calling capability to models](/docs/how_to/tools_prompting)
- [How to: pass in runtime secrets](/docs/how_to/runnable_runtime_secrets)
### Multimodal
@@ -223,6 +230,7 @@ For in depth how-to guides for agents, please check out [LangGraph](https://lang
- [How to: pass callbacks into a module constructor](/docs/how_to/callbacks_constructor)
- [How to: create custom callback handlers](/docs/how_to/custom_callbacks)
- [How to: use callbacks in async environments](/docs/how_to/callbacks_async)
- [How to: dispatch custom callback events](/docs/how_to/callbacks_custom_events)
### Custom
@@ -235,6 +243,7 @@ All of LangChain components can easily be extended to support your own versions.
- [How to: write a custom output parser class](/docs/how_to/output_parser_custom)
- [How to: create custom callback handlers](/docs/how_to/custom_callbacks)
- [How to: define a custom tool](/docs/how_to/custom_tools)
- [How to: dispatch custom callback events](/docs/how_to/callbacks_custom_events)
### Serialization
- [How to: save and load LangChain objects](/docs/how_to/serialization)
@@ -306,6 +315,15 @@ For a high-level tutorial, check out [this guide](/docs/tutorials/graph/).
- [How to: improve results with prompting](/docs/how_to/graph_prompting)
- [How to: construct knowledge graphs](/docs/how_to/graph_constructing)
### Summarization
LLMs can summarize and otherwise distill desired information from text, including
large volumes of text. For a high-level tutorial, check out [this guide](/docs/tutorials/summarization).
- [How to: summarize text in a single LLM call](/docs/how_to/summarize_stuff)
- [How to: summarize text through parallelization](/docs/how_to/summarize_map_reduce)
- [How to: summarize text through iterative refinement](/docs/how_to/summarize_refine)
## [LangGraph](https://langchain-ai.github.io/langgraph)
LangGraph is an extension of LangChain aimed at

View File

@@ -60,7 +60,7 @@
" * document addition by id (`add_documents` method with `ids` argument)\n",
" * delete by id (`delete` method with `ids` argument)\n",
"\n",
"Compatible Vectorstores: `Aerospike`, `AnalyticDB`, `AstraDB`, `AwaDB`, `AzureCosmosDBNoSqlVectorSearch`, `AzureCosmosDBVectorSearch`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SingleStoreDB`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
"Compatible Vectorstores: `Aerospike`, `AnalyticDB`, `AstraDB`, `AwaDB`, `AzureCosmosDBNoSqlVectorSearch`, `AzureCosmosDBVectorSearch`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MongoDBAtlasVectorSearch`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SingleStoreDB`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
" \n",
"## Caution\n",
"\n",

View File

@@ -15,21 +15,38 @@
},
{
"cell_type": "code",
"execution_count": 1,
"id": "0aa6d335",
"execution_count": null,
"id": "25b0b0fa",
"metadata": {},
"outputs": [],
"source": [
"from langchain.globals import set_llm_cache\n",
"from langchain_openai import OpenAI\n",
"%pip install -qU langchain_openai langchain_community\n",
"\n",
"# To make the caching really obvious, lets use a slower model.\n",
"llm = OpenAI(model_name=\"gpt-3.5-turbo-instruct\", n=2, best_of=2)"
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"# Please manually enter OpenAI Key"
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 2,
"id": "0aa6d335",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.globals import set_llm_cache\n",
"from langchain_openai import OpenAI\n",
"\n",
"# To make the caching really obvious, lets use a slower and older model.\n",
"# Caching supports newer chat models as well.\n",
"llm = OpenAI(model=\"gpt-3.5-turbo-instruct\", n=2, best_of=2)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "f168ff0d",
"metadata": {},
"outputs": [
@@ -37,34 +54,34 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 13.7 ms, sys: 6.54 ms, total: 20.2 ms\n",
"Wall time: 330 ms\n"
"CPU times: user 546 ms, sys: 379 ms, total: 925 ms\n",
"Wall time: 1.11 s\n"
]
},
{
"data": {
"text/plain": [
"\"\\n\\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!\""
"\"\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything!\""
]
},
"execution_count": 12,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"from langchain.cache import InMemoryCache\n",
"from langchain_core.caches import InMemoryCache\n",
"\n",
"set_llm_cache(InMemoryCache())\n",
"\n",
"# The first time, it is not yet in cache, so it should take longer\n",
"llm.predict(\"Tell me a joke\")"
"llm.invoke(\"Tell me a joke\")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 4,
"id": "ce7620fb",
"metadata": {},
"outputs": [
@@ -72,17 +89,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 436 µs, sys: 921 µs, total: 1.36 ms\n",
"Wall time: 1.36 ms\n"
"CPU times: user 192 µs, sys: 77 µs, total: 269 µs\n",
"Wall time: 270 µs\n"
]
},
{
"data": {
"text/plain": [
"\"\\n\\nWhy couldn't the bicycle stand up by itself? Because it was two-tired!\""
"\"\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything!\""
]
},
"execution_count": 13,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -90,7 +107,7 @@
"source": [
"%%time\n",
"# The second time it is, so it goes faster\n",
"llm.predict(\"Tell me a joke\")"
"llm.invoke(\"Tell me a joke\")"
]
},
{
@@ -103,7 +120,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 5,
"id": "2e65de83",
"metadata": {},
"outputs": [],
@@ -113,7 +130,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 6,
"id": "0be83715",
"metadata": {},
"outputs": [],
@@ -126,7 +143,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 7,
"id": "9b427ce7",
"metadata": {},
"outputs": [
@@ -134,17 +151,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 29.3 ms, sys: 17.3 ms, total: 46.7 ms\n",
"Wall time: 364 ms\n"
"CPU times: user 10.6 ms, sys: 4.21 ms, total: 14.8 ms\n",
"Wall time: 851 ms\n"
]
},
{
"data": {
"text/plain": [
"'\\n\\nWhy did the tomato turn red?\\n\\nBecause it saw the salad dressing!'"
"\"\\n\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything!\""
]
},
"execution_count": 10,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -152,12 +169,12 @@
"source": [
"%%time\n",
"# The first time, it is not yet in cache, so it should take longer\n",
"llm.predict(\"Tell me a joke\")"
"llm.invoke(\"Tell me a joke\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 8,
"id": "87f52611",
"metadata": {},
"outputs": [
@@ -165,17 +182,17 @@
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 4.58 ms, sys: 2.23 ms, total: 6.8 ms\n",
"Wall time: 4.68 ms\n"
"CPU times: user 59.7 ms, sys: 63.6 ms, total: 123 ms\n",
"Wall time: 134 ms\n"
]
},
{
"data": {
"text/plain": [
"'\\n\\nWhy did the tomato turn red?\\n\\nBecause it saw the salad dressing!'"
"\"\\n\\nWhy don't scientists trust atoms?\\n\\nBecause they make up everything!\""
]
},
"execution_count": 11,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
@@ -183,7 +200,7 @@
"source": [
"%%time\n",
"# The second time it is, so it goes faster\n",
"llm.predict(\"Tell me a joke\")"
"llm.invoke(\"Tell me a joke\")"
]
},
{
@@ -211,7 +228,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.10.5"
}
},
"nbformat": 4,

View File

@@ -5,11 +5,11 @@
"id": "b8982428",
"metadata": {},
"source": [
"# Run LLMs locally\n",
"# Run models locally\n",
"\n",
"## Use case\n",
"\n",
"The popularity of projects like [PrivateGPT](https://github.com/imartinez/privateGPT), [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://github.com/ollama/ollama), [GPT4All](https://github.com/nomic-ai/gpt4all), [llamafile](https://github.com/Mozilla-Ocho/llamafile), and others underscore the demand to run LLMs locally (on your own device).\n",
"The popularity of projects like [llama.cpp](https://github.com/ggerganov/llama.cpp), [Ollama](https://github.com/ollama/ollama), [GPT4All](https://github.com/nomic-ai/gpt4all), [llamafile](https://github.com/Mozilla-Ocho/llamafile), and others underscore the demand to run LLMs locally (on your own device).\n",
"\n",
"This has at least two important benefits:\n",
"\n",
@@ -66,6 +66,12 @@
"\n",
"![Image description](../../static/img/llama_t_put.png)\n",
"\n",
"### Formatting prompts\n",
"\n",
"Some providers have [chat model](/docs/concepts/#chat-models) wrappers that takes care of formatting your input prompt for the specific local model you're using. However, if you are prompting local models with a [text-in/text-out LLM](/docs/concepts/#llms) wrapper, you may need to use a prompt tailed for your specific model.\n",
"\n",
"This can [require the inclusion of special tokens](https://huggingface.co/blog/llama2#how-to-prompt-llama-2). [Here's an example for LLaMA 2](https://smith.langchain.com/hub/rlm/rag-prompt-llama).\n",
"\n",
"## Quickstart\n",
"\n",
"[`Ollama`](https://ollama.ai/) is one way to easily run inference on macOS.\n",
@@ -73,10 +79,20 @@
"The instructions [here](https://github.com/jmorganca/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
" \n",
"* [Download and run](https://ollama.ai/download) the app\n",
"* From command line, fetch a model from this [list of options](https://github.com/jmorganca/ollama): e.g., `ollama pull llama2`\n",
"* From command line, fetch a model from this [list of options](https://github.com/jmorganca/ollama): e.g., `ollama pull llama3.1:8b`\n",
"* When the app is running, all models are automatically served on `localhost:11434`\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "29450fc9",
"metadata": {},
"outputs": [],
"source": [
"%pip install -qU langchain_ollama"
]
},
{
"cell_type": "code",
"execution_count": 2,
@@ -86,7 +102,7 @@
{
"data": {
"text/plain": [
"' The first man on the moon was Neil Armstrong, who landed on the moon on July 20, 1969 as part of the Apollo 11 mission. obviously.'"
"'...Neil Armstrong!\\n\\nOn July 20, 1969, Neil Armstrong became the first person to set foot on the lunar surface, famously declaring \"That\\'s one small step for man, one giant leap for mankind\" as he stepped off the lunar module Eagle onto the Moon\\'s surface.\\n\\nWould you like to know more about the Apollo 11 mission or Neil Armstrong\\'s achievements?'"
]
},
"execution_count": 2,
@@ -95,51 +111,78 @@
}
],
"source": [
"from langchain_community.llms import Ollama\n",
"from langchain_ollama import OllamaLLM\n",
"\n",
"llm = OllamaLLM(model=\"llama3.1:8b\")\n",
"\n",
"llm = Ollama(model=\"llama2\")\n",
"llm.invoke(\"The first man on the moon was ...\")"
]
},
{
"cell_type": "markdown",
"id": "343ab645",
"id": "674cc672",
"metadata": {},
"source": [
"Stream tokens as they are being generated."
"Stream tokens as they are being generated:"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "9cd83603",
"execution_count": 3,
"id": "1386a852",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" The first man to walk on the moon was Neil Armstrong, an American astronaut who was part of the Apollo 11 mission in 1969. февруари 20, 1969, Armstrong stepped out of the lunar module Eagle and onto the moon's surface, famously declaring \"That's one small step for man, one giant leap for mankind\" as he took his first steps. He was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the moon during the mission."
"...|"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Neil| Armstrong|,| an| American| astronaut|.| He| stepped| out| of| the| lunar| module| Eagle| and| onto| the| surface| of| the| Moon| on| July| |20|,| |196|9|,| famously| declaring|:| \"|That|'s| one| small| step| for| man|,| one| giant| leap| for| mankind|.\"||"
]
}
],
"source": [
"for chunk in llm.stream(\"The first man on the moon was ...\"):\n",
" print(chunk, end=\"|\", flush=True)"
]
},
{
"cell_type": "markdown",
"id": "e5731060",
"metadata": {},
"source": [
"Ollama also includes a chat model wrapper that handles formatting conversation turns:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f14a778a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' The first man to walk on the moon was Neil Armstrong, an American astronaut who was part of the Apollo 11 mission in 1969. февруари 20, 1969, Armstrong stepped out of the lunar module Eagle and onto the moon\\'s surface, famously declaring \"That\\'s one small step for man, one giant leap for mankind\" as he took his first steps. He was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the moon during the mission.'"
"AIMessage(content='The answer is a historic one!\\n\\nThe first man to walk on the Moon was Neil Armstrong, an American astronaut and commander of the Apollo 11 mission. On July 20, 1969, Armstrong stepped out of the lunar module Eagle onto the surface of the Moon, famously declaring:\\n\\n\"That\\'s one small step for man, one giant leap for mankind.\"\\n\\nArmstrong was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the Moon during the mission. Michael Collins remained in orbit around the Moon in the command module Columbia.\\n\\nNeil Armstrong passed away on August 25, 2012, but his legacy as a pioneering astronaut and engineer continues to inspire people around the world!', response_metadata={'model': 'llama3.1:8b', 'created_at': '2024-08-01T00:38:29.176717Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 10681861417, 'load_duration': 34270292, 'prompt_eval_count': 19, 'prompt_eval_duration': 6209448000, 'eval_count': 141, 'eval_duration': 4432022000}, id='run-7bed57c5-7f54-4092-912c-ae49073dcd48-0', usage_metadata={'input_tokens': 19, 'output_tokens': 141, 'total_tokens': 160})"
]
},
"execution_count": 40,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler\n",
"from langchain_ollama import ChatOllama\n",
"\n",
"llm = Ollama(\n",
" model=\"llama2\", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])\n",
")\n",
"llm.invoke(\"The first man on the moon was ...\")"
"chat_model = ChatOllama(model=\"llama3.1:8b\")\n",
"\n",
"chat_model.invoke(\"Who was the first man on the moon?\")"
]
},
{
@@ -199,7 +242,7 @@
"\n",
"With [Ollama](https://github.com/jmorganca/ollama), fetch a model via `ollama pull <model family>:<tag>`:\n",
"\n",
"* E.g., for Llama-7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n",
"* E.g., for Llama 2 7b: `ollama pull llama2` will download the most basic version of the model (e.g., smallest # parameters and 4 bit quantization)\n",
"* We can also specify a particular version from the [model list](https://github.com/jmorganca/ollama?tab=readme-ov-file#model-library), e.g., `ollama pull llama2:13b`\n",
"* See the full set of parameters on the [API reference page](https://api.python.langchain.com/en/latest/llms/langchain_community.llms.ollama.Ollama.html)"
]
@@ -222,9 +265,7 @@
}
],
"source": [
"from langchain_community.llms import Ollama\n",
"\n",
"llm = Ollama(model=\"llama2:13b\")\n",
"llm = OllamaLLM(model=\"llama2:13b\")\n",
"llm.invoke(\"The first man on the moon was ... think step by step\")"
]
},
@@ -268,11 +309,7 @@
"cell_type": "code",
"execution_count": null,
"id": "5eba38dc",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"metadata": {},
"outputs": [],
"source": [
"%env CMAKE_ARGS=\"-DLLAMA_METAL=on\"\n",
@@ -542,7 +579,6 @@
}
],
"source": [
"from langchain.chains import LLMChain\n",
"from langchain.chains.prompt_selector import ConditionalPromptSelector\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
@@ -613,9 +649,9 @@
],
"source": [
"# Chain\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"chain = prompt | llm\n",
"question = \"What NFL team won the Super Bowl in the year that Justin Bieber was born?\"\n",
"llm_chain.run({\"question\": question})"
"chain.invoke({\"question\": question})"
]
},
{
@@ -666,7 +702,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
"version": "3.10.5"
}
},
"nbformat": 4,

View File

@@ -63,6 +63,38 @@
"Notice that if the contents of one of the messages to merge is a list of content blocks then the merged message will have a list of content blocks. And if both messages to merge have string contents then those are concatenated with a newline character."
]
},
{
"cell_type": "markdown",
"id": "11f7e8d3",
"metadata": {},
"source": [
"The `merge_message_runs` utility also works with messages composed together using the overloaded `+` operation:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b51855c5",
"metadata": {},
"outputs": [],
"source": [
"messages = (\n",
" SystemMessage(\"you're a good assistant.\")\n",
" + SystemMessage(\"you always respond with a joke.\")\n",
" + HumanMessage([{\"type\": \"text\", \"text\": \"i wonder why it's called langchain\"}])\n",
" + HumanMessage(\"and who is harrison chasing anyways\")\n",
" + AIMessage(\n",
" 'Well, I guess they thought \"WordRope\" and \"SentenceString\" just didn\\'t have the same ring to it!'\n",
" )\n",
" + AIMessage(\n",
" \"Why, he's probably chasing after the last cup of coffee in the office!\"\n",
" )\n",
")\n",
"\n",
"merged = merge_message_runs(messages)\n",
"print(\"\\n\\n\".join([repr(x) for x in merged]))"
]
},
{
"cell_type": "markdown",
"id": "1b2eee74-71c8-4168-b968-bca580c25d18",

View File

@@ -41,7 +41,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "662fac50",
"metadata": {},
"outputs": [],
@@ -50,6 +50,26 @@
"%pip install -U langgraph langchain langchain-openai"
]
},
{
"cell_type": "markdown",
"id": "6f8ec38f",
"metadata": {},
"source": [
"Then, set your OpenAI API key."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "5fca87ef",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
]
},
{
"cell_type": "markdown",
"id": "8e50635c-1671-46e6-be65-ce95f8167c2f",
@@ -62,7 +82,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 2,
"id": "1e425fea-2796-4b99-bee6-9a6ffe73f756",
"metadata": {},
"outputs": [],
@@ -95,7 +115,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"id": "03ea357c-9c36-4464-b2cc-27bd150e1554",
"metadata": {},
"outputs": [
@@ -106,7 +126,7 @@
" 'output': 'The value of `magic_function(3)` is 5.'}"
]
},
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -142,7 +162,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"id": "53a3737a-d167-4255-89bf-20ac37f89a3e",
"metadata": {},
"outputs": [
@@ -153,7 +173,7 @@
" 'output': 'The value of `magic_function(3)` is 5.'}"
]
},
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -173,7 +193,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"id": "74ecebe3-512e-409c-a661-bdd5b0a2b782",
"metadata": {},
"outputs": [
@@ -181,10 +201,10 @@
"data": {
"text/plain": [
"{'input': 'Pardon?',\n",
" 'output': 'The result of applying `magic_function` to the input 3 is 5.'}"
" 'output': 'The value you get when you apply `magic_function` to the input 3 is 5.'}"
]
},
"execution_count": 4,
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
@@ -223,7 +243,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 6,
"id": "a9a11ccd-75e2-4c11-844d-a34870b0ff91",
"metadata": {},
"outputs": [
@@ -234,7 +254,7 @@
" 'output': 'El valor de `magic_function(3)` es 5.'}"
]
},
"execution_count": 5,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -263,19 +283,19 @@
"source": [
"Now, let's pass a custom system message to [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent).\n",
"\n",
"LangGraph's prebuilt `create_react_agent` does not take a prompt template directly as a parameter, but instead takes a [`messages_modifier`](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) parameter. This modifies messages before they are passed into the model, and can be one of four values:\n",
"LangGraph's prebuilt `create_react_agent` does not take a prompt template directly as a parameter, but instead takes a [`state_modifier`](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) parameter. This modifies the graph state before the llm is called, and can be one of four values:\n",
"\n",
"- A `SystemMessage`, which is added to the beginning of the list of messages.\n",
"- A `string`, which is converted to a `SystemMessage` and added to the beginning of the list of messages.\n",
"- A `Callable`, which should take in a list of messages. The output is then passed to the language model.\n",
"- Or a [`Runnable`](/docs/concepts/#langchain-expression-language-lcel), which should should take in a list of messages. The output is then passed to the language model.\n",
"- A `Callable`, which should take in full graph state. The output is then passed to the language model.\n",
"- Or a [`Runnable`](/docs/concepts/#langchain-expression-language-lcel), which should take in full graph state. The output is then passed to the language model.\n",
"\n",
"Here's how it looks in action:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"id": "a9486805-676a-4d19-a5c4-08b41b172989",
"metadata": {},
"outputs": [],
@@ -287,7 +307,7 @@
"# This could also be a SystemMessage object\n",
"# system_message = SystemMessage(content=\"You are a helpful assistant. Respond only in Spanish.\")\n",
"\n",
"app = create_react_agent(model, tools, messages_modifier=system_message)\n",
"app = create_react_agent(model, tools, state_modifier=system_message)\n",
"\n",
"\n",
"messages = app.invoke({\"messages\": [(\"user\", query)]})"
@@ -304,7 +324,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "d369ab45-0c82-45f4-9d3e-8efb8dd47e2c",
"metadata": {},
"outputs": [
@@ -317,8 +337,8 @@
}
],
"source": [
"from langchain_core.messages import AnyMessage\n",
"from langgraph.prebuilt import create_react_agent\n",
"from langgraph.prebuilt.chat_agent_executor import AgentState\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
@@ -328,13 +348,13 @@
")\n",
"\n",
"\n",
"def _modify_messages(messages: list[AnyMessage]):\n",
" return prompt.invoke({\"messages\": messages}).to_messages() + [\n",
"def _modify_state_messages(state: AgentState):\n",
" return prompt.invoke({\"messages\": state[\"messages\"]}).to_messages() + [\n",
" (\"user\", \"Also say 'Pandamonium!' after the answer.\")\n",
" ]\n",
"\n",
"\n",
"app = create_react_agent(model, tools, messages_modifier=_modify_messages)\n",
"app = create_react_agent(model, tools, state_modifier=_modify_state_messages)\n",
"\n",
"\n",
"messages = app.invoke({\"messages\": [(\"human\", query)]})\n",
@@ -366,8 +386,8 @@
},
{
"cell_type": "code",
"execution_count": 8,
"id": "1fb52a2c",
"execution_count": 9,
"id": "b97beba5-8f74-430c-9399-91b77c8fa15c",
"metadata": {},
"outputs": [
{
@@ -376,7 +396,7 @@
"text": [
"Hi Polly! The output of the magic function for the input 3 is 5.\n",
"---\n",
"Yes, I remember your name, Polly! How can I assist you further?\n",
"Yes, your name is Polly!\n",
"---\n",
"The output of the magic function for the input 3 is 5.\n"
]
@@ -384,14 +404,14 @@
],
"source": [
"from langchain.agents import AgentExecutor, create_tool_calling_agent\n",
"from langchain_community.chat_message_histories import ChatMessageHistory\n",
"from langchain_core.chat_history import InMemoryChatMessageHistory\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables.history import RunnableWithMessageHistory\n",
"from langchain_core.tools import tool\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model=\"gpt-4o\")\n",
"memory = ChatMessageHistory(session_id=\"test-session\")\n",
"memory = InMemoryChatMessageHistory(session_id=\"test-session\")\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", \"You are a helpful assistant.\"),\n",
@@ -456,24 +476,23 @@
},
{
"cell_type": "code",
"execution_count": 9,
"id": "035e1253",
"execution_count": 10,
"id": "baca3dc6-678b-4509-9275-2fd653102898",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hi Polly! The output of the magic_function for the input 3 is 5.\n",
"Hi Polly! The output of the magic_function for the input of 3 is 5.\n",
"---\n",
"Yes, your name is Polly!\n",
"---\n",
"The output of the magic_function for the input 3 was 5.\n"
"The output of the magic_function for the input of 3 was 5.\n"
]
}
],
"source": [
"from langchain_core.messages import SystemMessage\n",
"from langgraph.checkpoint import MemorySaver # an in-memory checkpointer\n",
"from langgraph.prebuilt import create_react_agent\n",
"\n",
@@ -483,7 +502,7 @@
"\n",
"memory = MemorySaver()\n",
"app = create_react_agent(\n",
" model, tools, messages_modifier=system_message, checkpointer=memory\n",
" model, tools, state_modifier=system_message, checkpointer=memory\n",
")\n",
"\n",
"config = {\"configurable\": {\"thread_id\": \"test-thread\"}}\n",
@@ -525,16 +544,16 @@
},
{
"cell_type": "code",
"execution_count": 10,
"id": "d640feb3",
"execution_count": 11,
"id": "e62843c4-1107-41f0-a50b-aea256e28053",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'actions': [ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-c68fd76f-a3c3-4c3c-bfd7-748c171ed4b8', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'index': 0}])], tool_call_id='call_q9MgGFjqJbV2xSUX93WqxmOt')], 'messages': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-c68fd76f-a3c3-4c3c-bfd7-748c171ed4b8', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'index': 0}])]}\n",
"{'steps': [AgentStep(action=ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-c68fd76f-a3c3-4c3c-bfd7-748c171ed4b8', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_q9MgGFjqJbV2xSUX93WqxmOt', 'index': 0}])], tool_call_id='call_q9MgGFjqJbV2xSUX93WqxmOt'), observation=5)], 'messages': [FunctionMessage(content='5', name='magic_function')]}\n",
"{'actions': [ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518'}, id='run-5664e138-7085-4da7-a49e-5656a87b8d78', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'index': 0, 'type': 'tool_call_chunk'}])], tool_call_id='call_1exy0rScfPmo4fy27FbQ5qJ2')], 'messages': [AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518'}, id='run-5664e138-7085-4da7-a49e-5656a87b8d78', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'index': 0, 'type': 'tool_call_chunk'}])]}\n",
"{'steps': [AgentStep(action=ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518'}, id='run-5664e138-7085-4da7-a49e-5656a87b8d78', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_1exy0rScfPmo4fy27FbQ5qJ2', 'index': 0, 'type': 'tool_call_chunk'}])], tool_call_id='call_1exy0rScfPmo4fy27FbQ5qJ2'), observation=5)], 'messages': [FunctionMessage(content='5', name='magic_function')]}\n",
"{'output': 'The value of `magic_function(3)` is 5.', 'messages': [AIMessage(content='The value of `magic_function(3)` is 5.')]}\n"
]
}
@@ -585,23 +604,23 @@
},
{
"cell_type": "code",
"execution_count": 11,
"id": "86abbe07",
"execution_count": 12,
"id": "076ebc85-f804-4093-a25a-a16334c9898e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_yTjXXibj76tyFyPRa1soLo0S', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 70, 'total_tokens': 84}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-b275f314-c42e-4e77-9dec-5c23f7dbd53b-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_yTjXXibj76tyFyPRa1soLo0S'}])]}}\n",
"{'tools': {'messages': [ToolMessage(content='5', name='magic_function', id='41c5f227-528d-4483-a313-b03b23b1d327', tool_call_id='call_yTjXXibj76tyFyPRa1soLo0S')]}}\n",
"{'agent': {'messages': [AIMessage(content='The value of `magic_function(3)` is 5.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 93, 'total_tokens': 107}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'stop', 'logprobs': None}, id='run-0ef12b6e-415d-4758-9b62-5e5e1b350072-0')]}}\n"
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_my9rzFSKR4T1yYKwCsfbZB8A', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 61, 'total_tokens': 75}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_bc2a86f5f5', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dd705555-8fae-4fb1-a033-5d99a23e3c22-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_my9rzFSKR4T1yYKwCsfbZB8A', 'type': 'tool_call'}], usage_metadata={'input_tokens': 61, 'output_tokens': 14, 'total_tokens': 75})]}}\n",
"{'tools': {'messages': [ToolMessage(content='5', name='magic_function', tool_call_id='call_my9rzFSKR4T1yYKwCsfbZB8A')]}}\n",
"{'agent': {'messages': [AIMessage(content='The value of `magic_function(3)` is 5.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 84, 'total_tokens': 98}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'stop', 'logprobs': None}, id='run-698cad05-8cb2-4d08-8c2a-881e354f6cc7-0', usage_metadata={'input_tokens': 84, 'output_tokens': 14, 'total_tokens': 98})]}}\n"
]
}
],
"source": [
"from langchain_core.messages import AnyMessage\n",
"from langgraph.prebuilt import create_react_agent\n",
"from langgraph.prebuilt.chat_agent_executor import AgentState\n",
"\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
@@ -611,12 +630,11 @@
")\n",
"\n",
"\n",
"def _modify_messages(messages: list[AnyMessage]):\n",
" return prompt.invoke({\"messages\": messages}).to_messages()\n",
"def _modify_state_messages(state: AgentState):\n",
" return prompt.invoke({\"messages\": state[\"messages\"]}).to_messages()\n",
"\n",
"\n",
"app = create_react_agent(model, tools, messages_modifier=_modify_messages)\n",
"\n",
"app = create_react_agent(model, tools, state_modifier=_modify_state_messages)\n",
"\n",
"for step in app.stream({\"messages\": [(\"human\", query)]}, stream_mode=\"updates\"):\n",
" print(step)"
@@ -637,14 +655,14 @@
{
"cell_type": "code",
"execution_count": 12,
"id": "4eff44bc-a620-4c8a-97b1-268692a842bb",
"id": "a2f720f3-c121-4be2-b498-92c16bb44b0a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[(ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_ABI4hftfEdnVgKyfF6OzZbca', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-837e794f-cfd8-40e0-8abc-4d98ced11b75', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_ABI4hftfEdnVgKyfF6OzZbca'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_ABI4hftfEdnVgKyfF6OzZbca', 'index': 0}])], tool_call_id='call_ABI4hftfEdnVgKyfF6OzZbca'), 5)]\n"
"[(ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_uPZ2D1Bo5mdED3gwgaeWURrf', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls', 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518'}, id='run-a792db4a-278d-4090-82ae-904a30eada93', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_uPZ2D1Bo5mdED3gwgaeWURrf', 'type': 'tool_call'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_uPZ2D1Bo5mdED3gwgaeWURrf', 'index': 0, 'type': 'tool_call_chunk'}])], tool_call_id='call_uPZ2D1Bo5mdED3gwgaeWURrf'), 5)]\n"
]
}
],
@@ -667,16 +685,16 @@
{
"cell_type": "code",
"execution_count": 13,
"id": "4f4364ea-dffe-4d25-bdce-ef7d0020b880",
"id": "ef23117a-5ccb-42ce-80c3-ea49a9d3a942",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content='what is the value of magic_function(3)?', id='0f63e437-c4d8-4da9-b6f5-b293ebfe4a64'),\n",
" AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_S96v28LlI6hNkQrNnIio0JPh', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-ffef7898-14b1-4537-ad90-7c000a8a5d25-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_S96v28LlI6hNkQrNnIio0JPh'}]),\n",
" ToolMessage(content='5', name='magic_function', id='fbd9df4e-1dda-4d3e-9044-b001f7875476', tool_call_id='call_S96v28LlI6hNkQrNnIio0JPh'),\n",
" AIMessage(content='The value of `magic_function(3)` is 5.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 87, 'total_tokens': 101}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'stop', 'logprobs': None}, id='run-e5d94c54-d9f4-45cd-be8e-a9101a8d88d6-0')]}"
"{'messages': [HumanMessage(content='what is the value of magic_function(3)?', id='cd7d0f49-a0e0-425a-b2b0-603a716058ed'),\n",
" AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_VfZ9287DuybOSrBsQH5X12xf', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 55, 'total_tokens': 69}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-a1e965cd-bf61-44f9-aec1-8aaecb80955f-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_VfZ9287DuybOSrBsQH5X12xf', 'type': 'tool_call'}], usage_metadata={'input_tokens': 55, 'output_tokens': 14, 'total_tokens': 69}),\n",
" ToolMessage(content='5', name='magic_function', id='20d5c2fe-a5d8-47fa-9e04-5282642e2039', tool_call_id='call_VfZ9287DuybOSrBsQH5X12xf'),\n",
" AIMessage(content='The value of `magic_function(3)` is 5.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 78, 'total_tokens': 92}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'stop', 'logprobs': None}, id='run-abf9341c-ef41-4157-935d-a3be5dfa2f41-0', usage_metadata={'input_tokens': 78, 'output_tokens': 14, 'total_tokens': 92})]}"
]
},
"execution_count": 13,
@@ -708,7 +726,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 16,
"id": "16f189a7-fc78-4cb5-aa16-a94ca06401a6",
"metadata": {},
"outputs": [],
@@ -724,7 +742,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 17,
"id": "c96aefd7-6f6e-4670-aca6-1ac3d4e7871f",
"metadata": {},
"outputs": [
@@ -739,11 +757,7 @@
"Invoking: `magic_function` with `{'input': '3'}`\n",
"\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Invoking: `magic_function` with `{'input': '3'}`\n",
"responded: Parece que hubo un error al intentar obtener el valor de `magic_function(3)`. Permíteme intentarlo de nuevo.\n",
"\n",
"\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3mAún no puedo obtener el valor de `magic_function(3)`. ¿Hay algo más en lo que pueda ayudarte?\u001b[0m\n",
"\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3mParece que hubo un error al intentar calcular el valor de la función mágica. ¿Te gustaría que lo intente de nuevo?\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
@@ -752,10 +766,10 @@
"data": {
"text/plain": [
"{'input': 'what is the value of magic_function(3)?',\n",
" 'output': 'Aún no puedo obtener el valor de `magic_function(3)`. ¿Hay algo más en lo que pueda ayudarte?'}"
" 'output': 'Parece que hubo un error al intentar calcular el valor de la función mágica. ¿Te gustaría que lo intente de nuevo?'}"
]
},
"execution_count": 15,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
@@ -797,7 +811,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 18,
"id": "b974a91f-6ae8-4644-83d9-73666258a6db",
"metadata": {},
"outputs": [
@@ -805,12 +819,12 @@
"name": "stdout",
"output_type": "stream",
"text": [
"('human', 'what is the value of magic_function(3)?')\n",
"content='' additional_kwargs={'tool_calls': [{'id': 'call_pFdKcCu5taDTtOOfX14vEDRp', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-25836468-ba7e-43be-a7cf-76bba06a2a08-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_pFdKcCu5taDTtOOfX14vEDRp'}]\n",
"content='Sorry, there was an error. Please try again.' name='magic_function' id='1a08b883-9c7b-4969-9e9b-67ce64cdcb5f' tool_call_id='call_pFdKcCu5taDTtOOfX14vEDRp'\n",
"content='It seems there was an error when trying to apply the magic function. Let me try again.' additional_kwargs={'tool_calls': [{'id': 'call_DA0lpDIkBFg2GHy4WsEcZG4K', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 97, 'total_tokens': 131}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-d571b774-0ea3-4e35-8b7d-f32932c3f3cc-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_DA0lpDIkBFg2GHy4WsEcZG4K'}]\n",
"content='Sorry, there was an error. Please try again.' name='magic_function' id='0b45787b-c82a-487f-9a5a-de129c30460f' tool_call_id='call_DA0lpDIkBFg2GHy4WsEcZG4K'\n",
"content='It appears that there is a consistent issue when trying to apply the magic function to the input \"3.\" This could be due to various reasons, such as the input not being in the correct format or an internal error.\\n\\nIf you have any other questions or if there\\'s something else you\\'d like to try, please let me know!' response_metadata={'token_usage': {'completion_tokens': 66, 'prompt_tokens': 153, 'total_tokens': 219}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'stop', 'logprobs': None} id='run-50a962e6-21b7-4327-8dea-8e2304062627-0'\n"
"content='what is the value of magic_function(3)?' id='74e2d5e8-2b59-4820-979c-8d11ecfc14c2'\n",
"content='' additional_kwargs={'tool_calls': [{'id': 'call_ihtrH6IG95pDXpKluIwAgi3J', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 55, 'total_tokens': 69}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-5a35e465-8a08-43dd-ac8b-4a76dcace305-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_ihtrH6IG95pDXpKluIwAgi3J', 'type': 'tool_call'}] usage_metadata={'input_tokens': 55, 'output_tokens': 14, 'total_tokens': 69}\n",
"content='Sorry, there was an error. Please try again.' name='magic_function' id='8c37c19b-3586-46b1-aab9-a045786801a2' tool_call_id='call_ihtrH6IG95pDXpKluIwAgi3J'\n",
"content='It seems there was an error in processing the request. Let me try again.' additional_kwargs={'tool_calls': [{'id': 'call_iF0vYWAd6rfely0cXSqdMOnF', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 88, 'total_tokens': 119}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-eb88ec77-d492-43a5-a5dd-4cefef9a6920-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_iF0vYWAd6rfely0cXSqdMOnF', 'type': 'tool_call'}] usage_metadata={'input_tokens': 88, 'output_tokens': 31, 'total_tokens': 119}\n",
"content='Sorry, there was an error. Please try again.' name='magic_function' id='c9ff261f-a0f1-4c92-a9f2-cd749f62d911' tool_call_id='call_iF0vYWAd6rfely0cXSqdMOnF'\n",
"content='I am currently unable to process the request with the input \"3\" for the `magic_function`. If you have any other questions or need assistance with something else, please let me know!' response_metadata={'token_usage': {'completion_tokens': 39, 'prompt_tokens': 141, 'total_tokens': 180}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'stop', 'logprobs': None} id='run-d42508aa-f286-4b57-80fb-f8a76736d470-0' usage_metadata={'input_tokens': 141, 'output_tokens': 39, 'total_tokens': 180}\n"
]
}
],
@@ -847,7 +861,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 19,
"id": "4b8498fc-a7af-4164-a401-d8714f082306",
"metadata": {},
"outputs": [
@@ -874,7 +888,7 @@
" 'output': 'Agent stopped due to max iterations.'}"
]
},
"execution_count": 17,
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
@@ -917,7 +931,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 20,
"id": "a2b29113-e6be-4f91-aa4c-5c63dea3e423",
"metadata": {},
"outputs": [
@@ -925,7 +939,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_HaQkeCwD5QskzJzFixCBacZ4', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-596c9200-771f-436d-8576-72fcb81620f1-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_HaQkeCwD5QskzJzFixCBacZ4'}])]}}\n",
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_FKiTkTd0Ffd4rkYSzERprf1M', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 55, 'total_tokens': 69}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-b842f7b6-ec10-40f8-8c0e-baa220b77e91-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_FKiTkTd0Ffd4rkYSzERprf1M', 'type': 'tool_call'}], usage_metadata={'input_tokens': 55, 'output_tokens': 14, 'total_tokens': 69})]}}\n",
"------\n",
"{'input': 'what is the value of magic_function(3)?', 'output': 'Agent stopped due to max iterations.'}\n"
]
@@ -956,7 +970,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 21,
"id": "e9eb55f4-a321-4bac-b52d-9e43b411cf92",
"metadata": {},
"outputs": [
@@ -964,7 +978,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_4agJXUHtmHrOOMogjF6ZuzAv', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-a1c77db7-405f-43d9-8d57-751f2ca1a58c-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_4agJXUHtmHrOOMogjF6ZuzAv'}])]}}\n",
"{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_WoOB8juagB08xrP38twYlYKR', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 55, 'total_tokens': 69}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-73dee47e-30ab-42c9-bb0c-6f227cac96cd-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_WoOB8juagB08xrP38twYlYKR', 'type': 'tool_call'}], usage_metadata={'input_tokens': 55, 'output_tokens': 14, 'total_tokens': 69})]}}\n",
"------\n",
"Task Cancelled.\n"
]
@@ -1005,7 +1019,7 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 22,
"id": "3f6e2cf2",
"metadata": {},
"outputs": [
@@ -1067,7 +1081,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 23,
"id": "73cabbc4",
"metadata": {},
"outputs": [
@@ -1075,10 +1089,10 @@
"name": "stdout",
"output_type": "stream",
"text": [
"('human', 'what is the value of magic_function(3)?')\n",
"content='' additional_kwargs={'tool_calls': [{'id': 'call_bTURmOn9C8zslmn0kMFeykIn', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-0844a504-7e6b-4ea6-a069-7017e38121ee-0' tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_bTURmOn9C8zslmn0kMFeykIn'}]\n",
"content='Sorry there was an error, please try again.' name='magic_function' id='00d5386f-eb23-4628-9a29-d9ce6a7098cc' tool_call_id='call_bTURmOn9C8zslmn0kMFeykIn'\n",
"content='' additional_kwargs={'tool_calls': [{'id': 'call_JYqvvvWmXow2u012DuPoDHFV', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 96, 'total_tokens': 110}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-b73b1b1c-c829-4348-98cd-60b315c85448-0' tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_JYqvvvWmXow2u012DuPoDHFV'}]\n",
"content='what is the value of magic_function(3)?' id='4fa7fbe5-758c-47a3-9268-717665d10680'\n",
"content='' additional_kwargs={'tool_calls': [{'id': 'call_ujE0IQBbIQnxcF9gsZXQfdhF', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 55, 'total_tokens': 69}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-65d689aa-baee-4342-a5d2-048feefab418-0' tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_ujE0IQBbIQnxcF9gsZXQfdhF', 'type': 'tool_call'}] usage_metadata={'input_tokens': 55, 'output_tokens': 14, 'total_tokens': 69}\n",
"content='Sorry there was an error, please try again.' name='magic_function' id='ef8ddf1d-9ad7-4ac0-b784-b673c4d94bbd' tool_call_id='call_ujE0IQBbIQnxcF9gsZXQfdhF'\n",
"content='It seems there was an issue with the previous attempt. Let me try that again.' additional_kwargs={'tool_calls': [{'id': 'call_GcsAfCFUHJ50BN2IOWnwTbQ7', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 87, 'total_tokens': 119}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_4e2b2da518', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-54527c4b-8ff0-4ee8-8abf-224886bd222e-0' tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_GcsAfCFUHJ50BN2IOWnwTbQ7', 'type': 'tool_call'}] usage_metadata={'input_tokens': 87, 'output_tokens': 32, 'total_tokens': 119}\n",
"{'input': 'what is the value of magic_function(3)?', 'output': 'Agent stopped due to max iterations.'}\n"
]
}
@@ -1118,7 +1132,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 24,
"id": "b94bb169",
"metadata": {},
"outputs": [
@@ -1216,12 +1230,12 @@
"source": [
"### In LangGraph\n",
"\n",
"We can use the [`messages_modifier`](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) just as before when passing in [prompt templates](#prompt-templates)."
"We can use the [`state_modifier`](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) just as before when passing in [prompt templates](#prompt-templates)."
]
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 25,
"id": "b309ba9a",
"metadata": {},
"outputs": [
@@ -1246,9 +1260,9 @@
}
],
"source": [
"from langchain_core.messages import AnyMessage\n",
"from langgraph.errors import GraphRecursionError\n",
"from langgraph.prebuilt import create_react_agent\n",
"from langgraph.prebuilt.chat_agent_executor import AgentState\n",
"\n",
"magic_step_num = 1\n",
"\n",
@@ -1265,12 +1279,12 @@
"tools = [magic_function]\n",
"\n",
"\n",
"def _modify_messages(messages: list[AnyMessage]):\n",
"def _modify_state_messages(state: AgentState):\n",
" # Give the agent amnesia, only keeping the original user query\n",
" return [(\"system\", \"You are a helpful assistant\"), messages[0]]\n",
" return [(\"system\", \"You are a helpful assistant\"), state[\"messages\"][0]]\n",
"\n",
"\n",
"app = create_react_agent(model, tools, messages_modifier=_modify_messages)\n",
"app = create_react_agent(model, tools, state_modifier=_modify_state_messages)\n",
"\n",
"try:\n",
" for step in app.stream({\"messages\": [(\"human\", query)]}, stream_mode=\"updates\"):\n",
@@ -1308,7 +1322,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.10.4"
}
},
"nbformat": 4,

Some files were not shown because too many files have changed in this diff Show More