langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-14 07:07:34 +00:00

Author	SHA1	Message	Date
남광우	256bad3251	core[minor]: Support asynchronous in InMemoryVectorStore (#24472 ) ### Description * support asynchronous in InMemoryVectorStore * since embeddings might be possible to call asynchronously, ensure that both asynchronous and synchronous functions operate correctly.	2024-07-25 11:36:55 -04:00
Luca Dorigo	5fdbdd6bec	community[patch]: Fix invalid iohttp verify parameter (#24655 ) Should fix https://github.com/langchain-ai/langchain/issues/24654	2024-07-25 11:09:21 -04:00
Eugene Yurtsev	7dd6b32991	core[minor]: Add InMemoryRateLimiter (#21992 ) This PR introduces the following Runnables: 1. BaseRateLimiter: an abstraction for specifying a time based rate limiter as a Runnable 2. InMemoryRateLimiter: Provides an in-memory implementation of a rate limiter ## Example ```python from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda from datetime import datetime foo = InMemoryRateLimiter(requests_per_second=0.5) def meow(x): print(datetime.now().strftime("%H:%M:%S.%f")) return x chain = foo \| meow for _ in range(10): print(chain.invoke('hello')) ``` Produces: ``` 17:12:07.530151 hello 17:12:09.537932 hello 17:12:11.548375 hello 17:12:13.558383 hello 17:12:15.568348 hello 17:12:17.578171 hello 17:12:19.587508 hello 17:12:21.597877 hello 17:12:23.607707 hello 17:12:25.617978 hello ``` ![image](https://github.com/user-attachments/assets/283af59f-e1e1-408b-8e75-d3910c3c44cc) ## Interface The rate limiter uses the following interface for acquiring a token: ```python class BaseRateLimiter(Runnable[Input, Output], abc.ABC): @abc.abstractmethod def acquire(self, *, blocking: bool = True) -> bool: """Attempt to acquire the necessary tokens for the rate limiter.``` ``` The flag `blocking` has been added to the abstraction to allow supporting streaming (which is easier if blocking=False). ## Limitations - The rate limiter is not designed to work across different processes. It is an in-memory rate limiter, but it is thread safe. - The rate limiter only supports time-based rate limiting. It does not take into account the size of the request or any other factors. - The current implementation does not handle streaming inputs well and will consume all inputs even if the rate limit has been reached. Better support for streaming inputs will be added in the future. - When the rate limiter is combined with another runnable via a RunnableSequence, usage of .batch() or .abatch() will only respect the average rate limit. There will be bursty behavior as .batch() and .abatch() wait for each step to complete before starting the next step. One way to mitigate this is to use batch_as_completed() or abatch_as_completed(). ## Bursty behavior in `batch` and `abatch` When the rate limiter is combined with another runnable via a RunnableSequence, usage of .batch() or .abatch() will only respect the average rate limit. There will be bursty behavior as .batch() and .abatch() wait for each step to complete before starting the next step. This becomes a problem if users are using `batch` and `abatch` with many inputs (e.g., 100). In this case, there will be a burst of 100 inputs into the batch of the rate limited runnable. 1. Using a RunnableBinding The API would look like: ```python from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda rate_limiter = InMemoryRateLimiter(requests_per_second=0.5) def meow(x): return x rate_limited_meow = RunnableLambda(meow).with_rate_limiter(rate_limiter) ``` 2. Another option is to add some init option to RunnableSequence that changes `.batch()` to be depth first (e.g., by delegating to `batch_as_completed`) ```python RunnableSequence(first=rate_limiter, last=model, how='batch-depth-first') ``` Pros: Does not require Runnable Binding Cons: Feels over-complicated	2024-07-25 01:34:03 +00:00
Oleg Kulyk	4b1b7959a2	community[minor]: Add ScrapingAnt Loader Community Integration (#24514 ) Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration. ScrapingAnt is a web scraping API that allows extracting web page data into accessible and well-formatted markdown. Description: Added ScrapingAnt web loader for retrieving web page data as markdown Dependencies: scrapingant-client Twitter: @WeRunTheWorld3 --------- Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>	2024-07-24 21:11:43 -04:00
John	d59c656ea5	unstructured, community, initialize langchain-unstructured package (#22779 ) #### Update (2): A single `UnstructuredLoader` is added to handle both local and api partitioning. This loader also handles single or multiple documents. #### Changes in `community`: Changes here do not affect users. In the initial process of using the SDK for the API Loaders, the Loaders in community were refactored. Other changes include: The `UnstructuredBaseLoader` has a new check to see if both `mode="paged"` and `chunking_strategy="by_page"`. It also now has `Element.element_id` added to the `Document.metadata`. `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such, now both directly inherit from `UnstructuredBaseLoader` and initialize their `file_path`/`file` attributes respectively and implement their own `_post_process_elements` methods. -------- #### Update: New SDK Loaders in a [partner package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo) are introduced to prevent breaking changes for users (see discussion below). ##### TODO: - [x] Test docstring examples -------- - Description: UnstructuredAPIFileIOLoader and UnstructuredAPIFileLoader calls to the unstructured api are now made using the unstructured-client sdk. - New Dependencies: unstructured-client - [x] Add tests and docs: If you're adding a new integration, please include - [x] a test for the integration, preferably unit tests that do not rely on network access, - [x] update the description in `docs/docs/integrations/providers/unstructured.mdx` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. TODO: - [x] Update https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api - `langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb` - The description here needs to indicate that users should install `unstructured-client` instead of `unstructured`. Read over closely to look for any other changes that need to be made. - [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to handle json responses from the API instead of just lists of elements. - This method may need to be overwritten by the API loaders instead of changing it in the `UnstructuredBaseLoader`. - [x] Update the documentation links in the class docstrings (the Unstructured documents have moved) - [x] Update Document.metadata to include `element_id` (see thread [here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419)) --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com>	2024-07-24 23:21:20 +00:00
Joel Akeret	acfce30017	Adding compatibility for OllamaFunctions with ImagePromptTemplate (#24499 ) - [ ] PR title: "experimental: Adding compatibility for OllamaFunctions with ImagePromptTemplate" - [ ] PR message: - Description: Removes the outdated `_convert_messages_to_ollama_messages` method override in the `OllamaFunctions` class to ensure that ollama multimodal models can be invoked with an image. - Issue: #24174 --------- Co-authored-by: Joel Akeret <joel.akeret@ti&m.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-24 14:57:05 -07:00
Erick Friis	8f3c052db1	cli: release 0.0.26 (#24623 ) - cli: remove snapshot flag from pytest defaults - x - x	2024-07-24 13:13:58 -07:00
ChengZi	29a3b3a711	partners[milvus]: add dynamic field (#24544 ) add dynamic field feature to langchain_milvus more unittest, more robustic plan to deprecate the `metadata_field` in the future, because it's function is the same as `enable_dynamic_field`, but the latter one is a more advanced concept in milvus Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-24 20:01:58 +00:00
Erick Friis	20fe4deea0	milvus: release 0.1.3 (#24624 )	2024-07-24 13:01:27 -07:00
Erick Friis	3a55f4bfe9	cli: remove snapshot flag from pytest defaults (#24622 )	2024-07-24 19:41:01 +00:00
Eugene Yurtsev	b55f6105c6	community[patch]: Add linter to prevent further usage of root_validator and validator (#24613 ) This linter is meant to move development to use __init__ instead of root_validator and validator. We need to investigate whether we need to lint some of the functionality of Field (e.g., `lt` and `gt`, `alias`) `alias` is the one that's most popular: (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "alias=" \| wc -l 144 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "ge=" \| wc -l 10 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "gt=" \| wc -l 4	2024-07-24 12:35:21 -04:00
Anush	4585eaef1b	qdrant: Fix vectors_config access (#24606 ) ## Description Fixes #24558 by accessing `vectors_config` after asserting it to be a dict.	2024-07-24 10:54:33 -04:00
maang-h	22175738ac	docs: Add MongoDBChatMessageHistory docstrings (#24608 ) - Description: Add MongoDBChatMessageHistory rich docstrings. - Issue: the issue #21983	2024-07-24 10:12:44 -04:00
Anindyadeep	12c3454fd9	[Community] PremAI Tool Calling Functionality (#23931 ) This PR is under WIP and adds the following functionalities: - [X] Supports tool calling across the langchain ecosystem. (However streaming is not supported) - [X] Update documentation	2024-07-24 09:53:58 -04:00
Vishnu Nandakumar	e271965d1e	community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424 ) - [ ] Community: "Retrievers: Product Quantization" - [X] This PR adds Product Quantization feature to the retrievers to the Langchain Community. PQ is one of the fastest retrieval methods if the embeddings are rich enough in context due to the concepts of quantization and representation through centroids - Description: Adding PQ as one of the retrievers - Dependencies: using the package nanopq for this PR - Twitter handle: vishnunkumar_ - [X] Add tests and docs: If you're adding a new integration, please include - [X] Added unit tests for the same in the retrievers. - [] Will add an example notebook subsequently - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ - done the same --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-24 13:52:15 +00:00
stydxm	b9bea36dd4	community: fix typo in warning message (#24597 ) - Description: This PR fixes a small typo in a warning message - Issue: ![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49) There were double `Use` and double `instead`	2024-07-24 13:19:07 +00:00
cüre	da06d4d7af	community: update finetuned model cost for 4o-mini (#24605 ) - Description: adds model price for. reference: https://openai.com/api/pricing/ - Issue: - - Dependencies: - - Twitter handle: cureef	2024-07-24 13:17:26 +00:00
Philippe PRADOS	5f73c836a6	openai[small]: Add the new model: gpt-4o-mini (#24594 )	2024-07-24 09:14:48 -04:00
ZhangShenao	ad18afc3ec	community[patch]: Fix param spelling error in `ElasticsearchChatMessageHistory` (#24589 ) Fix param spelling error in `ElasticsearchChatMessageHistory`	2024-07-23 19:29:42 -07:00
Isaac Francisco	464a525a5a	[partner]: minor change to embeddings for Ollama (#24521 )	2024-07-24 00:00:13 +00:00
Aayush Kataria	0f45ac4088	LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087 ) Thank you for contributing to LangChain! - This PR adds vector search filtering for Azure Cosmos DB Mongo vCore and NoSQL. - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-23 16:59:23 -07:00
Gareth	ac41c97d21	pinecone: Add embedding Inference Support (#24515 ) Description Add support for Pinecone hosted embedding models as `PineconeEmbeddings`. Replacement for #22890 Dependencies Add `aiohttp` to support async embeddings call against REST directly - [x] Add tests and docs: If you're adding a new integration, please include Added `docs/docs/integrations/text_embedding/pinecone.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Twitter: `gdjdg17` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-23 22:50:28 +00:00
ccurme	2d6b0bf3e3	core[patch]: add to RunnableLambda docstring (#24575 ) Explain behavior when function returns a runnable.	2024-07-23 20:46:44 +00:00
Carlos André Antunes	325068bb53	community: Fix azure_openai.py (#24572 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method - [ x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-23 16:22:21 -04:00
Nik Jmaeff	6878bc39b5	langchain: fix TrajectoryEvalChain.prep_inputs (#19959 ) The previous implementation would never be called. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-23 18:37:39 +00:00
Bagatur	55e66aa40c	langchain[patch]: init_chat_model support ChatBedrockConverse (#24564 )	2024-07-23 11:07:38 -07:00
Bagatur	9b7db08184	experimental[patch]: Release 0.0.63 (#24563 )	2024-07-23 16:28:37 +00:00
Bagatur	8691a5a37f	community[patch]: Release 0.2.10 (#24560 )	2024-07-23 09:24:57 -07:00
Bagatur	4919d5d6df	langchain[patch]: Release 0.2.11 (#24559 )	2024-07-23 09:18:44 -07:00
Bagatur	918e1c8a93	core[patch]: Release 0.2.23 (#24557 )	2024-07-23 09:01:18 -07:00
Leonid Ganeline	e787532479	langchain: `globals` fix (#21281 ) Issue: functions from `globals`, like the `get_debug` are placed in the init.py file. As a result, they don't listed in the API Reference docs. [See this](https://langchain-9jq1kef7i-langchain.vercel.app/v0.2/docs/how_to/debugging/#set_debugtrue) and [broken this](https://api.python.langchain.com/en/latest/globals/langchain.globals.set_debug.html). Change: moved code from init.py into the `globals.py` file and removed `globals` directory. Similar to: #21266 BTW `globals` in core implemented exactly inside a file not inside a folder.	2024-07-23 11:23:18 -04:00
Ben Chambers	e80b0932ee	community[patch]: small fixes to link extractors (#24528 ) - Description: small fixes to imports / types in the link extraction work --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-23 14:28:06 +00:00
Morteza Hosseini	9e06991aae	community[patch]: Update URL to the 2markdown API (#24546 ) Update the URL to Markdown endpoint. API information is available here: https://2markdown.com/docs#url2md	2024-07-23 14:27:55 +00:00
ZhangShenao	a14e02ab33	core[patch]: Fix word spelling error in `globals.py` (#24532 ) Fix word spelling error in `globals.py` Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-23 14:27:16 +00:00
maang-h	378db2e1a5	docs: Add RedisChatMessageHistory docstrings (#24548 ) - Description: Add `RedisChatMessageHistory ` rich docstrings. - Issue: the issue #21983 Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-23 14:23:46 +00:00
ccurme	a197a8e184	openai[patch]: move test (#24552 ) No-override tests (https://github.com/langchain-ai/langchain/pull/24407) include a condition that integrations not implement additional tests.	2024-07-23 10:22:22 -04:00
Eugene Yurtsev	f47b4edcc2	standard-test: Fix typo in skipif for chat model integration tests (#24553 )	2024-07-23 10:11:01 -04:00
Eugene Yurtsev	20b72a044c	standard-tests: Add BaseModel variations tests to with_structured_output (#24527 ) After this standard tests will test with the following combinations: 1. pydantic.BaseModel 2. pydantic.v1.BaseModel If ran within a matrix, it'll covert both pydantic.BaseModel originating from pydantic 1 and the one defined in pydantic 2.	2024-07-23 09:01:26 -04:00
Bagatur	70c71efcab	core[patch]: merge_content fix (#24526 )	2024-07-22 22:20:22 -07:00
Ben Chambers	a5a3d28776	community[patch]: Remove targets_table from C* GraphVectorStore (#24502 ) - Description: Remove the unnecessary `targets_table` parameter	2024-07-22 22:09:36 -04:00
Alexander Golodkov	2a70a07aad	community[minor]: added new document loaders based on dedoc library (#24303 ) ### Description This pull request added new document loaders to load documents of various formats using [Dedoc](https://github.com/ispras/dedoc): - `DedocFileLoader` (determine file types automatically and parse) - `DedocPDFLoader` (for `PDF` and images parsing) - `DedocAPIFileLoader` (determine file types automatically and parse using Dedoc API without library installation) [Dedoc](https://dedoc.readthedocs.io) is an open-source library/service that extracts texts, tables, attached files and document structure (e.g., titles, list items, etc.) from files of various formats. The library is actively developed and maintained by a group of developers. `Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images and more. Full list of supported formats can be found [here](https://dedoc.readthedocs.io/en/latest/#id1). For `PDF` documents, `Dedoc` allows to determine textual layer correctness and split the document into paragraphs. ### Issue This pull request extends variety of document loaders supported by `langchain_community` allowing users to choose the most suitable option for raw documents parsing. ### Dependencies The PR added a new (optional) dependency `dedoc>=2.2.5` ([library documentation](https://dedoc.readthedocs.io)) to the `extended_testing_deps.txt` ### Twitter handle None ### Add tests and docs 1. Test for the integration: `libs/community/tests/integration_tests/document_loaders/test_dedoc.py` 2. Example notebook: `docs/docs/integrations/document_loaders/dedoc.ipynb` 3. Information about the library: `docs/docs/integrations/providers/dedoc.mdx` ### Lint and test Done locally: - `make format` - `make lint` - `make integration_tests` - `make docs_build` (from the project root) --------- Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>	2024-07-23 02:04:53 +00:00
Ben Chambers	5ac936a284	community[minor]: add document transformer for extracting links (#24186 ) - Description: Add a DocumentTransformer for executing one or more `LinkExtractor`s and adding the extracted links to each document. - Issue: n/a - Depedencies: none --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-22 22:01:21 -04:00
Erick Friis	2c6b9e8771	standard-tests: add override check (#24407 )	2024-07-22 23:38:01 +00:00
Nithish Raghunandanan	1639ccfd15	couchbase: [patch] Return chat message history in order (#24498 ) Description: Fixes an issue where the chat message history was not returned in order. Fixed it now by returning based on timestamps. - [x] Add tests and docs: Updated the tests to check the order 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-22 23:30:29 +00:00
Erick Friis	3dce2e1d35	all: add release notes to pypi (#24519 )	2024-07-22 13:59:13 -07:00
Bagatur	8a140ee77c	core[patch]: don't serialize BasePromptTemplate.input_types (#24516 ) Candidate fix for #24513	2024-07-22 13:30:16 -07:00
MarkYQJ	df357f82ca	ignore the first turn to apply "history" mechanism (#14118 ) This will generate a meaningless string "system: " for generating condense question; this increases the probability to make an improper condense question and misunderstand user's question. Below is a case - Original Question: Can you explain the arguments of Meilisearch? - Condense Question - What are the benefits of using Meilisearch? (by CodeLlama) - What are the reasons for using Meilisearch? (by GPT-4) The condense questions (not matter from CodeLlam or GPT-4) are different from the original one. By checking the content of each dialogue turn, generating history string only when the dialog content is not empty. Since there is nothing before first turn, the "history" mechanism will be ignored at the very first turn. Doing so, the condense question will be "What are the arguments for using Meilisearch?". <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 20:11:17 +00:00
Bagatur	236e957abb	core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417 ) After this PR chat models will correctly handle pydantic 2 with bind_tools and with_structured_output. ```python import pydantic print(pydantic.__version__) ``` 2.8.2 ```python from langchain_openai import ChatOpenAI from pydantic import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ``` [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}] <class '__main__.Add'> ``` ```python from langchain_openai import ChatOpenAI from pydantic.v1 import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ```python [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}] <class '__main__.Add'> ``` Addresses issues: https://github.com/langchain-ai/langchain/issues/22782 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-22 20:07:39 +00:00
Naka Masato	884f76e05a	fix: load google credentials properly in GoogleDriveLoader (#12871 ) - Description: - Fix #12870: set scope in `default` func (ref: https://google-auth.readthedocs.io/en/master/reference/google.auth.html) - Moved the code to load default credentials to the bottom for clarity of the logic - Add docstring and comment for each credential loading logic - Issue: https://github.com/langchain-ai/langchain/issues/12870 - Dependencies: no dependencies change - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: @gymnstcs <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 17:43:33 +00:00
Erick Friis	a45337ea07	ollama: release 0.1.0 (#24510 )	2024-07-22 10:35:26 -07:00
Isaac Francisco	1318d534af	[docs]: minor react change (#24509 )	2024-07-22 10:25:01 -07:00
Jorge Piedrahita Ortiz	10e3982b59	community: sambanova integration minor changes (#24503 ) - Minor changes in samabanova llm integration - default api - docstrings - minor changes in docs	2024-07-22 17:06:35 +00:00
maang-h	721f709dec	community: Improve QianfanChatEndpoint tool result to model (#24466 ) - Description: `QianfanChatEndpoint` When using tool result to answer questions, the content of the tool is required to be in Dict format. Of course, this can require users to return Dict format when calling the tool, but in order to be consistent with other Chat Models, I think such modifications are necessary.	2024-07-22 11:29:00 -04:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
ccurme	0f7569ddbc	core[patch]: enable RunnableWithMessageHistory without config (#23775 ) Feedback that `RunnableWithMessageHistory` is unwieldy compared to ConversationChain and similar legacy abstractions is common. Legacy chains using memory typically had no explicit notion of threads or separate sessions. To use `RunnableWithMessageHistory`, users are forced to introduce this concept into their code. This possibly felt like unnecessary boilerplate. Here we enable `RunnableWithMessageHistory` to run without a config if the `get_session_history` callable has no arguments. This enables minimal implementations like the following: ```python from langchain_core.chat_history import InMemoryChatMessageHistory from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-3.5-turbo-0125") memory = InMemoryChatMessageHistory() chain = RunnableWithMessageHistory(llm, lambda: memory) chain.invoke("Hi I'm Bob") # Hello Bob! chain.invoke("What is my name?") # Your name is Bob. ```	2024-07-22 10:36:53 -04:00
Mohammad Mohtashim	5ade0187d0	[Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in `create_sql_agent` function (#23693 ) - Description: The correct Prompts for ZERO_SHOT_REACT were not being used in the `create_sql_agent` function. They were not using the specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not provide any prompts. This is fixed. - Issue: #23585 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 14:04:20 +00:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
남광우	7ab82eb8cc	langchain: Copy libs/standard-tests folder when building devcontainer (#24470 ) ### Description * Fix `libs/langchain/dev.Dockerfile` file. copy the `libs/standard-tests` folder when building the devcontainer. * `poetry install --no-interaction --no-ansi --with dev,test,docs` command requires this folder, but it was not copied. ### Reference #### Error message when building the devcontainer from the master branch ``` ... [2024-07-20T14:27:34.779Z] ------ > [langchain langchain-dev-dependencies 7/7] RUN poetry install --no-interaction --no-ansi --with dev,test,docs: 0.409 0.409 Directory ../standard-tests does not exist ------ ... ``` #### After the fix Build success at vscode: <img width="866" alt="image" src="https://github.com/user-attachments/assets/10db1b50-6fcf-4dfe-83e1-d93c96aa2317">	2024-07-22 13:46:38 +00:00
Thomas Meike	40c02cedaf	langchain[patch]: add async methods to ConversationSummaryBufferMemory (#20956 ) Added asynchronously callable methods according to the ConversationSummaryBufferMemory API documentation. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 09:21:43 -04:00
Sheng Han Lim	0c6a3fdd6b	langchain: Update ContextualCompressionRetriever base_retriever type to RetrieverLike (#24192 ) Description: When initializing retrievers with `configurable_fields` as base retriever, `ContextualCompressionRetriever` validation fails with the following error: ``` ValidationError: 1 validation error for ContextualCompressionRetriever base_retriever Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error) ``` Example code: ```python esearch_retriever = VertexAISearchRetriever( project_id=GCP_PROJECT_ID, location_id="global", data_store_id=SEARCH_ENGINE_ID, ).configurable_fields( filter=ConfigurableField(id="vertex_search_filter", name="Vertex Search Filter") ) # rerank documents with Vertex AI Rank API reranker = VertexAIRank( project_id=GCP_PROJECT_ID, location_id=GCP_REGION, ranking_config="default_ranking_config", ) retriever_with_reranker = ContextualCompressionRetriever( base_compressor=reranker, base_retriever=esearch_retriever ) ``` It seems like the issue stems from ContextualCompressionRetriever insisting that base retrievers must be strictly `BaseRetriever` inherited, and doesn't take into account cases where retrievers need to be chained and can have configurable fields defined. `0a1e475a30/libs/langchain/langchain/retrievers/contextual_compression.py (L15-L22)` This PR proposes that the base_retriever type be set to `RetrieverLike`, similar to how `EnsembleRetriever` validates its list of retrievers: `0a1e475a30/libs/langchain/langchain/retrievers/ensemble.py (L58-L75)`	2024-07-21 14:23:19 -04:00
clement.l	d98b830e4b	community: add flag to toggle progress bar (#24463 ) - Description: Add a flag to determine whether to show progress bar - Issue: n/a - Dependencies: n/a - Twitter handle: n/a --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-20 13:18:02 +00:00
chuanbei888	6b08a33fa4	community: fix QianfanChatEndpoint default model (#24464 ) the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to ERNIE-Lite-8K	2024-07-20 13:00:29 +00:00
Nuno Campos	947628311b	core[patch]: Accept configurable keys top-level (#23806 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-07-20 03:49:00 +00:00
Jesus Martinez	c1d1fc13c2	langchain[patch]: Remove multiagent return_direct validation (#24419 ) Description: When you use Agents with multi-input tool and some of these tools have `return_direct=True`, langchain thrown an error related to one validator. This change is implemented on [JS community](https://github.com/langchain-ai/langchainjs/pull/4643) as well Issue: This MR resolves #19843 Dependencies: None Co-authored-by: Jesus Martinez <jesusabraham.martinez@tyson.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-20 03:27:43 +00:00
Will Badart	74e3d796f1	core[patch]: ensure `iterator_` in scope for `_atransform_stream_with_config` except (#24454 ) Before, if an exception was raised in the outer `try` block in `Runnable._atransform_stream_with_config` before `iterator_` is assigned, the corresponding `finally` block would blow up with an `UnboundLocalError`: ```txt UnboundLocalError: cannot access local variable 'iterator_' where it is not associated with a value ``` By assigning an initial value to `iterator_` before entering the `try` block, this commit ensures that the `finally` can run, and not bury the "true" exception under a "During handling of the above exception [...]" traceback. Thanks for your consideration!	2024-07-20 03:24:04 +00:00
maang-h	7b28359719	docs: Add ChatSparkLLM docstrings (#24449 ) - Description: - Add `ChatSparkLLM` docstrings, the issue #22296 - To support `stream` method	2024-07-19 20:19:14 -07:00
Eugene Yurtsev	5e48f35fba	core[minor]: Relax constraints on type checking for tools and parsers (#24459 ) This will allow tools and parsers to accept pydantic models from any of the following namespaces: * pydantic.BaseModel with pydantic 1 * pydantic.BaseModel with pydantic 2 * pydantic.v1.BaseModel with pydantic 2	2024-07-19 21:47:34 -04:00
Isaac Francisco	838464de25	ollama: init package (#23615 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-20 00:43:29 +00:00
Erick Friis	f4ee3c8a22	infra: add min version testing to pr test flow (#24358 ) xfailing some sql tests that do not currently work on sqlalchemy v1 #22207 was very much not sqlalchemy v1 compatible. Moving forward, implementations should be compatible with both to pass CI	2024-07-19 22:03:19 +00:00
Bagatur	842065a9cc	community[patch]: Release 0.2.9 (#24453 )	2024-07-19 12:50:22 -07:00
Bagatur	27ad6a4bb3	langchain[patch]: Release 0.2.10 (#24452 )	2024-07-19 12:50:13 -07:00
Bagatur	dda9438e87	community[patch]: gpt-4o-mini costs (#24421 )	2024-07-19 19:02:44 +00:00
Eugene Yurtsev	604dfe2d99	community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451 ) This PR addresses the issue raised by (CVE-2024-3095) https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273 Unfortunately, we didn't do a good job writing the initial report. It's pointing at both the wrong package and the wrong code. The affected code is the Web Retriever not the AsyncHTMLLoader, and the WebRetriever lives in langchain-community The vulnerable code lives here: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)` This PR adds a forced opt-in for users to make sure they are aware of the risk and can mitigate by configuring a proxy: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)`	2024-07-19 18:51:35 +00:00
Asi Greenholts	372c27f2e5	community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034 ) - Description: Search has a limit of 500 results, playlistItems doesn't. Added a class in except clause to catch another common error. - Issue: None - Dependencies: None - Twitter handle: @TupleType --------- Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 14:04:34 -04:00
Rafael Pereira	6a45bf9554	community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300 ) Description: This PR introduces a change to the `cypher_generation_chain` to dynamically concatenate inputs. This improvement aims to streamline the input handling process and make the method more flexible. The change involves updating the arguments dictionary with all elements from the `inputs` dictionary, ensuring that all necessary inputs are dynamically appended. This will ensure that any cypher generation template will not require a new `_call` method patch. Issue: This PR fixes issue #24260.	2024-07-19 14:03:14 -04:00
Philippe PRADOS	f5856680fe	community[minor]: add mongodb byte store (#23876 ) The `MongoDBStore` can manage only documents. It's not possible to use MongoDB for an `CacheBackedEmbeddings`. With this new implementation, it's possible to use: ```python CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=MongoDBByteStore( connection_string=db_uri, db_name=db_name, collection_name=collection_name, ), ) ``` and use MongoDB to cache the embeddings !	2024-07-19 13:54:12 -04:00
yabooung	07715f815b	community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258 ) Description: Add UTF-8 encoding support Issue: Inability to properly handle characters from certain languages (e.g., Korean) Fix: Implement UTF-8 encoding in FileChatMessageHistory --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:53:21 -04:00
Dristy Srivastava	020cc1cf3e	Community[minor]: Added checksum in while send data to pebblo-cloud (#23968 ) - Description: - Updated checksum in doc metadata - Sending checksum and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `/loader/doc` API - Adding `pb_id` i.e. pebblo id to doc metadata - Refactoring as needed. - Sending `content-checksum` and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `prmopt` API - Issue: NA - Dependencies: NA - Tests: Updated - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-07-19 13:52:54 -04:00
Eun Hye Kim	9aae8ef416	core[patch]: Fix utils.json_schema.dereference_refs (#24335 KeyError: 400 in JSON schema processing) (#24337 ) Description: This PR fixes a KeyError: 400 that occurs in the JSON schema processing within the reduce_openapi_spec function. The _retrieve_ref function in json_schema.py was modified to handle missing components gracefully by continuing to the next component if the current one is not found. This ensures that the OpenAPI specification is fully interpreted and the agent executes without errors. Issue: Fixes issue #24335 Dependencies: No additional dependencies are required for this change. Twitter handle: @lunara_x	2024-07-19 13:31:00 -04:00
keval dekivadiya	06f47678ae	community[minor]: Add TextEmbed Embedding Integration (#22946 ) Description: TextEmbed is a high-performance embedding inference server designed to provide a high-throughput, low-latency solution for serving embeddings. It supports various sentence-transformer models and includes the ability to deploy image and text embedding models. TextEmbed offers flexibility and scalability for diverse applications. - PyPI Package: [TextEmbed on PyPI](https://pypi.org/project/textembed/) - Docker Image: [TextEmbed on Docker Hub](https://hub.docker.com/r/kevaldekivadiya/textembed) - GitHub Repository: [TextEmbed on GitHub](https://github.com/kevaldekivadiya2415/textembed) PR Description This PR adds functionality for embedding documents and queries using the `TextEmbedEmbeddings` class. The implementation allows for both synchronous and asynchronous embedding requests to a TextEmbed API endpoint. The class handles batching and permuting of input texts to optimize the embedding process. Example Usage: ```python from langchain_community.embeddings import TextEmbedEmbeddings # Initialise the embeddings class embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url") # Define a list of documents documents = [ "Data science involves extracting insights from data.", "Artificial intelligence is transforming various industries.", "Cloud computing provides scalable computing resources over the internet.", "Big data analytics helps in understanding large datasets.", "India has a diverse cultural heritage." ] # Define a query query = "What is the cultural heritage of India?" # Embed all documents document_embeddings = embeddings.embed_documents(documents) # Embed the query query_embedding = embeddings.embed_query(query) # Print embeddings for each document for i, embedding in enumerate(document_embeddings): print(f"Document {i+1} Embedding:", embedding) # Print the query embedding print("Query Embedding:", query_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-19 17:30:25 +00:00
Shikanime Deva	9c3da11910	Fix MultiQueryRetriever breaking Embeddings with empty lines (#21093 ) Fix MultiQueryRetriever breaking Embeddings with empty lines ``` [chain/end] [1:chain:ConversationalRetrievalChain > 2:retriever:Retriever > 3:retriever:Retriever > 4:chain:LLMChain] [2.03s] Exiting Chain run with output: [outputs] > /workspaces/Sfeir/sncf/metabot-backend/.venv/lib/python3.11/site-packages/langchain/retrievers/multi_query.py(116)_aget_relevant_documents() -> if self.include_original: (Pdb) queries ['## Alternative questions for "Hello, tell me about phones?":', '', '1. What are the latest trends in smartphone technology? (Focuses on recent advancements)', '2. How has the mobile phone industry evolved over the years? (Historical perspective)', '3. What are the different types of phones available in the market, and which one is best for me? (Categorization and recommendation)'] ``` Example of failure on VertexAIEmbeddings ``` grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.INVALID_ARGUMENT details = "The text content is empty." debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.184.234:443 {created_time:"2024-04-30T09:57:45.625698408+00:00", grpc_status:3, grpc_message:"The text content is empty."}" ``` Fixes: #15959 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 17:13:12 +00:00
John Kelly	5affbada61	langchain: Add `aadd_documents` to `ParentDocumentRetriever` (#23969 ) - Description: Add an async version of `add_documents` to `ParentDocumentRetriever` - Twitter handle: @johnkdev --------- Co-authored-by: John Kelly <j.kelly@mwam.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:12:39 -04:00
Andrew Benton	f9d64d22e5	community[minor]: Add Riza Python/JS code execution tool (#23995 ) - Description: Add Riza Python/JS code execution tool - Issue: N/A - Dependencies: an optional dependency on the `rizaio` pypi package - Twitter handle: [@rizaio](https://x.com/rizaio) [Riza](https://riza.io) is a safe code execution environment for agent-generated Python and JavaScript that's easy to integrate into langchain apps. This PR adds two new tool classes to the community package.	2024-07-19 17:03:22 +00:00
Ben Chambers	3691701d58	community[minor]: Add keybert-based link extractor (#24311 ) - Description: Add a `KeybertLinkExtractor` for graph vectorstores. This allows extracting links from keywords in a Document and linking nodes that have common keywords. - Issue: None - Dependencies: None. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-19 12:25:07 -04:00
Erick Friis	ef049769f0	core[patch]: Release 0.2.22 (#24423 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-19 09:09:24 -07:00
Bagatur	cd19ba9a07	core[patch]: core lint fix (#24447 )	2024-07-19 09:01:22 -07:00
Ben Chambers	83f3d95ffa	community[minor]: GLiNER link extraction (#24314 ) - Description: This allows extracting links between documents with common named entities using [GLiNER](https://github.com/urchade/GLiNER). - Issue: None - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 15:34:54 +00:00
Anas Khan	b5acb91080	Mask API keys for various LLM/ChatModel Modules (#13885 ) Description: - Added masking of the API Keys for the modules: - `langchain/chat_models/openai.py` - `langchain/llms/openai.py` - `langchain/llms/google_palm.py` - `langchain/chat_models/google_palm.py` - `langchain/llms/edenai.py` - Updated the modules to utilize `SecretStr` from pydantic to securely manage API key. - Added unit/integration tests - `langchain/chat_models/asure_openai.py` used the `open_api_key` that is derived from the `ChatOpenAI` Class and it was assuming `openai_api_key` is a str so we changed it to expect `SecretStr` instead. Issue: https://github.com/langchain-ai/langchain/issues/12165 , Dependencies: none, Tag maintainer: @eyurtsev --------- Co-authored-by: HassanA01 <anikeboss@gmail.com> Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca> Co-authored-by: kristinspenc <kristinspenc2003@gmail.com> Co-authored-by: faisalt14 <faisalt14@gmail.com> Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com> Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com> Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 15:23:34 +00:00
ccurme	f99369a54c	community[patch]: fix formatting (#24443 ) Somehow this got through CI: https://github.com/langchain-ai/langchain/pull/24363	2024-07-19 14:38:53 +00:00
Ben Chambers	242b085be7	Merge pull request #24315 * community: Add Hierarchy link extractor * add example * lint	2024-07-19 09:42:26 -04:00
Rhuan Barros	c3308f31bc	Merge pull request #24363 * important email fields	2024-07-19 09:41:20 -04:00
Han Sol Park	aade9bfde5	Mask API key for ChatOpenAI based chat_models (#14293 ) - Description: Mask API key for ChatOpenAi based chat_models (openai, azureopenai, anyscale, everlyai). Made changes to all chat_models that are based on ChatOpenAI since all of them assumes that openai_api_key is str rather than SecretStr. - Issue:: #12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 02:25:38 +00:00
William FH	0ee6ed76ca	[Evaluation] Pass in seed directly (#24403 ) adding test rn	2024-07-18 19:12:28 -07:00
Nuno Campos	62b6965d2a	core: In ensure_config don't copy dunder configurable keys to metadata (#24420 )	2024-07-18 22:28:52 +00:00
Eugene Yurtsev	ef22ebe431	standard-tests[patch]: Add pytest assert rewrites (#24408 ) This will surface nice error messages in subclasses that fail assertions.	2024-07-18 21:41:11 +00:00
Eugene Yurtsev	f62b323108	core[minor]: Support all versions of pydantic base model in argsschema (#24418 ) This adds support to any pydantic base model for tools. The only potential issue is that `get_input_schema()` will not always return a v1 base model.	2024-07-18 17:14:23 -04:00
Prakul	b2bc15e640	docs: Update mongodb README.md (#24412 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-18 14:02:34 -07:00
Evan Harris	61ea7bf60b	Add a `ListRerank` document compressor (#13311 ) - Description: This PR adds a new document compressor called `ListRerank`. It's derived from `BaseDocumentCompressor`. It's a near exact implementation of introduced by this paper: [Zero-Shot Listwise Document Reranking with a Large Language Model](https://arxiv.org/pdf/2305.02156.pdf) which it finds to outperform pointwise reranking, which is somewhat implemented in LangChain as [LLMChainFilter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter.py). - Issue: None - Dependencies: None - Tag maintainer: @hwchase17 @izzymsft - Twitter handle: @HarrisEMitchell Notes: 1. I didn't add anything to `docs`. I wasn't exactly sure which patterns to follow as [cohere reranker is under Retrievers](https://python.langchain.com/docs/integrations/retrievers/cohere-reranker) with other external document retrieval integrations, but other contextual compression is [here](https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/). Happy to contribute to either with some direction. 2. I followed syntax, docstrings, implementation patterns, etc. as well as I could looking at nearby modules. One thing I didn't do was put the default prompt in a separate `.py` file like [Chain Filter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter_prompt.py) and [Chain Extract](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_extract_prompt.py). Happy to follow that pattern if it would be preferred. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-18 20:34:38 +00:00
Erick Friis	334fc1ed1c	mongodb: release 0.1.7 (#24409 )	2024-07-18 18:13:27 +00:00
Eun Hye Kim	07c5c60f63	community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384 ) Description: - Updated the format for the 'Action' section in the planner prompt to ensure it must be one of the tools without additional words. Adjusted the phrasing from "should be" to "must be" for clarity and enforceability. - Corrected the tool appending logic in the `_create_api_controller_agent` function to ensure that `RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are properly added to the tools list for "DELETE" and "PATCH" operations. Issue: #24382 Dependencies: None Twitter handle: @lunara_x --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-18 13:37:46 +00:00
Chen Xiabin	63c60a31f0	[fix] baidu qianfan AiMessage with usage_metadata (#24389 ) make AIMessage usage_metadata has error	2024-07-18 09:28:16 -04:00
ZhangShenao	916b813107	community[patch]: Fix spelling error in ConversationVectorStoreTokenBufferMemory doc-string (#24385 ) Fix word spelling error in `ConversationVectorStoreTokenBufferMemory`	2024-07-18 12:27:36 +00:00
Rajendra Kadam	1c65529fd7	community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393 ) Thank you for contributing to LangChain! - [x] PR title: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders - Description: Minor fixes in the PebbloSafeLoader: - Renamed the loader type from `remote_db` to `cloud_folder`. - Added `SharePointLoader` to the list of loaders supported by PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Add tests and docs: NA	2024-07-18 08:23:12 -04:00
Eugene Yurtsev	6182a402f1	experimental[patch]: block a few more things from PALValidator (#24379 ) * Please see security warning already in existing class. * The approach here is fundamentally insecure as it's relying on a block approach rather than an approach based on only running allowed nodes. So users should only use this code if its running from a properly sandboxed environment.	2024-07-18 08:22:45 -04:00
Paolo Ráez	0dec72cab0	Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987 ) ### Description Missing "stream" parameter. Without it, you'd never receive a stream of tokens when using stream() or astream() ### Issue No existing issue available	2024-07-18 02:09:39 +00:00
Eugene Yurtsev	570566b858	core[patch]: Update API reference for astream events (#24359 ) Update the API reference for astream events to include information about custom events.	2024-07-17 21:48:53 -04:00
Bagatur	a4c101ae97	core[patch]: Release 0.2.21 (#24372 )	2024-07-17 22:44:35 +00:00
William FH	c5a07e2dd8	core[patch]: add InjectedToolArg annotation (#24279 ) ```python from typing_extensions import Annotated from langchain_core.tools import tool, InjectedToolArg from langchain_anthropic import ChatAnthropic @tool def multiply(x: int, y: int, not_for_model: Annotated[dict, InjectedToolArg]) -> str: """multiply.""" return x * y ChatAnthropic(model='claude-3-sonnet-20240229',).bind_tools([multiply]).invoke('5 times 3').tool_calls ''' -> [{'name': 'multiply', 'args': {'x': 5, 'y': 3}, 'id': 'toolu_01Y1QazYWhu4R8vF4hF4z9no', 'type': 'tool_call'}] ''' ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-17 15:28:40 -07:00
Erick Friis	80f3d48195	openai: release 0.1.18 (#24369 )	2024-07-17 22:26:33 +00:00
Bagatur	7d83189b19	openai[patch]: use model_name in AzureOpenAI.ls_model_name (#24366 )	2024-07-17 15:24:05 -07:00
Nithish Raghunandanan	eb26b5535a	couchbase: Add chat message history (#24356 ) Description: : Add support for chat message history using Couchbase - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>	2024-07-17 15:22:42 -07:00
Eugene Yurtsev	96bac8e20d	core[patch]: Fix regression requiring input_variables in few chat prompt templates (#24360 ) * Fix regression that requires users passing input_variables=[]. * Regression introduced by my own changes to this PR: https://github.com/langchain-ai/langchain/pull/22851	2024-07-17 18:14:57 -04:00
Brice Fotzo	034a8c7c1b	community: support advanced text extraction options for pdf documents (#20265 ) Description: - Updated constructors in PyPDFParser and PyPDFLoader to handle `extraction_mode` and additional kwargs, aligning with the capabilities of `PageObject.extract_text()` from pypdf. - Added `test_pypdf_loader_with_layout` along with a corresponding example text file to validate layout extraction from PDFs. Issue: fixes #19735 Dependencies: This change requires updating the pypdf dependency from version 3.4.0 to at least 4.0.0. Additional changes include the addition of a new test test_pypdf_loader_with_layout and an example text file to ensure the functionality of layout extraction from PDFs aligns with the new capabilities. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:47:09 +00:00
hmasdev	a402de3dae	langchain[patch]: fix wrong `dict` key in `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser` (#23967 ) # Description This PR aims to solve a bug in `OutputFixingParser`, `RetryOutputParser` and `RetryWithErrorOutputParser` The bug is that the wrong keyword argument was given to `retry_chain`. The correct keyword argument is 'completion', but 'input' is used. This pull request makes the following changes: 1. correct a `dict` key given to `retry_chain`; 2. add a test when using the default prompt. - `NAIVE_FIX_PROMPT` for `OutputFixingParser`; - `NAIVE_RETRY_PROMPT` for `RetryOutputParser`; - `NAIVE_RETRY_WITH_ERROR_PROMPT` for `RetryWithErrorOutputParser`; 3. ~~add comments on `retry_chain` input and output types~~ clarify `InputType` and `OutputType` of `retry_chain` # Issue The bug is pointed out in https://github.com/langchain-ai/langchain/pull/19792#issuecomment-2196512928 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:34:46 +00:00
Casey Clements	a47f69a120	partners/mongodb : Significant MongoDBVectorSearch ID enhancements (#23535 ) ## Description This pull-request improves the treatment of document IDs in `MongoDBAtlasVectorSearch`. Class method signatures of add_documents, add_texts, delete, and from_texts now include an `ids:Optional[List[str]]` keyword argument permitting the user greater control. Note that, as before, IDs may also be inferred from `Document.metadata['_id']` if present, but this is no longer required, IDs can also optionally be returned from searches. This PR closes the following JIRA issues. * [PYTHON-4446](https://jira.mongodb.org/browse/PYTHON-4446) MongoDBVectorSearch delete / add_texts function rework * [PYTHON-4435](https://jira.mongodb.org/browse/PYTHON-4435) Add support for "Indexing" * [PYTHON-4534](https://jira.mongodb.org/browse/PYTHON-4534) Ensure datetimes are json-serializable --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 13:26:20 -07:00
Erick Friis	cc2cbfabfc	milvus: release 0.1.2 (#24365 )	2024-07-17 19:42:44 +00:00
Eugene Yurtsev	9e4a0e76f6	core[patch]: Fix one unit test for chat prompt template (#24362 ) Minor change that fixes a unit test that had missing assertions.	2024-07-17 18:56:48 +00:00
Erick Friis	81639243e2	openai: release 0.1.17 (#24361 )	2024-07-17 18:50:42 +00:00
Erick Friis	61976a4147	pinecone: release 0.1.2 (#24355 )	2024-07-17 17:09:07 +00:00
Bagatur	b5360e2e5f	community[patch]: Release 0.2.8 (#24354 )	2024-07-17 17:07:27 +00:00
ccurme	4cf67084d3	openai[patch]: fix key collision and _astream (#24345 ) Fixes small issues introduced in https://github.com/langchain-ai/langchain/pull/24150 (unreleased).	2024-07-17 12:59:26 -04:00
Luis Moros	bcb5f354ad	community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346 ) - Description: When SQLDatabase.from_databricks is ran from a Databricks Workflow job, line 205 (default_host = context.browserHostName) throws an ``AttributeError`` as the ``context`` object has no ``browserHostName`` attribute. The fix handles the exception and sets the ``default_host`` variable to null --------- Co-authored-by: lmorosdb <lmorosdb> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-17 12:40:12 -04:00
Bagatur	24e9b48d15	langchain[patch]: Release 0.2.9 (#24327 )	2024-07-17 09:39:57 -07:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
maang-h	2a3288b15d	docs: Add ChatBaichuan docstrings (#24348 ) - Description: Add ChatBaichuan rich docstrings. - Issue: the issue #22296	2024-07-17 12:00:16 -04:00
Rafael Pereira	fc41730e28	neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338 ) Description: This PR addresses two main issues in the `test_neo4jvector.py`: 1. Order-insensitive Comparison: Modified the `test_retrieval_dictionary` to ensure that it passes regardless of the order of returned values by parsing `page_content` into a structured format (dictionary) before comparison. 2. Floating-point Precision: Updated `test_neo4jvector_relevance_score` to handle minor floating-point precision differences by using the `isclose` function for comparing relevance scores with a relative tolerance. Errors addressed: - test_neo4jvector_relevance_score: ``` AssertionError: assert [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512)] == [(Document(page_content='foo', metadata={'page': '0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706)] At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746) != (Document(page_content='foo', metadata={'page': '0'}), 1.0) Full diff: - [(Document(page_content='foo', metadata={'page': '0'}), 1.0), + [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), ? +++++++++++++++ - (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), ? ^^^ ------ + (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), ? ^^^^^^^^^ - (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706), ? ---------- + (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512), ? ++++++++++ ] ``` - test_retrieval_dictionary: ``` AssertionError: assert [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n')] == [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] At index 0 diff: Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n') != Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n') Full diff: - [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] ? --------- + [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: John\nage: 30\n')] ? +++++++++ ```	2024-07-17 09:28:25 -04:00
Bagatur	80e7cd6cff	core[patch]: Release 0.2.20 (#24322 )	2024-07-16 15:04:36 -07:00
Eugene Yurtsev	616196c620	Docs: Add how to dispatch custom callback events (#24278 ) * Add how-to guide for dispatching custom callback events. * Add links from index to the how to guide * Add link from streaming from within a tool * Update versionadded to correct release https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.2.15	2024-07-16 17:38:32 -04:00
Leonid Ganeline	5ccf8ebfac	core: docstrings `vectorstores` update (#24281 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-16 16:58:11 +00:00
Erick Friis	1e9cc02ed8	openai: raw response headers (#24150 )	2024-07-16 09:54:54 -07:00
Bagatur	dc42279eb5	core[patch]: fix Typing.cast import (#24313 ) Fixes #24287	2024-07-16 16:53:48 +00:00
bovlb	5caa381177	community[minor]: Add ApertureDB as a vectorstore (#24088 ) Thank you for contributing to LangChain! - [X] ApertureDB as vectorstore: "community: Add ApertureDB as a vectorestore" - Description:* this change provides a new community integration that uses ApertureData's ApertureDB as a vector store. - Issue: none - Dependencies: depends on ApertureDB Python SDK - Twitter handle: ApertureData - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Integration tests rely on a local run of a public docker image. Example notebook additionally relies on a local Ollama server. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ All lint tests pass. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Gautam <gautam@aperturedata.io>	2024-07-16 09:32:59 -07:00
frob	c59e663365	community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973 ) Fix doc-string for ollama integration	2024-07-16 14:48:38 +00:00
Leonid Ganeline	5fcf2ef7ca	core: docstrings `documents` (#23506 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-16 10:43:54 -04:00
Rahul Raghavendra Choudhury	f5a38772a8	community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270 ) On using TavilySearchAPIRetriever with any conversation chain getting error : `TypeError: Client.__init__() got an unexpected keyword argument 'api_key'` It is because the retreiver class is using the depreciated `Client` class, `TavilyClient` need to be used instead. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-16 13:35:28 +00:00
Shenhai Ran	5f2dea2b20	core[patch]: Add encoding options when create prompt template from a file (#24054 ) - Uses default utf-8 encoding for loading prompt templates from file --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-16 09:35:09 -04:00
Chen Xiabin	69b1603173	baidu qianfan AiMessage with usage_metadata (#24288 ) add usage_metadata to qianfan AIMessage. Thanks	2024-07-16 09:30:50 -04:00
Leonid Ganeline	198b85334f	core[patch]: docstrings `langchain_core/` files update (#24285 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-16 09:21:51 -04:00
Dobiichi-Origami	7aeaa1974d	community[patch]: change the class of `qianfan_ak` and `qianfan_sk` parameters (#24293 ) - Description: we changed the class of two parameters to fix a bug, which causes validation failure when using QianfanEmbeddingEndpoint	2024-07-16 09:17:48 -04:00
Tibor Reiss	1c753d1e81	core[patch]: Update typing for template format to include jinja2 as a Literal (#24144 ) Fixes #23929 via adjusting the typing	2024-07-16 09:09:42 -04:00
Erick Friis	d2f671271e	langchain: fix extended test (#24282 )	2024-07-15 15:29:48 -07:00
Lage Ragnarsson	a3c10fc6ce	community: Add support for specifying hybrid search for Databricks vector search (#23528 ) Description: Databricks Vector Search recently added support for hybrid keyword-similarity search. See [usage examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint) from their documentation. This PR updates the Langchain vectorstore interface for Databricks to enable the user to pass the query_type parameter to similarity_search to make use of this functionality. By default, there will not be any changes for existing users of this interface. To use the new hybrid search feature, it is now possible to do ```python # ... dvs = DatabricksVectorSearch(index) dvs.similarity_search("my search query", query_type="HYBRID") ``` Or using the retriever: ```python retriever = dvs.as_retriever( search_kwargs={ "query_type": "HYBRID", } ) retriever.invoke("my search query") ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-15 22:14:08 +00:00
Christopher Tee	5171ffc026	community(you): Integrate You.com conversational APIs (#23046 ) You.com is releasing two new conversational APIs — Smart and Research. This PR: - integrates those APIs with Langchain, as an LLM - streaming is supported If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-15 17:46:58 -04:00
maang-h	6c7d9f93b9	feat: Add ChatTongyi structured output (#24187 ) - Description: Add `with_structured_output` method to ChatTongyi to support structured output.	2024-07-15 15:57:21 -04:00
Chen Xiabin	8f4620f4b8	baidu qianfan streaming token_usage (#24117 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 19:52:31 +00:00
maang-h	9d97de34ae	community[patch]: Improve ChatBaichuan init args and role (#23878 ) - Description: Improve ChatBaichuan init args and role - ChatBaichuan adds `system` role - alias: `baichuan_api_base` -> `base_url` - `with_search_enhance` is deprecated - Add `max_tokens` argument	2024-07-15 15:17:00 -04:00
Erick Friis	56cca23745	openai: remove some params from default serialization (#24280 )	2024-07-15 18:53:36 +00:00
mrugank-wadekar	66bebeb76a	partners: add similarity search by image functionality to langchain_chroma partner package (#22982 ) - Description: This pull request introduces two new methods to the Langchain Chroma partner package that enable similarity search based on image embeddings. These methods enhance the package's functionality by allowing users to search for images similar to a given image URI. Also introduces a notebook to demonstrate it's use. - Issue: N/A - Dependencies: None - Twitter handle: @mrugank9009 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-15 18:48:22 +00:00
pm390	b0aa915dea	community[patch]: use asyncio.sleep instead of sleep in OpenAI Assistant async (#24275 ) Description: Implemented async sleep using asyncio instead of synchronous sleep in openAI Assistants Issue: 24194 Dependencies: asyncio Twitter handle: pietromald60939	2024-07-15 18:14:39 +00:00
wenngong	a001037319	retrievers: MultiVectorRetriever similarity_score_threshold search type (#23539 ) Description: support MultiVectorRetriever similarity_score_threshold search type. Issue: #23387 #19404 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-15 13:31:34 -04:00
Carlos André Antunes	20151384d7	fix azure_openai.py: some keys do not exists (#24158 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method Thank you for contributing to LangChain! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-15 17:17:05 +00:00
blueoom	d895614d19	text_splitters: add request parameters for function HTMLHeaderTextSplitter.split_text… (#24178 ) Description: The `split_text_from_url` method of `HTMLHeaderTextSplitter` does not include parameters like `timeout` when using `requests` to send a request. Therefore, I suggest adding a `kwargs` parameter to the function, which can be passed as arguments to `requests.get()` internally, allowing control over the `get` request. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 16:43:56 +00:00
Bagatur	9d0c1d2dc9	docs: specify init_chat_model version (#24274 )	2024-07-15 16:29:06 +00:00
Bagatur	c9473367b1	langchain[patch]: Release 0.2.8 (#24273 )	2024-07-15 16:05:51 +00:00
JP-Ellis	f77659463a	core[patch]: allow message utils to work with lcel (#23743 ) The functions `convert_to_messages` has had an expansion of the arguments it can take: 1. Previously, it only could take a `Sequence` in order to iterate over it. This has been broadened slightly to an `Iterable` (which should have no other impact). 2. Support for `PromptValue` and `BaseChatPromptTemplate` has been added. These are generated when combining messages using the overloaded `+` operator. Functions which rely on `convert_to_messages` (namely `filter_messages`, `merge_message_runs` and `trim_messages`) have had the type of their arguments similarly expanded. Resolves #23706. <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --> --------- Signed-off-by: JP-Ellis <josh@jpellis.me> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-07-15 08:58:05 -07:00
Harold Martin	ccdaf14eff	docs: Spell check fixes (#24217 ) Description: Spell check fixes for docs, comments, and a couple of strings. No code change e.g. variable names. Issue: none Dependencies: none Twitter handle: hmartin	2024-07-15 15:51:43 +00:00
Leonid Ganeline	cacdf96f9c	core docstrings `tracers` update (#24211 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-15 11:37:09 -04:00
Leonid Ganeline	36ee083753	core: docstrings `utils` update (#24213 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-15 11:36:00 -04:00
thehunmonkgroup	e8a21146d3	community[patch]: upgrade default model for ChatAnyscale (#24232 ) Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.	2024-07-15 11:34:59 -04:00
Bagatur	620b118c70	core[patch]: Release 0.2.19 (#24272 )	2024-07-15 07:51:30 -07:00
ccurme	888fbc07b5	core[patch]: support passing `args_schema` through `as_tool` (#24269 ) Note: this allows the schema to be passed in positionally. ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.runnables import RunnableLambda class Add(BaseModel): """Add two integers together.""" a: int = Field(..., description="First integer") b: int = Field(..., description="Second integer") def add(input: dict) -> int: return input["a"] + input["b"] runnable = RunnableLambda(add) as_tool = runnable.as_tool(Add) as_tool.args_schema.schema() ``` ``` {'title': 'Add', 'description': 'Add two integers together.', 'type': 'object', 'properties': {'a': {'title': 'A', 'description': 'First integer', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'Second integer', 'type': 'integer'}}, 'required': ['a', 'b']} ```	2024-07-15 07:51:05 -07:00
ccurme	ab2d7821a7	fireworks[patch]: use firefunction-v2 in standard tests (#24264 )	2024-07-15 13:15:08 +00:00
ccurme	6fc7610b1c	standard-tests[patch]: update test_bind_runnables_as_tools (#24241 ) Reduce number of tool arguments from two to one.	2024-07-15 08:35:07 -04:00
Bagatur	0da5078cad	langchain[minor]: Generic configurable model (#23419 ) alternative to [23244](https://github.com/langchain-ai/langchain/pull/23244). allows you to use chat model declarative methods ![Screenshot 2024-06-25 at 1 07 10 PM](https://github.com/langchain-ai/langchain/assets/22008038/910d1694-9b7b-46bc-bc2e-3792df9321d6)	2024-07-15 01:11:01 +00:00
Bagatur	d0728b0ba0	core[patch]: add tool name to tool message (#24243 ) Copying current ToolNode behavior	2024-07-15 00:42:40 +00:00
Bagatur	5c3e2612da	core[patch]: Release 0.2.18 (#24230 )	2024-07-13 09:14:43 -07:00
Bagatur	65321bf975	core[patch]: fix ToolCall "type" when streaming (#24218 )	2024-07-13 08:59:03 -07:00
Anush	a653b209ba	qdrant: test new QdrantVectorStore (#24165 ) ## Description This PR adds integration tests to follow up on #24164. By default, the tests use an in-memory instance. To run the full suite of tests, with both in-memory and Qdrant server: ``` $ docker run -p 6333:6333 qdrant/qdrant $ make test $ make integration_test ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:59:30 +00:00
Roman Solomatin	f071581aea	openai[patch]: update openai params (#23691 ) Description: Explicitly add parameters from openai API - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 16:53:33 -07:00
Leonid Ganeline	f0a7581b50	milvus: docstring (#23151 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference) --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:25:31 +00:00
Christian D. Glissov	474b88326f	langchain_qdrant: Added method "_asimilarity_search_with_relevance_scores" to Qdrant class (#23954 ) I stumbled upon a bug that led to different similarity scores between the async and sync similarity searches with relevance scores in Qdrant. The reason being is that _asimilarity_search_with_relevance_scores is missing, this makes langchain_qdrant use the method of the vectorstore baseclass leading to drastically different results. To illustrate the magnitude here are the results running an identical search in a test vectorstore. Output of asimilarity_search_with_relevance_scores: [0.9902903374601824, 0.9472135924938804, 0.8535534011299859] Output of similarity_search_with_relevance_scores: [0.9805806749203648, 0.8944271849877607, 0.7071068022599718] Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 23:25:20 +00:00
Bagatur	bdc03997c9	standard-tests[patch]: check for ToolCall["type"] (#24209 )	2024-07-12 16:17:34 -07:00
Miroslav	aee55eda39	community: Skip Login to HuggubgFaceHub when token is not set (#21561 ) Thank you for contributing to LangChain! - [ ] HuggingFaceEndpoint: "Skip Login to HuggingFaceHub" - Where: langchain, community, llm, huggingface_endpoint - [ ] PR message: *Delete this entire checklist* and replace with - Description: Skip login to huggingface hub when when `huggingfacehub_api_token` is not set. This is needed when using custom `endpoint_url` outside of HuggingFaceHub. - Issue: the issue # it fixes https://github.com/langchain-ai/langchain/issues/20342 and https://github.com/langchain-ai/langchain/issues/19685 - Dependencies: None - [ ] Add tests and docs: 1. Tested with locally available TGI endpoint 2. Example Usage ```python from langchain_community.llms import HuggingFaceEndpoint llm = HuggingFaceEndpoint( endpoint_url='http://localhost:8080', server_kwargs={ "headers": {"Content-Type": "application/json"} } ) resp = llm.invoke("Tell me a joke") print(resp) ``` Also tested against HF Endpoints ```python from langchain_community.llms import HuggingFaceEndpoint huggingfacehub_api_token = "hf_xyz" repo_id = "mistralai/Mistral-7B-Instruct-v0.2" llm = HuggingFaceEndpoint( huggingfacehub_api_token=huggingfacehub_api_token, repo_id=repo_id, ) resp = llm.invoke("Tell me a joke") print(resp) ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 22:10:32 +00:00
Anush	d09dda5a08	qdrant: Bump patch version (#24168 ) # Description To release a new version of `langchain-qdrant` after #24165 and #24166.	2024-07-12 14:48:50 -07:00
Bagatur	12950cc602	standard-tests[patch]: improve runnable tool description (#24210 )	2024-07-12 21:33:56 +00:00
Erick Friis	e8ee781a42	ibm: move to external repo (#24208 )	2024-07-12 21:14:24 +00:00
Bagatur	02e71cebed	together[patch]: Release 0.1.4 (#24205 )	2024-07-12 13:59:58 -07:00
Bagatur	259d4d2029	anthropic[patch]: Release 0.1.20 (#24204 )	2024-07-12 13:59:15 -07:00
Bagatur	3aed74a6fc	fireworks[patch]: Release 0.1.5 (#24203 )	2024-07-12 13:58:58 -07:00
Bagatur	13b0d7ec8f	openai[patch]: Release 0.1.16 (#24202 )	2024-07-12 13:58:39 -07:00
Bagatur	71cd6e6feb	groq[patch]: Release 0.1.7 (#24201 )	2024-07-12 13:58:19 -07:00
Bagatur	99054e19eb	mistralai[patch]: Release 0.1.10 (#24200 )	2024-07-12 13:57:58 -07:00
Bagatur	7a1321e2f9	ibm[patch]: Release 0.1.10 (#24199 )	2024-07-12 13:57:38 -07:00
Bagatur	cb5031f22f	integrations[patch]: require core >=0.2.17 (#24207 )	2024-07-12 20:54:01 +00:00
Nithish Raghunandanan	f1618ec540	couchbase: Add standard and semantic caches (#23607 ) Thank you for contributing to LangChain! Description: Add support for caching (standard + semantic) LLM responses using Couchbase - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 20:30:03 +00:00
Eugene Yurtsev	8d82a0d483	core[patch]: Mark GraphVectorStore as beta (#24195 ) * This PR marks graph vectorstore as beta	2024-07-12 14:28:06 -04:00
Bagatur	0a1e475a30	core[patch]: Release 0.2.17 (#24189 )	2024-07-12 17:08:29 +00:00
Bagatur	6166ea67a8	core[minor]: rename ToolMessage.raw_output -> artifact (#24185 )	2024-07-12 09:52:44 -07:00
Jean Nshuti	d77d9bfc00	community[patch]: update typo document content returned from semanticscholar (#24175 ) Update "astract" -> abstract	2024-07-12 15:40:47 +00:00
Leonid Ganeline	aa3e3cfa40	core[patch]: docstrings `runnables` update (#24161 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-12 11:27:06 -04:00
Tomaz Bratanic	d3a2b9fae0	Fix neo4j type error on missing constraint information (#24177 ) If you use `refresh_schema=False`, then the metadata constraint doesn't exist. ATM, we used default `None` in the constraint check, but then `any` fails because it can't iterate over None value	2024-07-12 06:39:29 -04:00
Anush	7014d07cab	qdrant: new Qdrant implementation (#24164 )	2024-07-12 04:52:02 +02:00
Xander Dumaine	35784d1c33	langchain[minor]: add document_variable_name to create_stuff_documents_chain (#24083 ) - Description: `StuffDocumentsChain` uses `LLMChain` which is deprecated by langchain runnables. `create_stuff_documents_chain` is the replacement, but needs support for `document_variable_name` to allow multiple uses of the chain within a longer chain. - Issue: none - Dependencies: none	2024-07-12 02:31:46 +00:00
Eugene Yurtsev	8858846607	milvus[patch]: Fix Milvus vectorstore for newer versions of langchain-core (#24152 ) Fix for: https://github.com/langchain-ai/langchain/issues/24116 This keeps the old behavior of add_documents and add_texts	2024-07-11 18:51:18 -07:00
thedavgar	ffe6ca986e	community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081 ) Thank you for contributing to LangChain! Description: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` Issue: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 Dependencies: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-11 18:32:19 -07:00
Anush	7790d67f94	qdrant: New sparse embeddings provider interface - PART 1 (#24015 ) ## Description This PR introduces a new sparse embedding provider interface to work with the new Qdrant implementation that will follow this PR. Additionally, an implementation of this interface is provided with https://github.com/qdrant/fastembed. This PR will be followed by https://github.com/Anush008/langchain/pull/3.	2024-07-11 17:07:25 -07:00
Erick Friis	1132fb801b	core: release 0.2.16 (#24159 )	2024-07-11 23:59:41 +00:00
Nuno Campos	1d37aa8403	core: Remove extra newline (#24157 )	2024-07-11 23:55:36 +00:00
ccurme	cb95198398	standard-tests[patch]: add tests for runnables as tools and streaming usage metadata (#24153 )	2024-07-11 18:30:05 -04:00
Bagatur	8d100c58de	core[patch]: Tool accept RunnableConfig (#24143 ) Relies on #24038 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 22:13:17 +00:00
Bagatur	5fd1e67808	core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038 ) Changes: - ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type" parameter now - LLM integration packages add "type" to all the above - Tool supports ToolCall inputs that have "type" specified - Tool outputs ToolMessage when a ToolCall is passed as input - Tools can separately specify ToolMessage.content and ToolMessage.raw_output - Tools emit events for validation errors (using on_tool_error and on_tool_end) Example: ```python @tool("structured_api", response_format="content_and_raw_output") def _mock_structured_tool_with_raw_output( arg1: int, arg2: bool, arg3: Optional[dict] = None ) -> Tuple[str, dict]: """A Structured Tool""" return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3} def test_tool_call_input_tool_message_with_raw_output() -> None: tool_call: Dict = { "name": "structured_api", "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}}, "id": "123", "type": "tool_call", } expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123") tool = _mock_structured_tool_with_raw_output actual = tool.invoke(tool_call) assert actual == expected tool_call.pop("type") with pytest.raises(ValidationError): tool.invoke(tool_call) actual_content = tool.invoke(tool_call["args"]) assert actual_content == expected.content ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 14:54:02 -07:00
Bagatur	eeb996034b	core[patch]: Release 0.2.15 (#24149 )	2024-07-11 21:34:25 +00:00
Nuno Campos	03fba07d15	core[patch]: Update styles for mermaid graphs (#24147 )	2024-07-11 14:19:36 -07:00
ccurme	8ee8ca7c83	core[patch]: propagate `parse_docstring` to tool decorator (#24123 ) Disabled by default. ```python from langchain_core.tools import tool @tool(parse_docstring=True) def foo(bar: str, baz: int) -> str: """The foo. Args: bar: this is the bar baz: this is the baz """ return bar foo.args_schema.schema() ``` ```json { "title": "fooSchema", "description": "The foo.", "type": "object", "properties": { "bar": { "title": "Bar", "description": "this is the bar", "type": "string" }, "baz": { "title": "Baz", "description": "this is the baz", "type": "integer" } }, "required": [ "bar", "baz" ] } ```	2024-07-11 20:11:45 +00:00
Jacob Lee	f1f1f75782	community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085 )	2024-07-11 21:45:30 +02:00
Eugene Yurtsev	4ba14adec6	core[patch]: Clean up indexing test code (#24139 ) Refactor the code to use the existing InMemroyVectorStore. This change is needed for another PR that moves some of the imports around (and messes up the mock.patch in this file)	2024-07-11 18:54:46 +00:00
Atul R	457677c1b7	community: Fixes use of ImagePromptTemplate with Ollama (#24140 ) Description: ImagePromptTemplate for Multimodal llms like llava when using Ollama Twitter handle: https://x.com/a7ulr Details: When using llava models / any ollama multimodal llms and passing images in the prompt as urls, langchain breaks with this error. ```python image_url_components = image_url.split(",") ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'split' ``` From the looks of it, there was bug where the condition did check for a `url` field in the variable but missed to actually assign it. This PR fixes ImagePromptTemplate for Multimodal llms like llava when using Ollama specifically. @hwchase17	2024-07-11 11:31:48 -07:00
Matt	8327925ab7	community:support additional Azure Search Options (#24134 ) - Description: Support additional kwargs options for the Azure Search client (Described here https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations) - Issue: N/A - Dependencies: No additional Dependencies ---------	2024-07-11 18:22:36 +00:00
ccurme	122e80e04d	core[patch]: add versionadded to `as_tool` (#24138 )	2024-07-11 18:08:08 +00:00
Erick Friis	c4417ea93c	core: release 0.2.14, remove poetry 1.7 incompatible flag from root (#24137 )	2024-07-11 17:59:51 +00:00
Isaac Francisco	7a62d3dbd6	standard-tests[patch]: test that bind_tools can accept regular python function (#24135 )	2024-07-11 17:42:17 +00:00
Nuno Campos	2428984205	core: Add metadata to graph json repr (#24131 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-11 17:23:52 +00:00
Harley Gross	ea3cd1ebba	community[minor]: added support for C in RecursiveCharacterTextSplitter (#24091 ) Description: Added support for C in RecursiveCharacterTextSplitter by reusing the separators for C++	2024-07-11 16:47:48 +00:00
Nuno Campos	3e454d7568	core: fix docstring (#24129 )	2024-07-11 16:38:14 +00:00
Eugene Yurtsev	08638ccc88	community[patch]: QianfanLLMEndpoint fix type information for the keys (#24128 ) Fix for issue: https://github.com/langchain-ai/langchain/issues/24126	2024-07-11 16:24:26 +00:00
Nuno Campos	ee3fe20af4	core: mermaid: Render metadata key-value pairs when drawing mermaid graph (#24103 ) - if node is runnable binding with metadata attached	2024-07-11 16:22:23 +00:00
Eugene Yurtsev	1e7d8ba9a6	ci[patch]: Update community linter to provide a helpful error message (#24127 ) Update community import linter to explain what's wrong	2024-07-11 16:22:08 +00:00
maang-h	16e178a8c2	docs: Add MiniMaxChat docstrings (#24026 ) - Description: Add MiniMaxChat rich docstrings. - Issue: the issue #22296	2024-07-11 10:55:02 -04:00
Christophe Bornet	5fc5ef2b52	community[minor]: Add graph store extractors (#24065 ) This adds an extractor interface and an implementation for HTML pages. Extractors are used to create GraphVectorStore Links on loaded content. Twitter handle: cbornet_	2024-07-11 10:35:31 -04:00
maang-h	9bcf8f867d	docs: Add SQLChatMessageHistory docstring (#23978 ) - Description: Add SQLChatMessageHistory docstring. - Issue: the issue #21983 Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-11 14:24:28 +00:00
Rafael Pereira	092e9ee0e6	community[minor]: Neo4j Fixed similarity docs (#23913 ) Description: There was missing some documentation regarding the `filter` and `params` attributes in similarity search methods. --------- Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-11 10:16:48 -04:00
Eugene Yurtsev	dc131ac42a	core[minor]: Add dispatching for custom events (#24080 ) This PR allows dispatching adhoc events for a given run. # Context This PR allows users to send arbitrary data to the callback system and to the astream events API from within a given runnable. This can be extremely useful to surface custom information to end users about progress etc. Integration with langsmith tracer will be done separately since the data cannot be currently visualized. It'll be accommodated using the events attribute of the Run # Examples with astream events ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.tools import tool @tool async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events({'x': 1}, version='v2'): print(event) ``` ```python {'event': 'on_tool_start', 'data': {'input': {'x': 1}}, 'name': 'foo', 'tags': [], 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_tool_end', 'data': {'output': 2}, 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` ```python from langchain_core.callbacks import adispatch_custom_event from langchain_core.runnables import RunnableLambda @RunnableLambda async def foo(x: int) -> int: """Foo""" await adispatch_custom_event("event1", {"x": x}) await adispatch_custom_event("event2", {"x": x}) return x + 1 async for event in foo.astream_events(1, version='v2'): print(event) ``` ```python {'event': 'on_chain_start', 'data': {'input': 1}, 'name': 'foo', 'tags': [], 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'metadata': {}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []} {'event': 'on_chain_stream', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'data': {'chunk': 2}, 'parent_ids': []} {'event': 'on_chain_end', 'data': {'output': 2}, 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []} ``` # Examples with handlers This is copy pasted from unit tests ```python class CustomCallbackManager(BaseCallbackHandler): def __init__(self) -> None: self.events: List[Any] = [] def on_custom_event( self, name: str, data: Any, , run_id: UUID, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, *kwargs: Any, ) -> None: assert kwargs == {} self.events.append( ( name, data, run_id, tags, metadata, ) ) callback = CustomCallbackManager() run_id = uuid.UUID(int=7) @RunnableLambda def foo(x: int, config: RunnableConfig) -> int: dispatch_custom_event("event1", {"x": x}) dispatch_custom_event("event2", {"x": x}, config=config) return x foo.invoke(1, {"callbacks": [callback], "run_id": run_id}) assert callback.events == [ ("event1", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ("event2", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}), ] ```	2024-07-11 02:25:12 +00:00
Erick Friis	71c2221f8c	openai: release 0.1.15 (#24097 )	2024-07-10 16:45:42 -07:00
Erick Friis	6ea6f9f7bc	core: release 0.2.13 (#24096 )	2024-07-10 16:39:15 -07:00
ccurme	975b6129f6	core[patch]: support conversion of runnables to tools (#23992 ) Open to other thoughts on UX. string input: ```python as_tool = retriever.as_tool() as_tool.invoke("cat") # [Document(...), ...] ``` typed dict input: ```python class Args(TypedDict): key: int def f(x: Args) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(f).as_tool( name="my tool", description="description", # name, description are inferred if not supplied ) as_tool.invoke({"key": 3}) # "6" ``` for untyped dict input, allow specification of parameters + types ```python def g(x: Dict[str, Any]) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(g).as_tool(arg_types={"key": int}) result = as_tool.invoke({"key": 3}) # "6" ``` Passing the `arg_types` is slightly awkward but necessary to ensure tool calls populate parameters correctly: ```python from typing import Any, Dict from langchain_core.runnables import RunnableLambda from langchain_openai import ChatOpenAI def f(x: Dict[str, Any]) -> str: return str(x["key"] * 2) runnable = RunnableLambda(f) as_tool = runnable.as_tool(arg_types={"key": int}) llm = ChatOpenAI().bind_tools([as_tool]) result = llm.invoke("Use the tool on 3.") tool_call = result.tool_calls[0] args = tool_call["args"] assert args == {"key": 3} as_tool.run(args) ``` Contrived (?) example with langgraph agent as a tool: ```python from typing import List, Literal from typing_extensions import TypedDict from langchain_openai import ChatOpenAI from langgraph.prebuilt import create_react_agent llm = ChatOpenAI(temperature=0) def magic_function(input: int) -> int: """Applies a magic function to an input.""" return input + 2 agent_1 = create_react_agent(llm, [magic_function]) class Message(TypedDict): role: Literal["human"] content: str agent_tool = agent_1.as_tool( arg_types={"messages": List[Message]}, name="Jeeves", description="Ask Jeeves.", ) agent_2 = create_react_agent(llm, [agent_tool]) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-10 19:29:59 -04:00
Bagatur	6928f4c438	core[minor]: Add ToolMessage.raw_output (#23994 ) Decisions to discuss: 1. is a new attr needed or could additional_kwargs be used for this 2. is raw_output a good name for this attr 3. should raw_output default to {} or None 4. should raw_output be included in serialization 5. do we need to update repr/str to exclude raw_output	2024-07-10 20:11:10 +00:00
Eugene Yurtsev	c4e149d4f1	community[patch]: Add linter to catch @root_validator (#24070 ) - Add linter to prevent further usage of vanilla root validator - Udpate remaining root validators	2024-07-10 14:51:03 +00:00
ccurme	9c6efadec3	community[patch]: propagate cost information to OpenAI callback (#23996 ) This is enabled following https://github.com/langchain-ai/langchain/pull/22716.	2024-07-10 14:50:35 +00:00
William FH	1e1fd30def	[Core] Fix fstring in logger warning (#24043 )	2024-07-09 19:53:18 -07:00
Ethan Yang	13855ef0c3	[HuggingFace Pipeline] add streaming support (#23852 )	2024-07-09 17:02:00 -04:00
Nuno Campos	859e434932	core: Speed up json parse for large strings (#24036 ) for a large string: - old 4.657918874989264 - new 0.023724667000351474	2024-07-09 12:26:50 -07:00
Nuno Campos	160fc7f246	core: Move json parsing in base chat model / output parser to bg thread (#24031 ) - add version of AIMessageChunk.__add__ that can add many chunks, instead of only 2 - In agenerate_from_stream merge and parse chunks in bg thread - In output parse base classes do more work in bg threads where appropriate --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2024-07-09 12:26:36 -07:00
Nuno Campos	73966e693c	openai: Create msg chunk in bg thread (#24032 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-09 12:01:51 -07:00
Erick Friis	007c5a85d5	multiple: use modern installer in poetry (#23998 )	2024-07-08 18:50:48 -07:00
Erick Friis	e80c150c44	community: release 0.2.7 (prev was langchain) (#23997 )	2024-07-08 23:43:32 +00:00
Erick Friis	9f8fd08955	community: release 0.2.7 (#23993 )	2024-07-08 22:04:58 +00:00
Erick Friis	bedd893cd1	core: release 0.2.12 (#23991 )	2024-07-08 21:29:29 +00:00
Bagatur	1e957c0c23	docs: rm discord (#23985 )	2024-07-08 14:27:58 -07:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	aa8c9bb4a9	community[patch]: Add constraint for pdfminer.six to unbreak CI (#23988 ) Something changed in pdfminer six. This PR unreaks CI without fixing the underlying PDF parser.	2024-07-08 20:55:19 +00:00
Eugene Yurtsev	2c180d645e	core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841 ) This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!	2024-07-08 16:09:29 -04:00
Eugene Yurtsev	9787552b00	core[patch]: Use InMemoryChatMessageHistory in unit tests (#23916 ) Update unit test to use the existing implementation of chat message history	2024-07-05 20:10:54 +00:00
Rajendra Kadam	8b84457b17	community[minor]: Support PGVector in PebbloRetrievalQA (#23874 ) - Description: Support PGVector in PebbloRetrievalQA - Identity and Semantic Enforcement support for PGVector - Refactor Vectorstore validation and name check - Clear the overridden identity and semantic enforcement filters - Issue: NA - Dependencies: NA - Tests: NA(already added) - Docs: Updated - Twitter handle: [@Raj__725](https://twitter.com/Raj__725)	2024-07-05 16:02:25 -04:00
Eugene Yurtsev	e0186df56b	core[patch]: Clarify upsert response semantics (#23921 )	2024-07-05 15:59:47 -04:00
Rajendra Kadam	ee8aa54f53	community[patch]: Fix source path mismatch in PebbloSafeLoader (#23857 ) Description: Fix for source path mismatch in PebbloSafeLoader. The fix involves storing the full path in the doc metadata in VectorDB Issue: NA, caught in internal testing Dependencies: NA Add tests: Updated tests	2024-07-05 15:24:17 -04:00
Eugene Yurtsev	5b7d5f7729	core[patch]: Add comment to clarify aadd_documents (#23920 ) Add comment to clarify how add documents works	2024-07-05 15:20:16 -04:00
Eugene Yurtsev	e0889384d9	standard-tests[minor]: add unit tests for testing get_by_ids, aget_by_ids, upsert, aupsert_by_ids (#23919 ) These standard unit tests provide standard tests for functionality introduced in these PRs: * https://github.com/langchain-ai/langchain/pull/23774 * https://github.com/langchain-ai/langchain/pull/23594	2024-07-05 19:11:54 +00:00
ccurme	74c7198906	core, anthropic[patch]: support streaming tool calls when function has no arguments (#23915 ) resolves https://github.com/langchain-ai/langchain/issues/23911 When an AIMessageChunk is instantiated, we attempt to parse tool calls off of the tool_call_chunks. Here we add a special-case to this parsing, where `""` will be parsed as `{}`. This is a reaction to how Anthropic streams tool calls in the case where a function has no arguments: ``` {'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1} {'partial_json': '', 'type': 'tool_use', 'index': 1} ``` The `partial_json` does not accumulate to a valid json string-- most other providers tend to emit `"{}"` in this case.	2024-07-05 18:57:41 +00:00
Mateusz Szewczyk	902b57d107	IBM: Added WatsonxChat passing params to invoke method (#23758 ) Thank you for contributing to LangChain! - [x] PR title: "IBM: Added WatsonxChat to chat models preview, update passing params to invoke method" - [x] PR message: - Description: Added WatsonxChat passing params to invoke method, added integration tests - Dependencies: `ibm_watsonx_ai` - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-05 18:07:50 +00:00
ccurme	1f5a163f42	langchain[patch]: deprecate QAGenerationChain (#23730 )	2024-07-05 18:06:19 +00:00
ccurme	25de47878b	langchain[patch]: deprecate AnalyzeDocumentChain (#23769 )	2024-07-05 14:00:23 -04:00
Christophe Bornet	42d049f618	core[minor]: Add Graph Store component (#23092 ) This PR introduces a GraphStore component. GraphStore extends VectorStore with the concept of links between documents based on document metadata. This allows linking documents based on a variety of techniques, including common keywords, explicit links in the content, and other patterns. This works with existing Documents, so it’s easy to extend existing VectorStores to be used as GraphStores. The interface can be implemented for any Vector Store technology that supports metadata, not only graph DBs. When retrieving documents for a given query, the first level of search is done using classical similarity search. Next, links may be followed using various traversal strategies to get additional documents. This allows documents to be retrieved that aren’t directly similar to the query but contain relevant information. 2 retrieving methods are added to the VectorStore ones : * traversal_search which gets all linked documents up to a certain depth * mmr_traversal_search which selects linked documents using an MMR algorithm to have more diverse results. If a depth of retrieval of 0 is used, GraphStore is effectively a VectorStore. It enables an easy transition from a simple VectorStore to GraphStore by adding links between documents as a second step. An implementation for Apache Cassandra is also proposed. See https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb for a notebook explaining how to use GraphStore and that shows that it can answer correctly to questions that a simple VectorStore cannot. Twitter handle: _cbornet	2024-07-05 12:24:10 -04:00
Leonid Ganeline	77f5fc3d55	core: docstrings `load` (#23787 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:23:19 -04:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
G Sreejith	3c752238c5	core[patch]: Fix typo in docstring (graphm -> graph) (#23910 ) Changes has been as per the request Replaced graphm with graph	2024-07-05 16:20:33 +00:00
Leonid Ganeline	12c92b6c19	core: docstrings `outputs` (#23889 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:18:17 -04:00
Leonid Ganeline	1eca98ec56	core: docstrings `prompts` (#23890 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-05 12:17:52 -04:00
Philippe PRADOS	289960bc60	community[patch]: Redis.delete should be a regular method not a static method (#23873 ) The `langchain_common.vectostore.Redis.delete()` must not be a `@staticmethod`. With the current implementation, it's not possible to have multiple instances of Redis vectorstore because all versions must share the `REDIS_URL`. It's not conform with the base class.	2024-07-05 12:04:58 -04:00
Mohammad Mohtashim	2274d2b966	core[patch]: Accounting for Optional Input Variables in BasePromptTemplate (#22851 ) Description: After reviewing the prompts API, it is clear that the only way a user can explicitly mark an input variable as optional is through the `MessagePlaceholder.optional` attribute. Otherwise, the user must explicitly pass in the `input_variables` expected to be used in the `BasePromptTemplate`, which will be validated upon execution. Therefore, to semantically handle a `MessagePlaceholder` `variable_name` as optional, we will treat the `variable_name` of `MessagePlaceholder` as a `partial_variable` if it has been marked as optional. This approach aligns with how the `variable_name` of `MessagePlaceholder` is already handled [here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991). Additionally, an attribute `optional_variable` has been added to `BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is also made part of `optional_variable` when marked as optional. Moreover, the `get_input_schema` method has been updated for `BasePromptTemplate` to differentiate between optional and non-optional variables. Issue: #22832, #21425 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-05 15:49:40 +00:00
Klaudia Lemiec	a2082bc1f8	docs: Arxiv docs update (#23871 ) - [X] PR title - [X] PR message: *Delete this entire checklist* and replace with - Description: Update of docstrings and docpages - Issue: [22866](https://github.com/langchain-ai/langchain/issues/22866) - [X] Add tests and docs - [X] Lint and test	2024-07-05 11:43:51 -04:00
André Quintino	99b1467b63	community: add support for 'cloud' parameter in JiraAPIWrapper (#23057 ) - Description: Enhance JiraAPIWrapper to accept the 'cloud' parameter through an environment variable. This update allows more flexibility in configuring the environment for the Jira API. - Twitter handle: Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-05 15:11:10 +00:00
wenngong	b1e90b3075	community: add model_name param valid for GPT4AllEmbeddings (#23867 ) Description: add model_name param valid for GPT4AllEmbeddings Issue: #23863 #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-05 10:46:34 -04:00
volodymyr-memsql	a4eb6d0fb1	community: add SingleStoreDB semantic cache (#23218 ) This PR adds a `SingleStoreDBSemanticCache` class that implements a cache based on SingleStoreDB vector store, integration tests, and a notebook example. Additionally, this PR contains minor changes to SingleStoreDB vector store: - change add texts/documents methods to return a list of inserted ids - implement delete(ids) method to delete documents by list of ids - added drop() method to drop a correspondent database table - updated integration tests to use and check functionality implemented above CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-07-05 09:26:06 -04:00
Igor Drozdov	bb597b1286	feat(community): add bind_tools function for ChatLiteLLM (#23823 ) It's a follow-up to https://github.com/langchain-ai/langchain/pull/23765 Now the tools can be bound by calling `bind_tools` ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" # tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] tools = [GetWeather, GetPopulation] llm = ChatLiteLLM(model="claude-3-sonnet-20240229").bind_tools(tools) ai_msg = llm.invoke(prompt) print(ai_msg.tool_calls) ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-05 09:19:41 -04:00
Jiejun Tan	2be66a38d8	huggingface: Fix huggingface tei support (#22653 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22595. Modified `libs/partners/huggingface/langchain_huggingface/embeddings/huggingface_endpoint.py`, where the API call function does not match current [Text Embeddings Inference API](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed). One example is: ```json { "inputs": "string", "normalize": true, "truncate": false } ``` Parameters in `_model_kwargs` are not passed properly in the latest version. By the way, the issue [why cause 413? #50](https://github.com/huggingface/text-embeddings-inference/issues/50) might be solved.	2024-07-03 13:30:29 -07:00
Eugene Yurtsev	9ccc4b1616	core[patch]: Fix logic in BaseChatModel that processes the llm string that is used as a key for caching chat models responses (#23842 ) This PR should fix the following issue: https://github.com/langchain-ai/langchain/issues/23824 Introduced as part of this PR: https://github.com/langchain-ai/langchain/pull/23416 I am unable to reproduce the issue locally though it's clear that we're getting a `serialized` object which is not a dictionary somehow. The test below passes for me prior to the PR as well ```python def test_cache_with_sqllite() -> None: from langchain_community.cache import SQLiteCache from langchain_core.globals import set_llm_cache cache = SQLiteCache(database_path=".langchain.db") set_llm_cache(cache) chat_model = FakeListChatModel(responses=["hello", "goodbye"], cache=True) assert chat_model.invoke("How are you?").content == "hello" assert chat_model.invoke("How are you?").content == "hello" ```	2024-07-03 16:23:55 -04:00
Vadym Barda	9bb623381b	core[minor]: update conversion utils to handle RemoveMessage (#23840 )	2024-07-03 16:13:31 -04:00
Eugene Yurtsev	4ab78572e7	core[patch]: Speed up unit tests for imports (#23837 ) Speed up unit tests for imports	2024-07-03 15:55:15 -04:00
Nico Puhlmann	4a15fce516	langchain: update declarative_base import (#20056 ) Description: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (depreca ted since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-03 15:52:35 -04:00
Théo Deschamps	39b19cf764	core[patch]: extract input variables for `path` and `detail` keys in order to format an `ImagePromptTemplate` (#22613 ) - Description: Add support for `path` and `detail` keys in `ImagePromptTemplate`. Previously, only variables associated with the `url` key were considered. This PR allows for the inclusion of a local image path and a detail parameter as input to the format method. - Issues: - fixes #20820 - related to #22024 - Dependencies: None - Twitter handle: @DeschampsTho5 --------- Co-authored-by: tdeschamps <tdeschamps@kameleoon.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 18:58:42 +00:00
Bagatur	a4798802ef	cli[patch]: ruff 0.5 (#23833 )	2024-07-03 18:33:15 +00:00
Leonid Ganeline	55f6f91f17	core[patch]: docstrings `output_parsers` (#23825 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 14:27:40 -04:00
Philippe PRADOS	26cee2e878	partners[patch]: MongoDB vectorstore to return and accept string IDs (#23818 ) The mongdb have some errors. - `add_texts() -> List` returns a list of `ObjectId`, and not a list of string - `delete()` with `id` never remove chunks. --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-03 14:14:08 -04:00
Ikko Eltociear Ashimine	75734fbcf1	community: fix typo in unit tests for test_zenguard.py (#23819 ) enviroment -> environment - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM"	2024-07-03 14:05:42 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
William FH	6cd56821dc	[Core] Unify function schema parsing (#23370 ) Use pydantic to infer nested schemas and all that fun. Include bagatur's convenient docstring parser Include annotation support Previously we didn't adequately support many typehints in the bind_tools() method on raw functions (like optionals/unions, nested types, etc.)	2024-07-03 09:55:38 -07:00
Oguz Vuruskaner	2a2c0d1a94	community[deepinfra]: fix tool call parsing. (#23162 ) This PR includes fix for DeepInfra tool call parsing.	2024-07-03 12:11:37 -04:00
maang-h	525109e506	feat: Implement ChatBaichuan asynchronous interface (#23589 ) - Description: Add interface to `ChatBaichuan` to support asynchronous requests - `_agenerate` method - `_astream` method --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-03 12:10:04 -04:00
Leonid Ganeline	716a316654	core: docstrings `indexing` (#23785 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:27:34 -04:00
Leonid Ganeline	30fdc2dbe7	core: docstrings `messages` (#23788 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-03 11:25:00 -04:00
ccurme	54e730f6e4	fireworks[patch]: read from tool calls attribute (#23820 )	2024-07-03 11:11:17 -04:00
Bagatur	ebb404527f	anthropic[patch]: Release 0.1.19 (#23783 )	2024-07-02 18:17:25 -04:00
Bagatur	6168c846b2	openai[patch]: Release 0.1.14 (#23782 )	2024-07-02 18:17:15 -04:00
Bagatur	cb9812593f	openai[patch]: expose model request payload (#23287 ) ![Screenshot 2024-06-21 at 3 12 12 PM](https://github.com/langchain-ai/langchain/assets/22008038/6243a01f-1ef6-4085-9160-2844d9f2b683)	2024-07-02 17:43:55 -04:00
Bagatur	ed200bf2c4	anthropic[patch]: expose payload (#23291 ) ![Screenshot 2024-06-21 at 4 56 02 PM](https://github.com/langchain-ai/langchain/assets/22008038/a2c6224f-3741-4502-9607-1a726a0551c9)	2024-07-02 17:43:47 -04:00
Bagatur	7a3d8e5a99	core[patch]: Release 0.2.11 (#23780 )	2024-07-02 17:35:57 -04:00
Bagatur	d677dadf5f	core[patch]: mark RemoveMessage beta (#23656 )	2024-07-02 21:27:21 +00:00
ccurme	1d54ac93bb	ai21[patch]: release 0.1.7 (#23781 )	2024-07-02 21:24:13 +00:00
Asaf Joseph Gardin	320dc31822	partners: AI21 Labs Jamba Streaming Support (#23538 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - [x] PR message: *Delete this entire checklist* and replace with - Description: Added support for streaming in AI21 Jamba Model - Twitter handle: https://github.com/AI21Labs - [x] Add tests and docs: If you're adding a new integration, please include - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-02 17:15:46 -04:00
Qingchuan Hao	5cd4083457	community: make bing web search as the only option (#23523 ) This PR make bing web search as the option for BingSearchAPIWrapper to facilitate and simply the user interface on Langchain. This is a follow-up work of https://github.com/langchain-ai/langchain/pull/23306.	2024-07-02 17:13:54 -04:00
ccurme	7c1cddf1b7	anthropic[patch]: release 0.1.18 (#23778 )	2024-07-02 16:46:47 -04:00
ccurme	c9dac59008	anthropic[patch]: fix model name in some integration tests (#23779 )	2024-07-02 20:45:52 +00:00
Bagatur	7a6c06cadd	anthropic[patch]: tool output parser fix (#23647 )	2024-07-02 16:33:22 -04:00
ccurme	46cbf0e4aa	anthropic[patch]: use core output parsers for structured output (#23776 ) Also add to standard tests for structured output.	2024-07-02 16:15:26 -04:00
kiarina	dc396835ed	langchain_anthropic: add stop_reason in ChatAnthropic stream result (#23689 ) `ChatAnthropic` can get `stop_reason` from the resulting `AIMessage` in `invoke` and `ainvoke`, but not in `stream` and `astream`. This is a different behavior from `ChatOpenAI`. It is possible to get `stop_reason` from `stream` as well, since it is needed to determine the next action after the LLM call. This would be easier to handle in situations where only `stop_reason` is needed. - Issue: NA - Dependencies: NA - Twitter handle: https://x.com/kiarina37	2024-07-02 15:16:20 -04:00
maang-h	e4e28a6ff5	community[patch]: Fix MiniMaxChat validate_environment error (#23770 ) - Description: Fix some issues in MiniMaxChat - Fix `minimax_api_host` not in `values` error - Remove `minimax_group_id` from reading environment variables, the `minimax_group_id` no longer use in MiniMaxChat - Invoke callback prior to yielding token, the issus #16913	2024-07-02 13:23:32 -04:00
SN	acc457f645	core[patch]: fix nested sections for mustache templating (#23747 ) The prompt template variable detection only worked for singly-nested sections because we just kept track of whether we were in a section and then set that to false as soon as we encountered an end block. i.e. the following: ``` {{#outerSection}} {{variableThatShouldntShowUp}} {{#nestedSection}} {{nestedVal}} {{/nestedSection}} {{anotherVariableThatShouldntShowUp}} {{/outerSection}} ``` Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as input_variables (whereas it should just yield `['outerSection']`). This fixes that by keeping track of the current depth and using a stack.	2024-07-02 10:20:45 -07:00
Eugene Yurtsev	46ff0f7a3c	community[patch]: Update @root_validators to use explicit pre=True or pre=False (#23737 )	2024-07-02 10:47:21 -04:00
Igor Drozdov	b664dbcc36	feat(community): add support for tool_calls response (#23765 ) When `model_kwargs={"tools": tools}` are passed to `ChatLiteLLM`, they are executed, but the response is not recognized correctly Let's add `tool_calls` to the `additional_kwargs` Thank you for contributing to LangChain! ## ChatAnthropic I used the following example to verify the output of llm with tools: ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_anthropic import ChatAnthropic class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") llm = ChatAnthropic(model="claude-3-sonnet-20240229") llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?") print(ai_msg.tool_calls) ``` I get the following response: ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01UfDA89knrhw3vFV9X47neT'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01NrYVRYae7m7z7tBgyPb3Gd'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01EPFEpDgzL6vV2dTpD9SVP5'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01B5J6tPJXgwwfhQX9BHP2dt'}] ``` ## LiteLLM Based on https://litellm.vercel.app/docs/completion/function_call ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool import litellm class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] response = litellm.completion(model="claude-3-sonnet-20240229", messages=[{'role': 'user', 'content': prompt}], tools=tools) print(response.choices[0].message.tool_calls) ``` ```python [ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetWeather'), id='toolu_01HeDWV5vP7BDFfytH5FJsja', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetWeather'), id='toolu_01EiLesUSEr3YK1DaE2jxsQv', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetPopulation'), id='toolu_01Xz26zvkBDRxEUEWm9pX6xa', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetPopulation'), id='toolu_01SDqKnsLjvUXuBsgAZdEEpp', type='function')] ``` ## ChatLiteLLM When I try the following ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] llm = ChatLiteLLM(model="claude-3-sonnet-20240229", model_kwargs={"tools": tools}) ai_msg = llm.invoke(prompt) print(ai_msg) print(ai_msg.tool_calls) ``` ```python content="Okay, let's find out the current weather and populations for Los Angeles and New York City:" response_metadata={'token_usage': Usage(prompt_tokens=329, completion_tokens=193, total_tokens=522), 'model': 'claude-3-sonnet-20240229', 'finish_reason': 'tool_calls'} id='run-748b7a84-84f4-497e-bba1-320bd4823937-0' [] ``` --- When I apply the changes of this PR, the output is ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_017D2tGjiaiakB1HadsEFZ4e'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01WrDpJfVqLkPejWzonPCbLW'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_016UKyYrVAV9Pz99iZGgGU7V'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01Sgv1imExFX1oiR1Cw88zKy'}] ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-02 10:42:08 -04:00
Eugene Yurtsev	338cef35b4	community[patch]: update @root_validator in utilities namespace (#23768 ) Update all utilities to use `pre=True` or `pre=False` https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 14:33:01 +00:00

... 4 5 6 7 8 ...

5308 Commits