langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-31 10:23:18 +00:00

Author	SHA1	Message	Date
Eugene Yurtsev	202d7f6c4a	core[patch]: 0.3.11 release (#27403 ) Core bump to 0.3.11	2024-10-16 15:39:37 -04:00
Bagatur	a4392b070d	core[patch]: add convert_to_openai_messages util (#27263 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-16 17:10:10 +00:00
sByteman	31e7664afd	community[minor]: add proxy support to RecursiveUrlLoader (#27364 ) Description This PR introduces the proxies parameter to the RecursiveUrlLoader class, allowing the user to specify proxy servers for requests. This update enables crawling through proxy servers, providing enhanced flexibility for network configurations. The key changes include: 1.Added an optional proxies parameter to the constructor (__init__). 2.Updated the documentation to explain the proxies parameter usage with an example. 3.Modified the _get_child_links_recursive method to pass the proxies parameter to the requests.get function. Sample Usage ```python from bs4 import BeautifulSoup as Soup from langchain_community.document_loaders.recursive_url_loader import RecursiveUrlLoader proxies = { "http": "http://localhost:1080", "https": "http://localhost:1080", } url = "https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel" loader = RecursiveUrlLoader( url=url, max_depth=1, extractor=lambda x: Soup(x, "html.parser").text,proxies=proxies ) docs = loader.load() ``` --------- Co-authored-by: root <root@thb>	2024-10-16 16:29:59 +00:00
Yuki Watanabe	b8bfebd382	community: Add deprecation notice for Databricks integration in langchain-community (#27355 ) We have released the [langchain-databricks](https://github.com/langchain-ai/langchain-databricks) package for Databricks integration. This PR deprecates the legacy classes within `langchain-community`. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-16 02:20:40 +00:00
xsai9101	15c1ddaf99	community: Add support for clob datatype in oracle database (#27330 ) Description: This PR add support of clob/blob data type for oracle document loader, clob/blob can only be read by oracledb package when connection is open, so reformat code to process data before connection closes. Dependencies: oracledb package same as before. pip install oracledb Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-16 02:19:20 +00:00
Enes Bol	3f74dfc3d8	community[patch]: Fix vLLM integration to filter SamplingParams (#27367 ) Description: - This pull request addresses a bug in Langchain's VLLM integration, where the use_beam_search parameter was erroneously passed to SamplingParams. The SamplingParams class in vLLM does not support the use_beam_search argument, which caused a TypeError. - This PR introduces logic to filter out unsupported parameters, ensuring that only valid parameters are passed to SamplingParams. As a result, the integration now functions as expected without errors. - The bug was reproduced by running the code sample from Langchain’s documentation, which triggered the error due to the invalid parameter. This fix resolves that error by implementing proper parameter filtering. VLLM Sampling Params Class: https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py Issue: I could not found an Issue that belongs to this. Fixes "TypeError: Unexpected keyword argument 'use_beam_search'" error when using VLLM from Langchain. Dependencies: None. Tests and Documentation: Tests: No new functionality was added, but I tested the changes by running multiple prompts through the VLLM integration with various parameter configurations. All tests passed successfully without breaking compatibility. Docs No documentation changes were necessary as this is a bug fix. Reproducing the Error: https://python.langchain.com/docs/integrations/llms/vllm/ The code sample from the original documentation can be used to reproduce the error I got. from langchain_community.llms import VLLM llm = VLLM( model="mosaicml/mpt-7b", trust_remote_code=True, # mandatory for hf models max_new_tokens=128, top_k=10, top_p=0.95, temperature=0.8, ) print(llm.invoke("What is the capital of France ?")) ![image](https://github.com/user-attachments/assets/3782d6ac-1f7b-4acc-bf2c-186216149de5) This PR resolves the issue by ensuring that only valid parameters are passed to SamplingParams.	2024-10-15 21:57:50 +00:00
Erick Friis	edf6d0a0fb	partners/couchbase: release 0.2.0 (attempt 2) (#27375 )	2024-10-15 14:51:05 -07:00
Jorge Piedrahita Ortiz	12fea5b868	community: sambastudio chat model integration minor fix (#27238 ) Description: sambastudio chat model integration minor fix fix default params fix usage metadata when streaming	2024-10-15 13:24:36 -04:00
ZhangShenao	f3925d71b9	community: Fix word spelling in `Text2vecEmbeddings` (#27183 ) Fix word spelling in `Text2vecEmbeddings`	2024-10-15 09:28:48 -07:00
Erick Friis	92ae61bcc8	multiple: rely on asyncio_mode auto in tests (#27200 )	2024-10-15 16:26:38 +00:00
William FH	0a3e089827	[Anthropic] Shallow Copy (#27105 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-15 15:50:48 +00:00
Matthew Peveler	c6533616b6	docs: fix community pgvector deprecation warning formatting (#27094 ) Description: PR fixes some formatting errors in deprecation message in the `langchain_community.vectorstores.pgvector` module, where it was missing spaces between a few words, and one word was misspelled. Issue: n/a Dependencies: n/a Signed-off-by: mpeveler@timescale.com Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-15 15:45:53 +00:00
Erick Friis	3fa5ce3e5f	community: clear mypy syntax warning in openapi (#27370 ) not completely clear the regex is functional	2024-10-15 15:43:53 +00:00
Ahmet Yasin Aytar	443b37403d	community: refactor Arxiv search logic (#27084 ) PR message: Description: This PR refactors the Arxiv API wrapper by extracting the Arxiv search logic into a helper function (_fetch_results) to reduce code duplication and improve maintainability. The helper function is used in methods like get_summaries_as_docs, run, and lazy_load, streamlining the code and making it easier to maintain in the future. Issue: This is a minor refactor, so no specific issue is being fixed. Dependencies: No new dependencies are introduced with this change. Add tests and docs: No new integrations were added, so no additional tests or docs are necessary for this PR. Lint and test: I have run make format, make lint, and make test to ensure all checks pass successfully. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-15 08:43:03 -07:00
Qiu Qin	57fbc6bdf1	community: Update OCI data science integration (#27083 ) This PR updates the integration with OCI data science model deployment service. - Update LLM to support streaming and async calls. - Added chat model. - Updated tests and docs. - Updated `libs/community/scripts/check_pydantic.sh` since the use of `@pre_init` is removed from existing integration. - Updated `libs/community/extended_testing_deps.txt` as this integration requires `langchain_openai`. --------- Co-authored-by: MING KANG <ming.kang@oracle.com> Co-authored-by: Dmitrii Cherkasov <dmitrii.cherkasov@oracle.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-15 08:32:54 -07:00
Rafael Miller	fc14f675f1	Community: Updated Firecrawl Document Loader to v1 (#26548 ) This PR updates the Firecrawl Document Loader to use the recently released V1 API of Firecrawl. Key Updates: Firecrawl V1 Integration: Updated the document loader to leverage the new Firecrawl V1 API for improved performance, reliability, and developer experience. Map Functionality Added: Introduced the map mode for more flexible document loading options. These updates enhance the integration and provide access to the latest features of Firecrawl. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-10-15 13:13:28 +00:00
Max Tran	8fea07f92e	community: fixed KeyError: 'client' (#27345 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Updated - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! twitter: @MaxHTran - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Not needed due to small change - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Max Tran <maxtra@amazon.com>	2024-10-14 20:51:13 +00:00
Martin Triska	8dc4bec947	[community] [Bugfix] base_o365 document loader metadata needs to be JSON serializable (#26322 ) In order for indexer to work, all metadata in the documents need to be JSON serializable. Timestamps are not. See here: https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/indexing/api.py#L83-L89 @eyurtsev could you please review? It's a tiny PR :-)	2024-10-14 12:48:31 -04:00
Trayan Azarov	59bbda9ba3	chroma: Deprecating versions 0.5.7 thru 0.5.12 (#27305 ) Description: Deprecated version of Chroma >=0.5.5 <0.5.12 due to a serious correctness issue that caused some embeddings for deployments with multiple collections to be lost (read more on the issue in Chroma repo) Issue: chroma-core/chroma#2922 (fixed by chroma-core/chroma##2923 and released in [0.5.13](https://github.com/chroma-core/chroma/releases/tag/0.5.13)) Dependencies: N/A Twitter handle: `@t_azarov`	2024-10-14 11:56:05 -04:00
Marcelo Nunes Alves	5647276998	community: Problem with embeddings in new versions of clickhouse. (#26041 ) Starting with Clickhouse version 24.8, a different type of configuration has been introduced in the vectorized data ingestion, and if this configuration occurs, an error occurs when generating the table. As can be seen below: ![Screenshot from 2024-09-04 11-48-00](https://github.com/user-attachments/assets/70840a93-1001-490c-921a-26924c51d9eb) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-11 18:54:50 +00:00
Eugene Yurtsev	5b9b8fe80f	core[patch]: Ignore ASYNC110 to upgrade to newest ruff version (#27229 ) Ignoring ASYNC110 with explanation	2024-10-09 11:25:58 -04:00
Vittorio Rigamonti	7da2efd9d3	community[minor]: VectorStore Infinispan. Adding TLS and authentication (#23522 ) Description: this PR enable VectorStore TLS and authentication (digest, basic) with HTTP/2 for Infinispan server. Based on httpx. Added docker-compose facilities for testing Added documentation Dependencies: requires `pip install httpx[http2]` if HTTP2 is needed Twitter handle: https://twitter.com/infinispan	2024-10-09 10:51:39 -04:00
Diao Zihao	4553573acb	core[patch],langchain[patch],community[patch]: Bump version dependency of tenacity to >=8.1.0,!=8.4.0,<10 (#27201 ) This should fixes the compatibility issue with graprag as in - https://github.com/langchain-ai/langchain/discussions/25595 Here are the release notes for tenacity 9 (https://github.com/jd/tenacity/releases/tag/9.0.0) --------- Signed-off-by: Zihao Diao <hi@ericdiao.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-09 14:00:45 +00:00
Stefano Lottini	d05fdd97dd	community: Cassandra Vector Store: extend metadata-related methods (#27078 ) Description: this PR adds a set of methods to deal with metadata associated to the vector store entries. These, while essential to the Graph-related extension of the `Cassandra` vector store, are also useful in themselves. These are (all come in their sync+async versions): - `[a]delete_by_metadata_filter` - `[a]replace_metadata` - `[a]get_by_document_id` - `[a]metadata_search` Additionally, a `[a]similarity_search_with_embedding_id_by_vector` method is introduced to better serve the store's internal working (esp. related to reranking logic). Issue: no issue number, but now all Document's returned bear their `.id` consistently (as a consequence of a slight refactoring in how the raw entries read from DB are made back into `Document` instances). Dependencies: (no new deps: packaging comes through langchain-core already; `cassio` is now required to be version 0.1.10+) Add tests and docs Added integration tests for the relevant newly-introduced methods. (Docs will be updated in a separate PR). Lint and test Lint and (updated) test all pass. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-09 06:41:34 +00:00
Erick Friis	84c05b031d	community: release 0.3.2 (#27214 )	2024-10-08 23:33:55 -07:00
Serena Ruan	a7c1ce2b3f	[community] Add timeout control and retry for UC tool execution (#26645 ) Add timeout at client side for UCFunctionToolkit and add retry logic. Users could specify environment variable `UC_TOOL_CLIENT_EXECUTION_TIMEOUT` to increase the timeout value for retrying to get the execution response if the status is pending. Default timeout value is 120s. - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Tested in Databricks: <img width="1200" alt="image" src="https://github.com/user-attachments/assets/54ab5dfc-5e57-4941-b7d9-bfe3f8ad3f62"> - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: serena-ruan <serena.rxy@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-09 06:31:48 +00:00
Tomaz Bratanic	481bd25d29	community: Fix database connections for neo4j (#27190 ) Fixes https://github.com/langchain-ai/langchain/issues/27185 Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-08 23:47:55 +00:00
Erick Friis	cedf4d9462	langchain: release 0.3.3 (#27213 )	2024-10-08 16:39:42 -07:00
Erick Friis	7264fb254c	core: release 0.3.10 (#27209 )	2024-10-08 16:21:42 -07:00
Bagatur	ce33c4fa40	openai[patch]: default temp=1 for o1 (#27206 )	2024-10-08 15:45:21 -07:00
RIdham Golakiya	73ad7f2e7a	langchain_chroma[patch]: updated example for get documents with where clause (#26767 ) Example updated for vectorstore ChromaDB. If we want to apply multiple filters then ChromaDB supports filters like this: Reference: [ChromaDB filters](https://cookbook.chromadb.dev/core/filters/) Thank you.	2024-10-08 20:21:58 +00:00
Bagatur	e3e9ee8398	core[patch]: utils for adding/subtracting usage metadata (#27203 )	2024-10-08 13:15:33 -07:00
ccurme	e3920f2320	community[patch]: fix structured_output in llamacpp integration (#27202 ) Resolves https://github.com/langchain-ai/langchain/issues/25318.	2024-10-08 15:16:59 -04:00
Erick Friis	b84e00283f	standard-tests: test that only one chunk sets input_tokens (#27177 )	2024-10-08 11:35:32 -07:00
Ajayeswar Reddy	9b7bdf1a26	Fixed typo in llibs/community/langchain_community/storage/sql.py (#27029 ) - [ ] PR title: docs: fix typo in SQLStore import path - [ ] PR message: - Description: This PR corrects a typo in the docstrings for the class SQLStore(BaseStore[str, bytes]). The import path in the docstring currently reads from langchain_rag.storage import SQLStore, which should be changed to langchain_community.storage import SQLStore. This typo is also reflected in the official documentation. - Issue: N/A - Dependencies: None - Twitter handle: N/A Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-08 17:51:26 +00:00
Vadym Barda	8d27325dbc	core[patch]: support ValidationError from pydantic v1 in tools (#27194 )	2024-10-08 10:19:04 -04:00
Christophe Bornet	16f5fdb38b	core: Add various ruff rules (#26836 ) Adds - ASYNC - COM - DJ - EXE - FLY - FURB - ICN - INT - LOG - NPY - PD - Q - RSE - SLOT - T10 - TID - YTT Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-07 22:30:27 +00:00
Erick Friis	5c826faece	core: update make format to fix all autofixable things (#27174 )	2024-10-07 15:20:47 -07:00
Christophe Bornet	d31ec8810a	core: Add ruff rules for error messages (EM) (#26965 ) All auto-fixes Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-07 22:12:28 +00:00
Oleksii Pokotylo	37ca468d03	community: AzureSearch: fix reranking for empty lists (#27104 ) Description: Fix reranking for empty lists Issue: ``` ValueError: not enough values to unpack (expected 3, got 0) documents, scores, vectors = map(list, zip(*docs)) File langchain_community/vectorstores/azuresearch.py", line 1680, in _reorder_results_with_maximal_marginal_relevance ``` Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>	2024-10-07 15:27:09 -04:00
Christophe Bornet	c4ebccfec2	core[minor]: Improve support for id in VectorStore (#26660 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-07 15:01:08 -04:00
Bharat Ramanathan	931ce8d026	core[patch]: Update `AsyncCallbackManager` to honor `run_inline` attribute and prevent context loss (#26885 ) ## Description This PR fixes the context loss issue in `AsyncCallbackManager`, specifically in `on_llm_start` and `on_chat_model_start` methods. It properly honors the `run_inline` attribute of callback handlers, preventing race conditions and ordering issues. Key changes: 1. Separate handlers into inline and non-inline groups. 2. Execute inline handlers sequentially for each prompt. 3. Execute non-inline handlers concurrently across all prompts. 4. Preserve context for stateful handlers. 5. Maintain performance benefits for non-inline handlers. These changes are implemented in `AsyncCallbackManager` rather than `ahandle_event` because the issue occurs at the prompt and message_list levels, not within individual events. ## Testing - Test case implemented in #26857 now passes, verifying execution order for inline handlers. ## Related Issues - Fixes issue discussed in #23909 ## Dependencies No new dependencies are required. --- @eyurtsev: This PR implements the discussed changes to respect `run_inline` in `AsyncCallbackManager`. Please review and advise on any needed changes. Twitter handle: @parambharat --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-07 14:59:29 -04:00
João Carlos Ferra de Almeida	780ce00dea	core[minor]: add kwargs to index and aindex functions for custom vector_field support (#26998 ) Added `kwargs` parameters to the `index` and `aindex` functions in `libs/core/langchain_core/indexing/api.py`. This allows users to pass additional arguments to the `add_documents` and `aadd_documents` methods, enabling the specification of a custom `vector_field`. For example, users can now use `vector_field="embedding"` when indexing documents in `OpenSearchVectorStore` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-07 14:52:50 -04:00
Jorge Piedrahita Ortiz	14de81b140	community: sambastudio chat model (#27056 ) Description:: sambastudio chat model integration added, previously only LLM integration included docs and tests --------- Co-authored-by: luisfucros <luisfucros@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-10-07 14:31:39 -04:00
Aditya Anand	f70650f67d	core[patch]: correct typo doc-string for astream_events method (#27108 ) This commit addresses a typographical error in the documentation for the async astream_events method. The word 'evens' was incorrectly used in the introductory sentence for the reference table, which could lead to confusion for users.\n\n### Changes Made:\n- Corrected 'Below is a table that illustrates some evens that might be emitted by various chains.' to 'Below is a table that illustrates some events that might be emitted by various chains.'\n\nThis enhancement improves the clarity of the documentation and ensures accurate terminology is used throughout the reference material.\n\nIssue Reference: #27107	2024-10-07 14:12:42 -04:00
Bagatur	38099800cc	docs: fix anthropic max_tokens docstring (#27166 )	2024-10-07 16:51:42 +00:00
ogawa	07dd8dd3d7	community[patch]: update gpt-4o cost (#27038 ) updated OpenAI cost definition according to the following: https://openai.com/api/pricing/	2024-10-07 09:06:30 -04:00
Bagatur	06ce5d1d5c	anthropic[patch]: Release 0.2.3 (#27126 )	2024-10-04 22:38:03 +00:00
Bagatur	0b8416bd2e	anthropic[patch]: fix input_tokens when cached (#27125 )	2024-10-04 22:35:51 +00:00
Bagatur	bd5b335cb4	standard-tests[patch]: fix oai usage metadata test (#27122 )	2024-10-04 20:00:48 +00:00

1 2 3 4 5 ...

5860 Commits