langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-10 05:20:39 +00:00

Author	SHA1	Message	Date
Erick Friis	50d61eafa2	partners/deepseek: release 0.1.1 (#29592 )	2025-02-04 23:46:38 +00:00
Erick Friis	7edfcbb090	docs: rename to langchain-deepseek in docs (#29587 )	2025-02-04 14:22:17 -08:00
Erick Friis	df8fa882b2	deepseek: bump core (#29584 )	2025-02-04 10:25:46 -08:00
Erick Friis	455f65947a	deepseek: rename to langchain-deepseek from langchain-deepseek-official (#29583 )	2025-02-04 17:57:25 +00:00
Philippe PRADOS	5771e561fb	[Bugfix langchain_community] Fix PyMuPDFLoader (#29550 ) - Description: add legacy properties - Issue: #29470 - Twitter handle: pprados	2025-02-04 09:24:40 -05:00
Ashutosh Kumar	65b404a2d1	[oci_generative_ai] Option to pass auth_file_location (#29481 ) PR title: "community: Option to pass auth_file_location for oci_generative_ai" Description: Option to pass auth_file_location, to overwrite config file default location "~/.oci/config" where profile name configs present. This is not fixing any issues. Just added optional parameter called "auth_file_location", which internally supported by any OCI client including GenerativeAiInferenceClient.	2025-02-03 21:44:13 -05:00
Teruaki Ishizaki	aeb42dc900	partners: Fixed the procedure of initializing pad_token_id (#29500 ) - Description: Add to check pad_token_id and eos_token_id of model config. It seems that this is the same bug as the HuggingFace TGI bug. It's same bug as #29434 - Issue: #29431 - Dependencies: none - Twitter handle: tell14 Example code is followings: ```python from langchain_huggingface.llms import HuggingFacePipeline hf = HuggingFacePipeline.from_model_id( model_id="meta-llama/Llama-3.2-3B-Instruct", task="text-generation", pipeline_kwargs={"max_new_tokens": 10}, ) from langchain_core.prompts import PromptTemplate template = """Question: {question} Answer: Let's think step by step.""" prompt = PromptTemplate.from_template(template) chain = prompt \| hf question = "What is electroencephalography?" print(chain.invoke({"question": question})) ```	2025-02-03 21:40:33 -05:00
AmirPoursaberi	a6efd22ba1	Fix a tiny typo in `create_retrieval_chain` docstring (#29552 ) Hi there! To fix a tiny typo in `create_retrieval_chain` docstring.	2025-02-03 10:54:49 -05:00
Hemant Rawat	db1693aa70	community: fix issue #29429 in age_graph.py (#29506 ) ## Description: This PR addresses issue #29429 by fixing the _wrap_query method in langchain_community/graphs/age_graph.py. The method now correctly handles Cypher queries with UNION and EXCEPT operators, ensuring that the fields in the SQL query are ordered as they appear in the Cypher query. Additionally, the method now properly handles cases where RETURN * is not supported. ### Issue: #29429 ### Dependencies: None ### Add tests and docs: Added unit tests in tests/unit_tests/graphs/test_age_graph.py to validate the changes. No new integrations were added, so no example notebook is necessary. Lint and test: Ran make format, make lint, and make test to ensure code quality and functionality.	2025-02-01 21:24:45 -05:00
Keenan Pepper	2f97916dea	docs: Add goodfire notebook and add to packages.yml (#29512 ) - Description: Add Goodfire ipynb notebook and add langchain-goodfire package to packages.yml - Issue: n/a - Dependencies: docs only - Twitter handle: keenanpepper --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-02-01 19:43:20 -05:00
ccurme	a3c5e4d070	deepseek[patch]: bump langchain-openai and add to scheduled testing (#29535 )	2025-02-01 18:40:59 -05:00
ccurme	16a422f3fa	community: add standard tests for Perplexity (#29534 )	2025-02-01 17:02:57 -05:00
Amit Ghadge	0c405245c4	[Integrations][Tool] Added Jenkins tools support (#29516 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-31 12:50:10 -05:00
Christophe Bornet	aab2e42169	core[patch]: Use Blockbuster to detect blocking calls in asyncio during tests (#29043 ) This PR uses the [blockbuster](https://github.com/cbornet/blockbuster) library in langchain-core to detect blocking calls made in the asyncio event loop during unit tests. Avoiding blocking calls is hard as these can be deeply buried in the code or made in 3rd party libraries. Blockbuster makes it easier to detect them by raising an exception when a call is made to a known blocking function (eg: `time.sleep`). Adding blockbuster allowed to find a blocking call in `aconfig_with_context` (it ends up calling `get_function_nonlocals` which loads function code). Dependencies: - blockbuster (test) Twitter handle: cbornet_	2025-01-31 10:06:34 -05:00
Philippe PRADOS	ceda8bc050	community[minor]: 03 - Refactoring PyPDF parser (#29330 ) This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses on updating the PyPDF parser. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970).	2025-01-31 10:05:07 -05:00
Julian Castro Pulgarin	b7e3e337b1	community: Fix YahooFinanceNewsTool to handle updated yfinance data structure (#29498 ) Description:* Updates the YahooFinanceNewsTool to handle the current yfinance news data structure. The tool was failing with a KeyError due to changes in the yfinance API's response format. This PR updates the code to correctly extract news URLs from the new structure. Issue: #29495 Dependencies: No new dependencies required. Works with existing yfinance package. The changes maintain backwards compatibility while fixing the KeyError that users were experiencing. The modified code properly handles the new data structure where: - News type is now at `content.contentType` - News URL is now at `content.canonicalUrl.url` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-31 02:31:44 +00:00
Erick Friis	332e303858	partners/mistralai: release 0.2.6 (#29491 )	2025-01-29 22:23:14 +00:00
Erick Friis	2c795f5628	partners/openai: release 0.3.3 (#29490 )	2025-01-29 22:23:03 +00:00
Erick Friis	f307b3cc5f	langchain: release 0.3.17 (#29485 )	2025-01-29 22:22:49 +00:00
Erick Friis	5cad3683b4	partners/groq: release 0.2.4 (#29488 )	2025-01-29 22:22:30 +00:00
Erick Friis	e074c26a6b	partners/fireworks: release 0.2.7 (#29487 )	2025-01-29 22:22:18 +00:00
Erick Friis	685609e1ef	partners/anthropic: release 0.3.5 (#29486 )	2025-01-29 22:22:11 +00:00
Erick Friis	ed3a5e664c	standard-tests: release 0.3.10 (#29484 )	2025-01-29 22:21:05 +00:00
Erick Friis	29461b36d9	partners/ollama: release 0.2.3 (#29489 )	2025-01-29 22:19:44 +00:00
Erick Friis	07e2e80fe7	core: release 0.3.33 (#29483 )	2025-01-29 14:11:53 -08:00
Erick Friis	8f95da4eb1	multiple: structured output tracing standard metadata (#29421 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-29 14:00:26 -08:00
ccurme	284c935b08	tests[patch]: improve coverage of structured output tests (#29478 )	2025-01-29 14:52:09 -05:00
Matheus Torquato	7aae738296	docs:Fix Imports for Document and BaseRetriever (#29473 ) This pull request addresses an issue with import statements in the langchain_core/retrievers.py file. The following changes have been made: Corrected the import for Document from langchain_core.documents.base. Corrected the import for BaseRetriever from langchain_core.retrievers. These changes ensure that the SimpleRetriever class can correctly reference the Document and BaseRetriever classes, improving code reliability and maintainability. --------- Co-authored-by: Matheus Torquato <mtorquat@jaguarlandrover.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-29 14:32:05 +00:00
Mohammad Anash	12bcc85927	added operator filter for supabase (#29475 ) Description This PR adds support for MongoDB-style $in operator filtering in the Supabase vectorstore implementation. Currently, filtering with $in operators returns no results, even when matching documents exist. This change properly translates MongoDB-style filters to PostgreSQL syntax, enabling efficient multi-document filtering. Changes Modified similarity_search_by_vector_with_relevance_scores to handle MongoDB-style $in operators Added automatic conversion of $in filters to PostgreSQL IN clauses Preserved original vector type handling and numpy array conversion Maintained compatibility with existing postgrest filters Added support for the same filtering in similarity_search_by_vector_returning_embeddings Issue Closes #27932 Implementation Notes No changes to public API or function signatures Backwards compatible - behavior unchanged for non-$in filters More efficient than multiple individual queries for multi-ID searches Preserves all existing functionality including numpy array conversion for vector types Dependencies None Additional Notes The implementation handles proper SQL escaping for filter values Maintains consistent behavior with other vectorstore implementations that support MongoDB-style operators Future extensions could support additional MongoDB-style operators ($gt, $lt, etc.) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-29 14:24:18 +00:00
ccurme	585f467d4a	mistral[patch]: release 0.2.5 (#29463 )	2025-01-28 18:29:54 -05:00
ccurme	ca9d4e4595	mistralai: support method="json_schema" in structured output (#29461 ) https://docs.mistral.ai/capabilities/structured-output/custom_structured_output/	2025-01-28 18:17:39 -05:00
Michael Chin	e120378695	community: Additional AWS deprecations (#29447 ) Added deprecation warnings for a few more classes that weremoved to `langchain-aws` package: - [SageMaker Endpoint LLM](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) - [Amazon Kendra retriever](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.kendra.AmazonKendraRetriever.html) - [Amazon Bedrock Knowledge Bases retriever](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html)	2025-01-28 09:50:14 -05:00
Erick Friis	2d776351af	community: release 0.3.16 (#29452 )	2025-01-28 07:44:54 +00:00
Erick Friis	737a68fcdc	langchain: release 0.3.16 (#29451 )	2025-01-28 07:31:09 +00:00
Erick Friis	8bf9c71673	core: release 0.3.32 (#29450 )	2025-01-28 07:20:04 +00:00
Erick Friis	ecdc881328	langchain: add deepseek provider to init chat model (#29449 )	2025-01-27 23:13:59 -08:00
Erick Friis	dced0ed3fd	deepseek, docs: chatdeepseek integration added (#29445 )	2025-01-28 06:32:58 +00:00
Isaac Francisco	2bb2c9bfe8	change behavior for converting a string to openai messages (#29446 )	2025-01-27 18:18:54 -08:00
ccurme	b1fdac726b	groq[patch]: update model used in test (#29441 ) `llama-3.1-70b-versatile` was [shut down](https://console.groq.com/docs/deprecations).	2025-01-27 21:11:44 +00:00
Adrián Panella	1551d9750c	community(doc_loaders): allow any credential type in AzureAIDocumentI… (#29289 ) allow any credential type in AzureAIDocumentInteligence, not only `api_key`. This allows to use any of the credentials types integrated with AD. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:56:30 +00:00
ccurme	f00c66cc1f	chroma[patch]: release 0.2.1 (#29440 )	2025-01-27 20:41:35 +00:00
Jorge Piedrahita Ortiz	3b886cdbb2	libs: add sambanova-lagchain integration package (#29417 ) - Description:: Add sambanova-langchain integration package as suggested in previous PRs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:34:55 +00:00
Mohammad Anash	aba1fd0bd4	fixed similarity search with score error #29407 (#29413 ) Description: Fix TypeError in AzureSearch similarity_search_with_score by removing search_type from kwargs before passing to underlying requests. This resolves issue #29407 where search_type was being incorrectly passed through to Session.request(). Issue: #29407 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:34:42 +00:00
itaismith	7b404fcd37	partners[chroma]: Upgrade Chroma to 0.6.x (#29404 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-27 15:32:21 -05:00
Teruaki Ishizaki	3fce78994e	community: Fixed the procedure of initializing pad_token_id (#29434 ) - Description: Add to check pad_token_id and eos_token_id of model config. It seems that this is the same bug as the HuggingFace TGI bug. In addition, the source code of libs/partners/huggingface/langchain_huggingface/llms/huggingface_pipeline.py also requires similar changes. - Issue: #29431 - Dependencies: none - Twitter handle: tell14	2025-01-27 14:54:54 -05:00
Christophe Bornet	dbb6b7b103	core: Add ruff rules TRY (tryceratops) (#29388 ) TRY004 ("use TypeError rather than ValueError") existing errors are marked as ignore to preserve backward compatibility. LMK if you prefer to fix some of them. Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-24 05:01:40 +00:00
Erick Friis	723b603f52	docs: groq api key links (#29402 )	2025-01-24 04:33:18 +00:00
ccurme	bbc50f65e7	anthropic[patch]: release 0.3.4 (#29399 )	2025-01-23 23:55:58 +00:00
ccurme	ed797e17fb	anthropic[patch]: always return content blocks if citations are generated (#29398 ) We currently return string (and therefore no content blocks / citations) if the response is of the form ``` [ {"text": "a claim", "citations": [...]}, ] ``` There are other cases where we do return citations as-is: ``` [ {"text": "a claim", "citations": [...]}, {"text": "some other text"}, {"text": "another claim", "citations": [...]}, ] ``` Here we update to return content blocks including citations in the first case as well.	2025-01-23 18:47:23 -05:00
Bagatur	317fb86fd9	openai[patch]: fix int test (#29395 )	2025-01-23 21:23:01 +00:00
Bagatur	8d566a8fe7	openai[patch]: detect old models in with_structured_output (#29392 ) Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-23 20:47:32 +00:00
Christophe Bornet	b6ae7ca91d	core: Cache RunnableLambda __repr__ (#29199 ) `RunnableLambda`'s `__repr__` may do costly OS operation by calling `get_lambda_source`. So it's better to cache it. See #29043 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-23 18:34:47 +00:00
Christophe Bornet	618e550f06	core: Cache RunnableLambda deps (#29200 ) `RunnableLambda`'s `deps` may do costly OS operation by calling `get_function_nonlocals`. So it's better to cache it. See #29043	2025-01-23 13:09:07 -05:00
ccurme	f795ab99ec	docs: fix title rendered for integration package (#29387 ) "Tilores LangchAIn" -> "Tilores"	2025-01-23 12:21:19 -05:00
Stefan Berkner	8977451c76	docs: add Tilores provider and tools (#29244 ) Description: This PR adds documentation for the Tilores provider and tools. Issue: closes #26320	2025-01-23 12:17:59 -05:00
Ahmed Tammaa	d5b8aabb32	text-splitters[patch]: delete unused html_chunks_with_headers.xslt (#29340 ) This pull request removes the now-unused html_chunks_with_headers.xslt file from the codebase. In a previous update ([PR #27678](https://github.com/langchain-ai/langchain/pull/27678)), the HTMLHeaderTextSplitter class was refactored to utilize BeautifulSoup instead of lxml and XSLT for HTML processing. As a result, the html_chunks_with_headers.xslt file is no longer necessary and can be safely deleted to maintain code cleanliness and reduce potential confusion. Issue: N/A Dependencies: N/A	2025-01-23 11:29:08 -05:00
Wang Ran (汪然)	8f2c11e17b	core[patch]: fix API reference for draw_ascii (#29370 ) typo: no `draw` but `draw_ascii` and other things now, it works: <img width="688" alt="image" src="https://github.com/user-attachments/assets/5b5a8cc2-cf81-4a5c-b443-da0e4426556c" /> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-23 16:04:58 +00:00
Loris Alexandre	e4921239a6	community: missing mandatory parameter partition_key for AzureCosmosDBNoSqlVectorSearch (#29382 ) - Description: the `delete` function of AzureCosmosDBNoSqlVectorSearch is using `self._container.delete_item(document_id)` which miss a mandatory parameter `partition_key` We use the class function `delete_document_by_id` to provide a default `partition_key` - Issue: #29372 - Dependencies: None - Twitter handle: None Co-authored-by: Loris Alexandre <loris.alexandre@boursorama.fr>	2025-01-23 10:05:10 -05:00
Terry Tan	ec0ebb76f2	community: fix Google Scholar tool errors (#29371 ) Resolve https://github.com/langchain-ai/langchain/issues/27557	2025-01-23 10:03:01 -05:00
江同学呀	a1e62070d0	community: Fix the problem of error reporting when OCR extracts text from PDF. (#29378 ) - Description: The issue has been fixed where images could not be recognized from ```xObject[obj]["/Filter"]``` (whose value can be either a string or a list of strings) in the ```_extract_images_from_page()``` method. It also resolves the bug where vectorization by Faiss fails due to the failure of image extraction from a PDF containing only images```IndexError: list index out of range```. ![69a60f3f6bd474641b9126d74bb18f7e](https://github.com/user-attachments/assets/dc9e098d-2862-49f7-93b0-00f1056727dc) - Issue: Fix the following issues: [#15227 ](https://github.com/langchain-ai/langchain/issues/15227) [#22892 ](https://github.com/langchain-ai/langchain/issues/22892) [#26652 ](https://github.com/langchain-ai/langchain/issues/26652) [#27153 ](https://github.com/langchain-ai/langchain/issues/27153) Related issues: [#7067 ](https://github.com/langchain-ai/langchain/issues/7067) - Dependencies: None - Twitter handle: None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-23 15:01:52 +00:00
Tim Mallezie	a13faab6b7	community; allow to set gitlab url in gitlab tool in constrictor (#29380 ) This pr, expands the gitlab url so it can also be set in a constructor, instead of only through env variables. This allows to do something like this. ``` # Create the GitLab API wrapper gitlab_api = GitLabAPIWrapper( gitlab_url=self.gitlab_url, gitlab_personal_access_token=self.gitlab_personal_access_token, gitlab_repository=self.gitlab_repository, gitlab_branch=self.gitlab_branch, gitlab_base_branch=self.gitlab_base_branch, ) ``` Where before you could not set the url in the constructor. Co-authored-by: Tim Mallezie <tim.mallezie@dropsolid.com>	2025-01-23 09:36:27 -05:00
Tyllen	f2ea62f632	docs: add payman docs (#29362 ) - Description: Adding the docs to use the payman-langchain integration :)	2025-01-22 18:37:47 -08:00
Erick Friis	3f1d20964a	standard-tests: release 0.3.9 (#29356 )	2025-01-22 09:46:19 -08:00
Macs Dickinson	7378c955db	community: adds support for getting github releases for the configured repository (#29318 ) Description: adds support for github tool to query github releases on the configure respository Issue: N/A Dependencies: N/A Twitter handle: @macsdickinson --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-22 15:45:52 +00:00
Tayaa Med Amine	ef1610e24a	langchain[patch]: support ollama in init_embeddings (#29349 ) Why not Ollama ? Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-22 14:47:12 +00:00
Siddhant	9eb10a9240	langchain: added vectorstore docstring linting (#29241 ) …ore.py Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Added docstring linting in the vectorstore.py file relating to issue #25154 - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Siddhant Jain <sjain35@buffalo.edu> Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 03:47:43 +00:00
Sohan	de1fc4811d	packages, docs: Pipeshift - Langchain integration of pipeshift (#29114 ) Description: Added pipeshift integration. This integrates pipeshift LLM and ChatModels APIs with langchain Dependencies: none Unit Tests & Integration tests are added Documentation is added as well This PR is w.r.t [#27390](https://github.com/langchain-ai/langchain/pull/27390) and as per request, a freshly minted `langchain-pipeshift` package is uploaded to PYPI. Only changes to the docs & packages.yml are made in langchain master branch --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 03:03:06 +00:00
Christophe Bornet	836c791829	text-splitters: Bump ruff version to 0.9 (#29231 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:27:58 +00:00
Christophe Bornet	a004dec119	langchain: Bump ruff version to 0.9 (#29211 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:26:39 +00:00
Christophe Bornet	2340b3154d	standard-tests: Bump ruff version to 0.9 (#29230 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:23:01 +00:00
Christophe Bornet	e4a78dfc2a	core: Bump ruff version to 0.9 (#29201 ) Also run some preview autofix and formatting --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:20:09 +00:00
Ella Charlaix	6f95db81b7	huggingface: Add IPEX models support (#29179 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:16:44 +00:00
Bhav Sardana	d6a7aaa97d	community: Fix for Pydantic model validator of GoogleApiClient (#29346 ) - [ ] PR message: Delete this entire checklist* and replace with - Description: Fix for pedantic model validator for GoogleApiHandler - Issue: the issue #29165 - [ ] Lint and test*: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Signed-off-by: Bhav Sardana <sardana.bhav@gmail.com>	2025-01-21 15:17:43 -05:00
Christophe Bornet	1c4ce7b42b	core: Auto-fix some docstrings (#29337 )	2025-01-21 13:29:53 -05:00
ccurme	86a0720310	fireworks[patch]: update model used in integration tests (#29342 ) No access to firefunction-v1 and -v2.	2025-01-21 11:05:30 -05:00
Hugo Berg	32c9c58adf	Community: fix missing f-string modifier in oai structured output parsing error (#29326 ) - Description: The ValueError raised on certain structured-outputs parsing errors, in langchain openai community integration, was missing a f-string modifier and so didn't produce useful outputs. This is a 2-line, 2-character change. - Issue: None open that this fixes - Dependencies: Nothing changed - Twitter handle: None - [X] Add tests and docs: There's nothing to add for. - [-] Lint and test: Happy to run this if you deem it necessary. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-21 14:26:38 +00:00
Nuno Campos	566915d7cf	core: fix call to get closure vars for partial-wrapped funcs (#29316 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-21 09:26:15 -05:00
ZhangShenao	33e22ccb19	[Doc] Improve api doc (#29324 ) - Fix doc description - Add static method decorator	2025-01-21 09:16:08 -05:00
Bagatur	536b44a47f	community[patch]: Release 0.3.15 (#29325 )	2025-01-21 03:10:07 +00:00
Bagatur	ec5fae76d4	langchain[patch]: Release 0.3.15 (#29322 )	2025-01-21 02:24:11 +00:00
Bagatur	923e6fb321	core[patch]: 0.3.31 (#29320 )	2025-01-21 01:17:31 +00:00
Ahmed Tammaa	d3ed9b86be	text-splitters[minor]: Replace lxml and XSLT with BeautifulSoup in HTMLHeaderTextSplitter for Improved Large HTML File Processing (#27678 ) This pull request updates the `HTMLHeaderTextSplitter` by replacing the `split_text_from_file` method's implementation. The original method used `lxml` and XSLT for processing HTML files, which caused `lxml.etree.xsltapplyerror maxhead` when handling large HTML documents due to limitations in the XSLT processor. Fixes #13149 By switching to BeautifulSoup (`bs4`), we achieve: - Improved Performance and Reliability: BeautifulSoup efficiently processes large HTML files without the errors associated with `lxml` and XSLT. - Simplified Dependencies: Removes the dependency on `lxml` and external XSLT files, relying instead on the widely used `beautifulsoup4` library. - Maintained Functionality: The new method replicates the original behavior, ensuring compatibility with existing code and preserving the extraction of content and metadata. Issue: This change addresses issues related to processing large HTML files with the existing `HTMLHeaderTextSplitter` implementation. It resolves problems where users encounter lxml.etree.xsltapplyerror maxhead due to large HTML documents. Dependencies: - BeautifulSoup (`beautifulsoup4`): The `beautifulsoup4` library is now used for parsing HTML content. - Installation: `pip install beautifulsoup4` Code Changes: Updated the `split_text_from_file` method in `HTMLHeaderTextSplitter` as follows: ```python def split_text_from_file(self, file: Any) -> List[Document]: """Split HTML file using BeautifulSoup. Args: file: HTML file path or file-like object. Returns: List of Document objects with page_content and metadata. """ from bs4 import BeautifulSoup from langchain.docstore.document import Document import bs4 # Read the HTML content from the file or file-like object if isinstance(file, str): with open(file, 'r', encoding='utf-8') as f: html_content = f.read() else: # Assuming file is a file-like object html_content = file.read() # Parse the HTML content using BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') # Extract the header tags and their corresponding metadata keys headers_to_split_on = [tag[0] for tag in self.headers_to_split_on] header_mapping = dict(self.headers_to_split_on) documents = [] # Find the body of the document body = soup.body if soup.body else soup # Find all header tags in the order they appear all_headers = body.find_all(headers_to_split_on) # If there's content before the first header, collect it first_header = all_headers[0] if all_headers else None if first_header: pre_header_content = '' for elem in first_header.find_all_previous(): if isinstance(elem, bs4.Tag): text = elem.get_text(separator=' ', strip=True) if text: pre_header_content = text + ' ' + pre_header_content if pre_header_content.strip(): documents.append(Document( page_content=pre_header_content.strip(), metadata={} # No metadata since there's no header )) else: # If no headers are found, return the whole content full_text = body.get_text(separator=' ', strip=True) if full_text.strip(): documents.append(Document( page_content=full_text.strip(), metadata={} )) return documents # Process each header and its associated content for header in all_headers: current_metadata = {} header_name = header.name header_text = header.get_text(separator=' ', strip=True) current_metadata[header_mapping[header_name]] = header_text # Collect all sibling elements until the next header of the same or higher level content_elements = [] for sibling in header.find_next_siblings(): if sibling.name in headers_to_split_on: # Stop at the next header break if isinstance(sibling, bs4.Tag): content_elements.append(sibling) # Get the text content of the collected elements current_content = '' for elem in content_elements: text = elem.get_text(separator=' ', strip=True) if text: current_content += text + ' ' # Create a Document if there is content if current_content.strip(): documents.append(Document( page_content=current_content.strip(), metadata=current_metadata.copy() )) else: # If there's no content, but we have metadata, still create a Document documents.append(Document( page_content='', metadata=current_metadata.copy() )) return documents ``` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 16:10:37 -05:00
Christophe Bornet	989eec4b7b	core: Add ruff rule S101 (no assert) (#29267 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 20:24:31 +00:00
Christophe Bornet	e5d62c6ce7	core: Add ruff rule W293 (whitespaces) (#29272 )	2025-01-20 15:16:12 -05:00
Philippe PRADOS	4efc5093c1	community[minor]: Refactoring PyMuPDF parser, loader and add image blob parsers (#29063 ) * Adds BlobParsers for images. These implementations can take an image and produce one or more documents per image. This interface can be used for exposing OCR capabilities. * Update PyMuPDFParser and Loader to standardize metadata, handle images, improve table extraction etc. - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 15:15:43 -05:00
Syed Baqar Abbas	f175319303	[feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from `langchain_community.embeddings` to `langchain_ollama.embeddings` (#29296 ) - [feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from `langchain_community.embeddings` to `langchain_ollama.embeddings`: "langchain_ollama" - Description: Given that `OllamaEmbeddings` from `langchain_community.embeddings` is deprecated, code is being shifted to ``langchain_ollama.embeddings`. However, this does not offer backward compatibility of initializing the parameters and `OllamaEmbeddings` object. - Issue: #29294 - Dependencies: None - Twitter handle: @BaqarAbbas2001 ## Additional Information Previously, `OllamaEmbeddings` from `langchain_community.embeddings` used to support the following options: `e9abe583b2/libs/community/langchain_community/embeddings/ollama.py (L125-L139)` However, in the new package `from langchain_ollama import OllamaEmbeddings`, there is no method to set these options. I have added these parameters to resolve this issue. This issue was also discussed in https://github.com/langchain-ai/langchain/discussions/29113	2025-01-20 11:16:29 -05:00
CLOVA Studio 개발	7a95ffc775	community: fix some features on Naver ChatModel & embedding model 2 (#29243 ) ## Description - Responding to `NCP API Key` changes. - To fix `ChatClovaX` `astream` function to raise `SSEError` when an error event occurs. - To add `token length` and `ai_filter` to ChatClovaX's `response_metadata`. - To update document for apply NCP API Key changes. cc. @efriis @vbarda	2025-01-20 11:01:03 -05:00
Sangyun_LEE	5d64597490	docs: fix broken Appearance of langchain_community/document_loaders/recursive_url_loader API Reference (#29305 ) # PR mesesage ## Description Fixed a broken Appearance of RecurisveUrlLoader API Reference. ### Before <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/f39df65d-b788-411d-88af-8bfa2607c00b" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/b8a92b70-4548-4b4a-965f-026faeebd0ec" /> </p> ### After <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/8ea28146-de45-42e2-b346-3004ec4dfc55" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/914c6966-4055-45d3-baeb-2d97eab06fe7" /> </p> ## Issue: N/A ## Dependencies None ## Twitter handle N/A # Add tests and docs Not applicable; this change only affects documentation. # Lint and test Ran make format, make lint, and make test to ensure no issues.	2025-01-20 10:56:59 -05:00
Hemant Rawat	6c52378992	Add Google-style docstring linting and update pyproject.toml (#29303 ) ### Description: This PR introduces Google-style docstring linting for the ModelLaboratory class in libs/langchain/langchain/model_laboratory.py. It also updates the pyproject.toml file to comply with the latest Ruff configuration standards (deprecating top-level lint settings in favor of lint). ### Changes include: - [x] Added detailed Google-style docstrings to all methods in ModelLaboratory. - [x] Updated pyproject.toml to move select and pydocstyle settings under the [tool.ruff.lint] section. - [x] Ensured all files pass Ruff linting. Issue: Closes #25154 ### Dependencies: No additional dependencies are required for this change. ### Checklist - [x] Files passes ruff linting. - [x] Docstrings conform to the Google-style convention. - [x] pyproject.toml updated to avoid deprecation warnings. - [x] My PR is ready to review, please review.	2025-01-19 14:37:21 -05:00
Mohammad Mohtashim	b5fbebb3c8	(Community): Changing the BaseURL and Model for MiniMax (#29299 ) - Description: Changed the Base Default Model and Base URL to correct versions. Plus added a more explicit exception if user provides an invalid API Key - Issue: #29278	2025-01-19 14:15:02 -05:00
ccurme	c20f7418c7	openai[patch]: fix Azure LLM test (#29302 ) The tokens I get are: ``` ['', '\n\n', 'The', ' sun', ' was', ' setting', ' over', ' the', ' horizon', ',', ' casting', ''] ``` so possibly an extra empty token is included in the output. lmk @efriis if we should look into this further.	2025-01-19 17:25:42 +00:00
ccurme	6b249a0dc2	openai[patch]: release 0.3.1 (#29301 )	2025-01-19 17:04:00 +00:00
ThomasSaulou	e9abe583b2	chatperplexity stream-citations in additional kwargs (#29273 ) chatperplexity stream-citations in additional kwargs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-18 22:31:10 +00:00
TheSongg	1cd4d8d101	[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py (#29259 ) - [ ] PR title:[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py - [ ] PR message: Rewrite the _stream method so that the chain.stream() can be used to return data streams. chain = prompt \| llm chain.stream(input=user_input) - [ ] tests: from langchain_community.llms import Xinference from langchain.prompts import PromptTemplate llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model stream = True ) prompt = PromptTemplate(input=['country'], template="Q: where can we visit in the capital of {country}? A:") chain = prompt \| llm chain.stream(input={'country': 'France'})	2025-01-17 20:31:59 -05:00
ccurme	184ea8aeb2	anthropic[patch]: update tool choice type (#29276 )	2025-01-17 15:26:33 -05:00
ccurme	ac52021097	anthropic[patch]: release 0.3.2 (#29275 )	2025-01-17 19:48:31 +00:00
ccurme	c616b445f2	anthropic[patch]: support `parallel_tool_calls` (#29257 ) Need to: - Update docs - Decide if this is an explicit kwarg of bind_tools - Decide if this should be in standard test with flag for supporting	2025-01-17 19:41:41 +00:00
ccurme	d5360b9bd6	core[patch]: release 0.3.30 (#29256 )	2025-01-16 17:52:37 -05:00
Nuno Campos	595297e2e5	core: Add support for calls in get_function_nonlocals (#29255 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-16 14:43:42 -08:00
Luis Lopez	75663f2cae	community: Add cost per 1K tokens for fine-tuned model cached input (#29248 ) ### Description - Since there is no cost per 1k input tokens for a fine-tuned cached version of `gpt-4o-mini-2024-07-18` is not available when using the `OpenAICallbackHandler`, it raises an error when trying to make calls with such model. - To add the price in the `MODEL_COST_PER_1K_TOKENS` dictionary cc. @efriis	2025-01-16 15:19:26 -05:00

1 2 3 4 5 ...

6495 Commits