langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-10 13:27:36 +00:00

Author	SHA1	Message	Date
Erick Friis	4bc6cb759f	docs: update recommended code interpreters (#29236 ) unstable :(	2025-01-15 16:03:26 -08:00
Erick Friis	5eb4dc5e06	standard-tests: double messages test (#29237 )	2025-01-15 15:14:29 -08:00
Nithish Raghunandanan	1051fa5729	couchbase: Migrate couchbase partner package to different repo (#29239 ) Description: Migrate the couchbase partner package to [Couchbase-Ecosystem](https://github.com/Couchbase-Ecosystem/langchain-couchbase) org	2025-01-15 12:37:27 -08:00
Nadeem Sajjad	eaf2fb287f	community(pypdfloader): added page_label in metadata for pypdf loader (#29225 ) # Description ## Summary This PR adds support for handling multi-labeled page numbers in the PyPDFLoader. Some PDFs use complex page numbering systems where the actual content may begin after multiple introductory pages. The page_label field helps accurately reflect the document’s page structure, making it easier to handle such cases during document parsing. ## Motivation This feature improves document parsing accuracy by allowing users to access the actual page labels instead of relying only on the physical page numbers. This is particularly useful for documents where the first few pages have roman numerals or other non-standard page labels. ## Use Case This feature is especially useful for Retrieval-Augmented Generation (RAG) systems where users may reference page numbers when asking questions. Some PDFs have both labeled page numbers (like roman numerals for introductory sections) and index-based page numbers. For example, a user might ask: "What is mentioned on page 5?" The system can now check both: • Index-based page number (page) • Labeled page number (page_label) This dual-check helps improve retrieval accuracy. Additionally, the results can be validated with an agent or tool to ensure the retrieved pages match the user’s query contextually. ## Code Changes - Added a page_label field to the metadata of the Document class in PyPDFLoader. - Implemented support for retrieving page_label from the pdf_reader.page_labels. - Created a test case (test_pypdf_loader_with_multi_label_page_numbers) with a sample PDF containing multi-labeled pages (geotopo-komprimiert.pdf) [[Source of pdf](https://github.com/py-pdf/sample-files/blob/main/009-pdflatex-geotopo/GeoTopo-komprimiert.pdf)]. - Updated existing tests to ensure compatibility and verify page_label extraction. ## Tests Added - Added a new test case for a PDF with multi-labeled pages. - Verified both page and page_label metadata fields are correctly extracted. ## Screenshots <img width="549" alt="image" src="https://github.com/user-attachments/assets/65db9f5c-032e-4592-926f-824777c28f33" />	2025-01-15 14:18:07 -05:00
Mehdi	1a38948ee3	Mehdi zare/fmp data doc (#29219 ) Title: community: add Financial Modeling Prep (FMP) API integration Description: Adding LangChain integration for Financial Modeling Prep (FMP) API to enable semantic search and structured tool creation for financial data endpoints. This integration provides semantic endpoint search using vector stores and automatic tool creation with proper typing and error handling. Users can discover relevant financial endpoints using natural language queries and get properly typed LangChain tools for discovered endpoints. Issue: N/A Dependencies: fmp-data>=0.3.1 langchain-core>=0.1.0 faiss-cpu tiktoken Twitter handle: @mehdizarem Unit tests and example notebook have been added: Tests are in tests/integration_tests/est_tools.py and tests/unit_tests/test_tools.py Example notebook is in docs/tools.ipynb All format, lint and test checks pass: pytest mypy . Dependencies are imported within functions and not added to pyproject.toml. The changes are backwards compatible and only affect the community package. --------- Co-authored-by: mehdizare <mehdizare@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-15 15:31:01 +00:00
Mohammad Mohtashim	288613d361	(text-splitters): Small Fix in `_process_html` for HTMLSemanticPreservingSplitter to properly extract the metadata. (#29215 ) - Description: Include `main` in the list of elements whose child elements needs to be processed for splitting the HTML. - Issue: #29184	2025-01-15 10:18:06 -05:00
TheSongg	4867fe7ac8	[langchain_community.llms.xinference]: fix error in xinference.py (#29216 ) - [ ] PR title: [langchain_community.llms.xinference]: fix error in xinference.py - [ ] PR message: - The old code raised an ValidationError: pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference when import Xinference from xinference.py. This issue has been resolved by adjusting it's type and default value. File "/media/vdc/python/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference client Field required [type=missing, input_value={'server_url': 'http://10...t4', 'model_kwargs': {}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing - [ ] tests: from langchain_community.llms import Xinference llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model )	2025-01-15 10:11:26 -05:00
Kostadin Devedzhiev	bea5798b04	docs: Fix typo in retrievers documentation: 'An vectorstore' -> 'A vectorstore' (#29221 ) - [x] PR title: "docs: Fix typo in documentation" - [x] PR message: - Description: Fixed a typo in the documentation, changing "An vectorstore" to "A vector store" for grammatical accuracy. - Issue: N/A (no issue filed for this typo fix) - Dependencies: None - Twitter handle: N/A - [x] Add tests and docs: This is a minor documentation fix that doesn't require additional tests or example notebooks. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2025-01-15 10:10:14 -05:00
Sohaib Athar	d1cf10373b	Update elasticsearch_retriever.ipynb (#29223 ) docs: fix typo (connection) - Twitter handle: @ReallyVirtual	2025-01-15 10:09:51 -05:00
Syed Baqar Abbas	4278046329	[fix] Convert table names to list for compatibility in SQLDatabase (#29229 ) - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - The issue #29227 is being fixed here - The "package" modified is community - The issue lied in this block of code: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L72-L77)` - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - Description: When the SQLDatabase is initialized, it runs a code `self._inspector.get_table_names(schema=schema)` which expects an output of list. However, with some connectors (such as snowflake) the data type returned could be another iterable. This results in a type error when concatenating the table_names to view_names. I have added explicit type casting to prevent this. - Issue: The issue #29227 is being fixed here - Dependencies: None - Twitter handle: @BaqarAbbas2001 ## Additional Information When the following method is called for a Snowflake database: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L75)` Snowflake under the hood calls: ```python from snowflake.sqlalchemy.snowdialect import SnowflakeDialect SnowflakeDialect.get_table_names ``` This method returns a `dict_keys()` object which is incompatible to concatenate with a list and results in a `TypeError` ### Relevant Library Versions - snowflake-sqlalchemy: 1.7.2 - snowflake-connector-python: 3.12.4 - sqlalchemy: 2.0.20 - langchain_community: 0.3.14	2025-01-15 10:00:03 -05:00
Jin Hyung Ahn	05554265b4	community: Fix ConfluenceLoader load() failure caused by deleted pages (#29232 ) ## Description This PR modifies the is_public_page function in ConfluenceLoader to prevent exceptions caused by deleted pages during the execution of ConfluenceLoader.process_pages(). Example scenario: Consider the following usage of ConfluenceLoader: ```python import os from langchain_community.document_loaders import ConfluenceLoader loader = ConfluenceLoader( url=os.getenv("BASE_URL"), token=os.getenv("TOKEN"), max_pages=1000, cql=f'type=page and lastmodified >= "2020-01-01 00:00"', include_restricted_content=False, ) # Raised Exception : HTTPError: Outdated version/old_draft/trashed? Cannot find content Please provide valid ContentId. documents = loader.load() ``` If a deleted page exists within the query result, the is_public_page function would previously raise an exception when calling get_all_restrictions_for_content, causing the loader.load() process to fail for all pages. By adding a pre-check for the page's "current" status, unnecessary API calls to get_all_restrictions_for_content for non-current pages are avoided. This fix ensures that such pages are skipped without affecting the rest of the loading process. ## Issue N/A (No specific issue number) ## Dependencies No new dependencies are introduced with this change. ## Twitter handle [@zenoengine](https://x.com/zenoengine)	2025-01-15 09:56:23 -05:00
Mohammad Mohtashim	21eb39dff0	[Community]: AzureOpenAIWhisperParser Authenication Fix (#29135 ) - Description: `AzureOpenAIWhisperParser` authentication fix as stated in the issue. - Issue: #29133	2025-01-15 09:44:53 -05:00
Erick Friis	44b41b699c	docs: api docs build folder prep update (#29220 )	2025-01-15 03:52:00 +00:00
Erick Friis	b05543c69b	packages: disable mongodb for api docs (#29218 )	2025-01-15 02:23:01 +00:00
Erick Friis	30badd7a32	packages: update mongodb folder (#29217 )	2025-01-15 02:01:06 +00:00
pm390	76172511fd	community: Additional parameters for OpenAIAssistantV2Runnable (#29207 ) Description: Added Additional parameters that could be useful for usage of OpenAIAssistantV2Runnable. This change is thought to allow langchain users to set parameters that cannot be set using assistants UI (max_completion_tokens,max_prompt_tokens,parallel_tool_calls) and parameters that could be useful for experimenting like top_p and temperature. This PR originated from the need of using parallel_tool_calls in langchain, this parameter is very important in openAI assistants because without this parameter set to False strict mode is not respected by OpenAI Assistants (https://platform.openai.com/docs/guides/function-calling#parallel-function-calling). > Note: Currently, if the model calls multiple functions in one turn then strict mode will be disabled for those calls. Issue: None Dependencies: openai	2025-01-14 15:53:37 -05:00
Guy Korland	efadad6067	Add Link to FalkorDB Memory example (#29204 ) - Description: Add Link to FalkorDB Memory example	2025-01-14 13:27:52 -05:00
Bagatur	4ab04ad6be	docs: oai api ref nit (#29210 )	2025-01-14 17:55:16 +00:00
Michael Chin	d9b856abad	community: Deprecate Amazon Neptune resources in langchain-community (#29191 ) Related: https://github.com/langchain-ai/langchain-aws/pull/322 The legacy `NeptuneOpenCypherQAChain` and `NeptuneSparqlQAChain` classes are being replaced by the new LCEL format chains `create_neptune_opencypher_qa_chain` and `create_neptune_sparql_qa_chain`, respectively, in the `langchain_aws` package. This PR adds deprecation warnings to all Neptune classes and functions that have been migrated to `langchain_aws`. All relevant documentation has also been updated to replace `langchain_community` usage with the new `langchain_aws` implementations. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-14 10:23:34 -05:00
Erick Friis	c55af44711	anthropic: pydantic mypy plugin (#29144 )	2025-01-13 15:32:40 -08:00
Erick Friis	cdf3a17e55	docs: fix httpx conflicts with overrides in docs build (#29180 )	2025-01-13 21:25:00 +00:00
ccurme	1bf6576709	cli[patch]: fix anchor links in templates (#29178 ) These are outdated and can break docs builds.	2025-01-13 18:28:18 +00:00
Christopher Varjas	e156b372fb	langchain: support api key argument with OpenAI moderation chain (#29140 ) Description: Makes it possible to instantiate `OpenAIModerationChain` with an `openai_api_key` argument only and no `OPENAI_API_KEY` environment variable defined. Issue: https://github.com/langchain-ai/langchain/issues/25176 Dependencies: `openai` --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-13 11:00:02 -05:00
Nikhil Shahi	335ca3a606	docs: add HyperbrowserLoader docs (#29143 ) ### Description This PR adds docs for the [langchain-hyperbrowser](https://pypi.org/project/langchain-hyperbrowser/) package. It includes a document loader that uses Hyperbrowser to scrape or crawl any urls and return formatted markdown or html content as well as relevant metadata. [Hyperbrowser](https://hyperbrowser.ai) is a platform for running and scaling headless browsers. It lets you launch and manage browser sessions at scale and provides easy to use solutions for any webscraping needs, such as scraping a single page or crawling an entire site. ### Issue None ### Dependencies None ### Twitter Handle `@hyperbrowser`	2025-01-13 10:45:39 -05:00
Zhengren Wang	4c0217681a	cookbook: fix typo in cookbook/mongodb-langchain-cache-memory.ipynb (#29149 ) Description: fix "enviornment" into "environment". Issue: Typo Dependencies: None Twitter handle: zrwang01	2025-01-13 10:35:34 -05:00
Gabe Cornejo	e64bfb537f	docs: Fix old link to Unstructured package in document_loader_markdown.ipynb (#29175 ) Fixed a broken link in `document_loader_markdown.ipynb` to point to the updated documentation page for the Unstructured package. Issue: N/A Dependencies: None	2025-01-13 15:26:01 +00:00
Tymon Żarski	689592f9bb	community: Fix rank-llm import paths for new 0.20.3 version (#29154 ) # PR title: "community: Fix rank-llm import paths for new 0.20.3 version" - The "community" package is being modified to handle updated import paths for the new `rank-llm` version. --- ## Description This PR updates the import paths for the `rank-llm` package to account for changes introduced in version `0.20.3`. The changes ensure compatibility with both pre- and post-revamp versions of `rank-llm`, specifically version `0.12.8`. Conditional imports are introduced based on the detected version of `rank-llm` to handle different path structures for `VicunaReranker`, `ZephyrReranker`, and `SafeOpenai`. ## Issue RankLLMRerank usage throws an error when used GPT (not only) when rank-llm version is > 0.12.8 - #29156 ## Dependencies This change relies on the `packaging` and `pkg_resources` libraries to handle version checks. ## Twitter handle @tymzar	2025-01-13 10:22:14 -05:00
Andrew	0e3115330d	Add additional_instructions on openai assistan runs create. (#29164 ) - Description: In the functions `_create_run` and `_acreate_run`, the parameters passed to the creation of `openai.resources.beta.threads.runs` were limited. Source: ``` def _create_run(self, input: dict) -> Any: params = { k: v for k, v in input.items() if k in ("instructions", "model", "tools", "run_metadata") } return self.client.beta.threads.runs.create( input["thread_id"], assistant_id=self.assistant_id, params, ) ``` - OpenAI Documentation ([createRun](https://platform.openai.com/docs/api-reference/runs/createRun)) - Full list of parameters `openai.resources.beta.threads.runs` ([source code](https://github.com/openai/openai-python/blob/main/src/openai/resources/beta/threads/runs/runs.py#L91)) - Issue: Fix #17574 - [x] Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-13 10:11:47 -05:00
ccurme	e4ceafa1c8	langchain[patch]: update extended tests for compatibility with langchain-openai==0.3 (#29174 )	2025-01-13 15:04:22 +00:00
Syed Muneeb Abbas	8ef7f3eacc	Fixed the import error in OpenAIWhisperParserLocal and resolved the L… (#29168 ) …angChain parser issue. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-13 09:47:31 -05:00
Priyansh Agrawal	c115c09b6d	community: add missing format specifier in error log in CubeSemanticLoader (#29172 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - Description: Add a missing format specifier in an an error log in `langchain_community.document_loaders.CubeSemanticLoader` - Issue: raises `TypeError: not all arguments converted during string formatting` - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-13 09:32:57 -05:00
ThomasSaulou	349b5c91c2	fix chatperplexity: remove 'stream' from params in _stream method (#29173 ) quick fix chatperplexity: remove 'stream' from params in _stream method	2025-01-13 09:31:37 -05:00
LIU Yuwei	f980144e9c	community: add init for unstructured file loader (#29101 ) ## Description Add `__init__` for unstructured loader of epub/image/markdown/pdf/ppt/word to restrict the input type to `str` or `Path`. In the [signature](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.markdown.UnstructuredMarkdownLoader.html) these unstructured loaders receive `file_path: str \| List[str] \| Path \| List[Path]`, but actually they only receive `str` or `Path`. ## Issue None ## Dependencies No changes.	2025-01-13 09:26:00 -05:00
Erick Friis	bbc3e3b2cf	openai: disable streaming for o1 by default (#29147 ) Currently 400s https://community.openai.com/t/streaming-support-for-o1-o1-2024-12-17-resulting-in-400-unsupported-value/1085043 o1-mini and o1-preview stream fine	2025-01-11 02:24:11 +00:00
Isaac Francisco	62074bac60	replace all LANGCHAIN_ flags with LANGSMITH_ flags (#29120 )	2025-01-11 01:24:40 +00:00
Bagatur	5c2fbb5b86	docs: Update openai README.md (#29146 )	2025-01-10 17:24:16 -08:00
Erick Friis	0a54aedb85	anthropic: pdf integration test (#29142 )	2025-01-10 21:56:31 +00:00
ccurme	8de8519daf	tests[patch]: release 0.3.8 (#29141 )	2025-01-10 21:53:41 +00:00
Jiang	7d3fb21807	Add lindorm as new integration (#29123 ) Misoperation caused the pr close: [origin pr link](https://github.com/langchain-ai/langchain/pull/29085) --------- Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>	2025-01-10 16:30:37 -05:00
Zapiron	7594ad694f	docs: update the correct learning objective YAML instead of XML (#29131 ) Update the correct learning objective for the how-to page by changing XML to YAML which is taught. Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-10 16:13:13 -05:00
Mateusz Szewczyk	b1d3e25eb6	docs: Update IBM WatsonxRerank documentation (#29138 ) Thank you for contributing to LangChain! Update presented model in `WatsonxRerank` documentation. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2025-01-10 15:07:29 -05:00
ccurme	4819b500e8	pinecone[patch]: release 0.2.2 (#29139 )	2025-01-10 14:59:57 -05:00
Ashvin	46fd09ffeb	partner: Update aiohttp in langchain pinecone. (#28863 ) - partner: "Update Aiohttp for resolving vulnerability issue" - Description: I have updated the upper limit of aiohttp from `3.10` to `3.10.5` in the pyproject.toml file of langchain-pinecone. Hopefully this will resolve #28771 . Please review this as I'm quite unsure. --------- Co-authored-by: = <=> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-10 14:54:52 -05:00
ccurme	df5ec45b32	docs[patch]: update docs for langchain-openai==0.3 (#29119 ) Update model for one notebook that specified `gpt-4`. Otherwise just updating cassettes. --------- Co-authored-by: Jacob Lee <jacoblee93@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2025-01-10 13:29:31 -05:00
ccurme	f3d370753f	xai[minor]: release 0.2 (#29132 ) Update `langchain-openai` to 0.3. See [release notes](https://github.com/langchain-ai/langchain/releases/tag/langchain-openai%3D%3D0.3.0) for details. Should only impact default values of `temperature`, `n`, and `max_retries`.	2025-01-10 11:47:27 -05:00
ccurme	6e63ccba84	openai[minor]: release 0.3 (#29100 ) ## Goal Solve the following problems with `langchain-openai`: - Structured output with `o1` [breaks out of the box](https://langchain.slack.com/archives/C050X0VTN56/p1735232400232099). - `with_structured_output` by default does not use OpenAI’s [structured output feature](https://platform.openai.com/docs/guides/structured-outputs). - We override API defaults for temperature and other parameters. ## Breaking changes: - Default method for structured output is changing to OpenAI’s dedicated [structured output feature](https://platform.openai.com/docs/guides/structured-outputs). For schemas specified via TypedDict or JSON schema, strict schema validation is disabled by default but can be enabled by specifying `strict=True`. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - Models that don’t support `method="json_schema"` (e.g., `gpt-4` and `gpt-3.5-turbo`, currently the default model for ChatOpenAI) will raise an error unless `method` is explicitly specified. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - Schemas specified via Pydantic `BaseModel` that have fields with non-null defaults or metadata (like min/max constraints) will raise an error. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - `strict` now defaults to False for `method="json_schema"` when schemas are specified via TypedDict or JSON schema. - To recover previous behavior, use `with_structured_output(schema, strict=True)` - Schemas specified via Pydantic V1 will raise a warning (and use `method="function_calling"`) unless `method` is explicitly specified. - To remove the warning, pass `method="function_calling"` into `with_structured_output`. - Streaming with default structured output method / Pydantic schema no longer generates intermediate streamed chunks. - To recover previous behavior, pass `method="function_calling"` into `with_structured_output`. - We no longer override default temperature (was 0.7 in LangChain, now will follow OpenAI, currently 1.0). - To recover previous behavior, initialize `ChatOpenAI` or `AzureChatOpenAI` with `temperature=0.7`. - Note: conceptually there is a difference between forcing a tool call and forcing a response format. Tool calls may have more concise arguments vs. generating content adhering to a schema. Prompts may need to be adjusted to recover desired behavior. --------- Co-authored-by: Jacob Lee <jacoblee93@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2025-01-10 10:50:32 -05:00
ccurme	facfd42768	docs[patch]: fix links in partner package table (#29112 ) Integrations in external repos are not built into [API ref](https://python.langchain.com/api_reference/), so currently [the table](https://python.langchain.com/docs/integrations/providers/#integration-packages) includes broken links. Here we update the links for this type of package to point to PyPi.	2025-01-09 10:37:15 -05:00
ccurme	815bfa1913	openai[patch]: support streaming with json_schema response format (#29044 ) - Stream JSON string content. Final chunk includes parsed representation (following OpenAI [docs](https://platform.openai.com/docs/guides/structured-outputs#streaming)). - Mildly (?) breaking change: if you were using streaming with `response_format` before, usage metadata will disappear unless you set `stream_usage=True`. ## Response format Before: ![Screenshot 2025-01-06 at 11 59 01 AM](https://github.com/user-attachments/assets/e54753f7-47d5-421d-b8f3-172f32b3364d) After: ![Screenshot 2025-01-06 at 11 58 13 AM](https://github.com/user-attachments/assets/34882c6c-2284-45b4-92f7-5b5b69896903) ## with_structured_output For pydantic output, behavior of `with_structured_output` is unchanged (except for warning disappearing), because we pluck the parsed representation straight from OpenAI, and OpenAI doesn't return it until the stream is completed. Open to alternatives (e.g., parsing from content or intermediate dict chunks generated by OpenAI). Before: ![Screenshot 2025-01-06 at 12 38 11 PM](https://github.com/user-attachments/assets/913d320d-f49e-4cbb-a800-b394ae817fd1) After: ![Screenshot 2025-01-06 at 12 38 58 PM](https://github.com/user-attachments/assets/f7a45dd6-d886-48a6-8d76-d0e21ca767c6)	2025-01-09 10:32:30 -05:00
Panos Vagenas	858f655a25	docs: add Docling loader docs (#29104 ) ### Description This adds the docs for the Docling document loader. [Docling](https://github.com/DS4SD/docling) parses PDF, DOCX, PPTX, HTML, and other formats into a rich unified representation including document layout, tables etc., making them ready for generative AI workflows like RAG. Some references: - https://research.ibm.com/blog/docling-generative-AI - https://www.redhat.com/en/blog/docling-missing-document-processing-companion-generative-ai - [Docling Technical Report](https://arxiv.org/abs/2408.09869) The introduced `DoclingLoader` enables users to: - use various document types in their LLM applications with ease and speed, and - leverage Docling's rich representation for advanced, document-native grounding. ### Issue Replacing PR #27987 as discussed with @efriis [here](https://github.com/langchain-ai/langchain/pull/27987#issuecomment-2489354930). ### Dependencies None --------- Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-01-09 10:15:35 -05:00
fzowl	cc55e32924	docs: Adding voyage-3-large to the .ipynb file (#29098 ) Description: Adding voyage-3-large model to the .ipynb file (its just extending a list, so not even a code change) - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-09 10:01:55 -05:00

1 2 3 4 5 ...

12438 Commits