langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-06-09 18:50:33 +00:00

Author	SHA1	Message	Date
Mohan Kumar S	3beba77e2e	feat(ollama): support `response_format` (#34612 ) Fixes #34610 --- This PR resolves an issue where `ChatOllama` would raise an `unexpected keyword argument 'response_format'` error when used with `create_agent` or when passed an OpenAI-style `response_format`. When using `create_agent` (especially with models like `gpt-oss`), LangChain creates a `response_format` argument (e.g., `{"type": "json_schema", ...}`). `ChatOllama` previously passed this argument directly to the underlying Ollama client, which does not support `response_format` and instead expects a `format` parameter. ## The Fix I updated `_chat_params` in `libs/partners/ollama/langchain_ollama/chat_models.py` to: 1. Intercept the `response_format` argument. 2. Map it to the native Ollama `format` parameter: * `{"type": "json_schema", "json_schema": {"schema": ...}}` -> `format=schema` * `{"type": "json_object"}` -> `format="json"` 3. Remove `response_format` from the kwargs passed to the client. ## Validation * Reproduction Script: Verified the fix with a script covering `json_schema`, `json_object`, and explicit `format` priority scenarios. * New Tests: Added 3 new unit tests to `libs/partners/ollama/tests/unit_tests/test_chat_models.py` covering these scenarios. * Regression: Ran the full test suite (`make -C libs/partners/ollama test`), passing 29 tests (previously 26). * Lint/Format: Verified with `make lint_package` and `make format`. --------- Co-authored-by: Mohan Kumar Sagadevan <mohankumarsagadevan@Mohans-MacBook-Air.local> Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-04-06 22:23:57 -04:00
Mason Daugherty	2bc982b73c	fix(ollama): serialize `reasoning_content` back to ollama `thinking` (#36573 ) Closes #36177. --- Ollama's deserialization path already captures `"thinking"` content as `additional_kwargs["reasoning_content"]` on `AIMessage`, but the reverse direction — serializing back to the Ollama wire format — was missing. This means multi-turn conversations with reasoning models like `deepseek-r1` would silently drop the chain-of-thought, breaking agents that need prior reasoning preserved across turns.	2026-04-06 21:58:37 -04:00
Dat Nguyen	e71e6564b1	feat(ollama): add `dimensions` to `OllamaEmbeddings` (#36543 ) Fixes #34623 Add `dimensions` field to `OllamaEmbeddings` to allow users to specify output embedding size for models that support variable dimensions . The field is passed directly to the Ollama client's `embed()` call for both sync and async methods. How I verified it works: - Ran unit tests: `python -m pytest tests/unit_tests/ -v` - Ran integration tests against a live Ollama instance: `OLLAMA_HOST=http://ollama:11434 python -m pytest tests/integration_tests/ -v` - Confirmed that passing `dimensions=768` no longer raises `extra_forbidden` Pydantic validation error and returns embeddings of the expected size. --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-04-06 21:50:54 -04:00
Amber Shen	050b779d97	fix(ollama): respect scheme-less `base_url` (#34042 ) Fixes #33986. Summary: - Normalize scheme-less `base_url` values (e.g., `ollama:11434`) by defaulting to `http://` when the input resembles `host:port`. - Preserve and merge `Authorization` headers when `userinfo` credentials are present, both for sync and async clients. - Add unit tests covering scheme-less host:port and scheme-less userinfo credentials. Implementation details: - Update `parse_url_with_auth` to accept scheme-less endpoints, producing a cleaned URL with explicit scheme and extracted auth headers. - No changes required in `OllamaLLM`, `ChatOllama`, or `OllamaEmbeddings`—they already consume the cleaned URL and headers. Why: - Previously, scheme-less inputs caused `parse_url_with_auth` to return `(None, None)`, leading Ollama clients to fall back to defaults and ignore the provided `base_url`. Tests: - Extended `libs/partners/ollama/tests/unit_tests/test_auth.py` to cover the new cases. Notes: - Default scheme chosen is `http` to match common Ollama local deployments. Users can still explicitly provide `https://` when appropriate. --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-04-06 21:39:33 -04:00
Mohammad Mohtashim	0aa482d0cd	feat(ollama): logprobs support in Ollama (#34218 ) Closes #34207 --- Expose log probabilities from the Ollama Python SDK through `ChatOllama`. The ollama client already returns a `logprobs` field on chat responses for supported models, but `ChatOllama` had no way to request or surface it. ## Changes - Add `logprobs` and `top_logprobs` fields to `ChatOllama`, forwarded to the client via `_build_chat_params`. Setting `top_logprobs` without `logprobs=True` auto-enables it with a warning; setting it with `logprobs=False` raises a `ValueError` - Surface per-token logprobs on intermediate streaming chunks (both sync `_create_chat_stream` and async `_create_async_chat_stream`) via `response_metadata["logprobs"]`, accumulated into the final response on `invoke()` - Bump minimum `ollama` SDK from `>=0.6.0` to `>=0.6.1` — the version that added logprobs support --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-04-06 17:06:51 -04:00
Yi Liu	19ddd42891	fix(ollama): raise error when clients are not initialized (#35185 ) ## Summary - When `self._client` is `None` in `_create_chat_stream()`, the method silently produces an empty generator instead of failing. - The error only surfaces later as a misleading `"No data received from Ollama stream"` ValueError, making it difficult to diagnose the actual root cause (uninitialized client). - Changed to raise `RuntimeError` immediately with a clear message when the sync client is not initialized. ## Why this matters Users who hit this path see a confusing error message that points them in the wrong direction. An explicit error at the point of failure makes debugging straightforward. ## Test plan - [x] Added `test_create_chat_stream_raises_when_client_none` - [x] Existing tests still pass > This PR was authored with the help of AI tools. --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-02-12 11:56:53 -05:00
Deshbhushan Patil	2a82fbc0ff	test(ollama): Add unit test for `ChatOllama` reasoning parameter (#34095 )	2025-12-12 14:48:04 -05:00
dumko2001	05ba853548	fix(ollama): pop unsupported 'strict' argument in `ChatOllama` (#34114 )	2025-12-12 09:13:11 -05:00
Mason Daugherty	d8a680ee57	style: address Sphinx double-backtick snippet syntax (#33389 )	2025-10-09 13:35:51 -04:00
Mason Daugherty	63097db4fc	fix(ollama): exclude `None` parameters from options dictionary (#33208 )	2025-10-02 11:25:15 -04:00
Mason Daugherty	a9eda18e1e	refactor(ollama): clean up tests (#33198 )	2025-10-01 21:52:01 -04:00
Mason Daugherty	a89c549cb0	feat(ollama): add basic auth support (#32328 ) support for URL authentication in the format `https://user:password@host:port` for all LangChain Ollama clients. Related to #32327 and #25055	2025-10-01 20:46:37 -04:00
Mason Daugherty	986302322f	docs: more standardization (#33124 )	2025-09-25 20:46:20 -04:00
Mason Daugherty	ee4c2510eb	feat: port various nit changes from `wip-v0.4` (#32506 ) Lots of work that wasn't directly related to core improvements/messages/testing functionality	2025-08-11 15:09:08 -04:00
Copilot	d40fd5a3ce	feat(ollama): warn on empty `load` responses (#32161 ) ## Problem When using `ChatOllama` with `create_react_agent`, agents would sometimes terminate prematurely with empty responses when Ollama returned `done_reason: 'load'` responses with no content. This caused agents to return empty `AIMessage` objects instead of actual generated text. ```python from langchain_ollama import ChatOllama from langgraph.prebuilt import create_react_agent from langchain_core.messages import HumanMessage llm = ChatOllama(model='qwen2.5:7b', temperature=0) agent = create_react_agent(model=llm, tools=[]) result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}}) # Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'}) # Expected: AIMessage with actual generated content ``` ## Root Cause The `_iterate_over_stream` and `_aiterate_over_stream` methods treated any response with `done: True` as final, regardless of `done_reason`. When Ollama returns `done_reason: 'load'` with empty content, it indicates the model was loaded but no actual generation occurred - this should not be considered a complete response. ## Solution Modified the streaming logic to skip responses when: - `done: True` - `done_reason: 'load'` - Content is empty or contains only whitespace This ensures agents only receive actual generated content while preserving backward compatibility for load responses that do contain content. ## Changes - `_iterate_over_stream`: Skip empty load responses instead of yielding them - `_aiterate_over_stream`: Apply same fix to async streaming - Tests: Added comprehensive test cases covering all edge cases ## Testing All scenarios now work correctly: - ✅ Empty load responses are skipped (fixes original issue) - ✅ Load responses with actual content are preserved (backward compatibility) - ✅ Normal stop responses work unchanged - ✅ Streaming behavior preserved - ✅ `create_react_agent` integration fixed Fixes #31482. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-22 13:21:11 -04:00
diego-coder	8e4396bb32	fix(ollama): robustly parse single-quoted JSON in tool calls (#32109 ) Description: This PR makes argument parsing for Ollama tool calls more robust. Some LLMs—including Ollama—may return arguments as Python-style dictionaries with single quotes (e.g., `{'a': 1}`), which are not valid JSON and previously caused parsing to fail. The updated `_parse_json_string` method in `langchain_ollama.chat_models` now attempts standard JSON parsing and, if that fails, falls back to `ast.literal_eval` for safe evaluation of Python-style dictionaries. This improves interoperability with LLMs and fixes a common usability issue for tool-based agents. Issue: Closes #30910 Dependencies: None Tests: - Added new unit tests for double-quoted JSON, single-quoted dicts, mixed quoting, and malformed/failure cases. - All tests pass locally, including new coverage for single-quoted inputs. Notes: - No breaking changes. - No new dependencies introduced. - Code is formatted and linted (`ruff format`, `ruff check`). - If maintainers have suggestions for further improvements, I’m happy to revise! Thank you for maintaining LangChain! Looking forward to your feedback.	2025-07-21 12:11:22 -04:00
Copilot	98c3bbbaf0	fix(ollama): `num_gpu` parameter not working in async OllamaEmbeddings method (#32074 ) The `num_gpu` parameter in `OllamaEmbeddings` was not being passed to the Ollama client in the async embedding method, causing GPU acceleration settings to be ignored when using async operations. ## Problem The issue was in the `aembed_documents` method where the `options` parameter (containing `num_gpu` and other configuration) was missing: ```python # Sync method (working correctly) return self._client.embed( self.model, texts, options=self._default_params, keep_alive=self.keep_alive )["embeddings"] # Async method (missing options parameter) return ( await self._async_client.embed( self.model, texts, keep_alive=self.keep_alive # ❌ No options! ) )["embeddings"] ``` This meant that when users specified `num_gpu=4` (or any other GPU configuration), it would work with sync calls but be ignored with async calls. ## Solution Added the missing `options=self._default_params` parameter to the async embed call to match the sync version: ```python # Fixed async method return ( await self._async_client.embed( self.model, texts, options=self._default_params, # ✅ Now includes num_gpu! keep_alive=self.keep_alive, ) )["embeddings"] ``` ## Validation - ✅ Added unit test to verify options are correctly passed in both sync and async methods - ✅ All existing tests continue to pass - ✅ Manual testing confirms `num_gpu` parameter now works correctly - ✅ Code passes linting and formatting checks The fix ensures that GPU configuration works consistently across both synchronous and asynchronous embedding operations. Fixes #32059. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-16 18:42:52 -04:00
Mason Daugherty	0002b1dafa	ollama[patch]: fix model validation, ensure per-call reasoning can be set, tests (#31927 ) * update model validation due to change in [Ollama client](https://github.com/ollama/ollama) - ensure you are running the latest version (0.9.6) to use `validate_model_on_init` * add code example and fix formatting for ChatOllama reasoning * ensure that setting `reasoning` in invocation kwargs overrides class-level setting * tests	2025-07-08 16:39:41 -04:00
Mason Daugherty	e686a70ee0	ollama: thinking, tool streaming, docs, tests (#31772 ) * New `reasoning` (bool) param to support toggling [Ollama thinking](https://ollama.com/blog/thinking) (#31573, #31700). If `reasoning=True`, Ollama's `thinking` content will be placed in the model responses' `additional_kwargs.reasoning_content`. * Supported by: * ChatOllama (class level, invocation level TODO) * OllamaLLM (TODO) * Added tests to ensure streaming tool calls is successful (#29129) * Refactored tests that relied on `extract_reasoning()` * Myriad docs additions and consistency/typo fixes * Improved type safety in some spots Closes #29129 Addresses #31573 and #31700 Supersedes #31701	2025-07-07 13:56:41 -04:00
Mason Daugherty	572020c4d8	ollama: add `validate_model_on_init`, catch more errors (#31784 ) * Ensure access to local model during `ChatOllama` instantiation (#27720). This adds a new param `validate_model_on_init` (default: `true`) * Catch a few more errors from the Ollama client to assist users	2025-07-03 11:07:11 -04:00
rylativity	dbf9986d44	langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411 ) Replacement for PR #30191 (@ccurme) Description: currently, ChatOllama [will raise a value error if a ChatMessage is passed to it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514), as described https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481. Furthermore, ollama-python is removing the limitations on valid roles that can be passed through chat messages to a model in ollama - https://github.com/ollama/ollama-python/pull/462#event-16917810634. This PR removes the role limitations imposed by langchain and enables passing langchain ChatMessages with arbitrary 'role' values through the langchain ChatOllama class to the underlying ollama-python Client. As this PR relies on [merged but unreleased functionality in ollama-python]( https://github.com/ollama/ollama-python/pull/462#event-16917810634), I have temporarily pointed the ollama package source to the main branch of the ollama-python github repo. Format, lint, and tests of new functionality passing. Need to resolve issue with recently added ChatOllama tests. (Now resolved) Issue: resolves #30122 (related to ollama issue https://github.com/ollama/ollama/issues/8955) Dependencies: no new dependencies [x] PR title [x] PR message [x] Lint and test: format, lint, and test all running successfully and passing --------- Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-04-18 10:07:07 -04:00
Sydney Runkle	8c6734325b	partners[lint]: run `pyupgrade` to get code in line with 3.9 standards (#30781 ) Using `pyupgrade` to get all `partners` code up to 3.9 standards (mostly, fixing old `typing` imports).	2025-04-11 07:18:44 -04:00
Mohammad Mohtashim	1103bdfaf1	(Ollama) Fix String Value parsing in _parse_arguments_from_tool_call (#30154 ) - Description: Fix String Value parsing in _parse_arguments_from_tool_call - Issue: #30145 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-19 21:47:18 -04:00
Stavros Kontopoulos	ac22cde130	langchain_ollama: Support keep_alive in embeddings (#30251 ) - Description: Adds support for keep_alive in Ollama Embeddings see https://github.com/ollama/ollama/issues/6401. Builds on top of of https://github.com/langchain-ai/langchain/pull/29296. I have this use case where I want to keep the embeddings model in cpu forever. - Dependencies: no deps are being introduced. - Issue: haven't created an issue yet.	2025-03-14 14:56:50 -04:00
Erick Friis	0dbaf05bb7	standard-tests: rename langchain_standard_tests to langchain_tests, release 0.3.2 (#28203 )	2024-11-18 19:10:39 -08:00
ccurme	b83f1eb0d5	core, partners: implement standard tracing params for LLMs (#25410 )	2024-08-16 13:18:09 -04:00
Isaac Francisco	838464de25	ollama: init package (#23615 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-20 00:43:29 +00:00

27 Commits