langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-16 23:13:31 +00:00

Author	SHA1	Message	Date
Mason Daugherty	281488a5cf	Merge branch 'master' into wip-v0.4	2025-08-11 15:10:42 -04:00
Mason Daugherty	ee4c2510eb	feat: port various nit changes from `wip-v0.4` (#32506 ) Lots of work that wasn't directly related to core improvements/messages/testing functionality	2025-08-11 15:09:08 -04:00
ccurme	45a067509f	fix(core): fix tracing for PDFs in v1 messages (#32434 )	2025-08-11 12:18:32 -04:00
Mason Daugherty	13d67cf37e	fix(ollama): reasoning should come before text content (#32476 )	2025-08-08 19:34:36 -04:00
Mason Daugherty	c1b86cc929	feat: minor core work, v1 standard tests & (most of) v1 ollama (#32315 ) Resolves #32215 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Nuno Campos <nuno@langchain.dev>	2025-08-06 18:22:02 -04:00
Copilot	d40fd5a3ce	feat(ollama): warn on empty `load` responses (#32161 ) ## Problem When using `ChatOllama` with `create_react_agent`, agents would sometimes terminate prematurely with empty responses when Ollama returned `done_reason: 'load'` responses with no content. This caused agents to return empty `AIMessage` objects instead of actual generated text. ```python from langchain_ollama import ChatOllama from langgraph.prebuilt import create_react_agent from langchain_core.messages import HumanMessage llm = ChatOllama(model='qwen2.5:7b', temperature=0) agent = create_react_agent(model=llm, tools=[]) result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}}) # Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'}) # Expected: AIMessage with actual generated content ``` ## Root Cause The `_iterate_over_stream` and `_aiterate_over_stream` methods treated any response with `done: True` as final, regardless of `done_reason`. When Ollama returns `done_reason: 'load'` with empty content, it indicates the model was loaded but no actual generation occurred - this should not be considered a complete response. ## Solution Modified the streaming logic to skip responses when: - `done: True` - `done_reason: 'load'` - Content is empty or contains only whitespace This ensures agents only receive actual generated content while preserving backward compatibility for load responses that do contain content. ## Changes - `_iterate_over_stream`: Skip empty load responses instead of yielding them - `_aiterate_over_stream`: Apply same fix to async streaming - Tests: Added comprehensive test cases covering all edge cases ## Testing All scenarios now work correctly: - ✅ Empty load responses are skipped (fixes original issue) - ✅ Load responses with actual content are preserved (backward compatibility) - ✅ Normal stop responses work unchanged - ✅ Streaming behavior preserved - ✅ `create_react_agent` integration fixed Fixes #31482. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-22 13:21:11 -04:00
diego-coder	8e4396bb32	fix(ollama): robustly parse single-quoted JSON in tool calls (#32109 ) Description: This PR makes argument parsing for Ollama tool calls more robust. Some LLMs—including Ollama—may return arguments as Python-style dictionaries with single quotes (e.g., `{'a': 1}`), which are not valid JSON and previously caused parsing to fail. The updated `_parse_json_string` method in `langchain_ollama.chat_models` now attempts standard JSON parsing and, if that fails, falls back to `ast.literal_eval` for safe evaluation of Python-style dictionaries. This improves interoperability with LLMs and fixes a common usability issue for tool-based agents. Issue: Closes #30910 Dependencies: None Tests: - Added new unit tests for double-quoted JSON, single-quoted dicts, mixed quoting, and malformed/failure cases. - All tests pass locally, including new coverage for single-quoted inputs. Notes: - No breaking changes. - No new dependencies introduced. - Code is formatted and linted (`ruff format`, `ruff check`). - If maintainers have suggestions for further improvements, I’m happy to revise! Thank you for maintaining LangChain! Looking forward to your feedback.	2025-07-21 12:11:22 -04:00
Copilot	98c3bbbaf0	fix(ollama): `num_gpu` parameter not working in async OllamaEmbeddings method (#32074 ) The `num_gpu` parameter in `OllamaEmbeddings` was not being passed to the Ollama client in the async embedding method, causing GPU acceleration settings to be ignored when using async operations. ## Problem The issue was in the `aembed_documents` method where the `options` parameter (containing `num_gpu` and other configuration) was missing: ```python # Sync method (working correctly) return self._client.embed( self.model, texts, options=self._default_params, keep_alive=self.keep_alive )["embeddings"] # Async method (missing options parameter) return ( await self._async_client.embed( self.model, texts, keep_alive=self.keep_alive # ❌ No options! ) )["embeddings"] ``` This meant that when users specified `num_gpu=4` (or any other GPU configuration), it would work with sync calls but be ignored with async calls. ## Solution Added the missing `options=self._default_params` parameter to the async embed call to match the sync version: ```python # Fixed async method return ( await self._async_client.embed( self.model, texts, options=self._default_params, # ✅ Now includes num_gpu! keep_alive=self.keep_alive, ) )["embeddings"] ``` ## Validation - ✅ Added unit test to verify options are correctly passed in both sync and async methods - ✅ All existing tests continue to pass - ✅ Manual testing confirms `num_gpu` parameter now works correctly - ✅ Code passes linting and formatting checks The fix ensures that GPU configuration works consistently across both synchronous and asynchronous embedding operations. Fixes #32059. <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-07-16 18:42:52 -04:00
Mason Daugherty	0002b1dafa	ollama[patch]: fix model validation, ensure per-call reasoning can be set, tests (#31927 ) * update model validation due to change in [Ollama client](https://github.com/ollama/ollama) - ensure you are running the latest version (0.9.6) to use `validate_model_on_init` * add code example and fix formatting for ChatOllama reasoning * ensure that setting `reasoning` in invocation kwargs overrides class-level setting * tests	2025-07-08 16:39:41 -04:00
Mason Daugherty	1f829aacf4	ollama[patch]: ruff fixes and rules (#31924 ) * bump ruff deps * add more thorough ruff rules * fix said rules	2025-07-08 13:42:19 -04:00
Mason Daugherty	e686a70ee0	ollama: thinking, tool streaming, docs, tests (#31772 ) * New `reasoning` (bool) param to support toggling [Ollama thinking](https://ollama.com/blog/thinking) (#31573, #31700). If `reasoning=True`, Ollama's `thinking` content will be placed in the model responses' `additional_kwargs.reasoning_content`. * Supported by: * ChatOllama (class level, invocation level TODO) * OllamaLLM (TODO) * Added tests to ensure streaming tool calls is successful (#29129) * Refactored tests that relied on `extract_reasoning()` * Myriad docs additions and consistency/typo fixes * Improved type safety in some spots Closes #29129 Addresses #31573 and #31700 Supersedes #31701	2025-07-07 13:56:41 -04:00
Mason Daugherty	572020c4d8	ollama: add `validate_model_on_init`, catch more errors (#31784 ) * Ensure access to local model during `ChatOllama` instantiation (#27720). This adds a new param `validate_model_on_init` (default: `true`) * Catch a few more errors from the Ollama client to assist users	2025-07-03 11:07:11 -04:00
Mason Daugherty	2fb27b63f5	ollama: update tests, docs (#31736 ) - docs: for the Ollama notebooks, improve the specificity of some links, add `homebrew` install info, update some wording - tests: reduce number of local models needed to run in half from 4 → 2 (shedding 8gb of required installs) - bump deps (non-breaking) in anticipation of upcoming "thinking" PR	2025-06-25 20:13:20 +00:00
rylativity	dbf9986d44	langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411 ) Replacement for PR #30191 (@ccurme) Description: currently, ChatOllama [will raise a value error if a ChatMessage is passed to it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514), as described https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481. Furthermore, ollama-python is removing the limitations on valid roles that can be passed through chat messages to a model in ollama - https://github.com/ollama/ollama-python/pull/462#event-16917810634. This PR removes the role limitations imposed by langchain and enables passing langchain ChatMessages with arbitrary 'role' values through the langchain ChatOllama class to the underlying ollama-python Client. As this PR relies on [merged but unreleased functionality in ollama-python]( https://github.com/ollama/ollama-python/pull/462#event-16917810634), I have temporarily pointed the ollama package source to the main branch of the ollama-python github repo. Format, lint, and tests of new functionality passing. Need to resolve issue with recently added ChatOllama tests. (Now resolved) Issue: resolves #30122 (related to ollama issue https://github.com/ollama/ollama/issues/8955) Dependencies: no new dependencies [x] PR title [x] PR message [x] Lint and test: format, lint, and test all running successfully and passing --------- Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-04-18 10:07:07 -04:00
ccurme	085baef926	ollama[patch]: support standard image format (#30864 ) Following https://github.com/langchain-ai/langchain/pull/30746	2025-04-15 22:14:50 +00:00
Sydney Runkle	8c6734325b	partners[lint]: run `pyupgrade` to get code in line with 3.9 standards (#30781 ) Using `pyupgrade` to get all `partners` code up to 3.9 standards (mostly, fixing old `typing` imports).	2025-04-11 07:18:44 -04:00
Bob Merkus	5700646cc5	ollama: add reasoning model support (e.g. deepseek) (#29689 ) # Description This PR adds reasoning model support for `langchain-ollama` by extracting reasoning token blocks, like those used in deepseek. It was inspired by [ollama-deep-researcher](https://github.com/langchain-ai/ollama-deep-researcher), specifically the parsing of [thinking blocks](`6d1aaf2139/src/assistant/graph.py (L91)`): ```python # TODO: This is a hack to remove the <think> tags w/ Deepseek models # It appears very challenging to prompt them out of the responses while "<think>" in running_summary and "</think>" in running_summary: start = running_summary.find("<think>") end = running_summary.find("</think>") + len("</think>") running_summary = running_summary[:start] + running_summary[end:] ``` This notes that it is very hard to remove the reasoning block from prompting, but we actually want the model to reason in order to increase model performance. This implementation extracts the thinking block, so the client can still expect a proper message to be returned by `ChatOllama` (and use the reasoning content separately when desired). This implementation takes the same approach as [ChatDeepseek](`5d581ba22c/libs/partners/deepseek/langchain_deepseek/chat_models.py (L215)`), which adds the reasoning content to chunk.additional_kwargs.reasoning_content; ```python if hasattr(response.choices[0].message, "reasoning_content"): # type: ignore rtn.generations[0].message.additional_kwargs["reasoning_content"] = ( response.choices[0].message.reasoning_content # type: ignore ) ``` This should probably be handled upstream in ollama + ollama-python, but this seems like a reasonably effective solution. This is a standalone example of what is happening; ```python async def deepseek_message_astream( llm: BaseChatModel, messages: list[BaseMessage], config: RunnableConfig \| None = None, , model_target: str = "deepseek-r1", kwargs: Any, ) -> AsyncIterator[BaseMessageChunk]: """Stream responses from Deepseek models, filtering out <think> tags. Args: llm: The language model to stream from messages: The messages to send to the model Yields: Filtered chunks from the model response """ # check if the model is deepseek based if (llm.name and model_target not in llm.name) or (hasattr(llm, "model") and model_target not in llm.model): async for chunk in llm.astream(messages, config=config, kwargs): yield chunk return # Yield with a buffer, upon completing the <think></think> tags, move them to the reasoning content and start over buffer = "" async for chunk in llm.astream(messages, config=config, *kwargs): # start or append if not buffer: buffer = chunk.content else: buffer += chunk.content if hasattr(chunk, "content") else chunk # Process buffer to remove <think> tags if "<think>" in buffer or "</think>" in buffer: if hasattr(chunk, "tool_calls") and chunk.tool_calls: raise NotImplementedError("tool calls during reasoning should be removed?") if "<think>" in chunk.content or "</think>" in chunk.content: continue chunk.additional_kwargs["reasoning_content"] = chunk.content chunk.content = "" # upon block completion, reset the buffer if "<think>" in buffer and "</think>" in buffer: buffer = "" yield chunk ``` # Issue Integrating reasoning models (e.g. deepseek-r1) into existing LangChain based workflows is hard due to the thinking blocks that are included in the message contents. To avoid this, we could match the `ChatOllama` integration with `ChatDeepseek` to return the reasoning content inside `message.additional_arguments.reasoning_content` instead. # Dependenices None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-21 15:44:54 +00:00
Mohammad Mohtashim	1103bdfaf1	(Ollama) Fix String Value parsing in _parse_arguments_from_tool_call (#30154 ) - Description: Fix String Value parsing in _parse_arguments_from_tool_call - Issue: #30145 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-19 21:47:18 -04:00
Lance Martin	46d6bf0330	ollama[minor]: update default method for structured output (#30273 ) From function calling to Ollama's [dedicated structured output feature](https://ollama.com/blog/structured-outputs). --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-18 12:44:22 -04:00
Stavros Kontopoulos	ac22cde130	langchain_ollama: Support keep_alive in embeddings (#30251 ) - Description: Adds support for keep_alive in Ollama Embeddings see https://github.com/ollama/ollama/issues/6401. Builds on top of of https://github.com/langchain-ai/langchain/pull/29296. I have this use case where I want to keep the embeddings model in cpu forever. - Dependencies: no deps are being introduced. - Issue: haven't created an issue yet.	2025-03-14 14:56:50 -04:00
Erick Friis	8f95da4eb1	multiple: structured output tracing standard metadata (#29421 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-29 14:00:26 -08:00
Bagatur	e4d3ccf62f	json mode standard test (#25497 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 18:47:34 +00:00
ccurme	5c6e2cbcda	ollama[patch]: support structured output (#28629 ) - Bump minimum version of `ollama` to 0.4.4 (which also addresses https://github.com/langchain-ai/langchain/issues/28607). - Support recently-released [structured output](https://ollama.com/blog/structured-outputs) feature. This can be accessed by calling `.with_structured_output` with `method="json_schema"` (choice of name [mirrors](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.with_structured_output) what we have for OpenAI's structured output feature). `ChatOllama` previously implemented `.with_structured_output` via the [base implementation](`ec9b41431e/libs/core/langchain_core/language_models/chat_models.py (L1117)`).	2024-12-10 10:36:00 -05:00
TheDannyG	607c60a594	partners/ollama: fix tool calling with nested schemas (#28225 ) ## Description This PR addresses the following: Fixes Issue #25343: - Adds additional logic to parse shallowly nested JSON-encoded strings in tool call arguments, allowing for proper parsing of responses like that of Llama3.1 and 3.2 with nested schemas. Adds Integration Test for Fix: - Adds a Ollama specific integration test to ensure the issue is resolved and to prevent regressions in the future. Fixes Failing Integration Tests: - Fixes failing integration tests (even prior to changes) caused by `llama3-groq-tool-use` model. Previously, tests`test_structured_output_async` and `test_structured_output_optional_param` failed due to the model not issuing a tool call in the response. Resolved by switching to `llama3.1`. ## Issue Fixes #25343. ## Dependencies No dependencies. ____ Done in collaboration with @ishaan-upadhyay @mirajismail @ZackSteine.	2024-11-27 10:32:02 -05:00
Erick Friis	0dbaf05bb7	standard-tests: rename langchain_standard_tests to langchain_tests, release 0.3.2 (#28203 )	2024-11-18 19:10:39 -08:00
Isaac Francisco	a2e90a5a43	add embeddings integration tests (#25508 )	2024-08-16 13:20:37 -07:00
ccurme	b83f1eb0d5	core, partners: implement standard tracing params for LLMs (#25410 )	2024-08-16 13:18:09 -04:00
Isaac Francisco	152427eca1	make image inputs compatible with langchain_ollama (#24619 )	2024-07-26 17:39:57 -07:00
Isaac Francisco	838464de25	ollama: init package (#23615 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-20 00:43:29 +00:00

29 Commits