Commit Graph

84 Commits

Author SHA1 Message Date
Amber Shen
050b779d97 fix(ollama): respect scheme-less base_url (#34042)
Fixes #33986.

Summary:
- Normalize scheme-less `base_url` values (e.g., `ollama:11434`) by
defaulting to `http://` when the input resembles `host:port`.
- Preserve and merge `Authorization` headers when `userinfo` credentials
are present, both for sync and async clients.
- Add unit tests covering scheme-less host:port and scheme-less userinfo
credentials.

Implementation details:
- Update `parse_url_with_auth` to accept scheme-less endpoints,
producing a cleaned URL with explicit scheme and extracted auth headers.
- No changes required in `OllamaLLM`, `ChatOllama`, or
`OllamaEmbeddings`—they already consume the cleaned URL and headers.

Why:
- Previously, scheme-less inputs caused `parse_url_with_auth` to return
`(None, None)`, leading Ollama clients to fall back to defaults and
ignore the provided `base_url`.

Tests:
- Extended `libs/partners/ollama/tests/unit_tests/test_auth.py` to cover
the new cases.

Notes:
- Default scheme chosen is `http` to match common Ollama local
deployments. Users can still explicitly provide `https://` when
appropriate.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-04-06 21:39:33 -04:00
Mohammad Mohtashim
0aa482d0cd feat(ollama): logprobs support in Ollama (#34218)
Closes #34207 

---

Expose log probabilities from the Ollama Python SDK through
`ChatOllama`. The ollama client already returns a `logprobs` field on
chat responses for supported models, but `ChatOllama` had no way to
request or surface it.

## Changes
- Add `logprobs` and `top_logprobs` fields to `ChatOllama`, forwarded to
the client via `_build_chat_params`. Setting `top_logprobs` without
`logprobs=True` auto-enables it with a warning; setting it with
`logprobs=False` raises a `ValueError`
- Surface per-token logprobs on intermediate streaming chunks (both sync
`_create_chat_stream` and async `_create_async_chat_stream`) via
`response_metadata["logprobs"]`, accumulated into the final response on
`invoke()`
- Bump minimum `ollama` SDK from `>=0.6.0` to `>=0.6.1` — the version
that added logprobs support

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-04-06 17:06:51 -04:00
Mason Daugherty
d7575ffac9 chore(ollama): switch to ty (#36571) 2026-04-06 15:07:09 -04:00
Yi Liu
19ddd42891 fix(ollama): raise error when clients are not initialized (#35185)
## Summary
- When `self._client` is `None` in `_create_chat_stream()`, the method
silently produces an empty generator instead of failing.
- The error only surfaces later as a misleading `"No data received from
Ollama stream"` ValueError, making it difficult to diagnose the actual
root cause (uninitialized client).
- Changed to raise `RuntimeError` immediately with a clear message when
the sync client is not initialized.

## Why this matters
Users who hit this path see a confusing error message that points them
in the wrong direction. An explicit error at the point of failure makes
debugging straightforward.

## Test plan
- [x] Added `test_create_chat_stream_raises_when_client_none`
- [x] Existing tests still pass

> This PR was authored with the help of AI tools.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2026-02-12 11:56:53 -05:00
David Fernandez
6bcc4a1af1 docs: Fix TODO in Ollama compatibility docstring (#34713)
Replaces a leftover TODO in
`libs/partners/ollama/langchain_ollama/_compat.py` with a proper return
value description.
2026-01-12 12:52:25 -05:00
Mason Daugherty
18c25e9f10 chore: ban relative imports on all packages (#34691) 2026-01-09 17:02:24 -05:00
dumko2001
05ba853548 fix(ollama): pop unsupported 'strict' argument in ChatOllama (#34114) 2025-12-12 09:13:11 -05:00
Mason Daugherty
ff6e3558d7 docs(fireworks,groq,huggingface,mistralai,ollama,openai): x-ref convert_to_openai_tool (#34276) 2025-12-09 19:51:04 -05:00
Mason Daugherty
47b79c30c0 chore(docs): fix a few refs syntax errors (#34044)
missing whitespace for some admonitions
2025-11-22 00:58:21 -05:00
Mason Daugherty
e023201d42 style: some cleanup (#33857) 2025-11-06 23:50:46 -05:00
Mason Daugherty
d40e340479 chore: attribute package change versions (#33854)
Needed to disambiguate for within inherited docs
2025-11-06 16:57:30 -05:00
Mason Daugherty
1d2273597a docs: more fixes for refs (#33554) 2025-10-16 22:54:16 -04:00
Mason Daugherty
291a9fcea1 style: llm -> model (#33423) 2025-10-10 13:19:13 -04:00
Mason Daugherty
6fc21afbc9 style: .. code-block:: admonition translations (#33400)
biiiiiiiiiiiiiiiigggggggg pass
2025-10-09 16:52:58 -04:00
Mason Daugherty
d8a680ee57 style: address Sphinx double-backtick snippet syntax (#33389) 2025-10-09 13:35:51 -04:00
Mason Daugherty
3576e690fa chore: update Sphinx links to markdown (#33386) 2025-10-09 11:54:14 -04:00
Mason Daugherty
b6132fc23e style: remove more Optional syntax (#33371) 2025-10-08 23:28:43 -04:00
Mason Daugherty
31eeb50ce0 chore: drop UP045 (#33362)
Python 3.9 EOL
2025-10-08 21:17:53 -04:00
Mason Daugherty
d13823043d style: monorepo pass for refs (#33359)
* Delete some double backticks previously used by Sphinx (not done
everywhere yet)
* Fix some code blocks / dropdowns

Ignoring CLI CI for now
2025-10-08 18:41:39 -04:00
Mason Daugherty
b1acf8d931 chore: fix dropdown default open admonition in refs (#33354) 2025-10-08 18:50:44 +00:00
Mason Daugherty
ae5b105d11 docs: v1 docs updates (#33173)
Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com>
Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>
2025-10-02 18:46:26 -04:00
Mason Daugherty
63097db4fc fix(ollama): exclude None parameters from options dictionary (#33208) 2025-10-02 11:25:15 -04:00
Mason Daugherty
eaa6dcce9e release: v1.0.0 (#32567)
Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com>
Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com>
Co-authored-by: Vadym Barda <vadim.barda@gmail.com>
2025-10-02 10:49:42 -04:00
Mason Daugherty
6f2d16e6be refactor(ollama): simplify options handling (#33199)
Fixes #32744

Don't restrict options; the client accepts any dict
2025-10-01 21:58:12 -04:00
Mason Daugherty
a9eda18e1e refactor(ollama): clean up tests (#33198) 2025-10-01 21:52:01 -04:00
Mason Daugherty
a89c549cb0 feat(ollama): add basic auth support (#32328)
support for URL authentication in the format
`https://user:password@host:port` for all LangChain Ollama clients.

Related to #32327 and #25055
2025-10-01 20:46:37 -04:00
Mason Daugherty
986302322f docs: more standardization (#33124) 2025-09-25 20:46:20 -04:00
Mason Daugherty
5bea28393d docs: standardize .. code-block directive usage (#33122)
and fix typos
2025-09-25 16:49:56 -04:00
Mason Daugherty
b92b394804 style: repo linting pass (#33089)
enable docstring-code-format
2025-09-24 15:25:55 -04:00
Alexey Bondarenko
181bb91ce0 fix(ollama): Fix handling message content lists (#32881)
The Ollama chat model adapter does not support all of the possible
message content formats. That leads to Ollama model adapter crashing on
some messages from different models (e.g. Gemini 2.5 Flash).

These changes should fix one known scenario - when `content` is a list
containing a string.
2025-09-10 11:13:28 -04:00
Ravirajsingh Sodha
b42dac5fe6 docs: standardize OllamaLLM and BaseOpenAI docstrings (#32758)
- Add comprehensive docstring following LangChain standards
- Include Setup, Key init args, Instantiate, Invoke, Stream, and Async
sections
- Provide detailed parameter descriptions and code examples
- Fix linting issues for code formatting compliance

Contributes to #24803

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-31 17:45:56 -05:00
Mason Daugherty
af3b88f58d feat(ollama): update reasoning type to support string values for custom intensity levels (e.g. gpt-oss) (#32650) 2025-08-22 15:11:32 -04:00
Mason Daugherty
5ccdcd7b7b feat(ollama): docs updates (#32507) 2025-08-11 15:39:44 -04:00
Mason Daugherty
ee4c2510eb feat: port various nit changes from wip-v0.4 (#32506)
Lots of work that wasn't directly related to core
improvements/messages/testing functionality
2025-08-11 15:09:08 -04:00
Mason Daugherty
96cbd90cba fix: formatting issues in docstrings (#32265)
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
2025-07-27 23:37:47 -04:00
Mason Daugherty
d53ebf367e fix(docs): capitalization, codeblock formatting, and hyperlinks, note blocks (#32235)
widespread cleanup attempt
2025-07-24 16:55:04 -04:00
Copilot
d40fd5a3ce feat(ollama): warn on empty load responses (#32161)
## Problem

When using `ChatOllama` with `create_react_agent`, agents would
sometimes terminate prematurely with empty responses when Ollama
returned `done_reason: 'load'` responses with no content. This caused
agents to return empty `AIMessage` objects instead of actual generated
text.

```python
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

llm = ChatOllama(model='qwen2.5:7b', temperature=0)
agent = create_react_agent(model=llm, tools=[])

result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}})
# Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'})
# Expected: AIMessage with actual generated content
```

## Root Cause

The `_iterate_over_stream` and `_aiterate_over_stream` methods treated
any response with `done: True` as final, regardless of `done_reason`.
When Ollama returns `done_reason: 'load'` with empty content, it
indicates the model was loaded but no actual generation occurred - this
should not be considered a complete response.

## Solution

Modified the streaming logic to skip responses when:
- `done: True`
- `done_reason: 'load'` 
- Content is empty or contains only whitespace

This ensures agents only receive actual generated content while
preserving backward compatibility for load responses that do contain
content.

## Changes

- **`_iterate_over_stream`**: Skip empty load responses instead of
yielding them
- **`_aiterate_over_stream`**: Apply same fix to async streaming
- **Tests**: Added comprehensive test cases covering all edge cases

## Testing

All scenarios now work correctly:
-  Empty load responses are skipped (fixes original issue)
-  Load responses with actual content are preserved (backward
compatibility)
-  Normal stop responses work unchanged
-  Streaming behavior preserved
-  `create_react_agent` integration fixed

Fixes #31482.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-07-22 13:21:11 -04:00
Mason Daugherty
d65da13299 docs(ollama): add validate_model_on_init note, bump lock (#32172) 2025-07-22 10:58:45 -04:00
diego-coder
8e4396bb32 fix(ollama): robustly parse single-quoted JSON in tool calls (#32109)
**Description:**
This PR makes argument parsing for Ollama tool calls more robust. Some
LLMs—including Ollama—may return arguments as Python-style dictionaries
with single quotes (e.g., `{'a': 1}`), which are not valid JSON and
previously caused parsing to fail.
The updated `_parse_json_string` method in
`langchain_ollama.chat_models` now attempts standard JSON parsing and,
if that fails, falls back to `ast.literal_eval` for safe evaluation of
Python-style dictionaries. This improves interoperability with LLMs and
fixes a common usability issue for tool-based agents.

**Issue:**
Closes #30910

**Dependencies:**
None

**Tests:**
- Added new unit tests for double-quoted JSON, single-quoted dicts,
mixed quoting, and malformed/failure cases.
- All tests pass locally, including new coverage for single-quoted
inputs.

**Notes:**
- No breaking changes.
- No new dependencies introduced.
- Code is formatted and linted (`ruff format`, `ruff check`).
- If maintainers have suggestions for further improvements, I’m happy to
revise!

Thank you for maintaining LangChain! Looking forward to your feedback.
2025-07-21 12:11:22 -04:00
Copilot
98c3bbbaf0 fix(ollama): num_gpu parameter not working in async OllamaEmbeddings method (#32074)
The `num_gpu` parameter in `OllamaEmbeddings` was not being passed to
the Ollama client in the async embedding method, causing GPU
acceleration settings to be ignored when using async operations.

## Problem

The issue was in the `aembed_documents` method where the `options`
parameter (containing `num_gpu` and other configuration) was missing:

```python
# Sync method (working correctly)
return self._client.embed(
    self.model, texts, options=self._default_params, keep_alive=self.keep_alive
)["embeddings"]

# Async method (missing options parameter)
return (
    await self._async_client.embed(
        self.model, texts, keep_alive=self.keep_alive  #  No options!
    )
)["embeddings"]
```

This meant that when users specified `num_gpu=4` (or any other GPU
configuration), it would work with sync calls but be ignored with async
calls.

## Solution

Added the missing `options=self._default_params` parameter to the async
embed call to match the sync version:

```python
# Fixed async method
return (
    await self._async_client.embed(
        self.model,
        texts,
        options=self._default_params,  #  Now includes num_gpu!
        keep_alive=self.keep_alive,
    )
)["embeddings"]
```

## Validation

-  Added unit test to verify options are correctly passed in both sync
and async methods
-  All existing tests continue to pass
-  Manual testing confirms `num_gpu` parameter now works correctly
-  Code passes linting and formatting checks

The fix ensures that GPU configuration works consistently across both
synchronous and asynchronous embedding operations.

Fixes #32059.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-07-16 18:42:52 -04:00
Mason Daugherty
0002b1dafa ollama[patch]: fix model validation, ensure per-call reasoning can be set, tests (#31927)
* update model validation due to change in [Ollama
client](https://github.com/ollama/ollama) - ensure you are running the
latest version (0.9.6) to use `validate_model_on_init`
* add code example and fix formatting for ChatOllama reasoning
* ensure that setting `reasoning` in invocation kwargs overrides
class-level setting
* tests
2025-07-08 16:39:41 -04:00
Mason Daugherty
1f829aacf4 ollama[patch]: ruff fixes and rules (#31924)
* bump ruff deps
* add more thorough ruff rules
* fix said rules
2025-07-08 13:42:19 -04:00
Mason Daugherty
e686a70ee0 ollama: thinking, tool streaming, docs, tests (#31772)
* New `reasoning` (bool) param to support toggling [Ollama
thinking](https://ollama.com/blog/thinking) (#31573, #31700). If
`reasoning=True`, Ollama's `thinking` content will be placed in the
model responses' `additional_kwargs.reasoning_content`.
  * Supported by:
    * ChatOllama (class level, invocation level TODO)
    * OllamaLLM (TODO)
* Added tests to ensure streaming tool calls is successful (#29129)
* Refactored tests that relied on `extract_reasoning()`
* Myriad docs additions and consistency/typo fixes
* Improved type safety in some spots

Closes #29129
Addresses #31573 and #31700
Supersedes #31701
2025-07-07 13:56:41 -04:00
Mason Daugherty
572020c4d8 ollama: add validate_model_on_init, catch more errors (#31784)
* Ensure access to local model during `ChatOllama` instantiation
(#27720). This adds a new param `validate_model_on_init` (default:
`true`)
* Catch a few more errors from the Ollama client to assist users
2025-07-03 11:07:11 -04:00
Mason Daugherty
e1aff00cc1 groq: support reasoning_effort, update docs for clarity (#31754)
- There was some ambiguous wording that has been updated to hopefully
clarify the functionality of `reasoning_format` in ChatGroq.
- Added support for `reasoning_effort`
- Added links to see models capable of `reasoning_format` and
`reasoning_effort`
- Other minor nits
2025-06-27 09:43:40 -04:00
Mason Daugherty
59c2b81627 docs: fix some inline links (#31748) 2025-06-26 13:35:14 -04:00
Mason Daugherty
8878a7b143 docs: ollama nits (#31714) 2025-06-24 13:19:15 -04:00
Alexey Bondarenko
9efafe3337 ollama: Add separate kwargs parameter for async client (#31209)
**Description**:

Add a `async_client_kwargs` field to ollama chat/llm/embeddings adapters
that is passed to async httpx client constructor.

**Motivation:**

In my use-case:
- chat/embedding model adapters may be created frequently, sometimes to
be called just once or to never be called at all
- they may be used in bots sunc and async mode (not known at the moment
they are created)

So, I want to keep a static transport instance maintaining connection
pool, so model adapters can be created and destroyed freely. But that
doesn't work when both sync and async functions are in use as I can only
pass one transport instance for both sync and async client, while
transport types must be different for them. So I can't make both sync
and async calls use shared transport with current model adapter
interfaces.

In this PR I add a separate `async_client_kwargs` that gets passed to
async client constructor, so it will be possible to pass a separate
transport instance. For sake of backwards compatibility, it is merged
with `client_kwargs`, so nothing changes when it is not set.

I am unable to run linter right now, but the changes look ok.
2025-05-15 16:10:10 -04:00
rylativity
dbf9986d44 langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411)
Replacement for PR #30191 (@ccurme)

**Description**: currently, ChatOllama [will raise a value error if a
ChatMessage is passed to
it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514),
as described
https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481.

Furthermore, ollama-python is removing the limitations on valid roles
that can be passed through chat messages to a model in ollama -
https://github.com/ollama/ollama-python/pull/462#event-16917810634.

This PR removes the role limitations imposed by langchain and enables
passing langchain ChatMessages with arbitrary 'role' values through the
langchain ChatOllama class to the underlying ollama-python Client.

As this PR relies on [merged but unreleased functionality in
ollama-python](
https://github.com/ollama/ollama-python/pull/462#event-16917810634), I
have temporarily pointed the ollama package source to the main branch of
the ollama-python github repo.

Format, lint, and tests of new functionality passing. Need to resolve
issue with recently added ChatOllama tests. (Now resolved)

**Issue**: resolves #30122 (related to ollama issue
https://github.com/ollama/ollama/issues/8955)

**Dependencies**: no new dependencies

[x] PR title
[x] PR message
[x] Lint and test: format, lint, and test all running successfully and
passing

---------

Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-18 10:07:07 -04:00
ccurme
085baef926 ollama[patch]: support standard image format (#30864)
Following https://github.com/langchain-ai/langchain/pull/30746
2025-04-15 22:14:50 +00:00