Commit Graph

71 Commits

Author SHA1 Message Date
Mason Daugherty
572020c4d8
ollama: add validate_model_on_init, catch more errors (#31784)
* Ensure access to local model during `ChatOllama` instantiation
(#27720). This adds a new param `validate_model_on_init` (default:
`true`)
* Catch a few more errors from the Ollama client to assist users
2025-07-03 11:07:11 -04:00
Mason Daugherty
d0dce5315f
langchain-ollama[patch]: Add ruff bandit rules to linter (#31811)
- Add ruff bandit rules
2025-07-01 18:16:07 +00:00
Mason Daugherty
e1aff00cc1
groq: support reasoning_effort, update docs for clarity (#31754)
- There was some ambiguous wording that has been updated to hopefully
clarify the functionality of `reasoning_format` in ChatGroq.
- Added support for `reasoning_effort`
- Added links to see models capable of `reasoning_format` and
`reasoning_effort`
- Other minor nits
2025-06-27 09:43:40 -04:00
Mason Daugherty
59c2b81627
docs: fix some inline links (#31748) 2025-06-26 13:35:14 -04:00
Mason Daugherty
2fb27b63f5
ollama: update tests, docs (#31736)
- docs: for the Ollama notebooks, improve the specificity of some links,
add `homebrew` install info, update some wording
- tests: reduce number of local models needed to run in half from 4 → 2
(shedding 8gb of required installs)
- bump deps (non-breaking) in anticipation of upcoming "thinking" PR
2025-06-25 20:13:20 +00:00
Mason Daugherty
8878a7b143
docs: ollama nits (#31714) 2025-06-24 13:19:15 -04:00
ccurme
a401d7e52a
ollama: release 0.3.3 (#31253) 2025-05-15 16:24:04 -04:00
Alexey Bondarenko
9efafe3337
ollama: Add separate kwargs parameter for async client (#31209)
**Description**:

Add a `async_client_kwargs` field to ollama chat/llm/embeddings adapters
that is passed to async httpx client constructor.

**Motivation:**

In my use-case:
- chat/embedding model adapters may be created frequently, sometimes to
be called just once or to never be called at all
- they may be used in bots sunc and async mode (not known at the moment
they are created)

So, I want to keep a static transport instance maintaining connection
pool, so model adapters can be created and destroyed freely. But that
doesn't work when both sync and async functions are in use as I can only
pass one transport instance for both sync and async client, while
transport types must be different for them. So I can't make both sync
and async calls use shared transport with current model adapter
interfaces.

In this PR I add a separate `async_client_kwargs` that gets passed to
async client constructor, so it will be possible to pass a separate
transport instance. For sake of backwards compatibility, it is merged
with `client_kwargs`, so nothing changes when it is not set.

I am unable to run linter right now, but the changes look ok.
2025-05-15 16:10:10 -04:00
Sydney Runkle
7e926520d5
packaging: remove Python upper bound for langchain and co libs (#31025)
Follow up to https://github.com/langchain-ai/langsmith-sdk/pull/1696,
I've bumped the `langsmith` version where applicable in `uv.lock`.

Type checking problems here because deps have been updated in
`pyproject.toml` and `uv lock` hasn't been run - we should enforce that
in the future - goes with the other dependabot todos :).
2025-04-28 14:44:28 -04:00
rylativity
dbf9986d44
langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411)
Replacement for PR #30191 (@ccurme)

**Description**: currently, ChatOllama [will raise a value error if a
ChatMessage is passed to
it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514),
as described
https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481.

Furthermore, ollama-python is removing the limitations on valid roles
that can be passed through chat messages to a model in ollama -
https://github.com/ollama/ollama-python/pull/462#event-16917810634.

This PR removes the role limitations imposed by langchain and enables
passing langchain ChatMessages with arbitrary 'role' values through the
langchain ChatOllama class to the underlying ollama-python Client.

As this PR relies on [merged but unreleased functionality in
ollama-python](
https://github.com/ollama/ollama-python/pull/462#event-16917810634), I
have temporarily pointed the ollama package source to the main branch of
the ollama-python github repo.

Format, lint, and tests of new functionality passing. Need to resolve
issue with recently added ChatOllama tests. (Now resolved)

**Issue**: resolves #30122 (related to ollama issue
https://github.com/ollama/ollama/issues/8955)

**Dependencies**: no new dependencies

[x] PR title
[x] PR message
[x] Lint and test: format, lint, and test all running successfully and
passing

---------

Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-18 10:07:07 -04:00
ccurme
ca39680d2a
ollama: release 0.3.2 (#30865) 2025-04-16 09:14:57 -04:00
ccurme
085baef926
ollama[patch]: support standard image format (#30864)
Following https://github.com/langchain-ai/langchain/pull/30746
2025-04-15 22:14:50 +00:00
ccurme
47ded80b64
ollama[patch]: fix generation info (#30863)
https://github.com/langchain-ai/langchain/pull/30778 (not released)
broke all invocation modes of ChatOllama (intent was to remove
`"message"` from `generation_info`, but we turned `generation_info` into
`stream_resp["message"]`), resulting in validation errors.
2025-04-15 19:22:58 +00:00
Sydney Runkle
8c6734325b
partners[lint]: run pyupgrade to get code in line with 3.9 standards (#30781)
Using `pyupgrade` to get all `partners` code up to 3.9 standards
(mostly, fixing old `typing` imports).
2025-04-11 07:18:44 -04:00
Jacob Lee
e72f3c26a0
fix(ollama): Remove redundant message from response_metadata (#30778) 2025-04-10 23:12:57 -07:00
ccurme
a2bec5f2e5
ollama: release 0.3.1 (#30716) 2025-04-07 20:31:25 +00:00
ccurme
e3f15f0a47
ollama[patch]: add model_name to response metadata (#30706)
Fixes [this standard
test](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html#langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.test_usage_metadata).
2025-04-07 16:27:58 -04:00
Wenqi Li
64f97e707e
ollama[patch]: Support seed param for OllamaLLM (#30553)
**Description:** a description of the change
add the seed param for OllamaLLM client reproducibility

**Issue:** the issue # it fixes, if applicable
follow up of a similar issue
https://github.com/langchain-ai/langchain/issues/24703
see also https://github.com/langchain-ai/langchain/pull/24782

**Dependencies:** any dependencies required for this change
n/a
2025-03-31 11:28:49 -04:00
ccurme
e8e3b2bfae
ollama: release 0.3.0 (#30420) 2025-03-21 15:50:08 +00:00
Bob Merkus
5700646cc5
ollama: add reasoning model support (e.g. deepseek) (#29689)
# Description
This PR adds reasoning model support for `langchain-ollama` by
extracting reasoning token blocks, like those used in deepseek. It was
inspired by
[ollama-deep-researcher](https://github.com/langchain-ai/ollama-deep-researcher),
specifically the parsing of [thinking
blocks](6d1aaf2139/src/assistant/graph.py (L91)):
```python
  # TODO: This is a hack to remove the <think> tags w/ Deepseek models 
  # It appears very challenging to prompt them out of the responses 
  while "<think>" in running_summary and "</think>" in running_summary:
      start = running_summary.find("<think>")
      end = running_summary.find("</think>") + len("</think>")
      running_summary = running_summary[:start] + running_summary[end:]
```

This notes that it is very hard to remove the reasoning block from
prompting, but we actually want the model to reason in order to increase
model performance. This implementation extracts the thinking block, so
the client can still expect a proper message to be returned by
`ChatOllama` (and use the reasoning content separately when desired).

This implementation takes the same approach as
[ChatDeepseek](5d581ba22c/libs/partners/deepseek/langchain_deepseek/chat_models.py (L215)),
which adds the reasoning content to
chunk.additional_kwargs.reasoning_content;
```python
  if hasattr(response.choices[0].message, "reasoning_content"):  # type: ignore
      rtn.generations[0].message.additional_kwargs["reasoning_content"] = (
          response.choices[0].message.reasoning_content  # type: ignore
      )
```

This should probably be handled upstream in ollama + ollama-python, but
this seems like a reasonably effective solution. This is a standalone
example of what is happening;

```python
async def deepseek_message_astream(
    llm: BaseChatModel,
    messages: list[BaseMessage],
    config: RunnableConfig | None = None,
    *,
    model_target: str = "deepseek-r1",
    **kwargs: Any,
) -> AsyncIterator[BaseMessageChunk]:
    """Stream responses from Deepseek models, filtering out <think> tags.

    Args:
        llm: The language model to stream from
        messages: The messages to send to the model

    Yields:
        Filtered chunks from the model response
    """
    # check if the model is deepseek based
    if (llm.name and model_target not in llm.name) or (hasattr(llm, "model") and model_target not in llm.model):
        async for chunk in llm.astream(messages, config=config, **kwargs):
            yield chunk
        return

    # Yield with a buffer, upon completing the <think></think> tags, move them to the reasoning content and start over
    buffer = ""
    async for chunk in llm.astream(messages, config=config, **kwargs):
        # start or append
        if not buffer:
            buffer = chunk.content
        else:
            buffer += chunk.content if hasattr(chunk, "content") else chunk

        # Process buffer to remove <think> tags
        if "<think>" in buffer or "</think>" in buffer:
            if hasattr(chunk, "tool_calls") and chunk.tool_calls:
                raise NotImplementedError("tool calls during reasoning should be removed?")
            if "<think>" in chunk.content or "</think>" in chunk.content:
                continue
            chunk.additional_kwargs["reasoning_content"] = chunk.content
            chunk.content = ""
        # upon block completion, reset the buffer
        if "<think>" in buffer and "</think>" in buffer:
            buffer = ""
        yield chunk

```

# Issue
Integrating reasoning models (e.g. deepseek-r1) into existing LangChain
based workflows is hard due to the thinking blocks that are included in
the message contents. To avoid this, we could match the `ChatOllama`
integration with `ChatDeepseek` to return the reasoning content inside
`message.additional_arguments.reasoning_content` instead.

# Dependenices
None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-21 15:44:54 +00:00
Mohammad Mohtashim
1103bdfaf1
(Ollama) Fix String Value parsing in _parse_arguments_from_tool_call (#30154)
- **Description:** Fix String Value parsing in
_parse_arguments_from_tool_call
- **Issue:** #30145

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-19 21:47:18 -04:00
Lance Martin
46d6bf0330
ollama[minor]: update default method for structured output (#30273)
From function calling to Ollama's [dedicated structured output
feature](https://ollama.com/blog/structured-outputs).

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-18 12:44:22 -04:00
Stavros Kontopoulos
ac22cde130
langchain_ollama: Support keep_alive in embeddings (#30251)
- Description: Adds support for keep_alive in Ollama Embeddings see
https://github.com/ollama/ollama/issues/6401.
Builds on top of of
https://github.com/langchain-ai/langchain/pull/29296. I have this use
case where I want to keep the embeddings model in cpu forever.
- Dependencies: no deps are being introduced.
- Issue: haven't created an issue yet.
2025-03-14 14:56:50 -04:00
ccurme
806211475a
core[patch]: update structured output tracing (#30123)
- Trace JSON schema in `options`
- Rename to `ls_structured_output_format`
2025-03-07 13:05:25 -05:00
Erick Friis
6c1e21d128
core: basemessage.text() (#29078) 2025-02-18 17:45:44 -08:00
Erick Friis
1a225fad03
multiple: fix uv path deps (#29790)
file:// format wasn't working with updates - it doesn't install as an
editable dep

move to tool.uv.sources with path= instead
2025-02-13 21:32:34 +00:00
ccurme
3450bfc806
infra: add UV_FROZEN to makefiles (#29642)
These are set in Github workflows, but forgot to add them to most
makefiles for convenience when developing locally.

`uv run` will automatically sync the lock file. Because many of our
development dependencies are local installs, it will pick up version
changes and update the lock file. Passing `--frozen` or setting this
environment variable disables the behavior.
2025-02-06 14:36:54 -05:00
ccurme
d172984c91
infra: migrate to uv (#29566) 2025-02-06 13:36:26 -05:00
Erick Friis
29461b36d9
partners/ollama: release 0.2.3 (#29489) 2025-01-29 22:19:44 +00:00
Erick Friis
8f95da4eb1
multiple: structured output tracing standard metadata (#29421)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-01-29 14:00:26 -08:00
Syed Baqar Abbas
f175319303
[feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from langchain_community.embeddings to langchain_ollama.embeddings (#29296)
- [feat] **Added backwards compatibility for OllamaEmbeddings
initialization (migration from `langchain_community.embeddings` to
`langchain_ollama.embeddings`**: "langchain_ollama"
- **Description:** Given that `OllamaEmbeddings` from
`langchain_community.embeddings` is deprecated, code is being shifted to
``langchain_ollama.embeddings`. However, this does not offer backward
compatibility of initializing the parameters and `OllamaEmbeddings`
object.
    - **Issue:** #29294 
    - **Dependencies:** None
    - **Twitter handle:** @BaqarAbbas2001


## Additional Information
Previously, `OllamaEmbeddings` from `langchain_community.embeddings`
used to support the following options:

e9abe583b2/libs/community/langchain_community/embeddings/ollama.py (L125-L139)

However, in the new package `from langchain_ollama import
OllamaEmbeddings`, there is no method to set these options. I have added
these parameters to resolve this issue.

This issue was also discussed in
https://github.com/langchain-ai/langchain/discussions/29113
2025-01-20 11:16:29 -05:00
Erick Friis
514d78516b
partners/ollama: release 0.2.2 (#28802) 2024-12-18 22:11:08 +00:00
Bagatur
e4d3ccf62f
json mode standard test (#25497)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-17 18:47:34 +00:00
ccurme
bc4dc7f4b1
ollama[patch]: permit streaming for tool calls (#28654)
Resolves https://github.com/langchain-ai/langchain/issues/28543

Ollama recently
[released](https://github.com/ollama/ollama/releases/tag/v0.4.6) support
for streaming tool calls. Previously we would override the `stream`
parameter if tools were passed in.

Covered in standard tests here:
c1d348e95d/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L893-L897)

Before, the test generates one message chunk:
```python
[
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={
            'model': 'llama3.1',
            'created_at': '2024-12-10T17:49:04.468487Z',
            'done': True,
            'done_reason': 'stop',
            'total_duration': 525471208,
            'load_duration': 19701000,
            'prompt_eval_count': 170,
            'prompt_eval_duration': 31000000,
            'eval_count': 17,
            'eval_duration': 473000000,
            'message': Message(
                role='assistant',
                content='',
                images=None,
                tool_calls=[
                    ToolCall(
                        function=Function(name='magic_function', arguments={'input': 3})
                    )
                ]
            )
        },
        id='run-552bbe0f-8fb2-4105-ada1-fa38c1db444d',
        tool_calls=[
            {
                'name': 'magic_function',
                'args': {'input': 3},
                'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2',
                'type': 'tool_call',
            },
        ],
        usage_metadata={
            'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187
        },
        tool_call_chunks=[
            {
                'name': 'magic_function',
                'args': '{"input": 3}',
                'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2',
                'index': None,
                'type': 'tool_call_chunk',
            }
        ]
    )
]
```

After, it generates two (tool call in one, response metadata in
another):
```python
[
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={},
        id='run-9a3f0860-baa1-4bae-9562-13a61702de70',
        tool_calls=[
            {
                'name': 'magic_function',
                'args': {'input': 3},
                'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0',
                'type': 'tool_call',
            },
        ],
        tool_call_chunks=[
            {
                'name': 'magic_function',
                'args': '{"input": 3}',
                'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0',
                'index': None,
                'type': 'tool_call_chunk',
            },
        ],
    ),
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={
            'model': 'llama3.1',
            'created_at': '2024-12-10T17:46:43.278436Z',
            'done': True,
            'done_reason': 'stop',
            'total_duration': 514282750,
            'load_duration': 16894458,
            'prompt_eval_count': 170,
            'prompt_eval_duration': 31000000,
            'eval_count': 17,
            'eval_duration': 464000000,
            'message': Message(
                role='assistant', content='', images=None, tool_calls=None
            ),
        },
        id='run-9a3f0860-baa1-4bae-9562-13a61702de70',
        usage_metadata={
            'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187
        }
    ),
]
```
2024-12-10 12:54:37 -05:00
ccurme
5c6e2cbcda
ollama[patch]: support structured output (#28629)
- Bump minimum version of `ollama` to 0.4.4 (which also addresses
https://github.com/langchain-ai/langchain/issues/28607).
- Support recently-released [structured
output](https://ollama.com/blog/structured-outputs) feature. This can be
accessed by calling `.with_structured_output` with
`method="json_schema"` (choice of name
[mirrors](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.with_structured_output)
what we have for OpenAI's structured output feature).

`ChatOllama` previously implemented `.with_structured_output` via the
[base
implementation](ec9b41431e/libs/core/langchain_core/language_models/chat_models.py (L1117)).
2024-12-10 10:36:00 -05:00
ccurme
28487597b2
ollama[patch]: release 0.2.1 (#28458)
We inadvertently skipped 0.2.1, so release pipeline
[failed](https://github.com/langchain-ai/langchain/actions/runs/12126964367/job/33810204551).
2024-12-02 21:17:51 +00:00
ccurme
88d6d02b59
ollama[patch]: release 0.2.2 (#28456) 2024-12-02 14:57:30 -05:00
TheDannyG
607c60a594
partners/ollama: fix tool calling with nested schemas (#28225)
## Description

This PR addresses the following:

**Fixes Issue #25343:**
- Adds additional logic to parse shallowly nested JSON-encoded strings
in tool call arguments, allowing for proper parsing of responses like
that of Llama3.1 and 3.2 with nested schemas.
 
**Adds Integration Test for Fix:**
- Adds a Ollama specific integration test to ensure the issue is
resolved and to prevent regressions in the future.

**Fixes Failing Integration Tests:**
- Fixes failing integration tests (even prior to changes) caused by
`llama3-groq-tool-use` model. Previously,
tests`test_structured_output_async` and
`test_structured_output_optional_param` failed due to the model not
issuing a tool call in the response. Resolved by switching to
`llama3.1`.

## Issue
Fixes #25343.

## Dependencies
No dependencies.

____

Done in collaboration with @ishaan-upadhyay @mirajismail @ZackSteine.
2024-11-27 10:32:02 -05:00
ccurme
a1c90794e1
ollama[patch]: bump to 0.4.1 in lock file (#28365) 2024-11-26 18:19:31 +00:00
ccurme
74d9d2cba1
ollama[patch]: support ollama 0.4 (#28364)
v0.4 of the Python SDK is already installed via the lock file in CI, but
our current implementation is not compatible with it.

This also addresses an issue introduced in
https://github.com/langchain-ai/langchain/pull/28299. @RyanMagnuson
would you mind explaining the motivation for that change? From what I
can tell the Ollama SDK [does not support
kwargs](6c44bb2729/ollama/_client.py (L286)).
Previously, unsupported kwargs were ignored, but they currently raise
`TypeError`.

Some of LangChain's standard test suite expects `tool_choice` to be
supported, so here we catch it in `bind_tools` so it is ignored and not
passed through to the client.
2024-11-26 12:45:59 -05:00
Erick Friis
aa7fa80e1e
partners/ollama: release 0.2.2rc1 (#28300) 2024-11-22 22:25:05 +00:00
Erick Friis
7277794a59
ollama: include kwargs in requests (#28299)
courtesy of @ryanmagnuson
2024-11-22 14:15:42 -08:00
Erick Friis
0dbaf05bb7
standard-tests: rename langchain_standard_tests to langchain_tests, release 0.3.2 (#28203) 2024-11-18 19:10:39 -08:00
Erick Friis
6d2004ee7d
multiple: langchain-standard-tests -> langchain-tests (#28139) 2024-11-15 11:32:04 -08:00
Elham Badri
d696728278
partners/ollama: Enabled Token Level Streaming when Using Bind Tools for ChatOllama (#27689)
**Description:** The issue concerns the unexpected behavior observed
using the bind_tools method in LangChain's ChatOllama. When tools are
not bound, the llm.stream() method works as expected, returning
incremental chunks of content, which is crucial for real-time
applications such as conversational agents and live feedback systems.
However, when bind_tools([]) is used, the streaming behavior changes,
causing the output to be delivered in full chunks rather than
incrementally. This change negatively impacts the user experience by
breaking the real-time nature of the streaming mechanism.
**Issue:** #26971

---------

Co-authored-by: 4meyDam1e <amey.damle@mail.utoronto.ca>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-11-15 11:36:27 -05:00
Bagatur
06420de2e7
integrations[patch]: bump core to 0.3.15 (#27805) 2024-10-31 11:27:05 -07:00
JiaranI
3952ee31b8
ollama: add pydocstyle linting for ollama (#27686)
Description: add lint docstrings for ollama module
Issue: the issue https://github.com/langchain-ai/langchain/issues/23188
@baskaryan

test: ruff check passed.
<img width="311" alt="e94c68ffa93dd518297a95a93de5217"
src="https://github.com/user-attachments/assets/e96bf721-e0e3-44de-a50e-206603de398e">

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-31 03:06:55 +00:00
Erick Friis
537f6924dc
partners/ollama: release 0.2.0 (#26468) 2024-09-13 15:48:48 -07:00
Erick Friis
c2a3021bb0
multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Marcelo Machado
9bd4f1dfa8
docs: small improvement ChatOllama setup description (#26043)
Small improvement on ChatOllama description

---------

Co-authored-by: Marcelo Machado <mmachado@ibm.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:15:05 +00:00