55 Commits

Author SHA1 Message Date
Mason Daugherty
281488a5cf Merge branch 'master' into wip-v0.4 2025-08-11 15:10:42 -04:00
Mason Daugherty
ee4c2510eb feat: port various nit changes from wip-v0.4 (#32506)
Lots of work that wasn't directly related to core
improvements/messages/testing functionality
2025-08-11 15:09:08 -04:00
Mason Daugherty
13d67cf37e fix(ollama): reasoning should come before text content (#32476) 2025-08-08 19:34:36 -04:00
Mason Daugherty
c1b86cc929 feat: minor core work, v1 standard tests & (most of) v1 ollama (#32315)
Resolves #32215

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
2025-08-06 18:22:02 -04:00
Mason Daugherty
5e9eb19a83 chore: update branch with changes from master (#32277)
Co-authored-by: Maxime Grenu <69890511+cluster2600@users.noreply.github.com>
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: jmaillefaud <jonathan.maillefaud@evooq.ch>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: tanwirahmad <tanwirahmad@users.noreply.github.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: niceg <79145285+growmuye@users.noreply.github.com>
Co-authored-by: Chaitanya varma <varmac301@gmail.com>
Co-authored-by: dishaprakash <57954147+dishaprakash@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Kanav Bansal <13186335+bansalkanav@users.noreply.github.com>
Co-authored-by: Aleksandr Filippov <71711753+alex-feel@users.noreply.github.com>
Co-authored-by: Alex Feel <afilippov@spotware.com>
2025-07-28 10:39:41 -04:00
Mason Daugherty
96cbd90cba fix: formatting issues in docstrings (#32265)
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
2025-07-27 23:37:47 -04:00
Mason Daugherty
d53ebf367e fix(docs): capitalization, codeblock formatting, and hyperlinks, note blocks (#32235)
widespread cleanup attempt
2025-07-24 16:55:04 -04:00
Copilot
d40fd5a3ce feat(ollama): warn on empty load responses (#32161)
## Problem

When using `ChatOllama` with `create_react_agent`, agents would
sometimes terminate prematurely with empty responses when Ollama
returned `done_reason: 'load'` responses with no content. This caused
agents to return empty `AIMessage` objects instead of actual generated
text.

```python
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

llm = ChatOllama(model='qwen2.5:7b', temperature=0)
agent = create_react_agent(model=llm, tools=[])

result = agent.invoke(HumanMessage('Hello'), {"configurable": {"thread_id": "1"}})
# Before fix: AIMessage(content='', response_metadata={'done_reason': 'load'})
# Expected: AIMessage with actual generated content
```

## Root Cause

The `_iterate_over_stream` and `_aiterate_over_stream` methods treated
any response with `done: True` as final, regardless of `done_reason`.
When Ollama returns `done_reason: 'load'` with empty content, it
indicates the model was loaded but no actual generation occurred - this
should not be considered a complete response.

## Solution

Modified the streaming logic to skip responses when:
- `done: True`
- `done_reason: 'load'` 
- Content is empty or contains only whitespace

This ensures agents only receive actual generated content while
preserving backward compatibility for load responses that do contain
content.

## Changes

- **`_iterate_over_stream`**: Skip empty load responses instead of
yielding them
- **`_aiterate_over_stream`**: Apply same fix to async streaming
- **Tests**: Added comprehensive test cases covering all edge cases

## Testing

All scenarios now work correctly:
-  Empty load responses are skipped (fixes original issue)
-  Load responses with actual content are preserved (backward
compatibility)
-  Normal stop responses work unchanged
-  Streaming behavior preserved
-  `create_react_agent` integration fixed

Fixes #31482.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-07-22 13:21:11 -04:00
Mason Daugherty
d65da13299 docs(ollama): add validate_model_on_init note, bump lock (#32172) 2025-07-22 10:58:45 -04:00
diego-coder
8e4396bb32 fix(ollama): robustly parse single-quoted JSON in tool calls (#32109)
**Description:**
This PR makes argument parsing for Ollama tool calls more robust. Some
LLMs—including Ollama—may return arguments as Python-style dictionaries
with single quotes (e.g., `{'a': 1}`), which are not valid JSON and
previously caused parsing to fail.
The updated `_parse_json_string` method in
`langchain_ollama.chat_models` now attempts standard JSON parsing and,
if that fails, falls back to `ast.literal_eval` for safe evaluation of
Python-style dictionaries. This improves interoperability with LLMs and
fixes a common usability issue for tool-based agents.

**Issue:**
Closes #30910

**Dependencies:**
None

**Tests:**
- Added new unit tests for double-quoted JSON, single-quoted dicts,
mixed quoting, and malformed/failure cases.
- All tests pass locally, including new coverage for single-quoted
inputs.

**Notes:**
- No breaking changes.
- No new dependencies introduced.
- Code is formatted and linted (`ruff format`, `ruff check`).
- If maintainers have suggestions for further improvements, I’m happy to
revise!

Thank you for maintaining LangChain! Looking forward to your feedback.
2025-07-21 12:11:22 -04:00
Copilot
98c3bbbaf0 fix(ollama): num_gpu parameter not working in async OllamaEmbeddings method (#32074)
The `num_gpu` parameter in `OllamaEmbeddings` was not being passed to
the Ollama client in the async embedding method, causing GPU
acceleration settings to be ignored when using async operations.

## Problem

The issue was in the `aembed_documents` method where the `options`
parameter (containing `num_gpu` and other configuration) was missing:

```python
# Sync method (working correctly)
return self._client.embed(
    self.model, texts, options=self._default_params, keep_alive=self.keep_alive
)["embeddings"]

# Async method (missing options parameter)
return (
    await self._async_client.embed(
        self.model, texts, keep_alive=self.keep_alive  #  No options!
    )
)["embeddings"]
```

This meant that when users specified `num_gpu=4` (or any other GPU
configuration), it would work with sync calls but be ignored with async
calls.

## Solution

Added the missing `options=self._default_params` parameter to the async
embed call to match the sync version:

```python
# Fixed async method
return (
    await self._async_client.embed(
        self.model,
        texts,
        options=self._default_params,  #  Now includes num_gpu!
        keep_alive=self.keep_alive,
    )
)["embeddings"]
```

## Validation

-  Added unit test to verify options are correctly passed in both sync
and async methods
-  All existing tests continue to pass
-  Manual testing confirms `num_gpu` parameter now works correctly
-  Code passes linting and formatting checks

The fix ensures that GPU configuration works consistently across both
synchronous and asynchronous embedding operations.

Fixes #32059.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-07-16 18:42:52 -04:00
Mason Daugherty
0002b1dafa ollama[patch]: fix model validation, ensure per-call reasoning can be set, tests (#31927)
* update model validation due to change in [Ollama
client](https://github.com/ollama/ollama) - ensure you are running the
latest version (0.9.6) to use `validate_model_on_init`
* add code example and fix formatting for ChatOllama reasoning
* ensure that setting `reasoning` in invocation kwargs overrides
class-level setting
* tests
2025-07-08 16:39:41 -04:00
Mason Daugherty
1f829aacf4 ollama[patch]: ruff fixes and rules (#31924)
* bump ruff deps
* add more thorough ruff rules
* fix said rules
2025-07-08 13:42:19 -04:00
Mason Daugherty
e686a70ee0 ollama: thinking, tool streaming, docs, tests (#31772)
* New `reasoning` (bool) param to support toggling [Ollama
thinking](https://ollama.com/blog/thinking) (#31573, #31700). If
`reasoning=True`, Ollama's `thinking` content will be placed in the
model responses' `additional_kwargs.reasoning_content`.
  * Supported by:
    * ChatOllama (class level, invocation level TODO)
    * OllamaLLM (TODO)
* Added tests to ensure streaming tool calls is successful (#29129)
* Refactored tests that relied on `extract_reasoning()`
* Myriad docs additions and consistency/typo fixes
* Improved type safety in some spots

Closes #29129
Addresses #31573 and #31700
Supersedes #31701
2025-07-07 13:56:41 -04:00
Mason Daugherty
572020c4d8 ollama: add validate_model_on_init, catch more errors (#31784)
* Ensure access to local model during `ChatOllama` instantiation
(#27720). This adds a new param `validate_model_on_init` (default:
`true`)
* Catch a few more errors from the Ollama client to assist users
2025-07-03 11:07:11 -04:00
Mason Daugherty
e1aff00cc1 groq: support reasoning_effort, update docs for clarity (#31754)
- There was some ambiguous wording that has been updated to hopefully
clarify the functionality of `reasoning_format` in ChatGroq.
- Added support for `reasoning_effort`
- Added links to see models capable of `reasoning_format` and
`reasoning_effort`
- Other minor nits
2025-06-27 09:43:40 -04:00
Mason Daugherty
59c2b81627 docs: fix some inline links (#31748) 2025-06-26 13:35:14 -04:00
Mason Daugherty
8878a7b143 docs: ollama nits (#31714) 2025-06-24 13:19:15 -04:00
Alexey Bondarenko
9efafe3337 ollama: Add separate kwargs parameter for async client (#31209)
**Description**:

Add a `async_client_kwargs` field to ollama chat/llm/embeddings adapters
that is passed to async httpx client constructor.

**Motivation:**

In my use-case:
- chat/embedding model adapters may be created frequently, sometimes to
be called just once or to never be called at all
- they may be used in bots sunc and async mode (not known at the moment
they are created)

So, I want to keep a static transport instance maintaining connection
pool, so model adapters can be created and destroyed freely. But that
doesn't work when both sync and async functions are in use as I can only
pass one transport instance for both sync and async client, while
transport types must be different for them. So I can't make both sync
and async calls use shared transport with current model adapter
interfaces.

In this PR I add a separate `async_client_kwargs` that gets passed to
async client constructor, so it will be possible to pass a separate
transport instance. For sake of backwards compatibility, it is merged
with `client_kwargs`, so nothing changes when it is not set.

I am unable to run linter right now, but the changes look ok.
2025-05-15 16:10:10 -04:00
rylativity
dbf9986d44 langchain-ollama (partners) / langchain-core: allow passing ChatMessages to Ollama (including arbitrary roles) (#30411)
Replacement for PR #30191 (@ccurme)

**Description**: currently, ChatOllama [will raise a value error if a
ChatMessage is passed to
it](https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py#L514),
as described
https://github.com/langchain-ai/langchain/pull/30147#issuecomment-2708932481.

Furthermore, ollama-python is removing the limitations on valid roles
that can be passed through chat messages to a model in ollama -
https://github.com/ollama/ollama-python/pull/462#event-16917810634.

This PR removes the role limitations imposed by langchain and enables
passing langchain ChatMessages with arbitrary 'role' values through the
langchain ChatOllama class to the underlying ollama-python Client.

As this PR relies on [merged but unreleased functionality in
ollama-python](
https://github.com/ollama/ollama-python/pull/462#event-16917810634), I
have temporarily pointed the ollama package source to the main branch of
the ollama-python github repo.

Format, lint, and tests of new functionality passing. Need to resolve
issue with recently added ChatOllama tests. (Now resolved)

**Issue**: resolves #30122 (related to ollama issue
https://github.com/ollama/ollama/issues/8955)

**Dependencies**: no new dependencies

[x] PR title
[x] PR message
[x] Lint and test: format, lint, and test all running successfully and
passing

---------

Co-authored-by: Ryan Stewart <ryanstewart@Ryans-MacBook-Pro.local>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-04-18 10:07:07 -04:00
ccurme
085baef926 ollama[patch]: support standard image format (#30864)
Following https://github.com/langchain-ai/langchain/pull/30746
2025-04-15 22:14:50 +00:00
ccurme
47ded80b64 ollama[patch]: fix generation info (#30863)
https://github.com/langchain-ai/langchain/pull/30778 (not released)
broke all invocation modes of ChatOllama (intent was to remove
`"message"` from `generation_info`, but we turned `generation_info` into
`stream_resp["message"]`), resulting in validation errors.
2025-04-15 19:22:58 +00:00
Sydney Runkle
8c6734325b partners[lint]: run pyupgrade to get code in line with 3.9 standards (#30781)
Using `pyupgrade` to get all `partners` code up to 3.9 standards
(mostly, fixing old `typing` imports).
2025-04-11 07:18:44 -04:00
Jacob Lee
e72f3c26a0 fix(ollama): Remove redundant message from response_metadata (#30778) 2025-04-10 23:12:57 -07:00
ccurme
e3f15f0a47 ollama[patch]: add model_name to response metadata (#30706)
Fixes [this standard
test](https://python.langchain.com/api_reference/standard_tests/integration_tests/langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.html#langchain_tests.integration_tests.chat_models.ChatModelIntegrationTests.test_usage_metadata).
2025-04-07 16:27:58 -04:00
Wenqi Li
64f97e707e ollama[patch]: Support seed param for OllamaLLM (#30553)
**Description:** a description of the change
add the seed param for OllamaLLM client reproducibility

**Issue:** the issue # it fixes, if applicable
follow up of a similar issue
https://github.com/langchain-ai/langchain/issues/24703
see also https://github.com/langchain-ai/langchain/pull/24782

**Dependencies:** any dependencies required for this change
n/a
2025-03-31 11:28:49 -04:00
Bob Merkus
5700646cc5 ollama: add reasoning model support (e.g. deepseek) (#29689)
# Description
This PR adds reasoning model support for `langchain-ollama` by
extracting reasoning token blocks, like those used in deepseek. It was
inspired by
[ollama-deep-researcher](https://github.com/langchain-ai/ollama-deep-researcher),
specifically the parsing of [thinking
blocks](6d1aaf2139/src/assistant/graph.py (L91)):
```python
  # TODO: This is a hack to remove the <think> tags w/ Deepseek models 
  # It appears very challenging to prompt them out of the responses 
  while "<think>" in running_summary and "</think>" in running_summary:
      start = running_summary.find("<think>")
      end = running_summary.find("</think>") + len("</think>")
      running_summary = running_summary[:start] + running_summary[end:]
```

This notes that it is very hard to remove the reasoning block from
prompting, but we actually want the model to reason in order to increase
model performance. This implementation extracts the thinking block, so
the client can still expect a proper message to be returned by
`ChatOllama` (and use the reasoning content separately when desired).

This implementation takes the same approach as
[ChatDeepseek](5d581ba22c/libs/partners/deepseek/langchain_deepseek/chat_models.py (L215)),
which adds the reasoning content to
chunk.additional_kwargs.reasoning_content;
```python
  if hasattr(response.choices[0].message, "reasoning_content"):  # type: ignore
      rtn.generations[0].message.additional_kwargs["reasoning_content"] = (
          response.choices[0].message.reasoning_content  # type: ignore
      )
```

This should probably be handled upstream in ollama + ollama-python, but
this seems like a reasonably effective solution. This is a standalone
example of what is happening;

```python
async def deepseek_message_astream(
    llm: BaseChatModel,
    messages: list[BaseMessage],
    config: RunnableConfig | None = None,
    *,
    model_target: str = "deepseek-r1",
    **kwargs: Any,
) -> AsyncIterator[BaseMessageChunk]:
    """Stream responses from Deepseek models, filtering out <think> tags.

    Args:
        llm: The language model to stream from
        messages: The messages to send to the model

    Yields:
        Filtered chunks from the model response
    """
    # check if the model is deepseek based
    if (llm.name and model_target not in llm.name) or (hasattr(llm, "model") and model_target not in llm.model):
        async for chunk in llm.astream(messages, config=config, **kwargs):
            yield chunk
        return

    # Yield with a buffer, upon completing the <think></think> tags, move them to the reasoning content and start over
    buffer = ""
    async for chunk in llm.astream(messages, config=config, **kwargs):
        # start or append
        if not buffer:
            buffer = chunk.content
        else:
            buffer += chunk.content if hasattr(chunk, "content") else chunk

        # Process buffer to remove <think> tags
        if "<think>" in buffer or "</think>" in buffer:
            if hasattr(chunk, "tool_calls") and chunk.tool_calls:
                raise NotImplementedError("tool calls during reasoning should be removed?")
            if "<think>" in chunk.content or "</think>" in chunk.content:
                continue
            chunk.additional_kwargs["reasoning_content"] = chunk.content
            chunk.content = ""
        # upon block completion, reset the buffer
        if "<think>" in buffer and "</think>" in buffer:
            buffer = ""
        yield chunk

```

# Issue
Integrating reasoning models (e.g. deepseek-r1) into existing LangChain
based workflows is hard due to the thinking blocks that are included in
the message contents. To avoid this, we could match the `ChatOllama`
integration with `ChatDeepseek` to return the reasoning content inside
`message.additional_arguments.reasoning_content` instead.

# Dependenices
None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-21 15:44:54 +00:00
Mohammad Mohtashim
1103bdfaf1 (Ollama) Fix String Value parsing in _parse_arguments_from_tool_call (#30154)
- **Description:** Fix String Value parsing in
_parse_arguments_from_tool_call
- **Issue:** #30145

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-19 21:47:18 -04:00
Lance Martin
46d6bf0330 ollama[minor]: update default method for structured output (#30273)
From function calling to Ollama's [dedicated structured output
feature](https://ollama.com/blog/structured-outputs).

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-03-18 12:44:22 -04:00
Stavros Kontopoulos
ac22cde130 langchain_ollama: Support keep_alive in embeddings (#30251)
- Description: Adds support for keep_alive in Ollama Embeddings see
https://github.com/ollama/ollama/issues/6401.
Builds on top of of
https://github.com/langchain-ai/langchain/pull/29296. I have this use
case where I want to keep the embeddings model in cpu forever.
- Dependencies: no deps are being introduced.
- Issue: haven't created an issue yet.
2025-03-14 14:56:50 -04:00
ccurme
806211475a core[patch]: update structured output tracing (#30123)
- Trace JSON schema in `options`
- Rename to `ls_structured_output_format`
2025-03-07 13:05:25 -05:00
Erick Friis
6c1e21d128 core: basemessage.text() (#29078) 2025-02-18 17:45:44 -08:00
Erick Friis
8f95da4eb1 multiple: structured output tracing standard metadata (#29421)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-01-29 14:00:26 -08:00
Syed Baqar Abbas
f175319303 [feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from langchain_community.embeddings to langchain_ollama.embeddings (#29296)
- [feat] **Added backwards compatibility for OllamaEmbeddings
initialization (migration from `langchain_community.embeddings` to
`langchain_ollama.embeddings`**: "langchain_ollama"
- **Description:** Given that `OllamaEmbeddings` from
`langchain_community.embeddings` is deprecated, code is being shifted to
``langchain_ollama.embeddings`. However, this does not offer backward
compatibility of initializing the parameters and `OllamaEmbeddings`
object.
    - **Issue:** #29294 
    - **Dependencies:** None
    - **Twitter handle:** @BaqarAbbas2001


## Additional Information
Previously, `OllamaEmbeddings` from `langchain_community.embeddings`
used to support the following options:

e9abe583b2/libs/community/langchain_community/embeddings/ollama.py (L125-L139)

However, in the new package `from langchain_ollama import
OllamaEmbeddings`, there is no method to set these options. I have added
these parameters to resolve this issue.

This issue was also discussed in
https://github.com/langchain-ai/langchain/discussions/29113
2025-01-20 11:16:29 -05:00
ccurme
bc4dc7f4b1 ollama[patch]: permit streaming for tool calls (#28654)
Resolves https://github.com/langchain-ai/langchain/issues/28543

Ollama recently
[released](https://github.com/ollama/ollama/releases/tag/v0.4.6) support
for streaming tool calls. Previously we would override the `stream`
parameter if tools were passed in.

Covered in standard tests here:
c1d348e95d/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L893-L897)

Before, the test generates one message chunk:
```python
[
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={
            'model': 'llama3.1',
            'created_at': '2024-12-10T17:49:04.468487Z',
            'done': True,
            'done_reason': 'stop',
            'total_duration': 525471208,
            'load_duration': 19701000,
            'prompt_eval_count': 170,
            'prompt_eval_duration': 31000000,
            'eval_count': 17,
            'eval_duration': 473000000,
            'message': Message(
                role='assistant',
                content='',
                images=None,
                tool_calls=[
                    ToolCall(
                        function=Function(name='magic_function', arguments={'input': 3})
                    )
                ]
            )
        },
        id='run-552bbe0f-8fb2-4105-ada1-fa38c1db444d',
        tool_calls=[
            {
                'name': 'magic_function',
                'args': {'input': 3},
                'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2',
                'type': 'tool_call',
            },
        ],
        usage_metadata={
            'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187
        },
        tool_call_chunks=[
            {
                'name': 'magic_function',
                'args': '{"input": 3}',
                'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2',
                'index': None,
                'type': 'tool_call_chunk',
            }
        ]
    )
]
```

After, it generates two (tool call in one, response metadata in
another):
```python
[
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={},
        id='run-9a3f0860-baa1-4bae-9562-13a61702de70',
        tool_calls=[
            {
                'name': 'magic_function',
                'args': {'input': 3},
                'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0',
                'type': 'tool_call',
            },
        ],
        tool_call_chunks=[
            {
                'name': 'magic_function',
                'args': '{"input": 3}',
                'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0',
                'index': None,
                'type': 'tool_call_chunk',
            },
        ],
    ),
    AIMessageChunk(
        content='',
        additional_kwargs={},
        response_metadata={
            'model': 'llama3.1',
            'created_at': '2024-12-10T17:46:43.278436Z',
            'done': True,
            'done_reason': 'stop',
            'total_duration': 514282750,
            'load_duration': 16894458,
            'prompt_eval_count': 170,
            'prompt_eval_duration': 31000000,
            'eval_count': 17,
            'eval_duration': 464000000,
            'message': Message(
                role='assistant', content='', images=None, tool_calls=None
            ),
        },
        id='run-9a3f0860-baa1-4bae-9562-13a61702de70',
        usage_metadata={
            'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187
        }
    ),
]
```
2024-12-10 12:54:37 -05:00
ccurme
5c6e2cbcda ollama[patch]: support structured output (#28629)
- Bump minimum version of `ollama` to 0.4.4 (which also addresses
https://github.com/langchain-ai/langchain/issues/28607).
- Support recently-released [structured
output](https://ollama.com/blog/structured-outputs) feature. This can be
accessed by calling `.with_structured_output` with
`method="json_schema"` (choice of name
[mirrors](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.with_structured_output)
what we have for OpenAI's structured output feature).

`ChatOllama` previously implemented `.with_structured_output` via the
[base
implementation](ec9b41431e/libs/core/langchain_core/language_models/chat_models.py (L1117)).
2024-12-10 10:36:00 -05:00
TheDannyG
607c60a594 partners/ollama: fix tool calling with nested schemas (#28225)
## Description

This PR addresses the following:

**Fixes Issue #25343:**
- Adds additional logic to parse shallowly nested JSON-encoded strings
in tool call arguments, allowing for proper parsing of responses like
that of Llama3.1 and 3.2 with nested schemas.
 
**Adds Integration Test for Fix:**
- Adds a Ollama specific integration test to ensure the issue is
resolved and to prevent regressions in the future.

**Fixes Failing Integration Tests:**
- Fixes failing integration tests (even prior to changes) caused by
`llama3-groq-tool-use` model. Previously,
tests`test_structured_output_async` and
`test_structured_output_optional_param` failed due to the model not
issuing a tool call in the response. Resolved by switching to
`llama3.1`.

## Issue
Fixes #25343.

## Dependencies
No dependencies.

____

Done in collaboration with @ishaan-upadhyay @mirajismail @ZackSteine.
2024-11-27 10:32:02 -05:00
ccurme
74d9d2cba1 ollama[patch]: support ollama 0.4 (#28364)
v0.4 of the Python SDK is already installed via the lock file in CI, but
our current implementation is not compatible with it.

This also addresses an issue introduced in
https://github.com/langchain-ai/langchain/pull/28299. @RyanMagnuson
would you mind explaining the motivation for that change? From what I
can tell the Ollama SDK [does not support
kwargs](6c44bb2729/ollama/_client.py (L286)).
Previously, unsupported kwargs were ignored, but they currently raise
`TypeError`.

Some of LangChain's standard test suite expects `tool_choice` to be
supported, so here we catch it in `bind_tools` so it is ignored and not
passed through to the client.
2024-11-26 12:45:59 -05:00
Erick Friis
aa7fa80e1e partners/ollama: release 0.2.2rc1 (#28300) 2024-11-22 22:25:05 +00:00
Erick Friis
7277794a59 ollama: include kwargs in requests (#28299)
courtesy of @ryanmagnuson
2024-11-22 14:15:42 -08:00
Elham Badri
d696728278 partners/ollama: Enabled Token Level Streaming when Using Bind Tools for ChatOllama (#27689)
**Description:** The issue concerns the unexpected behavior observed
using the bind_tools method in LangChain's ChatOllama. When tools are
not bound, the llm.stream() method works as expected, returning
incremental chunks of content, which is crucial for real-time
applications such as conversational agents and live feedback systems.
However, when bind_tools([]) is used, the streaming behavior changes,
causing the output to be delivered in full chunks rather than
incrementally. This change negatively impacts the user experience by
breaking the real-time nature of the streaming mechanism.
**Issue:** #26971

---------

Co-authored-by: 4meyDam1e <amey.damle@mail.utoronto.ca>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-11-15 11:36:27 -05:00
JiaranI
3952ee31b8 ollama: add pydocstyle linting for ollama (#27686)
Description: add lint docstrings for ollama module
Issue: the issue https://github.com/langchain-ai/langchain/issues/23188
@baskaryan

test: ruff check passed.
<img width="311" alt="e94c68ffa93dd518297a95a93de5217"
src="https://github.com/user-attachments/assets/e96bf721-e0e3-44de-a50e-206603de398e">

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-31 03:06:55 +00:00
Erick Friis
c2a3021bb0 multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Marcelo Machado
9bd4f1dfa8 docs: small improvement ChatOllama setup description (#26043)
Small improvement on ChatOllama description

---------

Co-authored-by: Marcelo Machado <mmachado@ibm.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:15:05 +00:00
ccurme
b83f1eb0d5 core, partners: implement standard tracing params for LLMs (#25410) 2024-08-16 13:18:09 -04:00
Eugene Yurtsev
f4196f1fb8 ollama[patch]: Update extra in ollama package (#25383)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.
2024-08-14 10:30:01 -04:00
Eugene Yurtsev
f679ed72ca ollama[patch]: Update API Reference for ollama embeddings (#25315)
Update API reference for OllamaEmbeddings
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 21:31:48 -04:00
Isaac Francisco
63ddf0afb4 ollama: allow base_url, headers, and auth to be passed (#25078) 2024-08-05 15:39:36 -07:00
Bagatur
8461934c2b core[patch], integrations[patch]: convert TypedDict to tool schema support (#24641)
supports following UX

```python
    class SubTool(TypedDict):
        """Subtool docstring"""

        args: Annotated[Dict[str, Any], {}, "this does bar"]

    class Tool(TypedDict):
        """Docstring
        Args:
            arg1: foo
        """

        arg1: str
        arg2: Union[int, str]
        arg3: Optional[List[SubTool]]
        arg4: Annotated[Literal["bar", "baz"], ..., "this does foo"]
        arg5: Annotated[Optional[float], None]
```

- can parse google style docstring
- can use Annotated to specify default value (second arg)
- can use Annotated to specify arg description (third arg)
- can have nested complex types
2024-07-31 18:27:24 +00:00
Isaac Francisco
78d97b49d9 [partner]: ollama llm fix (#24790) 2024-07-29 13:00:02 -07:00