This PR lifts the restrictions by pydantic by remapping the fields.
We could do a bit more work if we wanted to better catch issues if any
users are intentionally trying to create field collisions
Pydantic 2 is stricter in terms of which field names are allowed in
pydantic models.
This PR results in the following breaking changes:
These will raise ValueErrors:
```python
ChatPromptTemplate([("system", "{_private}")]).get_input_schema()
ChatPromptTemplate([("system","{model_json_schema}")]).get_input_schema()
```
This PR should properly suppress warnings for the following cases:
```python
ChatPromptTemplate([("system", "{schema}")]).get_input_schema()
ChatPromptTemplate([("system","{model_id}")]).get_input_schema()
```
This PR was autogenerated using gritql, tests written manually
```shell
grit apply 'class_definition(name=$C, $body, superclasses=$S) where {
$C <: ! "Config", // Does not work in this scope, but works after class_definition
$body <: block($statements),
$statements <: some bubble assignment(left=$x, right=$y, type=$t) as $A where {
or {
$y <: `Field($z)`,
$x <: "model_config"
}
},
// And has either Any or Optional fields without a default
$statements <: some bubble assignment(left=$x, right=$y, type=$t) as $A where {
$t <: or {
r"Optional.*",
r"Any",
r"Union[None, .*]",
r"Union[.*, None, .*]",
r"Union[.*, None]",
},
$y <: ., // Match empty node
$t => `$t = None`,
},
}
' --language python .
```
* This allows pydantic to correctly resolve annotations necessary for
building pydantic models dynamically.
* Makes a small fix for RunnableWithMessageHistory which was fetching
the OutputType from the RunnableLambda that was yielding another
RunnableLambda. This doesn't propagate the output of the RunnableAssign
fully (i.e., with concrete type information etc.)
Resolves issue: https://github.com/langchain-ai/langchain/issues/26250
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
This repr will be deleted prior to release -- it's temporarily here to
make it easy to separate code changes in langchain vs. code changes
stemming from breaking changes in pydantic
This PR adds methods to directly get the json schema for inputs,
outputs, and config.
Currently, it's delegating to the underlying pydantic implementation,
but this may be changed in the future to be independent of pydantic.
This PR upgrades core to pydantic 2.
It involves a combination of manual changes together with automated code
mods using gritql.
Changes and known issues:
1. Current models override __repr__ to be consistent with pydantic 1
(this will be removed in a follow up PR)
Related:
https://github.com/langchain-ai/langchain/pull/25986/files#diff-e5bd296179b7a72fcd4ea5cfa28b145beaf787da057e6d122aa76ee0bb8132c9R74
2. Issue with decorator for BaseChatModel
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-932bf3b314b268754ef640a5b8f52da96f9024fb81dd388dcd166b5713ecdf66R202)
-- cc @baskaryan
3. `name` attribute in Base Runnable does not have a default -- was
raising a pydantic warning due to override. We need to see if there's a
way to fix to avoid making a breaking change for folks with custom
runnables.
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-836773d27f8565f4dd45e9d6cf828920f89991a880c098b7511e0d3bb78a8a0dR238)
4. Likely can remove hard-coded RunnableBranch name
(https://github.com/langchain-ai/langchain/pull/25986/files#diff-72894b94f70b1bfc908eb4d53f5ff90bb33bf8a4240a5e34cae48ddc62ac313aR147)
5. `model_*` namespace is reserved in pydantic. We'll need to specify
`protected_namespaces`
6. create_model does not have a cached path yet
7. get_input_schema() in many places has been updated to be explicit
about whether parameters are required or optional
8. injected tool args aren't picked up properly (losing type annotation)
For posterity the following gritql migrations were used:
```
engine marzano(0.1)
language python
or {
`from $IMPORT import $...` where {
$IMPORT <: contains `pydantic_v1`,
$IMPORT => `pydantic`
},
`$X.update_forward_refs` => `$X.model_rebuild`,
// This pattern still needs fixing as it fails (populate_by_name vs.
// allow_populate_by_name)
class_definition($name, $body) as $C where {
$name <: `Config`,
$body <: block($statements),
$t = "",
$statements <: some bubble($t) assignment(left=$x, right=$y) as $A where {
or {
$x <: `allow_population_by_field_name` where {
$t += `populate_by_name=$y,`
},
$t += `$x=$y,`
}
},
$C => `model_config = ConfigDict($t)`,
add_import(source="pydantic", name="ConfigDict")
}
}
```
```
engine marzano(0.1)
language python
`@root_validator(pre=True)` as $decorator where {
$decorator <: before function_definition($body, $return_type),
$decorator => `@model_validator(mode="before")\n@classmethod`,
add_import(source="pydantic", name="model_validator"),
$return_type => `Any`
}
```
```
engine marzano(0.1)
language python
`@root_validator(pre=False, skip_on_failure=True)` as $decorator where {
$decorator <: before function_definition($body, $parameters, $return_type) where {
$body <: contains bubble or {
`values["$Q"]` => `self.$Q`,
`values.get("$Q")` => `(self.$Q or None)`,
`values.get($Q, $...)` as $V where {
$Q <: contains `"$QName"`,
$V => `self.$QName`,
},
`return $Q` => `return self`
}
},
$decorator => `@model_validator(mode="after")`,
// Silly work around a bug in grit
// Adding Self to pydantic and then will replace it with one from typing
add_import(source="pydantic", name="model_validator"),
$parameters => `self`,
$return_type => `Self`
}
```
```
grit apply --language python '`Self` where { add_import(source="typing_extensions", name="Self")}'
```
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Hello.
First of all, thank you for maintaining such a great project.
## Description
In https://github.com/langchain-ai/langchain/pull/25123, support for
structured_output is added. However, `"additionalProperties": false`
needs to be included at all levels when a nested object is generated.
error from current code:
https://gist.github.com/fufufukakaka/e9b475300e6934853d119428e390f204
```
BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'JokeWithEvaluation': In context=('properties', 'self_evaluation'), 'additionalProperties' is required to be supplied and to be false", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}
```
Reference: [Introducing Structured Outputs in the
API](https://openai.com/index/introducing-structured-outputs-in-the-api/)
```json
{
"model": "gpt-4o-2024-08-06",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor."
},
{
"role": "user",
"content": "solve 8x + 31 = 2"
}
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "math_response",
"strict": true,
"schema": {
"type": "object",
"properties": {
"steps": {
"type": "array",
"items": {
"type": "object",
"properties": {
"explanation": {
"type": "string"
},
"output": {
"type": "string"
}
},
"required": ["explanation", "output"],
"additionalProperties": false
}
},
"final_answer": {
"type": "string"
}
},
"required": ["steps", "final_answer"],
"additionalProperties": false
}
}
}
}
```
In the current code, `"additionalProperties": false` is only added at
the last level.
This PR introduces the `_add_additional_properties_key` function, which
recursively adds `"additionalProperties": false` to the entire JSON
schema for the request.
Twitter handle: `@fukkaa1225`
Thank you!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Previously the code was able to only handle a single level of nesting
for subgraphs in mermaid. This change adds support for arbitrary nesting
of subgraphs.
**Description:**
LLM will stop generating text even in the middle of a sentence if
`finish_reason` is `length` (for OpenAI) or `stop_reason` is
`max_tokens` (for Anthropic).
To obtain longer outputs from LLM, we should call the message generation
API multiple times and merge the results into the text to circumvent the
API's output token limit.
The extra line breaks forced by the `merge_message_runs` function when
seamlessly merging messages can be annoying, so I added the option to
specify the chunk separator.
**Issue:**
No corresponding issues.
**Dependencies:**
No dependencies required.
**Twitter handle:**
@hanama_chem
https://x.com/hanama_chem
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>