diff --git a/libs/core/langchain_core/load/load.py b/libs/core/langchain_core/load/load.py index 82d84646f73..eb4207f1f95 100644 --- a/libs/core/langchain_core/load/load.py +++ b/libs/core/langchain_core/load/load.py @@ -13,6 +13,14 @@ allowlist. If the class is not in the allowlist, deserialization raises a `Value ## Security model +!!! warning "Do not use with untrusted input" + + These functions instantiate Python objects and can trigger side effects + such as network calls, file operations, or environment variable access + during deserialization. **Never call `load()` or `loads()` on data from an + untrusted or unauthenticated source.** Even with the allowlist, allowed + classes may perform dangerous operations during `__init__`. + The `allowed_objects` parameter controls which classes can be deserialized: - **`'core'` (default)**: Allow classes defined in the serialization mappings for @@ -35,6 +43,16 @@ These classes do not perform side effects during initialization. Import paths are also validated against trusted namespaces before any module is imported. +### Best practices + +- Use the most restrictive `allowed_objects` possible. Prefer an explicit list + of classes over `'core'` or `'all'`. +- Keep `secrets_from_env` set to `False` (the default). If you must use it, + ensure the serialized data comes from a fully trusted source, as a crafted + payload can read arbitrary environment variables. +- When using `secrets_map`, include only the specific secrets that the + serialized object requires. + ### Injection protection (escape-based) During serialization, plain dicts that contain an `'lc'` key are escaped by wrapping @@ -299,11 +317,18 @@ class Reviver: `langchain_core.load.mapping` for the full list. - Explicit list of classes: Only those specific classes are allowed. secrets_map: A map of secrets to load. - If a secret is not found in the map, it will be loaded from the - environment if `secrets_from_env` is `True`. + + Only include the specific secrets the serialized object + requires. If a secret is not found in the map, it will be loaded + from the environment if `secrets_from_env` is `True`. valid_namespaces: Additional namespaces (modules) to allow during deserialization, beyond the default trusted namespaces. secrets_from_env: Whether to load secrets from the environment. + + A crafted payload can name arbitrary environment variables in + its `secret` fields, so enabling this on untrusted data can leak + sensitive values. Keep this `False` (the default) unless the + serialized data is fully trusted. additional_import_mappings: A dictionary of additional namespace mappings. You can use this to override default mappings or add new mappings. @@ -490,10 +515,12 @@ def loads( core LangChain types (messages, prompts, documents, etc.). See `langchain_core.load.mapping` for the full list. - !!! warning "Beta feature" + !!! warning "Do not use with untrusted input" - This is a beta feature. Please be wary of deploying experimental code to - production unless you've taken appropriate precautions. + This function instantiates Python objects and can trigger side effects + during deserialization. **Never call `loads()` on data from an untrusted + or unauthenticated source.** See the module-level security model + documentation for details and best practices. Args: text: The string to load. @@ -511,11 +538,17 @@ def loads( - `[]`: Disallow all deserialization (will raise on any object). secrets_map: A map of secrets to load. - If a secret is not found in the map, it will be loaded from the environment - if `secrets_from_env` is `True`. + Only include the specific secrets the serialized object requires. If + a secret is not found in the map, it will be loaded from the + environment if `secrets_from_env` is `True`. valid_namespaces: Additional namespaces (modules) to allow during deserialization, beyond the default trusted namespaces. secrets_from_env: Whether to load secrets from the environment. + + A crafted payload can name arbitrary environment variables in its + `secret` fields, so enabling this on untrusted data can leak + sensitive values. Keep this `False` (the default) unless the + serialized data is fully trusted. additional_import_mappings: A dictionary of additional namespace mappings. You can use this to override default mappings or add new mappings. @@ -573,10 +606,12 @@ def load( core LangChain types (messages, prompts, documents, etc.). See `langchain_core.load.mapping` for the full list. - !!! warning "Beta feature" + !!! warning "Do not use with untrusted input" - This is a beta feature. Please be wary of deploying experimental code to - production unless you've taken appropriate precautions. + This function instantiates Python objects and can trigger side effects + during deserialization. **Never call `load()` on data from an untrusted + or unauthenticated source.** See the module-level security model + documentation for details and best practices. Args: obj: The object to load. @@ -594,11 +629,18 @@ def load( - `[]`: Disallow all deserialization (will raise on any object). secrets_map: A map of secrets to load. + Only include the specific secrets the serialized object requires. + If a secret is not found in the map, it will be loaded from the environment if `secrets_from_env` is `True`. valid_namespaces: Additional namespaces (modules) to allow during deserialization, beyond the default trusted namespaces. secrets_from_env: Whether to load secrets from the environment. + + A crafted payload can name arbitrary environment variables in its + `secret` fields, so enabling this on untrusted data can leak + sensitive values. Keep this `False` (the default) unless the + serialized data is fully trusted. additional_import_mappings: A dictionary of additional namespace mappings. You can use this to override default mappings or add new mappings.