mirror of
https://github.com/hwchase17/langchain.git
synced 2025-06-22 23:00:00 +00:00
docs: tool, retriever contributing docs (#28602)
This commit is contained in:
parent
5e8553c31a
commit
9b848491c8
@ -1,4 +1,5 @@
|
|||||||
---
|
---
|
||||||
|
pagination_prev: null
|
||||||
pagination_next: contributing/how_to/integrations/package
|
pagination_next: contributing/how_to/integrations/package
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -37,7 +38,6 @@ While any component can be integrated into LangChain, there are specific types o
|
|||||||
<li>Chat Models</li>
|
<li>Chat Models</li>
|
||||||
<li>Tools/Toolkits</li>
|
<li>Tools/Toolkits</li>
|
||||||
<li>Retrievers</li>
|
<li>Retrievers</li>
|
||||||
<li>Document Loaders</li>
|
|
||||||
<li>Vector Stores</li>
|
<li>Vector Stores</li>
|
||||||
<li>Embedding Models</li>
|
<li>Embedding Models</li>
|
||||||
</ul>
|
</ul>
|
||||||
@ -45,6 +45,7 @@ While any component can be integrated into LangChain, there are specific types o
|
|||||||
<td>
|
<td>
|
||||||
<ul>
|
<ul>
|
||||||
<li>LLMs (Text-Completion Models)</li>
|
<li>LLMs (Text-Completion Models)</li>
|
||||||
|
<li>Document Loaders</li>
|
||||||
<li>Key-Value Stores</li>
|
<li>Key-Value Stores</li>
|
||||||
<li>Document Transformers</li>
|
<li>Document Transformers</li>
|
||||||
<li>Model Caches</li>
|
<li>Model Caches</li>
|
||||||
|
@ -175,6 +175,60 @@ import EmbeddingsSource from '/src/theme/integration_template/integration_templa
|
|||||||
</TabItem>
|
</TabItem>
|
||||||
<TabItem value="tools" label="Tools">
|
<TabItem value="tools" label="Tools">
|
||||||
|
|
||||||
|
Tools are used in 2 main ways:
|
||||||
|
|
||||||
|
1. To define an "input schema" or "args schema" to pass to a chat model's tool calling
|
||||||
|
feature along with a text request, such that the chat model can generate a "tool call",
|
||||||
|
or parameters to call the tool with.
|
||||||
|
2. To take a "tool call" as generated above, and take some action and return a response
|
||||||
|
that can be passed back to the chat model as a ToolMessage.
|
||||||
|
|
||||||
|
The `Tools` class must inherit from the [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#langchain_core.tools.base.BaseTool) base class. This interface has 3 properties and 2 methods that should be implemented in a
|
||||||
|
subclass.
|
||||||
|
|
||||||
|
| Method/Property | Description |
|
||||||
|
|------------------------ |------------------------------------------------------|
|
||||||
|
| `name` | Name of the tool (passed to the LLM too). |
|
||||||
|
| `description` | Description of the tool (passed to the LLM too). |
|
||||||
|
| `args_schema` | Define the schema for the tool's input arguments. |
|
||||||
|
| `_run` | Run the tool with the given arguments. |
|
||||||
|
| `_arun` | Asynchronously run the tool with the given arguments.|
|
||||||
|
|
||||||
|
### Properties
|
||||||
|
|
||||||
|
`name`, `description`, and `args_schema` are all properties that should be implemented
|
||||||
|
in the subclass. `name` and `description` are strings that are used to identify the tool
|
||||||
|
and provide a description of what the tool does. Both of these are passed to the LLM,
|
||||||
|
and users may override these values depending on the LLM they are using as a form of
|
||||||
|
"prompt engineering." Giving these a concise and LLM-usable name and description is
|
||||||
|
important for the initial user experience of the tool.
|
||||||
|
|
||||||
|
`args_schema` is a Pydantic `BaseModel` that defines the schema for the tool's input
|
||||||
|
arguments. This is used to validate the input arguments to the tool, and to provide
|
||||||
|
a schema for the LLM to fill out when calling the tool. Similar to the `name` and
|
||||||
|
`description` of the overall Tool class, the fields' names (the variable name) and
|
||||||
|
description (part of `Field(..., description="description")`) are passed to the LLM,
|
||||||
|
and the values in these fields should be concise and LLM-usable.
|
||||||
|
|
||||||
|
### Run Methods
|
||||||
|
|
||||||
|
`_run` is the main method that should be implemented in the subclass. This method
|
||||||
|
takes in the arguments from `args_schema` and runs the tool, returning a string
|
||||||
|
response. This method is usually called in a LangGraph [`ToolNode`](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/), and can also be called in a legacy
|
||||||
|
`langchain.agents.AgentExecutor`.
|
||||||
|
|
||||||
|
`_arun` is optional because by default, `_run` will be run in an async executor.
|
||||||
|
However, if your tool is calling any apis or doing any async work, you should implement
|
||||||
|
this method to run the tool asynchronously in addition to `_run`.
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
The `langchain-cli` package contains [template integrations](https://github.com/langchain-ai/langchain/tree/master/libs/cli/langchain_cli/integration_template/integration_template)
|
||||||
|
for major LangChain components that are tested against the standard unit and
|
||||||
|
integration tests in the LangChain Github repository. You can access the starter
|
||||||
|
embedding model implementation [here](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/integration_template/tools.py).
|
||||||
|
For convenience, we also include the code below.
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
<summary>Example tool code</summary>
|
<summary>Example tool code</summary>
|
||||||
|
|
||||||
@ -194,6 +248,50 @@ import ToolSource from '/src/theme/integration_template/integration_template/too
|
|||||||
</TabItem>
|
</TabItem>
|
||||||
<TabItem value="retrievers" label="Retrievers">
|
<TabItem value="retrievers" label="Retrievers">
|
||||||
|
|
||||||
|
Retrievers are used to retrieve documents from APIs, databases, or other sources
|
||||||
|
based on a query. The `Retriever` class must inherit from the [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) base class. This interface has 1 attribute and 2 methods that should be implemented in a subclass.
|
||||||
|
|
||||||
|
| Method/Property | Description |
|
||||||
|
|------------------------ |------------------------------------------------------|
|
||||||
|
| `k` | Default number of documents to retrieve (configurable). |
|
||||||
|
| `_get_relevant_documents`| Retrieve documents based on a query. |
|
||||||
|
| `_aget_relevant_documents`| Asynchronously retrieve documents based on a query. |
|
||||||
|
|
||||||
|
### Attributes
|
||||||
|
|
||||||
|
`k` is an attribute that should be implemented in the subclass. This attribute
|
||||||
|
can simply be defined at the top of the class with a default value like
|
||||||
|
`k: int = 5`. This attribute is the default number of documents to retrieve
|
||||||
|
from the retriever, and can be overridden by the user when constructing or calling
|
||||||
|
the retriever.
|
||||||
|
|
||||||
|
### Methods
|
||||||
|
|
||||||
|
`_get_relevant_documents` is the main method that should be implemented in the subclass.
|
||||||
|
|
||||||
|
This method takes in a query and returns a list of `Document` objects, which have 2
|
||||||
|
main properties:
|
||||||
|
|
||||||
|
- `page_content` - the text content of the document
|
||||||
|
- `metadata` - a dictionary of metadata about the document
|
||||||
|
|
||||||
|
Retrievers are typically directly invoked by a user, e.g. as
|
||||||
|
`MyRetriever(k=4).invoke("query")`, which will automatically call `_get_relevant_documents`
|
||||||
|
under the hood.
|
||||||
|
|
||||||
|
`_aget_relevant_documents` is optional because by default, `_get_relevant_documents` will
|
||||||
|
be run in an async executor. However, if your retriever is calling any apis or doing
|
||||||
|
any async work, you should implement this method to run the retriever asynchronously
|
||||||
|
in addition to `_get_relevant_documents` for performance reasons.
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
The `langchain-cli` package contains [template integrations](https://github.com/langchain-ai/langchain/tree/master/libs/cli/langchain_cli/integration_template/integration_template)
|
||||||
|
for major LangChain components that are tested against the standard unit and
|
||||||
|
integration tests in the LangChain Github repository. You can access the starter
|
||||||
|
embedding model implementation [here](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/integration_template/retrievers.py).
|
||||||
|
For convenience, we also include the code below.
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
<summary>Example retriever code</summary>
|
<summary>Example retriever code</summary>
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user