Files
langchain/docs/docs/contributing/how_to/integrations/standard_tests.ipynb
2024-11-20 19:26:16 -08:00

469 lines
16 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to add standard tests to an integration\n",
"\n",
"When creating either a custom class for yourself or a new tool to publish in a LangChain integration, it is important to add standard tests to ensure it works as expected. This guide will show you how to add standard tests to a tool, and you can **[Skip to the test templates](#standard-test-templates-per-component)** for implementing tests for each integration.\n",
"\n",
"## Setup\n",
"\n",
"First, let's install 2 dependencies:\n",
"\n",
"- `langchain-core` will define the interfaces we want to import to define our custom tool.\n",
"- `langchain-tests==0.3.3` will provide the standard tests we want to use.\n",
"\n",
":::note\n",
"\n",
"Because added tests in new versions of `langchain-tests` will always break your CI/CD pipelines, we recommend pinning the \n",
"version of `langchain-tests` to avoid unexpected changes.\n",
"\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -U langchain-core langchain-tests==0.3.2 pytest pytest-socket"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say we're publishing a package, `langchain_parrot_link`, that exposes a\n",
"tool called `ParrotMultiplyTool`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"langchain_parrot_link/tools.py\"\n",
"from langchain_core.tools import BaseTool\n",
"\n",
"\n",
"class ParrotMultiplyTool(BaseTool):\n",
" name: str = \"ParrotMultiplyTool\"\n",
" description: str = (\n",
" \"Multiply two numbers like a parrot. Parrots always add \"\n",
" \"eighty for their matey.\"\n",
" )\n",
"\n",
" def _run(self, a: int, b: int) -> int:\n",
" return a * b + 80"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And we'll assume you've structured your package the same way as the main LangChain\n",
"packages:\n",
"\n",
"```\n",
"/\n",
"├── langchain_parrot_link/\n",
"│ └── tools.py\n",
"└── tests/\n",
" ├── unit_tests/\n",
" │ └── test_tools.py\n",
" └── integration_tests/\n",
" └── test_tools.py\n",
"```\n",
"\n",
"## Add and configure standard tests\n",
"\n",
"There are 2 namespaces in the `langchain-tests` package: \n",
"\n",
"- [unit tests](../../../concepts/testing.mdx#unit-tests) (`langchain_tests.unit_tests`): designed to be used to test the tool in isolation and without access to external services\n",
"- [integration tests](../../../concepts/testing.mdx#unit-tests) (`langchain_tests.integration_tests`): designed to be used to test the tool with access to external services (in particular, the external service that the tool is designed to interact with).\n",
"\n",
"Both types of tests are implemented as [`pytest` class-based test suites](https://docs.pytest.org/en/7.1.x/getting-started.html#group-multiple-tests-in-a-class).\n",
"\n",
"By subclassing the base classes for each type of standard test (see below), you get all of the standard tests for that type, and you\n",
"can override the properties that the test suite uses to configure the tests.\n",
"\n",
"### Standard tools tests\n",
"\n",
"Here's how you would configure the standard unit tests for the custom tool, e.g. in `tests/test_tools.py`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"title": "tests/test_custom_tool.py"
},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.unit_tests import ToolsUnitTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolUnit(ToolsUnitTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.integration_tests import ToolsIntegrationTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolIntegration(ToolsIntegrationTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and you would run these with the following commands from your project root\n",
"\n",
"```bash\n",
"# run unit tests without network access\n",
"pytest --disable-socket --allow-unix-socket tests/unit_tests\n",
"\n",
"# run integration tests\n",
"pytest tests/integration_tests\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Standard test templates per component:\n",
"\n",
"Above, we implement the **unit** and **integration** standard tests for a tool. Below are the templates for implementing the standard tests for each component:\n",
"\n",
"<details>\n",
" <summary>Chat Models</summary>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_chat_models.py\"\n",
"from typing import Tuple, Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.unit_tests import ChatModelUnitTests\n",
"\n",
"\n",
"class TestChatParrotLinkUnit(ChatModelUnitTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\"model\": \"bird-brain-001\", \"temperature\": 0}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_chat_models.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.chat_models import ChatParrotLink\n",
"from langchain_tests.integration_tests import ChatModelIntegrationTests\n",
"\n",
"\n",
"class TestChatParrotLinkIntegration(ChatModelIntegrationTests):\n",
" @property\n",
" def chat_model_class(self) -> Type[ChatParrotLink]:\n",
" return ChatParrotLink\n",
"\n",
" @property\n",
" def chat_model_params(self) -> dict:\n",
" return {\"model\": \"bird-brain-001\", \"temperature\": 0}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Embedding Models</summary>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_embeddings.py\"\n",
"from typing import Tuple, Type\n",
"\n",
"from langchain_parrot_link.embeddings import ParrotLinkEmbeddings\n",
"from langchain_tests.unit_tests import EmbeddingsUnitTests\n",
"\n",
"\n",
"class TestParrotLinkEmbeddingsUnit(EmbeddingsUnitTests):\n",
" @property\n",
" def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:\n",
" return ParrotLinkEmbeddings\n",
"\n",
" @property\n",
" def embedding_model_params(self) -> dict:\n",
" return {\"model\": \"nest-embed-001\", \"temperature\": 0}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_embeddings.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.embeddings import ParrotLinkEmbeddings\n",
"from langchain_tests.integration_tests import EmbeddingsIntegrationTests\n",
"\n",
"\n",
"class TestParrotLinkEmbeddingsIntegration(EmbeddingsIntegrationTests):\n",
" @property\n",
" def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:\n",
" return ParrotLinkEmbeddings\n",
"\n",
" @property\n",
" def embedding_model_params(self) -> dict:\n",
" return {\"model\": \"nest-embed-001\", \"temperature\": 0}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Tools/Toolkits</summary>\n",
" <p>Note: The standard tests for tools/toolkits are implemented in the example in the main body of this guide too.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/unit_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.unit_tests import ToolsUnitTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolUnit(ToolsUnitTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_tools.py\"\n",
"from typing import Type\n",
"\n",
"from langchain_parrot_link.tools import ParrotMultiplyTool\n",
"from langchain_tests.integration_tests import ToolsIntegrationTests\n",
"\n",
"\n",
"class TestParrotMultiplyToolIntegration(ToolsIntegrationTests):\n",
" @property\n",
" def tool_constructor(self) -> Type[ParrotMultiplyTool]:\n",
" return ParrotMultiplyTool\n",
"\n",
" @property\n",
" def tool_constructor_params(self) -> dict:\n",
" # if your tool constructor instead required initialization arguments like\n",
" # `def __init__(self, some_arg: int):`, you would return those here\n",
" # as a dictionary, e.g.: `return {'some_arg': 42}`\n",
" return {}\n",
"\n",
" @property\n",
" def tool_invoke_params_example(self) -> dict:\n",
" \"\"\"\n",
" Returns a dictionary representing the \"args\" of an example tool call.\n",
"\n",
" This should NOT be a ToolCall dict - i.e. it should not\n",
" have {\"name\", \"id\", \"args\"} keys.\n",
" \"\"\"\n",
" return {\"a\": 2, \"b\": 3}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>\n",
"<details>\n",
" <summary>Vector Stores</summary>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# title=\"tests/integration_tests/test_vectorstores_sync.py\"\n",
"\n",
"from typing import AsyncGenerator, Generator\n",
"\n",
"import pytest\n",
"from langchain_core.vectorstores import VectorStore\n",
"from langchain_parrot_link.vectorstores import ParrotVectorStore\n",
"from langchain_standard_tests.integration_tests.vectorstores import (\n",
" AsyncReadWriteTestSuite,\n",
" ReadWriteTestSuite,\n",
")\n",
"\n",
"\n",
"class TestSync(ReadWriteTestSuite):\n",
" @pytest.fixture()\n",
" def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore\n",
" \"\"\"Get an empty vectorstore for unit tests.\"\"\"\n",
" store = ParrotVectorStore()\n",
" # note: store should be EMPTY at this point\n",
" # if you need to delete data, you may do so here\n",
" try:\n",
" yield store\n",
" finally:\n",
" # cleanup operations, or deleting data\n",
" pass\n",
"\n",
"\n",
"class TestAsync(AsyncReadWriteTestSuite):\n",
" @pytest.fixture()\n",
" async def vectorstore(self) -> AsyncGenerator[VectorStore, None]: # type: ignore\n",
" \"\"\"Get an empty vectorstore for unit tests.\"\"\"\n",
" store = ParrotVectorStore()\n",
" # note: store should be EMPTY at this point\n",
" # if you need to delete data, you may do so here\n",
" try:\n",
" yield store\n",
" finally:\n",
" # cleanup operations, or deleting data\n",
" pass"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"</details>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}