docs: poetry publish (#28275)

This commit is contained in:
Erick Friis 2024-11-21 19:10:03 -08:00 committed by GitHub
parent f173b72e35
commit 49254cde70
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 438 additions and 22 deletions

View File

@ -6,4 +6,5 @@
## Integrations
- [**Start Here**](integrations/index.mdx): Help us integrate with your favorite vendors and tools.
- [**Package**](integrations/package): Publish an integration package to PyPi
- [**Standard Tests**](integrations/standard_tests): Ensure your integration passes an expected set of tests.

View File

@ -1,3 +1,7 @@
---
pagination_next: null
pagination_prev: null
---
## How to add a community integration (not recommended)
:::danger

View File

@ -1,3 +1,8 @@
---
pagination_next: null
pagination_prev: null
---
# How to publish an integration package from a template
:::danger

View File

@ -1,5 +1,5 @@
---
sidebar_position: 5
pagination_next: contributing/how_to/integrations/package
---
# Contribute Integrations
@ -66,7 +66,7 @@ that will render on this site (https://python.langchain.com/).
As a prerequisite to adding your integration to our documentation, you must:
1. Confirm that your integration is in the [list of components](#components-to-integrate) we are currently accepting.
2. Ensure that your integration is in a separate package that can be installed with `pip install <your-package>`.
2. [Publish your package to PyPi](./package.mdx) and make the repo public.
3. [Implement the standard tests](/docs/contributing/how_to/integrations/standard_tests) for your integration and successfully run them.
3. Write documentation for your integration in the `docs/docs/integrations/<component_type>` directory of the LangChain monorepo.
4. Add a provider page for your integration in the `docs/docs/integrations/providers` directory of the LangChain monorepo.

View File

@ -0,0 +1,229 @@
---
pagination_next: contributing/how_to/integrations/standard_tests
pagination_prev: contributing/how_to/integrations/index
---
# How to bootstrap a new integration package
This guide walks through the process of publishing a new LangChain integration
package to PyPi.
Integration packages are just Python packages that can be installed with `pip install <your-package>`,
which contain classes that are compatible with LangChain's core interfaces.
In this guide, we will be using [Poetry](https://python-poetry.org/) for
dependency management and packaging, and you're welcome to use any other tools you prefer.
## **Prerequisites**
- [GitHub](https://github.com) account
- [PyPi](https://pypi.org/) account
## Boostrapping a new Python package with Poetry
First, install Poetry:
```bash
pip install poetry
```
Next, come up with a name for your package. For this guide, we'll use `langchain-parrot-link`.
You can confirm that the name is available on PyPi by searching for it on the [PyPi website](https://pypi.org/).
Next, create your new Python package with Poetry:
```bash
poetry new langchain-parrot-link
```
Add main dependencies using Poetry, which will add them to your `pyproject.toml` file:
```bash
poetry add langchain-core
```
We will also add some `test` dependencies in a separate poetry dependency group. If
you are not using Poetry, we recommend adding these in a way that won't package them
with your published package, or just installing them separately when you run tests.
`langchain-tests` will provide the [standard tests](../standard_tests) we will use later.
We recommended pinning these to the latest version: <img src="https://img.shields.io/pypi/v/langchain-tests" style={{position:"relative",top:4,left:3}} />
Note: Replace `{latest version}` with the latest version of `langchain-tests` below.
```bash
poetry add --group test pytest pytest-socket langchain-tests=={latest version}
```
You're now ready to start writing your integration package!
## Writing your integration
Let's say you're building a simple integration package that provides a `ChatParrotLink`
chat model integration for LangChain. Here's a simple example of what your project
structure might look like:
```plaintext
langchain-parrot-link/
├── langchain_parrot_link/
│ ├── __init__.py
│ └── chat_models.py
├── tests/
│ ├── __init__.py
│ └── test_chat_models.py
├── pyproject.toml
└── README.md
```
All of these files should already exist from step 1, except for
`chat_models.py` and `test_chat_models.py`! We will implement `test_chat_models.py`
later, following the [standard tests](../standard_tests) guide.
To implement `chat_models.py`, let's copy the implementation from our
[Custom Chat Model Guide](../../../../how_to/custom_chat_model).
<details>
<summary>chat_models.py</summary>
```python title="langchain_parrot_link/chat_models.py"
from typing import Any, Dict, Iterator, List, Optional
from langchain_core.callbacks import (
CallbackManagerForLLMRun,
)
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import AIMessageChunk, BaseMessage, AIMessage
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
class CustomChatModelAdvanced(BaseChatModel):
"""A custom chat model that echoes the first `n` characters of the input.
When contributing an implementation to LangChain, carefully document
the model including the initialization parameters, include
an example of how to initialize the model and include any relevant
links to the underlying models documentation or API.
Example:
.. code-block:: python
model = CustomChatModel(n=2)
result = model.invoke([HumanMessage(content="hello")])
result = model.batch([[HumanMessage(content="hello")],
[HumanMessage(content="world")]])
"""
model_name: str
"""The name of the model"""
n: int
"""The number of characters from the last message of the prompt to be echoed."""
def _generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> ChatResult:
"""Override the _generate method to implement the chat model logic.
This can be a call to an API, a call to a local model, or any other
implementation that generates a response to the input prompt.
Args:
messages: the prompt composed of a list of messages.
stop: a list of strings on which the model should stop generating.
If generation stops due to a stop token, the stop token itself
SHOULD BE INCLUDED as part of the output. This is not enforced
across models right now, but it's a good practice to follow since
it makes it much easier to parse the output of the model
downstream and understand why generation stopped.
run_manager: A run manager with callbacks for the LLM.
"""
# Replace this with actual logic to generate a response from a list
# of messages.
last_message = messages[-1]
tokens = last_message.content[: self.n]
message = AIMessage(
content=tokens,
additional_kwargs={}, # Used to add additional payload (e.g., function calling request)
response_metadata={ # Use for response metadata
"time_in_seconds": 3,
},
)
##
generation = ChatGeneration(message=message)
return ChatResult(generations=[generation])
def _stream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
"""Stream the output of the model.
This method should be implemented if the model can generate output
in a streaming fashion. If the model does not support streaming,
do not implement it. In that case streaming requests will be automatically
handled by the _generate method.
Args:
messages: the prompt composed of a list of messages.
stop: a list of strings on which the model should stop generating.
If generation stops due to a stop token, the stop token itself
SHOULD BE INCLUDED as part of the output. This is not enforced
across models right now, but it's a good practice to follow since
it makes it much easier to parse the output of the model
downstream and understand why generation stopped.
run_manager: A run manager with callbacks for the LLM.
"""
last_message = messages[-1]
tokens = last_message.content[: self.n]
for token in tokens:
chunk = ChatGenerationChunk(message=AIMessageChunk(content=token))
if run_manager:
# This is optional in newer versions of LangChain
# The on_llm_new_token will be called automatically
run_manager.on_llm_new_token(token, chunk=chunk)
yield chunk
# Let's add some other information (e.g., response metadata)
chunk = ChatGenerationChunk(
message=AIMessageChunk(content="", response_metadata={"time_in_sec": 3})
)
if run_manager:
# This is optional in newer versions of LangChain
# The on_llm_new_token will be called automatically
run_manager.on_llm_new_token(token, chunk=chunk)
yield chunk
@property
def _llm_type(self) -> str:
"""Get the type of language model used by this chat model."""
return "echoing-chat-model-advanced"
@property
def _identifying_params(self) -> Dict[str, Any]:
"""Return a dictionary of identifying parameters.
This information is used by the LangChain callback system, which
is used for tracing purposes make it possible to monitor LLMs.
"""
return {
# The model name allows users to specify custom token counting
# rules in LLM monitoring applications (e.g., in LangSmith users
# can provide per token pricing for their model and monitor
# costs for the given LLM.)
"model_name": self.model_name,
}
```
</details>
## Next Steps
Now that you've implemented your package, you can move on to [testing your integration](../standard_tests) for your integration and successfully run them.

View File

@ -0,0 +1,145 @@
---
pagination_prev: contributing/how_to/integrations/standard_tests
pagination_next: null
---
# Publishing your package
Now that your package is implemented and tested, you can:
1. Publish your package to PyPi
2. Add documentation for your package to the LangChain Monorepo
## Publishing your package to PyPi
This guide assumes you have already implemented your package and written tests for it. If you haven't done that yet, please refer to the [implementation guide](../package) and the [testing guide](../standard_tests).
Note that Poetry is not required to publish a package to PyPi, and we're using it in this guide end-to-end for convenience.
You are welcome to publish your package using any other method you prefer.
First, make sure you have a PyPi account and have logged in with Poetry:
<details>
<summary>How to create a PyPi Token</summary>
1. Go to the [PyPi website](https://pypi.org/) and create an account.
2. Go to your account settings and enable 2FA. To generate an API token, you **must** have 2FA enabled currently.
3. Go to your account settings and [generate a new API token](https://pypi.org/manage/account/token/).
</details>
```bash
poetry config pypi-token.pypi <your-pypi-token>
```
Next, build your package:
```bash
poetry build
```
Finally, publish your package to PyPi:
```bash
poetry publish
```
You're all set! Your package is now available on PyPi and can be installed with `pip install langchain-parrot-link`.
## Adding documentation to the LangChain Monorepo
To add documentation for your package to the LangChain Monorepo, you will need to:
1. Fork and clone the LangChain Monorepo
2. Make a "Provider Page" at `docs/docs/integrations/providers/<your-package-name>.ipynb`
3. Make "Component Pages" at `docs/docs/integrations/<component-type>/<your-package-name>.ipynb`
4. Register your package in `libs/packages.yml`
5. Submit a PR with **only these changes** to the LangChain Monorepo
### Fork and clone the LangChain Monorepo
First, fork the [LangChain Monorepo](https://github.com/langchain-ai/langchain) to your GitHub account.
Next, clone the repository to your local machine:
```bash
git clone https://github.com/<your-username>/langchain.git
```
You're now ready to make your PR!
### Bootstrap your documentation pages with the langchain-cli (recommended)
To make it easier to create the necessary documentation pages, you can use the `langchain-cli` to bootstrap them for you.
First, install the latest version of the `langchain-cli` package:
```bash
pip install --upgrade langchain-cli
```
To see the available commands to bootstrap your documentation pages, run:
```bash
langchain-cli integration create-doc --help
```
Let's bootstrap a provider page from the root of the monorepo:
```bash
langchain-cli integration create-doc \
--component-type Provider \
--destination-dir docs/docs/integrations/providers \
--name parrot-link \
--name-class ParrotLink \
```
And a chat model component page:
```bash
langchain-cli integration create-doc \
--component-type ChatModel \
--destination-dir docs/docs/integrations/chat \
--name parrot-link \
--name-class ParrotLink \
```
And a vector store component page:
```bash
langchain-cli integration create-doc \
--component-type VectorStore \
--destination-dir docs/docs/integrations/vectorstores \
--name parrot-link \
--name-class ParrotLink \
```
These commands will create the following 3 files, which you should fill out with information about your package:
- `docs/docs/integrations/providers/parrot-link.ipynb`
- `docs/docs/integrations/chat/parrot-link.ipynb`
- `docs/docs/integrations/vectorstores/parrot-link.ipynb`
### Manually create your documentation pages (if you prefer)
If you prefer to create the documentation pages manually, you can create the same files listed
above and fill them out with information about your package.
You can view the templates that the CLI uses to create these files [here](https://github.com/langchain-ai/langchain/tree/master/libs/cli/langchain_cli/integration_template/docs) if helpful!
### Register your package in `libs/packages.yml`
Finally, add your package to the `libs/packages.yml` file in the LangChain Monorepo.
```yaml
packages:
- name: langchain-parrot-link
repo: <your github handle>/<your repo>
path: .
```
For `path`, you can use `.` if your package is in the root of your repository, or specify a subdirectory (e.g. `libs/parrot-link`) if it is in a subdirectory.
### Submit a PR with your changes
Once you have completed these steps, you can submit a PR to the LangChain Monorepo with **only these changes**.

View File

@ -4,6 +4,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"pagination_next: contributing/how_to/integrations/publish\n",
"pagination_prev: contributing/how_to/integrations/package\n",
"---\n",
"# How to add standard tests to an integration\n",
"\n",
"When creating either a custom class for yourself or a new tool to publish in a LangChain integration, it is important to add standard tests to ensure it works as expected. This guide will show you how to add standard tests to a tool, and you can **[Skip to the test templates](#standard-test-templates-per-component)** for implementing tests for each integration.\n",
@ -20,16 +24,29 @@
"Because added tests in new versions of `langchain-tests` can break your CI/CD pipelines, we recommend pinning the \n",
"version of `langchain-tests` to avoid unexpected changes.\n",
"\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -U langchain-core langchain-tests pytest pytest-socket"
":::\n",
"\n",
"import Tabs from '@theme/Tabs';\n",
"import TabItem from '@theme/TabItem';\n",
"\n",
"<Tabs>\n",
" <TabItem value=\"poetry\" label=\"Poetry\" default>\n",
"If you followed the [previous guide](../package), you should already have these dependencies installed!\n",
"\n",
"```bash\n",
"poetry add langchain-core\n",
"poetry add --group test pytest pytest-socket langchain-tests==<latest_version>\n",
"```\n",
" </TabItem>\n",
" <TabItem value=\"pip\" label=\"Pip\">\n",
"```bash\n",
"pip install -U langchain-core pytest pytest-socket langchain-tests\n",
"\n",
"# install current package in editable mode\n",
"pip install --editable .\n",
"```\n",
" </TabItem>\n",
"</Tabs>"
]
},
{
@ -176,13 +193,30 @@
"source": [
"and you would run these with the following commands from your project root\n",
"\n",
"<Tabs>\n",
" <TabItem value=\"poetry\" label=\"Poetry\" default>\n",
"\n",
"```bash\n",
"# run unit tests without network access\n",
"poetry run pytest --disable-socket --allow-unix-socket tests/unit_tests\n",
"\n",
"# run integration tests\n",
"poetry run pytest tests/integration_tests\n",
"```\n",
"\n",
" </TabItem>\n",
" <TabItem value=\"pip\" label=\"Pip\">\n",
"\n",
"```bash\n",
"# run unit tests without network access\n",
"pytest --disable-socket --allow-unix-socket tests/unit_tests\n",
"\n",
"# run integration tests\n",
"pytest tests/integration_tests\n",
"```"
"```\n",
"\n",
" </TabItem>\n",
"</Tabs>"
]
},
{

View File

@ -162,23 +162,21 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "25ba32e5-5a6d-49f4-bb68-911827b84d61",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from typing import Any, AsyncIterator, Dict, Iterator, List, Optional\n",
"from typing import Any, Dict, Iterator, List, Optional\n",
"\n",
"from langchain_core.callbacks import (\n",
" AsyncCallbackManagerForLLMRun,\n",
" CallbackManagerForLLMRun,\n",
")\n",
"from langchain_core.language_models import BaseChatModel, SimpleChatModel\n",
"from langchain_core.language_models import BaseChatModel\n",
"from langchain_core.messages import AIMessageChunk, BaseMessage, HumanMessage\n",
"from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult\n",
"from langchain_core.runnables import run_in_executor\n",
"\n",
"\n",
"class CustomChatModelAdvanced(BaseChatModel):\n",

View File

@ -140,6 +140,8 @@ TEMPLATE_MAP: dict[str, str] = {
"Retriever": "retrievers.ipynb",
}
_component_types_str = ", ".join(f"`{k}`" for k in TEMPLATE_MAP.keys())
@integration_cli.command()
def create_doc(
@ -170,8 +172,7 @@ def create_doc(
str,
typer.Option(
help=(
"The type of component. Currently only 'ChatModel', "
"'DocumentLoader', 'VectorStore' supported."
f"The type of component. Currently supported: {_component_types_str}."
),
),
] = "ChatModel",
@ -220,8 +221,7 @@ def create_doc(
docs_template = template_dir / TEMPLATE_MAP[component_type]
else:
raise ValueError(
f"Unrecognized {component_type=}. Expected one of 'ChatModel', "
f"'DocumentLoader', 'Tool'."
f"Unrecognized {component_type=}. Expected one of {_component_types_str}."
)
shutil.copy(docs_template, destination_path)