mirror of
https://github.com/hwchase17/langchain.git
synced 2025-04-27 19:46:55 +00:00
doc: Implement langchain-xinference (#30296)
- [ ] **PR title**: Implement langchain-xinference - [ ] **PR message**: Implement a standalone package for Xinference chat models and llm models. https://github.com/langchain-ai/langchain/issues/30045#issue-2887214214
This commit is contained in:
parent
492b4c1604
commit
251551ccf1
245
docs/docs/integrations/chat/xinference.ipynb
Normal file
245
docs/docs/integrations/chat/xinference.ipynb
Normal file
@ -0,0 +1,245 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_label: Xinference\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "01ca90da",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# ChatXinference\n",
|
||||
"\n",
|
||||
"[Xinference](https://github.com/xorbitsai/inference) is a powerful and versatile library designed to serve LLMs, \n",
|
||||
"speech recognition models, and multimodal models, even on your laptop. It supports a variety of models compatible with GGML, such as chatglm, baichuan, whisper, vicuna, orca, and many others.\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"### Integration details\n",
|
||||
"\n",
|
||||
"| Class | Package | Local | Serializable | [JS support] | Package downloads | Package latest |\n",
|
||||
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
|
||||
"| ChatXinference| langchain-xinference | ✅ | ❌ | ✅ | ✅ | ✅ |\n",
|
||||
"\n",
|
||||
"### Model features\n",
|
||||
"| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
|
||||
"| :---: |:----------------------------------------------------:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n",
|
||||
"| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |\n",
|
||||
"\n",
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"Install `Xinference` through PyPI:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install --upgrade --quiet \"xinference[all]\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7fb27b941602401d91542211134fc71a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Deploy Xinference Locally or in a Distributed Cluster.\n",
|
||||
"\n",
|
||||
"For local deployment, run `xinference`. \n",
|
||||
"\n",
|
||||
"To deploy Xinference in a cluster, first start an Xinference supervisor using the `xinference-supervisor`. You can also use the option -p to specify the port and -H to specify the host. The default port is 8080 and the default host is 0.0.0.0.\n",
|
||||
"\n",
|
||||
"Then, start the Xinference workers using `xinference-worker` on each server you want to run them on. \n",
|
||||
"\n",
|
||||
"You can consult the README file from [Xinference](https://github.com/xorbitsai/inference) for more information.\n",
|
||||
"### Wrapper\n",
|
||||
"\n",
|
||||
"To use Xinference with LangChain, you need to first launch a model. You can use command line interface (CLI) to do so:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "acae54e37e7d407bbb7b55eff062a284",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Model uid: 7167b2b0-2a04-11ee-83f0-d29396a3f064\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%xinference launch -n vicuna-v1.3 -f ggmlv3 -q q4_0"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9a63283cbaf04dbcab1f6479b197f3a8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A model UID is returned for you to use. Now you can use Xinference with LangChain:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8dd0d8092fe74a7c96281538738b07e2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Installation\n",
|
||||
"\n",
|
||||
"The LangChain Xinference integration lives in the `langchain-xinference` package:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "72eea5119410473aa328ad9291626812",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -qU langchain-xinference"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8edb47106e1a46a883d545849b8ab81b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Make sure you're using the latest Xinference version for structured outputs."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "55935996",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Instantiation\n",
|
||||
"\n",
|
||||
"Now we can instantiate our model object and generate chat completions:\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "10185d26023b46108eb7d9f57d49d2b3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_xinference.chat_models import ChatXinference\n",
|
||||
"\n",
|
||||
"llm = ChatXinference(\n",
|
||||
" server_url=\"your_server_url\", model_uid=\"7167b2b0-2a04-11ee-83f0-d29396a3f064\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"llm.invoke(\n",
|
||||
" \"Q: where can we visit in the capital of France?\",\n",
|
||||
" config={\"max_tokens\": 1024},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8763a12b2bbd4a93a75aff182afb95dc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invocation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "50f6c0e4",
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.messages import HumanMessage, SystemMessage\n",
|
||||
"from langchain_xinference.chat_models import ChatXinference\n",
|
||||
"\n",
|
||||
"llm = ChatXinference(\n",
|
||||
" server_url=\"your_server_url\", model_uid=\"7167b2b0-2a04-11ee-83f0-d29396a3f064\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"system_message = \"You are a helpful assistant that translates English to French. Translate the user sentence.\"\n",
|
||||
"human_message = \"I love programming.\"\n",
|
||||
"\n",
|
||||
"llm.invoke([HumanMessage(content=human_message), SystemMessage(content=system_message)])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7623eae2785240b9bd12b16a66d81610",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Chaining\n",
|
||||
"\n",
|
||||
"We can [chain](/docs/how_to/sequence/) our model with a prompt template like so:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "317d05c6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.prompts import PromptTemplate\n",
|
||||
"from langchain_xinference.chat_models import ChatXinference\n",
|
||||
"\n",
|
||||
"prompt = PromptTemplate(\n",
|
||||
" input=[\"country\"], template=\"Q: where can we visit in the capital of {country}? A:\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"llm = ChatXinference(\n",
|
||||
" server_url=\"your_server_url\", model_uid=\"7167b2b0-2a04-11ee-83f0-d29396a3f064\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"chain = prompt | llm\n",
|
||||
"chain.invoke(input={\"country\": \"France\"})\n",
|
||||
"chain.stream(input={\"country\": \"France\"})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "66f7fb26",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all ChatXinference features and configurations head to the API reference: https://github.com/TheSongg/langchain-xinference"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@ -99,4 +99,23 @@ For more information and detailed examples, refer to the
|
||||
|
||||
Xinference also supports embedding queries and documents. See
|
||||
[example for xinference embeddings](/docs/integrations/text_embedding/xinference)
|
||||
for a more detailed demo.
|
||||
for a more detailed demo.
|
||||
|
||||
|
||||
### Xinference LangChain partner package install
|
||||
Install the integration package with:
|
||||
```bash
|
||||
pip install langchain-xinference
|
||||
```
|
||||
## Chat Models
|
||||
|
||||
```python
|
||||
from langchain_xinference.chat_models import ChatXinference
|
||||
```
|
||||
|
||||
## LLM
|
||||
|
||||
```python
|
||||
from langchain_xinference.llms import Xinference
|
||||
```
|
||||
|
||||
|
@ -516,3 +516,7 @@ packages:
|
||||
- name: langchain-agentql
|
||||
path: langchain
|
||||
repo: tinyfish-io/agentql-integrations
|
||||
- name: langchain-xinference
|
||||
path: .
|
||||
repo: TheSongg/langchain-xinference
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user