mirror of
https://github.com/hwchase17/langchain.git
synced 2026-04-03 10:55:08 +00:00
Use docusaurus versioning with a callout, merged master as well @hwchase17 @baskaryan --------- Signed-off-by: Weichen Xu <weichen.xu@databricks.com> Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Averi Kitsch <akitsch@google.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com> Co-authored-by: Fayfox <admin@fayfox.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com> Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com> Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: WeichenXu <weichen.xu@databricks.com> Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com> Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com> Co-authored-by: Kartik Sarangmath <kartik@thirdai.com> Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai> Co-authored-by: MacanPN <martin.triska@gmail.com> Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Co-authored-by: Hyeongchan Kim <kozistr@gmail.com> Co-authored-by: sdan <git@sdan.io> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com> Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com> Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com> Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com> Co-authored-by: Tomer Cagan <tomer@tomercagan.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
509 lines
15 KiB
Plaintext
509 lines
15 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Databricks\n",
|
||
"\n",
|
||
"The [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, analytics, and AI on one platform.\n",
|
||
"\n",
|
||
"This example notebook shows how to wrap Databricks endpoints as LLMs in LangChain.\n",
|
||
"It supports two endpoint types:\n",
|
||
"\n",
|
||
"* Serving endpoint, recommended for production and development,\n",
|
||
"* Cluster driver proxy app, recommended for interactive development."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Installation\n",
|
||
"\n",
|
||
"`mlflow >= 2.9 ` is required to run the code in this notebook. If it's not installed, please install it using this command:\n",
|
||
"\n",
|
||
"```\n",
|
||
"pip install mlflow>=2.9\n",
|
||
"```\n",
|
||
"\n",
|
||
"Also, we need `dbutils` for this example.\n",
|
||
"\n",
|
||
"```\n",
|
||
"pip install dbutils\n",
|
||
"```\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Wrapping a serving endpoint: External model\n",
|
||
"\n",
|
||
"Prerequisite: Register an OpenAI API key as a secret:\n",
|
||
"\n",
|
||
" ```bash\n",
|
||
" databricks secrets create-scope <scope>\n",
|
||
" databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY\n",
|
||
" ```"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The following code creates a new serving endpoint with OpenAI's GPT-4 model for chat and generates a response using the endpoint."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"content='Hello! How can I assist you today?'\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from langchain_community.chat_models import ChatDatabricks\n",
|
||
"from langchain_core.messages import HumanMessage\n",
|
||
"from mlflow.deployments import get_deploy_client\n",
|
||
"\n",
|
||
"client = get_deploy_client(\"databricks\")\n",
|
||
"\n",
|
||
"secret = \"secrets/<scope>/openai-api-key\" # replace `<scope>` with your scope\n",
|
||
"name = \"my-chat\" # rename this if my-chat already exists\n",
|
||
"client.create_endpoint(\n",
|
||
" name=name,\n",
|
||
" config={\n",
|
||
" \"served_entities\": [\n",
|
||
" {\n",
|
||
" \"name\": \"my-chat\",\n",
|
||
" \"external_model\": {\n",
|
||
" \"name\": \"gpt-4\",\n",
|
||
" \"provider\": \"openai\",\n",
|
||
" \"task\": \"llm/v1/chat\",\n",
|
||
" \"openai_config\": {\n",
|
||
" \"openai_api_key\": \"{{\" + secret + \"}}\",\n",
|
||
" },\n",
|
||
" },\n",
|
||
" }\n",
|
||
" ],\n",
|
||
" },\n",
|
||
")\n",
|
||
"\n",
|
||
"chat = ChatDatabricks(\n",
|
||
" target_uri=\"databricks\",\n",
|
||
" endpoint=name,\n",
|
||
" temperature=0.1,\n",
|
||
")\n",
|
||
"chat([HumanMessage(content=\"hello\")])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Wrapping a serving endpoint: Foundation model\n",
|
||
"\n",
|
||
"The following code uses the `databricks-bge-large-en` serving endpoint (no endpoint creation is required) to generate embeddings from input text."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0.051055908203125, 0.007221221923828125, 0.003879547119140625]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from langchain_community.embeddings import DatabricksEmbeddings\n",
|
||
"\n",
|
||
"embeddings = DatabricksEmbeddings(endpoint=\"databricks-bge-large-en\")\n",
|
||
"embeddings.embed_query(\"hello\")[:3]"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Wrapping a serving endpoint: Custom model\n",
|
||
"\n",
|
||
"Prerequisites:\n",
|
||
"\n",
|
||
"* An LLM was registered and deployed to [a Databricks serving endpoint](https://docs.databricks.com/machine-learning/model-serving/index.html).\n",
|
||
"* You have [\"Can Query\" permission](https://docs.databricks.com/security/auth-authz/access-control/serving-endpoint-acl.html) to the endpoint.\n",
|
||
"\n",
|
||
"The expected MLflow model signature is:\n",
|
||
"\n",
|
||
" * inputs: `[{\"name\": \"prompt\", \"type\": \"string\"}, {\"name\": \"stop\", \"type\": \"list[string]\"}]`\n",
|
||
" * outputs: `[{\"type\": \"string\"}]`\n",
|
||
"\n",
|
||
"If the model signature is incompatible or you want to insert extra configs, you can set `transform_input_fn` and `transform_output_fn` accordingly."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I am happy to hear that you are in good health and as always, you are appreciated.'"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from langchain_community.llms import Databricks\n",
|
||
"\n",
|
||
"# If running a Databricks notebook attached to an interactive cluster in \"single user\"\n",
|
||
"# or \"no isolation shared\" mode, you only need to specify the endpoint name to create\n",
|
||
"# a `Databricks` instance to query a serving endpoint in the same workspace.\n",
|
||
"llm = Databricks(endpoint_name=\"dolly\")\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Good'"
|
||
]
|
||
},
|
||
"execution_count": 34,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"llm(\"How are you?\", stop=[\".\"])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I am fine. Thank you!'"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Otherwise, you can manually specify the Databricks workspace hostname and personal access token\n",
|
||
"# or set `DATABRICKS_HOST` and `DATABRICKS_TOKEN` environment variables, respectively.\n",
|
||
"# See https://docs.databricks.com/dev-tools/auth.html#databricks-personal-access-tokens\n",
|
||
"# We strongly recommend not exposing the API token explicitly inside a notebook.\n",
|
||
"# You can use Databricks secret manager to store your API token securely.\n",
|
||
"# See https://docs.databricks.com/dev-tools/databricks-utils.html#secrets-utility-dbutilssecrets\n",
|
||
"\n",
|
||
"import os\n",
|
||
"\n",
|
||
"import dbutils\n",
|
||
"\n",
|
||
"os.environ[\"DATABRICKS_TOKEN\"] = dbutils.secrets.get(\"myworkspace\", \"api_token\")\n",
|
||
"\n",
|
||
"llm = Databricks(host=\"myworkspace.cloud.databricks.com\", endpoint_name=\"dolly\")\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I am fine.'"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# If the serving endpoint accepts extra parameters like `temperature`,\n",
|
||
"# you can set them in `model_kwargs`.\n",
|
||
"llm = Databricks(endpoint_name=\"dolly\", model_kwargs={\"temperature\": 0.1})\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I’m Excellent. You?'"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Use `transform_input_fn` and `transform_output_fn` if the serving endpoint\n",
|
||
"# expects a different input schema and does not return a JSON string,\n",
|
||
"# respectively, or you want to apply a prompt template on top.\n",
|
||
"\n",
|
||
"\n",
|
||
"def transform_input(**request):\n",
|
||
" full_prompt = f\"\"\"{request[\"prompt\"]}\n",
|
||
" Be Concise.\n",
|
||
" \"\"\"\n",
|
||
" request[\"prompt\"] = full_prompt\n",
|
||
" return request\n",
|
||
"\n",
|
||
"\n",
|
||
"llm = Databricks(endpoint_name=\"dolly\", transform_input_fn=transform_input)\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"attachments": {},
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
},
|
||
"source": [
|
||
"## Wrapping a cluster driver proxy app\n",
|
||
"\n",
|
||
"Prerequisites:\n",
|
||
"\n",
|
||
"* An LLM loaded on a Databricks interactive cluster in \"single user\" or \"no isolation shared\" mode.\n",
|
||
"* A local HTTP server running on the driver node to serve the model at `\"/\"` using HTTP POST with JSON input/output.\n",
|
||
"* It uses a port number between `[3000, 8000]` and listens to the driver IP address or simply `0.0.0.0` instead of localhost only.\n",
|
||
"* You have \"Can Attach To\" permission to the cluster.\n",
|
||
"\n",
|
||
"The expected server schema (using JSON schema) is:\n",
|
||
"\n",
|
||
"* inputs:\n",
|
||
" ```json\n",
|
||
" {\"type\": \"object\",\n",
|
||
" \"properties\": {\n",
|
||
" \"prompt\": {\"type\": \"string\"},\n",
|
||
" \"stop\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}}},\n",
|
||
" \"required\": [\"prompt\"]}\n",
|
||
" ```\n",
|
||
"* outputs: `{\"type\": \"string\"}`\n",
|
||
"\n",
|
||
"If the server schema is incompatible or you want to insert extra configs, you can use `transform_input_fn` and `transform_output_fn` accordingly.\n",
|
||
"\n",
|
||
"The following is a minimal example for running a driver proxy app to serve an LLM:\n",
|
||
"\n",
|
||
"```python\n",
|
||
"from flask import Flask, request, jsonify\n",
|
||
"import torch\n",
|
||
"from transformers import pipeline, AutoTokenizer, StoppingCriteria\n",
|
||
"\n",
|
||
"model = \"databricks/dolly-v2-3b\"\n",
|
||
"tokenizer = AutoTokenizer.from_pretrained(model, padding_side=\"left\")\n",
|
||
"dolly = pipeline(model=model, tokenizer=tokenizer, trust_remote_code=True, device_map=\"auto\")\n",
|
||
"device = dolly.device\n",
|
||
"\n",
|
||
"class CheckStop(StoppingCriteria):\n",
|
||
" def __init__(self, stop=None):\n",
|
||
" super().__init__()\n",
|
||
" self.stop = stop or []\n",
|
||
" self.matched = \"\"\n",
|
||
" self.stop_ids = [tokenizer.encode(s, return_tensors='pt').to(device) for s in self.stop]\n",
|
||
" def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs):\n",
|
||
" for i, s in enumerate(self.stop_ids):\n",
|
||
" if torch.all((s == input_ids[0][-s.shape[1]:])).item():\n",
|
||
" self.matched = self.stop[i]\n",
|
||
" return True\n",
|
||
" return False\n",
|
||
"\n",
|
||
"def llm(prompt, stop=None, **kwargs):\n",
|
||
" check_stop = CheckStop(stop)\n",
|
||
" result = dolly(prompt, stopping_criteria=[check_stop], **kwargs)\n",
|
||
" return result[0][\"generated_text\"].rstrip(check_stop.matched)\n",
|
||
"\n",
|
||
"app = Flask(\"dolly\")\n",
|
||
"\n",
|
||
"@app.route('/', methods=['POST'])\n",
|
||
"def serve_llm():\n",
|
||
" resp = llm(**request.json)\n",
|
||
" return jsonify(resp)\n",
|
||
"\n",
|
||
"app.run(host=\"0.0.0.0\", port=\"7777\")\n",
|
||
"```\n",
|
||
"\n",
|
||
"Once the server is running, you can create a `Databricks` instance to wrap it as an LLM."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'Hello, thank you for asking. It is wonderful to hear that you are well.'"
|
||
]
|
||
},
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# If running a Databricks notebook attached to the same cluster that runs the app,\n",
|
||
"# you only need to specify the driver port to create a `Databricks` instance.\n",
|
||
"llm = Databricks(cluster_driver_port=\"7777\")\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I am well. You?'"
|
||
]
|
||
},
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Otherwise, you can manually specify the cluster ID to use,\n",
|
||
"# as well as Databricks workspace hostname and personal access token.\n",
|
||
"\n",
|
||
"llm = Databricks(cluster_id=\"0000-000000-xxxxxxxx\", cluster_driver_port=\"7777\")\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I am very well. It is a pleasure to meet you.'"
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# If the app accepts extra parameters like `temperature`,\n",
|
||
"# you can set them in `model_kwargs`.\n",
|
||
"llm = Databricks(cluster_driver_port=\"7777\", model_kwargs={\"temperature\": 0.1})\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'I AM DOING GREAT THANK YOU.'"
|
||
]
|
||
},
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Use `transform_input_fn` and `transform_output_fn` if the app\n",
|
||
"# expects a different input schema and does not return a JSON string,\n",
|
||
"# respectively, or you want to apply a prompt template on top.\n",
|
||
"\n",
|
||
"\n",
|
||
"def transform_input(**request):\n",
|
||
" full_prompt = f\"\"\"{request[\"prompt\"]}\n",
|
||
" Be Concise.\n",
|
||
" \"\"\"\n",
|
||
" request[\"prompt\"] = full_prompt\n",
|
||
" return request\n",
|
||
"\n",
|
||
"\n",
|
||
"def transform_output(response):\n",
|
||
" return response.upper()\n",
|
||
"\n",
|
||
"\n",
|
||
"llm = Databricks(\n",
|
||
" cluster_driver_port=\"7777\",\n",
|
||
" transform_input_fn=transform_input,\n",
|
||
" transform_output_fn=transform_output,\n",
|
||
")\n",
|
||
"\n",
|
||
"llm(\"How are you?\")"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.12"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|