diff --git a/docs/extras/ecosystem/integrations/databricks.md b/docs/extras/ecosystem/integrations/databricks.md index 71969699273..8dd3bf3d4c3 100644 --- a/docs/extras/ecosystem/integrations/databricks.md +++ b/docs/extras/ecosystem/integrations/databricks.md @@ -6,22 +6,28 @@ The [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, a Databricks embraces the LangChain ecosystem in various ways: 1. Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain -2. Databricks-managed MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps -3. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks -4. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub +2. Databricks MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps +3. Databricks MLflow AI Gateway +4. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks +5. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub Databricks connector for the SQLDatabase Chain ---------------------------------------------- You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain. See the notebook [Connect to Databricks](/docs/ecosystem/integrations/databricks/databricks.html) for details. -Databricks-managed MLflow integrates with LangChain ---------------------------------------------------- +Databricks MLflow integrates with LangChain +------------------------------------------- MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. See the notebook [MLflow Callback Handler](/docs/ecosystem/integrations/mlflow_tracking.ipynb) for details about MLflow's integration with LangChain. Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects. See [MLflow guide](https://docs.databricks.com/mlflow/index.html) for more details. -Databricks-managed MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving. +Databricks MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving. + +Databricks MLflow AI Gateway +---------------------------- + +See [MLflow AI Gateway](/docs/ecosystem/integrations/mlflow_ai_gateway). Databricks as an LLM provider ----------------------------- diff --git a/docs/extras/ecosystem/integrations/mlflow_ai_gateway.mdx b/docs/extras/ecosystem/integrations/mlflow_ai_gateway.mdx new file mode 100644 index 00000000000..7eb9f7869e9 --- /dev/null +++ b/docs/extras/ecosystem/integrations/mlflow_ai_gateway.mdx @@ -0,0 +1,116 @@ +# MLflow AI Gateway + +The MLflow AI Gateway service is a powerful tool designed to streamline the usage and management of various large language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests. See [the MLflow AI Gateway documentation](https://mlflow.org/docs/latest/gateway/index.html) for more details. + +## Installation and Setup + +Install `mlflow` with MLflow AI Gateway dependencies: + +```sh +pip install 'mlflow[gateway]' +``` + +Set the OpenAI API key as an environment variable: + +```sh +export OPENAI_API_KEY=... +``` + +Create a configuration file: + +```yaml +routes: + - name: completions + type: llm/v1/completions + model: + provider: openai + name: text-davinci-003 + config: + openai_api_key: $OPENAI_API_KEY + + - name: embeddings + type: llm/v1/embeddings + model: + provider: openai + name: text-embedding-ada-002 + config: + openai_api_key: $OPENAI_API_KEY +``` + +Start the Gateway server: + +```sh +mlflow gateway start --config-path /path/to/config.yaml +``` + +## Completions Example + +```python +import mlflow +from langchain import LLMChain, PromptTemplate +from langchain.llms import MlflowAIGateway + +gateway = MlflowAIGateway( + gateway_uri="http://127.0.0.1:5000", + route="completions", + params={ + "temperature": 0.0, + "top_p": 0.1, + }, +) + +llm_chain = LLMChain( + llm=gateway, + prompt=PromptTemplate( + input_variables=["adjective"], + template="Tell me a {adjective} joke", + ), +) +result = llm_chain.run(adjective="funny") +print(result) + +with mlflow.start_run(): + model_info = mlflow.langchain.log_model(chain, "model") + +model = mlflow.pyfunc.load_model(model_info.model_uri) +print(model.predict([{"adjective": "funny"}])) +``` + +## Embeddings Example + +```python +from langchain.embeddings import MlflowAIGatewayEmbeddings + +embeddings = MlflowAIGatewayEmbeddings( + gateway_uri="http://127.0.0.1:5000", + route="embeddings", +) + +print(embeddings.embed_query("hello")) +print(embeddings.embed_documents(["hello"])) +``` + +## Databricks MLflow AI Gateway + +Databricks MLflow AI Gateway is in private preview. +Please contact a Databricks representative to enroll in the preview. + +```python +from langchain import LLMChain, PromptTemplate +from langchain.llms import MlflowAIGateway + +gateway = MlflowAIGateway( + gateway_uri="databricks", + route="completions", +) + +llm_chain = LLMChain( + llm=gateway, + prompt=PromptTemplate( + input_variables=["adjective"], + template="Tell me a {adjective} joke", + ), +) +result = llm_chain.run(adjective="funny") +print(result) +``` diff --git a/langchain/embeddings/__init__.py b/langchain/embeddings/__init__.py index 4eab874e7b0..2a806fad2cd 100644 --- a/langchain/embeddings/__init__.py +++ b/langchain/embeddings/__init__.py @@ -24,6 +24,7 @@ from langchain.embeddings.huggingface_hub import HuggingFaceHubEmbeddings from langchain.embeddings.jina import JinaEmbeddings from langchain.embeddings.llamacpp import LlamaCppEmbeddings from langchain.embeddings.minimax import MiniMaxEmbeddings +from langchain.embeddings.mlflow_gateway import MlflowAIGatewayEmbeddings from langchain.embeddings.modelscope_hub import ModelScopeEmbeddings from langchain.embeddings.mosaicml import MosaicMLInstructorEmbeddings from langchain.embeddings.octoai_embeddings import OctoAIEmbeddings @@ -50,6 +51,7 @@ __all__ = [ "JinaEmbeddings", "LlamaCppEmbeddings", "HuggingFaceHubEmbeddings", + "MlflowAIGatewayEmbeddings", "ModelScopeEmbeddings", "TensorflowHubEmbeddings", "SagemakerEndpointEmbeddings", diff --git a/langchain/embeddings/mlflow_gateway.py b/langchain/embeddings/mlflow_gateway.py new file mode 100644 index 00000000000..e3a647f5ba2 --- /dev/null +++ b/langchain/embeddings/mlflow_gateway.py @@ -0,0 +1,51 @@ +from __future__ import annotations + +from typing import Any, Iterator, List, Optional + +from pydantic import BaseModel + +from langchain.embeddings.base import Embeddings + + +def _chunk(texts: List[str], size: int) -> Iterator[List[str]]: + for i in range(0, len(texts), size): + yield texts[i : i + size] + + +class MlflowAIGatewayEmbeddings(Embeddings, BaseModel): + route: str + gateway_uri: Optional[str] = None + + def __init__(self, **kwargs: Any): + try: + import mlflow.gateway + except ImportError as e: + raise ImportError( + "Could not import `mlflow.gateway` module. " + "Please install it with `pip install mlflow[gateway]`." + ) from e + + super().__init__(**kwargs) + if self.gateway_uri: + mlflow.gateway.set_gateway_uri(self.gateway_uri) + + def _query(self, texts: List[str]) -> List[List[float]]: + try: + import mlflow.gateway + except ImportError as e: + raise ImportError( + "Could not import `mlflow.gateway` module. " + "Please install it with `pip install mlflow[gateway]`." + ) from e + + embeddings = [] + for txt in _chunk(texts, 20): + resp = mlflow.gateway.query(self.route, data={"text": txt}) + embeddings.append(resp["embeddings"]) + return embeddings + + def embed_documents(self, texts: List[str]) -> List[List[float]]: + return self._query(texts) + + def embed_query(self, text: str) -> List[float]: + return self._query([text])[0] diff --git a/langchain/llms/__init__.py b/langchain/llms/__init__.py index 1e32346f26b..1782a34cc76 100644 --- a/langchain/llms/__init__.py +++ b/langchain/llms/__init__.py @@ -33,6 +33,7 @@ from langchain.llms.human import HumanInputLLM from langchain.llms.koboldai import KoboldApiLLM from langchain.llms.llamacpp import LlamaCpp from langchain.llms.manifest import ManifestWrapper +from langchain.llms.mlflow_ai_gateway import MlflowAIGateway from langchain.llms.modal import Modal from langchain.llms.mosaicml import MosaicML from langchain.llms.nlpcloud import NLPCloud @@ -89,6 +90,7 @@ __all__ = [ "LlamaCpp", "TextGen", "ManifestWrapper", + "MlflowAIGateway", "Modal", "MosaicML", "NLPCloud", @@ -146,6 +148,7 @@ type_to_cls_dict: Dict[str, Type[BaseLLM]] = { "koboldai": KoboldApiLLM, "llamacpp": LlamaCpp, "textgen": TextGen, + "mlflow-gateway": MlflowAIGateway, "modal": Modal, "mosaic": MosaicML, "nlpcloud": NLPCloud, diff --git a/langchain/llms/mlflow_ai_gateway.py b/langchain/llms/mlflow_ai_gateway.py new file mode 100644 index 00000000000..1cf2b5221d7 --- /dev/null +++ b/langchain/llms/mlflow_ai_gateway.py @@ -0,0 +1,75 @@ +from __future__ import annotations + +from typing import Any, Dict, List, Mapping, Optional + +from pydantic import BaseModel, Extra + +from langchain.callbacks.manager import CallbackManagerForLLMRun +from langchain.llms.base import LLM + + +class Params(BaseModel, extra=Extra.allow): + temperature: float = 0.0 + candidate_count: int = 1 + stop: Optional[List[str]] = None + max_tokens: Optional[int] = None + + +class MlflowAIGateway(LLM): + route: str + gateway_uri: Optional[str] = None + params: Optional[Params] = None + + def __init__(self, **kwargs: Any): + try: + import mlflow.gateway + except ImportError as e: + raise ImportError( + "Could not import `mlflow.gateway` module. " + "Please install it with `pip install mlflow[gateway]`." + ) from e + + super().__init__(**kwargs) + if self.gateway_uri: + mlflow.gateway.set_gateway_uri(self.gateway_uri) + + @property + def _default_params(self) -> Dict[str, Any]: + params: Dict[str, Any] = { + "gateway_uri": self.gateway_uri, + "route": self.route, + **(self.params.dict() if self.params else {}), + } + return params + + @property + def _identifying_params(self) -> Mapping[str, Any]: + return self._default_params + + def _call( + self, + prompt: str, + stop: Optional[List[str]] = None, + run_manager: Optional[CallbackManagerForLLMRun] = None, + **kwargs: Any, + ) -> str: + try: + import mlflow.gateway + except ImportError as e: + raise ImportError( + "Could not import `mlflow.gateway` module. " + "Please install it with `pip install mlflow[gateway]`." + ) from e + + data: Dict[str, Any] = { + "prompt": prompt, + **(self.params.dict() if self.params else {}), + } + if s := (stop or (self.params.stop if self.params else None)): + data["stop"] = s + resp = mlflow.gateway.query(self.route, data=data) + return resp["candidates"][0]["text"] + + @property + def _llm_type(self) -> str: + return "mlflow-ai-gateway"