Support Databricks in SQLDatabase (#4702)

This PR adds support for Databricks runtime and Databricks SQL by using
[Databricks SQL Connector for
Python](https://docs.databricks.com/dev-tools/python-sql-connector.html).
As a cloud data platform, accessing Databricks requires a URL as follows

`databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`.

**The URL is **complicated** and it may take users a while to figure it
out**. Since the fields `api_token`/`hostname`/`http_path` fields are
known in the Databricks notebook, I am proposing a new method
`from_databricks` to simplify the connection to Databricks.

## In Databricks Notebook
After changes, Databricks users only need to specify the `catalog` and
`schema` field when using langchain.
<img width="881" alt="image"
src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37">

## In Jupyter Notebook
The method can be used on the local setup as well:
<img width="678" alt="image"
src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">
This commit is contained in:
Gengliang Wang
2023-05-19 00:42:06 -07:00
committed by GitHub
parent 88a3a56c1a
commit bf5a3c6dec
2 changed files with 102 additions and 1 deletions

View File

@@ -34,7 +34,7 @@
}
},
"source": [
"Under the hood, LangChain uses SQLAlchemy to connect to SQL databases. The `SQLDatabaseChain` can therefore be used with any SQL dialect supported by SQLAlchemy, such as MS SQL, MySQL, MariaDB, PostgreSQL, Oracle SQL, and SQLite. Please refer to the SQLAlchemy documentation for more information about requirements for connecting to your database. For example, a connection to MySQL requires an appropriate connector such as PyMySQL. A URI for a MySQL connection might look like: `mysql+pymysql://user:pass@some_mysql_db_address/db_name`\n",
"Under the hood, LangChain uses SQLAlchemy to connect to SQL databases. The `SQLDatabaseChain` can therefore be used with any SQL dialect supported by SQLAlchemy, such as MS SQL, MySQL, MariaDB, PostgreSQL, Oracle SQL, Databricks and SQLite. Please refer to the SQLAlchemy documentation for more information about requirements for connecting to your database. For example, a connection to MySQL requires an appropriate connector such as PyMySQL. A URI for a MySQL connection might look like: `mysql+pymysql://user:pass@some_mysql_db_address/db_name`. To connect to Databricks, it is recommended to use the handy method `SQLDatabase.from_databricks(catalog, schema, host, api_token, (warehouse_id|cluster_id))`.\n",
"\n",
"This demonstration uses SQLite and the example Chinook database.\n",
"To set it up, follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository."