Commit Graph

22 Commits

Author SHA1 Message Date
Jon Luo
0a1b1806e9 sql: do not hard code the LIMIT clause in the table_info section (#1563)
Seeing a lot of issues in Discord in which the LLM is not using the
correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT`
for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc.
I think this could be due to us specifying the LIMIT statement in the
example rows portion of `table_info`. So the LLM is seeing the `LIMIT`
statement used in the prompt.
Since we can't specify each dialect's method here, I think it's fine to
just replace the `SELECT... LIMIT 3;` statement with `3 rows from
table_name table:`, and wrap everything in a block comment directly
following the `CREATE` statement. The Rajkumar et al paper wrapped the
example rows and `SELECT` statement in a block comment as well anyway.
Thoughts @fpingham?
2023-03-13 23:08:27 -07:00
Jon Luo
882f7964fb fix sql misinterpretation of % in query (#1408)
% is being misinterpreted by sqlalchemy as parameter passing, so any
`LIKE 'asdf%'` will result in a value error with mysql, mariadb, and
maybe some others. This is one way to fix it - the alternative is to
simply double up %, like `LIKE 'asdf%%'` but this seemed cleaner in
terms of output.
Fixes #1383
2023-03-02 16:03:16 -08:00
Ankush Gola
82baecc892 Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-02-28 19:44:39 -08:00
Jon Luo
35f1e8f569 separate columns by tabs instead of single space in sql sample rows (#1348)
Use tabs to separate columns instead of a single space - confusing when
there are spaces in a cell
2023-02-28 18:59:53 -08:00
Jon Luo
5bf8772f26 add option to use user-defined SQL table info (#1347)
Currently, table information is gathered through SQLAlchemy as complete
table DDL and a user-selected number of sample rows from each table.
This PR adds the option to use user-defined table information instead of
automatically collecting it. This will use the provided table
information and fall back to the automatic gathering for tables that the
user didn't provide information for.

Off the top of my head, there are a few cases where this can be quite
useful:
- The first n rows of a table are uninformative, or very similar to one
another. In this case, hand-crafting example rows for a table such that
they provide the good, diverse information can be very helpful. Another
approach we can think about later is getting a random sample of n rows
instead of the first n rows, but there are some performance
considerations that need to be taken there. Even so, hand-crafting the
sample rows is useful and can guarantee the model sees informative data.
- The user doesn't want every column to be available to the model. This
is not an elegant way to fulfill this specific need since the user would
have to provide the table definition instead of a simple list of columns
to include or ignore, but it does work for this purpose.
- For the developers, this makes it a lot easier to compare/benchmark
the performance of different prompting structures for providing table
information in the prompt.

These are cases I've run into myself (particularly cases 1 and 3) and
I've found these changes useful. Personally, I keep custom table info
for a few tables in a yaml file for versioning and easy loading.

Definitely open to other opinions/approaches though!
2023-02-28 18:58:04 -08:00
Jon Luo
ac1320aae8 fix sqlite internal tables breaking table_info (#1224)
With the current method used to get the SQL table info, sqlite internal
schema tables are being included and are not being handled correctly by
sqlalchemy because the columns have no types. This is easy to see with
the Chinook database:
```python
db = SQLDatabase.from_uri("sqlite:///Chinook.db")
print(db.table_info)
```
```python
...
sqlalchemy.exc.CompileError: (in table 'sqlite_sequence', column 'name'): Can't generate DDL for NullType(); did you forget to specify a type on this Column?
```

SQLAlchemy 2.0 [ignores these by
default](63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L856-L880)):

63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L2096-L2123)
2023-02-22 10:34:05 -08:00
Harrison Chase
45b5640fe5 fix sql (#1141) 2023-02-18 11:49:08 -08:00
Francisco Ingham
3f29742adc Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-18 10:58:29 -08:00
Jon Luo
c39ef70aa4 fix for database compatibility when getting table DDL (#1129)
#1081 introduced a method to get DDL (table definitions) in a manner
specific to sqlite3, thus breaking compatibility with other non-sqlite3
databases. This uses the sqlite3 command if the detected dialect is
sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This
should fix #1103.
2023-02-17 13:39:44 -08:00
Harrison Chase
5e10e19bfe Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-15 23:53:37 -08:00
Harrison Chase
ec727bf166 Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-13 21:48:41 -08:00
Harrison Chase
3d639d1539 update lint (#975)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 08:01:13 -08:00
Kevin Huo
512c523368 remove sample_row_in_table_info and simplify set operations in SQLDB (#932)
-Address TODO: deprecate for sample_row_in_table_info
-Simplify set operations by casting to sets to not need multiple set
casts + .difference() calls
2023-02-09 23:15:41 -08:00
Harrison Chase
f95cedc443 Harrison/sql rows (#915)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-06 18:56:18 -08:00
Harrison Chase
248c297f1b Sample row in table info for SQLDatabase (#769) (#782)
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-01-28 13:37:07 -08:00
Amos Ng
fa6826e417 Fix sqlalchemy warnings when running tests (#733)
This has been bugging me when running my own tests that call langchain
methods :P
2023-01-25 07:14:07 -08:00
Harrison Chase
1c71fadfdc more complex sql chain (#619)
add a more complex sql chain that first subsets the necessary tables
2023-01-15 17:07:21 -08:00
Harrison Chase
95157d0aad Add schema property to sql database utility class (#448) (#462)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>

Signed-off-by: Diwank Singh Tomer <diwank.singh@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Diwank Singh Tomer <diwank.singh@gmail.com>
2022-12-28 17:37:53 -05:00
Xupeng (Tony) Tong
bb4bf9d6d0 chore: minor clean up / formatting (#233)
to get familiarize with the project
2022-12-01 10:50:36 -08:00
Andrew Gleave
ea67c049f0 Support SQL statements that return no results (#222)
Adds support for statements such as insert, update etc which do not
return any rows.

`engine.execute` is deprecated and so execution has been updated to use
`connection.exec_driver_sql` as-per:


https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Engine.execute
2022-11-29 08:28:45 -08:00
Nicholas Larus-Stone
ca4b10bb74 feat: add option to ignore or restrict to SQL tables (#151)
`SQLDatabase` now accepts two `init` arguments:
1. `ignore_tables` to pass in a list of tables to not search over
2. `include_tables` to restrict to a list of tables to consider
2022-11-16 22:04:50 -08:00
Harrison Chase
af81e9ca9c add sql database (#35) 2022-10-27 23:21:47 -07:00