langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-10-26 05:10:22 +00:00

Author	SHA1	Message	Date
Jon Luo	0a1b1806e9	sql: do not hard code the LIMIT clause in the table_info section (#1563 ) Seeing a lot of issues in Discord in which the LLM is not using the correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT` for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc. I think this could be due to us specifying the LIMIT statement in the example rows portion of `table_info`. So the LLM is seeing the `LIMIT` statement used in the prompt. Since we can't specify each dialect's method here, I think it's fine to just replace the `SELECT... LIMIT 3;` statement with `3 rows from table_name table:`, and wrap everything in a block comment directly following the `CREATE` statement. The Rajkumar et al paper wrapped the example rows and `SELECT` statement in a block comment as well anyway. Thoughts @fpingham?	2023-03-13 23:08:27 -07:00
Jon Luo	882f7964fb	fix sql misinterpretation of % in query (#1408 ) % is being misinterpreted by sqlalchemy as parameter passing, so any `LIKE 'asdf%'` will result in a value error with mysql, mariadb, and maybe some others. This is one way to fix it - the alternative is to simply double up %, like `LIKE 'asdf%%'` but this seemed cleaner in terms of output. Fixes #1383	2023-03-02 16:03:16 -08:00
Ankush Gola	82baecc892	Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150 ) This PR adds * `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting with a sql database. This builds off of `SQLDatabaseChain`. The main advantages are 1) answering general questions about the db, 2) access to a tool for double checking queries, and 3) recovering from errors * `ZeroShotAgent.as_json_agent` which returns an agent for interacting with json blobs. * Several examples in notebooks --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-02-28 19:44:39 -08:00
Jon Luo	35f1e8f569	separate columns by tabs instead of single space in sql sample rows (#1348 ) Use tabs to separate columns instead of a single space - confusing when there are spaces in a cell	2023-02-28 18:59:53 -08:00
Jon Luo	5bf8772f26	add option to use user-defined SQL table info (#1347 ) Currently, table information is gathered through SQLAlchemy as complete table DDL and a user-selected number of sample rows from each table. This PR adds the option to use user-defined table information instead of automatically collecting it. This will use the provided table information and fall back to the automatic gathering for tables that the user didn't provide information for. Off the top of my head, there are a few cases where this can be quite useful: - The first n rows of a table are uninformative, or very similar to one another. In this case, hand-crafting example rows for a table such that they provide the good, diverse information can be very helpful. Another approach we can think about later is getting a random sample of n rows instead of the first n rows, but there are some performance considerations that need to be taken there. Even so, hand-crafting the sample rows is useful and can guarantee the model sees informative data. - The user doesn't want every column to be available to the model. This is not an elegant way to fulfill this specific need since the user would have to provide the table definition instead of a simple list of columns to include or ignore, but it does work for this purpose. - For the developers, this makes it a lot easier to compare/benchmark the performance of different prompting structures for providing table information in the prompt. These are cases I've run into myself (particularly cases 1 and 3) and I've found these changes useful. Personally, I keep custom table info for a few tables in a yaml file for versioning and easy loading. Definitely open to other opinions/approaches though!	2023-02-28 18:58:04 -08:00
Jon Luo	ac1320aae8	fix sqlite internal tables breaking table_info (#1224 ) With the current method used to get the SQL table info, sqlite internal schema tables are being included and are not being handled correctly by sqlalchemy because the columns have no types. This is easy to see with the Chinook database: ```python db = SQLDatabase.from_uri("sqlite:///Chinook.db") print(db.table_info) ``` ```python ... sqlalchemy.exc.CompileError: (in table 'sqlite_sequence', column 'name'): Can't generate DDL for NullType(); did you forget to specify a type on this Column? ``` SQLAlchemy 2.0 [ignores these by default](`63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L856-L880)`): `63d90b0f44/lib/sqlalchemy/dialects/sqlite/base.py (L2096-L2123)`	2023-02-22 10:34:05 -08:00
Harrison Chase	45b5640fe5	fix sql (#1141 )	2023-02-18 11:49:08 -08:00
Francisco Ingham	3f29742adc	Sql alchemy commands used in table info (#1135 ) This approach has several advantages: * it improves the readability of the code * removes incompatibilities between SQL dialects * fixes a bug with `datetime` values in rows and `ast.literal_eval` Huge thanks and credits to @jzluo for finding the weaknesses in the current approach and for the thoughtful discussion on the best way to implement this. --------- Co-authored-by: Francisco Ingham <> Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>	2023-02-18 10:58:29 -08:00
Jon Luo	c39ef70aa4	fix for database compatibility when getting table DDL (#1129 ) #1081 introduced a method to get DDL (table definitions) in a manner specific to sqlite3, thus breaking compatibility with other non-sqlite3 databases. This uses the sqlite3 command if the detected dialect is sqlite, and otherwise uses the standard SQL `SHOW CREATE TABLE`. This should fix #1103.	2023-02-17 13:39:44 -08:00
Harrison Chase	5e10e19bfe	Harrison/align table (#1081 ) Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2023-02-15 23:53:37 -08:00
Harrison Chase	ec727bf166	Align table info (#999 ) (#1034 ) Currently the chain is getting the column names and types on the one side and the example rows on the other. It is easier for the llm to read the table information if the column name and examples are shown together so that it can easily understand to which columns do the examples refer to. For an instantiation of this, please refer to the changes in the `sqlite.ipynb` notebook. Also changed `eval` for `ast.literal_eval` when interpreting the results from the sample row query since it is a better practice. --------- Co-authored-by: Francisco Ingham <> --------- Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2023-02-13 21:48:41 -08:00
Harrison Chase	3d639d1539	update lint (#975 ) Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>	2023-02-10 08:01:13 -08:00
Kevin Huo	512c523368	remove sample_row_in_table_info and simplify set operations in SQLDB (#932 ) -Address TODO: deprecate for sample_row_in_table_info -Simplify set operations by casting to sets to not need multiple set casts + .difference() calls	2023-02-09 23:15:41 -08:00
Harrison Chase	f95cedc443	Harrison/sql rows (#915 ) Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>	2023-02-06 18:56:18 -08:00
Harrison Chase	248c297f1b	Sample row in table info for SQLDatabase (#769 ) (#782 ) The agents usually benefit from understanding what the data looks like to be able to filter effectively. Sending just one row in the table info allows the agent to understand the data before querying and get better results. --------- Co-authored-by: Francisco Ingham <> --------- Co-authored-by: Francisco Ingham <fpingham@gmail.com>	2023-01-28 13:37:07 -08:00
Amos Ng	fa6826e417	Fix sqlalchemy warnings when running tests (#733 ) This has been bugging me when running my own tests that call langchain methods :P	2023-01-25 07:14:07 -08:00
Harrison Chase	1c71fadfdc	more complex sql chain (#619 ) add a more complex sql chain that first subsets the necessary tables	2023-01-15 17:07:21 -08:00
Harrison Chase	95157d0aad	Add schema property to sql database utility class (#448 ) (#462 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Signed-off-by: Diwank Singh Tomer <diwank.singh@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Diwank Singh Tomer <diwank.singh@gmail.com>	2022-12-28 17:37:53 -05:00
Xupeng (Tony) Tong	bb4bf9d6d0	chore: minor clean up / formatting (#233 ) to get familiarize with the project	2022-12-01 10:50:36 -08:00
Andrew Gleave	ea67c049f0	Support SQL statements that return no results (#222 ) Adds support for statements such as insert, update etc which do not return any rows. `engine.execute` is deprecated and so execution has been updated to use `connection.exec_driver_sql` as-per: https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Engine.execute	2022-11-29 08:28:45 -08:00
Nicholas Larus-Stone	ca4b10bb74	feat: add option to ignore or restrict to SQL tables (#151 ) `SQLDatabase` now accepts two `init` arguments: 1. `ignore_tables` to pass in a list of tables to not search over 2. `include_tables` to restrict to a list of tables to consider	2022-11-16 22:04:50 -08:00
Harrison Chase	af81e9ca9c	add sql database (#35 )	2022-10-27 23:21:47 -07:00

22 Commits