community: add 'get_col_comments' option for retrieve database columns comments (#30646)

## Description
Added support for retrieving column comments in the SQL Database
utility. This feature allows users to see comments associated with
database columns when querying table information. Column comments
provide valuable metadata that helps LLMs better understand the
semantics and purpose of database columns.

A new optional parameter `get_col_comments` was added to the
`get_table_info` method, defaulting to `False` for backward
compatibility. When set to `True`, it retrieves and formats column
comments for each table.

Currently, this feature is supported on PostgreSQL, MySQL, and Oracle
databases.

## Implementation
You should create Table with column comments before.

```python
db = SQLDatabase.from_uri("YOUR_DB_URI")
print(db.get_table_info(get_col_comments=True)) 
```
## Result
```
CREATE TABLE test_table (
	name VARCHAR
        school VARCHAR)
/*
Column Comments: {'name': person name, 'school":school_name}
*/

/*
3 rows from test_table:
name
a
b
c
*/
```

## Benefits
1. Enhances LLM's understanding of database schema semantics
2. Preserves valuable domain knowledge embedded in database design
3. Improves accuracy of SQL query generation
4. Provides more context for data interpretation

Tests are available in
`langchain/libs/community/tests/test_sql_get_table_info.py`.

---------

Co-authored-by: chbae <chbae@gcsc.co.kr>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
Bae-ChangHyun
2025-04-29 00:19:46 +09:00
committed by GitHub
parent 3fb0a55122
commit a2863f8757
3 changed files with 308 additions and 6 deletions

View File

@@ -316,7 +316,9 @@ class SQLDatabase:
"""Information about all tables in the database."""
return self.get_table_info()
def get_table_info(self, table_names: Optional[List[str]] = None) -> str:
def get_table_info(
self, table_names: Optional[List[str]] = None, get_col_comments: bool = False
) -> str:
"""Get information about specified tables.
Follows best practices as specified in: Rajkumar et al, 2022
@@ -356,14 +358,39 @@ class SQLDatabase:
tables.append(self._custom_table_info[table.name])
continue
# Ignore JSON datatyped columns
for k, v in table.columns.items(): # AttributeError: items in sqlalchemy v1
if type(v.type) is NullType:
table._columns.remove(v)
# Ignore JSON datatyped columns - SQLAlchemy v1.x compatibility
try:
# For SQLAlchemy v2.x
for k, v in table.columns.items():
if type(v.type) is NullType:
table._columns.remove(v)
except AttributeError:
# For SQLAlchemy v1.x
for k, v in dict(table.columns).items():
if type(v.type) is NullType:
table._columns.remove(v)
# add create table command
create_table = str(CreateTable(table).compile(self._engine))
table_info = f"{create_table.rstrip()}"
# Add column comments as dictionary
if get_col_comments:
try:
column_comments_dict = {}
for column in table.columns:
if column.comment:
column_comments_dict[column.name] = column.comment
if column_comments_dict:
table_info += (
f"\n\n/*\nColumn Comments: {column_comments_dict}\n*/"
)
except Exception:
raise ValueError(
"Column comments are available on PostgreSQL, MySQL, Oracle"
)
has_extra_info = (
self._indexes_in_table_info or self._sample_rows_in_table_info
)