bugfix(ChatExcel): ChatExcel Language confusion bug

1.Fix ChatExcel Language confusion bug
This commit is contained in:
yhjun1026 2023-11-09 14:52:11 +08:00
parent d609cccb83
commit 2b948c34a5
4 changed files with 12 additions and 8 deletions

View File

@ -15,7 +15,7 @@ _DEFAULT_TEMPLATE_EN = """
Please use the data structure information in the above historical dialogue and combine it with data analysis to answer the user's questions while satisfying the constraints.
Constraint:
1.Please fully understand the user's problem and use duckdb sql for analysis. The analysis content is returned in the required output format. Do not output sql information outside the required location.
1.Please fully understand the user's problem and use duckdb sql for analysis. The analysis content is returned in the output format required below. Please output the sql in the corresponding sql parameter.
2.Please choose the best one from the display methods given below for data rendering, and put the type name into the name parameter value that returns the required format. If you cannot find the most suitable one, use 'Table' as the display method. , the available data display methods are as follows: {disply_type}
3.The table name that needs to be used in SQL is: {table_name}. Please check the sql you generated and do not use column names that are not in the data structure.
4.Give priority to answering using data analysis. If the user's question does not involve data analysis, you can answer according to your understanding.
@ -32,7 +32,7 @@ _PROMPT_SCENE_DEFINE_ZH = """你是一个数据分析专家!"""
_DEFAULT_TEMPLATE_ZH = """
请使用上述历史对话中的数据结构信息在满足下面约束条件下通过数据分析回答用户的问题
约束条件:
1.请充分理解用户的问题使用duckdb sql的方式进行分析 分析内容按要求的输出格式返回不要在要求的位置外输出sql信息
1.请充分理解用户的问题使用duckdb sql的方式进行分析 分析内容按下面要求的输出格式返回sql请输出在对应的sql参数中
2.请从如下给出的展示方式种选择最优的一种用以进行数据渲染将类型名称放入返回要求格式的name参数值种如果找不到最合适的则使用'Table'作为展示方式可用数据展示方式如下: {disply_type}
3.SQL中需要使用的表名是: {table_name},请检查你生成的sql不要使用没在数据结构中的列名
4.优先使用数据分析的方式回答如果用户问题不涉及数据分析内容你可以按你的理解进行回答

View File

@ -51,10 +51,9 @@ class ExcelLearning(BaseChat):
self._executor, self.excel_reader.get_sample_data
)
self.prompt_template.output_parser.update(colunms)
copy_datas = datas.copy()
datas.insert(0, colunms)
input_values = {
"data_example": json.dumps(copy_datas, cls=DateTimeEncoder),
"data_example": json.dumps(datas, cls=DateTimeEncoder),
}
return input_values

View File

@ -14,7 +14,8 @@ _PROMPT_SCENE_DEFINE_EN = "You are a data analysis expert. "
_DEFAULT_TEMPLATE_EN = """
This is an example dataplease learn to understand the structure and content of this data:
{data_example}
Explain the meaning and function of each column, and give a simple and clear explanation of the technical terms.
Explain the meaning and function of each column, and give a simple and clear explanation of the technical terms If it is a Date column, please summarize the Date format like: yyyy-MM-dd HH:MM:ss.
Please do not modify or translate the column names, make sure they are consistent with the given data column names.
Provide some analysis options,please think step by step.
Please return your answer in JSON format, the return format is as follows:
@ -26,7 +27,9 @@ _PROMPT_SCENE_DEFINE_ZH = "你是一个数据分析专家. "
_DEFAULT_TEMPLATE_ZH = """
下面是一份示例数据请学习理解该数据的结构和内容:
{data_example}
分析各列数据的含义和作用并对专业术语进行简单明了的解释
分析各列数据的含义和作用并对专业术语进行简单明了的解释, 如果是时间类型请给出时间格式类似:yyyy-MM-dd HH:MM:ss.
请不要修改或者翻译列名确保和给出数据列名一致.
提供一些分析方案思路请一步一步思考
请以JSON格式返回您的答案返回格式如下

View File

@ -258,15 +258,17 @@ class ExcelReader:
self.extension = os.path.splitext(file_name)[1]
# read excel file
if file_path.endswith(".xlsx") or file_path.endswith(".xls"):
df_tmp = pd.read_excel(file_path)
df_tmp = pd.read_excel(file_path, index_col= False)
self.df = pd.read_excel(
file_path,
index_col=False,
converters={i: csv_colunm_foramt for i in range(df_tmp.shape[1])},
)
elif file_path.endswith(".csv"):
df_tmp = pd.read_csv(file_path, encoding=encoding)
df_tmp = pd.read_csv(file_path, index_col= False, encoding=encoding)
self.df = pd.read_csv(
file_path,
index_col=False,
encoding=encoding,
converters={i: csv_colunm_foramt for i in range(df_tmp.shape[1])},
)