[evaluation] improvement on evaluation (#3862)

* fix a bug when the config file contains one category but the answer file doesn't contains that category

* fix Chinese prompt file

* support gpt-3.5-turbo and gpt-4 evaluation

* polish and update README

* resolve pr comments

---------

Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
This commit is contained in:
Yuanchen
2023-05-30 11:48:41 +08:00
committed by GitHub
parent b0474878bf
commit 2506e275b8
7 changed files with 335 additions and 142 deletions

View File

@@ -57,6 +57,7 @@ def get_data_per_category(data, categories):
data_per_category = {category: [] for category in categories}
for item in data:
category = item["category"]
data_per_category[category].append(item)
if category in categories:
data_per_category[category].append(item)
return data_per_category