mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-30 11:39:03 +00:00
**Description:** - What I changed - By specifying the `id_key` during the initialization of `EnsembleRetriever`, it is now possible to determine which documents to merge scores for based on the value corresponding to the `id_key` element in the metadata, instead of `page_content`. Below is an example of how to use the modified `EnsembleRetriever`: ```python retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") # The Document returned by each retriever must keep the "id" key in its metadata. ``` - Additionally, I added a script to easily test the behavior of the `invoke` method of the modified `EnsembleRetriever`. - Why I changed - There are cases where you may want to calculate scores by treating Documents with different `page_content` as the same when using `EnsembleRetriever`. For example, when you want to ensemble the search results of the same document described in two different languages. - The previous `EnsembleRetriever` used `page_content` as the basis for score aggregation, making the above usage difficult. Therefore, the score is now calculated based on the specified key value in the Document's metadata. **Twitter handle:** @shimajiroxyz |
||
---|---|---|
.. | ||
cli | ||
community | ||
core | ||
experimental | ||
langchain | ||
partners | ||
standard-tests | ||
text-splitters |