langchain_qdrant: Added method "_asimilarity_search_with_relevance_scores" to Qdrant class (#23954)

I stumbled upon a bug that led to different similarity scores between
the async and sync similarity searches with relevance scores in Qdrant.
The reason being is that _asimilarity_search_with_relevance_scores is
missing, this makes langchain_qdrant use the method of the vectorstore
baseclass leading to drastically different results.

To illustrate the magnitude here are the results running an identical
search in a test vectorstore.

Output of asimilarity_search_with_relevance_scores:
[0.9902903374601824, 0.9472135924938804, 0.8535534011299859]

Output of similarity_search_with_relevance_scores:
[0.9805806749203648, 0.8944271849877607, 0.7071068022599718]

Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
Christian D. Glissov 2024-07-13 01:25:20 +02:00 committed by GitHub
parent bdc03997c9
commit 474b88326f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1953,6 +1953,29 @@ class Qdrant(VectorStore):
"""
return self.similarity_search_with_score(query, k, **kwargs)
@sync_call_fallback
async def _asimilarity_search_with_relevance_scores(
self,
query: str,
k: int = 4,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
"""Return docs and relevance scores in the range [0, 1].
0 is dissimilar, 1 is most similar.
Args:
query: input text
k: Number of Documents to return. Defaults to 4.
**kwargs: kwargs to be passed to similarity search. Should include:
score_threshold: Optional, a floating point value between 0 to 1 to
filter the resulting set of retrieved docs
Returns:
List of Tuples of (doc, similarity_score)
"""
return await self.asimilarity_search_with_score(query, k, **kwargs)
@classmethod
def _build_payloads(
cls,