added operator filter for supabase (#29475)

Description
This PR adds support for MongoDB-style $in operator filtering in the
Supabase vectorstore implementation. Currently, filtering with $in
operators returns no results, even when matching documents exist. This
change properly translates MongoDB-style filters to PostgreSQL syntax,
enabling efficient multi-document filtering.
Changes

Modified similarity_search_by_vector_with_relevance_scores to handle
MongoDB-style $in operators
Added automatic conversion of $in filters to PostgreSQL IN clauses
Preserved original vector type handling and numpy array conversion
Maintained compatibility with existing postgrest filters
Added support for the same filtering in
similarity_search_by_vector_returning_embeddings

Issue
Closes #27932

Implementation Notes
No changes to public API or function signatures
Backwards compatible - behavior unchanged for non-$in filters
More efficient than multiple individual queries for multi-ID searches
Preserves all existing functionality including numpy array conversion
for vector types

Dependencies
None

Additional Notes
The implementation handles proper SQL escaping for filter values
Maintains consistent behavior with other vectorstore implementations
that support MongoDB-style operators
Future extensions could support additional MongoDB-style operators ($gt,
$lt, etc.)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
Mohammad Anash 2025-01-29 19:54:18 +05:30 committed by GitHub
parent 585f467d4a
commit 12bcc85927
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -226,6 +226,22 @@ class SupabaseVectorStore(VectorStore):
postgrest_filter: Optional[str] = None,
score_threshold: Optional[float] = None,
) -> List[Tuple[Document, float]]:
# Convert MongoDB-style filter to PostgreSQL syntax if needed
if filter:
for key, value in filter.items():
if isinstance(value, dict) and "$in" in value:
# Extract the list of values for the $in operator
in_values = value["$in"]
# Create a PostgreSQL IN clause
values_str = ",".join(f"'{str(v)}'" for v in in_values)
new_filter = f"metadata->>{key} IN ({values_str})"
# Combine with existing postgrest_filter if present
if postgrest_filter:
postgrest_filter = f"({postgrest_filter}) and ({new_filter})"
else:
postgrest_filter = new_filter
match_documents_params = self.match_args(query, filter)
query_builder = self._client.rpc(self.query_name, match_documents_params)