Files
langchain/libs/community
Vincent Zhang df5008fe55 community[minor]: FAISS Filter Function Enhancement with Advanced Query Operators (#28207)
## Description
We are submitting as a team of four for a project. Other team members
are @RuofanChen03, @LikeWang10067, @TANYAL77.

This pull requests expands the filtering capabilities of the FAISS
vectorstore by adding MongoDB-style query operators indicated as
follows, while including comprehensive testing for the added
functionality.
- $eq (equals)
- $neq (not equals)
- $gt (greater than)
- $lt (less than)
- $gte (greater than or equal)
- $lte (less than or equal)
- $in (membership in list)
- $nin (not in list)
- $and (all conditions must match)
- $or (any condition must match)
- $not (negation of condition)


## Issue
This closes https://github.com/langchain-ai/langchain/issues/26379.


## Sample Usage
```python
import faiss
import asyncio
from langchain_community.vectorstores import FAISS
from langchain.schema import Document
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
documents = [
    Document(page_content="Process customer refund request", metadata={"schema_type": "financial", "handler_type": "refund",}),
    Document(page_content="Update customer shipping address", metadata={"schema_type": "customer", "handler_type": "update",}),
    Document(page_content="Process payment transaction", metadata={"schema_type": "financial", "handler_type": "payment",}),
    Document(page_content="Handle customer complaint", metadata={"schema_type": "customer","handler_type": "complaint",}),
    Document(page_content="Process invoice payment", metadata={"schema_type": "financial","handler_type": "payment",})
]

async def search(vectorstore, query, schema_type, handler_type, k=2):
    schema_filter = {"schema_type": {"$eq": schema_type}}
    handler_filter = {"handler_type": {"$eq": handler_type}}
    combined_filter = {
        "$and": [
            schema_filter,
            handler_filter,
        ]
    }
    base_retriever = vectorstore.as_retriever(
        search_kwargs={"k":k, "filter":combined_filter}
    )
    return await base_retriever.ainvoke(query)

async def main():
    vectorstore = FAISS.from_texts(
        texts=[doc.page_content for doc in documents],
        embedding=embeddings,
        metadatas=[doc.metadata for doc in documents]
    )
    
    def printt(title, documents):
        print(title)
        if not documents:
            print("\tNo documents found.")
            return
        for doc in documents:
            print(f"\t{doc.page_content}. {doc.metadata}")

    printt("Documents:", documents)
    printt('\nquery="process payment", schema_type="financial", handler_type="payment":', await search(vectorstore, query="process payment", schema_type="financial", handler_type="payment", k=2))
    printt('\nquery="customer update", schema_type="customer", handler_type="update":', await search(vectorstore, query="customer update", schema_type="customer", handler_type="update", k=2))
    printt('\nquery="refund process", schema_type="financial", handler_type="refund":', await search(vectorstore, query="refund process", schema_type="financial", handler_type="refund", k=2))
    printt('\nquery="refund process", schema_type="financial", handler_type="foobar":', await search(vectorstore, query="refund process", schema_type="financial", handler_type="foobar", k=2))
    print()

if __name__ == "__main__":asyncio.run(main())
```

## Output
```
Documents:
	Process customer refund request. {'schema_type': 'financial', 'handler_type': 'refund'}
	Update customer shipping address. {'schema_type': 'customer', 'handler_type': 'update'}
	Process payment transaction. {'schema_type': 'financial', 'handler_type': 'payment'}
	Handle customer complaint. {'schema_type': 'customer', 'handler_type': 'complaint'}
	Process invoice payment. {'schema_type': 'financial', 'handler_type': 'payment'}

query="process payment", schema_type="financial", handler_type="payment":
	Process payment transaction. {'schema_type': 'financial', 'handler_type': 'payment'}
	Process invoice payment. {'schema_type': 'financial', 'handler_type': 'payment'}

query="customer update", schema_type="customer", handler_type="update":
	Update customer shipping address. {'schema_type': 'customer', 'handler_type': 'update'}

query="refund process", schema_type="financial", handler_type="refund":
	Process customer refund request. {'schema_type': 'financial', 'handler_type': 'refund'}

query="refund process", schema_type="financial", handler_type="foobar":
	No documents found.

```

---------

Co-authored-by: ruofan chen <ruofan.is.awesome@gmail.com>
Co-authored-by: RickyCowboy <like.wang@mail.utoronto.ca>
Co-authored-by: Shanni Li <tanya.li@mail.utoronto.ca>
Co-authored-by: RuofanChen03 <114096642+ruofanchen03@users.noreply.github.com>
Co-authored-by: Like Wang <102838708+likewang10067@users.noreply.github.com>
2024-12-11 17:52:22 -05:00
..

🦜🧑‍🤝‍🧑 LangChain Community

Downloads License: MIT

Quick Install

pip install langchain-community

What is it?

LangChain Community contains third-party integrations that implement the base interfaces defined in LangChain Core, making them ready-to-use in any LangChain application.

For full documentation see the API reference.

Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.

📕 Releases & Versioning

langchain-community is currently on version 0.0.x

All changes will be accompanied by a patch version increase.

💁 Contributing

As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

For detailed information on how to contribute, see the Contributing Guide.