community: update RankLLM integration and fix LangChain deprecation (#29931)

# Community: update RankLLM integration and fix LangChain deprecation

- [x] **Description:**  
- Removed `ModelType` enum (`VICUNA`, `ZEPHYR`, `GPT`) to align with
RankLLM's latest implementation.
- Updated `chain({query})` to `chain.invoke({query})` to resolve
LangChain 0.1.0 deprecation warnings from
https://github.com/langchain-ai/langchain/pull/29840.

- [x] **Dependencies:** No new dependencies added.  

- [x] **Tests and Docs:**  
- Updated RankLLM documentation
(`docs/docs/integrations/document_transformers/rankllm-reranker.ipynb`).
  - Fixed LangChain usage in related code examples.  

- [x] **Lint and Test:**  
- Ran `make format`, `make lint`, and verified functionality after
updates.
  - No breaking changes introduced.  

```
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in langchain.

If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
Brayden Zhong
2025-03-31 09:50:00 -04:00
committed by GitHub
parent b4fe1f1ec0
commit e4515f308f

View File

@@ -11,39 +11,41 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"[RankLLM](https://github.com/castorini/rank_llm) offers a suite of listwise rerankers, albeit with focus on open source LLMs finetuned for the task - RankVicuna and RankZephyr being two of them."
"**[RankLLM](https://github.com/castorini/rank_llm)** is a **flexible reranking framework** supporting **listwise, pairwise, and pointwise ranking models**. It includes **RankVicuna, RankZephyr, MonoT5, DuoT5, LiT5, and FirstMistral**, with integration for **FastChat, vLLM, SGLang, and TensorRT-LLM** for efficient inference. RankLLM is optimized for **retrieval and ranking tasks**, leveraging both **open-source LLMs** and proprietary rerankers like **RankGPT and RankGemini**. It supports **batched inference, first-token reranking, and retrieval via BM25 and SPLADE**.\n",
"\n",
"> **Note:** If using the built-in retriever, RankLLM requires **Pyserini, JDK 21, PyTorch, and Faiss** for retrieval functionality."
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet rank_llm"
"%pip install --upgrade --quiet rank_llm"
]
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain_openai"
"%pip install --upgrade --quiet langchain_openai"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet faiss-cpu"
"%pip install --upgrade --quiet faiss-cpu"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
@@ -56,7 +58,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
@@ -64,7 +66,7 @@
"def pretty_print_docs(docs):\n",
" print(\n",
" f\"\\n{'-' * 100}\\n\".join(\n",
" [f\"Document {i+1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]\n",
" [f\"Document {i + 1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]\n",
" )\n",
" )"
]
@@ -79,9 +81,17 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2025-02-22 15:28:58,344 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
]
}
],
"source": [
"from langchain_community.document_loaders import TextLoader\n",
"from langchain_community.vectorstores import FAISS\n",
@@ -114,14 +124,14 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"name": "stdout",
"output_type": "stream",
"text": [
"2025-02-17 04:37:08,458 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
"2025-02-22 15:29:00,892 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
]
},
{
@@ -331,27 +341,41 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Retrieval + Reranking with RankZephyr"
"RankZephyr performs listwise reranking for improved retrieval quality but requires at least 24GB of VRAM to run efficiently."
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading shards: 100%|██████████| 3/3 [00:00<00:00, 2674.37it/s]\n",
"Loading checkpoint shards: 100%|██████████| 3/3 [01:49<00:00, 36.39s/it]\n"
]
}
],
"source": [
"import torch\n",
"from langchain.retrievers.contextual_compression import ContextualCompressionRetriever\n",
"from langchain_community.document_compressors.rankllm_rerank import RankLLMRerank\n",
"\n",
"compressor = RankLLMRerank(top_n=3, model=\"zephyr\")\n",
"torch.cuda.empty_cache()\n",
"\n",
"compressor = RankLLMRerank(top_n=3, model=\"rank_zephyr\")\n",
"compression_retriever = ContextualCompressionRetriever(\n",
" base_compressor=compressor, base_retriever=retriever\n",
")"
")\n",
"\n",
"del compressor"
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"metadata": {},
"outputs": [
{
@@ -386,7 +410,7 @@
]
},
{
"name": "stderr",
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
@@ -407,7 +431,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"metadata": {},
"outputs": [
{
@@ -432,7 +456,7 @@
" llm=ChatOpenAI(temperature=0), retriever=compression_retriever\n",
")\n",
"\n",
"chain({\"query\": query})"
"chain.invoke({\"query\": query})"
]
},
{
@@ -451,9 +475,16 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2025-02-22 15:01:29,469 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
@@ -683,7 +714,7 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
@@ -698,9 +729,18 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2025-02-22 15:01:38,554 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
" 0%| | 0/1 [00:00<?, ?it/s]2025-02-22 15:01:43,704 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
"100%|██████████| 1/1 [00:05<00:00, 5.15s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
@@ -727,7 +767,7 @@
]
},
{
"name": "stderr",
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
@@ -748,15 +788,14 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"name": "stdout",
"output_type": "stream",
"text": [
"/tmp/ipykernel_2153001/1437145854.py:10: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~invoke` instead.\n",
" chain({\"query\": query})\n",
" chain.invoke({\"query\": query})\n",
"2025-02-17 04:30:00,016 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
" 0%| | 0/1 [00:00<?, ?it/s]2025-02-17 04:30:01,649 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
"100%|██████████| 1/1 [00:01<00:00, 1.63s/it]\n",
@@ -785,13 +824,13 @@
" llm=ChatOpenAI(temperature=0), retriever=compression_retriever\n",
")\n",
"\n",
"chain({\"query\": query})"
"chain.invoke({\"query\": query})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "rankllm",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
@@ -805,7 +844,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
"version": "3.13.2"
}
},
"nbformat": 4,