community: update RankLLM integration and fix LangChain deprecation (#29931)

# Community: update RankLLM integration and fix LangChain deprecation - [x] **Description:** - Removed `ModelType` enum (`VICUNA`, `ZEPHYR`, `GPT`) to align with RankLLM's latest implementation. - Updated `chain({query})` to `chain.invoke({query})` to resolve LangChain 0.1.0 deprecation warnings from https://github.com/langchain-ai/langchain/pull/29840. - [x] **Dependencies:** No new dependencies added. - [x] **Tests and Docs:** - Updated RankLLM documentation (`docs/docs/integrations/document_transformers/rankllm-reranker.ipynb`). - Fixed LangChain usage in related code examples. - [x] **Lint and Test:** - Ran `make format`, `make lint`, and verified functionality after updates. - No breaking changes introduced. ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-09-02 03:26:17 +00:00 · 2025-03-31 09:50:00 -04:00
parent b4fe1f1ec0
commit e4515f308f
1 changed files with 74 additions and 35 deletions
--- a/docs/docs/integrations/document_transformers/rankllm-reranker.ipynb
+++ b/docs/docs/integrations/document_transformers/rankllm-reranker.ipynb
@@ -11,39 +11,41 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "[RankLLM](https://github.com/castorini/rank_llm) offers a suite of listwise rerankers, albeit with focus on open source LLMs finetuned for the task - RankVicuna and RankZephyr being two of them."
+    "**[RankLLM](https://github.com/castorini/rank_llm)** is a **flexible reranking framework** supporting **listwise, pairwise, and pointwise ranking models**. It includes **RankVicuna, RankZephyr, MonoT5, DuoT5, LiT5, and FirstMistral**, with integration for **FastChat, vLLM, SGLang, and TensorRT-LLM** for efficient inference. RankLLM is optimized for **retrieval and ranking tasks**, leveraging both **open-source LLMs** and proprietary rerankers like **RankGPT and RankGemini**. It supports **batched inference, first-token reranking, and retrieval via BM25 and SPLADE**.\n",
+    "\n",
+    "> **Note:** If using the built-in retriever, RankLLM requires **Pyserini, JDK 21, PyTorch, and Faiss** for retrieval functionality."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  rank_llm"
+    "%pip install --upgrade --quiet rank_llm"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  langchain_openai"
+    "%pip install --upgrade --quiet langchain_openai"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  faiss-cpu"
+    "%pip install --upgrade --quiet faiss-cpu"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -56,7 +58,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -64,7 +66,7 @@
    "def pretty_print_docs(docs):\n",
    "    print(\n",
    "        f\"\\n{'-' * 100}\\n\".join(\n",
-    "            [f\"Document {i+1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]\n",
+    "            [f\"Document {i + 1}:\\n\\n\" + d.page_content for i, d in enumerate(docs)]\n",
    "        )\n",
    "    )"
   ]
@@ -79,9 +81,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2025-02-22 15:28:58,344 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
+     ]
+    }
+   ],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
    "from langchain_community.vectorstores import FAISS\n",
@@ -114,14 +124,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
-     "name": "stderr",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "2025-02-17 04:37:08,458 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
+      "2025-02-22 15:29:00,892 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
@@ -331,27 +341,41 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Retrieval + Reranking with RankZephyr"
+    "RankZephyr performs listwise reranking for improved retrieval quality but requires at least 24GB of VRAM to run efficiently."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": null,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Downloading shards: 100%|██████████| 3/3 [00:00<00:00, 2674.37it/s]\n",
+      "Loading checkpoint shards: 100%|██████████| 3/3 [01:49<00:00, 36.39s/it]\n"
+     ]
+    }
+   ],
   "source": [
+    "import torch\n",
    "from langchain.retrievers.contextual_compression import ContextualCompressionRetriever\n",
    "from langchain_community.document_compressors.rankllm_rerank import RankLLMRerank\n",
    "\n",
-    "compressor = RankLLMRerank(top_n=3, model=\"zephyr\")\n",
+    "torch.cuda.empty_cache()\n",
+    "\n",
+    "compressor = RankLLMRerank(top_n=3, model=\"rank_zephyr\")\n",
    "compression_retriever = ContextualCompressionRetriever(\n",
    "    base_compressor=compressor, base_retriever=retriever\n",
-    ")"
+    ")\n",
+    "\n",
+    "del compressor"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
@@ -386,7 +410,7 @@
     ]
    },
    {
-     "name": "stderr",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
@@ -407,7 +431,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
@@ -432,7 +456,7 @@
    "    llm=ChatOpenAI(temperature=0), retriever=compression_retriever\n",
    ")\n",
    "\n",
-    "chain({\"query\": query})"
+    "chain.invoke({\"query\": query})"
   ]
  },
  {
@@ -451,9 +475,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2025-02-22 15:01:29,469 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
    {
     "name": "stdout",
     "output_type": "stream",
@@ -683,7 +714,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -698,9 +729,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "2025-02-22 15:01:38,554 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
+      "  0%|          | 0/1 [00:00<?, ?it/s]2025-02-22 15:01:43,704 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+      "100%|██████████| 1/1 [00:05<00:00,  5.15s/it]"
+     ]
+    },
    {
     "name": "stdout",
     "output_type": "stream",
@@ -727,7 +767,7 @@
     ]
    },
    {
-     "name": "stderr",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
@@ -748,15 +788,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
-     "name": "stderr",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "/tmp/ipykernel_2153001/1437145854.py:10: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 1.0. Use :meth:`~invoke` instead.\n",
-      "  chain({\"query\": query})\n",
+      "  chain.invoke({\"query\": query})\n",
      "2025-02-17 04:30:00,016 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
      "  0%|          | 0/1 [00:00<?, ?it/s]2025-02-17 04:30:01,649 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "100%|██████████| 1/1 [00:01<00:00,  1.63s/it]\n",
@@ -785,13 +824,13 @@
    "    llm=ChatOpenAI(temperature=0), retriever=compression_retriever\n",
    ")\n",
    "\n",
-    "chain({\"query\": query})"
+    "chain.invoke({\"query\": query})"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "rankllm",
+   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
@@ -805,7 +844,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.14"
+   "version": "3.13.2"
  }
 },
 "nbformat": 4,