experimental[minor]: upgrade the prompt injection model (#20783)

- **Description:** In January, Laiyer.ai became part of ProtectAI, which means the model became owned by ProtectAI. In addition to that, yesterday, we released a new version of the model addressing issues the Langchain's community and others mentioned to us about false-positives. The new model has a better accuracy compared to the previous version, and we thought the Langchain community would benefit from using the [latest version of the model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2). - **Issue:** N/A - **Dependencies:** N/A - **Twitter handle:** @alex_yaremchuk
2025-08-24 20:12:11 +00:00 · 2024-04-23 16:23:39 +02:00 · 2024-04-23 16:23:39 +02:00 · 9428923bab
commit 9428923bab
parent 645b1e142e
3 changed files with 13 additions and 9 deletions
--- a/docs/api_reference/guide_imports.json
+++ b/docs/api_reference/guide_imports.json
--- a/docs/docs/guides/productionization/safety/hugging_face_prompt_injection.ipynb
+++ b/docs/docs/guides/productionization/safety/hugging_face_prompt_injection.ipynb
@ -9,7 +9,7 @@
    "\n",
    "This notebook shows how to prevent prompt injection attacks using the text classification model from `HuggingFace`.\n",
    "\n",
-    "By default, it uses a *[laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection)* model trained to identify prompt injections. \n",
+    "By default, it uses a *[protectai/deberta-v3-base-prompt-injection-v2](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2)* model trained to identify prompt injections. \n",
    "\n",
    "In this notebook, we will use the ONNX version of the model to speed up the inference. "
   ]
@ -49,11 +49,15 @@
    "from optimum.onnxruntime import ORTModelForSequenceClassification\n",
    "from transformers import AutoTokenizer, pipeline\n",
    "\n",
-    "# Using https://huggingface.co/laiyer/deberta-v3-base-prompt-injection\n",
-    "model_path = \"laiyer/deberta-v3-base-prompt-injection\"\n",
-    "tokenizer = AutoTokenizer.from_pretrained(model_path)\n",
-    "tokenizer.model_input_names = [\"input_ids\", \"attention_mask\"]  # Hack to run the model\n",
-    "model = ORTModelForSequenceClassification.from_pretrained(model_path, subfolder=\"onnx\")\n",
+    "# Using https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2\n",
+    "model_path = \"laiyer/deberta-v3-base-prompt-injection-v2\"\n",
+    "revision = None  # We recommend specifiying the revision to avoid breaking changes or supply chain attacks\n",
+    "tokenizer = AutoTokenizer.from_pretrained(\n",
+    "    model_path, revision=revision, model_input_names=[\"input_ids\", \"attention_mask\"]\n",
+    ")\n",
+    "model = ORTModelForSequenceClassification.from_pretrained(\n",
+    "    model_path, revision=revision, subfolder=\"onnx\"\n",
+    ")\n",
    "\n",
    "classifier = pipeline(\n",
    "    \"text-classification\",\n",
--- a/libs/experimental/langchain_experimental/prompt_injection_identifier/hugging_face_identifier.py
+++ b/libs/experimental/langchain_experimental/prompt_injection_identifier/hugging_face_identifier.py
@ -23,7 +23,7 @@ class PromptInjectionException(ValueError):


 def _model_default_factory(
-    model_name: str = "laiyer/deberta-v3-base-prompt-injection",
+    model_name: str = "protectai/deberta-v3-base-prompt-injection-v2",
 ) -> Pipeline:
    try:
        from transformers import (
@ -64,7 +64,7 @@ class HuggingFaceInjectionIdentifier(BaseTool):
    
    Can be specified as transformers Pipeline or string. String should correspond to the
        model name of a text-classification transformers model. Defaults to 
-        ``laiyer/deberta-v3-base-prompt-injection`` model.
+        ``protectai/deberta-v3-base-prompt-injection-v2`` model.
    """
    threshold: float = Field(
        description="Threshold for prompt injection detection.", default=0.5