{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# IntelĀ® Extension for Transformers Quantized Text Embeddings\n", "\n", "Load quantized BGE embedding models generated by [IntelĀ® Extension for Transformers](https://github.com/intel/intel-extension-for-transformers) (ITREX) and use ITREX [Neural Engine](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/deprecated/docs/Installation.md), a high-performance NLP backend, to accelerate the inference of models without compromising accuracy.\n", "\n", "Refer to our blog of [Efficient Natural Language Embedding Models with Intel Extension for Transformers](https://medium.com/intel-analytics-software/efficient-natural-language-embedding-models-with-intel-extension-for-transformers-2b6fcd0f8f34) and [BGE optimization example](https://github.com/intel/intel-extension-for-transformers/tree/main/examples/huggingface/pytorch/text-embedding/deployment/mteb/bge) for more details." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/yuwenzho/.conda/envs/bge/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n", "2024-03-04 10:17:17 [INFO] Start to extarct onnx model ops...\n", "2024-03-04 10:17:17 [INFO] Extract onnxruntime model done...\n", "2024-03-04 10:17:17 [INFO] Start to implement Sub-Graph matching and replacing...\n", "2024-03-04 10:17:18 [INFO] Sub-Graph match and replace done...\n" ] } ], "source": [ "from langchain_community.embeddings import QuantizedBgeEmbeddings\n", "\n", "model_name = \"Intel/bge-small-en-v1.5-sts-int8-static-inc\"\n", "encode_kwargs = {\"normalize_embeddings\": True} # set True to compute cosine similarity\n", "\n", "model = QuantizedBgeEmbeddings(\n", " model_name=model_name,\n", " encode_kwargs=encode_kwargs,\n", " query_instruction=\"Represent this sentence for searching relevant passages: \",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## usage" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "text = \"This is a test document.\"\n", "query_result = model.embed_query(text)\n", "doc_result = model.embed_documents([text])" ] } ], "metadata": { "kernelspec": { "display_name": "yuwen", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 2 }