Integrate NLP Cloud embeddings endpoint (#7931)

Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>
2026-01-05 16:06:39 +00:00 · 2023-07-20 00:27:34 +02:00
parent 854a2be0ca
commit 3adab5e5be
3 changed files with 179 additions and 0 deletions
--- a/docs/extras/modules/data_connection/text_embedding/integrations/nlp_cloud.ipynb
+++ b/docs/extras/modules/data_connection/text_embedding/integrations/nlp_cloud.ipynb
@@ -0,0 +1,106 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "6802946f",
+   "metadata": {},
+   "source": [
+    "# NLP Cloud\n",
+    "\n",
+    "NLP Cloud is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
+    "\n",
+    "The [embeddings](https://docs.nlpcloud.com/#embeddings) endpoint offers several models:\n",
+    "\n",
+    "* `paraphrase-multilingual-mpnet-base-v2`: Paraphrase Multilingual MPNet Base V2 is a very fast model based on Sentence Transformers that is perfectly suited for embeddings extraction in more than 50 languages (see the full list here).\n",
+    "\n",
+    "* `gpt-j`: GPT-J returns advanced embeddings. It might return better results than Sentence Transformers based models (see above) but it is also much slower.\n",
+    "\n",
+    "* `dolphin`: Dolphin returns advanced embeddings. It might return better results than Sentence Transformers based models (see above) but it is also much slower. It natively understands the following languages: Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, French, German, Hungarian, Italian, Japanese, Polish, Portuguese, Romanian, Russian, Serbian, Slovenian, Spanish, Swedish, and Ukrainian."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "490d7923",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install nlpcloud"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "6a39ed4b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.embeddings import NLPCloudEmbeddings"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "c105d8cd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"NLPCLOUD_API_KEY\"] = \"xxx\"\n",
+    "nlpcloud_embd = NLPCloudEmbeddings()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "cca84023",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "text = \"This is a test document.\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "26868d0f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "query_result = nlpcloud_embd.embed_query(text)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "0c171c2f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "doc_result = nlpcloud_embd.embed_documents([text])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}