Integrate NLP Cloud embeddings endpoint (#7931)

Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
This commit is contained in:
Julien Salinas
2023-07-20 00:27:34 +02:00
committed by GitHub
parent 854a2be0ca
commit 3adab5e5be
3 changed files with 179 additions and 0 deletions

View File

@@ -0,0 +1,106 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6802946f",
"metadata": {},
"source": [
"# NLP Cloud\n",
"\n",
"NLP Cloud is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
"\n",
"The [embeddings](https://docs.nlpcloud.com/#embeddings) endpoint offers several models:\n",
"\n",
"* `paraphrase-multilingual-mpnet-base-v2`: Paraphrase Multilingual MPNet Base V2 is a very fast model based on Sentence Transformers that is perfectly suited for embeddings extraction in more than 50 languages (see the full list here).\n",
"\n",
"* `gpt-j`: GPT-J returns advanced embeddings. It might return better results than Sentence Transformers based models (see above) but it is also much slower.\n",
"\n",
"* `dolphin`: Dolphin returns advanced embeddings. It might return better results than Sentence Transformers based models (see above) but it is also much slower. It natively understands the following languages: Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, French, German, Hungarian, Italian, Japanese, Polish, Portuguese, Romanian, Russian, Serbian, Slovenian, Spanish, Swedish, and Ukrainian."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "490d7923",
"metadata": {},
"outputs": [],
"source": [
"! pip install nlpcloud"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6a39ed4b",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import NLPCloudEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c105d8cd",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"NLPCLOUD_API_KEY\"] = \"xxx\"\n",
"nlpcloud_embd = NLPCloudEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "cca84023",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "26868d0f",
"metadata": {},
"outputs": [],
"source": [
"query_result = nlpcloud_embd.embed_query(text)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0c171c2f",
"metadata": {},
"outputs": [],
"source": [
"doc_result = nlpcloud_embd.embed_documents([text])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}