docs integrations/embeddings consistency (#10302)

Updated `integrations/embeddings`: fixed titles; added links, descriptions Updated `integrations/providers`.
2025-09-06 05:25:04 +00:00 · 2023-09-07 19:53:33 -07:00
parent 1b3ea1eeb4
commit fdba711d28
16 changed files with 170 additions and 80 deletions
--- a/docs/docs_skeleton/vercel.json
+++ b/docs/docs_skeleton/vercel.json
@@ -2216,6 +2216,10 @@
      "source": "/docs/modules/data_connection/text_embedding/integrations/tensorflowhub",
      "destination": "/docs/integrations/text_embedding/tensorflowhub"
    },
    {
      "source": "/docs/integrations/text_embedding/Awa",
      "destination": "/docs/integrations/text_embedding/awadb"
    },
    {
      "source": "/en/latest/modules/indexes/vectorstores/examples/analyticdb.html",
      "destination": "/docs/integrations/vectorstores/analyticdb"
--- a/docs/extras/integrations/providers/awadb.md
+++ b/docs/extras/integrations/providers/awadb.md
@@ -9,13 +9,20 @@ pip install awadb
 ```
-## VectorStore
+## Vector Store
 There exists a wrapper around AwaDB vector databases, allowing you to use it as a vectorstore,
 whether for semantic search or example selection.
 ```python
 from langchain.vectorstores import AwaDB
 ```
-For a more detailed walkthrough of the AwaDB wrapper, see [here](/docs/integrations/vectorstores/awadb.html).
+See a [usage example](/docs/integrations/vectorstores/awadb).
 ## Text Embedding Model
 ```python
 from langchain.embeddings import AwaEmbeddings
 ```
 See a [usage example](/docs/integrations/text_embedding/awadb).
--- a/docs/extras/integrations/providers/modelscope.mdx
+++ b/docs/extras/integrations/providers/modelscope.mdx
@@ -1,20 +1,24 @@
 # ModelScope
 >[ModelScope](https://www.modelscope.cn/home) is a big repository of the models and datasets.
 This page covers how to use the modelscope ecosystem within LangChain.
 It is broken into two parts: installation and setup, and then references to specific modelscope wrappers.
 ## Installation and Setup
-* Install the Python SDK with `pip install modelscope`
+Install the `modelscope` package.
 ```bash
 pip install modelscope
 ```
 ## Wrappers
-### Embeddings
+## Text Embedding Models
 There exists a modelscope Embeddings wrapper, which you can access with 
 ```python
 from langchain.embeddings import ModelScopeEmbeddings
 ```
-For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/modelscope_hub.html)
+For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/modelscope_hub)
--- a/docs/extras/integrations/providers/nlpcloud.mdx
+++ b/docs/extras/integrations/providers/nlpcloud.mdx
@@ -1,17 +1,31 @@
 # NLPCloud
-This page covers how to use the NLPCloud ecosystem within LangChain.
+>[NLP Cloud](https://docs.nlpcloud.com/#introduction) is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. 
-It is broken into two parts: installation and setup, and then references to specific NLPCloud wrappers.
+
 ## Installation and Setup
- Install the Python SDK with `pip install nlpcloud`
+
 - Install the `nlpcloud` package.
 ```bash
 pip install nlpcloud
 ```
 - Get an NLPCloud api key and set it as an environment variable (`NLPCLOUD_API_KEY`)
 ## Wrappers
-### LLM
+## LLM
 See a [usage example](/docs/integrations/llms/nlpcloud).
 There exists an NLPCloud LLM wrapper, which you can access with 
 ```python
 from langchain.llms import NLPCloud
 ```
 ## Text Embedding Models
 See a [usage example](/docs/integrations/text_embedding/nlp_cloud)
 ```python
 from langchain.embeddings import NLPCloudEmbeddings
 ```
--- a/docs/extras/integrations/providers/spacy.mdx
+++ b/docs/extras/integrations/providers/spacy.mdx
@@ -18,3 +18,11 @@ See a [usage example](/docs/modules/data_connection/document_transformers/text_s
 ```python
 from langchain.text_splitter import SpacyTextSplitter
 ```
 ## Text Embedding Models
 See a [usage example](/docs/integrations/text_embedding/spacy_embedding)
 ```python
 from langchain.embeddings.spacy_embeddings import SpacyEmbeddings
 ```
--- a/docs/extras/integrations/text_embedding/awadb.ipynb
+++ b/docs/extras/integrations/text_embedding/awadb.ipynb
@@ -5,9 +5,11 @@
   "id": "b14a24db",
   "metadata": {},
   "source": [
-    "# AwaEmbedding\n",
+    "# AwaDB\n",
    "\n",
-    "This notebook explains how to use AwaEmbedding, which is included in [awadb](https://github.com/awa-ai/awadb), to embedding texts in langchain."
+    ">[AwaDB](https://github.com/awa-ai/awadb) is an AI Native database for the search and storage of embedding vectors used by LLM Applications.\n",
    "\n",
    "This notebook explains how to use `AwaEmbeddings` in LangChain."
   ]
  },
  {
@@ -101,7 +103,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.4"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/text_embedding/bedrock.ipynb
+++ b/docs/extras/integrations/text_embedding/bedrock.ipynb
@@ -5,7 +5,9 @@
   "id": "75e378f5-55d7-44b6-8e2e-6d7b8b171ec4",
   "metadata": {},
   "source": [
-    "# Bedrock Embeddings"
+    "# Bedrock\n",
    "\n",
    ">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case.\n"
   ]
  },
  {
@@ -91,7 +93,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.13"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/text_embedding/bge_huggingface.ipynb
+++ b/docs/extras/integrations/text_embedding/bge_huggingface.ipynb
@@ -5,26 +5,29 @@
   "id": "719619d3",
   "metadata": {},
   "source": [
-    "# BGE Hugging Face Embeddings\n",
+    "# BGE on Hugging Face\n",
    "\n",
-    "This notebook shows how to use BGE Embeddings through Hugging Face"
+    ">[BGE models on the HuggingFace](https://huggingface.co/BAAI/bge-large-en) are [the best open-source embedding models](https://huggingface.co/spaces/mteb/leaderboard).\n",
    ">BGE model is created by the [Beijing Academy of Artificial Intelligence (BAAI)](https://www.baai.ac.cn/english.html). `BAAI` is a private non-profit organization engaged in AI research and development.\n",
    "\n",
    "This notebook shows how to use `BGE Embeddings` through `Hugging Face`"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
   "id": "f7a54279",
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
-    "# !pip install sentence_transformers"
+    "#!pip install sentence_transformers"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": null,
   "id": "9e1d5b6b",
   "metadata": {},
   "outputs": [],
@@ -43,12 +46,24 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 5,
   "id": "e59d1a89",
   "metadata": {},
-   "outputs": [],
+   "outputs": [
    {
     "data": {
      "text/plain": [
       "384"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "embedding = hf.embed_query(\"hi this is harrison\")"
+    "embedding = hf.embed_query(\"hi this is harrison\")\n",
    "len(embedding)"
   ]
  },
  {
@@ -76,7 +91,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/text_embedding/google_vertex_ai_palm.ipynb
+++ b/docs/extras/integrations/text_embedding/google_vertex_ai_palm.ipynb
@@ -1,13 +1,14 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Google Cloud Platform Vertex AI PaLM \n",
+    "# Google Vertex AI PaLM \n",
    "\n",
-    "Note: This is seperate from the Google PaLM integration, it exposes [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) on Google Cloud. \n",
+    ">[Vertex AI PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/overview) is a service on Google Cloud exposing the embedding models. \n",
    "\n",
    "Note: This integration is seperate from the Google PaLM integration.\n",
    "\n",
    "By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
    "\n",
@@ -96,7 +97,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/integrations/text_embedding/modelscope_hub.ipynb
+++ b/docs/extras/integrations/text_embedding/modelscope_hub.ipynb
@@ -1,12 +1,13 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ModelScope\n",
    "\n",
    ">[ModelScope](https://www.modelscope.cn/home) is big repository of the models and datasets.\n",
    "\n",
    "Let's load the ModelScope Embedding class."
   ]
  },
@@ -67,16 +68,23 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "chatgpt",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
-   "version": "3.9.15"
+   "nbconvert_exporter": "python",
-  },
+   "pygments_lexer": "ipython3",
-  "orig_nbformat": 4
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/extras/integrations/text_embedding/mosaicml.ipynb
+++ b/docs/extras/integrations/text_embedding/mosaicml.ipynb
@@ -1,15 +1,14 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# MosaicML embeddings\n",
+    "# MosaicML\n",
    "\n",
-    "[MosaicML](https://docs.mosaicml.com/en/latest/inference.html) offers a managed inference service. You can either use a variety of open source models, or deploy your own.\n",
+    ">[MosaicML](https://docs.mosaicml.com/en/latest/inference.html) offers a managed inference service. You can either use a variety of open source models, or deploy your own.\n",
    "\n",
-    "This example goes over how to use LangChain to interact with MosaicML Inference for text embedding."
+    "This example goes over how to use LangChain to interact with `MosaicML` Inference for text embedding."
   ]
  },
  {
@@ -94,6 +93,11 @@
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
@@ -103,9 +107,10 @@
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3"
+   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/extras/integrations/text_embedding/nlp_cloud.ipynb
+++ b/docs/extras/integrations/text_embedding/nlp_cloud.ipynb
@@ -7,7 +7,7 @@
   "source": [
    "# NLP Cloud\n",
    "\n",
-    "NLP Cloud is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
+    ">[NLP Cloud](https://docs.nlpcloud.com/#introduction) is an artificial intelligence platform that allows you to use the most advanced AI engines, and even train your own engines with your own data. \n",
    "\n",
    "The [embeddings](https://docs.nlpcloud.com/#embeddings) endpoint offers the following model:\n",
    "\n",
@@ -80,7 +80,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3.11.2 64-bit",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
@@ -94,7 +94,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.2"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/integrations/text_embedding/sagemaker-endpoint.ipynb
+++ b/docs/extras/integrations/text_embedding/sagemaker-endpoint.ipynb
@@ -5,11 +5,13 @@
   "id": "1f83f273",
   "metadata": {},
   "source": [
-    "# SageMaker Endpoint Embeddings\n",
+    "# SageMaker\n",
    "\n",
-    "Let's load the SageMaker Endpoints Embeddings class. The class can be used if you host, e.g. your own Hugging Face model on SageMaker.\n",
+    "Let's load the `SageMaker Endpoints Embeddings` class. The class can be used if you host, e.g. your own Hugging Face model on SageMaker.\n",
    "\n",
-    "For instructions on how to do this, please see [here](https://www.philschmid.de/custom-inference-huggingface-sagemaker). **Note**: In order to handle batched requests, you will need to adjust the return line in the `predict_fn()` function within the custom `inference.py` script:\n",
+    "For instructions on how to do this, please see [here](https://www.philschmid.de/custom-inference-huggingface-sagemaker). \n",
    "\n",
    "**Note**: In order to handle batched requests, you will need to adjust the return line in the `predict_fn()` function within the custom `inference.py` script:\n",
    "\n",
    "Change from\n",
    "\n",
@@ -143,7 +145,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/integrations/text_embedding/self-hosted.ipynb
+++ b/docs/extras/integrations/text_embedding/self-hosted.ipynb
@@ -5,8 +5,8 @@
   "id": "eec4efda",
   "metadata": {},
   "source": [
-    "# Self Hosted Embeddings\n",
+    "# Self Hosted\n",
-    "Let's load the SelfHostedEmbeddings, SelfHostedHuggingFaceEmbeddings, and SelfHostedHuggingFaceInstructEmbeddings classes."
+    "Let's load the `SelfHostedEmbeddings`, `SelfHostedHuggingFaceEmbeddings`, and `SelfHostedHuggingFaceInstructEmbeddings` classes."
   ]
  },
  {
@@ -149,9 +149,7 @@
   "cell_type": "code",
   "execution_count": null,
   "id": "fc1bfd0f",
-   "metadata": {
+   "metadata": {},
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "query_result = embeddings.embed_query(text)"
@@ -182,7 +180,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/integrations/text_embedding/sentence_transformers.ipynb
+++ b/docs/extras/integrations/text_embedding/sentence_transformers.ipynb
@@ -1,16 +1,15 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ed47bb62",
   "metadata": {},
   "source": [
-    "# Sentence Transformers Embeddings\n",
+    "# Sentence Transformers\n",
    "\n",
-    "[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
+    ">[SentenceTransformers](https://www.sbert.net/) embeddings are called using the `HuggingFaceEmbeddings` integration. We have also added an alias for `SentenceTransformerEmbeddings` for users who are more familiar with directly using that package.\n",
    "\n",
-    "SentenceTransformers is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
+    "`SentenceTransformers` is a python package that can generate text and image embeddings, originating from [Sentence-BERT](https://arxiv.org/abs/1908.10084)"
   ]
  },
  {
@@ -109,7 +108,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.16"
+   "version": "3.10.12"
  },
  "vscode": {
   "interpreter": {
--- a/docs/extras/integrations/text_embedding/spacy_embedding.ipynb
+++ b/docs/extras/integrations/text_embedding/spacy_embedding.ipynb
@@ -1,21 +1,31 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# Spacy Embedding\n",
+    "# SpaCy\n",
    "\n",
-    "### Loading the Spacy embedding class to generate and query embeddings"
+    ">[spaCy](https://spacy.io/) is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython.\n",
    " \n",
    "\n",
    "## Installation and Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install spacy"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### Import the necessary classes"
+    "Import the necessary classes"
   ]
  },
  {
@@ -28,11 +38,12 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### Initialize SpacyEmbeddings.This will load the Spacy model into memory."
+    "## Example\n",
    "\n",
    "Initialize SpacyEmbeddings.This will load the Spacy model into memory."
   ]
  },
  {
@@ -45,11 +56,10 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
+    "Define some example texts . These could be any documents that you want to analyze - for example, news articles, social media posts, or product reviews."
   ]
  },
  {
@@ -67,11 +77,10 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### Generate and print embeddings for the texts . The SpacyEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
+    "Generate and print embeddings for the texts . The SpacyEmbeddings class generates an embedding for each document, which is a numerical representation of the document's content. These embeddings can be used for various natural language processing tasks, such as document similarity comparison or text classification."
   ]
  },
  {
@@ -86,11 +95,10 @@
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "#### Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
+    "Generate and print an embedding for a single piece of text. You can also generate an embedding for a single piece of text, such as a search query. This can be useful for tasks like information retrieval, where you want to find documents that are similar to a given query."
   ]
  },
  {
@@ -106,11 +114,24 @@
  }
 ],
 "metadata": {
-  "language_info": {
+  "kernelspec": {
-   "name": "python"
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
-  "orig_nbformat": 4
+  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }