Files
langchain/docs/versioned_docs/version-0.2.x/integrations/text_embedding/elasticsearch.ipynb
Jacob Lee aff771923a Jacob/new docs (#20570)
Use docusaurus versioning with a callout, merged master as well

@hwchase17 @baskaryan

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-04-18 11:10:55 -07:00

270 lines
5.9 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "72644940",
"metadata": {
"id": "1eZl1oaVUNeC"
},
"source": [
"# Elasticsearch\n",
"Walkthrough of how to generate embeddings using a hosted embedding model in Elasticsearch\n",
"\n",
"The easiest way to instantiate the `ElasticsearchEmbeddings` class it either\n",
"- using the `from_credentials` constructor if you are using Elastic Cloud\n",
"- or using the `from_es_connection` constructor with any Elasticsearch cluster"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "298759cb",
"metadata": {
"id": "6dJxqebov4eU"
},
"outputs": [],
"source": [
"!pip -q install langchain-elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "76489aff",
"metadata": {
"id": "RV7C3DUmv4aq"
},
"outputs": [],
"source": [
"from langchain_elasticsearch import ElasticsearchEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "57bfdc82",
"metadata": {
"id": "MrT3jplJvp09"
},
"outputs": [],
"source": [
"# Define the model ID\n",
"model_id = \"your_model_id\""
]
},
{
"cell_type": "markdown",
"id": "0ffad1ec",
"metadata": {
"id": "j5F-nwLVS_Zu"
},
"source": [
"## Testing with `from_credentials`\n",
"This required an Elastic Cloud `cloud_id`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc2e9dcb",
"metadata": {
"id": "svtdnC-dvpxR"
},
"outputs": [],
"source": [
"# Instantiate ElasticsearchEmbeddings using credentials\n",
"embeddings = ElasticsearchEmbeddings.from_credentials(\n",
" model_id,\n",
" es_cloud_id=\"your_cloud_id\",\n",
" es_user=\"your_user\",\n",
" es_password=\"your_password\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8ee7f1fc",
"metadata": {
"id": "7DXZAK7Kvpth"
},
"outputs": [],
"source": [
"# Create embeddings for multiple documents\n",
"documents = [\n",
" \"This is an example document.\",\n",
" \"Another example document to generate embeddings for.\",\n",
"]\n",
"document_embeddings = embeddings.embed_documents(documents)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0b9d8471",
"metadata": {
"id": "K8ra75W_vpqy"
},
"outputs": [],
"source": [
"# Print document embeddings\n",
"for i, embedding in enumerate(document_embeddings):\n",
" print(f\"Embedding for document {i+1}: {embedding}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3989ab23",
"metadata": {
"id": "V4Q5kQo9vpna"
},
"outputs": [],
"source": [
"# Create an embedding for a single query\n",
"query = \"This is a single query.\"\n",
"query_embedding = embeddings.embed_query(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0da6d2bf",
"metadata": {
"id": "O0oQDzGKvpkz"
},
"outputs": [],
"source": [
"# Print query embedding\n",
"print(f\"Embedding for query: {query_embedding}\")"
]
},
{
"cell_type": "markdown",
"id": "32700096",
"metadata": {
"id": "rHN03yV6TJ5q"
},
"source": [
"## Testing with Existing Elasticsearch client connection\n",
"This can be used with any Elasticsearch deployment"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0bc60465",
"metadata": {
"id": "GMQcJDwBTJFm"
},
"outputs": [],
"source": [
"# Create Elasticsearch connection\n",
"from elasticsearch import Elasticsearch\n",
"\n",
"es_connection = Elasticsearch(\n",
" hosts=[\"https://es_cluster_url:port\"], basic_auth=(\"user\", \"password\")\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8085843b",
"metadata": {
"id": "WTYIU4u3TJO1"
},
"outputs": [],
"source": [
"# Instantiate ElasticsearchEmbeddings using es_connection\n",
"embeddings = ElasticsearchEmbeddings.from_es_connection(\n",
" model_id,\n",
" es_connection,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "59a90bf3",
"metadata": {
"id": "4gdAUHwoTJO3"
},
"outputs": [],
"source": [
"# Create embeddings for multiple documents\n",
"documents = [\n",
" \"This is an example document.\",\n",
" \"Another example document to generate embeddings for.\",\n",
"]\n",
"document_embeddings = embeddings.embed_documents(documents)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54b18673",
"metadata": {
"id": "RC_-tov6TJO3"
},
"outputs": [],
"source": [
"# Print document embeddings\n",
"for i, embedding in enumerate(document_embeddings):\n",
" print(f\"Embedding for document {i+1}: {embedding}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a4812d5e",
"metadata": {
"id": "6GEnHBqETJO3"
},
"outputs": [],
"source": [
"# Create an embedding for a single query\n",
"query = \"This is a single query.\"\n",
"query_embedding = embeddings.embed_query(query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c6c69916",
"metadata": {
"id": "-kyUQAXDTJO4"
},
"outputs": [],
"source": [
"# Print query embedding\n",
"print(f\"Embedding for query: {query_embedding}\")"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}