Files
langchain/docs/versioned_docs/version-0.2.x/integrations/vectorstores/vikingdb.ipynb
Jacob Lee aff771923a Jacob/new docs (#20570)
Use docusaurus versioning with a callout, merged master as well

@hwchase17 @baskaryan

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-04-18 11:10:55 -07:00

249 lines
5.7 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "96ff9e912bfe9d8",
"metadata": {
"collapsed": false
},
"source": [
"# viking DB\n",
"\n",
">[viking DB](https://www.volcengine.com/docs/6459/1163946) is a database that stores, indexes, and manages massive embedding vectors generated by deep neural networks and other machine learning (ML) models.\n",
"\n",
"This notebook shows how to use functionality related to the VikingDB vector database.\n",
"\n",
"To run, you should have a [viking DB instance up and running](https://www.volcengine.com/docs/6459/1165058).\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dd771e02d8a93a0",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"!pip install --upgrade volcengine"
]
},
{
"cell_type": "markdown",
"id": "12719205caed0d18",
"metadata": {
"collapsed": false
},
"source": [
"We want to use VikingDBEmbeddings so we have to get the VikingDB API Key."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "fbfb32665b4a3640",
"metadata": {
"ExecuteTime": {
"end_time": "2023-12-21T09:53:24.186916Z",
"start_time": "2023-12-21T09:53:24.179524Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d8c983d329237fa4",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from langchain.document_loaders import TextLoader\n",
"from langchain_community.vectorstores.vikingdb import VikingDB, VikingDBConfig\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1a4aea2eaeb2261",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"loader = TextLoader(\"./test.txt\")\n",
"documents = loader.load()\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=10, chunk_overlap=0)\n",
"docs = text_splitter.split_documents(documents)\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bfd593f3deabfaf8",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"db = VikingDB.from_documents(\n",
" docs,\n",
" embeddings,\n",
" connection_args=VikingDBConfig(\n",
" host=\"host\", region=\"region\", ak=\"ak\", sk=\"sk\", scheme=\"http\"\n",
" ),\n",
" drop_old=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "50e6ee12ca7eec39",
"metadata": {
"ExecuteTime": {
"end_time": "2023-12-21T10:01:47.355894Z",
"start_time": "2023-12-21T10:01:47.334789Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b6b81f5995c79ef0",
"metadata": {
"ExecuteTime": {
"end_time": "2023-12-21T10:01:47.771478Z",
"start_time": "2023-12-21T10:01:47.731485Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"docs[0].page_content"
]
},
{
"cell_type": "markdown",
"id": "a2d932c1290478ee",
"metadata": {
"collapsed": false
},
"source": [
"### Compartmentalize the data with viking DB Collections\n",
"\n",
"You can store different unrelated documents in different collections within same viking DB instance to maintain the context"
]
},
{
"cell_type": "markdown",
"id": "907de4eb10626d2a",
"metadata": {
"collapsed": false
},
"source": [
"Here's how you can create a new collection"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4f5a59ba40f7985f",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"db = VikingDB.from_documents(\n",
" docs,\n",
" embeddings,\n",
" connection_args=VikingDBConfig(\n",
" host=\"host\", region=\"region\", ak=\"ak\", sk=\"sk\", scheme=\"http\"\n",
" ),\n",
" collection_name=\"collection_1\",\n",
" drop_old=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "7c8eada37b17d992",
"metadata": {
"collapsed": false
},
"source": [
"And here is how you retrieve that stored collection"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "883ec678d47c9adc",
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"db = VikingDB.from_documents(\n",
" embeddings,\n",
" connection_args=VikingDBConfig(\n",
" host=\"host\", region=\"region\", ak=\"ak\", sk=\"sk\", scheme=\"http\"\n",
" ),\n",
" collection_name=\"collection_1\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2f0be30cfe70083d",
"metadata": {
"collapsed": false
},
"source": [
"After retrieval you can go on querying it as usual."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}