{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "vm8vn9t8DvC_" }, "source": [ "# AstraDB" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DataStax [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) is a serverless vector-capable database built on Cassandra and made conveniently available through an easy-to-use JSON API." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "5WjXERXzFEhg" }, "source": [ "## Overview" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "juAmbgoWD17u" }, "source": [ "The AstraDB Document Loader returns a list of Langchain Documents from an AstraDB database.\n", "\n", "The Loader takes the following parameters:\n", "\n", "* `api_endpoint`: AstraDB API endpoint. Looks like `https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com`\n", "* `token`: AstraDB token. Looks like `AstraCS:6gBhNmsk135....`\n", "* `collection_name` : AstraDB collection name\n", "* `namespace`: (Optional) AstraDB namespace\n", "* `filter_criteria`: (Optional) Filter used in the find query\n", "* `projection`: (Optional) Projection used in the find query\n", "* `find_options`: (Optional) Options used in the find query\n", "* `nb_prefetched`: (Optional) Number of documents pre-fetched by the loader\n", "* `extraction_function`: (Optional) A function to convert the AstraDB document to the LangChain `page_content` string. Defaults to `json.dumps`\n", "\n", "The following metadata is set to the LangChain Documents metadata output:\n", "\n", "```python\n", "{\n", " metadata : {\n", " \"namespace\": \"...\", \n", " \"api_endpoint\": \"...\", \n", " \"collection\": \"...\"\n", " }\n", "}\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Load documents with the Document Loader" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain_community.document_loaders import AstraDBLoader" ] }, { "cell_type": "code", "outputs": [], "source": [ "from getpass import getpass\n", "\n", "ASTRA_DB_API_ENDPOINT = input(\"ASTRA_DB_API_ENDPOINT = \")\n", "ASTRA_DB_APPLICATION_TOKEN = getpass(\"ASTRA_DB_APPLICATION_TOKEN = \")" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2024-01-08T12:41:22.643335Z", "start_time": "2024-01-08T12:40:57.759116Z" } }, "execution_count": 4 }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2024-01-08T12:42:25.395162Z", "start_time": "2024-01-08T12:42:25.391387Z" } }, "outputs": [], "source": [ "loader = AstraDBLoader(\n", " api_endpoint=ASTRA_DB_API_ENDPOINT,\n", " token=ASTRA_DB_APPLICATION_TOKEN,\n", " collection_name=\"movie_reviews\",\n", " projection={\"title\": 1, \"reviewtext\": 1},\n", " find_options={\"limit\": 10},\n", ")" ] }, { "cell_type": "code", "outputs": [], "source": [ "docs = loader.load()" ], "metadata": { "collapsed": false, "ExecuteTime": { "end_time": "2024-01-08T12:42:30.236489Z", "start_time": "2024-01-08T12:42:29.612133Z" } }, "execution_count": 7 }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2024-01-08T12:42:31.369394Z", "start_time": "2024-01-08T12:42:31.359003Z" } }, "outputs": [ { "data": { "text/plain": "Document(page_content='{\"_id\": \"659bdffa16cbc4586b11a423\", \"title\": \"Dangerous Men\", \"reviewtext\": \"\\\\\"Dangerous Men,\\\\\" the picture\\'s production notes inform, took 26 years to reach the big screen. After having seen it, I wonder: What was the rush?\"}', metadata={'namespace': 'default_keyspace', 'api_endpoint': 'https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com', 'collection': 'movie_reviews'})" }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "docs[0]" ] } ], "metadata": { "colab": { "collapsed_sections": [ "5WjXERXzFEhg" ], "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 4 }