Add "Astra DB" vector store integration (#12966)

# Astra DB Vector store integration - **Description:** This PR adds a `VectorStore` implementation for DataStax Astra DB using its HTTP API - **Issue:** (no related issue) - **Dependencies:** A new required dependency is `astrapy` (`>=0.5.3`) which was added to pyptoject.toml, optional, as per guidelines - **Tag maintainer:** I recently mentioned to @baskaryan this integration was coming - **Twitter handle:** `@rsprrs` if you want to mention me This PR introduces the `AstraDB` vector store class, extensive integration test coverage, a reworking of the documentation which conflates Cassandra and Astra DB on a single "provider" page and a new, completely reworked vector-store example notebook (common to the Cassandra store, since parts of the flow is shared by the two APIs). I also took care in ensuring docs (and redirects therein) are behaving correctly. All style, linting, typechecks and tests pass as far as the `AstraDB` integration is concerned. I could build the documentation and check it all right (but ran into trouble with the `api_docs_build` makefile target which I could not verify: `Error: Unable to import module 'plan_and_execute.agent_executor' with error: No module named 'langchain_experimental'` was the first of many similar errors) Thank you for a review! Stefano --------- Co-authored-by: Erick Friis <erick@langchain.dev>
2025-09-06 21:43:44 +00:00 · 2023-11-07 23:45:33 +01:00
parent 13bd83bd61
commit 4f4b020582
21 changed files with 4376 additions and 376 deletions
--- a/docs/docs/integrations/providers/astradb.mdx
+++ b/docs/docs/integrations/providers/astradb.mdx
@@ -0,0 +1,85 @@
+# Astra DB
+
+This page lists the integrations available with [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) and [Apache Cassandra®](https://cassandra.apache.org/).
+
+### Setup
+
+Install the following Python package:
+
+```bash
+pip install "astrapy>=0.5.3"
+```
+
+## Astra DB
+
+> DataStax [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) is a serverless vector-capable database built on Cassandra and made conveniently available
+> through an easy-to-use JSON API.
+
+### Vector Store
+
+```python
+from langchain.vectorstores import AstraDB
+vector_store = AstraDB(
+  embedding=my_embedding,
+  collection_name="my_store",
+  api_endpoint="...",
+  token="...",
+)
+```
+
+Learn more in the [example notebook](/docs/integrations/vectorstores/astradb).
+
+
+## Apache Cassandra and Astra DB through CQL
+
+> [Cassandra](https://cassandra.apache.org/) is a NoSQL, row-oriented, highly scalable and highly available database.
+> Starting with version 5.0, the database ships with [vector search capabilities](https://cassandra.apache.org/doc/trunk/cassandra/vector-search/overview.html).
+> DataStax [Astra DB through CQL](https://docs.datastax.com/en/astra-serverless/docs/vector-search/quickstart.html) is a managed serverless database built on Cassandra, offering the same interface and strengths.
+
+These databases use the CQL protocol (Cassandra Query Language).
+Hence, a different set of connectors, outlined below, shall be used.
+
+### Vector Store
+
+```python
+from langchain.vectorstores import Cassandra
+vector_store = Cassandra(
+  embedding=my_embedding,
+  table_name="my_store",
+)
+```
+
+Learn more in the [example notebook](/docs/integrations/vectorstores/astradb) (scroll down to the CQL-specific section).
+
+
+### Memory
+
+```python
+from langchain.memory import CassandraChatMessageHistory
+message_history = CassandraChatMessageHistory(session_id="my-session")
+```
+
+Learn more in the [example notebook](/docs/integrations/memory/cassandra_chat_message_history).
+
+
+### LLM Cache
+
+```python
+from langchain.cache import CassandraCache
+langchain.llm_cache = CassandraCache()
+```
+
+Learn more in the [example notebook](/docs/integrations/llms/llm_caching) (scroll to the Cassandra section).
+
+
+### Semantic LLM Cache
+
+```python
+from langchain.cache import CassandraSemanticCache
+cassSemanticCache = CassandraSemanticCache(
+  embedding=my_embedding,
+  table_name="my_store",
+)
+```
+
+Learn more in the [example notebook](/docs/integrations/llms/llm_caching) (scroll to the appropriate section).
--- a/docs/docs/integrations/providers/cassandra.mdx
+++ b/docs/docs/integrations/providers/cassandra.mdx
@@ -1,35 +0,0 @@
-# Cassandra
-
->[Apache Cassandra®](https://cassandra.apache.org/) is a free and open-source, distributed, wide-column
-> store, NoSQL database management system designed to handle large amounts of data across many commodity servers, 
-> providing high availability with no single point of failure. Cassandra offers support for clusters spanning
-> multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. 
-> Cassandra was designed to implement a combination of _Amazon's Dynamo_ distributed storage and replication
-> techniques combined with _Google's Bigtable_ data and storage engine model.
- 
-## Installation and Setup
-
-```bash
-pip install cassandra-driver
-pip install cassio
-```
-
-
-
-## Vector Store
-
-See a [usage example](/docs/integrations/vectorstores/cassandra).
-
-```python
-from langchain.vectorstores import Cassandra
-```
-
-
-
-## Memory
-
-See a [usage example](/docs/integrations/memory/cassandra_chat_message_history).
-
-```python
-from langchain.memory import CassandraChatMessageHistory
-```
--- a/docs/docs/integrations/vectorstores/astradb.ipynb
+++ b/docs/docs/integrations/vectorstores/astradb.ipynb
@@ -0,0 +1,749 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d2d6ca14-fb7e-4172-9aa0-a3119a064b96",
+   "metadata": {},
+   "source": [
+    "# Astra DB\n",
+    "\n",
+    "This page provides a quickstart for using [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) and [Apache Cassandra®](https://cassandra.apache.org/) as a Vector Store.\n",
+    "\n",
+    "_Note: in addition to access to the database, an OpenAI API Key is required to run the full example._"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bb9be7ce-8c70-4d46-9f11-71c42a36e928",
+   "metadata": {},
+   "source": [
+    "### Setup and general dependencies"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dbe7c156-0413-47e3-9237-4769c4248869",
+   "metadata": {},
+   "source": [
+    "Use of the integration requires the following Python package.\n",
+    "\n",
+    "_Note: depending on your LangChain setup, you may need to install other dependencies needed for this demo._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8d00fcf4-9798-4289-9214-d9734690adfc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install --quiet \"astrapy>=0.5.3\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b06619af-fea2-4863-8149-7f239a8c9c82",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from getpass import getpass\n",
+    "\n",
+    "from datasets import load_dataset  # if not present yet, run: pip install \"datasets==2.14.6\"\n",
+    "\n",
+    "from langchain.schema import Document\n",
+    "from langchain.embeddings import OpenAIEmbeddings\n",
+    "from langchain.document_loaders import PyPDFLoader\n",
+    "from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
+    "from langchain.chat_models import ChatOpenAI\n",
+    "from langchain.prompts import ChatPromptTemplate\n",
+    "from langchain.schema.runnable import RunnablePassthrough\n",
+    "from langchain.schema.output_parser import StrOutputParser"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1983f1da-0ae7-4a9b-bf4c-4ade328f7a3a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "os.environ[\"OPENAI_API_KEY\"] = getpass(\"OPENAI_API_KEY = \")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c656df06-e938-4bc5-b570-440b8b7a0189",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "embe = OpenAIEmbeddings()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dd8caa76-bc41-429e-a93b-989ba13aff01",
+   "metadata": {},
+   "source": [
+    "_Keep reading to connect with Astra DB. For usage with Apache Cassandra and Astra DB through CQL, scroll to the section below._"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22866f09-e10d-4f05-a24b-b9420129462e",
+   "metadata": {},
+   "source": [
+    "## Astra DB"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5fba47cc-3533-42fc-84b7-9dc14cd68b2b",
+   "metadata": {},
+   "source": [
+    "DataStax [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) is a serverless vector-capable database built on Cassandra and made conveniently available through an easy-to-use JSON API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0b32730d-176e-414c-9d91-fd3644c54211",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.vectorstores import AstraDB"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "68f61b01-3e09-47c1-9d67-5d6915c86626",
+   "metadata": {},
+   "source": [
+    "### Astra DB connection parameters\n",
+    "\n",
+    "- the API Endpoint looks like `https://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.com`\n",
+    "- the Token looks like `AstraCS:6gBhNmsk135....`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d78af8ed-cff9-4f14-aa5d-016f99ab547c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ASTRA_DB_API_ENDPOINT = input(\"ASTRA_DB_API_ENDPOINT = \")\n",
+    "ASTRA_DB_TOKEN = getpass(\"ASTRA_DB_TOKEN = \")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8b77553b-8bb5-4949-b87b-8c6abac56a26",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "vstore = AstraDB(\n",
+    "    embedding=embe,\n",
+    "    collection_name=\"astra_vector_demo\",\n",
+    "    api_endpoint=ASTRA_DB_API_ENDPOINT,\n",
+    "    token=ASTRA_DB_TOKEN,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9a348678-b2f6-46ca-9a0d-2eb4cc6b66b1",
+   "metadata": {},
+   "source": [
+    "### Load a dataset"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3a1f532f-ad63-4256-9730-a183841bd8e9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "philo_dataset = load_dataset(\"datastax/philosopher-quotes\")[\"train\"]\n",
+    "\n",
+    "docs = []\n",
+    "for entry in philo_dataset:\n",
+    "    metadata = {\"author\": entry[\"author\"]}\n",
+    "    doc = Document(page_content=entry[\"quote\"], metadata=metadata)\n",
+    "    docs.append(doc)\n",
+    "\n",
+    "inserted_ids = vstore.add_documents(docs)\n",
+    "print(f\"\\nInserted {len(inserted_ids)} documents.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "084d8802-ab39-4262-9a87-42eafb746f92",
+   "metadata": {},
+   "source": [
+    "Add some more entries, this time with `add_texts`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b6b157f5-eb31-4907-a78e-2e2b06893936",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "texts = [\"I think, therefore I am.\", \"To the things themselves!\"]\n",
+    "metadatas = [{\"author\": \"descartes\"}, {\"author\": \"husserl\"}]\n",
+    "ids = [\"desc_01\", \"huss_xy\"]\n",
+    "\n",
+    "inserted_ids_2 = vstore.add_texts(texts=texts, metadatas=metadatas, ids=ids)\n",
+    "print(f\"\\nInserted {len(inserted_ids_2)} documents.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c031760a-1fc5-4855-adf2-02ed52fe2181",
+   "metadata": {},
+   "source": [
+    "### Run simple searches"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02a77d8e-1aae-4054-8805-01c77947c49f",
+   "metadata": {},
+   "source": [
+    "This section demonstrates metadata filtering and getting the similarity scores back:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1761806a-1afd-4491-867c-25a80d92b9fe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = vstore.similarity_search(\"Our life is what we make of it\", k=3)\n",
+    "for res in results:\n",
+    "    print(f\"* {res.page_content} [{res.metadata}]\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "eebc4f7c-f61a-438e-b3c8-17e6888d8a0b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results_filtered = vstore.similarity_search(\n",
+    "    \"Our life is what we make of it\",\n",
+    "    k=3,\n",
+    "    filter={\"author\": \"plato\"},\n",
+    ")\n",
+    "for res in results_filtered:\n",
+    "    print(f\"* {res.page_content} [{res.metadata}]\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "11bbfe64-c0cd-40c6-866a-a5786538450e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = vstore.similarity_search_with_score(\"Our life is what we make of it\", k=3)\n",
+    "for res, score in results:\n",
+    "    print(f\"* [SIM={score:3f}] {res.page_content} [{res.metadata}]\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b14ea558-bfbe-41ce-807e-d70670060ada",
+   "metadata": {},
+   "source": [
+    "### MMR (Maximal-marginal-relevance) search"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "76381ce8-780a-4e3b-97b1-056d6782d7d5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = vstore.max_marginal_relevance_search(\n",
+    "    \"Our life is what we make of it\",\n",
+    "    k=3,\n",
+    "    filter={\"author\": \"aristotle\"},\n",
+    ")\n",
+    "for res in results:\n",
+    "    print(f\"* {res.page_content} [{res.metadata}]\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1cc86edd-692b-4495-906c-ccfd13b03c23",
+   "metadata": {},
+   "source": [
+    "### Deleting stored documents"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "38a70ec4-b522-4d32-9ead-c642864fca37",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "delete_1 = vstore.delete(inserted_ids[:3])\n",
+    "print(f\"all_succeed={delete_1}\")  # True, all documents deleted"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d4cf49ed-9d29-4ed9-bdab-51a308c41b8e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "delete_2 = vstore.delete(inserted_ids[2:5])\n",
+    "print(f\"some_succeeds={delete_2}\")  # True, though some IDs were gone already"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "847181ba-77d1-4a17-b7f9-9e2c3d8efd13",
+   "metadata": {},
+   "source": [
+    "### A minimal RAG chain"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cd64b844-846f-43c5-a7dd-c26b9ed417d0",
+   "metadata": {},
+   "source": [
+    "The next cells will implement a simple RAG pipeline:\n",
+    "- download a sample PDF file and load it onto the store;\n",
+    "- create a RAG chain with LCEL (LangChain Expression Language), with the vector store at its heart;\n",
+    "- run the question-answering chain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5cbc4dba-0d5e-4038-8fc5-de6cadd1c2a9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!curl -L \\\n",
+    "    \"https://github.com/awesome-astra/datasets/blob/main/demo-resources/what-is-philosophy/what-is-philosophy.pdf?raw=true\" \\\n",
+    "    -o \"what-is-philosophy.pdf\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "459385be-5e9c-47ff-ba53-2b7ae6166b09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pdf_loader = PyPDFLoader(\"what-is-philosophy.pdf\")\n",
+    "splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=64)\n",
+    "docs_from_pdf = pdf_loader.load_and_split(text_splitter=splitter)\n",
+    "\n",
+    "print(f\"Documents from PDF: {len(docs_from_pdf)}.\")\n",
+    "inserted_ids_from_pdf = vstore.add_documents(docs_from_pdf)\n",
+    "print(f\"Inserted {len(inserted_ids_from_pdf)} documents.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5010a66c-4298-4e32-82b5-2da0d36a5c70",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "retriever = vstore.as_retriever(search_kwargs={'k': 3})\n",
+    "\n",
+    "philo_template = \"\"\"\n",
+    "You are a philosopher that draws inspiration from great thinkers of the past\n",
+    "to craft well-thought answers to user questions. Use the provided context as the basis\n",
+    "for your answers and do not make up new reasoning paths - just mix-and-match what you are given.\n",
+    "Your answers must be concise and to the point, and refrain from answering about other topics than philosophy.\n",
+    "\n",
+    "CONTEXT:\n",
+    "{context}\n",
+    "\n",
+    "QUESTION: {question}\n",
+    "\n",
+    "YOUR ANSWER:\"\"\"\n",
+    "\n",
+    "philo_prompt = ChatPromptTemplate.from_template(philo_template)\n",
+    "\n",
+    "llm = ChatOpenAI()\n",
+    "\n",
+    "chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()} \n",
+    "    | philo_prompt \n",
+    "    | llm \n",
+    "    | StrOutputParser()\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fcbc1296-6c7c-478b-b55b-533ba4e54ddb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain.invoke(\"How does Russel elaborate on Peirce's idea of the security blanket?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "869ab448-a029-4692-aefc-26b85513314d",
+   "metadata": {},
+   "source": [
+    "For more, check out a complete RAG template using Astra DB [here](https://github.com/langchain-ai/langchain/tree/master/templates/rag-astradb)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "177610c7-50d0-4b7b-8634-b03338054c8e",
+   "metadata": {},
+   "source": [
+    "### Cleanup"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0da4d19f-9878-4d3d-82c9-09cafca20322",
+   "metadata": {},
+   "source": [
+    "If you want to completely delete the collection from your Astra DB instance, run this.\n",
+    "\n",
+    "_(You will lose the data you stored in it.)_"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fd405a13-6f71-46fa-87e6-167238e9c25e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "vstore.delete_collection()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "94ebaab1-7cbf-4144-a147-7b0e32c43069",
+   "metadata": {},
+   "source": [
+    "## Apache Cassandra and Astra DB through CQL"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bc3931b4-211d-4f84-bcc0-51c127e3027c",
+   "metadata": {},
+   "source": [
+    "[Cassandra](https://cassandra.apache.org/) is a NoSQL, row-oriented, highly scalable and highly available database.Starting with version 5.0, the database ships with [vector search capabilities](https://cassandra.apache.org/doc/trunk/cassandra/vector-search/overview.html).\n",
+    "\n",
+    "DataStax [Astra DB through CQL](https://docs.datastax.com/en/astra-serverless/docs/vector-search/quickstart.html) is a managed serverless database built on Cassandra, offering the same interface and strengths."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a0055fbf-448d-4e46-9c40-28d43df25ca3",
+   "metadata": {},
+   "source": [
+    "#### What sets this case apart from \"Astra DB\" above?\n",
+    "\n",
+    "Thanks to LangChain having a standardized `VectorStore` interface, most of the \"Astra DB\" section above applies to this case as well. However, this time the database uses the CQL protocol, which means you'll use a _different_ class this time and instantiate it in another way.\n",
+    "\n",
+    "The cells below show how you should get your `vstore` object in this case and how you can clean up the database resources at the end: for the rest, i.e. the actual usage of the vector store, you will be able to run the very code that was shown above.\n",
+    "\n",
+    "In other words, running this demo in full with Cassandra or Astra DB through CQL means:\n",
+    "\n",
+    "- **initialization as shown below**\n",
+    "- \"Load a dataset\", _see above section_\n",
+    "- \"Run simple searches\", _see above section_\n",
+    "- \"MMR search\", _see above section_\n",
+    "- \"Deleting stored documents\", _see above section_\n",
+    "- \"A minimal RAG chain\", _see above section_\n",
+    "- **cleanup as shown below**"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "23d12be2-745f-4e72-a82c-334a887bc7cd",
+   "metadata": {},
+   "source": [
+    "### Initialization"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3212542-79be-423e-8e1f-b8d725e3cda8",
+   "metadata": {},
+   "source": [
+    "The class to use is the following:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "941af73e-a090-4fba-b23c-595757d470eb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.vectorstores import Cassandra"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "414d1e72-f7c9-4b6d-bf6f-16075712c7e3",
+   "metadata": {},
+   "source": [
+    "Now, depending on whether you connect to a Cassandra cluster or to Astra DB through CQL, you will provide different parameters when creating the vector store object."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "48ecca56-71a4-4a91-b198-29384c44ce27",
+   "metadata": {},
+   "source": [
+    "#### Initialization (Cassandra cluster)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "55ebe958-5654-43e0-9aed-d607ffd3fa48",
+   "metadata": {},
+   "source": [
+    "In this case, you first need to create a `cassandra.cluster.Session` object, as described in the [Cassandra driver documentation](https://docs.datastax.com/en/developer/python-driver/latest/api/cassandra/cluster/#module-cassandra.cluster). The details vary (e.g. with network settings and authentication), but this might be something like:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4642dafb-a065-4063-b58c-3d276f5ad07e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from cassandra.cluster import Cluster\n",
+    "\n",
+    "cluster = Cluster([\"127.0.0.1\"])\n",
+    "session = cluster.connect()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "624c93bf-fb46-4350-bcfa-09ca09dc068f",
+   "metadata": {},
+   "source": [
+    "You can now set the session, along with your desired keyspace name, as a global CassIO parameter:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "92a4ab28-1c4f-4dad-9671-d47e0b1dde7b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import cassio\n",
+    "\n",
+    "CASSANDRA_KEYSPACE = input(\"CASSANDRA_KEYSPACE = \")\n",
+    "\n",
+    "cassio.init(session=session, keyspace=CASSANDRA_KEYSPACE)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b87a824-36f1-45b4-b54c-efec2a2de216",
+   "metadata": {},
+   "source": [
+    "Now you can create the vector store:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "853a2a88-a565-4e24-8789-d78c213954a6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "vstore = Cassandra(\n",
+    "    embedding=embe,\n",
+    "    table_name=\"cassandra_vector_demo\",\n",
+    "    # session=None, keyspace=None  # Uncomment on older versions of LangChain\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "768ddf7a-0c3e-4134-ad38-25ac53c3da7a",
+   "metadata": {},
+   "source": [
+    "#### Initialization (Astra DB through CQL)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4ed4269a-b7e7-4503-9e66-5a11335c7681",
+   "metadata": {},
+   "source": [
+    "In this case you initialize CassIO with the following connection parameters:\n",
+    "\n",
+    "- the Database ID, e.g. `01234567-89ab-cdef-0123-456789abcdef`\n",
+    "- the Token, e.g. `AstraCS:6gBhNmsk135....` (it must be a \"Database Administrator\" token)\n",
+    "- Optionally a Keyspace name (if omitted, the default one for the database will be used)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5fa6bd74-d4b2-45c5-9757-96dddc6242fb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ASTRA_DB_ID = input(\"ASTRA_DB_ID = \")\n",
+    "ASTRA_DB_TOKEN = getpass(\"ASTRA_DB_TOKEN = \")\n",
+    "\n",
+    "desired_keyspace = input(\"ASTRA_DB_KEYSPACE (optional, can be left empty) = \")\n",
+    "if desired_keyspace:\n",
+    "    ASTRA_DB_KEYSPACE = desired_keyspace\n",
+    "else:\n",
+    "    ASTRA_DB_KEYSPACE = None"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "add6e585-17ff-452e-8ef6-7e485ead0b06",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import cassio\n",
+    "\n",
+    "cassio.init(\n",
+    "    database_id=ASTRA_DB_ID,\n",
+    "    token=ASTRA_DB_TOKEN,\n",
+    "    keyspace=ASTRA_DB_KEYSPACE,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b305823c-bc98-4f3d-aabb-d7eb663ea421",
+   "metadata": {},
+   "source": [
+    "Now you can create the vector store:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f45f3038-9d59-41cc-8b43-774c6aa80295",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "vstore = Cassandra(\n",
+    "    embedding=embe,\n",
+    "    table_name=\"cassandra_vector_demo\",\n",
+    "    # session=None, keyspace=None  # Uncomment on older versions of LangChain\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "39284918-cf8a-49bb-a2d3-aef285bb2ffa",
+   "metadata": {},
+   "source": [
+    "### Usage of the vector store"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3cc1aead-d6ec-48a3-affe-1d0cffa955a9",
+   "metadata": {},
+   "source": [
+    "_See the sections \"Load a dataset\" through \"A minimal RAG chain\" above._\n",
+    "\n",
+    "Speaking of the latter, you can check out a full RAG template for Astra DB through CQL [here](https://github.com/langchain-ai/langchain/tree/master/templates/cassandra-entomology-rag)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "096397d8-6622-4685-9f9d-7e238beca467",
+   "metadata": {},
+   "source": [
+    "### Cleanup"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cc1e74f9-5500-41aa-836f-235b1ed5f20c",
+   "metadata": {},
+   "source": [
+    "the following essentially retrieves the `Session` object from CassIO and runs a CQL `DROP TABLE` statement with it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b5b82c33-0e77-4a37-852c-8d50edbdd991",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cassio.config.resolve_session().execute(\n",
+    "    f\"DROP TABLE {cassio.config.resolve_keyspace()}.cassandra_vector_demo;\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c10ece4d-ae06-42ab-baf4-4d0ac2051743",
+   "metadata": {},
+   "source": [
+    "### Learn more"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "51ea8b69-7e15-458f-85aa-9fa199f95f9c",
+   "metadata": {},
+   "source": [
+    "For more information, extended quickstarts and additional usage examples, please visit the [CassIO documentation](https://cassio.org/frameworks/langchain/about/) for more on using the LangChain `Cassandra` vector store."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/integrations/vectorstores/cassandra.ipynb
+++ b/docs/docs/integrations/vectorstores/cassandra.ipynb
@@ -1,326 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "683953b3",
-   "metadata": {},
-   "source": [
-    "# Cassandra\n",
-    "\n",
-    ">[Apache Cassandra®](https://cassandra.apache.org) is a NoSQL, row-oriented, highly scalable and highly available database.\n",
-    "\n",
-    "Newest Cassandra releases natively [support](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes) Vector Similarity Search.\n",
-    "\n",
-    "To run this notebook you need either a running Cassandra cluster equipped with Vector Search capabilities (in pre-release at the time of writing) or a DataStax Astra DB instance running in the cloud (you can get one for free at [datastax.com](https://astra.datastax.com)). Check [cassio.org](https://cassio.org/start_here/) for more information."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b4c41cad-08ef-4f72-a545-2151e4598efe",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "!pip install \"cassio>=0.1.0\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b7e46bb0",
-   "metadata": {},
-   "source": [
-    "### Please provide database connection parameters and secrets:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "36128a32",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "import getpass\n",
-    "\n",
-    "database_mode = (input(\"\\n(C)assandra or (A)stra DB? \")).upper()\n",
-    "\n",
-    "keyspace_name = input(\"\\nKeyspace name? \")\n",
-    "\n",
-    "if database_mode == \"A\":\n",
-    "    ASTRA_DB_APPLICATION_TOKEN = getpass.getpass('\\nAstra DB Token (\"AstraCS:...\") ')\n",
-    "    #\n",
-    "    ASTRA_DB_SECURE_BUNDLE_PATH = input(\"Full path to your Secure Connect Bundle? \")\n",
-    "elif database_mode == \"C\":\n",
-    "    CASSANDRA_CONTACT_POINTS = input(\n",
-    "        \"Contact points? (comma-separated, empty for localhost) \"\n",
-    "    ).strip()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4f22aac2",
-   "metadata": {},
-   "source": [
-    "#### depending on whether local or cloud-based Astra DB, create the corresponding database connection \"Session\" object"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "677f8576",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from cassandra.cluster import Cluster\n",
-    "from cassandra.auth import PlainTextAuthProvider\n",
-    "\n",
-    "if database_mode == \"C\":\n",
-    "    if CASSANDRA_CONTACT_POINTS:\n",
-    "        cluster = Cluster(\n",
-    "            [cp.strip() for cp in CASSANDRA_CONTACT_POINTS.split(\",\") if cp.strip()]\n",
-    "        )\n",
-    "    else:\n",
-    "        cluster = Cluster()\n",
-    "    session = cluster.connect()\n",
-    "elif database_mode == \"A\":\n",
-    "    ASTRA_DB_CLIENT_ID = \"token\"\n",
-    "    cluster = Cluster(\n",
-    "        cloud={\n",
-    "            \"secure_connect_bundle\": ASTRA_DB_SECURE_BUNDLE_PATH,\n",
-    "        },\n",
-    "        auth_provider=PlainTextAuthProvider(\n",
-    "            ASTRA_DB_CLIENT_ID,\n",
-    "            ASTRA_DB_APPLICATION_TOKEN,\n",
-    "        ),\n",
-    "    )\n",
-    "    session = cluster.connect()\n",
-    "else:\n",
-    "    raise NotImplementedError"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "320af802-9271-46ee-948f-d2453933d44b",
-   "metadata": {},
-   "source": [
-    "### Please provide OpenAI access key\n",
-    "\n",
-    "We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "ffea66e4-bc23-46a9-9580-b348dfe7b7a7",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e98a139b",
-   "metadata": {},
-   "source": [
-    "### Creation and usage of the Vector Store"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "aac9563e",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
-    "from langchain.text_splitter import CharacterTextSplitter\n",
-    "from langchain.vectorstores import Cassandra\n",
-    "from langchain.document_loaders import TextLoader"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a3c3999a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.document_loaders import TextLoader\n",
-    "\n",
-    "SOURCE_FILE_NAME = \"../../modules/state_of_the_union.txt\"\n",
-    "\n",
-    "loader = TextLoader(SOURCE_FILE_NAME)\n",
-    "documents = loader.load()\n",
-    "text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
-    "docs = text_splitter.split_documents(documents)\n",
-    "\n",
-    "embedding_function = OpenAIEmbeddings()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6e104aee",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "table_name = \"my_vector_db_table\"\n",
-    "\n",
-    "docsearch = Cassandra.from_documents(\n",
-    "    documents=docs,\n",
-    "    embedding=embedding_function,\n",
-    "    session=session,\n",
-    "    keyspace=keyspace_name,\n",
-    "    table_name=table_name,\n",
-    ")\n",
-    "\n",
-    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "docs = docsearch.similarity_search(query)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "f509ee02",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "## if you already have an index, you can load it and use it like this:\n",
-    "\n",
-    "# docsearch_preexisting = Cassandra(\n",
-    "#     embedding=embedding_function,\n",
-    "#     session=session,\n",
-    "#     keyspace=keyspace_name,\n",
-    "#     table_name=table_name,\n",
-    "# )\n",
-    "\n",
-    "# docs = docsearch_preexisting.similarity_search(query, k=2)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9c608226",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print(docs[0].page_content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d46d1452",
-   "metadata": {},
-   "source": [
-    "### Maximal Marginal Relevance Searches\n",
-    "\n",
-    "In addition to using similarity search in the retriever object, you can also use `mmr` as retriever.\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a359ed74",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "retriever = docsearch.as_retriever(search_type=\"mmr\")\n",
-    "matched_docs = retriever.get_relevant_documents(query)\n",
-    "for i, d in enumerate(matched_docs):\n",
-    "    print(f\"\\n## Document {i}\\n\")\n",
-    "    print(d.page_content)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7c477287",
-   "metadata": {},
-   "source": [
-    "Or use `max_marginal_relevance_search` directly:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9ca82740",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "found_docs = docsearch.max_marginal_relevance_search(query, k=2, fetch_k=10)\n",
-    "for i, doc in enumerate(found_docs):\n",
-    "    print(f\"{i + 1}.\", doc.page_content, \"\\n\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "da791c5f",
-   "metadata": {},
-   "source": [
-    "### Metadata filtering\n",
-    "\n",
-    "You can specify filtering on metadata when running searches in the vector store. By default, when inserting documents, the only metadata is the `\"source\"` (but you can customize the metadata at insertion time).\n",
-    "\n",
-    "Since only one files was inserted, this is just a demonstration of how filters are passed:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "93f132fa",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "filter = {\"source\": SOURCE_FILE_NAME}\n",
-    "filtered_docs = docsearch.similarity_search(query, filter=filter, k=5)\n",
-    "print(f\"{len(filtered_docs)} documents retrieved.\")\n",
-    "print(f\"{filtered_docs[0].page_content[:64]} ...\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1b413ec4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "filter = {\"source\": \"nonexisting_file.txt\"}\n",
-    "filtered_docs2 = docsearch.similarity_search(query, filter=filter)\n",
-    "print(f\"{len(filtered_docs2)} documents retrieved.\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "a0fea764",
-   "metadata": {},
-   "source": [
-    "Please visit the [cassIO documentation](https://cassio.org/frameworks/langchain/about/) for more on using vector stores with Langchain."
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/modules/data_connection/indexing.ipynb
+++ b/docs/docs/modules/data_connection/indexing.ipynb
@@ -58,9 +58,9 @@
    "1. Do not use with a store that has been pre-populated with content independently of the indexing API, as the record manager will not know that records have been inserted previously.\n",
    "2. Only works with LangChain `vectorstore`'s that support:\n",
    "   * document addition by id (`add_documents` method with `ids` argument)\n",
-    "   * delete by id (`delete` method with)\n",
+    "   * delete by id (`delete` method with `ids` argument)\n",
    "\n",
-    "Compatible Vectorstores: `AnalyticDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `DashVector`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `MyScale`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `ScaNN`, `SupabaseVectorStore`, `TimescaleVector`, `Vald`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
+    "Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `DashVector`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `MyScale`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `ScaNN`, `SupabaseVectorStore`, `TimescaleVector`, `Vald`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
    "  \n",
    "## Caution\n",
    "\n",
--- a/docs/vercel.json
+++ b/docs/vercel.json
@@ -414,7 +414,15 @@
    },
    {
      "source": "/docs/integrations/cassandra",
-      "destination": "/docs/integrations/providers/cassandra"
+      "destination": "/docs/integrations/providers/astradb"
+    },
+    {
+      "source": "/docs/integrations/providers/cassandra",
+      "destination": "/docs/integrations/providers/astradb"
+    },
+    {
+      "source": "/docs/integrations/vectorstores/cassandra",
+      "destination": "/docs/integrations/vectorstores/astradb"
    },
    {
      "source": "/docs/integrations/cerebriumai",