Add integration for Timescale Vector(Postgres) (#10650)

**Description:** This commit adds a vector store for the Postgres-based vector database (`TimescaleVector`). Timescale Vector(https://www.timescale.com/ai) is PostgreSQL++ for AI applications. It enables you to efficiently store and query billions of vector embeddings in `PostgreSQL`: - Enhances `pgvector` with faster and more accurate similarity search on 1B+ vectors via DiskANN inspired indexing algorithm. - Enables fast time-based vector search via automatic time-based partitioning and indexing. - Provides a familiar SQL interface for querying vector embeddings and relational data. Timescale Vector scales with you from POC to production: - Simplifies operations by enabling you to store relational metadata, vector embeddings, and time-series data in a single database. - Benefits from rock-solid PostgreSQL foundation with enterprise-grade feature liked streaming backups and replication, high-availability and row-level security. - Enables a worry-free experience with enterprise-grade security and compliance. Timescale Vector is available on Timescale, the cloud PostgreSQL platform. (There is no self-hosted version at this time.) LangChain users get a 90-day free trial for Timescale Vector. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Avthar Sewrathan <avthar@timescale.com>
2025-07-15 17:33:53 +00:00 · 2023-09-21 09:33:37 -05:00 · 2023-09-21 09:33:37 -05:00 · 6e02c45ca4
commit 6e02c45ca4
parent 55570e54e1
10 changed files with 3863 additions and 470 deletions
--- a/docs/extras/integrations/vectorstores/timescalevector.ipynb
+++ b/docs/extras/integrations/vectorstores/timescalevector.ipynb
--- a/docs/extras/modules/data_connection/retrievers/self_query/timescalevector_self_query.ipynb
+++ b/docs/extras/modules/data_connection/retrievers/self_query/timescalevector_self_query.ipynb
@ -0,0 +1,534 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "13afcae7",
   "metadata": {},
   "source": [
    "# Timescale Vector (Postgres) self-querying \n",
    "\n",
    "[Timescale Vector](https://www.timescale.com/ai) is PostgreSQL++ for AI applications. It enables you to efficiently store and query billions of vector embeddings in `PostgreSQL`.\n",
    "\n",
    "This notebook shows how to use the Postgres vector database (`TimescaleVector`) to perform self-querying. In the notebook we'll demo the `SelfQueryRetriever` wrapped around a TimescaleVector vector store. \n",
    "\n",
    "## What is Timescale Vector?\n",
    "**[Timescale Vector](https://www.timescale.com/ai) is PostgreSQL++ for AI applications.**\n",
    "\n",
    "Timescale Vector enables you to efficiently store and query millions of vector embeddings in `PostgreSQL`.\n",
    "- Enhances `pgvector` with faster and more accurate similarity search on 1B+ vectors via DiskANN inspired indexing algorithm.\n",
    "- Enables fast time-based vector search via automatic time-based partitioning and indexing.\n",
    "- Provides a familiar SQL interface for querying vector embeddings and relational data.\n",
    "\n",
    "Timescale Vector is cloud PostgreSQL for AI that scales with you from POC to production:\n",
    "- Simplifies operations by enabling you to store relational metadata, vector embeddings, and time-series data in a single database.\n",
    "- Benefits from rock-solid PostgreSQL foundation with enterprise-grade feature liked streaming backups and replication, high-availability and row-level security.\n",
    "- Enables a worry-free experience with enterprise-grade security and compliance.\n",
    "\n",
    "## How to access Timescale Vector\n",
    "Timescale Vector is available on [Timescale](https://www.timescale.com/ai), the cloud PostgreSQL platform. (There is no self-hosted version at this time.)\n",
    "\n",
    "LangChain users get a 90-day free trial for Timescale Vector.\n",
    "- To get started, [signup](https://console.cloud.timescale.com/signup?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) to Timescale, create a new database and follow this notebook!\n",
    "- See the [Timescale Vector explainer blog](https://www.timescale.com/blog/how-we-made-postgresql-the-best-vector-database/?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral) for more details and performance benchmarks.\n",
    "- See the [installation instructions](https://github.com/timescale/python-vector) for more details on using Timescale Vector in python.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "68e75fb9",
   "metadata": {},
   "source": [
    "## Creating a TimescaleVector vectorstore\n",
    "First we'll want to create a Timescale Vector vectorstore and seed it with some data. We've created a small demo set of documents that contain summaries of movies.\n",
    "\n",
    "NOTE: The self-query retriever requires you to have `lark` installed (`pip install lark`). We also need the `timescale-vector` package."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "63a8af5b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#!pip install lark"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "22431060-52c4-48a7-a97b-9f542b8b0928",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "#!pip install timescale-vector "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "83811610-7df3-4ede-b268-68a6a83ba9e2",
   "metadata": {},
   "source": [
    "In this example, we'll use `OpenAIEmbeddings`, so let's load your OpenAI API key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "dd01b61b-7d32-4a55-85d6-b2d2d4f18840",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Get openAI api key by reading local .env file\n",
    "# The .env file should contain a line starting with `OPENAI_API_KEY=sk-`\n",
    "import os\n",
    "from dotenv import load_dotenv, find_dotenv\n",
    "_ = load_dotenv(find_dotenv())\n",
    "\n",
    "OPENAI_API_KEY  = os.environ['OPENAI_API_KEY']\n",
    "# Alternatively, use getpass to enter the key in a prompt\n",
    "#import os\n",
    "#import getpass\n",
    "#os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "766e9c4b",
   "metadata": {},
   "source": [
    "To connect to your PostgreSQL database, you'll need your service URI, which can be found in the cheatsheet or `.env` file you downloaded after creating a new database. \n",
    "\n",
    "If you haven't already, [signup for Timescale](https://console.cloud.timescale.com/signup?utm_campaign=vectorlaunch&utm_source=langchain&utm_medium=referral), and create a new database.\n",
    "\n",
    "The URI will look something like this: `postgres://tsdbadmin:<password>@<id>.tsdb.cloud.timescale.com:<port>/tsdb?sslmode=require`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6bd6877e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get the service url by reading local .env file\n",
    "# The .env file should contain a line starting with `TIMESCALE_SERVICE_URL=postgresql://`\n",
    "_ = load_dotenv(find_dotenv())\n",
    "TIMESCALE_SERVICE_URL = os.environ[\"TIMESCALE_SERVICE_URL\"]\n",
    "\n",
    "# Alternatively, use getpass to enter the key in a prompt\n",
    "#import os\n",
    "#import getpass\n",
    "#TIMESCALE_SERVICE_URL = getpass.getpass(\"Timescale Service URL:\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "cb4a5787",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.schema import Document\n",
    "from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "from langchain.vectorstores.timescalevector import TimescaleVector\n",
    "\n",
    "embeddings = OpenAIEmbeddings()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a4f863f5",
   "metadata": {},
   "source": [
    "Here's the sample documents we'll use for this demo. The data is about movies, and has both content and metadata fields with information about particular movie."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "bcbe04d9",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "docs = [\n",
    "    Document(\n",
    "        page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\",\n",
    "        metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"science fiction\"},\n",
    "    ),\n",
    "    Document(\n",
    "        page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\",\n",
    "        metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2},\n",
    "    ),\n",
    "    Document(\n",
    "        page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\",\n",
    "        metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6},\n",
    "    ),\n",
    "    Document(\n",
    "        page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\",\n",
    "        metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3},\n",
    "    ),\n",
    "    Document(\n",
    "        page_content=\"Toys come alive and have a blast doing so\",\n",
    "        metadata={\"year\": 1995, \"genre\": \"animated\"},\n",
    "    ),\n",
    "    Document(\n",
    "        page_content=\"Three men walk into the Zone, three men walk out of the Zone\",\n",
    "        metadata={\n",
    "            \"year\": 1979,\n",
    "            \"rating\": 9.9,\n",
    "            \"director\": \"Andrei Tarkovsky\",\n",
    "            \"genre\": \"science fiction\",\n",
    "            \"rating\": 9.9,\n",
    "        },\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "7d0d771e",
   "metadata": {},
   "source": [
    "Finally, we'll create our Timescale Vector vectorstore. Note that the collection name will be the name of the PostgreSQL table in which the documents are stored in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2428d1ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "COLLECTION_NAME = \"langchain_self_query_demo\"\n",
    "vectorstore = TimescaleVector.from_documents(\n",
    "    embedding=embeddings,\n",
    "    documents=docs,\n",
    "    collection_name=COLLECTION_NAME,\n",
    "    service_url=TIMESCALE_SERVICE_URL,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5ecaab6d",
   "metadata": {},
   "source": [
    "## Creating our self-querying retriever\n",
    "Now we can instantiate our retriever. To do this we'll need to provide some information upfront about the metadata fields that our documents support and a short description of the document contents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "86e34dbf",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
    "from langchain.chains.query_constructor.base import AttributeInfo\n",
    "\n",
    "# Give LLM info about the metadata fields\n",
    "metadata_field_info = [\n",
    "    AttributeInfo(\n",
    "        name=\"genre\",\n",
    "        description=\"The genre of the movie\",\n",
    "        type=\"string or list[string]\",\n",
    "    ),\n",
    "    AttributeInfo(\n",
    "        name=\"year\",\n",
    "        description=\"The year the movie was released\",\n",
    "        type=\"integer\",\n",
    "    ),\n",
    "    AttributeInfo(\n",
    "        name=\"director\",\n",
    "        description=\"The name of the movie director\",\n",
    "        type=\"string\",\n",
    "    ),\n",
    "    AttributeInfo(\n",
    "        name=\"rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
    "    ),\n",
    "]\n",
    "document_content_description = \"Brief summary of a movie\"\n",
    "\n",
    "# Instantiate the self-query retriever from an LLM\n",
    "llm = OpenAI(temperature=0)\n",
    "retriever = SelfQueryRetriever.from_llm(\n",
    "    llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ea9df8d4",
   "metadata": {},
   "source": [
    "## Self Querying Retrieval with Timescale Vector\n",
    "And now we can try actually using our retriever!\n",
    "\n",
    "Run the queries below and note how you can specify a query, filter, composite filter (filters with AND, OR) in natural language and the self-query retriever will translate that query into SQL and perform the search on the Timescale Vector (Postgres) vectorstore.\n",
    "\n",
    "This illustrates the power of the self-query retriever. You can use it to perform complex searches over your vectorstore without you or your users having to write any SQL directly!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "38a126e9",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/avtharsewrathan/sideprojects2023/timescaleai/tsv-langchain/langchain/libs/langchain/langchain/chains/llm.py:275: UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query='dinosaur' filter=None limit=None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'genre': 'science fiction', 'rating': 7.7}),\n",
       " Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'genre': 'science fiction', 'rating': 7.7}),\n",
       " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'}),\n",
       " Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example only specifies a relevant query\n",
    "retriever.get_relevant_documents(\"What are some movies about dinosaurs\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "fc3f1e6e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query=' ' filter=Comparison(comparator=<Comparator.GT: 'gt'>, attribute='rating', value=8.5) limit=None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'genre': 'science fiction', 'rating': 9.9, 'director': 'Andrei Tarkovsky'}),\n",
       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'genre': 'science fiction', 'rating': 9.9, 'director': 'Andrei Tarkovsky'}),\n",
       " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'rating': 8.6, 'director': 'Satoshi Kon'}),\n",
       " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'year': 2006, 'rating': 8.6, 'director': 'Satoshi Kon'})]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example only specifies a filter\n",
    "retriever.get_relevant_documents(\"I want to watch a movie rated higher than 8.5\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "b19d4da0",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query='women' filter=Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='director', value='Greta Gerwig') limit=None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'rating': 8.3, 'director': 'Greta Gerwig'}),\n",
       " Document(page_content='A bunch of normal-sized women are supremely wholesome and some men pine after them', metadata={'year': 2019, 'rating': 8.3, 'director': 'Greta Gerwig'})]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example specifies a query and a filter\n",
    "retriever.get_relevant_documents(\"Has Greta Gerwig directed any movies about women\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "f900e40e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query=' ' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.GTE: 'gte'>, attribute='rating', value=8.5), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='science fiction')]) limit=None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'genre': 'science fiction', 'rating': 9.9, 'director': 'Andrei Tarkovsky'}),\n",
       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'year': 1979, 'genre': 'science fiction', 'rating': 9.9, 'director': 'Andrei Tarkovsky'})]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example specifies a composite filter\n",
    "retriever.get_relevant_documents(\n",
    "    \"What's a highly rated (above 8.5) science fiction film?\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "12a51522",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query='toys' filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.GT: 'gt'>, attribute='year', value=1990), Comparison(comparator=<Comparator.LT: 'lt'>, attribute='year', value=2005), Comparison(comparator=<Comparator.EQ: 'eq'>, attribute='genre', value='animated')]) limit=None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='Toys come alive and have a blast doing so', metadata={'year': 1995, 'genre': 'animated'})]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example specifies a query and composite filter\n",
    "retriever.get_relevant_documents(\n",
    "    \"What's a movie after 1990 but before 2005 that's all about toys, and preferably is animated\"\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "39bd1de1-b9fe-4a98-89da-58d8a7a6ae51",
   "metadata": {},
   "source": [
    "### Filter k\n",
    "\n",
    "We can also use the self query retriever to specify `k`: the number of documents to fetch.\n",
    "\n",
    "We can do this by passing `enable_limit=True` to the constructor."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "bff36b88-b506-4877-9c63-e5a1a8d78e64",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = SelfQueryRetriever.from_llm(\n",
    "    llm,\n",
    "    vectorstore,\n",
    "    document_content_description,\n",
    "    metadata_field_info,\n",
    "    enable_limit=True,\n",
    "    verbose=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "2758d229-4f97-499c-819f-888acaf8ee10",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query='dinosaur' filter=None limit=2\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'genre': 'science fiction', 'rating': 7.7}),\n",
       " Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'year': 1993, 'genre': 'science fiction', 'rating': 7.7})]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# This example specifies a query with a LIMIT value\n",
    "retriever.get_relevant_documents(\"what are two movies about dinosaurs\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/libs/langchain/langchain/retrievers/self_query/base.py
+++ b/libs/langchain/langchain/retrievers/self_query/base.py
@ -18,6 +18,7 @@ from langchain.retrievers.self_query.pinecone import PineconeTranslator
 from langchain.retrievers.self_query.qdrant import QdrantTranslator
 from langchain.retrievers.self_query.redis import RedisTranslator
 from langchain.retrievers.self_query.supabase import SupabaseVectorTranslator
 from langchain.retrievers.self_query.timescalevector import TimescaleVectorTranslator
 from langchain.retrievers.self_query.vectara import VectaraTranslator
 from langchain.retrievers.self_query.weaviate import WeaviateTranslator
 from langchain.schema import BaseRetriever, Document
@ -33,6 +34,7 @@ from langchain.vectorstores import (
    Qdrant,
    Redis,
    SupabaseVectorStore,
    TimescaleVector,
    Vectara,
    VectorStore,
    Weaviate,
@ -53,6 +55,7 @@ def _get_builtin_translator(vectorstore: VectorStore) -> Visitor:
        ElasticsearchStore: ElasticsearchTranslator,
        Milvus: MilvusTranslator,
        SupabaseVectorStore: SupabaseVectorTranslator,
        TimescaleVector: TimescaleVectorTranslator,
    }
    if isinstance(vectorstore, Qdrant):
        return QdrantTranslator(metadata_key=vectorstore.metadata_payload_key)
--- a/libs/langchain/langchain/retrievers/self_query/timescalevector.py
+++ b/libs/langchain/langchain/retrievers/self_query/timescalevector.py
@ -0,0 +1,84 @@
 from __future__ import annotations
 from typing import TYPE_CHECKING, Tuple, Union
 from langchain.chains.query_constructor.ir import (
    Comparator,
    Comparison,
    Operation,
    Operator,
    StructuredQuery,
    Visitor,
 )
 if TYPE_CHECKING:
    from timescale_vector import client
 class TimescaleVectorTranslator(Visitor):
    """Translate the internal query language elements to valid filters."""
    allowed_operators = [Operator.AND, Operator.OR, Operator.NOT]
    """Subset of allowed logical operators."""
    allowed_comparators = [
        Comparator.EQ,
        Comparator.GT,
        Comparator.GTE,
        Comparator.LT,
        Comparator.LTE,
    ]
    COMPARATOR_MAP = {
        Comparator.EQ: "==",
        Comparator.GT: ">",
        Comparator.GTE: ">=",
        Comparator.LT: "<",
        Comparator.LTE: "<=",
    }
    OPERATOR_MAP = {Operator.AND: "AND", Operator.OR: "OR", Operator.NOT: "NOT"}
    def _format_func(self, func: Union[Operator, Comparator]) -> str:
        self._validate_func(func)
        if isinstance(func, Operator):
            value = self.OPERATOR_MAP[func.value]  # type: ignore
        elif isinstance(func, Comparator):
            value = self.COMPARATOR_MAP[func.value]  # type: ignore
        return f"{value}"
    def visit_operation(self, operation: Operation) -> client.Predicates:
        try:
            from timescale_vector import client
        except ImportError as e:
            raise ImportError(
                "Cannot import timescale-vector. Please install with `pip install "
                "timescale-vector`."
            ) from e
        args = [arg.accept(self) for arg in operation.arguments]
        return client.Predicates(*args, operator=self._format_func(operation.operator))
    def visit_comparison(self, comparison: Comparison) -> client.Predicates:
        try:
            from timescale_vector import client
        except ImportError as e:
            raise ImportError(
                "Cannot import timescale-vector. Please install with `pip install "
                "timescale-vector`."
            ) from e
        return client.Predicates(
            (
                comparison.attribute,
                self._format_func(comparison.comparator),
                comparison.value,
            )
        )
    def visit_structured_query(
        self, structured_query: StructuredQuery
    ) -> Tuple[str, dict]:
        if structured_query.filter is None:
            kwargs = {}
        else:
            kwargs = {"predicates": structured_query.filter.accept(self)}
        return structured_query.query, kwargs
--- a/libs/langchain/langchain/vectorstores/init.py
+++ b/libs/langchain/langchain/vectorstores/init.py
@ -70,6 +70,7 @@ from langchain.vectorstores.supabase import SupabaseVectorStore
 from langchain.vectorstores.tair import Tair
 from langchain.vectorstores.tencentvectordb import TencentVectorDB
 from langchain.vectorstores.tigris import Tigris
 from langchain.vectorstores.timescalevector import TimescaleVector
 from langchain.vectorstores.typesense import Typesense
 from langchain.vectorstores.usearch import USearch
 from langchain.vectorstores.vald import Vald
@ -135,6 +136,7 @@ __all__ = [
    "SupabaseVectorStore",
    "Tair",
    "Tigris",
    "TimescaleVector",
    "Typesense",
    "USearch",
    "Vald",
--- a/libs/langchain/langchain/vectorstores/timescalevector.py
+++ b/libs/langchain/langchain/vectorstores/timescalevector.py
@ -0,0 +1,871 @@
 """VectorStore wrapper around a Postgres-TimescaleVector database."""
 from __future__ import annotations
 import enum
 import logging
 import uuid
 from datetime import timedelta
 from typing import (
    TYPE_CHECKING,
    Any,
    Callable,
    Dict,
    Iterable,
    List,
    Optional,
    Tuple,
    Type,
    Union,
 )
 from langchain.docstore.document import Document
 from langchain.embeddings.base import Embeddings
 from langchain.utils import get_from_dict_or_env
 from langchain.vectorstores.base import VectorStore
 from langchain.vectorstores.utils import DistanceStrategy
 if TYPE_CHECKING:
    from timescale_vector import Predicates
 DEFAULT_DISTANCE_STRATEGY = DistanceStrategy.COSINE
 ADA_TOKEN_COUNT = 1536
 _LANGCHAIN_DEFAULT_COLLECTION_NAME = "langchain_store"
 class TimescaleVector(VectorStore):
    """VectorStore implementation using the timescale vector client to store vectors
    in Postgres.
    To use, you should have the ``timescale_vector`` python package installed.
    Args:
        service_url: Service url on timescale cloud.
        embedding: Any embedding function implementing
            `langchain.embeddings.base.Embeddings` interface.
        collection_name: The name of the collection to use. (default: langchain_store)
            This will become the table name used for the collection.
        distance_strategy: The distance strategy to use. (default: COSINE)
        pre_delete_collection: If True, will delete the collection if it exists.
            (default: False). Useful for testing.
    Example:
        .. code-block:: python
            from langchain.vectorstores import TimescaleVector
            from langchain.embeddings.openai import OpenAIEmbeddings
            SERVICE_URL = "postgres://tsdbadmin:<password>@<id>.tsdb.cloud.timescale.com:<port>/tsdb?sslmode=require"
            COLLECTION_NAME = "state_of_the_union_test"
            embeddings = OpenAIEmbeddings()
            vectorestore = TimescaleVector.from_documents(
                embedding=embeddings,
                documents=docs,
                collection_name=COLLECTION_NAME,
                service_url=SERVICE_URL,
            )
    """  # noqa: E501
    def __init__(
        self,
        service_url: str,
        embedding: Embeddings,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        num_dimensions: int = ADA_TOKEN_COUNT,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        pre_delete_collection: bool = False,
        logger: Optional[logging.Logger] = None,
        relevance_score_fn: Optional[Callable[[float], float]] = None,
        time_partition_interval: Optional[timedelta] = None,
    ) -> None:
        try:
            from timescale_vector import client
        except ImportError:
            raise ImportError(
                "Could not import timescale_vector python package. "
                "Please install it with `pip install timescale-vector`."
            )
        self.service_url = service_url
        self.embedding = embedding
        self.collection_name = collection_name
        self.num_dimensions = num_dimensions
        self._distance_strategy = distance_strategy
        self.pre_delete_collection = pre_delete_collection
        self.logger = logger or logging.getLogger(__name__)
        self.override_relevance_score_fn = relevance_score_fn
        self._time_partition_interval = time_partition_interval
        self.sync_client = client.Sync(
            self.service_url,
            self.collection_name,
            self.num_dimensions,
            self._distance_strategy.value.lower(),
            time_partition_interval=self._time_partition_interval,
        )
        self.async_client = client.Async(
            self.service_url,
            self.collection_name,
            self.num_dimensions,
            self._distance_strategy.value.lower(),
            time_partition_interval=self._time_partition_interval,
        )
        self.__post_init__()
    def __post_init__(
        self,
    ) -> None:
        """
        Initialize the store.
        """
        self.sync_client.create_tables()
        if self.pre_delete_collection:
            self.sync_client.delete_all()
    @property
    def embeddings(self) -> Embeddings:
        return self.embedding
    def drop_tables(self) -> None:
        self.sync_client.drop_table()
    @classmethod
    def __from(
        cls,
        texts: List[str],
        embeddings: List[List[float]],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        service_url: Optional[str] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        num_dimensions = len(embeddings[0])
        if ids is None:
            ids = [str(uuid.uuid1()) for _ in texts]
        if not metadatas:
            metadatas = [{} for _ in texts]
        if service_url is None:
            service_url = cls.get_service_url(kwargs)
        store = cls(
            service_url=service_url,
            num_dimensions=num_dimensions,
            collection_name=collection_name,
            embedding=embedding,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
        store.add_embeddings(
            texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
        )
        return store
    @classmethod
    async def __afrom(
        cls,
        texts: List[str],
        embeddings: List[List[float]],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        service_url: Optional[str] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        num_dimensions = len(embeddings[0])
        if ids is None:
            ids = [str(uuid.uuid1()) for _ in texts]
        if not metadatas:
            metadatas = [{} for _ in texts]
        if service_url is None:
            service_url = cls.get_service_url(kwargs)
        store = cls(
            service_url=service_url,
            num_dimensions=num_dimensions,
            collection_name=collection_name,
            embedding=embedding,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
        await store.aadd_embeddings(
            texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
        )
        return store
    def add_embeddings(
        self,
        texts: Iterable[str],
        embeddings: List[List[float]],
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Add embeddings to the vectorstore.
        Args:
            texts: Iterable of strings to add to the vectorstore.
            embeddings: List of list of embedding vectors.
            metadatas: List of metadatas associated with the texts.
            kwargs: vectorstore specific parameters
        """
        if ids is None:
            ids = [str(uuid.uuid1()) for _ in texts]
        if not metadatas:
            metadatas = [{} for _ in texts]
        records = list(zip(ids, metadatas, texts, embeddings))
        self.sync_client.upsert(records)
        return ids
    async def aadd_embeddings(
        self,
        texts: Iterable[str],
        embeddings: List[List[float]],
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Add embeddings to the vectorstore.
        Args:
            texts: Iterable of strings to add to the vectorstore.
            embeddings: List of list of embedding vectors.
            metadatas: List of metadatas associated with the texts.
            kwargs: vectorstore specific parameters
        """
        if ids is None:
            ids = [str(uuid.uuid1()) for _ in texts]
        if not metadatas:
            metadatas = [{} for _ in texts]
        records = list(zip(ids, metadatas, texts, embeddings))
        await self.async_client.upsert(records)
        return ids
    def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Run more texts through the embeddings and add to the vectorstore.
        Args:
            texts: Iterable of strings to add to the vectorstore.
            metadatas: Optional list of metadatas associated with the texts.
            kwargs: vectorstore specific parameters
        Returns:
            List of ids from adding the texts into the vectorstore.
        """
        embeddings = self.embedding.embed_documents(list(texts))
        return self.add_embeddings(
            texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
        )
    async def aadd_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        ids: Optional[List[str]] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Run more texts through the embeddings and add to the vectorstore.
        Args:
            texts: Iterable of strings to add to the vectorstore.
            metadatas: Optional list of metadatas associated with the texts.
            kwargs: vectorstore specific parameters
        Returns:
            List of ids from adding the texts into the vectorstore.
        """
        embeddings = self.embedding.embed_documents(list(texts))
        return await self.aadd_embeddings(
            texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
        )
    def similarity_search(
        self,
        query: str,
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Document]:
        """Run similarity search with TimescaleVector with distance.
        Args:
            query (str): Query text to search for.
            k (int): Number of results to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query.
        """
        embedding = self.embedding.embed_query(text=query)
        return self.similarity_search_by_vector(
            embedding=embedding,
            k=k,
            filter=filter,
            predicates=predicates,
            **kwargs,
        )
    async def asimilarity_search(
        self,
        query: str,
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Document]:
        """Run similarity search with TimescaleVector with distance.
        Args:
            query (str): Query text to search for.
            k (int): Number of results to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query.
        """
        embedding = self.embedding.embed_query(text=query)
        return await self.asimilarity_search_by_vector(
            embedding=embedding,
            k=k,
            filter=filter,
            predicates=predicates,
            **kwargs,
        )
    def similarity_search_with_score(
        self,
        query: str,
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        """Return docs most similar to query.
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query and score for each
        """
        embedding = self.embedding.embed_query(query)
        docs = self.similarity_search_with_score_by_vector(
            embedding=embedding,
            k=k,
            filter=filter,
            predicates=predicates,
            **kwargs,
        )
        return docs
    async def asimilarity_search_with_score(
        self,
        query: str,
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        """Return docs most similar to query.
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query and score for each
        """
        embedding = self.embedding.embed_query(query)
        return await self.asimilarity_search_with_score_by_vector(
            embedding=embedding,
            k=k,
            filter=filter,
            predicates=predicates,
            **kwargs,
        )
    def date_to_range_filter(self, **kwargs: Any) -> Any:
        constructor_args = {
            key: kwargs[key]
            for key in [
                "start_date",
                "end_date",
                "time_delta",
                "start_inclusive",
                "end_inclusive",
            ]
            if key in kwargs
        }
        if not constructor_args or len(constructor_args) == 0:
            return None
        try:
            from timescale_vector import client
        except ImportError:
            raise ImportError(
                "Could not import timescale_vector python package. "
                "Please install it with `pip install timescale-vector`."
            )
        return client.UUIDTimeRange(**constructor_args)
    def similarity_search_with_score_by_vector(
        self,
        embedding: List[float],
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        try:
            from timescale_vector import client
        except ImportError:
            raise ImportError(
                "Could not import timescale_vector python package. "
                "Please install it with `pip install timescale-vector`."
            )
        results = self.sync_client.search(
            embedding,
            limit=k,
            filter=filter,
            predicates=predicates,
            uuid_time_filter=self.date_to_range_filter(**kwargs),
        )
        docs = [
            (
                Document(
                    page_content=result[client.SEARCH_RESULT_CONTENTS_IDX],
                    metadata=result[client.SEARCH_RESULT_METADATA_IDX],
                ),
                result[client.SEARCH_RESULT_DISTANCE_IDX],
            )
            for result in results
        ]
        return docs
    async def asimilarity_search_with_score_by_vector(
        self,
        embedding: List[float],
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Tuple[Document, float]]:
        try:
            from timescale_vector import client
        except ImportError:
            raise ImportError(
                "Could not import timescale_vector python package. "
                "Please install it with `pip install timescale-vector`."
            )
        results = await self.async_client.search(
            embedding,
            limit=k,
            filter=filter,
            predicates=predicates,
            uuid_time_filter=self.date_to_range_filter(**kwargs),
        )
        docs = [
            (
                Document(
                    page_content=result[client.SEARCH_RESULT_CONTENTS_IDX],
                    metadata=result[client.SEARCH_RESULT_METADATA_IDX],
                ),
                result[client.SEARCH_RESULT_DISTANCE_IDX],
            )
            for result in results
        ]
        return docs
    def similarity_search_by_vector(
        self,
        embedding: List[float],
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Document]:
        """Return docs most similar to embedding vector.
        Args:
            embedding: Embedding to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query vector.
        """
        docs_and_scores = self.similarity_search_with_score_by_vector(
            embedding=embedding, k=k, filter=filter, predicates=predicates, **kwargs
        )
        return [doc for doc, _ in docs_and_scores]
    async def asimilarity_search_by_vector(
        self,
        embedding: List[float],
        k: int = 4,
        filter: Optional[Union[dict, list]] = None,
        predicates: Optional[Predicates] = None,
        **kwargs: Any,
    ) -> List[Document]:
        """Return docs most similar to embedding vector.
        Args:
            embedding: Embedding to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
            filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
        Returns:
            List of Documents most similar to the query vector.
        """
        docs_and_scores = await self.asimilarity_search_with_score_by_vector(
            embedding=embedding, k=k, filter=filter, predicates=predicates, **kwargs
        )
        return [doc for doc, _ in docs_and_scores]
    @classmethod
    def from_texts(
        cls: Type[TimescaleVector],
        texts: List[str],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        ids: Optional[List[str]] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        """
        Return VectorStore initialized from texts and embeddings.
        Postgres connection string is required
        "Either pass it as a parameter
        or set the TIMESCALE_SERVICE_URL environment variable.
        """
        embeddings = embedding.embed_documents(list(texts))
        return cls.__from(
            texts,
            embeddings,
            embedding,
            metadatas=metadatas,
            ids=ids,
            collection_name=collection_name,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
    @classmethod
    async def afrom_texts(
        cls: Type[TimescaleVector],
        texts: List[str],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        ids: Optional[List[str]] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        """
        Return VectorStore initialized from texts and embeddings.
        Postgres connection string is required
        "Either pass it as a parameter
        or set the TIMESCALE_SERVICE_URL environment variable.
        """
        embeddings = embedding.embed_documents(list(texts))
        return await cls.__afrom(
            texts,
            embeddings,
            embedding,
            metadatas=metadatas,
            ids=ids,
            collection_name=collection_name,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
    @classmethod
    def from_embeddings(
        cls,
        text_embeddings: List[Tuple[str, List[float]]],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        ids: Optional[List[str]] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        """Construct TimescaleVector wrapper from raw documents and pre-
        generated embeddings.
        Return VectorStore initialized from documents and embeddings.
        Postgres connection string is required
        "Either pass it as a parameter
        or set the TIMESCALE_SERVICE_URL environment variable.
        Example:
            .. code-block:: python
                from langchain.vectorstores import TimescaleVector
                from langchain.embeddings import OpenAIEmbeddings
                embeddings = OpenAIEmbeddings()
                text_embeddings = embeddings.embed_documents(texts)
                text_embedding_pairs = list(zip(texts, text_embeddings))
                tvs = TimescaleVector.from_embeddings(text_embedding_pairs, embeddings)
        """
        texts = [t[0] for t in text_embeddings]
        embeddings = [t[1] for t in text_embeddings]
        return cls.__from(
            texts,
            embeddings,
            embedding,
            metadatas=metadatas,
            ids=ids,
            collection_name=collection_name,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
    @classmethod
    async def afrom_embeddings(
        cls,
        text_embeddings: List[Tuple[str, List[float]]],
        embedding: Embeddings,
        metadatas: Optional[List[dict]] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        ids: Optional[List[str]] = None,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        """Construct TimescaleVector wrapper from raw documents and pre-
        generated embeddings.
        Return VectorStore initialized from documents and embeddings.
        Postgres connection string is required
        "Either pass it as a parameter
        or set the TIMESCALE_SERVICE_URL environment variable.
        Example:
            .. code-block:: python
                from langchain.vectorstores import TimescaleVector
                from langchain.embeddings import OpenAIEmbeddings
                embeddings = OpenAIEmbeddings()
                text_embeddings = embeddings.embed_documents(texts)
                text_embedding_pairs = list(zip(texts, text_embeddings))
                tvs = TimescaleVector.from_embeddings(text_embedding_pairs, embeddings)
        """
        texts = [t[0] for t in text_embeddings]
        embeddings = [t[1] for t in text_embeddings]
        return await cls.__afrom(
            texts,
            embeddings,
            embedding,
            metadatas=metadatas,
            ids=ids,
            collection_name=collection_name,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
            **kwargs,
        )
    @classmethod
    def from_existing_index(
        cls: Type[TimescaleVector],
        embedding: Embeddings,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        pre_delete_collection: bool = False,
        **kwargs: Any,
    ) -> TimescaleVector:
        """
        Get intsance of an existing TimescaleVector store.This method will
        return the instance of the store without inserting any new
        embeddings
        """
        service_url = cls.get_service_url(kwargs)
        store = cls(
            service_url=service_url,
            collection_name=collection_name,
            embedding=embedding,
            distance_strategy=distance_strategy,
            pre_delete_collection=pre_delete_collection,
        )
        return store
    @classmethod
    def get_service_url(cls, kwargs: Dict[str, Any]) -> str:
        service_url: str = get_from_dict_or_env(
            data=kwargs,
            key="service_url",
            env_key="TIMESCALE_SERVICE_URL",
        )
        if not service_url:
            raise ValueError(
                "Postgres connection string is required"
                "Either pass it as a parameter"
                "or set the TIMESCALE_SERVICE_URL environment variable."
            )
        return service_url
    @classmethod
    def service_url_from_db_params(
        cls,
        host: str,
        port: int,
        database: str,
        user: str,
        password: str,
    ) -> str:
        """Return connection string from database parameters."""
        return f"postgresql://{user}:{password}@{host}:{port}/{database}"
    def _select_relevance_score_fn(self) -> Callable[[float], float]:
        """
        The 'correct' relevance function
        may differ depending on a few things, including:
        - the distance / similarity metric used by the VectorStore
        - the scale of your embeddings (OpenAI's are unit normed. Many others are not!)
        - embedding dimensionality
        - etc.
        """
        if self.override_relevance_score_fn is not None:
            return self.override_relevance_score_fn
        # Default strategy is to rely on distance strategy provided
        # in vectorstore constructor
        if self._distance_strategy == DistanceStrategy.COSINE:
            return self._cosine_relevance_score_fn
        elif self._distance_strategy == DistanceStrategy.EUCLIDEAN_DISTANCE:
            return self._euclidean_relevance_score_fn
        elif self._distance_strategy == DistanceStrategy.MAX_INNER_PRODUCT:
            return self._max_inner_product_relevance_score_fn
        else:
            raise ValueError(
                "No supported normalization function"
                f" for distance_strategy of {self._distance_strategy}."
                "Consider providing relevance_score_fn to TimescaleVector constructor."
            )
    def delete(self, ids: Optional[List[str]] = None, **kwargs: Any) -> Optional[bool]:
        """Delete by vector ID or other criteria.
        Args:
            ids: List of ids to delete.
            **kwargs: Other keyword arguments that subclasses might use.
        Returns:
            Optional[bool]: True if deletion is successful,
            False otherwise, None if not implemented.
        """
        if ids is None:
            raise ValueError("No ids provided to delete.")
        self.sync_client.delete_by_ids(ids)
        return True
    # todo should this be part of delete|()?
    def delete_by_metadata(
        self, filter: Union[Dict[str, str], List[Dict[str, str]]], **kwargs: Any
    ) -> Optional[bool]:
        """Delete by vector ID or other criteria.
        Args:
            ids: List of ids to delete.
            **kwargs: Other keyword arguments that subclasses might use.
        Returns:
            Optional[bool]: True if deletion is successful,
            False otherwise, None if not implemented.
        """
        self.sync_client.delete_by_metadata(filter)
        return True
    class IndexType(str, enum.Enum):
        """Enumerator for the supported Index types"""
        TIMESCALE_VECTOR = "tsv"
        PGVECTOR_IVFFLAT = "ivfflat"
        PGVECTOR_HNSW = "hnsw"
    DEFAULT_INDEX_TYPE = IndexType.TIMESCALE_VECTOR
    def create_index(
        self, index_type: Union[IndexType, str] = DEFAULT_INDEX_TYPE, **kwargs: Any
    ) -> None:
        try:
            from timescale_vector import client
        except ImportError:
            raise ImportError(
                "Could not import timescale_vector python package. "
                "Please install it with `pip install timescale-vector`."
            )
        index_type = (
            index_type.value if isinstance(index_type, self.IndexType) else index_type
        )
        if index_type == self.IndexType.PGVECTOR_IVFFLAT.value:
            self.sync_client.create_embedding_index(client.IvfflatIndex(**kwargs))
        if index_type == self.IndexType.PGVECTOR_HNSW.value:
            self.sync_client.create_embedding_index(client.HNSWIndex(**kwargs))
        if index_type == self.IndexType.TIMESCALE_VECTOR.value:
            self.sync_client.create_embedding_index(
                client.TimescaleVectorIndex(**kwargs)
            )
    def drop_index(self) -> None:
        self.sync_client.drop_embedding_index()
--- a/libs/langchain/poetry.lock
+++ b/libs/langchain/poetry.lock
--- a/libs/langchain/pyproject.toml
+++ b/libs/langchain/pyproject.toml
@ -129,6 +129,7 @@ markdownify = {version = "^0.11.6", optional = true}
 assemblyai = {version = "^0.17.0", optional = true}
 dashvector = {version = "^1.0.1", optional = true}
 sqlite-vss = {version = "^0.1.2", optional = true}
 timescale-vector = {version = "^0.0.1", optional = true}
 [tool.poetry.group.test.dependencies]
@ -345,6 +346,7 @@ extended_testing = [
 "markdownify",
 "dashvector",
 "sqlite-vss",
 "timescale-vector",
 ]
 [tool.ruff]
--- a/libs/langchain/tests/integration_tests/vectorstores/test_timescalevector.py
+++ b/libs/langchain/tests/integration_tests/vectorstores/test_timescalevector.py
@ -0,0 +1,433 @@
 """Test TimescaleVector functionality."""
 import os
 from datetime import datetime, timedelta
 from typing import List
 import pytest
 from langchain.docstore.document import Document
 from langchain.vectorstores.timescalevector import TimescaleVector
 from tests.integration_tests.vectorstores.fake_embeddings import FakeEmbeddings
 SERVICE_URL = TimescaleVector.service_url_from_db_params(
    host=os.environ.get("TEST_TIMESCALE_HOST", "localhost"),
    port=int(os.environ.get("TEST_TIMESCALE_PORT", "5432")),
    database=os.environ.get("TEST_TIMESCALE_DATABASE", "postgres"),
    user=os.environ.get("TEST_TIMESCALE_USER", "postgres"),
    password=os.environ.get("TEST_TIMESCALE_PASSWORD", "postgres"),
 )
 ADA_TOKEN_COUNT = 1536
 class FakeEmbeddingsWithAdaDimension(FakeEmbeddings):
    """Fake embeddings functionality for testing."""
    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        """Return simple embeddings."""
        return [
            [float(1.0)] * (ADA_TOKEN_COUNT - 1) + [float(i)] for i in range(len(texts))
        ]
    def embed_query(self, text: str) -> List[float]:
        """Return simple embeddings."""
        return [float(1.0)] * (ADA_TOKEN_COUNT - 1) + [float(0.0)]
 def test_timescalevector() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search("foo", k=1)
    assert output == [Document(page_content="foo")]
 def test_timescalevector_from_documents() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    docs = [Document(page_content=t, metadata={"a": "b"}) for t in texts]
    docsearch = TimescaleVector.from_documents(
        documents=docs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search("foo", k=1)
    assert output == [Document(page_content="foo", metadata={"a": "b"})]
@pytest.mark.asyncio
 async def test_timescalevector_afrom_documents() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    docs = [Document(page_content=t, metadata={"a": "b"}) for t in texts]
    docsearch = await TimescaleVector.afrom_documents(
        documents=docs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = await docsearch.asimilarity_search("foo", k=1)
    assert output == [Document(page_content="foo", metadata={"a": "b"})]
 def test_timescalevector_embeddings() -> None:
    """Test end to end construction with embeddings and search."""
    texts = ["foo", "bar", "baz"]
    text_embeddings = FakeEmbeddingsWithAdaDimension().embed_documents(texts)
    text_embedding_pairs = list(zip(texts, text_embeddings))
    docsearch = TimescaleVector.from_embeddings(
        text_embeddings=text_embedding_pairs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search("foo", k=1)
    assert output == [Document(page_content="foo")]
@pytest.mark.asyncio
 async def test_timescalevector_aembeddings() -> None:
    """Test end to end construction with embeddings and search."""
    texts = ["foo", "bar", "baz"]
    text_embeddings = FakeEmbeddingsWithAdaDimension().embed_documents(texts)
    text_embedding_pairs = list(zip(texts, text_embeddings))
    docsearch = await TimescaleVector.afrom_embeddings(
        text_embeddings=text_embedding_pairs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = await docsearch.asimilarity_search("foo", k=1)
    assert output == [Document(page_content="foo")]
 def test_timescalevector_with_metadatas() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search("foo", k=1)
    assert output == [Document(page_content="foo", metadata={"page": "0"})]
 def test_timescalevector_with_metadatas_with_scores() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_score("foo", k=1)
    assert output == [(Document(page_content="foo", metadata={"page": "0"}), 0.0)]
@pytest.mark.asyncio
 async def test_timescalevector_awith_metadatas_with_scores() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = await TimescaleVector.afrom_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = await docsearch.asimilarity_search_with_score("foo", k=1)
    assert output == [(Document(page_content="foo", metadata={"page": "0"}), 0.0)]
 def test_timescalevector_with_filter_match() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection_filter",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_score("foo", k=1, filter={"page": "0"})
    assert output == [(Document(page_content="foo", metadata={"page": "0"}), 0.0)]
 def test_timescalevector_with_filter_distant_match() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection_filter",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_score("foo", k=1, filter={"page": "2"})
    assert output == [
        (Document(page_content="baz", metadata={"page": "2"}), 0.0013003906671379406)
    ]
 def test_timescalevector_with_filter_no_match() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection_filter",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_score("foo", k=1, filter={"page": "5"})
    assert output == []
 def test_timescalevector_with_filter_in_set() -> None:
    """Test end to end construction and search."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection_filter",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_score(
        "foo", k=2, filter=[{"page": "0"}, {"page": "2"}]
    )
    assert output == [
        (Document(page_content="foo", metadata={"page": "0"}), 0.0),
        (Document(page_content="baz", metadata={"page": "2"}), 0.0013003906671379406),
    ]
 def test_timescalevector_relevance_score() -> None:
    """Test to make sure the relevance score is scaled to 0-1."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = docsearch.similarity_search_with_relevance_scores("foo", k=3)
    assert output == [
        (Document(page_content="foo", metadata={"page": "0"}), 1.0),
        (Document(page_content="bar", metadata={"page": "1"}), 0.9996744261675065),
        (Document(page_content="baz", metadata={"page": "2"}), 0.9986996093328621),
    ]
@pytest.mark.asyncio
 async def test_timescalevector_relevance_score_async() -> None:
    """Test to make sure the relevance score is scaled to 0-1."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = await TimescaleVector.afrom_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    output = await docsearch.asimilarity_search_with_relevance_scores("foo", k=3)
    assert output == [
        (Document(page_content="foo", metadata={"page": "0"}), 1.0),
        (Document(page_content="bar", metadata={"page": "1"}), 0.9996744261675065),
        (Document(page_content="baz", metadata={"page": "2"}), 0.9986996093328621),
    ]
 def test_timescalevector_retriever_search_threshold() -> None:
    """Test using retriever for searching with threshold."""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    retriever = docsearch.as_retriever(
        search_type="similarity_score_threshold",
        search_kwargs={"k": 3, "score_threshold": 0.999},
    )
    output = retriever.get_relevant_documents("summer")
    assert output == [
        Document(page_content="foo", metadata={"page": "0"}),
        Document(page_content="bar", metadata={"page": "1"}),
    ]
 def test_timescalevector_retriever_search_threshold_custom_normalization_fn() -> None:
    """Test searching with threshold and custom normalization function"""
    texts = ["foo", "bar", "baz"]
    metadatas = [{"page": str(i)} for i in range(len(texts))]
    docsearch = TimescaleVector.from_texts(
        texts=texts,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        metadatas=metadatas,
        service_url=SERVICE_URL,
        pre_delete_collection=True,
        relevance_score_fn=lambda d: d * 0,
    )
    retriever = docsearch.as_retriever(
        search_type="similarity_score_threshold",
        search_kwargs={"k": 3, "score_threshold": 0.5},
    )
    output = retriever.get_relevant_documents("foo")
    assert output == []
 def test_timescalevector_delete() -> None:
    """Test deleting functionality."""
    texts = ["bar", "baz"]
    docs = [Document(page_content=t, metadata={"a": "b"}) for t in texts]
    docsearch = TimescaleVector.from_documents(
        documents=docs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    texts = ["foo"]
    meta = [{"b": "c"}]
    ids = docsearch.add_texts(texts, meta)
    output = docsearch.similarity_search("bar", k=10)
    assert len(output) == 3
    docsearch.delete(ids)
    output = docsearch.similarity_search("bar", k=10)
    assert len(output) == 2
    docsearch.delete_by_metadata({"a": "b"})
    output = docsearch.similarity_search("bar", k=10)
    assert len(output) == 0
 def test_timescalevector_with_index() -> None:
    """Test deleting functionality."""
    texts = ["bar", "baz"]
    docs = [Document(page_content=t, metadata={"a": "b"}) for t in texts]
    docsearch = TimescaleVector.from_documents(
        documents=docs,
        collection_name="test_collection",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
    )
    texts = ["foo"]
    meta = [{"b": "c"}]
    docsearch.add_texts(texts, meta)
    docsearch.create_index()
    output = docsearch.similarity_search("bar", k=10)
    assert len(output) == 3
    docsearch.drop_index()
    docsearch.create_index(
        index_type=TimescaleVector.IndexType.TIMESCALE_VECTOR,
        max_alpha=1.0,
        num_neighbors=50,
    )
    docsearch.drop_index()
    docsearch.create_index("tsv", max_alpha=1.0, num_neighbors=50)
    docsearch.drop_index()
    docsearch.create_index("ivfflat", num_lists=20, num_records=1000)
    docsearch.drop_index()
    docsearch.create_index("hnsw", m=16, ef_construction=64)
 def test_timescalevector_time_partitioning() -> None:
    """Test deleting functionality."""
    from timescale_vector import client
    texts = ["bar", "baz"]
    docs = [Document(page_content=t, metadata={"a": "b"}) for t in texts]
    docsearch = TimescaleVector.from_documents(
        documents=docs,
        collection_name="test_collection_time_partitioning",
        embedding=FakeEmbeddingsWithAdaDimension(),
        service_url=SERVICE_URL,
        pre_delete_collection=True,
        time_partition_interval=timedelta(hours=1),
    )
    texts = ["foo"]
    meta = [{"b": "c"}]
    ids = [client.uuid_from_time(datetime.now() - timedelta(hours=3))]
    docsearch.add_texts(texts, meta, ids)
    output = docsearch.similarity_search("bar", k=10)
    assert len(output) == 3
    output = docsearch.similarity_search(
        "bar", k=10, start_date=datetime.now() - timedelta(hours=1)
    )
    assert len(output) == 2
    output = docsearch.similarity_search(
        "bar", k=10, end_date=datetime.now() - timedelta(hours=1)
    )
    assert len(output) == 1
    output = docsearch.similarity_search(
        "bar", k=10, start_date=datetime.now() - timedelta(minutes=200)
    )
    assert len(output) == 3
    output = docsearch.similarity_search(
        "bar",
        k=10,
        start_date=datetime.now() - timedelta(minutes=200),
        time_delta=timedelta(hours=1),
    )
    assert len(output) == 1
--- a/libs/langchain/tests/unit_tests/retrievers/self_query/test_timescalevector.py
+++ b/libs/langchain/tests/unit_tests/retrievers/self_query/test_timescalevector.py
@ -0,0 +1,97 @@
 from typing import Dict, Tuple
 import pytest as pytest
 from langchain.chains.query_constructor.ir import (
    Comparator,
    Comparison,
    Operation,
    Operator,
    StructuredQuery,
 )
 from langchain.retrievers.self_query.timescalevector import TimescaleVectorTranslator
 DEFAULT_TRANSLATOR = TimescaleVectorTranslator()
@pytest.mark.requires("timescale_vector")
 def test_visit_comparison() -> None:
    from timescale_vector import client
    comp = Comparison(comparator=Comparator.LT, attribute="foo", value=1)
    expected = client.Predicates(("foo", "<", 1))
    actual = DEFAULT_TRANSLATOR.visit_comparison(comp)
    assert expected == actual
@pytest.mark.requires("timescale_vector")
 def test_visit_operation() -> None:
    from timescale_vector import client
    op = Operation(
        operator=Operator.AND,
        arguments=[
            Comparison(comparator=Comparator.LT, attribute="foo", value=2),
            Comparison(comparator=Comparator.EQ, attribute="bar", value="baz"),
            Comparison(comparator=Comparator.GT, attribute="abc", value=2.0),
        ],
    )
    expected = client.Predicates(
        client.Predicates(("foo", "<", 2)),
        client.Predicates(("bar", "==", "baz")),
        client.Predicates(("abc", ">", 2.0)),
    )
    actual = DEFAULT_TRANSLATOR.visit_operation(op)
    assert expected == actual
@pytest.mark.requires("timescale_vector")
 def test_visit_structured_query() -> None:
    from timescale_vector import client
    query = "What is the capital of France?"
    structured_query = StructuredQuery(
        query=query,
        filter=None,
    )
    expected: Tuple[str, Dict] = (query, {})
    actual = DEFAULT_TRANSLATOR.visit_structured_query(structured_query)
    assert expected == actual
    comp = Comparison(comparator=Comparator.LT, attribute="foo", value=1)
    expected = (
        query,
        {"predicates": client.Predicates(("foo", "<", 1))},
    )
    structured_query = StructuredQuery(
        query=query,
        filter=comp,
    )
    actual = DEFAULT_TRANSLATOR.visit_structured_query(structured_query)
    assert expected == actual
    op = Operation(
        operator=Operator.AND,
        arguments=[
            Comparison(comparator=Comparator.LT, attribute="foo", value=2),
            Comparison(comparator=Comparator.EQ, attribute="bar", value="baz"),
            Comparison(comparator=Comparator.GT, attribute="abc", value=2.0),
        ],
    )
    structured_query = StructuredQuery(
        query=query,
        filter=op,
    )
    expected = (
        query,
        {
            "predicates": client.Predicates(
                client.Predicates(("foo", "<", 2)),
                client.Predicates(("bar", "==", "baz")),
                client.Predicates(("abc", ">", 2.0)),
            )
        },
    )
    actual = DEFAULT_TRANSLATOR.visit_structured_query(structured_query)
    assert expected == actual