diff --git a/docs/docs/integrations/providers/cognee.mdx b/docs/docs/integrations/providers/cognee.mdx new file mode 100644 index 00000000000..8cc40a17288 --- /dev/null +++ b/docs/docs/integrations/providers/cognee.mdx @@ -0,0 +1,27 @@ +# Cognee + +Cognee implements scalable, modular ECL (Extract, Cognify, Load) pipelines that allow +you to interconnect and retrieve past conversations, documents, and audio +transcriptions while reducing hallucinations, developer effort, and cost. + +Cognee merges graph and vector databases to uncover hidden relationships and new +patterns in your data. You can automatically model, load and retrieve entities and +objects representing your business domain and analyze their relationships, uncovering +insights that neither vector stores nor graph stores alone can provide. + +Try it in a Google Colab notebook or have a look at the documentation. + +If you have questions, join cognee Discord community. + +Have you seen cognee's starter repo? Check it out! + + +## Installation and Setup + +```bash +pip install langchain-cognee +``` + +## Retrievers + +See detail on available retrievers [here](/docs/integrations/retrievers/cognee). diff --git a/docs/docs/integrations/retrievers/cognee.ipynb b/docs/docs/integrations/retrievers/cognee.ipynb new file mode 100644 index 00000000000..739e3312ced --- /dev/null +++ b/docs/docs/integrations/retrievers/cognee.ipynb @@ -0,0 +1,275 @@ +{ + "cells": [ + { + "cell_type": "raw", + "id": "afaf8039", + "metadata": {}, + "source": [ + "---\n", + "sidebar_label: Cognee\n", + "---" + ] + }, + { + "cell_type": "markdown", + "id": "e49f1e0d", + "metadata": {}, + "source": [ + "# CogneeRetriever\n", + "\n", + "This will help you getting started with the Cognee [retriever](/docs/concepts/retrievers). For detailed documentation of all CogneeRetriever features and configurations head to the [API reference](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.cognee.CogneeRetriever.html).\n", + "\n", + "### Integration details\n", + "\n", + "Bring-your-own data (i.e., index and search a custom corpus of documents):\n", + "\n", + "| Retriever | Self-host | Cloud offering | Package |\n", + "| :--- | :--- | :---: | :---: |\n", + "[CogneeRetriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.cognee.CogneeRetriever.html) | ✅ | ❌ | langchain-cognee |\n", + "\n", + "## Setup\n", + "\n", + "For cognee default setup, only thing you need is your OpenAI API key. \n" + ] + }, + { + "cell_type": "markdown", + "id": "72ee0c4b-9764-423a-9dbf-95129e185210", + "metadata": {}, + "source": [ + "If you want to get automated tracing from individual queries, you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de", + "metadata": {}, + "outputs": [], + "source": [ + "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n", + "# os.environ[\"LANGSMITH_TRACING\"] = \"true\"" + ] + }, + { + "cell_type": "markdown", + "id": "0730d6a1-c893-4840-9817-5e5251676d5d", + "metadata": {}, + "source": [ + "### Installation\n", + "\n", + "This retriever lives in the `langchain-cognee` package:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "652d6238-1f87-422a-b135-f5abbb8652fc", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -qU langchain-cognee" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b8bcb1e7", + "metadata": {}, + "outputs": [], + "source": [ + "import nest_asyncio\n", + "\n", + "nest_asyncio.apply()" + ] + }, + { + "cell_type": "markdown", + "id": "a38cde65-254d-4219-a441-068766c0d4b5", + "metadata": {}, + "source": [ + "## Instantiation\n", + "\n", + "Now we can instantiate our retriever:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "70cc8e65-2a02-408a-bbc6-8ef649057d82", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_cognee import CogneeRetriever\n", + "\n", + "retriever = CogneeRetriever(\n", + " llm_api_key=\"sk-\", # OpenAI API Key\n", + " dataset_name=\"my_dataset\",\n", + " k=3,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "5c5f2839-4020-424e-9fc9-07777eede442", + "metadata": {}, + "source": [ + "## Usage\n", + "\n", + "Add some documents, process them, and then run queries. Cognee retrieves relevant knowledge to your queries and generates final answers." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "51a60dbe-9f2e-4e04-bb62-23968f17164a", + "metadata": {}, + "outputs": [], + "source": [ + "# Example of adding and processing documents\n", + "from langchain_core.documents import Document\n", + "\n", + "docs = [\n", + " Document(page_content=\"Elon Musk is the CEO of SpaceX.\"),\n", + " Document(page_content=\"SpaceX focuses on rockets and space travel.\"),\n", + "]\n", + "\n", + "retriever.add_documents(docs)\n", + "retriever.process_data()\n", + "\n", + "# Now let's query the retriever\n", + "query = \"Tell me about Elon Musk\"\n", + "results = retriever.invoke(query)\n", + "\n", + "for idx, doc in enumerate(results, start=1):\n", + " print(f\"Doc {idx}: {doc.page_content}\")" + ] + }, + { + "cell_type": "markdown", + "id": "dfe8aad4-8626-4330-98a9-7ea1ca5d2e0e", + "metadata": {}, + "source": [ + "## Use within a chain\n", + "\n", + "Like other retrievers, CogneeRetriever can be incorporated into LLM applications via [chains](/docs/how_to/sequence/).\n", + "\n", + "We will need a LLM or chat model:\n", + "\n", + "import ChatModelTabs from \"@theme/ChatModelTabs\";\n", + "\n", + "" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25b647a3-f8f2-4541-a289-7a241e43f9df", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_openai import ChatOpenAI\n", + "\n", + "llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "23e11cc9-abd6-4855-a7eb-799f45ca01ae", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_cognee import CogneeRetriever\n", + "from langchain_core.documents import Document\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "from langchain_core.prompts import ChatPromptTemplate\n", + "from langchain_core.runnables import RunnablePassthrough\n", + "\n", + "# Instantiate the retriever with your Cognee config\n", + "retriever = CogneeRetriever(llm_api_key=\"sk-\", dataset_name=\"my_dataset\", k=3)\n", + "\n", + "# Optionally, prune/reset the dataset for a clean slate\n", + "retriever.prune()\n", + "\n", + "# Add some documents\n", + "docs = [\n", + " Document(page_content=\"Elon Musk is the CEO of SpaceX.\"),\n", + " Document(page_content=\"SpaceX focuses on space travel.\"),\n", + "]\n", + "retriever.add_documents(docs)\n", + "retriever.process_data()\n", + "\n", + "\n", + "prompt = ChatPromptTemplate.from_template(\n", + " \"\"\"Answer the question based only on the context provided.\n", + "\n", + "Context: {context}\n", + "\n", + "Question: {question}\"\"\"\n", + ")\n", + "\n", + "\n", + "def format_docs(docs):\n", + " return \"\\n\\n\".join(doc.page_content for doc in docs)\n", + "\n", + "\n", + "chain = (\n", + " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt\n", + " | llm\n", + " | StrOutputParser()\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d47c37dd-5c11-416c-a3b6-bec413cd70e8", + "metadata": {}, + "outputs": [], + "source": [ + "answer = chain.invoke(\"What companies do Elon Musk own?\")\n", + "\n", + "print(\"\\nFinal chain answer:\\n\", answer)" + ] + }, + { + "cell_type": "markdown", + "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3", + "metadata": {}, + "source": [ + "## API reference\n", + "\n", + "TODO: add link to API reference." + ] + }, + { + "cell_type": "markdown", + "id": "a8dbdd72", + "metadata": {}, + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "langchain-cognee-wqM4bUfz-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/libs/packages.yml b/libs/packages.yml index ca1c0249e28..c3add8e1e35 100644 --- a/libs/packages.yml +++ b/libs/packages.yml @@ -445,6 +445,9 @@ packages: repo: Shikenso-Analytics/langchain-discord downloads: 1 downloads_updated_at: '2025-02-15T16:00:00.000000+00:00' +- name: langchain-cognee + repo: topoteretes/langchain-cognee + path: . - name: langchain-prolog path: . repo: apisani1/langchain-prolog