{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "vm8vn9t8DvC_" }, "source": [ "# MongoDB" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[MongoDB](https://www.mongodb.com/) is a NoSQL , document-oriented database that supports JSON-like documents with a dynamic schema." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "5WjXERXzFEhg" }, "source": [ "## Overview" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": { "id": "juAmbgoWD17u" }, "source": [ "The MongoDB Document Loader returns a list of Langchain Documents from a MongoDB database.\n", "\n", "The Loader requires the following parameters:\n", "\n", "* MongoDB connection string\n", "* MongoDB database name\n", "* MongoDB collection name\n", "* (Optional) Content Filter dictionary\n", "\n", "The output takes the following format:\n", "\n", "- pageContent= Mongo Document\n", "- metadata={'database': '[database_name]', 'collection': '[collection_name]'}" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Load the Document Loader" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# add this import for running in jupyter notebook\n", "import nest_asyncio\n", "nest_asyncio.apply()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain.document_loaders.mongodb import MongodbLoader" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "loader = MongodbLoader(connection_string=\"mongodb://localhost:27017/\",\n", " db_name=\"sample_restaurants\", \n", " collection_name=\"restaurants\",\n", " filter_criteria={\"borough\": \"Bronx\", \"cuisine\": \"Bakery\" },\n", " ) " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "25359" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "docs = loader.load()\n", "\n", "len(docs)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Document(page_content=\"{'_id': ObjectId('5eb3d668b31de5d588f4292a'), 'address': {'building': '2780', 'coord': [-73.98241999999999, 40.579505], 'street': 'Stillwell Avenue', 'zipcode': '11224'}, 'borough': 'Brooklyn', 'cuisine': 'American', 'grades': [{'date': datetime.datetime(2014, 6, 10, 0, 0), 'grade': 'A', 'score': 5}, {'date': datetime.datetime(2013, 6, 5, 0, 0), 'grade': 'A', 'score': 7}, {'date': datetime.datetime(2012, 4, 13, 0, 0), 'grade': 'A', 'score': 12}, {'date': datetime.datetime(2011, 10, 12, 0, 0), 'grade': 'A', 'score': 12}], 'name': 'Riviera Caterer', 'restaurant_id': '40356018'}\", metadata={'database': 'sample_restaurants', 'collection': 'restaurants'})" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "docs[0]" ] } ], "metadata": { "colab": { "collapsed_sections": [ "5WjXERXzFEhg" ], "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 4 }