mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-10 09:36:08 +00:00
Added a GitBook document loader. It lets you both, (1) fetch text from any single GitBook page, or (2) fetch all relative paths and return their respective content in Documents. I've modified the `scrape` method in the `WebBaseLoader` to accept custom web paths if given, but happy to remove it and move that logic into the `GitbookLoader` itself.
66 lines
3.1 KiB
ReStructuredText
66 lines
3.1 KiB
ReStructuredText
How To Guides
|
|
====================================
|
|
|
|
There are a lot of different document loaders that LangChain supports. Below are how-to guides for working with them
|
|
|
|
`File Loader <./examples/unstructured_file.html>`_: A walkthrough of how to use Unstructured to load files of arbitrary types (pdfs, txt, html, etc).
|
|
|
|
`Directory Loader <./examples/directory_loader.html>`_: A walkthrough of how to use Unstructured load files from a given directory.
|
|
|
|
`Notion <./examples/notion.html>`_: A walkthrough of how to load data for an arbitrary Notion DB.
|
|
|
|
`ReadTheDocs <./examples/readthedocs_documentation.html>`_: A walkthrough of how to load data for documentation generated by ReadTheDocs.
|
|
|
|
`HTML <./examples/html.html>`_: A walkthrough of how to load data from an html file.
|
|
|
|
`PDF <./examples/pdf.html>`_: A walkthrough of how to load data from a PDF file.
|
|
|
|
`PowerPoint <./examples/powerpoint.html>`_: A walkthrough of how to load data from a powerpoint file.
|
|
|
|
`Email <./examples/email.html>`_: A walkthrough of how to load data from an email (`.eml`) file.
|
|
|
|
`GoogleDrive <./examples/googledrive.html>`_: A walkthrough of how to load data from Google drive.
|
|
|
|
`Microsoft Word <./examples/microsoft_word.html>`_: A walkthrough of how to load data from Microsoft Word files.
|
|
|
|
`Obsidian <./examples/obsidian.html>`_: A walkthrough of how to load data from an Obsidian file dump.
|
|
|
|
`Roam <./examples/roam.html>`_: A walkthrough of how to load data from a Roam file export.
|
|
|
|
`EverNote <./examples/evernote.html>`_: A walkthrough of how to load data from a EverNote (`.enex`) file.
|
|
|
|
`YouTube <./examples/youtube.html>`_: A walkthrough of how to load the transcript from a YouTube video.
|
|
|
|
`Hacker News <./examples/hn.html>`_: A walkthrough of how to load a Hacker News page.
|
|
|
|
`GitBook <./examples/gitbook.html>`_: A walkthrough of how to load a GitBook page.
|
|
|
|
`s3 File <./examples/s3_file.html>`_: A walkthrough of how to load a file from s3.
|
|
|
|
`s3 Directory <./examples/s3_directory.html>`_: A walkthrough of how to load all files in a directory from s3.
|
|
|
|
`GCS File <./examples/gcs_file.html>`_: A walkthrough of how to load a file from Google Cloud Storage (GCS).
|
|
|
|
`GCS Directory <./examples/gcs_directory.html>`_: A walkthrough of how to load all files in a directory from Google Cloud Storage (GCS).
|
|
|
|
`Web Base <./examples/web_base.html>`_: A walkthrough of how to load all text data from webpages.
|
|
|
|
`IMSDb <./examples/imsdb.html>`_: A walkthrough of how to load all text data from IMSDb webpage.
|
|
|
|
`AZLyrics <./examples/azlyrics.html>`_: A walkthrough of how to load all text data from AZLyrics webpage.
|
|
|
|
`College Confidential <./examples/college_confidential.html>`_: A walkthrough of how to load all text data from College Confidential webpage.
|
|
|
|
`Gutenberg <./examples/gutenberg.html>`_: A walkthrough of how to load data from a Gutenberg ebook text.
|
|
|
|
`Airbyte Json <./examples/airbyte_json.html>`_: A walkthrough of how to load data from a local Airbyte JSON file.
|
|
|
|
`Online PDF <./examples/online_pdf.html>`_: A walkthrough of how to load data from an online PDF.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:glob:
|
|
:hidden:
|
|
|
|
examples/*
|