mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-11 10:06:05 +00:00
I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.
68 lines
3.1 KiB
ReStructuredText
68 lines
3.1 KiB
ReStructuredText
How To Guides
|
|
====================================
|
|
|
|
There are a lot of different document loaders that LangChain supports. Below are how-to guides for working with them
|
|
|
|
`File Loader <./examples/unstructured_file.html>`_: A walkthrough of how to use Unstructured to load files of arbitrary types (pdfs, txt, html, etc).
|
|
|
|
`Directory Loader <./examples/directory_loader.html>`_: A walkthrough of how to use Unstructured load files from a given directory.
|
|
|
|
`Notion <./examples/notion.html>`_: A walkthrough of how to load data for an arbitrary Notion DB.
|
|
|
|
`ReadTheDocs <./examples/readthedocs_documentation.html>`_: A walkthrough of how to load data for documentation generated by ReadTheDocs.
|
|
|
|
`HTML <./examples/html.html>`_: A walkthrough of how to load data from an html file.
|
|
|
|
`PDF <./examples/pdf.html>`_: A walkthrough of how to load data from a PDF file.
|
|
|
|
`PowerPoint <./examples/powerpoint.html>`_: A walkthrough of how to load data from a powerpoint file.
|
|
|
|
`Email <./examples/email.html>`_: A walkthrough of how to load data from an email (`.eml`) file.
|
|
|
|
`GoogleDrive <./examples/googledrive.html>`_: A walkthrough of how to load data from Google drive.
|
|
|
|
`Microsoft Word <./examples/microsoft_word.html>`_: A walkthrough of how to load data from Microsoft Word files.
|
|
|
|
`Obsidian <./examples/obsidian.html>`_: A walkthrough of how to load data from an Obsidian file dump.
|
|
|
|
`Roam <./examples/roam.html>`_: A walkthrough of how to load data from a Roam file export.
|
|
|
|
`EverNote <./examples/evernote.html>`_: A walkthrough of how to load data from a EverNote (`.enex`) file.
|
|
|
|
`YouTube <./examples/youtube.html>`_: A walkthrough of how to load the transcript from a YouTube video.
|
|
|
|
`Hacker News <./examples/hn.html>`_: A walkthrough of how to load a Hacker News page.
|
|
|
|
`GitBook <./examples/gitbook.html>`_: A walkthrough of how to load a GitBook page.
|
|
|
|
`s3 File <./examples/s3_file.html>`_: A walkthrough of how to load a file from s3.
|
|
|
|
`s3 Directory <./examples/s3_directory.html>`_: A walkthrough of how to load all files in a directory from s3.
|
|
|
|
`GCS File <./examples/gcs_file.html>`_: A walkthrough of how to load a file from Google Cloud Storage (GCS).
|
|
|
|
`GCS Directory <./examples/gcs_directory.html>`_: A walkthrough of how to load all files in a directory from Google Cloud Storage (GCS).
|
|
|
|
`Web Base <./examples/web_base.html>`_: A walkthrough of how to load all text data from webpages.
|
|
|
|
`IMSDb <./examples/imsdb.html>`_: A walkthrough of how to load all text data from IMSDb webpage.
|
|
|
|
`AZLyrics <./examples/azlyrics.html>`_: A walkthrough of how to load all text data from AZLyrics webpage.
|
|
|
|
`College Confidential <./examples/college_confidential.html>`_: A walkthrough of how to load all text data from College Confidential webpage.
|
|
|
|
`Gutenberg <./examples/gutenberg.html>`_: A walkthrough of how to load data from a Gutenberg ebook text.
|
|
|
|
`Airbyte Json <./examples/airbyte_json.html>`_: A walkthrough of how to load data from a local Airbyte JSON file.
|
|
|
|
`Online PDF <./examples/online_pdf.html>`_: A walkthrough of how to load data from an online PDF.
|
|
|
|
`CoNLL-U <./examples/CoNLL-U.html>`_: A walkthrough of how to load data from a ConLL-U file.
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:glob:
|
|
:hidden:
|
|
|
|
examples/*
|