langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-10-07 13:17:55 +00:00

Files

Pau Ramon Revilla 87802c86d9 Added a MHTML document loader (#6311 )

MHTML is a very interesting format since it's used both for emails but
also for archived webpages. Some scraping projects want to store pages
in disk to process them later, mhtml is perfect for that use case.

This is heavily inspired from the beautifulsoup html loader, but
extracting the html part from the mhtml file.

---------

Co-authored-by: rlm <pexpresss31@gmail.com>

2023-06-25 13:12:08 -07:00

document_loaders/integrations

Added a MHTML document loader (#6311 )

2023-06-25 13:12:08 -07:00

document_transformers/text_splitters

MD header text splitter returns Documents (#6571 )

2023-06-22 09:25:38 -07:00

retrievers

Kendra retriever api (#6616 )

2023-06-23 14:59:35 -07:00

text_embedding/integrations

Doc refactor (#6300 )

2023-06-16 11:52:56 -07:00

vectorstores/integrations

chroma nb close img tag (#6669 )

2023-06-23 15:41:54 -07:00