mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-28 18:48:50 +00:00
### Adds a document loader for Docugami Specifically: 1. Adds a data loader that talks to the [Docugami](http://docugami.com) API to download processed documents as semantic XML 2. Parses the semantic XML into chunks, with additional metadata capturing chunk semantics 3. Adds a detailed notebook showing how you can use additional metadata returned by Docugami for techniques like the [self-querying retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html) 4. Adds an integration test, and related documentation Here is an example of a result that is not possible without the capabilities added by Docugami (from the notebook): <img width="1585" alt="image" src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b"> --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com> Co-authored-by: Taqi Jaffri <tjaffri@gmail.com> |
||
---|---|---|
.. | ||
agents | ||
callbacks | ||
chains | ||
chat_models | ||
client | ||
data | ||
docstore | ||
document_loader | ||
evaluation | ||
llms | ||
memory | ||
output_parsers | ||
prompts | ||
retrievers | ||
tools | ||
utilities | ||
vectorstores | ||
__init__.py | ||
conftest.py | ||
test_bash.py | ||
test_depedencies.py | ||
test_document_transformers.py | ||
test_formatting.py | ||
test_math_utils.py | ||
test_python.py | ||
test_schema.py | ||
test_sql_database_schema.py | ||
test_sql_database.py | ||
test_text_splitter.py |