langchain/libs/community/langchain_community/document_loaders/parsers
Alireza Kashani d1b4ead87c
community[patch]: Update grobid.py (#16298)
there is a case where "coords" does not exist in the "sentence"
therefore, the "split(";")" will lead to error.

we can fix that by adding "if sentence.get("coords") is not None:" 

the resulting empty "sbboxes" from this scenario will raise error at
"sbboxes[0]["page"]" because sbboxes are empty.

the PDF from https://pubmed.ncbi.nlm.nih.gov/23970373/ can replicate
those errors.
2024-01-22 14:03:58 -08:00
..
html community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
language community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
__init__.py community[minor]: Azure DocumentIntelligenceLoader/Parser support update with latest SDK (#14389) 2023-12-21 16:40:27 -08:00
audio.py community[patch]: Refactor OpenAIWhisperParserLocal (#15150) 2024-01-15 12:29:14 -08:00
doc_intelligence.py community: fix the "page" mode in the AzureAIDocumentIntelligenceParser (bug) (#15958) 2024-01-12 11:01:28 -08:00
docai.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
generic.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
grobid.py community[patch]: Update grobid.py (#16298) 2024-01-22 14:03:58 -08:00
msword.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pdf.py docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429) 2024-01-02 16:47:11 -05:00
registry.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
txt.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00