langchain/libs/community/langchain_community/document_loaders
Shawn f6d3a3546f
community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526)
https://github.com/langchain-ai/langchain/issues/17525

### Example Code

```python
from langchain_community.document_loaders.athena import AthenaLoader

database_name = "database"
s3_output_path = "s3://bucket-no-prefix"
query="""SELECT 
  CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour,
  CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute,
  CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second;
"""
profile_name = "AdministratorAccess"

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
)

documents = loader.load()
print(documents)
```



### Error Message and Stack Trace (if applicable)

NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject
operation: The specified key does not exist

### Description

Athena Loader errors when result s3 bucket uri has no prefix. The Loader
instance call results in a "NoSuchKey: An error occurred (NoSuchKey)
when calling the GetObject operation: The specified key does not exist."
error.

If s3_output_path contains a prefix like:

```python
s3_output_path = "s3://bucket-with-prefix/prefix"
```

Execution works without an error.

## Suggested solution

Modify:

```python
key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv"
```

to

```python
key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv"
```


9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)


### System Info


System Information
------------------
> OS:  Darwin
> OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT
2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103
> Python Version:  3.9.9 (main, Jan  9 2023, 11:42:03) 
[Clang 14.0.0 (clang-1400.0.29.102)]

Package Information
-------------------
> langchain_core: 0.1.23
> langchain: 0.1.7
> langchain_community: 0.0.20
> langsmith: 0.0.87
> langchain_openai: 0.0.6
> langchainhub: 0.1.14

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:48:31 -08:00
..
blob_loaders community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
parsers community[patch]: fix import in language parser (#17538) 2024-02-14 11:11:23 -08:00
__init__.py community[minor]: Add pebblo safe document loader (#16862) 2024-02-12 21:56:12 -08:00
acreom.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
airbyte_json.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
airbyte.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
airtable.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
apify_dataset.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
arcgis_loader.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
arxiv.py Update arxiv.py with get_summaries_as_docs inside of Arxivloader (#14953) 2023-12-22 13:14:22 -08:00
assemblyai.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
astradb.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
async_html.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
athena.py community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526) 2024-02-14 11:48:31 -08:00
azlyrics.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
azure_ai_data.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
azure_blob_storage_container.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
azure_blob_storage_file.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
baiducloud_bos_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
baiducloud_bos_file.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
base_o365.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
base.py infra: add -p to mkdir in lint steps (#17013) 2024-02-05 11:22:06 -08:00
bibtex.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
bigquery.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
bilibili.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
blackboard.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
blockchain.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
brave_search.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
browserless.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
cassandra.py infra: add -p to mkdir in lint steps (#17013) 2024-02-05 11:22:06 -08:00
chatgpt.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
chm.py community[patch]: docstrings (#16810) 2024-02-09 12:48:57 -08:00
chromium.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
college_confidential.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
concurrent.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
confluence.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
conllu.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
couchbase.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
csv_loader.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
cube_semantic.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
datadog_logs.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
dataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
diffbot.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
directory.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
discord.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
doc_intelligence.py infra: add -p to mkdir in lint steps (#17013) 2024-02-05 11:22:06 -08:00
docugami.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
docusaurus.py docs: docstrings langchain_community update (#14889) 2023-12-19 08:58:24 -05:00
dropbox.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
duckdb_loader.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
email.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
epub.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
etherscan.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
evernote.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
excel.py Docs: fix excel document loader typo (#15470) 2024-01-07 09:33:35 -08:00
facebook_chat.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
fauna.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
figma.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
gcs_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
gcs_file.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
generic.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
geodataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
git.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
gitbook.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
github.py community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934) 2024-02-12 18:30:36 -08:00
google_speech_to_text.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
googledrive.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
gutenberg.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
helpers.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
hn.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
html_bs.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
html.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
hugging_face_dataset.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
ifixit.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
image_captions.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
image.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
imsdb.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
iugu.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
joplin.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
json_loader.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
lakefs.py docs: docstrings langchain_community update (#14889) 2023-12-19 08:58:24 -05:00
larksuite.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
markdown.py corrected outdated link (#15053) 2023-12-22 12:39:38 -08:00
mastodon.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
max_compute.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
mediawikidump.py infra: add -p to mkdir in lint steps (#17013) 2024-02-05 11:22:06 -08:00
merge.py langchain[minor],community[minor]: Add async methods in BaseLoader (#16634) 2024-01-31 11:08:11 -08:00
mhtml.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
modern_treasury.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
mongodb.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
news.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
notebook.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
notion.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
notiondb.py community[patch]: support query filters for NotionDBLoader (#17217) 2024-02-14 11:43:41 -08:00
nuclia.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
obs_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
obs_file.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
obsidian.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
odt.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
onedrive_file.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
onedrive.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
onenote.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
open_city_data.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
org_mode.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pdf.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
pebblo.py community[minor]: Add pebblo safe document loader (#16862) 2024-02-12 21:56:12 -08:00
polars_dataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
powerpoint.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
psychic.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pubmed.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pyspark_dataframe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
python.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
quip.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
readthedocs.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
recursive_url_loader.py community[patch]: doc loaders mypy fixes (#17368) 2024-02-12 16:51:06 -08:00
reddit.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
roam.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
rocksetdb.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
rspace.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
rss.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
rst.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
rtf.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
s3_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
s3_file.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
sharepoint.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
sitemap.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
slack_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
snowflake_loader.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
spreedly.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
srt.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
stripe.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
surrealdb.py community[patch]: SurrealDB fix for asyncio (#16092) 2024-01-23 19:46:19 -08:00
telegram.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
tencent_cos_directory.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
tencent_cos_file.py fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647) 2023-12-22 11:49:35 -08:00
tensorflow_datasets.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
text.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
tomarkdown.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
toml.py infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
trello.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
tsv.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
twitter.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
unstructured.py community[patch]: Load list of files using UnstructuredFileLoader (#16216) 2024-01-23 19:37:37 -08:00
url_playwright.py community[proxy]: Enhancement/add proxy support playwrighturlloader 16751 (#16822) 2024-02-12 19:48:29 -08:00
url_selenium.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
url.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
vsdx.py community[minor]: New documents loader for visio files (with extension .vsdx) (#16171) 2024-01-22 22:07:03 -08:00
weather.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
web_base.py community[patch]: Add Cookie Support to Fetch Method (#16673) 2024-01-27 16:03:53 -08:00
whatsapp_chat.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
wikipedia.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
word_document.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
xml.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
xorbits.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
youtube.py community[patch]: docstrings (#16810) 2024-02-09 12:48:57 -08:00