mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-07 12:06:43 +00:00
docs: Update confluence code blocks with all the latest changes and update documentation (#31242)
Description: This document change concerns the document-loader integration, specifically `Confluence`. I am trying to use the ConfluenceLoader and came across deprecations when I followed the instructions in the documentation. So I updated the code blocks with the latest changes made to langchain, and also updated the documentation for better readability --------- Co-authored-by: ccurme <chester.curme@gmail.com>
This commit is contained in:
parent
1f43b6062e
commit
bd367ba10c
@ -6,20 +6,54 @@
|
||||
"source": [
|
||||
"# Confluence\n",
|
||||
"\n",
|
||||
">[Confluence](https://www.atlassian.com/software/confluence) is a wiki collaboration platform that saves and organizes all of the project-related material. `Confluence` is a knowledge base that primarily handles content management activities. \n",
|
||||
"[Confluence](https://www.atlassian.com/software/confluence) is a wiki collaboration platform designed to save and organize all project-related materials. As a knowledge base, Confluence primarily serves content management activities.\n",
|
||||
"\n",
|
||||
"A loader for `Confluence` pages.\n",
|
||||
"This loader allows you to fetch and process Confluence pages into `Document` objects.\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"This currently supports `username/api_key`, `Oauth2 login`, `cookies`. Additionally, on-prem installations also support `token` authentication. \n",
|
||||
"## Authentication Methods\n",
|
||||
"\n",
|
||||
"The following authentication methods are supported:\n",
|
||||
"\n",
|
||||
"Specify a list `page_id`-s and/or `space_key` to load in the corresponding pages into Document objects, if both are specified the union of both sets will be returned.\n",
|
||||
"- `username/api_key`\n",
|
||||
"- `OAuth2 login`\n",
|
||||
"- `cookies`\n",
|
||||
"- On-premises installations: `token` authentication\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"You can also specify a boolean `include_attachments` to include attachments, this is set to False by default, if set to True all attachments will be downloaded and ConfluenceReader will extract the text from the attachments and add it to the Document object. Currently supported attachment types are: `PDF`, `PNG`, `JPEG/JPG`, `SVG`, `Word` and `Excel`.\n",
|
||||
"## Page Selection\n",
|
||||
"\n",
|
||||
"Hint: `space_key` and `page_id` can both be found in the URL of a page in Confluence - https://yoursite.atlassian.com/wiki/spaces/<space_key>/pages/<page_id>\n"
|
||||
"You can specify which pages to load using:\n",
|
||||
"\n",
|
||||
"- **page_ids** (*list*): \n",
|
||||
" A list of `page_id` values to load the corresponding pages.\n",
|
||||
"\n",
|
||||
"- **space_key** (*string*): \n",
|
||||
" A string of `space_key` value to load all pages within the specified confluence space.\n",
|
||||
"\n",
|
||||
"If both `page_ids` and `space_key` are provided, the loader will return the union of pages from both lists.\n",
|
||||
"\n",
|
||||
"*Hint:* Both `space_key` and `page_id` can be found in the URL of a Confluence page: \n",
|
||||
"`https://yoursite.atlassian.com/wiki/spaces/{space_key}/pages/{page_id}`\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"## Attachments\n",
|
||||
"\n",
|
||||
"You may include attachments in the loaded `Document` objects by setting the boolean parameter **include_attachments** to `True` (default: `False`). When enabled, all attachments are downloaded and their text content is extracted and added to the Document.\n",
|
||||
"\n",
|
||||
"**Currently supported attachment types:**\n",
|
||||
"\n",
|
||||
"- PDF (`.pdf`)\n",
|
||||
"- PNG (`.png`)\n",
|
||||
"- JPEG/JPG (`.jpeg`, `.jpg`)\n",
|
||||
"- SVG (`.svg`)\n",
|
||||
"- Word (`.doc`, `.docx`)\n",
|
||||
"- Excel (`.xls`, `.xlsx`)\n",
|
||||
"\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -70,9 +104,14 @@
|
||||
"from langchain_community.document_loaders import ConfluenceLoader\n",
|
||||
"\n",
|
||||
"loader = ConfluenceLoader(\n",
|
||||
" url=\"https://yoursite.atlassian.com/wiki\", username=\"me\", api_key=\"12345\"\n",
|
||||
" url=\"https://yoursite.atlassian.com/wiki\",\n",
|
||||
" username=\"<your-confluence-username>\",\n",
|
||||
" api_key=\"<your-api-token>\",\n",
|
||||
" space_key=\"<your-space-key>\",\n",
|
||||
" include_attachments=True,\n",
|
||||
" limit=50,\n",
|
||||
")\n",
|
||||
"documents = loader.load(space_key=\"SPACE\", include_attachments=True, limit=50)"
|
||||
"documents = loader.load()"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -95,10 +134,15 @@
|
||||
"source": [
|
||||
"from langchain_community.document_loaders import ConfluenceLoader\n",
|
||||
"\n",
|
||||
"loader = ConfluenceLoader(url=\"https://yoursite.atlassian.com/wiki\", token=\"12345\")\n",
|
||||
"documents = loader.load(\n",
|
||||
" space_key=\"SPACE\", include_attachments=True, limit=50, max_pages=50\n",
|
||||
")"
|
||||
"loader = ConfluenceLoader(\n",
|
||||
" url=\"https://confluence.yoursite.com/\",\n",
|
||||
" token=\"<your-personal-access-token>\",\n",
|
||||
" space_key=\"<your-space-key>\",\n",
|
||||
" include_attachments=True,\n",
|
||||
" limit=50,\n",
|
||||
" max_pages=50,\n",
|
||||
")\n",
|
||||
"documents = loader.load()"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
Loading…
Reference in New Issue
Block a user