mirror of
https://github.com/hwchase17/langchain.git
synced 2025-10-04 11:49:23 +00:00
docs: ecosystem/integrations
update 2 (#5282)
# docs: ecosystem/integrations update 2 #5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049
This commit is contained in:
@@ -11,7 +11,7 @@
|
||||
">It starts with computer vision, which classifies a page into one of 20 possible types. Content is then interpreted by a machine learning model trained to identify the key attributes on a page based on its type.\n",
|
||||
">The result is a website transformed into clean structured data (like JSON or CSV), ready for your application.\n",
|
||||
"\n",
|
||||
"This covers how to extract HTML documents from a list of URLs using the [Diffbot extract API](https://www.diffbot.com/products/extract/), into a document format that we can use downstream."
|
||||
"This covers how to extract HTML documents from a list of URLs using the [Diffbot extract API](https://www.diffbot.com/products/extract/), into a document format that we can use downstream.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -31,7 +31,9 @@
|
||||
"id": "6fffec88",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The Diffbot Extract API Requires an API token. Once you have it, you can extract the data from the previous URLs\n"
|
||||
"The Diffbot Extract API Requires an API token. Once you have it, you can extract the data.\n",
|
||||
"\n",
|
||||
"Read [instructions](https://docs.diffbot.com/reference/authentication) how to get the Diffbot API Token."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Reference in New Issue
Block a user