mirror of
https://github.com/hwchase17/langchain.git
synced 2025-07-10 06:55:09 +00:00
docs: Fix typo in quickstart.ipynb (#16859)
- **Description:** "load HTML **form** web URLs" should be "load HTML
**from** web URLs"? 🤔
- **Issue:** Typo
- **Dependencies:** Nope
- **Twitter handle:** n0vad3v
This commit is contained in:
parent
cf0b29b6d2
commit
eb7b05885f
@ -262,7 +262,7 @@
|
||||
"\n",
|
||||
"We need to first load the blog post contents. We can use [DocumentLoaders](/docs/modules/data_connection/document_loaders/) for this, which are objects that load in data from a source and return a list of [Documents](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html). A `Document` is an object with some `page_content` (str) and `metadata` (dict).\n",
|
||||
"\n",
|
||||
"In this case we'll use the [WebBaseLoader](/docs/integrations/document_loaders/web_base), which uses `urllib` to load HTML form web URLs and `BeautifulSoup` to parse it to text. We can customize the HTML -> text parsing by passing in parameters to the `BeautifulSoup` parser via `bs_kwargs` (see [BeautifulSoup docs](https://beautiful-soup-4.readthedocs.io/en/latest/#beautifulsoup)). In this case only HTML tags with class \"post-content\", \"post-title\", or \"post-header\" are relevant, so we'll remove all others."
|
||||
"In this case we'll use the [WebBaseLoader](/docs/integrations/document_loaders/web_base), which uses `urllib` to load HTML from web URLs and `BeautifulSoup` to parse it to text. We can customize the HTML -> text parsing by passing in parameters to the `BeautifulSoup` parser via `bs_kwargs` (see [BeautifulSoup docs](https://beautiful-soup-4.readthedocs.io/en/latest/#beautifulsoup)). In this case only HTML tags with class \"post-content\", \"post-title\", or \"post-header\" are relevant, so we'll remove all others."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
Loading…
Reference in New Issue
Block a user