community: updated Browserbase loader (#21757)

Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
This commit is contained in:
Mish Ushakov
2024-05-16 17:21:23 +02:00
committed by GitHub
parent 1e6517ba73
commit d77e60a7f4
3 changed files with 43 additions and 11 deletions

View File

@@ -6,11 +6,17 @@
"source": [
"# Browserbase\n",
"\n",
"[Browserbase](https://browserbase.com) is a serverless platform for running headless browsers, it offers advanced debugging, session recordings, stealth mode, integrated proxies and captcha solving.\n",
"[Browserbase](https://browserbase.com) is a developer platform to reliably run, manage, and monitor headless browsers.\n",
"\n",
"## Installation\n",
"Power your AI data retrievals with:\n",
"- [Serverless Infrastructure](https://docs.browserbase.com/under-the-hood) providing reliable browsers to extract data from complex UIs\n",
"- [Stealth Mode](https://docs.browserbase.com/features/stealth-mode) with included fingerprinting tactics and automatic captcha solving\n",
"- [Session Debugger](https://docs.browserbase.com/features/sessions) to inspect your Browser Session with networks timeline and logs\n",
"- [Live Debug](https://docs.browserbase.com/guides/session-debug-connection/browser-remote-control) to quickly debug your automation\n",
"\n",
"- Get an API key from [browserbase.com](https://browserbase.com) and set it in environment variables (`BROWSERBASE_API_KEY`).\n",
"## Installation and Setup\n",
"\n",
"- Get an API key and Project ID from [browserbase.com](https://browserbase.com) and set it in environment variables (`BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID`).\n",
"- Install the [Browserbase SDK](http://github.com/browserbase/python-sdk):"
]
},
@@ -64,6 +70,20 @@
"print(docs[0].page_content[:61])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loader Options\n",
"\n",
"- `urls` Required. A list of URLs to fetch.\n",
"- `text_content` Retrieve only text content. Default is `False`.\n",
"- `api_key` Optional. Browserbase API key. Default is `BROWSERBASE_API_KEY` env variable.\n",
"- `project_id` Optional. Browserbase Project ID. Default is `BROWSERBASE_PROJECT_ID` env variable.\n",
"- `session_id` Optional. Provide an existing Session ID.\n",
"- `proxy` Optional. Enable/Disable Proxies."
]
},
{
"cell_type": "markdown",
"metadata": {},

View File

@@ -1,10 +1,16 @@
# Browserbase
>[Browserbase](https://browserbase.com) is a serverless platform for running headless browsers, it offers advanced debugging, session recordings, stealth mode, integrated proxies and captcha solving.
[Browserbase](https://browserbase.com) is a developer platform to reliably run, manage, and monitor headless browsers.
Power your AI data retrievals with:
- [Serverless Infrastructure](https://docs.browserbase.com/under-the-hood) providing reliable browsers to extract data from complex UIs
- [Stealth Mode](https://docs.browserbase.com/features/stealth-mode) with included fingerprinting tactics and automatic captcha solving
- [Session Debugger](https://docs.browserbase.com/features/sessions) to inspect your Browser Session with networks timeline and logs
- [Live Debug](https://docs.browserbase.com/guides/session-debug-connection/browser-remote-control) to quickly debug your automation
## Installation and Setup
- Get an API key from [browserbase.com](https://browserbase.com) and set it in environment variables (`BROWSERBASE_API_KEY`).
- Get an API key and Project ID from [browserbase.com](https://browserbase.com) and set it in environment variables (`BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID`).
- Install the [Browserbase SDK](http://github.com/browserbase/python-sdk):
```python