langchain/docs/docs/integrations/providers/hyperbrowser.mdx
Ninad Sinha a3671ceb71
docs: Add tools for hyperbrowser (#30606)
Description

This PR updates the docs for the
[langchain-hyperbrowser](https://pypi.org/project/langchain-hyperbrowser/)
package. It adds a few tools
 - Scrape Tool
 - Crawl Tool
 - Extract Tool
 - Browser Agents
   - Claude Computer Use
   - OpenAI CUA
   - Browser Use 

[Hyperbrowser](https://hyperbrowser.ai/) is a platform for running and
scaling headless browsers. It lets you launch and manage browser
sessions at scale and provides easy to use solutions for any webscraping
needs, such as scraping a single page or crawling an entire site.

Issue
None

Dependencies
None

Twitter Handle
`@hyperbrowser`
2025-04-04 16:02:47 -04:00

176 lines
5.8 KiB
Plaintext

# Hyperbrowser
> [Hyperbrowser](https://hyperbrowser.ai) is a platform for running and scaling headless browsers. It lets you launch and manage browser sessions at scale and provides easy to use solutions for any webscraping needs, such as scraping a single page or crawling an entire site.
>
> Key Features:
>
> - Instant Scalability - Spin up hundreds of browser sessions in seconds without infrastructure headaches
> - Simple Integration - Works seamlessly with popular tools like Puppeteer and Playwright
> - Powerful APIs - Easy to use APIs for scraping/crawling any site, and much more
> - Bypass Anti-Bot Measures - Built-in stealth mode, ad blocking, automatic CAPTCHA solving, and rotating proxies
For more information about Hyperbrowser, please visit the [Hyperbrowser website](https://hyperbrowser.ai) or if you want to check out the docs, you can visit the [Hyperbrowser docs](https://docs.hyperbrowser.ai).
## Installation and Setup
To get started with `langchain-hyperbrowser`, you can install the package using pip:
```bash
pip install langchain-hyperbrowser
```
And you should configure credentials by setting the following environment variables:
`HYPERBROWSER_API_KEY=<your-api-key>`
Make sure to get your API Key from https://app.hyperbrowser.ai/
## Available Tools
Hyperbrowser provides two main categories of tools that are particularly useful for:
- Web scraping and data extraction from complex websites
- Automating repetitive web tasks
- Interacting with web applications that require authentication
- Performing research across multiple websites
- Testing web applications
### Browser Agent Tools
Hyperbrowser provides a number of Browser Agents tools. Currently we supported
- Claude Computer Use
- OpenAI CUA
- Browser Use
You can see more details [here](/docs/integrations/tools/hyperbrowser_browser_agent_tools)
#### Browser Use Tool
A general-purpose browser automation tool that can handle various web tasks through natural language instructions.
```python
from langchain_hyperbrowser import HyperbrowserBrowserUseTool
tool = HyperbrowserBrowserUseTool()
result = tool.run({
"task": "Go to npmjs.com, find the React package, and tell me when it was last updated"
})
print(result)
```
#### OpenAI CUA Tool
Leverages OpenAI's Computer Use Agent capabilities for advanced web interactions and information gathering.
```python
from langchain_hyperbrowser import HyperbrowserOpenAICUATool
tool = HyperbrowserOpenAICUATool()
result = tool.run({
"task": "Go to Hacker News and summarize the top 5 posts right now"
})
print(result)
```
#### Claude Computer Use Tool
Utilizes Anthropic's Claude for sophisticated web browsing and information processing tasks.
```python
from langchain_hyperbrowser import HyperbrowserClaudeComputerUseTool
tool = HyperbrowserClaudeComputerUseTool()
result = tool.run({
"task": "Go to GitHub's trending repositories page, and list the top 3 posts there right now"
})
print(result)
```
### Web Scraping Tools
Here is a brief description of the Web Scraping Tools available with Hyperbrowser. You can see more details [here](/docs/integrations/tools/hyperbrowser_web_scraping_tools)
#### Scrape Tool
The Scrape Tool allows you to extract content from a single webpage in markdown, HTML, or link format.
```python
from langchain_hyperbrowser import HyperbrowserScrapeTool
tool = HyperbrowserScrapeTool()
result = tool.run({
"url": "https://example.com",
"scrape_options": {"formats": ["markdown"]}
})
print(result)
```
#### Crawl Tool
The Crawl Tool enables you to traverse entire websites, starting from a given URL, with configurable page limits.
```python
from langchain_hyperbrowser import HyperbrowserCrawlTool
tool = HyperbrowserCrawlTool()
result = tool.run({
"url": "https://example.com",
"max_pages": 2,
"scrape_options": {"formats": ["markdown"]}
})
print(result)
```
#### Extract Tool
The Extract Tool uses AI to pull structured data from web pages based on predefined schemas, making it perfect for data extraction tasks.
```python
from langchain_hyperbrowser import HyperbrowserExtractTool
from pydantic import BaseModel
class SimpleExtractionModel(BaseModel):
title: str
tool = HyperbrowserExtractTool()
result = tool.run({
"url": "https://example.com",
"schema": SimpleExtractionModel
})
print(result)
```
## Document Loader
The `HyperbrowserLoader` class in `langchain-hyperbrowser` can easily be used to load content from any single page or multiple pages as well as crawl an entire site.
The content can be loaded as markdown or html.
```python
from langchain_hyperbrowser import HyperbrowserLoader
loader = HyperbrowserLoader(urls="https://example.com")
docs = loader.load()
print(docs[0])
```
### Advanced Usage
You can specify the operation to be performed by the loader. The default operation is `scrape`. For `scrape`, you can provide a single URL or a list of URLs to be scraped. For `crawl`, you can only provide a single URL. The `crawl` operation will crawl the provided page and subpages and return a document for each page.
```python
loader = HyperbrowserLoader(
urls="https://hyperbrowser.ai", api_key="YOUR_API_KEY", operation="crawl"
)
```
Optional params for the loader can also be provided in the `params` argument. For more information on the supported params, visit https://docs.hyperbrowser.ai/reference/sdks/python/scrape#start-scrape-job-and-wait or https://docs.hyperbrowser.ai/reference/sdks/python/crawl#start-crawl-job-and-wait.
```python
loader = HyperbrowserLoader(
urls="https://example.com",
api_key="YOUR_API_KEY",
operation="scrape",
params={"scrape_options": {"include_tags": ["h1", "h2", "p"]}}
)
```
## Additional Resources
- [Hyperbrowser Docs](https://docs.hyperbrowser.ai/)
- [GitHub](https://github.com/hyperbrowserai/langchain-hyperbrowser/)
- [PyPi](https://pypi.org/project/langchain-hyperbrowser/)