mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-14 07:07:34 +00:00
docs: fix apify actors notebok main heading text (#30040)
- **Description:** Fix Apify Actors tool notebook main heading text so there is an actual description instead of "Overview" in the tool integration description on [LangChain tools integration page](https://python.langchain.com/docs/integrations/tools/#all-tools).
This commit is contained in:
parent
230876a7c5
commit
d0f5bcda29
@ -29,7 +29,7 @@ You can use the `ApifyActorsTool` to use Apify Actors with agents.
|
||||
from langchain_apify import ApifyActorsTool
|
||||
```
|
||||
|
||||
See [this notebook](/docs/integrations/tools/apify_actors) for example usage.
|
||||
See [this notebook](/docs/integrations/tools/apify_actors) for example usage and a full example of a tool-calling agent with LangGraph in the [Apify LangGraph agent Actor template](https://apify.com/templates/python-langgraph).
|
||||
|
||||
For more information on how to use this tool, visit [the Apify integration documentation](https://docs.apify.com/platform/integrations/langgraph).
|
||||
|
||||
|
@ -1,256 +1,258 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "_9MNj58sIkGN"
|
||||
},
|
||||
"source": [
|
||||
"# Apify Actor\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
">[Apify Actors](https://docs.apify.com/platform/actors) are cloud programs designed for a wide range of web scraping, crawling, and data extraction tasks. These actors facilitate automated data gathering from the web, enabling users to extract, process, and store information efficiently. Actors can be used to perform tasks like scraping e-commerce sites for product details, monitoring price changes, or gathering search engine results. They integrate seamlessly with [Apify Datasets](https://docs.apify.com/platform/storage/dataset), allowing the structured data collected by actors to be stored, managed, and exported in formats like JSON, CSV, or Excel for further analysis or use.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "OHLF9t9v9HCb"
|
||||
},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"This integration lives in the [langchain-apify](https://pypi.org/project/langchain-apify/) package. The package can be installed using pip.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "4DdGmBn5IbXz"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install langchain-apify"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "rEAwonXqwggR"
|
||||
},
|
||||
"source": [
|
||||
"### Prerequisites\n",
|
||||
"\n",
|
||||
"- **Apify account**: Register your free Apify account [here](https://console.apify.com/sign-up).\n",
|
||||
"- **Apify API token**: Learn how to get your API token in the [Apify documentation](https://docs.apify.com/platform/integrations/api)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "9nJOl4MBMkcR"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"APIFY_API_TOKEN\"] = \"your-apify-api-token\"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "UfoQxAlCxR9q"
|
||||
},
|
||||
"source": [
|
||||
"## Instantiation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "qG9KtXtLM8i7"
|
||||
},
|
||||
"source": [
|
||||
"Here we instantiate the `ApifyActorsTool` to be able to call [RAG Web Browser](https://apify.com/apify/rag-web-browser) Apify Actor. This Actor provides web browsing functionality for AI and LLM applications, similar to the web browsing feature in ChatGPT. Any Actor from the [Apify Store](https://apify.com/store) can be used in this way."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 43,
|
||||
"metadata": {
|
||||
"id": "cyxeTlPnM4Ya"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_apify import ApifyActorsTool\n",
|
||||
"\n",
|
||||
"tool = ApifyActorsTool(\"apify/rag-web-browser\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "fGDLvDCqyKWO"
|
||||
},
|
||||
"source": [
|
||||
"## Invocation\n",
|
||||
"\n",
|
||||
"The `ApifyActorsTool` takes a single argument, which is `run_input` - a dictionary that is passed as a run input to the Actor. Run input schema documentation can be found in the input section of the Actor details page. See [RAG Web Browser input schema](https://apify.com/apify/rag-web-browser/input-schema).\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "nTWy6Hx1yk04"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.invoke({\"run_input\": {\"query\": \"what is apify?\", \"maxResults\": 2}})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "kQsa27hoO58S"
|
||||
},
|
||||
"source": [
|
||||
"## Chaining\n",
|
||||
"\n",
|
||||
"We can provide the created tool to an [agent](https://python.langchain.com/docs/tutorials/agents/). When asked to search for information, the agent will call the Apify Actor, which will search the web, and then retrieve the search results.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "YySvLskW72Y8"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install langgraph langchain-openai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 44,
|
||||
"metadata": {
|
||||
"id": "QEDz07btO5Gi"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.messages import ToolMessage\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"from langgraph.prebuilt import create_react_agent\n",
|
||||
"\n",
|
||||
"model = ChatOpenAI(model=\"gpt-4o\")\n",
|
||||
"tools = [tool]\n",
|
||||
"graph = create_react_agent(model, tools=tools)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 45,
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "XS1GEyNkQxGu",
|
||||
"outputId": "195273d7-034c-425b-f3f9-95c0a9fb0c9e"
|
||||
},
|
||||
"outputs": [
|
||||
"cells": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"================================\u001b[1m Human Message \u001b[0m=================================\n",
|
||||
"\n",
|
||||
"search for what is Apify\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"Tool Calls:\n",
|
||||
" apify_actor_apify_rag-web-browser (call_27mjHLzDzwa5ZaHWCMH510lm)\n",
|
||||
" Call ID: call_27mjHLzDzwa5ZaHWCMH510lm\n",
|
||||
" Args:\n",
|
||||
" run_input: {\"run_input\":{\"query\":\"Apify\",\"maxResults\":3,\"outputFormats\":[\"markdown\"]}}\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"\n",
|
||||
"Apify is a comprehensive platform for web scraping, browser automation, and data extraction. It offers a wide array of tools and services that cater to developers and businesses looking to extract data from websites efficiently and effectively. Here's an overview of Apify:\n",
|
||||
"\n",
|
||||
"1. **Ecosystem and Tools**:\n",
|
||||
" - Apify provides an ecosystem where developers can build, deploy, and publish data extraction and web automation tools called Actors.\n",
|
||||
" - The platform supports various use cases such as extracting data from social media platforms, conducting automated browser-based tasks, and more.\n",
|
||||
"\n",
|
||||
"2. **Offerings**:\n",
|
||||
" - Apify offers over 3,000 ready-made scraping tools and code templates.\n",
|
||||
" - Users can also build custom solutions or hire Apify's professional services for more tailored data extraction needs.\n",
|
||||
"\n",
|
||||
"3. **Technology and Integration**:\n",
|
||||
" - The platform supports integration with popular tools and services like Zapier, GitHub, Google Sheets, Pinecone, and more.\n",
|
||||
" - Apify supports open-source tools and technologies such as JavaScript, Python, Puppeteer, Playwright, Selenium, and its own Crawlee library for web crawling and browser automation.\n",
|
||||
"\n",
|
||||
"4. **Community and Learning**:\n",
|
||||
" - Apify hosts a community on Discord where developers can get help and share expertise.\n",
|
||||
" - It offers educational resources through the Web Scraping Academy to help users become proficient in data scraping and automation.\n",
|
||||
"\n",
|
||||
"5. **Enterprise Solutions**:\n",
|
||||
" - Apify provides enterprise-grade web data extraction solutions with high reliability, 99.95% uptime, and compliance with SOC2, GDPR, and CCPA standards.\n",
|
||||
"\n",
|
||||
"For more information, you can visit [Apify's official website](https://apify.com/) or their [GitHub page](https://github.com/apify) which contains their code repositories and further details about their projects.\n"
|
||||
]
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "_9MNj58sIkGN"
|
||||
},
|
||||
"source": [
|
||||
"# Apify Actor\n",
|
||||
"\n",
|
||||
">[Apify Actors](https://docs.apify.com/platform/actors) are cloud programs designed for a wide range of web scraping, crawling, and data extraction tasks. These actors facilitate automated data gathering from the web, enabling users to extract, process, and store information efficiently. Actors can be used to perform tasks like scraping e-commerce sites for product details, monitoring price changes, or gathering search engine results. They integrate seamlessly with [Apify Datasets](https://docs.apify.com/platform/storage/dataset), allowing the structured data collected by actors to be stored, managed, and exported in formats like JSON, CSV, or Excel for further analysis or use.\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"This notebook walks you through using [Apify Actors](https://docs.apify.com/platform/actors) with LangChain to automate web scraping and data extraction. The `langchain-apify` package integrates Apify's cloud-based tools with LangChain agents, enabling efficient data collection and processing for AI applications.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "OHLF9t9v9HCb"
|
||||
},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"This integration lives in the [langchain-apify](https://pypi.org/project/langchain-apify/) package. The package can be installed using pip.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "4DdGmBn5IbXz"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install langchain-apify"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "rEAwonXqwggR"
|
||||
},
|
||||
"source": [
|
||||
"### Prerequisites\n",
|
||||
"\n",
|
||||
"- **Apify account**: Register your free Apify account [here](https://console.apify.com/sign-up).\n",
|
||||
"- **Apify API token**: Learn how to get your API token in the [Apify documentation](https://docs.apify.com/platform/integrations/api)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "9nJOl4MBMkcR"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"APIFY_API_TOKEN\"] = \"your-apify-api-token\"\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"your-openai-api-key\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "UfoQxAlCxR9q"
|
||||
},
|
||||
"source": [
|
||||
"## Instantiation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "qG9KtXtLM8i7"
|
||||
},
|
||||
"source": [
|
||||
"Here we instantiate the `ApifyActorsTool` to be able to call [RAG Web Browser](https://apify.com/apify/rag-web-browser) Apify Actor. This Actor provides web browsing functionality for AI and LLM applications, similar to the web browsing feature in ChatGPT. Any Actor from the [Apify Store](https://apify.com/store) can be used in this way."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "cyxeTlPnM4Ya"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_apify import ApifyActorsTool\n",
|
||||
"\n",
|
||||
"tool = ApifyActorsTool(\"apify/rag-web-browser\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "fGDLvDCqyKWO"
|
||||
},
|
||||
"source": [
|
||||
"## Invocation\n",
|
||||
"\n",
|
||||
"The `ApifyActorsTool` takes a single argument, which is `run_input` - a dictionary that is passed as a run input to the Actor. Run input schema documentation can be found in the input section of the Actor details page. See [RAG Web Browser input schema](https://apify.com/apify/rag-web-browser/input-schema).\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "nTWy6Hx1yk04"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.invoke({\"run_input\": {\"query\": \"what is apify?\", \"maxResults\": 2}})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "kQsa27hoO58S"
|
||||
},
|
||||
"source": [
|
||||
"## Chaining\n",
|
||||
"\n",
|
||||
"We can provide the created tool to an [agent](https://python.langchain.com/docs/tutorials/agents/). When asked to search for information, the agent will call the Apify Actor, which will search the web, and then retrieve the search results.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "YySvLskW72Y8"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install langgraph langchain-openai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "QEDz07btO5Gi"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.messages import ToolMessage\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"from langgraph.prebuilt import create_react_agent\n",
|
||||
"\n",
|
||||
"model = ChatOpenAI(model=\"gpt-4o\")\n",
|
||||
"tools = [tool]\n",
|
||||
"graph = create_react_agent(model, tools=tools)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "XS1GEyNkQxGu",
|
||||
"outputId": "195273d7-034c-425b-f3f9-95c0a9fb0c9e"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"================================\u001b[1m Human Message \u001b[0m=================================\n",
|
||||
"\n",
|
||||
"search for what is Apify\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"Tool Calls:\n",
|
||||
" apify_actor_apify_rag-web-browser (call_27mjHLzDzwa5ZaHWCMH510lm)\n",
|
||||
" Call ID: call_27mjHLzDzwa5ZaHWCMH510lm\n",
|
||||
" Args:\n",
|
||||
" run_input: {\"run_input\":{\"query\":\"Apify\",\"maxResults\":3,\"outputFormats\":[\"markdown\"]}}\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"\n",
|
||||
"Apify is a comprehensive platform for web scraping, browser automation, and data extraction. It offers a wide array of tools and services that cater to developers and businesses looking to extract data from websites efficiently and effectively. Here's an overview of Apify:\n",
|
||||
"\n",
|
||||
"1. **Ecosystem and Tools**:\n",
|
||||
" - Apify provides an ecosystem where developers can build, deploy, and publish data extraction and web automation tools called Actors.\n",
|
||||
" - The platform supports various use cases such as extracting data from social media platforms, conducting automated browser-based tasks, and more.\n",
|
||||
"\n",
|
||||
"2. **Offerings**:\n",
|
||||
" - Apify offers over 3,000 ready-made scraping tools and code templates.\n",
|
||||
" - Users can also build custom solutions or hire Apify's professional services for more tailored data extraction needs.\n",
|
||||
"\n",
|
||||
"3. **Technology and Integration**:\n",
|
||||
" - The platform supports integration with popular tools and services like Zapier, GitHub, Google Sheets, Pinecone, and more.\n",
|
||||
" - Apify supports open-source tools and technologies such as JavaScript, Python, Puppeteer, Playwright, Selenium, and its own Crawlee library for web crawling and browser automation.\n",
|
||||
"\n",
|
||||
"4. **Community and Learning**:\n",
|
||||
" - Apify hosts a community on Discord where developers can get help and share expertise.\n",
|
||||
" - It offers educational resources through the Web Scraping Academy to help users become proficient in data scraping and automation.\n",
|
||||
"\n",
|
||||
"5. **Enterprise Solutions**:\n",
|
||||
" - Apify provides enterprise-grade web data extraction solutions with high reliability, 99.95% uptime, and compliance with SOC2, GDPR, and CCPA standards.\n",
|
||||
"\n",
|
||||
"For more information, you can visit [Apify's official website](https://apify.com/) or their [GitHub page](https://github.com/apify) which contains their code repositories and further details about their projects.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"inputs = {\"messages\": [(\"user\", \"search for what is Apify\")]}\n",
|
||||
"for s in graph.stream(inputs, stream_mode=\"values\"):\n",
|
||||
" message = s[\"messages\"][-1]\n",
|
||||
" # skip tool messages\n",
|
||||
" if isinstance(message, ToolMessage):\n",
|
||||
" continue\n",
|
||||
" message.pretty_print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "WYXuQIQx8AvG"
|
||||
},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For more information on how to use this integration, see the [git repository](https://github.com/apify/langchain-apify) or the [Apify integration documentation](https://docs.apify.com/platform/integrations/langgraph)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "f1NnMik78oib"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"provenance": [],
|
||||
"toc_visible": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"inputs = {\"messages\": [(\"user\", \"search for what is Apify\")]}\n",
|
||||
"for s in graph.stream(inputs, stream_mode=\"values\"):\n",
|
||||
" message = s[\"messages\"][-1]\n",
|
||||
" # skip tool messages\n",
|
||||
" if isinstance(message, ToolMessage):\n",
|
||||
" continue\n",
|
||||
" message.pretty_print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "WYXuQIQx8AvG"
|
||||
},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For more information on how to use this integration, see the [git repository](https://github.com/apify/langchain-apify) or the [Apify integration documentation](https://docs.apify.com/platform/integrations/langgraph)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "f1NnMik78oib"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"provenance": [],
|
||||
"toc_visible": true
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
Loading…
Reference in New Issue
Block a user