diff --git a/docs/docs/integrations/providers/oxylabs.mdx b/docs/docs/integrations/providers/oxylabs.mdx new file mode 100644 index 00000000000..30b2093336b --- /dev/null +++ b/docs/docs/integrations/providers/oxylabs.mdx @@ -0,0 +1,18 @@ +# Oxylabs + +[Oxylabs](https://oxylabs.io/) is a market-leading web intelligence collection platform, driven by the highest business, +ethics, and compliance standards, enabling companies worldwide to unlock data-driven insights. + +[langchain-oxylabs](https://pypi.org/project/langchain-oxylabs/) implements +tools enabling LLMs to interact with Oxylabs Web Scraper API. + + +## Installation and Setup + +```bash +pip install langchain-oxylabs +``` + +## Tools + +See details on available tools [here](/docs/integrations/tools/oxylabs/). diff --git a/docs/docs/integrations/tools/oxylabs.ipynb b/docs/docs/integrations/tools/oxylabs.ipynb new file mode 100644 index 00000000000..2c459682dcc --- /dev/null +++ b/docs/docs/integrations/tools/oxylabs.ipynb @@ -0,0 +1,298 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "65142ccb18a6add5", + "metadata": {}, + "source": "# Oxylabs" + }, + { + "cell_type": "markdown", + "id": "19976df355f9c461", + "metadata": {}, + "source": [ + ">[Oxylabs](https://oxylabs.io/) is a market-leading web intelligence collection platform, driven by the highest business, ethics, and compliance standards, enabling companies worldwide to unlock data-driven insights.\n", + "\n", + "## Overview\n", + "\n", + "This package contains the LangChain integration with Oxylabs, providing tools to scrape Google search results with Oxylabs Web Scraper API using LangChain's framework.\n", + "\n", + "The following classes are provided by this package:\n", + "- `OxylabsSearchRun` - A tool that returns scraped Google search results in a formatted text\n", + "- `OxylabsSearchResults` - A tool that returns scraped Google search results in a JSON format\n", + "- `OxylabsSearchAPIWrapper` - An API wrapper for initializing Oxylabs API" + ] + }, + { + "cell_type": "markdown", + "id": "76e6d8030199710d", + "metadata": {}, + "source": [ + "| Pricing |\n", + "|:-------------------------------:|\n", + "| ✅ Free 5,000 results for 1 week |" + ] + }, + { + "cell_type": "markdown", + "id": "eb59c5f3051be9d4", + "metadata": {}, + "source": "## Setup" + }, + { + "cell_type": "markdown", + "id": "45951d881c460419", + "metadata": {}, + "source": "Install the required dependencies." + }, + { + "cell_type": "code", + "id": "576e08c41f16ceda", + "metadata": {}, + "source": [ + "%pip install -qU langchain-oxylabs" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "ce19bb3a52a3ab60", + "metadata": {}, + "source": "### Credentials" + }, + { + "cell_type": "markdown", + "id": "b8330dfd5861482e", + "metadata": {}, + "source": "Set up the proper API keys and environment variables. Create your API user credentials: Sign up for a free trial or purchase the product in the [Oxylabs dashboard](https://dashboard.oxylabs.io/en/registration) to create your API user credentials (OXYLABS_USERNAME and OXYLABS_PASSWORD)." + }, + { + "cell_type": "code", + "id": "474b6eaf6e35efda", + "metadata": {}, + "source": [ + "import getpass\n", + "import os\n", + "\n", + "os.environ[\"OXYLABS_USERNAME\"] = getpass.getpass(\"Enter your Oxylabs username: \")\n", + "os.environ[\"OXYLABS_PASSWORD\"] = getpass.getpass(\"Enter your Oxylabs password: \")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "ae310f86b113bd78", + "metadata": {}, + "source": "## Instantiation" + }, + { + "cell_type": "code", + "id": "69ea8139f6152b48", + "metadata": {}, + "source": [ + "from langchain_oxylabs import OxylabsSearchAPIWrapper, OxylabsSearchRun\n", + "\n", + "oxylabs_wrapper = OxylabsSearchAPIWrapper()\n", + "tool_ = OxylabsSearchRun(wrapper=oxylabs_wrapper)" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "636efff6267b9bc1", + "metadata": {}, + "source": "## Invocation" + }, + { + "cell_type": "markdown", + "id": "272f5fdaed5fa000", + "metadata": {}, + "source": "### Invoke directly with args" + }, + { + "cell_type": "markdown", + "id": "3c26724911f6cfdf", + "metadata": {}, + "source": "The `OxylabsSearchRun` tool takes a single \"query\" argument, which should be a natural language query and returns combined string format result:" + }, + { + "cell_type": "code", + "id": "b5f4d423cb3abf", + "metadata": {}, + "source": [ + "tool_.invoke({\"query\": \"Restaurants in Paris.\"})" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "784aad18bba31062", + "metadata": {}, + "source": "### Invoke with ToolCall" + }, + { + "cell_type": "code", + "id": "8afd1652e2c4e7c9", + "metadata": {}, + "source": [ + "tool_ = OxylabsSearchRun(\n", + " wrapper=oxylabs_wrapper,\n", + " kwargs={\n", + " \"result_categories\": [\n", + " \"local_information\",\n", + " \"combined_search_result\",\n", + " ]\n", + " },\n", + ")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "1b903fb2f8272df2", + "metadata": {}, + "source": [ + "from pprint import pprint\n", + "\n", + "model_generated_tool_call = {\n", + " \"args\": {\n", + " \"query\": \"Visit restaurants in Vilnius.\",\n", + " \"geo_location\": \"Vilnius,Lithuania\",\n", + " },\n", + " \"id\": \"1\",\n", + " \"name\": \"oxylabs_search\",\n", + " \"type\": \"tool_call\",\n", + "}\n", + "tool_call_result = tool_.invoke(model_generated_tool_call)\n", + "\n", + "# The content is a JSON string of results\n", + "pprint(tool_call_result.content)" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "a436fc39ce17b6da", + "metadata": {}, + "source": [ + "## Use within an agent\n", + "Install the required dependencies." + ] + }, + { + "cell_type": "code", + "id": "a6f23e65092e8d27", + "metadata": {}, + "source": "%pip install -qU \"langchain[openai]\" langgraph", + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "21078a0c265759ff", + "metadata": {}, + "source": [ + "import getpass\n", + "import os\n", + "\n", + "from langchain.chat_models import init_chat_model\n", + "\n", + "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter API key for OpenAI: \")\n", + "llm = init_chat_model(\"gpt-4o-mini\", model_provider=\"openai\")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "code", + "id": "396548ee2f08fb30", + "metadata": {}, + "source": [ + "from langgraph.prebuilt import create_react_agent\n", + "\n", + "# Initialize OxylabsSearchRun tool\n", + "tool_ = OxylabsSearchRun(wrapper=oxylabs_wrapper)\n", + "\n", + "agent = create_react_agent(llm, [tool_])\n", + "\n", + "user_input = \"What happened in the latest Burning Man floods?\"\n", + "\n", + "for step in agent.stream(\n", + " {\"messages\": user_input},\n", + " stream_mode=\"values\",\n", + "):\n", + " step[\"messages\"][-1].pretty_print()" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "b10bbcf5ce299fd5", + "metadata": {}, + "source": [ + "## JSON results\n", + "`OxylabsSearchResults` tool can be used as an alternative to `OxylabsSearchRun` to retrieve results in a JSON format:" + ] + }, + { + "cell_type": "code", + "id": "ef0d856f571c4938", + "metadata": {}, + "source": [ + "import json\n", + "\n", + "from langchain_oxylabs import OxylabsSearchResults\n", + "\n", + "tool_ = OxylabsSearchResults(wrapper=oxylabs_wrapper)\n", + "\n", + "response_results = tool_.invoke({\"query\": \"What are the most famous artists?\"})\n", + "response_results = json.loads(response_results)\n", + "\n", + "for result in response_results:\n", + " for key, value in result.items():\n", + " print(f\"{key}: {value}\")" + ], + "outputs": [], + "execution_count": null + }, + { + "cell_type": "markdown", + "id": "7f16702d224dabb2", + "metadata": {}, + "source": [ + "## API reference\n", + "More information about this integration package can be found here: https://github.com/oxylabs/langchain-oxylabs\n", + "\n", + "Oxylabs Web Scraper API documentation: https://developers.oxylabs.io/scraper-apis/web-scraper-api\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 2 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython2", + "version": "2.7.6" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/libs/packages.yml b/libs/packages.yml index 23f86fd6b8d..b77921a3de5 100644 --- a/libs/packages.yml +++ b/libs/packages.yml @@ -564,4 +564,7 @@ packages: repo: memgraph/langchain-memgraph - name: langchain-vectara path: libs/vectara - repo: vectara/langchain-vectara \ No newline at end of file + repo: vectara/langchain-vectara +- name: langchain-oxylabs + path: . + repo: oxylabs/langchain-oxylabs \ No newline at end of file