docs: Add Oxylabs integration (#30591)

Description:
This PR adds documentation for the langchain-oxylabs integration
package.

The documentation includes instructions for configuring Oxylabs
credentials and provides example code demonstrating how to use the
package.

Issue:
N/A

Dependencies:
No new dependencies are required.

Tests and Docs:

Added an example notebook demonstrating the usage of the
Langchain-Oxylabs package, located in docs/docs/integrations.
Added a provider page in docs/docs/providers.
Added a new package to libs/packages.yml.

Lint and Test:

Successfully ran make format, make lint, and make test.
This commit is contained in:
oxy-tg 2025-04-02 17:40:32 +03:00 committed by GitHub
parent 816492e1d3
commit 38807871ec
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 320 additions and 1 deletions

View File

@ -0,0 +1,18 @@
# Oxylabs
[Oxylabs](https://oxylabs.io/) is a market-leading web intelligence collection platform, driven by the highest business,
ethics, and compliance standards, enabling companies worldwide to unlock data-driven insights.
[langchain-oxylabs](https://pypi.org/project/langchain-oxylabs/) implements
tools enabling LLMs to interact with Oxylabs Web Scraper API.
## Installation and Setup
```bash
pip install langchain-oxylabs
```
## Tools
See details on available tools [here](/docs/integrations/tools/oxylabs/).

View File

@ -0,0 +1,298 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "65142ccb18a6add5",
"metadata": {},
"source": "# Oxylabs"
},
{
"cell_type": "markdown",
"id": "19976df355f9c461",
"metadata": {},
"source": [
">[Oxylabs](https://oxylabs.io/) is a market-leading web intelligence collection platform, driven by the highest business, ethics, and compliance standards, enabling companies worldwide to unlock data-driven insights.\n",
"\n",
"## Overview\n",
"\n",
"This package contains the LangChain integration with Oxylabs, providing tools to scrape Google search results with Oxylabs Web Scraper API using LangChain's framework.\n",
"\n",
"The following classes are provided by this package:\n",
"- `OxylabsSearchRun` - A tool that returns scraped Google search results in a formatted text\n",
"- `OxylabsSearchResults` - A tool that returns scraped Google search results in a JSON format\n",
"- `OxylabsSearchAPIWrapper` - An API wrapper for initializing Oxylabs API"
]
},
{
"cell_type": "markdown",
"id": "76e6d8030199710d",
"metadata": {},
"source": [
"| Pricing |\n",
"|:-------------------------------:|\n",
"| ✅ Free 5,000 results for 1 week |"
]
},
{
"cell_type": "markdown",
"id": "eb59c5f3051be9d4",
"metadata": {},
"source": "## Setup"
},
{
"cell_type": "markdown",
"id": "45951d881c460419",
"metadata": {},
"source": "Install the required dependencies."
},
{
"cell_type": "code",
"id": "576e08c41f16ceda",
"metadata": {},
"source": [
"%pip install -qU langchain-oxylabs"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "ce19bb3a52a3ab60",
"metadata": {},
"source": "### Credentials"
},
{
"cell_type": "markdown",
"id": "b8330dfd5861482e",
"metadata": {},
"source": "Set up the proper API keys and environment variables. Create your API user credentials: Sign up for a free trial or purchase the product in the [Oxylabs dashboard](https://dashboard.oxylabs.io/en/registration) to create your API user credentials (OXYLABS_USERNAME and OXYLABS_PASSWORD)."
},
{
"cell_type": "code",
"id": "474b6eaf6e35efda",
"metadata": {},
"source": [
"import getpass\n",
"import os\n",
"\n",
"os.environ[\"OXYLABS_USERNAME\"] = getpass.getpass(\"Enter your Oxylabs username: \")\n",
"os.environ[\"OXYLABS_PASSWORD\"] = getpass.getpass(\"Enter your Oxylabs password: \")"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "ae310f86b113bd78",
"metadata": {},
"source": "## Instantiation"
},
{
"cell_type": "code",
"id": "69ea8139f6152b48",
"metadata": {},
"source": [
"from langchain_oxylabs import OxylabsSearchAPIWrapper, OxylabsSearchRun\n",
"\n",
"oxylabs_wrapper = OxylabsSearchAPIWrapper()\n",
"tool_ = OxylabsSearchRun(wrapper=oxylabs_wrapper)"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "636efff6267b9bc1",
"metadata": {},
"source": "## Invocation"
},
{
"cell_type": "markdown",
"id": "272f5fdaed5fa000",
"metadata": {},
"source": "### Invoke directly with args"
},
{
"cell_type": "markdown",
"id": "3c26724911f6cfdf",
"metadata": {},
"source": "The `OxylabsSearchRun` tool takes a single \"query\" argument, which should be a natural language query and returns combined string format result:"
},
{
"cell_type": "code",
"id": "b5f4d423cb3abf",
"metadata": {},
"source": [
"tool_.invoke({\"query\": \"Restaurants in Paris.\"})"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "784aad18bba31062",
"metadata": {},
"source": "### Invoke with ToolCall"
},
{
"cell_type": "code",
"id": "8afd1652e2c4e7c9",
"metadata": {},
"source": [
"tool_ = OxylabsSearchRun(\n",
" wrapper=oxylabs_wrapper,\n",
" kwargs={\n",
" \"result_categories\": [\n",
" \"local_information\",\n",
" \"combined_search_result\",\n",
" ]\n",
" },\n",
")"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"id": "1b903fb2f8272df2",
"metadata": {},
"source": [
"from pprint import pprint\n",
"\n",
"model_generated_tool_call = {\n",
" \"args\": {\n",
" \"query\": \"Visit restaurants in Vilnius.\",\n",
" \"geo_location\": \"Vilnius,Lithuania\",\n",
" },\n",
" \"id\": \"1\",\n",
" \"name\": \"oxylabs_search\",\n",
" \"type\": \"tool_call\",\n",
"}\n",
"tool_call_result = tool_.invoke(model_generated_tool_call)\n",
"\n",
"# The content is a JSON string of results\n",
"pprint(tool_call_result.content)"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "a436fc39ce17b6da",
"metadata": {},
"source": [
"## Use within an agent\n",
"Install the required dependencies."
]
},
{
"cell_type": "code",
"id": "a6f23e65092e8d27",
"metadata": {},
"source": "%pip install -qU \"langchain[openai]\" langgraph",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"id": "21078a0c265759ff",
"metadata": {},
"source": [
"import getpass\n",
"import os\n",
"\n",
"from langchain.chat_models import init_chat_model\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter API key for OpenAI: \")\n",
"llm = init_chat_model(\"gpt-4o-mini\", model_provider=\"openai\")"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"id": "396548ee2f08fb30",
"metadata": {},
"source": [
"from langgraph.prebuilt import create_react_agent\n",
"\n",
"# Initialize OxylabsSearchRun tool\n",
"tool_ = OxylabsSearchRun(wrapper=oxylabs_wrapper)\n",
"\n",
"agent = create_react_agent(llm, [tool_])\n",
"\n",
"user_input = \"What happened in the latest Burning Man floods?\"\n",
"\n",
"for step in agent.stream(\n",
" {\"messages\": user_input},\n",
" stream_mode=\"values\",\n",
"):\n",
" step[\"messages\"][-1].pretty_print()"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "b10bbcf5ce299fd5",
"metadata": {},
"source": [
"## JSON results\n",
"`OxylabsSearchResults` tool can be used as an alternative to `OxylabsSearchRun` to retrieve results in a JSON format:"
]
},
{
"cell_type": "code",
"id": "ef0d856f571c4938",
"metadata": {},
"source": [
"import json\n",
"\n",
"from langchain_oxylabs import OxylabsSearchResults\n",
"\n",
"tool_ = OxylabsSearchResults(wrapper=oxylabs_wrapper)\n",
"\n",
"response_results = tool_.invoke({\"query\": \"What are the most famous artists?\"})\n",
"response_results = json.loads(response_results)\n",
"\n",
"for result in response_results:\n",
" for key, value in result.items():\n",
" print(f\"{key}: {value}\")"
],
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
"id": "7f16702d224dabb2",
"metadata": {},
"source": [
"## API reference\n",
"More information about this integration package can be found here: https://github.com/oxylabs/langchain-oxylabs\n",
"\n",
"Oxylabs Web Scraper API documentation: https://developers.oxylabs.io/scraper-apis/web-scraper-api\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@ -564,4 +564,7 @@ packages:
repo: memgraph/langchain-memgraph
- name: langchain-vectara
path: libs/vectara
repo: vectara/langchain-vectara
repo: vectara/langchain-vectara
- name: langchain-oxylabs
path: .
repo: oxylabs/langchain-oxylabs