mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-14 23:26:34 +00:00
SearchApi integration (#11023)
Based on the customers' requests for native langchain integration, SearchApi is ready to invest in AI and LLM space, especially in open-source development. - This is our initial PR and later we want to improve it based on customers' and langchain users' feedback. Most likely changes will affect how the final results string is being built. - We are creating similar native integration in Python and JavaScript. - The next plan is to integrate into Java, Ruby, Go, and others. - Feel free to assign @SebastjanPrachovskij as a main reviewer for any SearchApi-related searches. We will be glad to help and support langchain development.
This commit is contained in:
parent
8cd18a48e4
commit
a4e0cf6300
80
docs/extras/integrations/providers/searchapi.mdx
Normal file
80
docs/extras/integrations/providers/searchapi.mdx
Normal file
@ -0,0 +1,80 @@
|
||||
# SearchApi
|
||||
|
||||
This page covers how to use the [SearchApi](https://www.searchapi.io/) Google Search API within LangChain. SearchApi is a real-time SERP API for easy SERP scraping.
|
||||
|
||||
## Setup
|
||||
|
||||
- Go to [https://www.searchapi.io/](https://www.searchapi.io/) to sign up for a free account
|
||||
- Get the api key and set it as an environment variable (`SEARCHAPI_API_KEY`)
|
||||
|
||||
## Wrappers
|
||||
|
||||
### Utility
|
||||
|
||||
There is a SearchApiAPIWrapper utility which wraps this API. To import this utility:
|
||||
|
||||
```python
|
||||
from langchain.utilities import SearchApiAPIWrapper
|
||||
```
|
||||
|
||||
You can use it as part of a Self Ask chain:
|
||||
|
||||
```python
|
||||
from langchain.utilities import SearchApiAPIWrapper
|
||||
from langchain.llms.openai import OpenAI
|
||||
from langchain.agents import initialize_agent, Tool
|
||||
from langchain.agents import AgentType
|
||||
|
||||
import os
|
||||
|
||||
os.environ["SEARCHAPI_API_KEY"] = ""
|
||||
os.environ['OPENAI_API_KEY'] = ""
|
||||
|
||||
llm = OpenAI(temperature=0)
|
||||
search = SearchApiAPIWrapper()
|
||||
tools = [
|
||||
Tool(
|
||||
name="Intermediate Answer",
|
||||
func=search.run,
|
||||
description="useful for when you need to ask with search"
|
||||
)
|
||||
]
|
||||
|
||||
self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)
|
||||
self_ask_with_search.run("Who lived longer: Plato, Socrates, or Aristotle?")
|
||||
```
|
||||
|
||||
#### Output
|
||||
|
||||
```
|
||||
> Entering new AgentExecutor chain...
|
||||
Yes.
|
||||
Follow up: How old was Plato when he died?
|
||||
Intermediate answer: eighty
|
||||
Follow up: How old was Socrates when he died?
|
||||
Intermediate answer: | Socrates |
|
||||
| -------- |
|
||||
| Born | c. 470 BC Deme Alopece, Athens |
|
||||
| Died | 399 BC (aged approximately 71) Athens |
|
||||
| Cause of death | Execution by forced suicide by poisoning |
|
||||
| Spouse(s) | Xanthippe, Myrto |
|
||||
|
||||
Follow up: How old was Aristotle when he died?
|
||||
Intermediate answer: 62 years
|
||||
So the final answer is: Plato
|
||||
|
||||
> Finished chain.
|
||||
'Plato'
|
||||
```
|
||||
|
||||
### Tool
|
||||
|
||||
You can also easily load this wrapper as a Tool (to use with an Agent).
|
||||
You can do this with:
|
||||
|
||||
```python
|
||||
from langchain.agents import load_tools
|
||||
tools = load_tools(["searchapi"])
|
||||
```
|
||||
|
||||
For more information on tools, see [this page](/docs/modules/agents/tools/).
|
@ -107,6 +107,85 @@
|
||||
"agent.run(\"What is the weather in Pomfret?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8786bdc8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## SearchApi\n",
|
||||
"\n",
|
||||
"Second, let's try SearchApi tool."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "5fd5ca32",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tools = load_tools([\"searchapi\"], llm=llm)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "547c9cf5-aa4d-48ed-b7a5-29ecc1491adf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"agent = initialize_agent(\n",
|
||||
" tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "a7564c40-83ec-490b-ad36-385be5c20e58",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3m I need to find out the current weather in Pomfret.\n",
|
||||
"Action: searchapi\n",
|
||||
"Action Input: \"weather in Pomfret\"\u001b[0m\n",
|
||||
"Observation: \u001b[36;1m\u001b[1;3mThu 14 | Day ... Some clouds this morning will give way to generally sunny skies for the afternoon. High 73F. Winds NW at 5 to 10 mph.\n",
|
||||
"Hourly Weather-Pomfret, CT · 1 pm. 71°. 0%. Sunny. Feels Like71°. WindNW 9 mph · 2 pm. 72°. 0%. Sunny. Feels Like72°. WindNW 9 mph · 3 pm. 72°. 0%. Sunny. Feels ...\n",
|
||||
"10 Day Weather-Pomfret, VT. As of 4:28 am EDT. Today. 68°/48°. 4%. Thu 14 | Day. 68°. 4%. WNW 10 mph. Some clouds this morning will give way to generally ...\n",
|
||||
"Be prepared with the most accurate 10-day forecast for Pomfret, MD with highs, lows, chance of precipitation from The Weather Channel and Weather.com.\n",
|
||||
"Current Weather. 10:00 PM. 65°F. RealFeel® 67°. Mostly cloudy. LOCAL HURRICANE TRACKER. Category2. Lee. Late Friday Night - Saturday Afternoon.\n",
|
||||
"10 Day Weather-Pomfret, NY. As of 5:09 pm EDT. Tonight. --/55°. 10%. Wed 13 | Night. 55°. 10%. NW 11 mph. Some clouds. Low near 55F.\n",
|
||||
"Pomfret CT. Overnight. Overnight: Patchy fog before 3am, then patchy fog after 4am. Otherwise, mostly. Patchy Fog. Low: 58 °F. Thursday.\n",
|
||||
"Isolated showers. Mostly cloudy, with a high near 76. Calm wind. Chance of precipitation is 20%. Tonight. Mostly Cloudy. Mostly cloudy, with a ...\n",
|
||||
"Partly sunny, with a high near 67. Breezy, with a north wind 18 to 22 mph, with gusts as high as 34 mph. Chance of precipitation is 30%. ... A chance of showers ...\n",
|
||||
"Today's Weather - Pomfret, CT ... Patchy fog. Showers. Lows in the upper 50s. Northwest winds around 5 mph. Chance of rain near 100 percent. ... Sunny. Patchy fog ...\u001b[0m\n",
|
||||
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
|
||||
"Final Answer: The current weather in Pomfret is mostly cloudy with a high near 67 and a chance of showers. Winds are from the north at 18 to 22 mph with gusts up to 34 mph.\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'The current weather in Pomfret is mostly cloudy with a high near 67 and a chance of showers. Winds are from the north at 18 to 22 mph with gusts up to 34 mph.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"agent.run(\"What is the weather in Pomfret?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0e39fc46",
|
||||
|
620
docs/extras/integrations/tools/searchapi.ipynb
Normal file
620
docs/extras/integrations/tools/searchapi.ipynb
Normal file
@ -0,0 +1,620 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7960ce8a-859a-41f4-a886-0d1502ed1105",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# SearchApi\n",
|
||||
"\n",
|
||||
"This notebook shows examples of how to use SearchApi to search the web. Go to [https://www.searchapi.io/](https://www.searchapi.io/) to sign up for a free account and get API key."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"id": "70871a99-ffee-47d7-8e02-82eb99971f28",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"SEARCHAPI_API_KEY\"] = \"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "2e26a518-c41c-4d75-9a79-67602ca2ec43",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.utilities import SearchApiAPIWrapper"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "8c0977f3-c136-400a-8024-f4f00645b981",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"search = SearchApiAPIWrapper()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "f573767d-4144-4407-8149-5fdddab99c63",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Barack Hussein Obama II'"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"search.run(\"Obama's first name?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9f4f75ae-2e1e-42db-a991-3ac111029f56",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using as part of a Self Ask With Search Chain"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"id": "17a9b1ad-6e84-4949-8ebd-8c52f6b296e3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "cf8970a5-00e1-46bd-ba53-6a974eebbc10",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3m Yes.\n",
|
||||
"Follow up: How old was Plato when he died?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[36;1m\u001b[1;3meighty\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3mFollow up: How old was Socrates when he died?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[36;1m\u001b[1;3m| Socrates | \n",
|
||||
"| -------- | \n",
|
||||
"| Born | c. 470 BC Deme Alopece, Athens | \n",
|
||||
"| Died | 399 BC (aged approximately 71) Athens | \n",
|
||||
"| Cause of death | Execution by forced suicide by poisoning | \n",
|
||||
"| Spouse(s) | Xanthippe, Myrto | \n",
|
||||
"\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3mFollow up: How old was Aristotle when he died?\u001b[0m\n",
|
||||
"Intermediate answer: \u001b[36;1m\u001b[1;3m62 years\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3mSo the final answer is: Plato\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Plato'"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.utilities import SearchApiAPIWrapper\n",
|
||||
"from langchain.llms.openai import OpenAI\n",
|
||||
"from langchain.agents import initialize_agent, Tool\n",
|
||||
"from langchain.agents import AgentType\n",
|
||||
"\n",
|
||||
"llm = OpenAI(temperature=0)\n",
|
||||
"search = SearchApiAPIWrapper()\n",
|
||||
"tools = [\n",
|
||||
" Tool(\n",
|
||||
" name=\"Intermediate Answer\",\n",
|
||||
" func=search.run,\n",
|
||||
" description=\"useful for when you need to ask with search\"\n",
|
||||
" )\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"self_ask_with_search = initialize_agent(tools, llm, agent=AgentType.SELF_ASK_WITH_SEARCH, verbose=True)\n",
|
||||
"self_ask_with_search.run(\"Who lived longer: Plato, Socrates, or Aristotle?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cc433d06-579b-45e5-a256-2bb30bbefb93",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Custom parameters\n",
|
||||
"\n",
|
||||
"SearchApi wrapper can be customized to use different engines like [Google News](https://www.searchapi.io/docs/google-news), [Google Jobs](https://www.searchapi.io/docs/google-jobs), [Google Scholar](https://www.searchapi.io/docs/google-scholar), or others which can be found in [SearchApi](https://www.searchapi.io/docs/google) documentation. All parameters supported by SearchApi can be passed when executing the query. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "6d0b4411-780a-4dcf-91b6-f3544e31e532",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"search = SearchApiAPIWrapper(engine=\"google_jobs\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "34e79449-6b33-4b45-9306-7e3dab1b8599",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Azure AI Engineer Be an XpanderCandidatar-meCandidatar-meCandidatar-me\\n\\nShare:\\n\\nAzure AI Engineer\\n\\nA área Digital Xperience da Xpand IT é uma equipa tecnológica de rápido crescimento que se concentra em tecnologias Microsoft e Mobile. A sua principal missão é fornecer soluções de software de alta qualidade que atendam às necessidades do utilizador final, num mundo tecnológico continuamente exigente e em ritmo acelerado, proporcionando a melhor experiência em termos de personalização, performance'"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"search.run(\"AI Engineer\", location=\"Portugal\", gl=\"pt\")[0:500]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d414513d-f374-4af0-a129-e878d4311a1e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Getting results with metadata"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "b16b7cd9-f0fe-4030-a36b-bbb52b19da18",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import pprint"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "e8adb325-2ad0-4a39-9bc2-d220ec3a29be",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'search_metadata': {'id': 'search_qVdXG2jzvrlqTzayeYoaOb8A',\n",
|
||||
" 'status': 'Success',\n",
|
||||
" 'created_at': '2023-09-25T15:22:30Z',\n",
|
||||
" 'request_time_taken': 3.21,\n",
|
||||
" 'parsing_time_taken': 0.03,\n",
|
||||
" 'total_time_taken': 3.24,\n",
|
||||
" 'request_url': 'https://scholar.google.com/scholar?q=Large+Language+Models&hl=en',\n",
|
||||
" 'html_url': 'https://www.searchapi.io/api/v1/searches/search_qVdXG2jzvrlqTzayeYoaOb8A.html',\n",
|
||||
" 'json_url': 'https://www.searchapi.io/api/v1/searches/search_qVdXG2jzvrlqTzayeYoaOb8A'},\n",
|
||||
" 'search_parameters': {'engine': 'google_scholar',\n",
|
||||
" 'q': 'Large Language Models',\n",
|
||||
" 'hl': 'en'},\n",
|
||||
" 'search_information': {'query_displayed': 'Large Language Models',\n",
|
||||
" 'total_results': 6420000,\n",
|
||||
" 'page': 1,\n",
|
||||
" 'time_taken_displayed': 0.06},\n",
|
||||
" 'organic_results': [{'position': 1,\n",
|
||||
" 'title': 'ChatGPT for good? On opportunities and '\n",
|
||||
" 'challenges of large language models for '\n",
|
||||
" 'education',\n",
|
||||
" 'data_cid': 'uthwmf2nU3EJ',\n",
|
||||
" 'link': 'https://www.sciencedirect.com/science/article/pii/S1041608023000195',\n",
|
||||
" 'publication': 'E Kasneci, K Seßler, S Küchemann, M '\n",
|
||||
" 'Bannert… - Learning and individual …, '\n",
|
||||
" '2023 - Elsevier',\n",
|
||||
" 'snippet': '… state of large language models and their '\n",
|
||||
" 'applications. We then highlight how these '\n",
|
||||
" 'models can be … With regard to challenges, '\n",
|
||||
" 'we argue that large language models in '\n",
|
||||
" 'education require …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '8166055256995715258',\n",
|
||||
" 'total': 410,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=8166055256995715258&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '8166055256995715258',\n",
|
||||
" 'total': 10,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=8166055256995715258&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:uthwmf2nU3EJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'edarxiv.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://edarxiv.org/5er8f/download?format=pdf'},\n",
|
||||
" 'authors': [{'name': 'E Kasneci',\n",
|
||||
" 'id': 'bZVkVvoAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=bZVkVvoAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'K Seßler',\n",
|
||||
" 'id': 'MbMBoN4AAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=MbMBoN4AAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'S Küchemann',\n",
|
||||
" 'id': 'g1jX5QUAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=g1jX5QUAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'M Bannert',\n",
|
||||
" 'id': 'TjfQ8QkAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=TjfQ8QkAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 2,\n",
|
||||
" 'title': 'Large language models in medicine',\n",
|
||||
" 'data_cid': 'Ph9AwHTmhzAJ',\n",
|
||||
" 'link': 'https://www.nature.com/articles/s41591-023-02448-8',\n",
|
||||
" 'publication': 'AJ Thirunavukarasu, DSJ Ting, K '\n",
|
||||
" 'Elangovan… - Nature medicine, 2023 - '\n",
|
||||
" 'nature.com',\n",
|
||||
" 'snippet': '… HuggingChat offers a free-to-access '\n",
|
||||
" 'chatbot with a similar interface to ChatGPT '\n",
|
||||
" 'but uses Large Language Model Meta AI '\n",
|
||||
" '(LLaMA) as its backend model 30 . Finally, '\n",
|
||||
" 'cheap imitations of …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '3497017024792502078',\n",
|
||||
" 'total': 25,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=3497017024792502078&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '3497017024792502078',\n",
|
||||
" 'total': 3,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=3497017024792502078&hl=en&as_sdt=0,33'}},\n",
|
||||
" 'authors': [{'name': 'AJ Thirunavukarasu',\n",
|
||||
" 'id': '3qb1AYwAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=3qb1AYwAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'DSJ Ting',\n",
|
||||
" 'id': 'KbrpC8cAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=KbrpC8cAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'K Elangovan',\n",
|
||||
" 'id': 'BE_lVTQAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=BE_lVTQAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 3,\n",
|
||||
" 'title': 'Extracting training data from large language '\n",
|
||||
" 'models',\n",
|
||||
" 'data_cid': 'mEYsWK6bWKoJ',\n",
|
||||
" 'link': 'https://www.usenix.org/conference/usenixsecurity21/presentation/carlini-extracting',\n",
|
||||
" 'publication': 'N Carlini, F Tramer, E Wallace, M '\n",
|
||||
" 'Jagielski… - 30th USENIX Security …, '\n",
|
||||
" '2021 - usenix.org',\n",
|
||||
" 'snippet': '… language model trained on scrapes of the '\n",
|
||||
" 'public Internet, and are able to extract '\n",
|
||||
" 'hundreds of verbatim text sequences from the '\n",
|
||||
" 'model’… models are more vulnerable than '\n",
|
||||
" 'smaller models. …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '12274731957504198296',\n",
|
||||
" 'total': 742,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=12274731957504198296&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '12274731957504198296',\n",
|
||||
" 'total': 8,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=12274731957504198296&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:mEYsWK6bWKoJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:mEYsWK6bWKoJ:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'usenix.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://www.usenix.org/system/files/sec21-carlini-extracting.pdf'},\n",
|
||||
" 'authors': [{'name': 'N Carlini',\n",
|
||||
" 'id': 'q4qDvAoAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=q4qDvAoAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'F Tramer',\n",
|
||||
" 'id': 'ijH0-a8AAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=ijH0-a8AAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'E Wallace',\n",
|
||||
" 'id': 'SgST3LkAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=SgST3LkAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'M Jagielski',\n",
|
||||
" 'id': '_8rw_GMAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=_8rw_GMAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 4,\n",
|
||||
" 'title': 'Emergent abilities of large language models',\n",
|
||||
" 'data_cid': 'hG0iVOrOguoJ',\n",
|
||||
" 'link': 'https://arxiv.org/abs/2206.07682',\n",
|
||||
" 'publication': 'J Wei, Y Tay, R Bommasani, C Raffel, B '\n",
|
||||
" 'Zoph… - arXiv preprint arXiv …, 2022 - '\n",
|
||||
" 'arxiv.org',\n",
|
||||
" 'snippet': 'Scaling up language models has been shown to '\n",
|
||||
" 'predictably improve performance and sample '\n",
|
||||
" 'efficiency on a wide range of downstream '\n",
|
||||
" 'tasks. This paper instead discusses an …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '16898296257676733828',\n",
|
||||
" 'total': 621,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=16898296257676733828&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '16898296257676733828',\n",
|
||||
" 'total': 12,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=16898296257676733828&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:hG0iVOrOguoJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:hG0iVOrOguoJ:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'arxiv.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://arxiv.org/pdf/2206.07682.pdf?trk=cndc-detail'},\n",
|
||||
" 'authors': [{'name': 'J Wei',\n",
|
||||
" 'id': 'wA5TK_0AAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=wA5TK_0AAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'Y Tay',\n",
|
||||
" 'id': 'VBclY_cAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=VBclY_cAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'R Bommasani',\n",
|
||||
" 'id': 'WMBXw1EAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=WMBXw1EAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'C Raffel',\n",
|
||||
" 'id': 'I66ZBYwAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=I66ZBYwAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'B Zoph',\n",
|
||||
" 'id': 'NL_7iTwAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=NL_7iTwAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 5,\n",
|
||||
" 'title': 'A survey on evaluation of large language '\n",
|
||||
" 'models',\n",
|
||||
" 'data_cid': 'ZYohnzOz-XgJ',\n",
|
||||
" 'link': 'https://arxiv.org/abs/2307.03109',\n",
|
||||
" 'publication': 'Y Chang, X Wang, J Wang, Y Wu, K Zhu… - '\n",
|
||||
" 'arXiv preprint arXiv …, 2023 - arxiv.org',\n",
|
||||
" 'snippet': '… 3.1 Natural Language Processing Tasks … '\n",
|
||||
" 'the development of language models, '\n",
|
||||
" 'particularly large language models, was to '\n",
|
||||
" 'enhance performance on natural language '\n",
|
||||
" 'processing tasks, …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '8717195588046785125',\n",
|
||||
" 'total': 31,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=8717195588046785125&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '8717195588046785125',\n",
|
||||
" 'total': 3,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=8717195588046785125&hl=en&as_sdt=0,33'},\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:ZYohnzOz-XgJ:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'arxiv.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://arxiv.org/pdf/2307.03109'},\n",
|
||||
" 'authors': [{'name': 'X Wang',\n",
|
||||
" 'id': 'Q7Ieos8AAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=Q7Ieos8AAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'J Wang',\n",
|
||||
" 'id': 'YomxTXQAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=YomxTXQAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'Y Wu',\n",
|
||||
" 'id': 'KVeRu2QAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=KVeRu2QAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'K Zhu',\n",
|
||||
" 'id': 'g75dFLYAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=g75dFLYAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 6,\n",
|
||||
" 'title': 'Evaluating large language models trained on '\n",
|
||||
" 'code',\n",
|
||||
" 'data_cid': '3tNvW3l5nU4J',\n",
|
||||
" 'link': 'https://arxiv.org/abs/2107.03374',\n",
|
||||
" 'publication': 'M Chen, J Tworek, H Jun, Q Yuan, HPO '\n",
|
||||
" 'Pinto… - arXiv preprint arXiv …, 2021 - '\n",
|
||||
" 'arxiv.org',\n",
|
||||
" 'snippet': '… We introduce Codex, a GPT language model '\n",
|
||||
" 'finetuned on publicly available code from '\n",
|
||||
" 'GitHub, and study its Python code-writing '\n",
|
||||
" 'capabilities. A distinct production version '\n",
|
||||
" 'of Codex …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '5664817468434011102',\n",
|
||||
" 'total': 941,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=5664817468434011102&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '5664817468434011102',\n",
|
||||
" 'total': 2,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=5664817468434011102&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:3tNvW3l5nU4J:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:3tNvW3l5nU4J:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'arxiv.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://arxiv.org/pdf/2107.03374.pdf?trk=public_post_comment-text'},\n",
|
||||
" 'authors': [{'name': 'M Chen',\n",
|
||||
" 'id': '5fU-QMwAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=5fU-QMwAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'J Tworek',\n",
|
||||
" 'id': 'ZPuESCQAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=ZPuESCQAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'Q Yuan',\n",
|
||||
" 'id': 'B059m2EAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=B059m2EAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 7,\n",
|
||||
" 'title': 'Large language models in machine translation',\n",
|
||||
" 'data_cid': 'sY5m_Y3-0Y4J',\n",
|
||||
" 'link': 'http://research.google/pubs/pub33278.pdf',\n",
|
||||
" 'publication': 'T Brants, AC Popat, P Xu, FJ Och, J Dean '\n",
|
||||
" '- 2007 - research.google',\n",
|
||||
" 'snippet': '… the benefits of largescale statistical '\n",
|
||||
" 'language modeling in ma… trillion tokens, '\n",
|
||||
" 'resulting in language models having up to '\n",
|
||||
" '300 … is inexpensive to train on large data '\n",
|
||||
" 'sets and approaches the …',\n",
|
||||
" 'type': 'PDF',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '10291286509313494705',\n",
|
||||
" 'total': 737,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=10291286509313494705&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '10291286509313494705',\n",
|
||||
" 'total': 31,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=10291286509313494705&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:sY5m_Y3-0Y4J:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:sY5m_Y3-0Y4J:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'research.google',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'http://research.google/pubs/pub33278.pdf'},\n",
|
||||
" 'authors': [{'name': 'FJ Och',\n",
|
||||
" 'id': 'ITGdg6oAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=ITGdg6oAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'J Dean',\n",
|
||||
" 'id': 'NMS69lQAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=NMS69lQAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 8,\n",
|
||||
" 'title': 'A watermark for large language models',\n",
|
||||
" 'data_cid': 'BlSyLHT4iiEJ',\n",
|
||||
" 'link': 'https://arxiv.org/abs/2301.10226',\n",
|
||||
" 'publication': 'J Kirchenbauer, J Geiping, Y Wen, J '\n",
|
||||
" 'Katz… - arXiv preprint arXiv …, 2023 - '\n",
|
||||
" 'arxiv.org',\n",
|
||||
" 'snippet': '… To derive this watermark, we examine what '\n",
|
||||
" 'happens in the language model just before it '\n",
|
||||
" 'produces a probability vector. The last '\n",
|
||||
" 'layer of the language model outputs a vector '\n",
|
||||
" 'of logits l(t). …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '2417017327887471622',\n",
|
||||
" 'total': 104,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=2417017327887471622&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '2417017327887471622',\n",
|
||||
" 'total': 4,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=2417017327887471622&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:BlSyLHT4iiEJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:BlSyLHT4iiEJ:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'arxiv.org',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://arxiv.org/pdf/2301.10226.pdf?curius=1419'},\n",
|
||||
" 'authors': [{'name': 'J Kirchenbauer',\n",
|
||||
" 'id': '48GJrbsAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=48GJrbsAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'J Geiping',\n",
|
||||
" 'id': '206vNCEAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=206vNCEAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'Y Wen',\n",
|
||||
" 'id': 'oUYfjg0AAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=oUYfjg0AAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'J Katz',\n",
|
||||
" 'id': 'yPw4WjoAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=yPw4WjoAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 9,\n",
|
||||
" 'title': 'ChatGPT and other large language models are '\n",
|
||||
" 'double-edged swords',\n",
|
||||
" 'data_cid': 'So0q8TRvxhYJ',\n",
|
||||
" 'link': 'https://pubs.rsna.org/doi/full/10.1148/radiol.230163',\n",
|
||||
" 'publication': 'Y Shen, L Heacock, J Elias, KD Hentel, B '\n",
|
||||
" 'Reig, G Shih… - Radiology, 2023 - '\n",
|
||||
" 'pubs.rsna.org',\n",
|
||||
" 'snippet': '… Large Language Models (LLMs) are deep '\n",
|
||||
" 'learning models trained to understand and '\n",
|
||||
" 'generate natural language. Recent studies '\n",
|
||||
" 'demonstrated that LLMs achieve great success '\n",
|
||||
" 'in a …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '1641121387398204746',\n",
|
||||
" 'total': 231,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=1641121387398204746&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '1641121387398204746',\n",
|
||||
" 'total': 3,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=1641121387398204746&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:So0q8TRvxhYJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'authors': [{'name': 'Y Shen',\n",
|
||||
" 'id': 'XaeN2zgAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=XaeN2zgAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'L Heacock',\n",
|
||||
" 'id': 'tYYM5IkAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=tYYM5IkAAAAJ&hl=en&oi=sra'}]},\n",
|
||||
" {'position': 10,\n",
|
||||
" 'title': 'Pythia: A suite for analyzing large language '\n",
|
||||
" 'models across training and scaling',\n",
|
||||
" 'data_cid': 'aaIDvsMAD8QJ',\n",
|
||||
" 'link': 'https://proceedings.mlr.press/v202/biderman23a.html',\n",
|
||||
" 'publication': 'S Biderman, H Schoelkopf… - '\n",
|
||||
" 'International …, 2023 - '\n",
|
||||
" 'proceedings.mlr.press',\n",
|
||||
" 'snippet': '… large language models, we prioritize '\n",
|
||||
" 'consistency in model … out the most '\n",
|
||||
" 'performance from each model. For example, we '\n",
|
||||
" '… models, as it is becoming widely used for '\n",
|
||||
" 'the largest models, …',\n",
|
||||
" 'inline_links': {'cited_by': {'cites_id': '14127511396791067241',\n",
|
||||
" 'total': 89,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cites=14127511396791067241&as_sdt=5,33&sciodt=0,33&hl=en'},\n",
|
||||
" 'versions': {'cluster_id': '14127511396791067241',\n",
|
||||
" 'total': 3,\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?cluster=14127511396791067241&hl=en&as_sdt=0,33'},\n",
|
||||
" 'related_articles_link': 'https://scholar.google.com/scholar?q=related:aaIDvsMAD8QJ:scholar.google.com/&scioq=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'cached_page_link': 'https://scholar.googleusercontent.com/scholar?q=cache:aaIDvsMAD8QJ:scholar.google.com/+Large+Language+Models&hl=en&as_sdt=0,33'},\n",
|
||||
" 'resource': {'name': 'mlr.press',\n",
|
||||
" 'format': 'PDF',\n",
|
||||
" 'link': 'https://proceedings.mlr.press/v202/biderman23a/biderman23a.pdf'},\n",
|
||||
" 'authors': [{'name': 'S Biderman',\n",
|
||||
" 'id': 'bO7H0DAAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=bO7H0DAAAAAJ&hl=en&oi=sra'},\n",
|
||||
" {'name': 'H Schoelkopf',\n",
|
||||
" 'id': 'XLahYIYAAAAJ',\n",
|
||||
" 'link': 'https://scholar.google.com/citations?user=XLahYIYAAAAJ&hl=en&oi=sra'}]}],\n",
|
||||
" 'related_searches': [{'query': 'large language models machine',\n",
|
||||
" 'highlighted': ['machine'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=1&q=large+language+models+machine&qst=ib'},\n",
|
||||
" {'query': 'large language models pruning',\n",
|
||||
" 'highlighted': ['pruning'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=2&q=large+language+models+pruning&qst=ib'},\n",
|
||||
" {'query': 'large language models multitask learners',\n",
|
||||
" 'highlighted': ['multitask learners'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=3&q=large+language+models+multitask+learners&qst=ib'},\n",
|
||||
" {'query': 'large language models speech recognition',\n",
|
||||
" 'highlighted': ['speech recognition'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=4&q=large+language+models+speech+recognition&qst=ib'},\n",
|
||||
" {'query': 'large language models machine translation',\n",
|
||||
" 'highlighted': ['machine translation'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=5&q=large+language+models+machine+translation&qst=ib'},\n",
|
||||
" {'query': 'emergent abilities of large language models',\n",
|
||||
" 'highlighted': ['emergent abilities of'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=6&q=emergent+abilities+of+large+language+models&qst=ir'},\n",
|
||||
" {'query': 'language models privacy risks',\n",
|
||||
" 'highlighted': ['privacy risks'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=7&q=language+models+privacy+risks&qst=ir'},\n",
|
||||
" {'query': 'language model fine tuning',\n",
|
||||
" 'highlighted': ['fine tuning'],\n",
|
||||
" 'link': 'https://scholar.google.com/scholar?hl=en&as_sdt=0,33&qsp=8&q=language+model+fine+tuning&qst=ir'}],\n",
|
||||
" 'pagination': {'current': 1,\n",
|
||||
" 'next': 'https://scholar.google.com/scholar?start=10&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" 'other_pages': {'2': 'https://scholar.google.com/scholar?start=10&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '3': 'https://scholar.google.com/scholar?start=20&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '4': 'https://scholar.google.com/scholar?start=30&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '5': 'https://scholar.google.com/scholar?start=40&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '6': 'https://scholar.google.com/scholar?start=50&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '7': 'https://scholar.google.com/scholar?start=60&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '8': 'https://scholar.google.com/scholar?start=70&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '9': 'https://scholar.google.com/scholar?start=80&q=Large+Language+Models&hl=en&as_sdt=0,33',\n",
|
||||
" '10': 'https://scholar.google.com/scholar?start=90&q=Large+Language+Models&hl=en&as_sdt=0,33'}}}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"search = SearchApiAPIWrapper(engine=\"google_scholar\")\n",
|
||||
"results = search.results(\"Large Language Models\")\n",
|
||||
"pprint.pp(results)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
@ -22,6 +22,7 @@ from langchain.tools.ddg_search.tool import DuckDuckGoSearchRun
|
||||
from langchain.tools.google_search.tool import GoogleSearchResults, GoogleSearchRun
|
||||
from langchain.tools.metaphor_search.tool import MetaphorSearchResults
|
||||
from langchain.tools.google_serper.tool import GoogleSerperResults, GoogleSerperRun
|
||||
from langchain.tools.searchapi.tool import SearchAPIResults, SearchAPIRun
|
||||
from langchain.tools.graphql.tool import BaseGraphQLTool
|
||||
from langchain.tools.human.tool import HumanInputRun
|
||||
from langchain.tools.python.tool import PythonREPLTool
|
||||
@ -52,6 +53,7 @@ from langchain.utilities.google_serper import GoogleSerperAPIWrapper
|
||||
from langchain.utilities.metaphor_search import MetaphorSearchAPIWrapper
|
||||
from langchain.utilities.awslambda import LambdaWrapper
|
||||
from langchain.utilities.graphql import GraphQLAPIWrapper
|
||||
from langchain.utilities.searchapi import SearchApiAPIWrapper
|
||||
from langchain.utilities.searx_search import SearxSearchWrapper
|
||||
from langchain.utilities.serpapi import SerpAPIWrapper
|
||||
from langchain.utilities.twilio import TwilioAPIWrapper
|
||||
@ -214,6 +216,14 @@ def _get_google_search_results_json(**kwargs: Any) -> BaseTool:
|
||||
return GoogleSearchResults(api_wrapper=GoogleSearchAPIWrapper(**kwargs))
|
||||
|
||||
|
||||
def _get_searchapi(**kwargs: Any) -> BaseTool:
|
||||
return SearchAPIRun(api_wrapper=SearchApiAPIWrapper(**kwargs))
|
||||
|
||||
|
||||
def _get_searchapi_results_json(**kwargs: Any) -> BaseTool:
|
||||
return SearchAPIResults(api_wrapper=SearchApiAPIWrapper(**kwargs))
|
||||
|
||||
|
||||
def _get_serpapi(**kwargs: Any) -> BaseTool:
|
||||
return Tool(
|
||||
name="Search",
|
||||
@ -298,7 +308,6 @@ _EXTRA_LLM_TOOLS: Dict[
|
||||
"tmdb-api": (_get_tmdb_api, ["tmdb_bearer_token"]),
|
||||
"podcast-api": (_get_podcast_api, ["listen_api_key"]),
|
||||
}
|
||||
|
||||
_EXTRA_OPTIONAL_TOOLS: Dict[str, Tuple[Callable[[KwArg(Any)], BaseTool], List[str]]] = {
|
||||
"wolfram-alpha": (_get_wolfram_alpha, ["wolfram_alpha_appid"]),
|
||||
"google-search": (_get_google_search, ["google_api_key", "google_cse_id"]),
|
||||
@ -318,6 +327,11 @@ _EXTRA_OPTIONAL_TOOLS: Dict[str, Tuple[Callable[[KwArg(Any)], BaseTool], List[st
|
||||
_get_google_serper_results_json,
|
||||
["serper_api_key", "aiosession"],
|
||||
),
|
||||
"searchapi": (_get_searchapi, ["searchapi_api_key", "aiosession"]),
|
||||
"searchapi-results-json": (
|
||||
_get_searchapi_results_json,
|
||||
["searchapi_api_key", "aiosession"],
|
||||
),
|
||||
"serpapi": (_get_serpapi, ["serpapi_api_key", "aiosession"]),
|
||||
"dalle-image-generator": (_get_dalle_image_generator, ["openai_api_key"]),
|
||||
"twilio": (_get_twilio, ["account_sid", "auth_token", "from_number"]),
|
||||
|
6
libs/langchain/langchain/tools/searchapi/__init__.py
Normal file
6
libs/langchain/langchain/tools/searchapi/__init__.py
Normal file
@ -0,0 +1,6 @@
|
||||
from langchain.tools.searchapi.tool import SearchAPIResults, SearchAPIRun
|
||||
|
||||
"""SearchApi.io API Toolkit."""
|
||||
"""Tool for the SearchApi.io Google SERP API."""
|
||||
|
||||
__all__ = ["SearchAPIResults", "SearchAPIRun"]
|
68
libs/langchain/langchain/tools/searchapi/tool.py
Normal file
68
libs/langchain/langchain/tools/searchapi/tool.py
Normal file
@ -0,0 +1,68 @@
|
||||
"""Tool for the SearchApi.io search API."""
|
||||
|
||||
from typing import Optional
|
||||
|
||||
from langchain.callbacks.manager import (
|
||||
AsyncCallbackManagerForToolRun,
|
||||
CallbackManagerForToolRun,
|
||||
)
|
||||
from langchain.pydantic_v1 import Field
|
||||
from langchain.tools.base import BaseTool
|
||||
from langchain.utilities.searchapi import SearchApiAPIWrapper
|
||||
|
||||
|
||||
class SearchAPIRun(BaseTool):
|
||||
"""Tool that queries the SearchApi.io search API."""
|
||||
|
||||
name: str = "searchapi"
|
||||
description: str = (
|
||||
"Google search API provided by SearchApi.io."
|
||||
"This tool is handy when you need to answer questions about current events."
|
||||
"Input should be a search query."
|
||||
)
|
||||
api_wrapper: SearchApiAPIWrapper
|
||||
|
||||
def _run(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[CallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool."""
|
||||
return self.api_wrapper.run(query)
|
||||
|
||||
async def _arun(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool asynchronously."""
|
||||
return await self.api_wrapper.arun(query)
|
||||
|
||||
|
||||
class SearchAPIResults(BaseTool):
|
||||
"""Tool that queries the SearchApi.io search API and returns JSON."""
|
||||
|
||||
name: str = "searchapi_results_json"
|
||||
description: str = (
|
||||
"Google search API provided by SearchApi.io."
|
||||
"This tool is handy when you need to answer questions about current events."
|
||||
"The input should be a search query and the output is a JSON object "
|
||||
"with the query results."
|
||||
)
|
||||
api_wrapper: SearchApiAPIWrapper = Field(default_factory=SearchApiAPIWrapper)
|
||||
|
||||
def _run(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[CallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool."""
|
||||
return str(self.api_wrapper.results(query))
|
||||
|
||||
async def _arun(
|
||||
self,
|
||||
query: str,
|
||||
run_manager: Optional[AsyncCallbackManagerForToolRun] = None,
|
||||
) -> str:
|
||||
"""Use the tool asynchronously."""
|
||||
return (await self.api_wrapper.aresults(query)).__str__()
|
@ -27,6 +27,7 @@ from langchain.utilities.pubmed import PubMedAPIWrapper
|
||||
from langchain.utilities.python import PythonREPL
|
||||
from langchain.utilities.requests import Requests, RequestsWrapper, TextRequestsWrapper
|
||||
from langchain.utilities.scenexplain import SceneXplainAPIWrapper
|
||||
from langchain.utilities.searchapi import SearchApiAPIWrapper
|
||||
from langchain.utilities.searx_search import SearxSearchWrapper
|
||||
from langchain.utilities.serpapi import SerpAPIWrapper
|
||||
from langchain.utilities.spark_sql import SparkSQL
|
||||
@ -64,6 +65,7 @@ __all__ = [
|
||||
"RequestsWrapper",
|
||||
"SQLDatabase",
|
||||
"SceneXplainAPIWrapper",
|
||||
"SearchApiAPIWrapper",
|
||||
"SearxSearchWrapper",
|
||||
"SerpAPIWrapper",
|
||||
"SparkSQL",
|
||||
|
139
libs/langchain/langchain/utilities/searchapi.py
Normal file
139
libs/langchain/langchain/utilities/searchapi.py
Normal file
@ -0,0 +1,139 @@
|
||||
from typing import Any, Dict, Optional
|
||||
|
||||
import aiohttp
|
||||
import requests
|
||||
|
||||
from langchain.pydantic_v1 import BaseModel, root_validator
|
||||
from langchain.utils import get_from_dict_or_env
|
||||
|
||||
|
||||
class SearchApiAPIWrapper(BaseModel):
|
||||
"""
|
||||
Wrapper around SearchApi API.
|
||||
|
||||
To use, you should have the environment variable ``SEARCHAPI_API_KEY``
|
||||
set with your API key, or pass `searchapi_api_key`
|
||||
as a named parameter to the constructor.
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain.utilities import SearchApiAPIWrapper
|
||||
searchapi = SearchApiAPIWrapper()
|
||||
"""
|
||||
|
||||
# Use "google" engine by default.
|
||||
# Full list of supported ones can be found in https://www.searchapi.io docs
|
||||
engine: str = "google"
|
||||
searchapi_api_key: Optional[str] = None
|
||||
aiosession: Optional[aiohttp.ClientSession] = None
|
||||
|
||||
class Config:
|
||||
"""Configuration for this pydantic object."""
|
||||
|
||||
arbitrary_types_allowed = True
|
||||
|
||||
@root_validator()
|
||||
def validate_environment(cls, values: Dict) -> Dict:
|
||||
"""Validate that API key exists in environment."""
|
||||
searchapi_api_key = get_from_dict_or_env(
|
||||
values, "searchapi_api_key", "SEARCHAPI_API_KEY"
|
||||
)
|
||||
values["searchapi_api_key"] = searchapi_api_key
|
||||
return values
|
||||
|
||||
def run(self, query: str, **kwargs: Any) -> str:
|
||||
results = self.results(query, **kwargs)
|
||||
return self._result_as_string(results)
|
||||
|
||||
async def arun(self, query: str, **kwargs: Any) -> str:
|
||||
results = await self.aresults(query, **kwargs)
|
||||
return self._result_as_string(results)
|
||||
|
||||
def results(self, query: str, **kwargs: Any) -> dict:
|
||||
results = self._search_api_results(query, **kwargs)
|
||||
return results
|
||||
|
||||
async def aresults(self, query: str, **kwargs: Any) -> dict:
|
||||
results = await self._async_search_api_results(query, **kwargs)
|
||||
return results
|
||||
|
||||
def _prepare_request(self, query: str, **kwargs: Any) -> dict:
|
||||
return {
|
||||
"url": "https://www.searchapi.io/api/v1/search",
|
||||
"headers": {
|
||||
"Authorization": f"Bearer {self.searchapi_api_key}",
|
||||
},
|
||||
"params": {
|
||||
"engine": self.engine,
|
||||
"q": query,
|
||||
**{key: value for key, value in kwargs.items() if value is not None},
|
||||
},
|
||||
}
|
||||
|
||||
def _search_api_results(self, query: str, **kwargs: Any) -> dict:
|
||||
request_details = self._prepare_request(query, **kwargs)
|
||||
response = requests.get(
|
||||
url=request_details["url"],
|
||||
params=request_details["params"],
|
||||
headers=request_details["headers"],
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
|
||||
async def _async_search_api_results(self, query: str, **kwargs: Any) -> dict:
|
||||
"""Use aiohttp to send request to SearchApi API and return results async."""
|
||||
request_details = self._prepare_request(query, **kwargs)
|
||||
if not self.aiosession:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.get(
|
||||
url=request_details["url"],
|
||||
headers=request_details["headers"],
|
||||
params=request_details["params"],
|
||||
raise_for_status=True,
|
||||
) as response:
|
||||
results = await response.json()
|
||||
else:
|
||||
async with self.aiosession.get(
|
||||
url=request_details["url"],
|
||||
headers=request_details["headers"],
|
||||
params=request_details["params"],
|
||||
raise_for_status=True,
|
||||
) as response:
|
||||
results = await response.json()
|
||||
return results
|
||||
|
||||
@staticmethod
|
||||
def _result_as_string(result: dict) -> str:
|
||||
toret = "No good search result found"
|
||||
if "answer_box" in result.keys() and "answer" in result["answer_box"].keys():
|
||||
toret = result["answer_box"]["answer"]
|
||||
elif "answer_box" in result.keys() and "snippet" in result["answer_box"].keys():
|
||||
toret = result["answer_box"]["snippet"]
|
||||
elif "knowledge_graph" in result.keys():
|
||||
toret = result["knowledge_graph"]["description"]
|
||||
elif "organic_results" in result.keys():
|
||||
snippets = [
|
||||
r["snippet"] for r in result["organic_results"] if "snippet" in r.keys()
|
||||
]
|
||||
toret = "\n".join(snippets)
|
||||
elif "jobs" in result.keys():
|
||||
jobs = [
|
||||
r["description"] for r in result["jobs"] if "description" in r.keys()
|
||||
]
|
||||
toret = "\n".join(jobs)
|
||||
elif "videos" in result.keys():
|
||||
videos = [
|
||||
f"""Title: "{r["title"]}" Link: {r["link"]}"""
|
||||
for r in result["videos"]
|
||||
if "title" in r.keys()
|
||||
]
|
||||
toret = "\n".join(videos)
|
||||
elif "images" in result.keys():
|
||||
images = [
|
||||
f"""Title: "{r["title"]}" Link: {r["original"]["link"]}"""
|
||||
for r in result["images"]
|
||||
if "original" in r.keys()
|
||||
]
|
||||
toret = "\n".join(images)
|
||||
return toret
|
@ -2,6 +2,12 @@
|
||||
# your api key from https://platform.openai.com/account/api-keys
|
||||
OPENAI_API_KEY=
|
||||
|
||||
|
||||
# searchapi
|
||||
# your api key from https://www.searchapi.io/
|
||||
SEARCHAPI_API_KEY=your_searchapi_api_key_here
|
||||
|
||||
|
||||
# pinecone
|
||||
# your api key from left menu "API Keys" in https://app.pinecone.io
|
||||
PINECONE_API_KEY=your_pinecone_api_key_here
|
||||
|
@ -0,0 +1,64 @@
|
||||
"""Integration tests for SearchApi"""
|
||||
import pytest
|
||||
|
||||
from langchain.utilities.searchapi import SearchApiAPIWrapper
|
||||
|
||||
|
||||
def test_call() -> None:
|
||||
"""Test that call gives correct answer."""
|
||||
search = SearchApiAPIWrapper()
|
||||
output = search.run("What is the capital of Lithuania?")
|
||||
assert "Vilnius" in output
|
||||
|
||||
|
||||
def test_results() -> None:
|
||||
"""Test that call gives correct answer."""
|
||||
search = SearchApiAPIWrapper()
|
||||
output = search.results("What is the capital of Lithuania?")
|
||||
assert "Vilnius" in output["answer_box"]["answer"]
|
||||
assert "Vilnius" in output["answer_box"]["snippet"]
|
||||
assert "Vilnius" in output["knowledge_graph"]["description"]
|
||||
assert "Vilnius" in output["organic_results"][0]["snippet"]
|
||||
|
||||
|
||||
def test_results_with_custom_params() -> None:
|
||||
"""Test that call gives correct answer with custom params."""
|
||||
search = SearchApiAPIWrapper()
|
||||
output = search.results(
|
||||
"cafeteria",
|
||||
hl="es",
|
||||
gl="es",
|
||||
google_domain="google.es",
|
||||
location="Madrid, Spain",
|
||||
)
|
||||
assert "Madrid" in output["search_information"]["detected_location"]
|
||||
|
||||
|
||||
def test_scholar_call() -> None:
|
||||
"""Test that call gives correct answer for scholar search."""
|
||||
search = SearchApiAPIWrapper(engine="google_scholar")
|
||||
output = search.run("large language models")
|
||||
assert "state of large language models and their applications" in output
|
||||
|
||||
|
||||
def test_jobs_call() -> None:
|
||||
"""Test that call gives correct answer for jobs search."""
|
||||
search = SearchApiAPIWrapper(engine="google_jobs")
|
||||
output = search.run("AI")
|
||||
assert "years of experience" in output
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_call() -> None:
|
||||
"""Test that call gives the correct answer."""
|
||||
search = SearchApiAPIWrapper()
|
||||
output = await search.arun("What is Obama's full name?")
|
||||
assert "Barack Hussein Obama II" in output
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_async_results() -> None:
|
||||
"""Test that call gives the correct answer."""
|
||||
search = SearchApiAPIWrapper()
|
||||
output = await search.aresults("What is Obama's full name?")
|
||||
assert "Barack Hussein Obama II" in output["knowledge_graph"]["description"]
|
Loading…
Reference in New Issue
Block a user