# Metaphor Search

Metaphor is a search engine fully designed to be used by LLMs. You can search and then get the contents for any page.

This notebook goes over how to use Metaphor search.

First, you need to set up the proper API keys and environment variables. Get 1000 free searches/month [here](https://platform.metaphor.systems/).

Then enter your API key as an environment variable.

In [1]:
import os

os.environ["METAPHOR_API_KEY"] = "..."

## Using their SDK

This is the newer and more supported way to use the Metaphor API - via their SDK

In [2]:
# !pip install metaphor-python

In [3]:
from metaphor_python import Metaphor

client = Metaphor(api_key=os.environ["METAPHOR_API_KEY"])

In [4]:
from langchain.agents import tool
from typing import List

In [5]:
@tool
def search(query: str):
    """Call search engine with a query."""
    return client.search(query, use_autoprompt=True, num_results=5)

@tool
def get_contents(ids: List[str]):
    """Get contents of a webpage.
    
    The ids passed in should be a list of ids as fetched from `search`.
    """
    return client.get_contents(ids)

@tool
def find_similar(url: str):
    """Get search results similar to a given URL.
    
    The url passed in should be a URL returned from `search`
    """
    return client.find_similar(url, num_results=5)

In [6]:
tools = [search, get_contents, find_similar]

### Use in an agent

In [16]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature=0)

In [18]:
from langchain.agents import OpenAIFunctionsAgent
from langchain.schema import SystemMessage
system_message = SystemMessage(content="You are a web researcher who uses search engines to look up information.")
prompt = OpenAIFunctionsAgent.create_prompt(system_message=system_message)
agent = OpenAIFunctionsAgent(llm=llm, tools=tools, prompt=prompt)

In [19]:
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [20]:
agent_executor.run("Find the hottest AI agent startups and what they do")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `search` with `{'query': 'hottest AI agent startups'}`


[0m[36;1m[1;3mSearchResponse(results=[Result(title='A Search Engine for Machine Intelligence', url='https://bellow.ai/', id='bdYc6hvHww_JvLv9k8NhPA', score=0.19460266828536987, published_date='2023-01-01', author=None, extract=None), Result(title='Adept: Useful General Intelligence', url='https://www.adept.ai/', id='aNBppxBZvQRZMov6sFVj9g', score=0.19103890657424927, published_date='2000-01-01', author=None, extract=None), Result(title='HiOperator | Generative AI-Enhanced Customer Service', url='https://www.hioperator.com/', id='jieb6sB53mId3EDo0z-SDw', score=0.18549954891204834, published_date='2000-01-01', author=None, extract=None), Result(title='Home - Stylo', url='https://www.askstylo.com/', id='kUiCuCjJYMD4N0NXdCtqlQ', score=0.1837376356124878, published_date='2000-01-01', author=None, extract=None), Result(title='DirectAI', url='https://directai.io/

"Here are some of the hottest AI agent startups and what they do:\n\n1. [Bellow AI](https://bellow.ai/): This startup provides a search engine for machine intelligence. It allows users to get responses from multiple AIs, exploring the full space of machine intelligence and getting highly tailored results.\n\n2. [Adept AI](https://www.adept.ai/): Adept is focused on creating useful general intelligence.\n\n3. [HiOperator](https://www.hioperator.com/): HiOperator offers generative AI-enhanced customer support automation. It provides scalable, digital-first customer service and uses its software to empower agents to learn quickly and deliver accurate results.\n\n4. [Stylo](https://www.askstylo.com/): Stylo uses AI to help manage customer support, identifying where to most effectively spend time to improve the customer experience.\n\n5. [DirectAI](https://directai.io/): DirectAI allows users to build and deploy powerful computer vision models with plain language, without the need for code 

## Using the tool wrapper

This is the old way of using Metaphor - through our own in-house integration.

In [2]:
from langchain.utilities import MetaphorSearchAPIWrapper

In [3]:
search = MetaphorSearchAPIWrapper()

### Call the API
`results` takes in a Metaphor-optimized search query and a number of results (up to 500). It returns a list of results with title, url, author, and creation date.

In [4]:
search.results("The best blog post about AI safety is definitely this: ", 10)

[{'title': 'Core Views on AI Safety: When, Why, What, and How',
  'url': 'https://www.anthropic.com/index/core-views-on-ai-safety',
  'author': None,
  'published_date': '2023-03-08'},
 {'title': 'Extinction Risk from Artificial Intelligence',
  'url': 'https://aisafety.wordpress.com/',
  'author': None,
  'published_date': '2013-10-08'},
 {'title': 'The simple picture on AI safety - LessWrong',
  'url': 'https://www.lesswrong.com/posts/WhNxG4r774bK32GcH/the-simple-picture-on-ai-safety',
  'author': 'Alex Flint',
  'published_date': '2018-05-27'},
 {'title': 'No Time Like The Present For AI Safety Work',
  'url': 'https://slatestarcodex.com/2015/05/29/no-time-like-the-present-for-ai-safety-work/',
  'author': None,
  'published_date': '2015-05-29'},
 {'title': 'A plea for solutionism on AI safety - LessWrong',
  'url': 'https://www.lesswrong.com/posts/ASMX9ss3J5G3GZdok/a-plea-for-solutionism-on-ai-safety',
  'author': 'Jasoncrawford',
  'published_date': '2023-06-09'},
 {'title': 'The 

### Adding filters
We can also add filters to our search. 

include_domains: Optional[List[str]] - List of domains to include in the search. If specified, results will only come from these domains. Only one of include_domains and exclude_domains should be specified.

exclude_domains: Optional[List[str]] - List of domains to exclude in the search. If specified, results will only come from these domains. Only one of include_domains and exclude_domains should be specified.

start_crawl_date: Optional[str] - "Crawl date" refers to the date that Metaphor discovered a link, which is more granular and can be more useful than published date. If start_crawl_date is specified, results will only include links that were crawled after start_crawl_date. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)

end_crawl_date: Optional[str] - "Crawl date" refers to the date that Metaphor discovered a link, which is more granular and can be more useful than published date. If endCrawlDate is specified, results will only include links that were crawled before end_crawl_date. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ)

start_published_date: Optional[str] - If specified, only links with a published date after start_published_date will be returned. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ). Note that for some links, we have no published date, and these links will be excluded from the results if start_published_date is specified.

end_published_date: Optional[str] - If specified, only links with a published date before end_published_date will be returned. Must be specified in ISO 8601 format (YYYY-MM-DDTHH:MM:SSZ). Note that for some links, we have no published date, and these links will be excluded from the results if end_published_date is specified.

See full docs [here](https://metaphorapi.readme.io/).

In [None]:
search.results(
    "The best blog post about AI safety is definitely this: ",
    10,
    include_domains=["lesswrong.com"],
    start_published_date="2019-01-01",
)

### Use Metaphor as a tool
Metaphor can be used as a tool that gets URLs that other tools such as browsing tools.

In [None]:
%pip install playwright
from langchain.agents.agent_toolkits import PlayWrightBrowserToolkit
from langchain.tools.playwright.utils import (
    create_async_playwright_browser,  # A synchronous browser is available, though it isn't compatible with jupyter.
)

async_browser = create_async_playwright_browser()
toolkit = PlayWrightBrowserToolkit.from_browser(async_browser=async_browser)
tools = toolkit.get_tools()

tools_by_name = {tool.name: tool for tool in tools}
print(tools_by_name.keys())
navigate_tool = tools_by_name["navigate_browser"]
extract_text = tools_by_name["extract_text"]

In [None]:
from langchain.agents import initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.tools import MetaphorSearchResults

llm = ChatOpenAI(model_name="gpt-4", temperature=0.7)

metaphor_tool = MetaphorSearchResults(api_wrapper=search)

agent_chain = initialize_agent(
    [metaphor_tool, extract_text, navigate_tool],
    llm,
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

agent_chain.run(
    "find me an interesting tweet about AI safety using Metaphor, then tell me the first sentence in the post. Do not finish until able to retrieve the first sentence."
)