mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-28 02:29:17 +00:00
This is a work in progress PR to track my progres. ## TODO: - [x] Get results using the specifed searx host - [x] Prioritize returning an `answer` or results otherwise - [ ] expose the field `infobox` when available - [ ] expose `score` of result to help agent's decision - [ ] expose the `suggestions` field to agents so they could try new queries if no results are found with the orignial query ? - [ ] Dynamic tool description for agents ? - Searx offers many engines and a search syntax that agents can take advantage of. It would be nice to generate a dynamic Tool description so that it can be used many times as a tool but for different purposes. - [x] Limit number of results - [ ] Implement paging - [x] Miror the usage of the Google Search tool - [x] easy selection of search engines - [x] Documentation - [ ] update HowTo guide notebook on Search Tools - [ ] Handle async - [ ] Tests ### Add examples / documentation on possible uses with - [ ] getting factual answers with `!wiki` option and `infoboxes` - [ ] getting `suggestions` - [ ] getting `corrections` --------- Co-authored-by: blob42 <spike@w530> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
44 lines
2.1 KiB
Markdown
44 lines
2.1 KiB
Markdown
# Key Concepts
|
|
|
|
## Text Splitter
|
|
This class is responsible for splitting long pieces of text into smaller components.
|
|
It contains different ways for splitting text (on characters, using Spacy, etc)
|
|
as well as different ways for measuring length (token based, character based, etc).
|
|
|
|
## Embeddings
|
|
These classes are very similar to the LLM classes in that they are wrappers around models,
|
|
but rather than return a string they return an embedding (list of floats). These are particularly useful when
|
|
implementing semantic search functionality. They expose separate methods for embedding queries versus embedding documents.
|
|
|
|
## Vectorstores
|
|
These are datastores that store embeddings of documents in vector form.
|
|
They expose a method for passing in a string and finding similar documents.
|
|
|
|
## Python REPL
|
|
Sometimes, for complex calculations, rather than have an LLM generate the answer directly,
|
|
it can be better to have the LLM generate code to calculate the answer, and then run that code to get the answer.
|
|
In order to easily do that, we provide a simple Python REPL to execute commands in.
|
|
This interface will only return things that are printed -
|
|
therefore, if you want to use it to calculate an answer, make sure to have it print out the answer.
|
|
|
|
## Bash
|
|
It can often be useful to have an LLM generate bash commands, and then run them.
|
|
A common use case this is for letting it interact with your local file system.
|
|
We provide an easy component to execute bash commands.
|
|
|
|
## Requests Wrapper
|
|
The web contains a lot of information that LLMs do not have access to.
|
|
In order to easily let LLMs interact with that information,
|
|
we provide a wrapper around the Python Requests module that takes in a URL and fetches data from that URL.
|
|
|
|
## Google Search
|
|
This uses the official Google Search API to look up information on the web.
|
|
|
|
## SerpAPI
|
|
This uses SerpAPI, a third party search API engine, to interact with Google Search.
|
|
|
|
## Searx Search
|
|
This uses the Searx (SearxNG fork) meta search engine API to lookup information
|
|
on the web. It supports 139 search engines and is easy to self-host
|
|
which makes it a good choice for privacy-conscious users.
|