This commit is contained in:
William Fu-Hinthorn
2023-12-11 12:57:51 -08:00
parent 08ff74ea0b
commit e4ffb17d7b
7 changed files with 15 additions and 119 deletions

View File

@@ -284,8 +284,6 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks.manager import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain.llms import LlamaCpp\n",
"\n",
"llm = LlamaCpp(\n",

View File

@@ -8,8 +8,6 @@
"\n",
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/guides/privacy/presidio_data_anonymization/index.ipynb)\n",
"\n",
">[Presidio](https://microsoft.github.io/presidio/) (Origin from Latin praesidium protection, garrison) helps to ensure sensitive data is properly managed and governed. It provides fast identification and anonymization modules for private entities in text and images such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.\n",
"\n",
"## Use case\n",
"\n",
"Data anonymization is crucial before passing information to a language model like GPT-4 because it helps protect privacy and maintain confidentiality. If data is not anonymized, sensitive information such as names, addresses, contact numbers, or other identifiers linked to specific individuals could potentially be learned and misused. Hence, by obscuring or removing this personally identifiable information (PII), data can be used freely without compromising individuals' privacy rights or breaching data protection laws and regulations.\n",
@@ -532,7 +530,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.11.4"
}
},
"nbformat": 4,

View File

@@ -54,6 +54,7 @@ See a [usage example](/docs/integrations/chat/google_vertex_ai_palm).
from langchain.chat_models import ChatVertexAI
```
## Document Loaders
### Google BigQuery
@@ -131,6 +132,8 @@ See a [usage example and authorization instructions](/docs/integrations/document
from langchain.document_loaders import GoogleSpeechToTextLoader
```
## Vector Stores
### Google Vertex AI Vector Search
@@ -261,9 +264,14 @@ from langchain.tools import GooglePlacesTool
### Google Search
We need to install a python package.
```bash
pip install google-api-python-client
```
- Set up a Custom Search Engine, following [these instructions](https://stackoverflow.com/questions/37083058/programmatically-searching-google-in-python-using-custom-search)
- Get an API Key and Custom Search Engine ID from the previous step, and set them as environment variables
`GOOGLE_API_KEY` and `GOOGLE_CSE_ID` respectively.
- Get an API Key and Custom Search Engine ID from the previous step, and set them as environment variables `GOOGLE_API_KEY` and `GOOGLE_CSE_ID` respectively
```python
from langchain.utilities import GoogleSearchAPIWrapper
@@ -278,74 +286,6 @@ from langchain.agents import load_tools
tools = load_tools(["google-search"])
```
### Google Finance
We need to install a python package.
```bash
pip install google-search-results
```
See a [usage example and authorization instructions](/docs/integrations/tools/google_finance).
```python
from langchain.tools.google_finance import GoogleFinanceQueryRun
from langchain.utilities.google_finance import GoogleFinanceAPIWrapper
```
### Google Jobs
We need to install a python package.
```bash
pip install google-search-results
```
See a [usage example and authorization instructions](/docs/integrations/tools/google_jobs).
```python
from langchain.tools.google_jobs import GoogleJobsQueryRun
from langchain.utilities.google_finance import GoogleFinanceAPIWrapper
```
### Google Lens
See a [usage example and authorization instructions](/docs/integrations/tools/google_lens).
```python
from langchain.tools.google_lens import GoogleLensQueryRun
from langchain.utilities.google_lens import GoogleLensAPIWrapper
```
### Google Scholar
We need to install a python package.
```bash
pip install google-search-results
```
See a [usage example and authorization instructions](/docs/integrations/tools/google_scholar).
```python
from langchain.tools.google_scholar import GoogleScholarQueryRun
from langchain.utilities.google_scholar import GoogleScholarAPIWrapper
```
### Google Trends
We need to install a python package.
```bash
pip install google-search-results
```
See a [usage example and authorization instructions](/docs/integrations/tools/google_trends).
```python
from langchain.tools.google_trends import GoogleTrendsQueryRun
from langchain.utilities.google_trends import GoogleTrendsAPIWrapper
```
## Document Transformers
@@ -473,14 +413,6 @@ See a [usage example and authorization instructions](/docs/integrations/tools/se
from langchain.utilities import SerpAPIWrapper
```
### Serper.dev
See a [usage example and authorization instructions](/docs/integrations/tools/google_serper).
```python
from langchain.utilities import GoogleSerperAPIWrapper
```
### YouTube
>[YouTube Search](https://github.com/joetats/youtube_search) package searches `YouTube` videos avoiding using their heavily rate-limited API.

View File

@@ -151,20 +151,6 @@ See a [usage example](/docs/integrations/document_loaders/microsoft_powerpoint).
from langchain.document_loaders import UnstructuredPowerPointLoader
```
### Microsoft OneNote
First, let's install dependencies:
```bash
pip install bs4 msal
```
See a [usage example](/docs/integrations/document_loaders/onenote).
```python
from langchain.document_loaders.onenote import OneNoteLoader
```
## Vector stores
@@ -273,25 +259,4 @@ from langchain.agents.agent_toolkits import PowerBIToolkit
from langchain.utilities.powerbi import PowerBIDataset
```
## More
### Microsoft Presidio
>[Presidio](https://microsoft.github.io/presidio/) (Origin from Latin praesidium protection, garrison)
> helps to ensure sensitive data is properly managed and governed. It provides fast identification and
> anonymization modules for private entities in text and images such as credit card numbers, names,
> locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.
First, you need to install several python packages and download a `SpaCy` model.
```bash
pip install langchain-experimental openai presidio-analyzer presidio-anonymizer spacy Faker
python -m spacy download en_core_web_lg
```
See [usage examples](/docs/guides/privacy/presidio_data_anonymization/).
```python
from langchain_experimental.data_anonymizer import PresidioAnonymizer, PresidioReversibleAnonymizer
```

View File

@@ -197,6 +197,8 @@ class RunnableWithMessageHistory(RunnableBindingBase):
fields[self.input_messages_key] = (Sequence[BaseMessage], ...)
else:
fields["__root__"] = (Sequence[BaseMessage], ...)
if self.history_messages_key:
fields[self.history_messages_key] = (Sequence[BaseMessage], ...)
return create_model( # type: ignore[call-overload]
"RunnableWithChatHistoryInput",
**fields,

View File

@@ -178,6 +178,7 @@ def test_output_dict() -> None:
def test_get_input_schema_input_dict() -> None:
class RunnableWithChatHistoryInput(BaseModel):
input: Union[str, BaseMessage, Sequence[BaseMessage]]
history: Sequence[BaseMessage]
runnable = RunnableLambda(
lambda input: {

View File

@@ -64,7 +64,7 @@ prompt = ChatPromptTemplate.from_messages(
]
)
llm_with_tools = llm.bind_functions(
llm_with_tools = llm.bind(
functions=[format_tool_to_openai_function(t) for t in tools]
)