langchain/docs
maks-operlejn-ds 2aae1102b0
Instance anonymization (#10501)
### Description

Add instance anonymization - if `John Doe` will appear twice in the
text, it will be treated as the same entity.
The difference between `PresidioAnonymizer` and
`PresidioReversibleAnonymizer` is that only the second one has a
built-in memory, so it will remember anonymization mapping for multiple
texts:

```
>>> anonymizer = PresidioAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Brett Russell. Hi Brett Russell!'
```
```
>>> anonymizer = PresidioReversibleAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
```

### Twitter handle
@deepsense_ai / @MaksOpp

### Tag maintainer
@baskaryan @hwchase17 @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-10-05 11:23:02 -07:00
..
_scripts llm feat table revision (#10947) 2023-09-22 10:29:12 -07:00
api_reference fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513) 2023-10-05 10:47:47 -07:00
docs_skeleton fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513) 2023-10-05 10:47:47 -07:00
extras Instance anonymization (#10501) 2023-10-05 11:23:02 -07:00
snippets Docs: improve similarity search examples (#11298) 2023-10-03 21:47:08 -04:00
.local_build.sh Update local script for docs build (#8377) 2023-07-27 13:13:59 -07:00
vercel_requirements.txt Add api cross ref linking (#8275) 2023-07-26 12:38:58 -07:00