# Amazon Comprehend Moderation Chain
---

In [None]:
%pip install boto3 nltk

In [None]:
import boto3

comprehend_client = boto3.client('comprehend', region_name='us-east-1')

Import `AmazonComprehendModerationChain`

In [None]:
from langchain_experimental.comprehend_moderation import AmazonComprehendModerationChain

Initialize an instance of the Amazon Comprehend Moderation Chain to be used with your LLM chain

In [None]:
comprehend_moderation = AmazonComprehendModerationChain(
    client=comprehend_client, #optional
    verbose=True
)

Using it with your LLM chain. 

**Note**: The example below uses the _Fake LLM_ from LangChain, but same concept could be applied to other LLMs.

In [None]:
from langchain import PromptTemplate, LLMChain
from langchain.llms.fake import FakeListLLM
from langchain_experimental.comprehend_moderation.base_moderation_exceptions import ModerationPiiError

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = (
    prompt 
    | comprehend_moderation 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comprehend_moderation 
)

try:
    response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except ModerationPiiError as e:
    print(e.message)
else:
    print(response['output'])


## Using `moderation_config` to customize your moderation
---

Use Amazon Comprehend Moderation with a configuration to control what moderations you wish to perform and what actions should be taken for each of them. There are three different moderations that happen when no configuration is passed as demonstrated above. These moderations are:

- PII (Personally Identifiable Information) checks 
- Toxicity content detection
- Intention detection

Here is an example of a moderation config.

In [None]:
from langchain_experimental.comprehend_moderation import BaseModerationActions, BaseModerationFilters

moderation_config = { 
        "filters":[ 
                BaseModerationFilters.PII, 
                BaseModerationFilters.TOXICITY,
                BaseModerationFilters.INTENT
        ],
        "pii":{ 
                "action": BaseModerationActions.ALLOW, 
                "threshold":0.5, 
                "labels":["SSN"],
                "mask_character": "X"
        },
        "toxicity":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5
        },
        "intent":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5
        }
}

At the core of the configuration you have three filters specified in the `filters` key:

1. `BaseModerationFilters.PII`
2. `BaseModerationFilters.TOXICITY`
3. `BaseModerationFilters.INTENT`

And an `action` key that defines two possible actions for each moderation function:

1. `BaseModerationActions.ALLOW` - `allows` the prompt to pass through but masks detected PII in case of PII check. The default behavior is to run and redact all PII entities. If there is an entity specified in the `labels` field, then only those entities will go through the PII check and masked.
2. `BaseModerationActions.STOP` - `stops` the prompt from passing through to the next step in case any PII, Toxicity, or incorrect Intent is detected. The action of `BaseModerationActions.STOP` will raise a Python `Exception` essentially stopping the chain in progress.

Using the configuration in the previous cell will perform PII checks and will allow the prompt to pass through however it will mask any SSN numbers present in either the prompt or the LLM output.


In [None]:
comp_moderation_with_config = AmazonComprehendModerationChain(
    moderation_config=moderation_config, #specify the configuration
    client=comprehend_client,            #optionally pass the Boto3 Client
    verbose=True
)

In [None]:
template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = ( 
    prompt 
    | comp_moderation_with_config 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comp_moderation_with_config 
)

try:
    response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

## Unique ID, and Moderation Callbacks
---

When Amazon Comprehend moderation action is specified as `STOP`, the chain will raise one of the following exceptions-
    - `ModerationPiiError`, for PII checks
    - `ModerationToxicityError`, for Toxicity checks 
    - `ModerationIntentionError` for Intent checks

In addition to the moderation configuration, the `AmazonComprehendModerationChain` can also be initialized with the following parameters

- `unique_id` [Optional] a string parameter. This parameter can be used to pass any string value or ID. For example, in a chat application you may want to keep track of abusive users, in this case you can pass the user's username/email id etc. This defaults to `None`.

- `moderation_callback` [Optional] the `BaseModerationCallbackHandler` that will be called asynchronously (non-blocking to the chain). Callback functions are useful when you want to perform additional actions when the moderation functions are executed, for example logging into a database, or writing a log file. You can override three functions by subclassing `BaseModerationCallbackHandler` - `on_after_pii()`, `on_after_toxicity()`, and `on_after_intent()`. Note that all three functions must be `async` functions. These callback functions receive two arguments:
    - `moderation_beacon` a dictionary that will contain information about the moderation function, the full response from Amazon Comprehend model, a unique chain id, the moderation status, and the input string which was validated. The dictionary is of the following schema-
    
    ```
    { 
        'moderation_chain_id': 'xxx-xxx-xxx', # Unique chain ID
        'moderation_type': 'Toxicity' | 'PII' | 'Intent', 
        'moderation_status': 'LABELS_FOUND' | 'LABELS_NOT_FOUND',
        'moderation_input': 'A sample SSN number looks like this 123-456-7890. Can you give me some more samples?',
        'moderation_output': {...} #Full Amazon Comprehend PII, Toxicity, or Intent Model Output
    }
    ```
    
    - `unique_id` if passed to the `AmazonComprehendModerationChain`

<div class="alert alert-block alert-info"> <b>NOTE:</b> <code>moderation_callback</code> is different from LangChain Chain Callbacks. You can still use LangChain Chain callbacks with <code>AmazonComprehendModerationChain</code> via the callbacks parameter. Example: <br/>
<pre>
from langchain.callbacks.stdout import StdOutCallbackHandler
comp_moderation_with_config = AmazonComprehendModerationChain(verbose=True, callbacks=[StdOutCallbackHandler()])
</pre>
</div>

In [None]:
from langchain_experimental.comprehend_moderation import BaseModerationCallbackHandler

In [None]:
# Define callback handlers by subclassing BaseModerationCallbackHandler

class MyModCallback(BaseModerationCallbackHandler):
    
    async def on_after_pii(self, output_beacon, unique_id):
        import json
        moderation_type = output_beacon['moderation_type']
        chain_id = output_beacon['moderation_chain_id']
        with open(f'output-{moderation_type}-{chain_id}.json', 'w') as file:
            data = { 'beacon_data': output_beacon, 'unique_id': unique_id }
            json.dump(data, file)
    
    '''
    async def on_after_toxicity(self, output_beacon, unique_id):
        pass
    
    async def on_after_intent(self, output_beacon, unique_id):
        pass
    '''
    

my_callback = MyModCallback()

In [None]:
moderation_config = { 
        "filters": [ 
                BaseModerationFilters.PII, 
                BaseModerationFilters.TOXICITY
        ],
        "pii":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5, 
                "labels":["SSN"], 
                "mask_character": "X" 
        },
        "toxicity":{ 
                "action": BaseModerationActions.STOP, 
                "threshold":0.5 
        }
}

comp_moderation_with_config = AmazonComprehendModerationChain(
        moderation_config=moderation_config, # specify the configuration
        client=comprehend_client,            # optionally pass the Boto3 Client
        unique_id='john.doe@email.com',      # A unique ID
        moderation_callback=my_callback,     # BaseModerationCallbackHandler
        verbose=True
)

In [None]:
from langchain import PromptTemplate, LLMChain
from langchain.llms.fake import FakeListLLM

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

responses = [
    "Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.", 
    "Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]

llm = FakeListLLM(responses=responses)

llm_chain = LLMChain(prompt=prompt, llm=llm)

chain = (
    prompt 
    | comp_moderation_with_config 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | comp_moderation_with_config 
) 

try:
    response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

## `moderation_config` and moderation execution order
---

If `AmazonComprehendModerationChain` is not initialized with any `moderation_config` then the default action is `STOP` and default order of moderation check is as follows.

```
AmazonComprehendModerationChain
│
└──Check PII with Stop Action
    ├── Callback (if available)
    ├── Label Found ⟶ [Error Stop]
    └── No Label Found 
        └──Check Toxicity with Stop Action
            ├── Callback (if available)
            ├── Label Found ⟶ [Error Stop]
            └── No Label Found
                └──Check Intent with Stop Action
                    ├── Callback (if available)
                    ├── Label Found ⟶ [Error Stop]
                    └── No Label Found
                        └── Return Prompt
```

If any of the check raises exception then the subsequent checks will not be performed. If a `callback` is provided in this case, then it will be called for each of the checks that have been performed. For example, in the case above, if the Chain fails due to presence of PII then the Toxicity and Intent checks will not be performed.

You can override the execution order by passing `moderation_config` and simply specifying the desired order in the `filters` key of the configuration. In case you use `moderation_config` then the order of the checks as specified in the `filters` key will be maintained. For example, in the configuration below, first Toxicity check will be performed, then PII, and finally Intent validation will be performed. In this case, `AmazonComprehendModerationChain` will perform the desired checks in the specified order with default values of each model `kwargs`.

```python
moderation_config = { 
        "filters":[ BaseModerationFilters.TOXICITY, 
                    BaseModerationFilters.PII, 
                    BaseModerationFilters.INTENT]
   }
```

Model `kwargs` are specified by the `pii`, `toxicity`, and `intent` keys within the `moderation_config` dictionary. For example, in the `moderation_config` below, the default order of moderation is overriden and the `pii` & `toxicity` model `kwargs` have been overriden. For `intent` the chain's default `kwargs` will be used.

```python
 moderation_config = { 
        "filters":[ BaseModerationFilters.TOXICITY, 
                    BaseModerationFilters.PII, 
                    BaseModerationFilters.INTENT],
        "pii":{ "action": BaseModerationActions.ALLOW, 
                "threshold":0.5, 
                "labels":["SSN"], 
                "mask_character": "X" },
        "toxicity":{ "action": BaseModerationActions.STOP, 
                     "threshold":0.5 }
   }
```

1. For a list of PII labels see Amazon Comprehend Universal PII entity types - https://docs.aws.amazon.com/comprehend/latest/dg/how-pii.html#how-pii-types
2. Following are the list of available Toxicity labels-
    - `HATE_SPEECH`: Speech that criticizes, insults, denounces or dehumanizes a person or a group on the basis of an identity, be it race, ethnicity, gender identity, religion, sexual orientation, ability, national origin, or another identity-group.
    - `GRAPHIC`: Speech that uses visually descriptive, detailed and unpleasantly vivid imagery is considered as graphic. Such language is often made verbose so as to amplify an insult, discomfort or harm to the recipient.
    - `HARASSMENT_OR_ABUSE`: Speech that imposes disruptive power dynamics between the speaker and hearer, regardless of intent, seeks to affect the psychological well-being of the recipient, or objectifies a person should be classified as Harassment.
    - `SEXUAL`: Speech that indicates sexual interest, activity or arousal by using direct or indirect references to body parts or physical traits or sex is considered as toxic with toxicityType "sexual". 
    - `VIOLENCE_OR_THREAT`: Speech that includes threats which seek to inflict pain, injury or hostility towards a person or group.
    - `INSULT`: Speech that includes demeaning, humiliating, mocking, insulting, or belittling language.
    - `PROFANITY`: Speech that contains words, phrases or acronyms that are impolite, vulgar, or offensive is considered as profane.
3. For a list of Intent labels refer to documentation [link here]

# Examples
---

## With HuggingFace Hub Models

Get your API Key from Huggingface hub - https://huggingface.co/docs/api-inference/quicktour#get-your-api-token

In [None]:
%pip install huggingface_hub

In [None]:
%env HUGGINGFACEHUB_API_TOKEN="<HUGGINGFACEHUB_API_TOKEN>"

In [None]:
# See https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads for some other options
repo_id = "google/flan-t5-xxl"  


In [None]:
from langchain import HuggingFaceHub
from langchain import PromptTemplate, LLMChain

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm = HuggingFaceHub(
    repo_id=repo_id, model_kwargs={"temperature": 0.5, "max_length": 256}
)
llm_chain = LLMChain(prompt=prompt, llm=llm)

Create a configuration and initialize an Amazon Comprehend Moderation chain

In [None]:
moderation_config = { 
        "filters":[ BaseModerationFilters.PII, BaseModerationFilters.TOXICITY, BaseModerationFilters.INTENT ],
        "pii":{"action": BaseModerationActions.ALLOW, "threshold":0.5, "labels":["SSN","CREDIT_DEBIT_NUMBER"], "mask_character": "X"},
        "toxicity":{"action": BaseModerationActions.STOP, "threshold":0.5},
        "intent":{"action": BaseModerationActions.ALLOW, "threshold":0.5,},
   }

# without any callback
amazon_comp_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config, 
                                                         client=comprehend_client,
                                                         verbose=True)

# with callback
amazon_comp_moderation_out = AmazonComprehendModerationChain(moderation_config=moderation_config, 
                                                         client=comprehend_client,
                                                         moderation_callback=my_callback,
                                                         verbose=True)

The `moderation_config` will now prevent any inputs and model outputs containing obscene words or sentences, bad intent, or PII with entities other than SSN with score above threshold or 0.5 or 50%. If it finds Pii entities - SSN - it will redact them before allowing the call to proceed. 

In [None]:
chain = (
    prompt 
    | amazon_comp_moderation 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | amazon_comp_moderation_out
)

try:
    response = chain.invoke({"question": "My AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has 24$ due by July 31st. Can you give me some more credit car number samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])

---
## With Amazon SageMaker Jumpstart

The exmaple below shows how to use Amazon Comprehend Moderation chain with an Amazon SageMaker Jumpstart hosted LLM. You should have an Amazon SageMaker Jumpstart hosted LLM endpoint within your AWS Account. 

In [None]:
endpoint_name = "<SAGEMAKER_ENDPOINT_NAME>" # replace with your SageMaker Endpoint name

In [None]:
from langchain import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains import LLMChain
from langchain.prompts import load_prompt, PromptTemplate
import json

class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: dict) -> bytes:
        input_str = json.dumps({"text_inputs": prompt,  **model_kwargs})
        return input_str.encode('utf-8')
    
    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json['generated_texts'][0]

content_handler = ContentHandler()

#prompt template for input text
llm_prompt = PromptTemplate(input_variables=["input_text"], template="{input_text}")

llm_chain = LLMChain(
    llm=SagemakerEndpoint(
        endpoint_name=endpoint_name, 
        region_name='us-east-1',
        model_kwargs={"temperature":0.97,
                      "max_length": 200,
                      "num_return_sequences": 3,
                      "top_k": 50,
                      "top_p": 0.95,
                      "do_sample": True},
        content_handler=content_handler
    ),
    prompt=llm_prompt
)

Create a configuration and initialize an Amazon Comprehend Moderation chain

In [None]:
moderation_config = { 
        "filters":[ BaseModerationFilters.PII, BaseModerationFilters.TOXICITY ],
        "pii":{"action": BaseModerationActions.ALLOW, "threshold":0.5, "labels":["SSN"], "mask_character": "X"},
        "toxicity":{"action": BaseModerationActions.STOP, "threshold":0.5},
        "intent":{"action": BaseModerationActions.ALLOW, "threshold":0.5,},
   }

amazon_comp_moderation = AmazonComprehendModerationChain(moderation_config=moderation_config, 
                                                         client=comprehend_client ,
                                                         verbose=True)

The `moderation_config` will now prevent any inputs and model outputs containing obscene words or sentences, bad intent, or Pii with entities other than SSN with score above threshold or 0.5 or 50%. If it finds Pii entities - SSN - it will redact them before allowing the call to proceed. 

In [None]:
chain = (
    prompt 
    | amazon_comp_moderation 
    | {llm_chain.input_keys[0]: lambda x: x['output'] }  
    | llm_chain 
    | { "input": lambda x: x['text'] } 
    | amazon_comp_moderation 
)

try:
    response = chain.invoke({"question": "My AnyCompany Financial Services, LLC credit card account 1111-0000-1111-0008 has 24$ due by July 31st. Can you give me some more samples?"})
except Exception as e:
    print(str(e))
else:
    print(response['output'])