Files
langchain/langchain/llms
Carmen Sam d54c88aa21 Add allowed and disallowed special arguments to BaseOpenAI (#3012)
## Background
This PR fixes this error when there are special tokens when querying the
chain:
```
Encountered text corresponding to disallowed special token '<|endofprompt|>'.
If you want this text to be encoded as a special token, pass it to `allowed_special`, e.g. `allowed_special={'<|endofprompt|>', ...}`.
If you want this text to be encoded as normal text, disable the check for this token by passing `disallowed_special=(enc.special_tokens_set - {'<|endofprompt|>'})`.
To disable this check for all special tokens, pass `disallowed_special=()`.
```

Refer to the code snippet below, it breaks in the chain line.
```
        chain = ConversationalRetrievalChain.from_llm(
            ChatOpenAI(openai_api_key=OPENAI_API_KEY),
            retriever=vectorstore.as_retriever(),
            qa_prompt=prompt,
            condense_question_prompt=condense_prompt,
        )
        answer = chain({"question": f"{question}"})
```
However `ChatOpenAI` class is not accepting `allowed_special` and
`disallowed_special` at the moment so they cannot be passed to the
`encode()` in `get_num_tokens` method to avoid the errors.


## Change
- Add `allowed_special` and `disallowed_special` attributes to
`BaseOpenAI` class.
- Pass in `allowed_special` and `disallowed_special` as arguments of
`encode()` in tiktoken.

---------

Co-authored-by: samcarmen <“carmen.samkahman@gmail.com”>
2023-04-18 09:34:08 -07:00
..
2023-04-06 14:41:06 -07:00
2023-04-09 17:54:26 -07:00
2022-12-18 16:22:42 -05:00
2023-04-13 21:31:18 -07:00
2022-11-14 22:05:41 -08:00