Bagatur
|
4d427b2397
|
Base language model docstrings (#7104)
|
2023-07-07 16:09:10 -04:00 |
|
Zander Chase
|
785502edb3
|
Add 'get_token_ids' method (#4784)
Let user inspect the token ids in addition to getting th enumber of tokens
---------
Co-authored-by: Zach Schillaci <40636930+zachschillaci27@users.noreply.github.com>
|
2023-05-22 13:17:26 +00:00 |
|
Ankush Gola
|
d3ec00b566
|
Callbacks Refactor [base] (#3256)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
|
2023-04-30 11:14:09 -07:00 |
|
Mike Wang
|
b588446bf9
|
[simple][test] Added test case for schema.py (#3692)
- added unittest for schema.py covering utility functions and token
counting.
- fixed a nit. based on huggingface doc, the tokenizer model is gpt-2.
[link](https://huggingface.co/transformers/v4.8.2/_modules/transformers/models/gpt2/tokenization_gpt2_fast.html)
- make lint && make format, passed on local
- screenshot of new test running result
<img width="1283" alt="Screenshot 2023-04-27 at 9 51 55 PM"
src="https://user-images.githubusercontent.com/62768671/235057441-c0ac3406-9541-453f-ba14-3ebb08656114.png">
|
2023-04-28 20:42:24 -07:00 |
|