Zach Nussbaum
|
7debf52fc2
|
fix: stop gap to remove unused colulmns
|
2023-04-19 21:16:22 +00:00 |
|
Zach Nussbaum
|
405d8c1bbc
|
fix: typo
|
2023-04-19 19:36:45 +00:00 |
|
Zach Nussbaum
|
6518fa1461
|
feat: load dataset from revision
|
2023-04-19 18:40:58 +00:00 |
|
Zach Nussbaum
|
c76f6e33a9
|
feat: pull from multiple datasets
|
2023-04-17 20:00:19 +00:00 |
|
Zach Nussbaum
|
a3485c4b32
|
Merge: main into gptj
|
2023-04-13 15:16:31 +00:00 |
|
Zach Nussbaum
|
8a94a8c068
|
fix: multi-turn data breaks
|
2023-04-12 03:51:29 +00:00 |
|
Zach Nussbaum
|
be3f528810
|
fix: tokenization error
|
2023-04-08 20:33:51 +00:00 |
|
Zach
|
0bd6acb4dd
|
fix: drop uneven batch size
|
2023-04-07 12:09:31 +00:00 |
|
Zach
|
1b14b1f723
|
fix: data for inference
|
2023-04-07 01:45:07 +00:00 |
|
Zach
|
7751f39432
|
fix: data processing
|
2023-04-06 03:03:34 +00:00 |
|
Zach
|
65ec606f21
|
fix: prompt len for larger
|
2023-04-04 22:01:55 +00:00 |
|
Zach Nussbaum
|
5c5f41ba36
|
fix: clean up data, pad at end
|
2023-04-04 20:53:23 +00:00 |
|
Zach Nussbaum
|
7e468f2199
|
Update data.py
|
2023-03-28 21:13:05 -07:00 |
|
Zach Nussbaum
|
1a95f68494
|
fix: just read from watermark file
|
2023-03-27 17:30:44 +00:00 |
|
Zach Nussbaum
|
bb28929305
|
fix: eos conditional, watermark
|
2023-03-27 16:29:43 +00:00 |
|
Zach Nussbaum
|
eac7734cbf
|
fix: add eos
|
2023-03-26 17:45:31 +00:00 |
|
Zach Nussbaum
|
723a50bdf1
|
feat: train and clean data
|
2023-03-25 16:17:48 +00:00 |
|