mirror of
https://github.com/nomic-ai/gpt4all.git
synced 2025-06-21 13:10:35 +00:00
fix: format
This commit is contained in:
parent
9cf38e0ad9
commit
4dd5df1b6f
@ -241,7 +241,10 @@ We tried training a full model using the parameters above, but found that during
|
|||||||
|
|
||||||
### Model Training Divergence
|
### Model Training Divergence
|
||||||
|
|
||||||
We trained multiple [GPT-J models](https://huggingface.co/EleutherAI/gpt-j-6b) with varying success. We found that training the full model lead to diverged post epoch 1. . We release the checkpoint after epoch 1.
|
We trained multiple [GPT-J models](https://huggingface.co/EleutherAI/gpt-j-6b) with varying success. We found that training the full model lead to diverged post epoch 1. 
|
||||||
|
|
||||||
|
|
||||||
|
We release the checkpoint after epoch 1.
|
||||||
|
|
||||||
|
|
||||||
Using Atlas, we extracted the embeddings of each point in the dataset and calculated the loss per sequence. We then uploaded [this to Atlas](https://atlas.nomic.ai/map/gpt4all-j-post-epoch-1-embeddings) and noticed that the higher loss items seem to cluster. On further inspection, the highest density clusters seemded to be of prompt/response pairs that asked for creative-like generations such as `Generate a story about ...` 
|
Using Atlas, we extracted the embeddings of each point in the dataset and calculated the loss per sequence. We then uploaded [this to Atlas](https://atlas.nomic.ai/map/gpt4all-j-post-epoch-1-embeddings) and noticed that the higher loss items seem to cluster. On further inspection, the highest density clusters seemded to be of prompt/response pairs that asked for creative-like generations such as `Generate a story about ...` 
|
||||||
|
Loading…
Reference in New Issue
Block a user