mirror of
				https://github.com/hpcaitech/ColossalAI.git
				synced 2025-11-03 23:48:41 +00:00 
			
		
		
		
	* add grpo, support rlvr * add grpo, support rlvr * tested deepseek r1 pipeline * add ci * verify grpo r1 * verify grpo r1 * update readme, remove unused code * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove path * clean code * fix circular import * fix ci OOM * fix ci OOM * skip kto tp, fix qwen generation --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
		
			
				
	
	
		
			15 lines
		
	
	
		
			423 B
		
	
	
	
		
			Bash
		
	
	
		
			Executable File
		
	
	
	
	
			
		
		
	
	
			15 lines
		
	
	
		
			423 B
		
	
	
	
		
			Bash
		
	
	
		
			Executable File
		
	
	
	
	
SAVE_DIR=""
 | 
						|
 | 
						|
rm -rf $SAVE_DIR/cache
 | 
						|
rm -rf $SAVE_DIR/jsonl
 | 
						|
rm -rf $SAVE_DIR/arrow
 | 
						|
 | 
						|
python prepare_dataset.py --type prompt \
 | 
						|
    --data_input_dirs /PATH/TO/PROMPT/DATASET \
 | 
						|
    --conversation_template_config /PATH/TO/CHAT/TEMPLATE/CONFIG.json \
 | 
						|
    --tokenizer_dir  "" \
 | 
						|
    --data_cache_dir $SAVE_DIR/cache \
 | 
						|
    --data_jsonl_output_dir $SAVE_DIR/jsonl \
 | 
						|
    --data_arrow_output_dir $SAVE_DIR/arrow \
 | 
						|
    --max_length 300
 |