mirror of https://github.com/hpcaitech/ColossalAI.git synced 2026-07-16 17:16:14 +00:00

Files

Tong Li 7bb7e80476 [feat] GRPO with distributed implementation (#6230 )

* add reward related function

* add simple grpo

* update grpo

* polish

* modify data loader

* grpo consumer

* update loss

* update reward fn

* update example

* update loader

* add algo selection

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add save

* update select algo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update grpo

* update reward fn

* update reward

* fix reward score

* add response length

* detach

* fix tp bug

* fix consumer

* convert to 8 generation

* print results

* setup update

* fix transformers backend

* [Feature] Support Distributed LogProb for GRPO Training (#6247)

* [fix] fix qwen VocabParallelLMHead1D and gather output

* fix tp bug

* fix consumer

* [feat] Support Distributed LogProb for GRPO Training

* [fix] fix loss func

* [fix] fix log prob plugin

* [fix] fix qwen modeling param

* [fix] rm comments

* [fix] rm hard-code;fix non-dist version

* [fix] fix test file param name and benchmark tp gather output=True/False

* [fix] rm non-dist version in dist log prob

* [fix] fix comments

* [fix] fix dis log prob plugin

* [fix] fix test case

* [fix] fix qwen VocabParallelLMHead1D and gather output

* [fix] fix DistLogProb comments

* [fix] restore tp size

* [fix] fix comments

* [fix] fix comment; fix LogSoftmax usage

---------

Co-authored-by: Tong Li <tong.li35271158@gmail.com>

* fix vllm

* fix logprob, add filtering, temperature annealing, lr descent

* simplify vllm preprocessing input ids

* update logging

* [feat] add microbatch forwarding (#6251)

* add microbatch forwarding

* fix forward microbatch

* fix producer OOM

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change project name

* fix temperature annealing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* address conversation

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [Distributed RLHF] Integration of PP (#6257)

* update help information

* update style

* fix

* minor fix

* support PP training

* add pp support

* remove unused code

* address conversation

---------

Co-authored-by: Tong Li <tong.li35271158@gmail.com>

* [hot-fix] Fix memory leakage bug, support TP+PP (#6258)

* update help information

* update style

* fix

* minor fix

* support PP training

* add pp support

* remove unused code

* address conversation

* fix memory leakage support tp+pp

* move empty cache

* move empty cache

---------

Co-authored-by: Tong Li <tong.li35271158@gmail.com>

---------

Co-authored-by: Tong Li <tong.li35271158@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: YeAnbang <anbangy2@outlook.com>
Co-authored-by: duanjunwen <935724073@qq.com>
Co-authored-by: YeAnbang <44796419+YeAnbang@users.noreply.github.com>

2025-04-21 10:43:49 +08:00

Colossal-LLaMA

[Device]Support npu (#6159 )

2024-12-17 15:42:39 +08:00

ColossalChat

[feat] GRPO with distributed implementation (#6230 )

2025-04-21 10:43:49 +08:00

ColossalEval

[ColossalEval] support for vllm (#6056 )

2024-09-18 17:09:45 +08:00

ColossalMoE

[MoE/ZeRO] Moe refactor with zero refactor (#5821 )

2024-06-28 14:00:08 +08:00

ColossalQA

[pre-commit.ci] pre-commit autoupdate (#5572 )

2024-07-01 17:16:41 +08:00

README.md

[Hotfix] README link (#5966 )

2024-08-08 18:04:47 +08:00

README.md

Applications

This directory contains the applications that are powered by Colossal-AI.

GPU Cloud Playground | Playground Document

The list of applications include:

Open-Sora: Revealing Complete Model Parameters, Training Details, and Everything for Sora-like Video Generation Models
ColossalChat: Replication of ChatGPT with RLHF.
Colossal-LLaMA: Continual Pre-training and Supervisied Fine-tuning of LLaMA2 / LLaMA3.
ColossalEval: Evaluation Pipeline for LLMs.
FastFold: Optimizing AlphaFold (Biomedicine) Training and Inference on GPU Clusters.
ColossalQA: Document Retrieval Conversation System
SwiftInfer: Breaks the Length Limit of LLM Inference for Multi-Round Conversations

Please note that the Chatbot application is migrated from the original ChatGPT folder.

You can find more example code for base models and functions in the Examples directory.