[Inference] Fix bug in ChatGLM2 Tensor Parallelism (#5014)

* fix bug * fix * fix multiquery * fix multiquery --------- Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
2025-09-22 18:09:06 +00:00 · 2023-11-07 15:01:50 +08:00
parent c36e782d80
commit ef4c14a5e2
8 changed files with 21 additions and 19 deletions
--- a/colossalai/shardformer/policies/chatglm2.py
+++ b/colossalai/shardformer/policies/chatglm2.py
@@ -104,7 +104,6 @@ class ChatGLMPolicy(Policy):
                    ),
                ],
            )
-
        # optimization configuration
        self.append_or_create_submodule_replacement(
            description=[