负载均衡
-
微软开源Sigma-MoE-Tiny:40:1极致稀疏比MoE模型,0.5B激活参数实现10B级性能突破
关键词: Mixture-of-Experts (MoE)、超高频稀疏度、渐进稀疏化调度、Sigma-MoE-Tiny、专家负载均衡 一次对 MoE 架构负载均衡机制的深度剖析与重构 SIGMA-MOE-TINY TECHNICAL REPORT https://qghuxmu.github.io/Sigma-MoE-Tiny https://github.…