InternLM/xtuner is a Python-based training engine designed for ultra-large MoE models, addressing the challenges of training and scaling these complex models efficiently.
Source: README View on GitHub →XTuner is gaining attention due to its innovative approach to training MoE models, offering scalable and efficient solutions for large-scale language model training. Its unique dropless training and long sequence support features stand out, addressing the pain points of traditional training methods.
Source: Synthesis of README and project traitsXTuner's dropless training allows for scalable training of MoE models without the complexity of traditional 3D parallel training architectures, optimizing parallelism for efficiency.
Source: READMEThe engine supports training on long sequences with memory-efficient design, enabling training of 200B MoE models on 64k sequence lengths without sequence parallelism.
Source: READMEXTuner achieves high training efficiency on Ascend NPU, surpassing traditional 3D parallel schemes for MoE models above 200B scale, and supports MoE training up to 1T parameters.
Source: READMEThe architecture of XTuner is modular, with distinct components for training, algorithm, and inference. It leverages advanced memory optimization techniques and supports various hardware platforms, including GPUs and NPUs. The code structure reflects a clear separation of concerns, with dedicated modules for different functionalities.
Source: Code tree + dependency filesCenter: project; inner ring: core feature modules; outer ring: key dependencies. Auto-generated from core_features and tech_stack.key_deps.
bitsandbytesmmenginetransformerstorchtorchvisionXTuner is suitable for researchers and developers working on large-scale language models, particularly those focusing on MoE architectures. It is useful for pre-training, instruction fine-tuning, and reinforcement learning of ultra-large MoE models.
Source: READMEv1.0.1 (2026-05-15): Bug fixes and improvements to the main branch v1.0.0rc0 (2025-11-18): Release candidate for version 1.0.0 v0.2.0 (2025-07-11): Initial release with support for pre-trained RM and bug fixes
Source: GitHub ReleasesInternLM/xtuner is a promising project for those involved in large-scale language model development, offering innovative solutions for training MoE models. It is particularly suited for teams with expertise in deep learning and large-scale model training.
Source: Synthesis