ByteDance’s Doubao AI Team Unveils Sparse Model Architecture UltraMem
TMTPOST – The Doubao AI team at ByteDance has introduced a new sparse model architecture called UltraMem. This architecture decouples computation and parameters, addressing memory access issues during inference while maintaining model effectiveness.
UltraMem significantly enhances inference speed, offering 2 to 6 times faster performance compared to the Mixture of Experts (MoE) architecture. Additionally, inference costs can be reduced by up to 83%, making it a highly efficient solution for AI models.
The innovation is seen as a key advancement in optimizing memory usage and computational efficiency, pushing the boundaries of large-scale AI model performance.
More News