Tencent to Launch First All-Modal Foundation Model 'Hunyuan-O'
TMTPOST — Tencent is preparing to unveil its first all-modal foundation model, codenamed Hunyuan-O, in a direct challenge to rivals such as DeepSeek and ByteDance’s Doubao AI, according to sources familiar with the matter.
The Hunyuan-O project is part of Tencent’s broader push into artificial general intelligence (AGI), and builds on its self-developed large model framework, Hunyuan. Tencent is reportedly planning a multi-modal to fully-modal roadmap, with Hunyuan-O set to become the world’s first "all-modal" model upon release, possibly within this year. The model targets the development of a "world model" — a concept in AGI research referring to a unified model capable of simulating and understanding complex real-world environments across modalities.
In the near term, Tencent is also set to launch Hunyuan-Voice, an end-to-end speech conversation model based on Hunyuan. The model could debut as early as June via the Tencent Yuanbao app, featuring real-time voice communication capabilities to compete directly with ByteDance Doubao's AI video call features.
According to researchers in Tencent’s Technical Engineering Group (TEG), the Hunyuan model family is being developed with language models at its core, while actively integrating diverse modalities including vision, speech, and potentially motion. The company aims to push the boundaries of both depth and breadth in AGI exploration.
Tencent’s foray into all-modal AI follows a wave of competition in China’s foundational model space, where companies are racing to develop more capable and unified systems. With Hunyuan-O, Tencent signals its ambition to play a leading role in the next generation of AI infrastructure.
More News