Tencent Hunyuan Unveils Hunyuan Image 2.0, an Industry-First Real-Time Text-to-Image Model with Millisecond Response| TMTPOST

中文

HOME

BRIEF NEWS

OPINION

FEATURES

LIVE

EVENTS

May. 19, 2025

Tencent Hunyuan Unveils Hunyuan Image 2.0, an Industry-First Real-Time Text-to-Image Model with Millisecond Response

TMTPOST — Tencent’s Hunyuan AI division has released Hunyuan Image 2.0, the industry's first real-time text-to-image generation model capable of millisecond-level response. The new model significantly scales up parameter size—by tens of times compared to its predecessor—and supports multimodal inputs including text, voice, and sketch. With just a spoken command, written prompt, or simple line drawing, users can instantly generate realistic images in real time. Hunyuan Image 2.0 is built on a single- and dual-stream DiT (Diffusion Transformer) architecture, which boosts generation efficiency without compromising image quality or detail. The system also integrates a multimodal large language model (MLLM) as its text encoder, paired with a proprietary structured captioning system. This allows the model to deeply understand semantic input, infer visual intent, and progressively generate images with high fidelity.

Subscribe To Our News