zhangxinyuezhangxinyue ・ Jan. 28, 2025
DeepSeek Releases Open-Source Multimodal AI Model Janus-Pro, Surpassing DALL-E 3 and Stable Diffusion
The Janus-Pro-7B model has outperformed OpenAI's DALL-E 3 and Stable Diffusion in benchmark tests such as GenEval and DPG-Bench, establishing its superiority in both image generation and understanding.

TMTPOST -- In the early hours of Tuesday, the AI community was abuzz as Hugging Face announced the release of DeepSeek's latest open-source multimodal AI model, Janus-Pro. Available in two configurations with 1 billion and 7 billion parameters, the model marks a significant leap in AI capabilities.

The Janus-Pro-7B model has outperformed OpenAI's DALL-E 3 and Stable Diffusion in benchmark tests such as GenEval and DPG-Bench, establishing its superiority in both image generation and understanding.

Janus-Pro integrates cutting-edge advancements in multimodal AI. The model's ability to process and understand images is powered by the innovative SigLIP-L architecture, while its image generation capabilities draw inspiration from LlamaGen. The model is offered in two sizes, with configurations at 1.5 billion and 7 billion parameters, catering to a range of computational needs.

This launch comes at a time when OpenAI's highly anticipated multimodal image-generation model, GPT-4o, remains unavailable to the public, adding to the excitement surrounding Janus-Pro's open-source debut.

DeepSeek has been at the forefront of multimodal generative AI research. The company launched its original Janus model in late 2024 as a unified framework for understanding and generating multimodal content. Built on DeepSeek-LLM-1.3b-base, Janus utilized a massive dataset of 500 billion text tokens for training. Its design decoupled visual encoding to optimize both understanding and generation tasks, employing advanced techniques like SigLIP-L for visual input and an innovative rectified flow for image generation.

This progress culminated in Janus-Pro, an enhanced self-regressive framework with significant architectural refinements. By decoupling visual encoding into independent pathways, Janus-Pro eliminates previous conflicts in understanding and generation tasks while maintaining a unified Transformer architecture. This modularity improves flexibility and task-specific performance.

Janus-Pro is built on DeepSeek-LLM-1.5b-base and DeepSeek-LLM-7b-base, trained using HAI-LLM, a high-performance distributed training framework on PyTorch. The training involved clusters of 16 to 32 nodes, each equipped with 8 Nvidia A100 GPUs, and required 7–14 days depending on the model size.

The complete Janus-Pro codebase is now available on GitHub: Janus GitHub Repository.

DeepSeek’s rapid advancements in multimodal AI may heighten competition with industry giants such as OpenAI, Meta, and Nvidia. However, the company has faced challenges, including recent large-scale cyberattacks on its online services. To mitigate these issues, DeepSeek has temporarily restricted new user registrations outside China, requiring international users to register using virtual numbers.

With Janus-Pro setting new standards for multimodal AI, the industry eagerly anticipates further developments, including potential advancements in text-to-image and text-to-video capabilities. 

LIKE 0
Related Posts
Elon Musk Offers $97.4B for OpenAI, Sam Altman Says 'No Thank You'
Elon Musk Offers $97.4B for OpenAI, Sam Altman Says 'No Thank You'
BYD Overtakes Tesla in European Market as Musk Faces Backlash
BYD Overtakes Tesla in European Market as Musk Faces Backlash
Trump’s $4 Billion Funding Cut Might Trigger Departure of Chinese Researchers
Trump’s $4 Billion Funding Cut Might Trigger Departure of Chinese Researchers
DeepSeek Researcher Predicts Significant Progress in 2025
DeepSeek Researcher Predicts Significant Progress in 2025
Keep Stock Soars Over 40% as Fitness Industry Bets Big on AI
Keep Stock Soars Over 40% as Fitness Industry Bets Big on AI
Trump Says Steel and Aluminum Imports to be Impose Tariffs on Monday, Followed by Reciprocal Tariffs on Every Country
Trump Says Steel and Aluminum Imports to be Impose Tariffs on Monday, Followed by Reciprocal Tariffs on Every Country

  • Subscribe To Our News