Chelsea_SunChelsea_Sun ・ Jun. 7, 2024
China's First Sora-level Text-to-video Large Model Vidu Can Generate 32-second Video
Currently, Vidu can generate 32-second videos with a single click. It also supports audio-visual generation, meaning that videos generated by Vidu now have sound (Text-to-Audio). Meanwhile, it supports 4D generation, allowing for spatially and temporally consistent 4D content from a single video.

TMTPOST--China's first Sora-level text-to-video large model Vidu has made substantial progress since it was unveiled at the 2024 Zhongguancun Forum in Beijing in April.

Currently, Vidu can generate 32-second videos with a single click. It also supports audio-visual generation, meaning that videos generated by Vidu now have sound (Text-to-Audio). Meanwhile, it supports 4D generation, allowing for spatially and temporally consistent 4D content from a single video.

Vidu, developed by Chinese AI firm Shengshu Technology and Tsinghua University, was able to create a high-definition video with a length of 16 seconds and 1080p resolution in just one click at initial launch in April.

At the 2024 China Chongqing Auto Forum from 7-16 June, Zhu Jun, a professor and Vice Dean of the Tsinghua University Artificial Intelligence Research Institute and Chief Scientist of Shengshu Technology, released a 32-second video generated by Vidu. The video depicted a rotating globe on a bookshelf in a library, with a camera zooming in to reveal a blue planet inside the globe.

Vidu4D can accurately reconstruct 4D (sequential 3D) content from a single video. Zhu said that the underlying architecture theoretically supports the generation and matching of audio of any length, thus enhancing the realism of video generation through improved 3D consistency.

The company said that it is China's first video large model with extended duration, exceptional consistency, and dynamic capabilities and is "very close to" the level of Sora.

Sora is a generative AI model launched by the U.S.-based OpenAI earlier this year. With its ability to build realistic and imaginative scenes from text instructions, the model has taken the tech world by storm.

Compared with Sora, Vidu is able to understand and generate Chinese elements such as the pandas and dragons.

The company also said that the core architecture of the large model was initiated in September 2022, which was earlier than Sora's adoption of its architecture.

LIKE 0
Related Posts
TikTok Warns U.S. Ban Could Cause $1.3 Billion in Losses for Small Businesses and Creators in a Month
TikTok Warns U.S. Ban Could Cause $1.3 Billion in Losses for Small Businesses and Creators in a Month
Tesla Shares Surge 6% to Record, Driving Elon Musk's Wealth to $400 Billion
Tesla Shares Surge 6% to Record, Driving Elon Musk's Wealth to $400 Billion
US Lawmakers Seek Further Curbs on Huawei's Chip Suppliers
US Lawmakers Seek Further Curbs on Huawei's Chip Suppliers
Google Parent Shares Rise Over 5% on Quantum Computing Breakthrough with New Chip Waymo
Google Parent Shares Rise Over 5% on Quantum Computing Breakthrough with New Chip Waymo
Xiaomi Set to Launch First SUV YU7 in Mid-2025
Xiaomi Set to Launch First SUV YU7 in Mid-2025
Tesla Stocks Rise to Nearly Three-Year High after BofA Lifts Price Target by 14% for Robotics Tech Outlook
Tesla Stocks Rise to Nearly Three-Year High after BofA Lifts Price Target by 14% for Robotics Tech Outlook

  • Subscribe To Our News