Artificial Intelligence Learning 🤖🧠🦾

Artificial Intelligence Learning 🤖🧠🦾

Share this post

Artificial Intelligence Learning 🤖🧠🦾
Artificial Intelligence Learning 🤖🧠🦾
What is Alibaba’s text to video Wan2.1?
Copy link
Facebook
Email
Notes
More

What is Alibaba’s text to video Wan2.1?

China's Alibaba and DeepSeek are pressuring OpenAI and Google in text-to-video and open-source capabilities of their models, products and application layer innovations.

Michael Spencer's avatar
Michael Spencer
Feb 26, 2025
∙ Paid
4

Share this post

Artificial Intelligence Learning 🤖🧠🦾
Artificial Intelligence Learning 🤖🧠🦾
What is Alibaba’s text to video Wan2.1?
Copy link
Facebook
Email
Notes
More
2
1
Share

Good Afternoon,

I’ve been thinking about Alibaba a lot recently. In the text-to-video space ByteDance, Tencent and now Alibaba have been impressive. Open-source AI tech has been thrown into the spotlight since Chinese firm DeepSeek rattled global markets in January.

Now Alibaba in late February, 2025 has made its video generation artificial intelligence models free to use, further ramping up competition with rivals like OpenAI.

This essentially means in 2025 China is winning in open-source text-to-video democratization of AI.

The Chinese giant said it is open sourcing four models that are part of its Wan2.1 series, the latest version of the company’s foundational AI model that can generate images and video from text and image inputs.

  • These models will be available via Alibaba Cloud’s Model Scope and Hugging Face, a huge repository of AI models.

  • See on Github: https://github.com/Wan-Video/Wan2.1

  • See on Hugging Face: https://huggingface.co/blog/LLMhacker/wanai-wan21

By making their Tongyi Wanxiang models – freely available for academics, researchers and commercial institutions worldwide Alibaba is making a DeepSeek level statement on how China will be making AI technologies available to developers and students worldwide.

It’s claimed also that SOTA Performance: Wan2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks including OpenAI’s Sora and Google’s VEO 2. All of which are fairly expensive to access.

Hangzhou-based DeepSeek has changed the rules in late January, 2025.

The Wan2.1 series, which was released in January, was the first video-generation model to support text effects in both Chinese and English.

Keep reading with a 7-day free trial

Subscribe to Artificial Intelligence Learning 🤖🧠🦾 to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Michael Spencer
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More