Recently, Modelers, a community for AI model enthusiasts, officially launched Step-Video and Step-Audio, two open-source multimodal large models developed by StepStar. These models are designed for video generation and voice interaction, respectively, aiming to provide developers and enterprise users with more powerful AI tools.

Step-Video, formally known as Step-Video-T2V, is a massive 30-billion-parameter model, currently the world's largest open-source video generation model. It can directly generate high-quality videos with 204 frames at 540P resolution. It surpasses existing top-tier open-source video models in terms of instruction following, motion smoothness, physical plausibility, and aesthetics.

Metaverse, Sci-fi, Cyberpunk Painting (4) Large Model

Image Source: AI-generated image, licensed by Midjourney

Meanwhile, Step-Audio is the industry's first large model capable of generating speech with diverse emotions, dialects, languages, singing styles, and personalized characteristics. This release marks a significant breakthrough in the field of AI voice interaction.

It's worth noting that these models are adapted to Huawei Ascend CANN heterogeneous computing architecture and Ascend servers. Developers and enterprise users can easily download and experience these models on the Modelers community. To further lower the barrier to entry, Modelers also provides free computing power support, allowing users to perform online model inference without complex environment setup, enabling them to quickly validate their AI solutions.

Furthermore, StepStar's open-source models have attracted attention from several industry-leading companies. Companies from various sectors, including TensFlow, Alibaba Cloud, Volcano Engine, and TCL, have already joined this open-source ecosystem. StepStar plans to launch a new image-to-video model in March, further enriching its product line.

This collaboration between Huawei Ascend and StepStar not only expands the application scenarios of multimodal AI models but also provides developers with more powerful tools, driving technological advancements across the industry.