TuSimple Launches Image-to-Video Model 'Ruyi' and Opens Sources Ruyi-Mini-7B

AIbase基地

Published inAI News · 4 min read · Dec 17, 2024

453

Beijing TuSimple Future Technology Co., Ltd. officially released its first "TuSheng Video" large model - "Ruyi" on December 17, 2024, and open-sourced the Ruyi-Mini-7B version for users to download from the Hugging Face platform. TuSimple was founded in 2015 and is headquartered in San Diego, California, USA, focusing on the application of AI technology in various industries, including animation, gaming, and transportation.

The Ruyi large model is designed to run on consumer-grade graphics cards, providing detailed deployment instructions and ComfyUI workflows for users to get started quickly. With its outstanding performance in frame consistency, motion fluidity, color presentation, and composition, it offers new possibilities for visual storytelling and is ideal for ACG enthusiasts, having undergone deep learning specifically for anime and gaming scenarios.

WeChat Screenshot_20241217140324.png

The Ruyi model supports multi-resolution and multi-duration generation, capable of handling resolutions from 384×384 to 1024×1024, with any aspect ratio, generating videos up to 120 frames/5 seconds long. It also supports first frame and first-last frame control generation, motion amplitude control, and five types of shot control. Ruyi is based on the DiT architecture, composed of a Casual VAE module and a Diffusion Transformer, with a total parameter count of approximately 7.1B, trained using about 200M video clips.

Despite the significant technical advancements of Ruyi, there are still some issues such as hand distortion, facial detail collapse in multi-person scenarios, and uncontrollable transitions. TuSimple is actively working to improve and address these issues in future updates.

Looking ahead, TuSimple plans to continue focusing on scene requirements, achieving breakthroughs in direct CUT generation, and offering two versions in the next release to meet the needs of different creators. The company is committed to using large models to reduce the development cycle and cost of anime and game content. The Ruyi large model is already capable of generating 5 seconds of content after inputting keyframes or generating intermediate transition content between two keyframes, thus shortening the development cycle.

Hugging Face Link:

https://huggingface.co/IamCreateAI/Ruyi-Mini-7B

Image-to-Video Ruyi huggingface ACG

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

LTX-Video 13B Released! Generate High-Definition Videos 30 Times Faster, Open Source AI Makes Creation Boundless!

Lightricks releases open-source LTX-Video13B, a 13B-parameter video generation model with multi-scale rendering, achieving 30x faster speeds. It runs on consumer GPUs, supports 1216×704 real-time generation, and offers text/image/video-to-video modes. The model enhances coherence and detail, enabling keyframe control and style transfer. Free for SMEs, it includes training tools and optimized versions to democratize AI video creation.....

Jul 18, 2025

140

Lightricks Releases LTXV Model Update: Breakthrough in Image-to-Video Generation Beyond 60 Seconds

Jul 18, 2025

Mistral AI Chatbot Le Chat Gets Major Update: Deep Research, Voice Interaction, and Advanced Image Editing Features Now Available

Mistral AI's chatbot Le Chat has received a major update with five core features now available: 1) Deep Research mode that intelligently breaks down complex questions and generates structured reports; 2) Voice input feature based on the Voxtral model for natural conversation; 3) Thinking mode using the Magistral model to handle complex reasoning; 4) A text-to-image modification feature developed in collaboration with the Black Forest Lab; 5) New project management tools to organize conversations and files. These features are now available on both web and mobile platforms, significantly enhancing AI capabilities.

Jul 18, 2025

First Live Streaming Diffusion AI Model MirageLSD Makes a Stunning Debut, Opening Infinite Possibilities for Real-Time Video Conversion!

MirageLSD, the world's first AI real-time video conversion model with 40ms latency, enables instant scene/outfit changes via gestures. Applied in gaming/livestreaming, its LSD model uses Diffusion Forcing to eliminate long-generation errors.....

Jul 18, 2025

120

New High Cost of AI Video? Google Veo3 Now Available via Gemini API

Google launched Veo3 API for text-to-video & audio generation, but costs are high ($0.75/s for 720p). Used in gaming cinematics & 3D modeling. Faster 'Veo3Fast' mode not yet API-accessible. Requires Gemini API & Google Cloud subscription.....

Jul 18, 2025

Baidu AI Assistant Launches Video Call Feature to Enable Real-Time Video Communication

Baidu AI Assistant introduces 'Video Call' feature, enabling real-time video chats via Baidu APP. It covers four scenarios: object/insect identification, styling advice, pet behavior analysis, and emotional support, with dialect recognition for elderly users. This upgrade transforms AI from a tool to a smart life companion.....

Jul 17, 2025

130

Meitu RoboNeo Launch: One Sentence to Complete Photo Editing and Website Building, AI Image Processing Enters the All-Round Era

Meitu launches RoboNeo, an AI tool for image editing, brand design, and web creation via natural language commands. It automates tasks like wedding photo retouching, brand kits, and e-commerce content, reducing visual production barriers for SMEs. Initial tests show rapid multi-format output generation within 5 minutes, though lighting details need refinement.....

Jul 15, 2025

140

HuggingFace Launches a Small Intelligent Robot, Sales Exceed One Million in Five Hours, Starting at 299 Dollars

HuggingFace launched the open-source desktop robot Reachy Mini, triggering a rush to buy. This 28-centimeter-tall robot, starting at 299 dollars, achieved sales of over 130,000 euros in five hours. It features a six-degree-of-freedom head, wide-angle camera, and modular design, supporting 15 preset actions and community behavior sharing. It is both a toy and an AI learning platform. As an AI platform valued at 4.5 billion dollars, HuggingFace has acquired the robotics company PollenRobotics and launched the LeRobot project.

Jul 14, 2025

180

PixVerse AI Video Creation Platform Launches Multi-Keyframe Generation Feature

On July 11, PixVerse AI video creation platform, which has surpassed 60 million global users, announced a major feature upgrade — the addition of the 'Multi-Keyframe Generation' function in the Start-End Frame module. This marks a new stage in AI video creation, transitioning from the generation of single segments to narrative expression. Users can now upload up to 7 images as keyframes via the web version's start-end frame feature, and the AI will automatically analyze the semantic relationships between frames, intelligently building smooth action and scene transition paths. This technological breakthrough enables static images to be presented dynamically.

Jul 14, 2025

140

New Breakthrough in Real-Time Video Generation: Meta StreamDiT Can Generate High-Quality Videos Frame by Frame with a Single GPU

Meta and Berkeley developed StreamDiT for real-time AI video generation: 1) 16fps 512p on single GPU, 4B-param model creates 1-min videos with live edits; 2) Novel buffer enables parallel processing (2 frames/0.5s) with 8-step optimization; 3) Trained on 3K HD videos + 2.6M dataset; 4) Outperforms rivals in motion smoothness, 30B-param version shows quality potential; 5) Enables interactive video despite transition flaws.....

Jul 14, 2025

220

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

TuSimple Launches Image-to-Video Model 'Ruyi' and Opens Sources Ruyi-Mini-7B

AIbase基地

This article is from AIbase Daily

AI News Recommendations

LTX-Video 13B Released! Generate High-Definition Videos 30 Times Faster, Open Source AI Makes Creation Boundless!

Lightricks Releases LTXV Model Update: Breakthrough in Image-to-Video Generation Beyond 60 Seconds

Mistral AI Chatbot Le Chat Gets Major Update: Deep Research, Voice Interaction, and Advanced Image Editing Features Now Available

First Live Streaming Diffusion AI Model MirageLSD Makes a Stunning Debut, Opening Infinite Possibilities for Real-Time Video Conversion!

New High Cost of AI Video? Google Veo3 Now Available via Gemini API

Baidu AI Assistant Launches Video Call Feature to Enable Real-Time Video Communication

Meitu RoboNeo Launch: One Sentence to Complete Photo Editing and Website Building, AI Image Processing Enters the All-Round Era

HuggingFace Launches a Small Intelligent Robot, Sales Exceed One Million in Five Hours, Starting at 299 Dollars

PixVerse AI Video Creation Platform Launches Multi-Keyframe Generation Feature

New Breakthrough in Real-Time Video Generation: Meta StreamDiT Can Generate High-Quality Videos Frame by Frame with a Single GPU