AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Doubao: The video generation model 'VideoWorld' has been open-sourced, achieving pure visual learning

AIbase基地

Published inAI News · 4 min read · Feb 10, 2025

540

According to the official WeChat account of the Doubao large model team, the "VideoWorld" video generation experimental model proposed by the Doubao team has recently been officially open-sourced following joint research by Beijing Jiaotong University and the University of Science and Technology of China.

The standout feature of this model is that it no longer relies on traditional language models but can recognize and understand the world solely through visual information. This groundbreaking research was inspired by Professor Fei-Fei Li's concept mentioned in her TED talk that "children can understand the real world without relying on language."

"VideoWorld" achieves complex reasoning, planning, and decision-making abilities by analyzing and processing a large amount of video data. Experiments conducted by the research team show that the model has achieved significant results with only 300M parameters. Unlike existing models that rely on language or labeled data, VideoWorld can learn knowledge independently, particularly providing a more intuitive learning approach in complex tasks like origami and tying ties.

To validate the effectiveness of this model, the research team established two experimental environments: Go game matches and robot simulation control. Go, being a highly strategic game, can effectively assess the model's rule learning and reasoning capabilities, while the robot tasks evaluate the model's performance in control and planning. During the training phase, the model gradually builds its ability to predict future scenes by watching a large number of video demonstration data.

To improve the efficiency of video learning, the team introduced a Latent Dynamic Model (LDM) aimed at compressing the visual changes between video frames to extract key information. This method not only reduces redundant information but also enhances the model's learning efficiency for complex knowledge. Through this innovation, VideoWorld has demonstrated outstanding capabilities in Go and robot tasks, even reaching the level of a professional 5-dan Go player.

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper Link:https://arxiv.org/abs/2501.09781

Code Link:https://github.com/bytedance/VideoWorld

Project Homepage:https://maverickren.github.io/VideoWorld.github.io

Key Highlights:
🌟 The "VideoWorld" model can achieve knowledge learning solely through visual information, without relying on language models.
🤖 The model exhibits exceptional reasoning and planning abilities in Go and robot simulation tasks.
🔓 The project code and model have been open-sourced, and we welcome participation and communication from all sectors.

BeanBagLargeModel VideoWorld VisualInformation LiFeifeiInstructor

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

ByteDance Open-Sources Multi-SWE-bench to Drive Intelligent Upgrades for Large Model Code

Apr 10, 2025

240

Doubao's Large Model Claims to Match GPT-4, Revealing Its Capability of 3 Million Long Texts for the First Time

Dec 31, 2024

5.2k

Douyin Vice President Denies a Price War for Large Models: Promoting the Inclusive Development and Application of AI Technology

Today, in response to rumors that ByteDance might initiate a price war for large models, Douyin Vice President Li Liang issued a statement on social media, clearly stating that this is not a price war. Li Liang pointed out that the Doubao large model has reduced costs through technological innovation, with significant optimizations in algorithms, software engineering, and hardware solutions. He mentioned that the pricing of 0.3 yuan per 1,000 tokens not only has a considerable gross profit but also follows a transparent pricing strategy, which is not the traditional 'list price discount' model.

Dec 19, 2024

1.7k

Doubao Large Model Family Upgraded, Launches Powerful Visual Understanding Model

Dec 18, 2024

4.3k

ByteDance Volcano Engine Global AI Search Released: Supports Multimodal Search and Quality Real-Time Content from Douyin

At the 2024 Volcano Engine FORCE Power Conference in Winter, ByteDance also launched the Global AI Search on Volcano Engine. This service integrates contextual search recommendations, enterprise private domain information integration, and connected Q&A services, closely aligning enterprise information, business needs, and user demands to help businesses achieve more precise recommendations and broader information discovery.

Dec 18, 2024

5.0k

AI Daily: ByteDance Launches Image Editing Model SeedEdit; Suno Releases V4 Music Generation Model; Google's Latest AI Video Creation Tool Vids

Welcome to the 【AI Daily】 column! This is your daily guide to exploring the world of artificial intelligence. Every day we present to you the hot topics in the AI field, focusing on developers to help you gain insights into technological trends and understand innovative AI product applications. Check out the new AI products here: https://top.aibase.com/ 1. The Doubao Big Model Team officially released the image editing model SeedEdit, making P-picture magic come true! Grammy-nominated band returns to the spotlight.

Nov 11, 2024

1.7k

ByteDance's Doubao Large Model to Release Video Generation Model on September 24

Today, ByteDance's Volcano Engine announced that the Doubao Large Model will release a video generation model on September 24, bringing upgrades to more capabilities in the model family. It is understood that the Doubao Large Model was officially launched at the Volcano Engine Original Power Conference on May 15, 2024.

Sep 18, 2024

5.0k