AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Google Releases VideoPoet Video Generation Model, Supporting Up to Ten Seconds of Video and Audio Generation

36氪

Published inAI News · 1 min read · Dec 22, 2023

101

On December 19th, Google unveiled VideoPoet, a video generation model capable of producing videos up to 10 seconds long and automatically generating accompanying music and sound effects based on the video content. VideoPoet extends the video by repeatedly predicting the next frame after the last frame, giving users the impression that the video can be infinitely extended. Unlike other models, VideoPoet utilizes a large language model rather than a diffusion model, integrating multiple functionalities such as text-to-video, video restoration, and video stylization into a single model, offering greater flexibility in use.

Video Generation Text to Video Multimodal

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

China's First Multimodal AI Programmer Officially Launches: Wenxin Quick Code Coding Intelligent Agent Zulu

Baidu's Create AI Developer Conference was grandly held in Beijing. At this highly anticipated technology event, Baidu officially released the Wenxin Quick Code 3.5 version and China's first multimodal AI programmer – the Wenxin Quick Code Comate Zulu intelligent agent, marking a new stage in the development of AI programming tools.

Apr 27, 2025

530

Zhipu AI and Shengshu Technology Announce Strategic Partnership to Focus on Large Model Joint Innovation

On April 27, Zhipu AI (Z.ai) and Shengshu Technology (shengshu.com), two leading artificial intelligence companies under Tsinghua University, announced a major strategic partnership. This collaboration aims to leverage both companies' technological expertise in large language models and multi-modal generative models to jointly advance the technological innovation and industrial application of domestic large models.

Apr 27, 2025

190

AccuSense Launches Next-Generation 4nm AI Cockpit Chip X10, Enhancing Intelligent Driving Experience

Apr 27, 2025

110

Moonshot AI Unveils Kimi-Audio: A New Benchmark for Open-Source Audio Foundation Models

Moonshot AI recently announced the launch of Kimi-Audio, a new open-source audio foundation model aimed at advancing the field of audio understanding, generation, and interaction. This release has garnered significant attention from the global AI community and is considered a major milestone in the development of multimodal AI. This report provides a comprehensive overview of Kimi-Audio's core features, performance, and industry impact. Breakthrough Features: Versatile Audio Processing Capabilities Kimi-Audio-7B-Instruct based on Qwen

Apr 27, 2025

340

GPT-4's Image Generation Capabilities Now Integrated into Custom GPTs

Apr 27, 2025

140

Meta Releases WebSSL Models: A New Exploration in Language-Free Visual Learning

In the field of artificial intelligence, Meta recently introduced the WebSSL family of models. These models, ranging in size from 300 million to 7 billion parameters, are trained on purely image data and aim to explore the vast potential of language-free visual self-supervised learning (SSL). This new research opens up new possibilities for future multimodal tasks and offers a fresh perspective on understanding how visual representations are learned. Previously, OpenAI's CLIP model was known for its performance in multimodal tasks such as visual question answering (VQA) and document understanding.

Apr 25, 2025

180

Pixverse Launches MCP: Unlocking a New Frontier in AI Video Generation

With the rapid advancement of generative AI technology, the video creation field is experiencing a new wave of transformation. Pixverse, a leading platform in AI video generation, recently launched the Model Context Protocol (MCP), providing users and developers with a more efficient and flexible video generation solution. What is MCP? Unlocking new ways to generate AI videos. Pixverse's MCP (Model Context Protocol) is specifically designed for AI video generation...

Apr 25, 2025

230

Jidream Video 3.0 Internal Testing: Smooth Camera Work, Accurate Capture of Facial Expressions

Last night, Jidream launched the internal testing of its Video 3.0 model. The new video model boasts smoother camera work and higher prompt fidelity compared to previous models. Based on the examples provided by the official release, the new model demonstrates improved stability in handling large movements, significantly reducing instances of character distortion. It can easily handle various scenarios, such as a man playing golf, a dog cooking, a boy singing passionately, and a toy hugging a robot. Key highlights of Jidream 3.0 include: 1. Rich cinematic language, ranging from rapid pushes to create suspense, to slow pans to showcase expansive scenes, and more.

Apr 25, 2025

170

Jieyue Xingchen and Yuanli Lingji Announce Strategic Partnership

Jieyue Xingchen and Yuanli Lingji have signed a strategic cooperation agreement in Beijing. Both parties will leverage their respective technological advantages to carry out in-depth cooperation in multimodal large model technology, intelligent terminal Agents, and embodied AI scenarios. The goal of this cooperation is to achieve "reasoning in the physical world", jointly developing an intelligent robot named "RoboAgent", and promoting the practical application of Artificial General Intelligence (AGI). At the signing ceremony, Dr. Jiang Daxin, founder and CEO of Jieyue Xingchen, and the co-founders of Yuanli Lingji...

Apr 24, 2025

170

Kunlun Wanwei Open-Sources Skywork-R1V 2.0 Version with Enhanced Visual and Text Reasoning Capabilities

On April 24th, Kunlun Wanwei announced the official open-sourcing of its multimodal reasoning model, Skywork-R1V2.0 (hereinafter referred to as R1V2.0). This upgraded version demonstrates significant improvements in both visual and text reasoning capabilities, particularly excelling in deep reasoning for challenging science problems in the College Entrance Examination and general task scenarios. It is considered the currently most balanced open-source multimodal model, equally adept at visual and text reasoning.

Apr 24, 2025

220