Google Upgrades AI Video Generation Model Veo2, 4K Resolution, Human Preference Scores Better than Sora

AIbase基地

Published inAI News · 6 min read · Dec 17, 2024

331

Google has officially released its next-generation video generation model, Veo2, aimed at competing with OpenAI's Sora. Google claims that Veo2 demonstrates higher realism and detail in video generation. This update not only enhances video quality but also improves understanding of the nuances of real-world physics, human movement, and expressions.

The Powerful Features of Veo2

Google states that Veo2 has a "better understanding of the physics of the real world and the subtleties of human movement and expressions." It can be accessed on Google Labs' VideoFX platform, but only for those on the waiting list. Users need to register through a Google form and wait for Google to temporarily grant access at its chosen time.

Google introduced that users can select the type of video, shots, and cinematic effects according to their needs, and Veo2 will be able to generate videos with resolutions up to 4K.

In comparison, the original Veo is still available on Vertex AI, and the videos generated by Veo2 will come with Google's metadata watermark SynthID to identify them as AI-generated content.

Despite the technical advancements of Veo2, Google acknowledges that the model occasionally produces "hallucinations," such as generating extra fingers, but overall, the new version has reduced the occurrence of such phenomena.

Google stated that in its internal testing, measuring "overall preference" (which videos audiences prefer) and "prompt adherence" (how well the video matches the instructions given by human creators), human evaluators preferred Veo over Sora and other competing AI models.

Veo2 Experience Access: https://labs.google/fx/tools/video-fx

AI Video Generation Still Needs Improvement

Although AI video generation technology is continuously advancing, competition between major players like Google and OpenAI remains fierce. Some smaller companies, such as RunwayML and Luma AI, are also consistently rolling out new features to enhance their video generation capabilities.

Recently, RunwayML launched advanced control features for its Gen-3Alpha Turbo model, while Pika Labs released Pika2.0, allowing users to add their own characters to videos.

However, users' trust in AI-generated videos is still lacking. The release of Sora did not completely eliminate skepticism, with some users reporting that the generated results contained physical and anatomical inconsistencies. Additionally, during a recent gaming awards ceremony, the audience expressed dissatisfaction with the "AI chaos" of AI-generated content.

Google indicated that many users have shown great interest in Veo2, particularly some YouTube creators who are already using VideoFX to create backgrounds for their short videos to save time.

Update on Imagen3

In addition to Veo2, Google has also updated its image generation model Imagen3, further enhancing the realism and vibrancy of generated images. The new version of Imagen3 can more accurately represent various artistic styles, from photographic realism to impressionism, as well as abstract and anime styles. Additionally, the model performs better in following user instructions.

Key Points:
- 🎥 Google releases the Veo2 video generation model, claiming its video quality surpasses that of OpenAI's Sora.
- 🚀 Users can apply to use Veo2 through a Google form, choosing styles and effects when generating videos.
- 🎨 The updated Imagen3 image generation model can better express various artistic styles, enhancing user experience.

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

Tencent Hunyuan-A13B Model API Launches

Recently, Tencent Cloud officially launched the API service for the Tencent Hunyuan A13B model on its official website. The input price is set at 0.5 yuan per million Tokens, and the output price is 2 yuan per million Tokens, which has quickly sparked enthusiastic discussions in the developer community. As the first 13B-level MoE (Mixture of Experts) open-source hybrid inference model in the industry, Hunyuan-A13B features a total of 80B parameters and only 13B activated parameters, achieving performance comparable to leading open-source models of the same architecture, while also demonstrating efficient reasoning capabilities.

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

1. Zhipu launches free AI Slides for PPT generation. 2. Keling AI introduces KeTu 2.1 with 180 styles. 3. NVIDIA's DiffusionRenderer enables 3D scene editing. 4. Modao AI offers 30-second prototype generation. 5. Higgsfield creates avatars from 10 photos. 6. Google open-sources GenAI Processors. 7. Google Veo3 adds image-to-video. 8. Mistral AI releases Devstral2507 for code generation.....

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Google DeepMind open sources the GenAI Processors Python library, helping developers build efficient generative AI workflows. The library supports asynchronous processing of multimodal data and optimizes Gemini API application development, significantly reducing latency in real-time applications. Core features include a modular Processor interface, streaming API design, and concurrency optimization, enabling rapid development of real-time applications such as intelligent assistants. Currently only supports Python, but with an open community contribution model, future plans include expanding functionality to cover more scenarios.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Google Upgrades AI Video Generation Model Veo2, 4K Resolution, Human Preference Scores Better than Sora

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Tencent Hunyuan-A13B Model API Launches

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Google Announces the Latest Class of Students at the American Artificial Intelligence Infrastructure Institute

Microsoft BioEmu Model Dramatically Shortens Protein Simulation Time

Zhipei has launched a PPT generation feature similar to Manus AI Slides, free to use without limitations

City Commercial Banks Are Launching a Trend of Large Model Bidding, with Million-Level Investments Becoming a New Industry Opportunity!

Google Veo3 Adds Image-to-Video Feature, Users Create Over 40 Million Videos Within Seven Weeks

Kling AI Releases KTu 2.1 Model: Significant Improvement in Image Generation Capabilities, Supports 180 Styles