Lumiere

A video generation spatio-temporal diffusion model

CommonProductVideoVideo SynthesisText-to-Video

Lumiere is a text-to-video diffusion model designed to synthesize videos that exhibit realistic, diverse, and coherent motion, addressing key challenges in video synthesis. We introduce a spatio-temporal U-Net architecture that enables the generation of an entire video's temporal duration in a single model pass. This contrasts with existing video models, which synthesize distant keyframes and then perform temporal super-resolution, a method that intrinsically makes global temporal consistency difficult to achieve. By deploying spatial and, importantly, temporal downsampling and upsampling, and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate full-frame rate, low-resolution videos at multiple spatio-temporal scales. We demonstrate state-of-the-art results in text-to-video generation and showcase that our design readily facilitates a variety of content creation tasks and video editing applications, including image-to-video, video repair, and style generation.

Visit

Lumiere Visit Over Time

Monthly Visits

25633376

Bounce Rate

44.05%

Page per Visit

5.8

Visit Duration

00:04:53

Lumiere Visit Trend

Lumiere Visit Geography

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Lumiere

Lumiere Visit Over Time

Lumiere Visit Trend

Lumiere Visit Geography

Lumiere Traffic Sources

Lumiere Alternatives

Lumiere — A video generation spatio-temporal diffusion model

VideoLLaMA 2 — An advanced spatio-temporal modeling and audio understanding model for video understanding.

STAR — STAR is a spatio-temporal enhancement framework for real-world video super-resolution, integrating powerful text-to-video diffusion priors into real-world video super-resolution for the first time.

Text-to-Video Generation — A better tool for evaluating text-to-video generation

Snap Video — Snap Video: An extensible spatiotemporal transformer for text-to-video synthesis.

Understanding Video Transformers — Conceptual discovery for explaining the decision-making process of video Transformers

Sora AI Video — A pure text-to-video generation model developed by Sora AI

Emu Video — AI-driven text-to-video generation

MotionDirector — Customization of text-to-video diffusion models for action

Stable Video Diffusion — Free and stable video diffusion model

CogVideoX — Text-to-video generation model

Midgenie — AI Video Dubbing and Text-to-Video App

SparseCtrl — Adds sparse control to text-to-video diffusion models

text2video — A one-click tool for converting text to video.

Allegro — Advanced text-to-video generation model

Open-Sora-Plan-v1.1.0 — Text-to-Video Generation Open Source Model, with Outstanding Performance

InstructVideo — A text-to-video instruction generation model.

Kling AI — A groundbreaking text-to-video generation model

Finalframe — An AI-driven video editing tool with text-to-video functionality

Show-1 — Show-1 combines pixel and latent diffusion models to achieve efficient, high-quality text-to-video generation.

Diffusion Priors — A diffusion prior based dynamic viewpoint synthesis model.

Morph Studio — Text-to-video AI, unleash your creativity!

DreamCloud — Text-to-Video AIGC Creation Platform

VideoTetris — An innovative framework for text-to-video generation

CogVideo — An open-source text-to-video generation model.

FlowVid — Optical Flow Guided Video Synthesis

Upscale-A-Video — Video Super-Resolution Expansion Model

Video2Text — One-click video to text

Hotshot - ACT 1 — Hotshot - ACT 1 is an advanced text-to-video synthesis system developed by Hotshot, aiming to empower the world to share their imagination through video.

Stable Video Diffusion 1.1 Image-to-Video — The SVD 1.1 Image-to-Video model generates short videos.

GEO Services