Volcano Engine Launches Video Preprocessing Solution for Large Model Training, Adopted by PixelDance

AIbase基地

Published inAI News · 3 min read · Oct 15, 2024

241

Volcano Engine recently unveiled a significant innovation at the Video Cloud Technology Conference: a large-scale model training video preprocessing solution. This technology has been successfully applied to the Doubao video generation model, marking a major advancement in AI video generation technology.

Tan Dai, President of Volcano Engine, emphasized that AIGC and multimodal technologies are profoundly transforming user experiences. Drawing on TikTok's practical experience, Volcano Engine is actively exploring the integration of AI large models with video technology to provide comprehensive solutions for businesses.

Wang Yue, Head of Video Architecture at TikTok Group, pointed out that large-scale model training faces numerous challenges, including high costs for processing massive amounts of data, varying sample quality, complex processing chains, and scheduling issues with diverse heterogeneous computing resources.

To address these challenges, Volcano Engine's preprocessing solution is based on its proprietary multimedia processing framework BMF and leverages Intel's diverse computing resources. The solution has been optimized at both the algorithmic and engineering levels, enabling efficient processing of vast amounts of video data and significantly improving model training efficiency.

Additionally, Volcano Engine has open-sourced the BMF lite version, a mobile-side post-processing solution that supports large model access on the edge and operator acceleration, making it more lightweight and versatile.

It is worth noting that the Doubao video generation model PixelDance, released on September 24, has adopted this technical solution. The model uses the DiT architecture, overcoming challenges in complex interactions involving multiple subjects and maintaining content consistency across multiple camera switches. Currently, the Doubao video generation model is available for enterprise testing through Volcano Engine.

Tsinghua Changgeng Hospital Collaborates with Beijing Electronic Information and Intelligence to Develop China's First Pharmaceutical Large Model: Focused on Medication Safety Evaluation for Special Populations

Beijing Tsinghua Changgeng Hospital has collaborated with Beijing Electronic Information and Intelligence to develop China's first pharmaceutical-specific large model, using AI to optimize pharmaceutical processes, improve the efficiency and accuracy of medication safety evaluation for special populations such as the elderly, children, and pregnant women, and address the challenges of rapid updates in drug information and complex individual differences.

OpenAI Video Generation Model Sora 2 Launches on Microsoft Azure Platform: Pricing at $0.10 per Second, Enters Public Preview Phase

Microsoft launches OpenAI's Sora2 video generation model on Azure AI for public preview, offering cloud API access to businesses and developers. This multimodal tool processes text, image, and video inputs to create new content, advancing generative AI video into commercial applications like advertising.....

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

Kayak launches AI mode, allowing users to research, plan, and book travel directly through an integrated chatbot. This feature supports comparison queries for flights, hotels, and car rentals, providing more accurate search results by utilizing ChatGPT integration technology. It is now available on both web and mobile platforms.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Volcano Engine Launches Video Preprocessing Solution for Large Model Training, Adopted by PixelDance

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tsinghua Changgeng Hospital Collaborates with Beijing Electronic Information and Intelligence to Develop China's First Pharmaceutical Large Model: Focused on Medication Safety Evaluation for Special Populations

OpenAI Suspends Sora from Generating Video of Martin Luther King Jr. to Protect Historical Figures' Image

AI Daily: Google Gemini 3.0 Pro is being rolled out on a limited scale; Aishike Technology completes B+ round financing of 100 million yuan; Baidu releases document parsing model PaddleOCR-VL

AI Daily: ByteDance Launches DouBao Large Model 1.6; AiShi Technology Completes 100 Million RMB B+ Funding Round; Baidu Releases Document Parsing Model PaddleOCR-VL

AI Video Company Ai Shi Technology Completes 100 Million RMB B+ Round Financing: ARR Exceeds 40 Million USD, Users Exceed 100 Million

Baidu Releases Global Leading Document Parsing Model PaddleOCR-VL, Reshaping the OCR Technology Landscape!

OpenAI Video Generation Model Sora 2 Launches on Microsoft Azure Platform: Pricing at $0.10 per Second, Enters Public Preview Phase

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

Game Video Platform Medal Splits AI Lab General Intuition: Raises $137.7 Million Seed Round, Focuses on Spatiotemporal Reasoning

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Volcano Engine Launches Video Preprocessing Solution for Large Model Training, Adopted by PixelDance

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Tsinghua Changgeng Hospital Collaborates with Beijing Electronic Information and Intelligence to Develop China's First Pharmaceutical Large Model: Focused on Medication Safety Evaluation for Special Populations

OpenAI Suspends Sora from Generating Video of Martin Luther King Jr. to Protect Historical Figures' Image

AI Daily: Google Gemini 3.0 Pro is being rolled out on a limited scale; Aishike Technology completes B+ round financing of 100 million yuan; Baidu releases document parsing model PaddleOCR-VL

AI Daily: ByteDance Launches DouBao Large Model 1.6; AiShi Technology Completes 100 Million RMB B+ Funding Round; Baidu Releases Document Parsing Model PaddleOCR-VL

AI Video Company Ai Shi Technology Completes 100 Million RMB B+ Round Financing: ARR Exceeds 40 Million USD, Users Exceed 100 Million

Baidu Releases Global Leading Document Parsing Model PaddleOCR-VL, Reshaping the OCR Technology Landscape!

OpenAI Video Generation Model Sora 2 Launches on Microsoft Azure Platform: Pricing at $0.10 per Second, Enters Public Preview Phase

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

LLaVA-OneVision-1.5, a Fully Open-Source Multimodal Model That Exceeds Qwen2.5-VL

Game Video Platform Medal Splits AI Lab General Intuition: Raises $137.7 Million Seed Round, Focuses on Spatiotemporal Reasoning

GEO Services