Apollo-LMMs
Exploration of Video Understanding in Large Multimodal Models
CommonProductVideoVideo UnderstandingMultimodal Models
Apollo is an advanced family of large multimodal models focused on video understanding. It systematically explores the design space of video-LMMs, revealing the key factors driving performance and providing practical insights for optimizing model efficacy. By uncovering 'Scaling Consistency', Apollo enables design decisions made on smaller models and datasets to be reliably transferred to larger models, significantly reducing computational costs. The main advantages of Apollo include efficient design decisions, optimized training schedules, and data mixing, along with a novel benchmarking tool, ApolloBench, for effective evaluation.