EVE
Decoder-free vision-language model, efficient and data-driven.
CommonProductProgrammingVision-language modelDecoder-free
EVE is a decoder-free vision-language model jointly developed by researchers from Dalian University of Technology, Beijing Institute of Artificial Intelligence, and Peking University. It demonstrates exceptional capabilities across different image aspect ratios, outperforming Fuyu-8B and approaching the performance of modular encoder-based LVMs. EVE excels in data efficiency and training efficiency, using 33M publicly available data for pre-training and leveraging 665K LLaVA SFT data for training the EVE-7B model, along with an additional 1.2M SFT data for the EVE-7B (HD) model. The development of EVE adopts efficient, transparent, and practical strategies, paving the way for novel paradigms in cross-modal pure decoder architectures.
EVE Visit Over Time
Monthly Visits
494758773
Bounce Rate
37.69%
Page per Visit
5.7
Visit Duration
00:06:29