EVE

Decoder-free vision-language model, efficient and data-driven.

CommonProductProgrammingVision-language modelDecoder-free
EVE is a decoder-free vision-language model jointly developed by researchers from Dalian University of Technology, Beijing Institute of Artificial Intelligence, and Peking University. It demonstrates exceptional capabilities across different image aspect ratios, outperforming Fuyu-8B and approaching the performance of modular encoder-based LVMs. EVE excels in data efficiency and training efficiency, using 33M publicly available data for pre-training and leveraging 665K LLaVA SFT data for training the EVE-7B model, along with an additional 1.2M SFT data for the EVE-7B (HD) model. The development of EVE adopts efficient, transparent, and practical strategies, paving the way for novel paradigms in cross-modal pure decoder architectures.
Visit

EVE Visit Over Time

Monthly Visits

494758773

Bounce Rate

37.69%

Page per Visit

5.7

Visit Duration

00:06:29

EVE Visit Trend

EVE Visit Geography

EVE Traffic Sources

EVE Alternatives