Recently, Answer.AI and LightOn jointly released the open-source language model ModernBERT, which is a significant upgrade over Google's BERT. According to the developers, ModernBERT has made remarkable improvements in processing speed, efficiency, and quality. The model can operate four times faster than its predecessor while using less memory.

The design of ModernBERT allows it to handle texts of up to 8192 tokens, which is a 16-fold increase over the typical 512-token limit of existing encoding models. Additionally, ModernBERT is the first extensively trained programming code encoding model, scoring over 80 on the StackOverflow Q&A dataset, setting a new record for coding models.

image.png

In the General Language Understanding Evaluation (GLUE), ModernBERT-Large achieved an optimal balance between processing speed and accuracy, with a processing time of about 20 milliseconds per token and a score of 90. The development team vividly compares ModernBERT to a finely tuned Honda Civic, emphasizing its reliability and efficiency in everyday applications.

Compared to existing large language models like GPT-4, ModernBERT significantly reduces costs for large-scale text processing. The cost for each query with GPT-4 is several cents, while ModernBERT can run locally, making it faster and cheaper. For example, the FineWeb Edu project incurred a cost of $60,000 when filtering 15 billion tokens using the BERT model, while even using Google's Gemini Flash decoder, the cost exceeded $1 million.

The development team states that ModernBERT is well-suited for various practical applications, including retrieval-augmented generation (RAG) systems, code search, and content moderation. Unlike GPT-4, which requires specialized hardware, ModernBERT can run efficiently on standard consumer-grade gaming GPUs.

Currently, ModernBERT offers two versions: the base model with 139 million parameters and the large version with 395 million parameters. Both versions are now available on Hugging Face, and users can directly replace their existing BERT models with them. The development team plans to release a larger version next year but does not have plans for multimodal capabilities. To encourage the development of new applications, they have also launched a competition that will reward the top five demonstrators with $100 and a six-month Hugging Face Pro subscription.

Since Google launched BERT in 2018, it has been one of the most popular language models, with over 68 million downloads per month on Hugging Face.

Project link: https://huggingface.co/blog/modernbert

Key Points:

🌟 ModernBERT is four times faster than BERT and can handle texts of up to 8192 tokens.

💰 Compared to GPT-4, ModernBERT significantly reduces costs for large-scale text processing and operates more efficiently.

📊 The model excels at handling programming code, scoring over 80 on the StackOverflow Q&A dataset, setting a new record.