AI startup Cognition has launched a groundbreaking AI programmer named "Genie," which has astoundingly outperformed both Devin and GPT-4, emerging as the world's strongest AI coding assistant.

image.png

This AI programmer scored an impressive 30.08% on the authoritative testing platform SWE-Bench, significantly surpassing Devin's 13.8% and Swe-agent+GPT-4's 12.47%.

image.png

You might wonder, how did Genie achieve this? As early as December 2022, Genie's co-founder Alistair Pullen showcased this project at the University of London. He aimed to create an AI that could code, debug, and optimize automatically like a human. After over a year of development, Genie finally entered the testing phase and secured $2.5 million in seed funding.

Alistair noted that Genie's success is closely tied to its training data and methods. Unlike traditional large-model fine-tuning, Genie uses a special dataset that includes human programmers' reasoning processes. This data covers the step-by-step discovery of knowledge and case-based decision-making processes, enabling Genie to exhibit human-like engineering judgment when faced with complex problems.

Additionally, Genie employs a unique "self-improvement mechanism." Initially, Genie was trained on high-quality data to achieve a "perfect" state, but during this process, Genie's self-assessment and improvement were insufficient. To address this, developers used Genie to generate synthetic data, further enriching the training content. This is akin to a mother teaching a child to walk, providing correct guidance after each fall.

image.png

After multiple rounds of iterative training, Genie's capabilities have significantly improved, even demonstrating creative solutions to unseen problems. Functionally, Genie supports various development tasks, including feature development, bug fixing, code refactoring, and testing, covering dozens of programming languages such as JavaScript, Python, and Java.

Genie is now open for trial applications. Users can register through the official website, with test permissions expected to be issued in the coming weeks.

Official Blog: https://cosine.sh/blog/state-of-the-art

Experience Link: https://cosine.sh/register

Key Points:

🌟 Genie scored 30.08% on the SWE-Bench test, becoming the world's strongest AI programmer.

🚀 Utilizes a special dataset and self-improvement mechanism, performing exceptionally in complex coding tasks.

📝 Currently open for trial applications, with more exciting features to be introduced in the future!