en
每月不到10元,就可以无限制地访问最好的AIbase。立即成为会员
Home
News
Daily Brief
Income Guide
Tutorial
Tools Directory
Product Library
en
Search AI Products and News
Explore worldwide AI information, discover new AI opportunities
AI News
AI Tools
AI Cases
AI Tutorial
Type :
AI News
AI Tools
AI Cases
AI Tutorial
2024-08-15 14:53:25
.
AIbase
.
11.1k
OpenAI Launches SWE-bench Verified: Enhancing AI Software Engineering Capability Assessment
OpenAI has released SWE-bench Verified, aiming to more accurately assess AI performance in software engineering tasks and address the limitations of the original SWE-bench, such as overly strict unit tests, ambiguous problem descriptions, and challenging development environment setups. The new benchmark improves assessment consistency and reliability by introducing a containerized Docker environment, significantly enhancing the performance scoring of AI models. GPT-4o solved 33.2% of the samples under the new benchmark, while the best open-source agent framework has...
2024-08-13 08:34:48
.
AIbase
.
11.0k
The so-called strongest AI programmer in the world, 'Genie', emerges, defeating Devin and GPT-4!
AI startup Cognition has launched the AI programmer 'Genie', scoring 30.08% on the authoritative testing platform SWE-Bench, far surpassing Devin and GPT-4, and becoming the world's strongest AI programming assistant. Co-founder Alistair Pullen showcased the project in 2022, aiming to create an AI that can automatically code, debug, and optimize like a human. After over a year of development, Genie utilizes special data that incorporates the reasoning processes of human programmers.