SWE-bench Verified

AI model assessment tool for software engineering capabilities

PremiumNewProductProgrammingAI AssessmentSoftware Engineering
SWE-bench Verified is a subset of SWE-bench released by OpenAI that has been manually verified to reliably assess the ability of AI models to solve real-world software issues. It challenges AI to generate patches that resolve the described problems by providing code repositories and problem descriptions. This tool has been developed to improve the accuracy of evaluating the model's ability to autonomously perform software engineering tasks and is a key component of OpenAI's medium-risk framework.
Visit

SWE-bench Verified Visit Over Time

Monthly Visits

546526496

Bounce Rate

56.81%

Page per Visit

2.1

Visit Duration

00:01:39

SWE-bench Verified Visit Trend

SWE-bench Verified Visit Geography

SWE-bench Verified Traffic Sources

SWE-bench Verified Alternatives