llm-colosseum

Evaluate large language models through Street Fighter 3 battles.

CommonProductProgrammingArtificial IntelligenceBenchmarking
llm-colosseum is an innovative benchmarking tool that uses the game Street Fighter 3 to assess the real-time decision-making capabilities of large language models (LLMs). Unlike traditional benchmarking methods, this tool tests the models' quick responses, intelligent strategies, creative thinking, adaptability, and resilience through simulated real game scenarios.
Visit

llm-colosseum Visit Over Time

Monthly Visits

515580771

Bounce Rate

37.20%

Page per Visit

5.8

Visit Duration

00:06:42

llm-colosseum Visit Trend

llm-colosseum Visit Geography

llm-colosseum Traffic Sources

llm-colosseum Alternatives