2024-09-10 11:03:27.AIbase.
AI Evaluation Made Easy! Hugging Face Launches LightEval to Help You Master Model Performance!
2024-09-05 08:43:35.AIbase.
ZhiYuan Research Institute Launches FlagEval Large Model Arena Featuring Text-to-Video Model Combat Evaluation Service
2024-09-03 13:42:26.AIbase.
DingTalk Launches Multiple 'Super Assistants', Including Super Work Order Assistant and Super Evaluation Assistant
2024-08-23 09:05:19.AIbase.
Baidu Smart Cloud Keway Passed the Evaluation of 'Intelligent Customer Service Based on Large Models' by the China Academy of Information and Communications Technology
2024-08-13 08:11:01.AIbase.
The Compass Arena, a Large Model Evaluation Platform, Adds a Multi-Modal Large Model Competition Section
2024-08-07 14:14:43.AIbase.
Meta Launches 'Self-Taught Evaluator': NLP Model Evaluation Without Human Annotation, Outperforming Common LLMs Like GPT-4
2024-07-23 08:09:28.AIbase.
Baidu Smart Cloud Launches Financial AI App 'Zhijin' with Asset Evaluation Features
2024-07-12 11:10:22.AIbase.
OpenAI Unveils Initial AGI Evaluation Criteria: ChatGPT Rated at Stage One
2024-07-02 09:07:20.AIbase.
Anthropic Launches Initiative to Fund AI Evaluation Benchmark Development
2024-06-27 09:28:40.AIbase.
Hugging Face Updates Leaderboard Evaluation Rules, AI Assessment Enters New Phase
2024-06-24 10:24:59.AIbase.
Technical Evaluation Report Shows: Claude 3.5 Sonnet Model Achieves Professional Doctorate Level
2024-03-07 03:52:56.AIbase.
AI Model Evaluation Company Points Out Serious Infringement Issues with GPT-4, Microsoft Engineers Express Concerns Over Image Generation Features
2023-12-25 14:12:47.AIbase.
ZhiYuan Research Institute Releases Code Generation Training Dataset TACO
2023-12-18 10:58:58.AIbase.
Midjourney V6 Model to be Released Next Week, Community Style Evaluation Officially Launched
2023-12-12 16:20:29.AIbase.
Zhipu AI Releases Chinese LLM Alignment Evaluation Benchmark AlignBench
2023-11-30 09:52:30.AIbase.
Amazon AWS Launches Human Benchmark Testing Team to Improve AI Model Evaluation
2023-11-29 09:08:23.AIbase.
"Baimao Battle" Family's First, When Will Cheating in Large Model 'Scoring' Stop?
2023-11-13 08:59:01.AIbase.
ZhiYuan Research Institute Releases Open-Source JudgeLM Evaluation Model to Assess Various Large Models and Provide Scores
2023-11-02 15:21:41.AIbase.
Ant Group Releases Benchmark for Large Model Evaluation in the DevOps Field
2023-11-02 10:51:40.AIbase.