Anthropic Launches Initiative to Fund Development of New AI Benchmarking Tools

AIbase

Published inAI News · 6 min read · Jul 2, 2024

Anthropic has launched a program to fund the development of new types of benchmark tests to evaluate the performance and impact of AI models, including generative models like its own Claude.

The program was announced by Anthropic on Monday, which will provide funding to third-party organizations that can "effectively measure the advanced capabilities of AI models," as the company stated in a blog post. Interested parties can submit applications for rolling evaluations.

Claude2, Anthropic, Artificial Intelligence, Chatbot

"Our investment in these evaluations is intended to elevate the entire AI safety field, providing valuable tools that benefit the entire ecosystem," Anthropic wrote on its official blog. "Developing high-quality, safety-related evaluations remains challenging, with demand outpacing supply."

As we have emphasized before, AI faces benchmarking issues. The benchmark tests most commonly cited by AI today often do not capture how ordinary people actually use the tested systems. Moreover, some benchmark tests, especially those released before the advent of modern generative AI, may even fail to measure what they claim to measure due to their outdated nature.

Anthropic's very high-level, seemingly difficult solution is to create challenging benchmark tests through new tools, infrastructure, and methods, focusing on AI safety and social impact.

The company specifically calls for tests to assess the models' abilities to complete tasks such as executing cyber attacks, "enhancing" weapons of mass destruction (such as nuclear weapons), and manipulating or deceiving people (such as through Deepfakes or misinformation). Anthropic says it is committed to developing a "early warning system" for AI risks involving national security and defense, although the blog post did not disclose what such a system might include.

Anthropic also stated that it plans to support the research of benchmark tests and "end-to-end" tasks through the new program, exploring the potential of AI in scientific research, multilingual communication, and reducing entrenched biases and self-censorship toxicity.

To achieve this goal, Anthropic envisions a new platform that allows discipline experts to develop their own evaluations and involves "thousands" of user model large-scale trials. The company said it has hired a full-time coordinator for the program and may purchase or expand promising projects.

Anthropic's efforts to support new AI benchmark tests are commendable — certainly, assuming sufficient funding and human resources. However, considering the company's commercial ambitions in AI competitions, complete trust in it may be difficult.

Anthropic also said it hopes its program will become a "catalyst for progress" to make comprehensive AI evaluation an industry standard in the future. This is a mission that many open, company-independent efforts can agree with. However, whether these efforts are willing to cooperate with AI suppliers whose loyalty ultimately belongs to shareholders is yet to be seen.

Key Points:

- 📌 Anthropic launches a program to fund new types of benchmark tests to evaluate the performance and impact of AI models.

- 📌 The program aims to create challenging benchmark tests focusing on AI safety and social impact.

- 📌 Anthropic hopes its program will become a "catalyst for progress" to make comprehensive AI evaluation an industry standard in the future.

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Recently, Shengshu Technology, a leading company in the field of multimodal AI, announced the successful completion of an A-round funding round worth several billion yuan. This round was led by Bohua Capital, with existing investors such as Baidu's strategic investment division and the Beijing Artificial Intelligence Industry Investment Fund continuing to participate, demonstrating strong market recognition of Shengshu Technology. The company plans to use the funds to further advance model R&D and technological innovation, explore the potential of multimodal large models, and accelerate product expansion and user services. Multimodal technology, especially in the field of video generation, is currently experiencing rapid development.

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Google recently announced that the Chrome browser will undergo its largest update to date, primarily by adding AI features to enhance user experience. This update will be rolled out today to macOS and Windows users in the United States, with users who have English settings being the first to experience these new features. Mike Torres, Vice President of Google Products, stated that the core of this update is 'Geminiization,' and users can now access AI capabilities for web pages through a newly added Gemini button.

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Suno recently sparked global discussions through a mysterious teaser video: its fifth-generation music model 'v5' is about to be released. This announcement is seen by the industry as a 'revolutionary' milestone in AI music creation, and is expected to further blur the boundaries between human composition and machine-generated music, significantly lowering the barriers to entry for creators from amateur enthusiasts to professional producers. Suno officially posted a 15-second short video on social media at night on September 18th, showing abstract notes and interwoven light and shadow, accompanied by a deep electronic melody, ending with 'coming soon'.

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

On September 19, 2025, Alibaba Cloud announced the official open-sourcing of Tongyi Wanxiang's new action generation model Wan2.2-Animate. This model can drive photos of people, anime characters, and animals, and is widely applied in short video creation, dance template generation, and animation production. Users can download the model and code on GitHub, HuggingFace, and the Moda Community, or call the API through the Alibaba Cloud BaiLian platform or experience it directly on the Tongyi Wanxiang website. Wan2.2-Animate

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

On September 19, 2025, Tencent launched HuanYuan 3D Studio, an AI workbench specifically designed for 3D designers, game developers, and modelers. This is Tencent's second major release within a week. The platform reduces the 3D asset production cycle from days to minutes, achieving a revolutionary improvement in production efficiency. A one-stop platform covers the entire creative process. The initial version of HuanYuan 3D Studio has been launched, featuring character and prop creation pipelines, integrating the entire workflow from concept design, geometric modeling, to texture mapping, skinning, and animation production. The platform is based on

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Anthropic Launches Initiative to Fund Development of New AI Benchmarking Tools

AIbase

This article is from AIbase Daily

AI News Recommendations

Tencent Video Launches AI Repair for Classic Film and Television Works to Restore 4K Quality

Hong Kong's Ultrasound Field Sees an AI Revolution! New Large Model Helps Doctors Diagnose Easily

Volc Engine Dominates the Market! Analysis of Large Model Services in China's Public Cloud in 2025

Alibaba Cloud Launches the Next-Generation Action Generation Model Wan2.2-Animate, Fully Open-Sourced!

AI Daily: Xiaomi Opensources Its First Native End-to-End Speech Large Model; Tongyi Wanxiang Wan2.2-Animate Officially Open-Sourced; Suno v5 to Launch Soon

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Anthropic Launches Initiative to Fund Development of New AI Benchmarking Tools

AIbase

This article is from AIbase Daily

AI News Recommendations

Tencent Video Launches AI Repair for Classic Film and Television Works to Restore 4K Quality

Hong Kong's Ultrasound Field Sees an AI Revolution! New Large Model Helps Doctors Diagnose Easily

Volc Engine Dominates the Market! Analysis of Large Model Services in China's Public Cloud in 2025

Alibaba Cloud Launches the Next-Generation Action Generation Model Wan2.2-Animate, Fully Open-Sourced!

AI Daily: Xiaomi Opensources Its First Native End-to-End Speech Large Model; Tongyi Wanxiang Wan2.2-Animate Officially Open-Sourced; Suno v5 to Launch Soon

Shengshu Technology Secures Several Billion Yuan in Funding, Driving New Trends in AI Commercialization through Video Generation

Google Chrome Browser Adds New AI Features, How Should Internet Users Respond?

Suno v5 Music Model is About to Launch, AI Music Creation is About to Experience a Revolutionary Upgrade

Tongyi Wanxiang's New Action Generation Model Wan2.2-Animate Officially Open-Sourced

Tencent HuanYuan 3D Studio Makes a Stunning Debut: 3D Creation Speeds Up from Days to Minutes

GEO Services