AI News

Don't miss any moment of global AI innovation

AI Daily

Daily three-minute AI industry trends

AI Timeline

AI industry milestones

Al Hardware

Lists all AI hardware products.

AI Monetization Guide

Latest Cases

AI monetization case sharing

Image Collection

AI image creation monetization cases

Video Collection

AI video creation monetization cases

Audio Collection

AI audio creation monetization cases

Content Collection

AI content writing monetization cases

AI Tutorials

Latest Tutorials

Free sharing of the latest AI tutorials

AI Product Rankings

AI Product Ranking

Shows total visits ranking of AI websites

AI Traffic Growth Ranking

Track fastest growing AI websites by traffic

AI Traffic Decline Ranking

Focus on AI websites with significant traffic drops

AI Weekly Ranking

Shows weekly visits ranking of AI websites

Popular Country Rankings

United States

AI websites most popular with US users

China

AI websites most popular with Chinese users

India

AI websites most popular with Indian users

Brazil

AI websites most popular with Brazilian users

Popular Category Rankings

Image Generation

Total visits ranking of AI image generation websites

Personal Assistant

Total visits ranking of AI personal assistant websites

Character Generation

Total visits ranking of AI character generation websites

Video Generation

Total visits ranking of AI video generation websites

Popular Open Source Data Rankings

AI Project Ranking

GitHub popular AI projects by total stars

AI Project Growth Ranking

GitHub popular AI projects by growth rate

AI Developer Ranking

GitHub popular AI developer ranking

AI Organization Ranking

GitHub popular AI organization ranking

Popular Open Source Categories

Deepseek

GitHub popular deepseek open source projects

TTS

GitHub popular TTS open source projects

LLM

GitHub popular LLM open source projects

ChatGPT

GitHub popular ChatGPT open source projects

AI Open Source Project Library

Overview

Overview of GitHub popular AI open source projects

Product Library Tool Navigation

Embarrassing! Google Exposed for Using Claude Model for Comparison Testing to Improve Gemini AI

AIbase基地

Published inAI News · 5 min read · Dec 25, 2024

213

Recently, Google's Gemini artificial intelligence project has been enhancing its performance by comparing its output with that of Anthropic's Claude model. Internal communications obtained by TechCrunch indicate that contractors responsible for improving Gemini are systematically evaluating the responses of both AI models.

Code Internet Computer

Image Source Note: Image generated by AI, image licensed by Midjourney

In the AI industry, model performance evaluation is typically conducted through industry benchmark tests rather than having contractors compare the answers of different models one by one. The contractors responsible for Gemini need to score the model outputs based on multiple criteria, including accuracy and detail. They have up to 30 minutes each time to determine which response is better, Gemini or Claude.

Recently, these contractors noticed that Claude's references frequently appeared on the internal platform they were using. Some content presented to the contractors explicitly stated: "I am Claude, created by Anthropic." In an internal chat, contractors also found that Claude's responses were more prominent in emphasizing safety. Some contractors pointed out that Claude's safety settings are the strictest among all AI models. In certain cases, Claude chooses not to respond to prompts it deems unsafe, such as role-playing other AI assistants. In another instance, Claude avoided a prompt, while Gemini's response was flagged as a "major safety violation" for containing "nudity and bondage" content.

It is important to note that Anthropic's commercial service terms prohibit customers from using Claude to "build competitive products or services" or "train competing AI models" without authorization. Google is one of Anthropic's major investors.

A spokesperson for Google DeepMind, Shira McNamara, did not disclose whether Google obtained Anthropic's approval to use Claude during an interview with TechCrunch. McNamara stated that DeepMind does compare model outputs for evaluation but has not trained Gemini using the Claude model. She mentioned, "Of course, as per industry standard practices, we do compare model outputs in certain cases. However, any claims about us training Gemini using Anthropic models are inaccurate."

Last week, TechCrunch also exclusively reported that Google's contractors were asked to score Gemini's AI responses in areas outside their expertise. Some contractors expressed concerns in internal communications, believing that Gemini might generate inaccurate information on sensitive topics such as healthcare.

Key Points:
🌟 Gemini is conducting comparative tests with Claude to enhance its AI model performance.
🔍 Contractors are responsible for scoring, with comparisons of responses involving multiple criteria, including accuracy and safety.
🚫 Anthropic prohibits the use of Claude for training competitive models without authorization.

GaiGe Gemini Claude Midjourney

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

mcp-server-weread: Seamless Claude and WeChat Reading Notes Interaction for Enhanced Reading and AI Integration

This tool facilitates seamless interaction between Claude and WeChat Reading notes, enabling a deeper integration of AI and reading experiences. It allows for efficient processing and analysis of reading materials, enhancing comprehension and knowledge retention.

Apr 24, 2025

Google Workspace Adds AI Features: Audio Summaries and Meeting Tracking Now Available

Apr 24, 2025

OpenAI Predicts $125 Billion Revenue by 2029, 3 Billion Monthly Active Users by 2030

OpenAI recently released a prediction forecasting $125 billion in total revenue by 2029. AI agent and channel revenue will be key drivers. AI agent revenue is projected to reach nearly $29 billion, representing almost a quarter of total revenue, while channel revenue is expected to reach $25 billion. Image note: Image generated by AI, image licensing service Midjourney. Following the success of ChatGPT, OpenAI's...

Apr 24, 2025

150

Google Gemini User Base Explodes to 350 Million! But Still Lags Behind ChatGPT

Apr 24, 2025

100

Google Gemini Surpasses 350 Million Monthly Active Users, Still Trails ChatGPT

Recent reports reveal Google's AI chatbot, Gemini, has achieved over 350 million monthly active users globally. This data, disclosed during ongoing antitrust litigation against Google, showcases Gemini's significant user growth over the past year. Notably, Gemini's daily active users have also seen a substantial increase, rising from 9 million in October 2023 to 35 million currently, representing considerable growth. Despite Gemini's rapidly expanding user base,

Apr 24, 2025

Sentra Secures $50 Million in Series B Funding to Help Enterprises Secure Data in the Age of AI

Apr 23, 2025

110

Google Considered Exclusive Gemini AI Deals with Android Makers

Internal Google documents revealed during a recent antitrust trial show the company considered exclusive deals last year with several Android phone manufacturers, including Samsung. These deals would have covered not only Google's search app but also its newly launched Gemini AI and Chrome browser. This news has drawn significant industry attention, particularly given the current heightened antitrust scrutiny. The documents suggest Google aimed to solidify its product dominance on Android devices through these partnerships.

Apr 23, 2025

120

Gartner Report: Task-Specific AI to Outpace General-Purpose AI by Threefold in 2027

Apr 23, 2025

Google Gemini Adds Video Analysis Capabilities, Accurately Identifying Video Locations

Apr 22, 2025

220

Claude-3 surpasses human average IQ, Anthropic leads AI intelligence into a new era

Anthropic's Claude-3 model has achieved a breakthrough in IQ testing, surpassing the human average of 100 for the first time. This marks a milestone in AI development. According to AIbase, Claude-3 outperformed its predecessor in the Norwegian Mensa IQ test, signifying a remarkable leap in AI cognitive abilities. Community analysis suggests this achievement reflects not only Anthropic's technological prowess but also sparks widespread discussion about the future of AI. Related data and predictions are...

Apr 22, 2025

240