Apple Clarifies: YouTube Caption Data Not Used for Apple Intelligence; OpenELM Exclusively for Research Purposes Step-by-step explanation: 1. Begin with Apple Clarifies to indicate that the company is providing a clarification or statement. 2. Mention the specific subject being clarified, which is YouTube Caption Data. 3. Use Not Used for to clearly state that this data is not being utilized for a particular purpose, in this case, Apple Intelligence. 4. Introduce the alternative purpose, OpenELM, and specify that it is Exclusively for Research Purposes. This helps to differentiate between the two purposes and emphasize the research-only nature of OpenELM.

AIbase基地

Published inAI News · 4 min read · Jul 18, 2024

103

Recently, an investigation revealed that several tech giants, including Apple, had utilized YouTube video subtitles to train AI models. These data encompassed over 170,000 videos, including content from well-known creators such as MKBHD and Mr. Beast. Apple used this data to train its open-source model, OpenELM, which was released in April of this year.

Apple, iOS 18, Apple Intelligence

In response, Apple recently clarified externally that OpenELM has not been applied to any of its AI or machine learning functions, including Apple Intelligence. Apple emphasized that the purpose of developing OpenELM was to contribute to the research community and promote the advancement of open-source large language models. Previously, Apple researchers had described OpenELM as a "state-of-the-art open language model."

Apple stated that OpenELM is only used for research purposes and does not support any Apple Intelligence features. The model is released in an open-source format and can be obtained from Apple's machine learning research website. This means that the "YouTube subtitles" dataset has not been used to support Apple Intelligence. Apple previously stated that the Apple Intelligence model was "trained on licensed data, including data selected for specific functions and publicly available data collected through web crawlers."

It is worth noting that Apple currently has no plans to develop a new version of OpenELM. Wired magazine reported that in addition to Apple, companies like Anthropic and NVIDIA have also used the "YouTube subtitles" dataset to train their AI models. This dataset is part of the non-profit organization EleutherAI's large dataset "The Pile."

This incident has sparked discussions about the sources of AI training data and its impact on privacy and copyright. Although Apple has clarified the use of OpenELM, the practice of tech companies using public data to train AI models remains a topic of concern.

Tesla AI5 Chip Completes Tape-Out, Musk Personally Supervised for Several Months, Dual-Chip Performance Competes with Blackwell

Tesla's next-generation AI5 chip has completed tape-out and is expected to be mass-produced in 2027. It will replace AI4 as the core computing platform for autonomous driving and humanoid robots. The single-chip performance is comparable to NVIDIA Hopper architecture, and the dual-chip configuration delivers even better performance.

EU Initial Court Ruling Finds Meta Monopolistic, Orders WhatsApp to Restore Third-Party AI Access

EU Commission preliminarily finds Meta's restrictions on third-party AI assistants in WhatsApp violate antitrust rules, deeming its revised policy still exclusionary. Regulators plan to require Meta to restore third-party access unconditionally to ensure ecosystem openness, noting its fee framework poses a barrier from 2026.....

Apple is Fully Enhancing Siri! Nearly 200 Engineers Participate in an AI Programming Training Camp

Apple organized an AI programming bootcamp for nearly 200 engineers to enhance Siri's AI capabilities. The training focused on using AI tools for coding, aiming to improve Siri in the upcoming iOS 27. Post-camp, 60 core developers will continue Siri development, while another 60 will shift to evaluating virtual assistant performance.....

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Snap Announces Layoffs of 1,000 People, CEO Says AI Technological Advancements Have Significantly Improved Productivity

Tesla AI5 Chip Completes Tape-Out, Musk Personally Supervised for Several Months, Dual-Chip Performance Competes with Blackwell

Indian startup Emergent launches AI agent Wingman with deep integration into instant messaging

Google Releases Its Strongest TTS Model, Supporting Nearly 70 Languages

EU Initial Court Ruling Finds Meta Monopolistic, Orders WhatsApp to Restore Third-Party AI Access

OpenAI to Launch Pay-Per-Click Model and Upgrade Self-Service Platform to Accelerate ChatGPT Advertising Monetization

Google Gemini for Mac is Here, Three AI Models Have All Arrived on Apple Desktop, the War for Access Points Has Begun

OpenAI Exits, Microsoft Takes Over: The Outcome of This Norwegian AI Resources Struggle Is Quite Interesting

Apple is Fully Enhancing Siri! Nearly 200 Engineers Participate in an AI Programming Training Camp

Adobe Releases Firefly AI Assistant for Cross-Application Intelligent Orchestration

AI News Recommendations

Snap Announces Layoffs of 1,000 People, CEO Says AI Technological Advancements Have Significantly Improved Productivity

Tesla AI5 Chip Completes Tape-Out, Musk Personally Supervised for Several Months, Dual-Chip Performance Competes with Blackwell

Indian startup Emergent launches AI agent Wingman with deep integration into instant messaging

Google Releases Its Strongest TTS Model, Supporting Nearly 70 Languages

EU Initial Court Ruling Finds Meta Monopolistic, Orders WhatsApp to Restore Third-Party AI Access

OpenAI to Launch Pay-Per-Click Model and Upgrade Self-Service Platform to Accelerate ChatGPT Advertising Monetization

Google Gemini for Mac is Here, Three AI Models Have All Arrived on Apple Desktop, the War for Access Points Has Begun

OpenAI Exits, Microsoft Takes Over: The Outcome of This Norwegian AI Resources Struggle Is Quite Interesting

Apple is Fully Enhancing Siri! Nearly 200 Engineers Participate in an AI Programming Training Camp

Adobe Releases Firefly AI Assistant for Cross-Application Intelligent Orchestration

GEO Services