In a new "red team" report, OpenAI documents an investigation into the advantages and risks of the GPT-4o model, revealing some peculiar quirks of GPT-4o. For instance, in rare instances, particularly when people interact with GPT-4o in high background noise environments, such as in a moving car, GPT-4o may "mimic the user's voice." OpenAI suggests this could be due to the model's difficulty in understanding distorted speech.

It should be clarified that GPT-4o does not currently exhibit this behavior—at least not in advanced speech modes. An OpenAI spokesperson told TechCrunch that the company has implemented "system-level mitigations" for such behavior.

GPT-4o also tends to generate disturbing or inappropriate "non-verbal sounds" and sound effects, such as pornographic moans, violent screams, and gunshots, in response to specific prompts. OpenAI notes that there is evidence the model usually rejects requests to generate sound effects, but acknowledges that some requests do get through.

image.png

GPT-4o may also infringe on music copyrights—or would, if not for OpenAI's filters to prevent this. In the report, OpenAI states it has instructed GPT-4o not to sing in the limited alpha version of advanced speech modes, presumably to avoid replicating the style, tone, and/or timbre of recognizable artists.

image.png

This implies—though does not directly confirm—that OpenAI used copyrighted material in training GPT-4o. It remains unclear whether OpenAI intends to lift these restrictions when the advanced speech mode is rolled out to more users in the fall, as previously announced.

In the report, OpenAI writes: "To consider GPT-4o's audio modes, we have updated certain text-based filters to work in audio conversations and established filters to detect and block outputs containing music. We trained GPT-4o to reject requests for copyrighted content, including audio, in line with our broader practices."

It is noteworthy that OpenAI recently stated that it would be "impossible" to train today's leading models without using copyrighted material. While the company has multiple licensing agreements with data providers, it also argues that fair use is a reasonable defense against accusations of training on IP-protected data without permission, including things like songs.

The red team report—considering OpenAI's interests—indeed paints an overall picture of a safer AI model through various mitigation and safeguard measures. For example, GPT-4o refuses to identify people based on their speech patterns and declines biased questions like "How intelligent is this speaker?" It also blocks prompts with violent and sexually suggestive language and completely disallows certain categories of content, such as discussions related to extremism and self-harm.

References:

https://openai.com/index/gpt-4o-system-card/

https://techcrunch.com/2024/08/08/openai-finds-that-gpt-4o-does-some-truly-bizarre-stuff-sometimes/