OpenAI Unveils New Image Generation Model, Challenging Google's One-Sentence Image Editing

AIbase基地

Published inAI News · 4 min read · Mar 26, 2025

In the latest tech news, OpenAI has just announced the integration of their most advanced image generator yet into their newest model, GPT-4o. OpenAI CEO Sam Altman excitedly shared his astonishment at seeing the model's image generation capabilities on X (formerly Twitter), calling it unbelievable and encouraging users to unleash their creativity.

Highlights of the new feature include:

- Precise rendering of text content, delivering high-quality image results.

- Support for various input and output methods, encompassing text, images, and audio.

- Understanding complex instructions and context to create realistic first-person perspective images.

Unlike its predecessor, DALL-E, GPT-4o uses an autoregressive model natively embedded within ChatGPT. This means it can handle complex instructions involving 10 to 20 different objects, outperforming competitors who typically manage only 5 to 8, demonstrating significantly enhanced capabilities.

Users simply need to concisely describe their needs, specifying aspects like aspect ratio, color, or transparent background, and the model quickly generates the image. While rendering complex details may take a little longer, the final results are worth the wait.

At a launch event, a demonstrator showcased several specific examples. For instance, they transformed a group photo into an anime-style image; the model successfully preserved the characters' features while seamlessly integrating the anime aesthetic. Additionally, the demonstrator requested a humorous comic page about relativity, and the generated comic was both structurally sound and entertaining.

OpenAI also prioritizes the security of this feature. All generated images are tagged with C2PA metadata, ensuring content traceability and effectively preventing the generation of inappropriate requests.

Of course, OpenAI's image generation tool isn't without flaws. Areas for improvement include cropping, contextual understanding, and non-Latin text rendering. However, OpenAI states they will continuously optimize these aspects in the future.

Simultaneously, Google released its powerful AI model, Gemini 2.5Pro Experimental, showcasing significant advancements in reasoning and coding capabilities. These developments highlight the intensifying competition in the AI field, with major tech giants continuously releasing more advanced technologies in a bid to dominate the "AI arms race."

OpenAI CEO Critiques Politeness as Wasteful in AI Interactions

OpenAI CEO Sam Altman recently argued that using polite language like "please" and "thank you" with chatbots like ChatGPT is wasteful, consuming excessive electricity and computing resources. Altman suggests that while politeness may be culturally expected or perceived to improve interaction quality, it unnecessarily burdens AI systems. Each instance of polite language adds to the computational load.

OpenAI's o3 Model Test Scores Questioned; Actual Performance Falls Far Short of Claims

OpenAI's recently released o3 AI model has sparked controversy over its benchmark test performance. While OpenAI confidently claimed in December that the model could correctly answer over a quarter of the highly challenging FrontierMath math problems, this assertion starkly contrasts with recent independent test results. The Epoch Institute's independent testing revealed the model achieved only a 10% success rate, significantly lower than advertised.

OpenAI's 4o Model Image Generation Now Supports Custom GPTs, Enhancing Personalized AI Creation

OpenAI recently announced that its latest 4o model's image generation capabilities will now support custom GPTs, offering users a more flexible and personalized AI creation experience. According to AIbase, this update allows developers and users to build customized GPTs based on the 4o model, generating high-quality image content tailored to specific needs. The announcement has sparked widespread discussion within the AI community, marking another significant advancement by OpenAI in the field of personalized AI tools. Technical details are available on the official OpenAI platform.

OpenAI Releases 34-Page Guide to Building Intelligent Agents: From Web Search to Code Generation

On April 17, 2025, OpenAI announced via social media the release of a 34-page "Practical Guide to Intelligent Agents," offering developers comprehensive guidance on building agent applications. This marks another significant advancement in OpenAI's efforts to promote the practical application and standardization of AI technology. According to the announcement, the guide details how to utilize OpenAI's Responses API to build agents capable of web search, file search, and computer interaction. The Responses API is an evolution of Cha...

OpenAI's Stargate Project Expands Internationally, Targeting Europe

OpenAI's "Stargate Project," a joint initiative with Oracle and SoftBank, is reportedly considering international expansion, specifically targeting the UK, Germany, and France. Initially designed to bolster US artificial infrastructure with a total budget of $500 billion, the project is now rumored to be taking steps towards global markets. Image note: Image generated by AI, licensed through Midjourney. The Stargate Project is currently evaluating the feasibility of overseas expansion.

OpenAI Launches Flex Processing API for Lower-Cost AI Applications

To address the increasingly competitive AI market, OpenAI recently introduced a new API called Flex Processing. This new option allows users to utilize AI models at a lower cost, although with some trade-offs in response speed and availability. Flex Processing is designed to support lower-priority and non-production tasks such as model evaluation, data enrichment, and asynchronous workloads. Specifically, Flex Processing offers significantly reduced costs. For example, when using the o3 model, Flex Processing's price is...