Welcome to the 【AI Daily】column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with the hottest AI news, focusing on developers and helping you understand technological trends and innovative AI product applications.
Check out the latest AI products Learn More: https://top.aibase.com/
1. Baidu Officially Announces: Ernie 4.5 Turbo to Launch on April 25th
Baidu announced that it will release the Ernie 4.5 Turbo large language model at its Create conference on April 25th. While specific details haven't been revealed, the industry is eagerly anticipating its launch. Ernie 4.5 and Ernie X1 were released last month and made freely available to users, showcasing Baidu's deep expertise in AI. Ernie 4.5 excels in multimodal understanding, and its API call pricing is highly competitive, attracting the attention of numerous developers and businesses.
【AiBase Summary:】
🚀 Ernie 4.5 Turbo will be released at the Create conference on April 25th; details to be announced.
💡 Ernie 4.5 demonstrates excellent multimodal understanding capabilities, with API call prices at only 1% of GPT 4.5.
🌟 Baidu founder Robin Li stated that Ernie 4.5 will be the best model in Baidu's history.
2. AI Agent Revolution! Google Launches Open-Source A2A Protocol, Enabling Seamless "Conversation" Between Agents
Google Cloud has introduced a new open-source protocol, Agent2Agent (A2A), aimed at facilitating collaboration and information exchange between different AI agents. This protocol breaks down silos by employing standardized communication methods, addressing the issue of information isolation. Design principles of the A2A protocol include support for long-term tasks, default security, and modality independence, ensuring its applicability in complex AI applications.
【AiBase Summary:】
🚀 The A2A protocol promotes seamless collaboration and information exchange between AI agents from different vendors, breaking down information silos.
🔒 The protocol is built on existing standards, supports enterprise-grade security, and simplifies the integration process.
🌐 Over 50 technology partners support the A2A protocol, driving broader adoption of AI in enterprises.
3. WeChat: Crackdown on Improper Use of AI Tools in Video Account Livestreams
The WeChat Video Creation Safety Center recently announced a special campaign targeting broadcasters who improperly use AI tools during livestreams. These actions not only mislead viewers but also infringe on others' portrait rights, violating operational guidelines. The platform will impose penalties based on the severity of the violations to ensure a healthy and safe livestreaming environment. While encouraging the reasonable use of AI technology to improve efficiency, the platform firmly opposes improper profit-seeking behavior.
【AiBase Summary:】
🚫 Some broadcasters improperly use AI tools during livestreams, misleading viewers and infringing on portrait rights.
⚖️ Broadcasters need to apply for livestreaming qualifications; using AI tools to alter facial features is considered a violation.
📢 Users can report violations using the complaint function; the platform will handle them promptly.
4. ByteDance Open-Sources Multi-SWE-bench, Promoting Intelligent Upgrades for Large Model Code
The ByteDance Doubao large model team recently open-sourced Multi-SWE-bench, the first multilingual code repair benchmark dataset. It aims to improve the evaluation of large models' automatic bug-fixing capabilities. Compared to the previous SWE-bench, Multi-SWE-bench not only covers Python but also adds six other languages, including Java and TypeScript, creating 1632 real-world tasks and introducing a difficulty rating mechanism.
【AiBase Summary:】
🛠️ Multi-SWE-bench is the first multilingual code repair benchmark dataset, covering seven mainstream programming languages.
📊 The dataset contains 1632 real-world tasks, rigorously screened and manually verified to ensure reliability.
🤖 Experiments show that large language models perform reasonably well in Python repair but have a repair rate below 10% in other languages.
5. JD Retail Launches its First Self-Developed Billion-Level Time Series Large Model, TimeHF, to Predict Product Sales
The JD Retail technology team successfully launched TimeHF, a self-developed billion-level sales prediction time series large model. Using reinforcement learning from human feedback, the model significantly improved sales prediction accuracy by over 10%. TimeHF excels in JD's automated replenishment scenarios and outperforms industry standards on multiple public datasets, setting a new benchmark for time series prediction.
【AiBase Summary:】
🛠️ The TimeHF model, based on reinforcement learning from human feedback, improves prediction accuracy by over 10%, significantly reducing uncertainty in demand forecasting.
📊 The JD team integrated a high-quality dataset of 1.5 billion samples, creating an unprecedentedly complex dataset that laid the foundation for model training.
🚀 TimeHF achieved SOTA results on multiple public datasets, demonstrating stronger zero-shot performance and prediction accuracy, and is already being used in JD's supply chain system.
6. Google Firebase Studio Makes its Debut: AI-Powered One-Stop Development Platform
Google's newly launched Firebase Studio is a cloud-based development platform integrating various development tools, aiming to simplify the development process through AI technology. It provides a one-stop development experience, supporting solutions from ideation to deployment, particularly excelling in mobile development. Although currently in preview, its powerful features and free trial policy allow developers to more easily develop projects, foreshadowing a transformation in the future programming ecosystem.
【AiBase Summary:】
🛠️ Firebase Studio integrates multiple development tools, providing a one-stop development experience and supporting rapid application building.
🌐 The platform includes a built-in mobile simulator, allowing developers to debug iOS and Android applications directly in the cloud, improving mobile development efficiency.
🔍 Currently in preview, functional stability needs improvement; future balance between free and paid services is needed.
Details: https://top.aibase.com/tool/firebase-studio
7. From Text to Complex Characters: The Powerful SVG Generation Large Model OmniSVG Arrives!
The release of OmniSVG marks a significant breakthrough in SVG generation technology. Combining an advanced vision-language model and an innovative SVG tokenizer, it greatly improves generation efficiency and quality. The model not only supports generating SVGs from text and images but can also generate complex anime characters, demonstrating high flexibility and quality. OmniSVG opens new possibilities for graphic design and web development, fostering the growth of the AIGC community.
【AiBase Summary:】
🌟 OmniSVG, jointly developed by StepFun and Fudan University, is an advanced SVG generation model with excellent multimodal generation capabilities.
📊 The newly released MMSVG-2M dataset contains 2 million SVG resources and provides a standardized evaluation protocol, driving the development of SVG generation technology.
🚀 SVGs generated by OmniSVG not only have excellent visual effects but are also editable, suitable for professional design workflows.
Details: https://omnisvg.github.io
8. Google Open-Sources Agent Development Kit: ADK Launch Ushers in a New Wave of AI Agent Development
Google's release of the Agent Development Kit (ADK) on April 9, 2025, marks significant progress in AI agent development. As an open-source framework, ADK aims to simplify the construction and management of multi-agent systems, supporting flexible development and deployment. Its modular design and multimodal interaction capabilities enable developers to create agents in a more natural way, while deep integration with Google Cloud services enhances the efficiency of enterprise applications.
【AiBase Summary:】
🚀 ADK is Google's open-source agent development framework, designed to simplify the construction and management of multi-agent systems.
🎤 It supports multimodal interaction, featuring bidirectional audio and video streams, enhancing the natural conversational abilities of agents.
🌐 Deep integration with Google Cloud services supports rapid prototyping and seamless deployment, lowering the technical barrier.
9. Veo 2 Lands on Gemini API: The AI Video Generation Revolution Begins
Google's Veo2 video generation model is now available to developers via the Gemini API, marking a new phase in AI video generation technology. Known for its high-fidelity video generation and accurate response to complex instructions, it supports text-to-video and image-to-video generation. Developers can flexibly integrate this API to explore a wide range of commercial and creative possibilities. However, discussions around content authenticity and copyright ownership have also emerged, with Google embedding watermarks in generated videos to mitigate misuse.
【AiBase Summary:】
🎥 Veo2 is available via the Gemini API, supporting text-to-video and image-to-video generation, driving the development of AI video generation technology.
💡 Developers can flexibly integrate Veo2 via the API to explore diverse applications for personalized short videos and interactive storytelling experiences.
⚖️ High-fidelity output raises discussions about content authenticity and copyright; Google embeds watermarks to reduce misuse.
Details: https://ai.google.dev/gemini-api/docs/video
10. Anthropic Launches Premium Claude Max Subscription, Priced up to $200 per Month
Anthropic recently launched a Claude Max subscription plan to meet the high-end needs of users for AI services. The plan offers two pricing options, $100 and $200 per month, with significantly increased usage limits, attracting more users. While unlimited usage options aren't currently available, the company is actively exploring other revenue streams, including the "Claude for Education" program for the education sector.
【AiBase Summary:】
🌟 The Claude Max subscription plan is launched, priced up to $200 per month, designed for high-end users.
📈 Two pricing options with significantly increased usage limits attract more users.
🎓 Exploring the "Claude for Education" program, focusing on AI needs in education.
11. OminiControl Art Released: Refining the Ghibli-like Art Style of GPT-4o into the FLUX Model
The release of OminiControl Art technology on April 9, 2025, marks a major breakthrough in the field of AI art creation. This technology, through the OminiControl framework, combines the art style of OpenAI's GPT-4o with the FLUX.1 model, providing users with an efficient and elegant tool for generating high-quality art images. Its design philosophy emphasizes simplicity and practicality, making it easy for ordinary users to use, further promoting the integration of AI and art.
【AiBase Summary:】
🖼️ OminiControl Art combines the art style of GPT-4o with the FLUX.1 model, opening up new possibilities for AI art generation.
⚙️ The OminiControl framework achieves flexible control over diffusion transformer models by adding a small number of parameters, improving the efficiency of art creation.
🌟 The release of this technology reflects the trend of AI art tools shifting from functionality to artistry and controllability, potentially leading to more creative projects.
Details: https://top.aibase.com/tool/ominicontrol-art
12. Following OpenAI, Google Gemini Joins the MCP Camp, Accelerating AI Agent Interoperability
Following OpenAI's adoption of Anthropic's Model Context Protocol (MCP), Google announced that it will add support for MCP to its Gemini model and software development kit (SDK). MCP, as an open standard, allows AI models to extract information from multiple data sources, promoting bidirectional connections between developers and AI applications. This move signifies an acceleration in AI agent interoperability, attracting multiple companies to integrate MCP, demonstrating the industry's emphasis on open protocols and the potential for collaboration.
【AiBase Summary:】
🔗 Google DeepMind CEO Demis Hassabis announced that the Gemini model will support the MCP protocol, driving AI agent interoperability.
🛠️ MCP allows AI models to extract information from multiple data sources, enabling developers to build applications with bidirectional connections.
🌐 Since Anthropic open-sourced MCP, several companies, such as Block and Replit, have integrated the protocol into their platforms.
13. Gradio 5.24 Major Update: AI Image Editing Rivals Photoshop, a Boon for Developers!
The release of Gradio 5.24 brings significant updates to the open-source AI interface framework, particularly the new ImageEditor component, which brings professional-level image editing capabilities. The addition of zoom, pan, transparency control, and custom layers greatly enhances developers' flexibility and efficiency in image processing.