AI Daily: Tencent Releases Version 2.5 of its HunYuan 3D Generation Model; Haier Launches Image-to-Person Reference Feature; Baidu Launches Mobile Super Intelligence App, Xinxiang

Welcome to the 【AI Daily】column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with the hottest AI news, focusing on developers and helping you understand technology trends and innovative AI product applications.

Check out the latest AI products Learn More: https://top.aibase.com/

1. Manus Open-Source Alternative! Kortix-AI Officially Releases Suna, an Open-Source General-Purpose AI Agent Platform

Kortix-AI has launched Suna, an open-source general-purpose AI agent platform designed as an alternative to Manus. Suna integrates multiple functions such as browser automation, file management, and API integration, enabling the automation of complex tasks through natural language conversations. The project has been launched on GitHub and has quickly attracted the attention of AI developers and automation enthusiasts.

【AiBase Summary:】
🌐 Suna integrates multiple functions including browser automation, file management, and API integration, supporting the automated processing of complex tasks.
📊 With its modular design, Suna provides users with a complete solution from data processing to process automation, applicable to various real-world scenarios.
🚀 Suna's open-source nature and flexible deployment options make it an ideal choice for developers and enterprises, lowering the barriers to development and deployment.
Details: https://github.com/kortix-ai/suna

2. Tencent HunYuan 3D Generation Model Officially Releases Version 2.5

The 2.5 version of Tencent's HunYuan 3D generation model has been officially released, marking the entry of 3D generation technology into the ultra-high-definition era. This upgrade significantly improves modeling precision and user experience, lowering the threshold for 3D content creation. The new version supports 4K high-definition textures and fine-grained bump mapping, greatly enhancing the realism and texture of the models. In addition, the free generation quota has been doubled, attracting more creators and promoting the popularization of 3D content creation.

【AiBase Summary:】
🚀 Improved modeling precision, with geometric resolution reaching 1024, resulting in smoother surfaces and richer details.
💰 Doubled free generation quota to 20 times per day, attracting more creators to try it out.
🌐 The HunYuan 3D generation API is available on Tencent Cloud, supporting enterprise-level applications and promoting the popularization of 3D content creation.
Details: https://3d.hunyuan.tencent.com/

3. Hailuo Launches Image Character Reference Feature, allowing users to generate multi-angle, dynamic pose character images based on a single image

MiniMax's Hailuo AI has launched a new feature for Hailuo Image - Character Reference, allowing users to generate multi-angle, dynamic pose character images based on a single reference image. This feature introduces rich expression control and cinematic lighting effects, greatly enhancing the creative expression and character consistency of AI image generation.

【AiBase Summary:】
🎭 Multi-angle and dynamic poses: Users can generate images of characters from different angles and poses based on a single image, maintaining consistent character features.
💡 Rich expression control: Supports diverse character expression generation, ensuring natural and vivid emotional expression, suitable for storytelling and brand content creation.
🎥 Cinematic lighting and composition: Provides professional lighting effects and optimized composition, generating results comparable to movie scenes and enhancing visual impact.
Details: https://hailuoai.video/create

4. Baidu Launches "Xinxiang" App, Dubbed a Super AI Agent for Full-Managed Complex Tasks

Baidu recently launched the Xinxiang App, a multi-agent collaboration application designed for mobile devices. This application uses intelligent methods to solve complex needs encountered by users in their daily lives. The Xinxiang App integrates map-based MCP functions, supporting automatic route annotation and ride-hailing recommendations. Users only need to describe their needs in natural language, and the system can quickly plan travel routes. Additionally, the app introduces multi-agent collaboration mechanisms in the health and legal fields, providing more professional consulting services.

【AiBase Summary:】
🗺️ The Xinxiang App integrates map-based MCP functions, providing automatic route annotation and ride-hailing recommendation services.
🏥 In the healthcare field, the system dispatches multiple "doctor AI avatars" for joint consultations, providing comprehensive health advice.
⚖️ In legal services, a "lawyer think tank" composed of multiple lawyer AI avatars works together to provide users with high-quality legal consultation.

5. Revolutionary Open-Source TTS Model Dia Released: Supports Emotions, Non-Verbal Cues, and Rivals Human Conversation

Nari Labs has released its open-source text-to-speech model Dia, featuring 1.6 billion parameters, aimed at generating natural conversations. Compared to products from ElevenLabs and Google, Dia performs better, especially in emotional intonation and non-verbal cues. Developed with zero funding, the model has been open-sourced on Hugging Face and GitHub, allowing users to download and deploy it locally.

【AiBase Summary:】
🎤 Dia is an open-source text-to-speech model with 1.6 billion parameters, designed to generate natural conversations.
💡 The model supports emotional intonation and non-verbal audio cues, correctly interpreting various text tags.
🔗 Dia's code and weights have been open-sourced on Hugging Face and GitHub for users to download and experience.
Details: https://github.com/nari-labs/dia

6. Grok Major Update! Visual Capabilities, Multilingual Audio Processing, and Real-Time Search Features Launched

The latest update to Grok brings significant functional improvements, particularly breakthroughs in visual processing, multilingual audio interaction, and real-time search. These new features not only enhance Grok's intelligence but also improve user interaction. With visual capabilities, users can upload images for analysis, while multilingual audio processing supports real-time voice interaction in 145 languages, greatly facilitating cross-language communication.

【AiBase Summary:】
🖼️ Grok's visual processing capabilities are now online, allowing users to analyze complex visual content by uploading images, enhancing application value.
🌍 Supports multilingual audio processing in 145 languages, allowing users to enjoy seamless voice interaction experiences, suitable for various scenarios.
🔍 The newly added real-time search function allows users to quickly obtain the latest information through voice commands, improving the timeliness and reliability of information.

7. Genspark AI Slides Tool Released, Revolutionizing Professional Presentation Creation

Genspark's AI slide tool significantly improves the efficiency of creating professional presentations through automation and intelligent integration. The tool supports the processing of various data formats and can quickly generate academic reports and business presentations, especially suitable for academic research and corporate reports. User feedback is positive, believing that its efficiency and artistic style features bring new possibilities to presentation creation, and future improvements and expansions are planned.

【AiBase Summary:】
📊 Multi-format data integration: Supports processing multiple data formats, automatically extracting key information, and reducing manual sorting time.
📚 Academic report generation: Uses natural language processing to quickly generate academic reports containing citations and visualizations, suitable for academic settings.
🎨 Artistic style support: Provides various visual expression options, allowing users to generate personalized presentations based on brand needs.

8. Character.AI Launches AvatarFX Model: Making Characters in Static Images "Speak"

Character.AI's recently launched AvatarFX model is a revolutionary technology that can transform static images into vivid, speaking video characters. Using advanced dynamic generation technology, AvatarFX not only achieves dynamic facial expressions and lip synchronization in the image but also provides a variety of audio options, enhancing user immersion. Furthermore, the platform prioritizes user safety, ensuring content health during the creation process.

【AiBase Summary:】
🎨 AvatarFX utilizes diffusion model-based dynamic generation technology to transform static images into vivid video characters with realistic expressions and movements.
🔊 Features multiple audio options, allowing users to freely choose different styles of voices, enhancing the vividness and appeal of storytelling.
🛡️ The platform prioritizes user safety, providing strong safety controls to ensure a healthy creative environment and avoid inappropriate content risks.
Details: https://character-ai.github.io/avatar-fx/

9. Whiteboard + Code Editor Combo Tool pad.ws: Developers Can Draw and Code Simultaneously on the Same Interface

pad.ws is an innovative online tool that perfectly combines a whiteboard with a code editor, providing developers and designers with a new collaborative experience. It supports multiple programming languages and features real-time collaboration, an infinite canvas, and more, meeting the diverse needs of remote teams and educational scenarios. User feedback shows that pad.ws is widely welcomed for its smooth user experience and open-source nature, and more innovative features are expected in the future.

【AiBase Summary:】
🖊️ Infinite canvas and collaborative whiteboard: Users can draw flowcharts and sketches on an infinite canvas, supporting real-time collaboration among multiple users, improving team efficiency.
💻 Real-time code editing: Built-in code editor supporting multiple programming languages, providing syntax highlighting and error prompts, enabling a smooth draw-and-code experience.
🔒 Open-source and privacy protection: As an open-source tool, users can customize functions, support local deployment and end-to-end encryption, ensuring data security.
Details: https://github.com/pad-ws/pad.ws

10. OpenBMB's "Juan Ji" Joins the Team! Revolutionizing Long-Form Text Generation, Challenging Traditional Review Models

OpenBMB's new member, "Juan Ji," has attracted widespread attention in the field of long-form text generation. Using information entropy and convolutional algorithms, "Juan Ji" can efficiently integrate a large amount of literature to generate high-quality review reports. Users only need to provide keywords to obtain content that is logically rigorous and insightful. Horizontal comparisons with other models show that "Juan Ji" performs exceptionally well in terms of structure, content, and citations, demonstrating strong technical capabilities.

【AiBase Summary:】
📚 "Juan Ji" uses information entropy and convolutional algorithms to quickly integrate a large amount of literature to generate high-quality reviews.
💻 Users can obtain review reports through simple operations, supporting both general and professional modes.
🚀 LLMxMapReduce-V2 technology empowers "Juan Ji," showcasing strong capabilities in long-form text generation.
Details: https://surveygo.thunlp.org/

11. Tencent HunYuan Large Model AI Reading Assistant - Penguin Reading Companion Officially Launched

Tencent's "Penguin Reading Companion," launched on World Book Day, is an innovative AI reading assistant designed to provide primary and secondary school students with a rich reading experience. This product combines AI technology with educational principles and can provide personalized book recommendations based on students' grades and interests. Through role-playing, scenario generation, and other interactive methods, students can not only deeply understand classic works but also improve their reading interest in a gamified environment, ultimately achieving quantitative and traceable reading results.

【AiBase Summary:】
🤖 AI technology supports personalized recommendations, enhancing students' reading experience.
📖 Through role-playing and scenario generation, enhances students' understanding of classic works.
🎮 Gamified design stimulates reading interest and promotes proactive exploration.

12. Department of Justice May Force Google to Sell Chrome, OpenAI Expresses Acquisition Interest

During the remedial phase of the Google antitrust case, OpenAI has expressed interest in acquiring the Chrome browser. The US Department of Justice is demanding that Google divest itself of the product, and a judge has already ruled that Google holds a monopoly in the search market. OpenAI had previously discussed cooperation with Google, hoping to use its search technology for ChatGPT, but currently relies on Bing. This move could impact Google's market position and also reflects the intensifying competition among tech giants.

【AiBase Summary:】
💼 OpenAI has expressed interest in acquiring Google's Chrome browser, should it be forced to sell.
⚖️ The US Department of Justice proposed a forced divestiture of Chrome as a remedy in the "US v. Google" case.
🔍 OpenAI executives revealed previous discussions with Google regarding cooperation, hoping to use Google's search technology for ChatGPT.

13. ByteDance Adjusts AI Product Line: Mao Xiang Leadership Change, Xing Hui Merged into Dou Bao, All In on Growth