Welcome to the 【AI Daily】column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with the hottest AI news, focusing on developers and helping you understand technology trends and innovative AI product applications.

New AI Products Learn More: https://top.aibase.com/

1. OpenAI Launches New Speech-to-Text Model gpt-4o-transcribe, Significantly Boosting Accuracy

OpenAI recently launched three new speech models: gpt-4o-transcribe, gpt-4o-mini-transcribe, and gpt-4o-mini-tts. Among them, gpt-4o-transcribe has attracted significant attention. This model excels in speech transcription, exhibiting a remarkable reduction in error rate, achieving 2.46% in English transcription. Furthermore, gpt-4o-transcribe supports transcription in various complex environments and incorporates noise reduction and semantic speech activity detection technologies.

image.png

【AiBase Summary:】

🔊 gpt-4o-transcribe boasts an error rate as low as 2.46% in English transcription, significantly outperforming its predecessor, Whisper.

🌍 Supports over 100 languages and maintains high accuracy in noisy environments and with diverse accents.

💻 Developers can easily integrate the new model via API, driving the development of speech interaction applications.

Details: https://top.aibase.com/tool/openai-fm

2. Kuaishou Search AI Upgrade: DeepSeek R1 Fully Integrated, Transforming Search Experience!

Kuaishou recently announced the full integration of the DeepSeek R1 large model into its search function. This aims to significantly improve search results and user experience, boosting user activity. Simultaneously, Kuaishou is actively exploring the commercial potential of intelligent search scenarios. By integrating DeepSeek, Kuaishou's AI content creation platform, "Keling AI," has become more efficient in video and image generation, allowing users finer control over the creative process and greatly enriching creative possibilities.

image.png

【AiBase Summary:】

🚀 Kuaishou's search function has fully integrated the DeepSeek R1 large model, enhancing search results and user experience.

💡 The AI content creation platform, "Keling AI," integrated with DeepSeek, allows users to generate and optimize prompts more efficiently.

🎥 DeepSeek's Inspiration version is deeply linked with Kuaishou's "Inspiration Library," enabling users to finely control key creative elements.

3. Claude Introduces Web Search Functionality, Providing Real-time Information and Source Citations

Anthropic is adding a web search function to its AI assistant, Claude, to provide more timely information. Unlike traditional search engines, Claude translates search results into conversational answers and accurately cites information sources. This feature is currently only available to paying users in the United States, with plans to expand to other countries and free accounts in the future.

image.png

【AiBase Summary:】

🌐 Claude translates online search results into conversational answers and cites information sources.

💰 The web search function is currently only available to paying users in the United States, with future plans to expand globally.

⚖️ The reduced traffic from search bots poses a challenge to websites reliant on traffic volume.

4. ByteDance Launches InfiniteYou Image Generation Framework: Preserves Facial Features, Changeable Scenes

ByteDance's InfiniteYou (InfU) is an innovative image generation tool that creates high-quality, personalized images based on user text descriptions. Its core technology, InfuseNet, uses a refined training strategy to ensure that the generated images retain user identity features while flexibly changing scenes and content.

image.png

【AiBase Summary:】

🧑‍🚀 InfiniteYou is ByteDance's image generation model that creates personalized images based on text descriptions.

🔧 Core technology InfuseNet, through fine-tuning, ensures generated images retain user identity features while flexibly changing scenes.

🌐 This tool is compatible with various existing tools and is suitable for academic research, subject to relevant licenses and legal regulations.

Details: https://top.aibase.com/tool/infiniteyou

5. Tencent Yuanbao Feature Update: Supports Analysis of Complex Excel Spreadsheets

Tencent Yuanbao's new feature significantly improves the convenience of handling Excel spreadsheets. Users simply need to ask questions in natural language, and Yuanbao can quickly read data, perform calculations, and even highlight important information, simplifying complex workflows. Even Excel novices can easily handle data analysis.

image.png

【AiBase Summary:】

📝 The new feature allows users to ask questions in natural language, simplifying Excel data analysis.

⚡ The HunYuan Turbo S model quickly processes complex spreadsheets, improving analysis efficiency.

📱 Supports multi-platform operation, allowing users to upload and analyze spreadsheets anytime, anywhere.

6. Krea AI Releases "Video Training" Feature: Train Your Own Unique Video Style

Krea AI recently launched a new "Video Training" feature, allowing users to upload personal images and video materials to train their own AI video style models. Based on the Wan2.1 model, this feature allows users to learn specific artistic styles and dynamic movements, greatly enhancing the personalization and flexibility of content creation. Users can adjust multiple parameters as needed, and the generated style models can be directly applied to the Krea Video platform.

image.png

【AiBase Summary:】

📹 Users can upload images and video materials to train their own unique AI video style models.

⚙️ Provides a flexible user experience, allowing users to adjust training parameters to optimize model performance.

🌟 This feature opens up new possibilities for content creators, improving creative efficiency.

Details: https://www.krea.ai/train

7. DomoAI Launches Voice and Image Generation Digital Human Feature: A New Breakthrough in Digital Content Creation

DomoAI's new feature, allowing users to generate talking digital humans by uploading voice and images, has sparked heated discussions, marking a significant innovation in digital content creation. This feature not only supports lip synchronization but also generates short videos of varying lengths, particularly excelling in Chinese support, greatly improving creative flexibility and efficiency.

image.png

【AiBase Summary:】

🎤 Users can quickly generate talking virtual characters by uploading voice and images, supporting multiple duration options.

🚀 This feature boasts fast generation speed and excellent Chinese support, meeting diverse needs.

🎨 DomoAI's innovative technology will lower the barrier to content creation, driving deeper integration between AI and the entertainment industry.

Details: https://www.domoai.app/en/create/talking-avatar

8. Sider AI Launches Deep Research Feature: Automatically Simulates Human Research Behavior and Generates Interactive Visual Reports

Sider AI's recently launched Deep Research feature has attracted widespread attention for its ability to simulate human research behavior and automatically generate visual reports. This feature enhances research efficiency through real-time note-taking and transparent information sources, while providing users with greater control.

image.png

【AiBase Summary:】

🤖 The Deep Research feature simulates human research behavior, automatically accessing online information and generating real-time notes.

📊 After information gathering, this feature automatically generates interactive visual web pages, including charts and flowcharts.

🔍 Users can seamlessly integrate automated and manual research via browser extensions, suitable for various research scenarios.

Details: https://sider.ai/wisebase/deep-research

9. High School Student Builds AI Model Evaluation Website Using Minecraft, Enabling Public Participation in Judging Model Quality

With the rapid development of artificial intelligence technology, evaluating and comparing the capabilities of generative AI models has become a major challenge. The MC-Bench website uses the game Minecraft to provide a novel evaluation method, where users unknowingly vote on AI-generated buildings. This creative approach not only increases public participation but also provides a visual experience for AI model evaluation.

image.png

【AiBase Summary:】

🎮 MC-Bench utilizes Minecraft as an evaluation platform, allowing users to evaluate AI model performance through voting.

👨‍🎓 This project was initiated by a high school student and has attracted the support of several top AI companies, showcasing the innovation of young people.

🛠️ In the future, MC-Bench may expand to more complex AI tasks, becoming an effective tool for testing AI reasoning capabilities.

Details: https://top.aibase.com/tool/mc-bench

10. From a Former Google Scientist! Reka Open-Sources Reka Flash 3, Outperforming Gemma 3 27B

Reka AI recently launched its first open-source model, Reka Flash3, a 21-billion parameter general-purpose reasoning model that has attracted widespread attention in the industry. Despite having fewer parameters than some competitors, its training from scratch and deep optimization have resulted in outstanding performance. Reka Flash3 not only boasts powerful reasoning capabilities but also supports low-latency and device-friendly deployment, suitable for various terminal uses.

image.png

【AiBase Summary:】

🚀 Reka Flash3 is an open-source model launched by Reka AI, founded by former Google DeepMind scientists, with 21 billion parameters and performance exceeding many larger models.

💡 This model has powerful reasoning capabilities, supporting complex question answering and multi-step reasoning, suitable for various innovative application scenarios.

📱 Reka Flash3 has low latency and is device-deployment friendly, potentially offering convenient services directly on mobile devices in the future.

Details: https://top.aibase.com/tool/reka-flash-3

11. Shengshu Technology's Vidu Secures Adaptation Rights for 7 Ten-Million-Level Online Literature IP Concept Short Films

Beijing Shengshu Technology Co., Ltd.'s AI video generation platform, Vidu, recently secured adaptation rights for 7 ten-million-level online literature IPs for short films, marking another milestone in the development of the domestic animation short film market. These works cover various genres, including fantasy, science fiction, martial arts, and urban romance, possessing a broad audience base and strong potential for film adaptation.

image.png

【AiBase Summary:】

🎬 Vidu has secured adaptation rights for 7 ten-million-level online literature IP short films, covering various genres.

⚙️ AI technology automates script and video generation, significantly improving production efficiency.

⏱️ Production time is shortened from months to weeks or even days, addressing the high cost of traditional adaptation.

12. World's First Pediatric Large Model Launched, Promoting Upgrades in Children's Medical Services

Against the backdrop of rapid development in medical technology, Baichuan Intelligent Technology, Beijing Children's Hospital, and Xiaor Fang Health jointly launched the world's first pediatric large model—"Futang Baichuan." This model not only encompasses a knowledge system of common and difficult-to-treat childhood diseases but also possesses strong clinical reasoning capabilities, aiming to provide scientific and personalized treatment plans. Through the pioneering pediatric "evidence-based model," doctors can make clinical decisions based on scientific evidence, thereby improving the quality of medical services. In addition, Baichuan Intelligent Technology has launched the "AI Pediatric Doctor" application, combining with national regional medical centers to promote the application of artificial intelligence in primary pediatric medical care.

image.png

【AiBase Summary:】

🧠 The model covers common and rare childhood diseases and has strong clinical reasoning capabilities.

📊 The pioneering pediatric "evidence-based model" integrates the best medical evidence to guide clinical decision-making.

🏥 The "AI Pediatric Doctor" application is launched to help primary hospitals improve pediatric medical services.