Welcome to the AI Daily section! This is your daily guide to exploring the world of artificial intelligence. Each day, we bring you the hottest topics in the AI field, focusing on developers, helping you stay informed about technological trends and innovative AI product applications.
Explore Fresh AI Products Click to Learn More: https://top.aibase.com/
1. Microsoft Launches Design Tool Microsoft Designer
As an AI-integrated design application, Microsoft Designer makes design simpler and more efficient. Users can access this powerful tool on any device, seamlessly integrated with Microsoft365 applications, offering features like intelligent object detection, innovative tools, and image style remodelling.
AiBase Highlights:
🚀 Seamlessly integrated with Microsoft365 applications, facilitating the creation and editing of images and designs.
🔍 Intelligent object detection feature, easily erasing unwanted objects or creating background blur effects.
🎨 Innovative tools include prompt templates, personalized greeting cards and invitations, as well as image style remodelling and background replacement functions.
Details: https://top.aibase.com/tool/microsoft-designer-sticker-creator
2. ElevenLabs Releases Turbo 2.5 Model: Speeds Up 3 Times, Covers 32 Languages Including Chinese
In the world of AI, ElevenLabs' Turbo 2.5 model breaks through language barriers once again. This model not only excels in performance and multilingual support but also triples the speed, reducing latency to 300 milliseconds, providing stronger support for dynamic interactions. In terms of user experience, it offers a rich selection of languages and convenient conversion functions, while ensuring data security and compliance.
AiBase Highlights:
🚀 Turbo 2.5 model supports 32 languages, triples the speed, reduces latency to 300 milliseconds, providing stronger support for dynamic interactions.
🌐 First-time support for Vietnamese, Hungarian, and Norwegian text-to-speech conversion, enriching the language library and improving the speed of English text-to-speech.
🔊 Offers a wide range of application scenarios, including conversational AI, education, entertainment, and content creation, providing realistic voice support, such as applications by Praktika.ai, Kindroid, and Aug X Labs.
Details: https://elevenlabs.io/api
3. Apple AI Releases Open-Source Language Model DCLM with 700 Million Parameters
Apple, in collaboration with several institutions, has launched an open-source language model DCLM with 700 million parameters. This model, trained with a vast amount of data tokens, helps in understanding and generating language. DCLM provides standardized dataset optimization tools to assist researchers in conducting effective experiments. The new model has made significant progress in important tests while reducing the demand for computational resources.
AiBase Highlights:
🔑 Apple AI, in collaboration with several institutions, has created a powerful open-source language model, DCLM.
🔑 DCLM provides standardized dataset optimization tools to help researchers conduct effective experiments.
🔑 The new model has made significant progress in important tests while reducing the demand for computational resources.
Details: https://huggingface.co/collections/mlfoundations/dclm-669938432ef5162d0d0bc14b
4. Xiaomi's Big Model Xiaoai Adds AI Document Q&A and AI Image Editing Features
Xiaomi has announced that Xiaoai Assistant now includes an 'AI Image Editing' feature, allowing users to perform operations such as background conversion, style conversion, removal of passersby, intelligent image expansion, and image Q&A. Additionally, 'Big Model Xiaoai' has added an 'AI Document Q&A' feature, providing a smarter document processing experience. Users need to update to version V6.126 to experience the new features.
AiBase Highlights:
✨ Xiaoai Assistant adds 'AI Image Editing' features, including background conversion, style conversion, removal of passersby, intelligent image expansion, and image Q&A.
🔍 Users need to update to version V6.126 to use the new features.
📄 'Big Model Xiaoai' adds 'AI Document Q&A' feature, providing a smarter document processing experience.
5. BlazeBVD: A Revolutionary Video De-flicker Technology
One of the key technologies in video production and image processing is video de-flicker technology. BlazeBVD, as a new de-flicker algorithm, can not only quickly remove flickering phenomena in videos but also maintain the integrity and color authenticity of the video content. Its emergence has revolutionized the way video post-production is done.
AiBase Highlights:
🔍 BlazeBVD is an automated video de-flicker technology that effectively improves the temporal consistency of videos.
⚙️ BlazeBVD uses a scale-time balance method to process the histogram of video frames, capturing flicker and local exposure changes.
🚀 BlazeBVD shows excellent results in global and local de-flicker modules, adaptive temporal consistency, and speeds up to 10 times that of existing technologies.
Details: https://arxiv.org/html/2403.06243v1
6. Baidu's Shen Dou: Large Model Applications Enter a Period of Explosive Growth
At the 2024 China Unicom Partner Conference, Shen Dou, Executive Vice President of Baidu Group and President of Baidu Intelligent Cloud Business Group, delivered a speech on deeply embracing AI+ and accelerating the development of new quality productivity. Shen Dou emphasized that AI is the key technology for innovation, and large models are the forefront of AI, showcasing the exponential growth in large model calls. Baidu, through cooperation with enterprises, discovered the importance of the underlying computing power management platform, and independently developed the Baiqi Computing Power Platform to support the rapid iteration of large models.
AiBase Highlights:
🚀 Large model applications enter a period of explosive growth, with enterprises applying large models to various business links, rather than just waiting for a blockbuster application.
💡 One Cloud Multi-Chip has become an inevitable choice for Chinese enterprises, with Baidu Intelligent Cloud opening the Baiqi Computing Power Platform, sharing the freedom of "chip selection".
💻 Baidu has developed the Qianfan toolchain platform based on the Wenxin large model, reducing the technical threshold and usage costs of large models, and launched the Qianfan·Industry Enhanced Edition to accelerate enterprise innovation.
7. Microsoft Researchers' SpreadsheetLLM Project
Microsoft researchers have recently released innovative research called SpreadsheetLLM, aimed at solving the challenges large language models face in parsing spreadsheets. The project uses an encoding framework that allows large language models to "understand" the content of spreadsheets, potentially significantly enhancing data management and analysis efficiency in spreadsheets, and enabling users to ask AI questions in natural language without needing to master complex formulas and operations.
AiBase Highlights:
📊 Challenges for large language models with spreadsheets: Spreadsheet structures are complex and two-dimensional, beyond the linear input range usually handled by large language models.
🔍 SpreadsheetLLM technical analysis: Microsoft has proposed two core technologies, SheetCompressor and Chain of Spreadsheet, significantly enhancing the large language model's understanding of spreadsheets.
🛠️ Impact on Microsoft AI tools: SpreadsheetLLM is expected to enhance the application capabilities of Microsoft Copilot in Excel, but currently still faces issues with data accuracy and computational resource consumption.
8. Google's 2024 Hardware Extravaganza: Pixel 9, Gemini, and New Foldable Phones
Google will hold a large hardware event earlier, announcing new products such as Pixel 9, ahead of Apple's iPhone 16 launch. The Gemini topic is highly anticipated, with new devices leading AI features in the Android field. Android 15 brings new features and UI adjustments, and the future of Google Assistant remains uncertain. Pixel Watch 3 and Pixel Buds Pro 2 will also be unveiled.
AiBase Highlights:
📱 Google will release Pixel 9 ahead of Apple's iPhone 16.
🌟 New devices will lead AI features in the Android field, with the Gemini topic highly anticipated.
🔍 Android 15 brings new features and UI adjustments, and the future of Google Assistant remains uncertain.
9. Arcee AI Releases Open-Source Language Model Arcee-Nova: Performance Close to GPT-4 Based on Qwen2-72B
Arcee AI's latest open-source language model, Arcee-Nova, performs excellently, close to GPT-4 levels, marking an important milestone and bringing new hope to the AI community. Arcee-Nova combines Qwen2-72B-Instruct and custom-tuned models, offering comprehensive functions, and is widely applied in customer service, content creation, software development, and education.
AiBase Highlights:
🌟 Arcee-Nova performs excellently, close to GPT-4 levels, bringing new hope to the AI community.
💡 Arcee-Nova combines Qwen2-72B-Instruct and custom-tuned models, offering comprehensive functions.
📈 Arcee-Nova is widely applied in customer service, content creation, software development, and education.
10. Japanese Supermarket Introduces AI Smile Monitoring System
Japanese supermarket chain AEON has introduced an AI smile monitoring system called "Mr Smile," aiming to improve employee service quality and customer experience. Although the system has achieved significant results in enhancing service attitude, it has also sparked controversy and concerns, involving the naturalness of employee smiles and customer harassment issues. Compared with practices in other industries, such as McDonald's "0 Yuan Smile" concept and Fukuoka supermarket's slow checkout lanes, AEON's approach has received mixed reviews.
AiBase Highlights:
📈 AI Smile Monitoring System: AEON's "Mr Smile" system evaluates employee smiles and service attitudes through analyzing over 450 factors, aiming to enhance customer experience.
🔍 Controversy and Concerns: The technology raises concerns about employee harassment, with some arguing that mandatory smiling may intensify employee scrutiny.
💡 Industry Comparison: AEON's approach is similar to McDonald's "0 Yuan Smile" concept but faces criticism for increasing employee burden. Fukuoka supermarket's slow checkout lanes receive positive feedback.
11. DeepSeek Open-Sources Vision-Language Representation Learning Model RWKV-CLIP
DeepSeek has open-sourced the RWKV-CLIP model, a vision-language representation learner that combines the advantages of Transformer and RNN. The model, through image-text pre-training tasks, uses image-text pairs expanded from web sources to significantly improve performance on vision and language tasks. The research team introduced a diverse description generation framework, leveraging large language models to synthesize and refine content from web-based text, synthetic captions, and detection labels, to address noisy data issues and enhance data quality.
AiBase Highlights:
🔍 The model combines the advantages of Transformer and RNN, significantly improving performance on vision and language tasks through image-text pre-training tasks.
🔬 Introduces a diverse description generation framework, using large language models to synthesize and refine content, addressing noisy data issues and enhancing data quality.
🚀 RWKV-CLIP model excels in input enhancement, achieving significant performance improvements, and achieving state-of-the-art performance in multiple downstream tasks.