英文标题：Daily AI Update: State-of-the-Art Sora Model Vidu Unveiled; Kimi Chat Mobile Upgrade; Alibaba's Tongyi Qianwen Open-Sources Billion-Parameter Model; Apple's Collaboration with OpenAI in the Works

Welcome to the AI Daily column! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the hottest content in the AI field, focusing on developers to help you understand technical trends and innovative AI product applications.

Explore the latest AI products here: https://top.aibase.com/

1. Tsinghua Team Releases Vidu, a Video Large Model Capable of Generating 16-Second, 1080P Videos

Tsinghua University and Life-Growing Technology have unveiled China's first long-duration, high-consistency, and high-dynamic video large model, Vidu, at the Future Artificial Intelligence Pioneer Forum of the Zhongguancun Forum. This marks a significant advancement in video generation technology in China. The model adopts an innovative U-ViT architecture, capable of generating high-definition video content with a single click, featuring high spatial-temporal consistency and rich imagination.

【AiBase Summary:】

🎥 Vidu is China's first long-duration, high-consistency, and high-dynamic video large model.

🌟 Integrating Diffusion and Transformer technologies, it can generate up to 16-second, 1080P high-definition video content with a single click.

🚀 It not only simulates the real physical world but also possesses rich imagination, supporting multi-camera generation.

Product entry: https://top.aibase.com/tool/vidu

2. Qwen1.5-110B, the First Open-Source Model with Over 100 Billion Parameters from the Qwen Team

The Qwen team has open-sourced the first model with over 100 billion parameters, Qwen1.5-110B, which has performed excellently in foundational capabilities and Chat evaluations, demonstrating the significant impact of model scale expansion on performance improvement. The model adopts a Transformer decoder architecture, supports multiple languages, and features an efficient group query attention mechanism. Qwen1.5-110B is the largest model in the Qwen series, with over 100 billion parameters, and has outperformed SOTA models in comparison. The team will continue to explore the advantages of model scale enhancement and expanded pre-training data scale.

【AiBase Summary:】

🌟 Qwen1.5-110B is the first model with over 100 billion parameters, performing better in Chat evaluations, demonstrating the potential of larger-scale models.

🔍 The performance improvement of the 110B model mainly comes from the increased model scale, with no significant changes in training methods, indicating the importance of model scale expansion on performance improvement.

💡 Qwen1.5-110B adopts a Transformer decoder architecture, supports multiple languages, and features an efficient group query attention mechanism, showing room for improvement in model size expansion.

Model link: https://top.aibase.com/tool/qwen1-5-110b

3. Kimi Chat Mobile App UI Receives Major Overhaul

The Kimi Chat mobile app has received a significant update, with version 1.2.1 featuring a comprehensive overhaul of the user interface, introducing a "Lunar Bright Side" light mode for a more comfortable and intuitive user experience. The update includes interface improvements, performance optimization, memory management, battery efficiency, feature enhancements, security improvements, compatibility enhancements, bug fixes, localization support, and accessibility features. Users can experience the new features by updating to the latest version, 1.2.1.

【AiBase Summary:】

🎨 Interface improvements: Redesigned interface, beautiful and easy to use, more intuitive operation.

⚡ Performance optimization: Improved response speed and fluidity, reduced lag and delay.

🔒 Security improvements: Enhanced application security, protected user data and privacy.

Details link: https://top.aibase.com/tool/kimi-chat

4. Domo AI Adds Four New Styles: LEGO, American Comics, Colored Pencils, and Pixel Art

DomoAI has added four new styles: LEGO, American comics, colored pencils, and pixel art, and new users can try 15 free credits to celebrate its Twitter account surpassing 10,000 followers. Previously, DomoAI introduced a video chroma keying function that allows users to combine the extracted characters with a new background. Users can also easily customize the background color and create dance videos, among other features.

【AiBase Summary:】

🎨 DomoAI has added four new styles: LEGO, American comics, colored pencils, and pixel art.

🔑 New users can try 15 free credits.

💃 Users can use the /move command to turn static photos into dynamic videos.

Details link: https://top.aibase.com/tool/domoai

5. Apple Plans to Collaborate with OpenAI to Enhance iPhone AI Features

Apple is seeking to collaborate with OpenAI to enhance the AI features of the iPhone. After the departure of the former machine learning director, Apple's AI development has been struggling. Apple may introduce new generative AI products before the Worldwide Developers Conference.

【AiBase Summary:】

📌 Apple is seeking to collaborate with OpenAI to enhance the AI features of the iPhone.

📌 After the departure of the former machine learning director, Apple's AI development has been struggling.

📌 Apple may introduce new generative AI products before the Worldwide Developers Conference.

6. Google Launches AI English Conversation Practice Feature

Google has recently launched an AI voice conversation practice feature, allowing users to practice English conversation with a dialogue robot through their phones. Although the feature is currently limited to certain countries, Google may expand it to more countries. The article introduces the conversation practice and feedback features provided by Google's language learning tools, as well as its development in the field of AI-assisted language learning.

【AiBase Summary:】

🎙️ Google launches an AI voice conversation practice feature, allowing users to practice English conversation with a dialogue robot through their phones.

🌐 Currently, the feature is limited to certain countries but may be expanded to more countries.

💬 Although lacking the course settings of applications like Duolingo, it provides conversation practice and feedback features.

7. Meta Releases the First Multimodal Large Model XVERSE-V

As the first multimodal large model released by Meta, XVERSE-V has performed excellently in multiple authoritative evaluations, showcasing outstanding comprehensive capabilities. The model integrates strategies for both overall and local information, improving the accuracy and comprehensiveness of image recognition and analysis. In addition to image recognition, XVERSE-V also excels in practical applications such as infographic understanding, visual impairment scenario processing, text generation, and educational problem-solving.

【AiBase Summary:】

🌟 XVERSE-V is the first multimodal large model, supporting image input of arbitrary aspect ratios.

🔍 The model has performed excellently in comprehensive capabilities, achieving outstanding results.

💡 XVERSE-V adopts strategies for integrating overall and local information, improving the accuracy and comprehensiveness of image recognition and analysis.

Details link: https://huggingface.co/xverse/XVERSE-V-13BModelScope

8. Perplexica: An Open-Source AI-Powered Question-Answering Search Engine

Perplexica is an open-source AI-powered search engine that offers multiple search modes, aiming to provide users with more accurate and intelligent search experiences. It features advanced machine learning algorithms to ensure user privacy and provide the latest search results. Perplexica is committed to becoming a comprehensive and efficient search solution.

【AiBase Summary:】

🔍 Offers multiple search modes, adjusting search algorithms according to user needs for more relevant search results.

🔍 Uses advanced machine learning algorithms to refine search results, including similarity search and embedding techniques.

🔍 Ensures privacy protection, using SearxNG as a current guarantee to avoid daily data update overheads.

Details link: https://top.aibase.com/tool/perplexica

9. Meta Introduces LayerSkip to Enhance Large Language Model Inference Speed

Meta's latest LayerSkip technology aims to enhance the inference speed of large language models by optimizing the inference process, reducing computational resource consumption, and maintaining model performance. This is of significant importance for applications with high real-time requirements, reflecting Meta's continuous investment and innovation in AI model efficiency. The future outlook for LayerSkip technology will bring more possibilities for the deployment and use of large language models, especially in scenarios requiring rapid processing of large amounts of language data.

【AiBase Summary:】

🚀 LayerSkip has improved the inference speed by 2.16 times in the CNN/DM document summarization task, significantly enhancing document processing efficiency.

⚡ LayerSkip has achieved a 1.82 times speed improvement in programming tasks, potentially optimizing the performance of programming assistant tools.

💡 LayerSkip has improved the inference speed by 2.0 times in the TOPv2 semantic parsing task, having an important impact on semantic parsing and other natural language processing tasks.

Paper: https://huggingface.co/papers/2404.16710

10. Survey Shows: 1/3 of Translators and 1/4 of Illustrators Have Lost Their Jobs Due to AI

The rapid development of AI technology has had a profound impact on the creative industry. A survey by the UK Writers' Guild reveals the impact of AI on writers, translators, illustrators, and other professions, sparking concern and calls within the industry.

【AiBase Summary:】

🤖 About one-fifth of creators have used generative AI in their work, with AI technology beginning to permeate various creative fields.

💼 1/4 of illustrators and 1/3 of translators have lost their jobs due to generative AI, with job opportunities directly threatened.

💰 The majority of novelists and non-fiction writers are concerned that AI technology will negatively impact future creative work income and strongly call for copyright protection and government regulation.

11. WebLlama: An Intelligent Web Browsing Agent Based on Llama-3-8B

WebLlama is an intelligent agent tool based on the Llama-3-8B model, interacting with users through dialogue and performing web browsing-related tasks. It can handle continuous dialogue, understand user instructions, and automatically complete online search, navigation, and information retrieval operations. WebLlama demonstrates strong dialogue processing capabilities and web interaction functions, improving the efficiency of users in obtaining information and reducing the need for manual operations. It performs excellently in professional benchmark tests, with advanced and practical features, and is expected to play a greater role in automated web browsing and information collection.

【AiBase Summary:】

🗣️ Dialogue understanding: Can listen to user instructions and interact with users.

🌐 Automatic web browsing: Performs searches, navigation, and helps users obtain information.

🤖 Completes complex tasks: Can perform practical application tasks such as hotel reservations, shopping, or information searches.

Details link: https://top.aibase.com/tool/webllama

12. Mutable AI Releases Auto Wiki v2: Converts Code into Wikipedia-Style Articles

Mutable AI's Auto Wiki v2 is a revolutionary tool that automatically converts code into Wikipedia-style articles, solving the challenge of code documentation. This innovative solution provides clear descriptions of code documentation, visualizing the understanding of code and enhancing development efficiency.

【AiBase Summary:】

🤖 Automatically converts code into Wikipedia-style articles, solving the challenge of code documentation.

📝 Automatically generates clear descriptions of code documentation, visualizing the understanding of code.

⚙️ Provides multiple functions such as code charts and automatic document updates, enhancing development efficiency.

Details link: https://top.aibase.com/tool/mutable

13. Cog-Become-Image: Transforms Any Character Image into a Specified Style

The Cog-Become-Image project is an innovative image conversion tool that can transform the facial image of any character into another style. The project has broad application prospects in the fields of art creation, media production, and entertainment, bringing new possibilities to the field of image conversion. Whether professional developers or technology enthusiasts, they can use this project to realize the conversion of creative images.