AI Daily: Microsoft Launches AI Models for iPhones; China's First AI Voice Infringement Case Decided; Kimi Founder Cashes Out Millions; Llama3 Chinese Chat Model Released

Welcome to the [AI Daily] column! Here is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the hottest content in the AI field, focusing on developers, helping you to understand technical trends and innovative AI product applications.

Fresh AI products click to learn more: https://top.aibase.com/

1. Tencent's SaaS products are intelligently upgraded, fully integrating the Hunyuan model

Tencent has announced that its collaborative SaaS products are fully integrated with the Hunyuan model, achieving intelligent software services. Products such as Tencent Lexia, Tencent E-Sign, and Tencent Questionnaire have undergone intelligent upgrades, providing users with smarter and more efficient services. The Hunyuan large model has been expanded to a trillion-level parameter scale, taking the lead in adopting a hybrid expert model structure domestically, with excellent performance. External developers and enterprises can directly call Tencent's Hunyuan capabilities through Tencent Cloud's API, solving user pain points.

【AiBase Summary:】
🚀 Tencent's SaaS products have achieved intelligent upgrades, offering smarter and more efficient services.
💡 The Hunyuan large model has been expanded to a trillion-level parameter scale, with excellent performance in multiple aspects.
🔗 External developers and enterprises can call Hunyuan capabilities through Tencent Cloud's API, solving pain points in different scenarios.

2. Microsoft releases the ChatGPT-level AI model Phi-3 series for iPhone operation, challenging OpenAI's position

Microsoft's latest Phi-3 series of small AI models has caused a stir in the AI field, especially the Phi-3-mini model, which outperforms the larger Llama3 model in multiple benchmarks. The series of models can achieve a running speed of 12 tokens per second on the iPhone 14 Pro and iPhone 15, reaching ChatGPT levels. Microsoft emphasizes the importance of training data, improving model performance through carefully designed data and training methods.

【AiBase Summary:】
🚀 The Phi-3-mini model, with only 3.8B parameters, outperforms the 8B parameter Llama3 model.
💡 The Phi-3 series includes the Phi-3-small and Phi-3-medium versions, with superior performance.
🔍 The Microsoft team has improved the performance of the Phi-3 series models through carefully designed training data and unique training methods.
Details link: https://arxiv.org/pdf/2404.14219.pdf

3. The first national AI voice infringement case is judged, with a compensation of 250,000 yuan for one's own voice being AI-ized and sold

This article reports on the first-instance judgment of the national AI voice infringement case, involving the misuse of a voice actor's voice by AI technology, which has attracted widespread attention from society. The court ruled that the defendant's unauthorized use of the voice actor's voice to develop AI products constituted infringement and required compensation of 250,000 yuan. The judgment emphasizes that voice, as a personality right, should be protected by law, providing important legal protection for voice creators.

【AiBase Summary:】
🔍 The first-instance judgment of the AI voice infringement case: the defendant, without authorization, used the voice actor's voice to develop AI products, requiring compensation of 250,000 yuan.
💡 The court emphasizes that voice, as a unique personality right, should be protected by law, and unauthorized use of voice constitutes infringement.
👩‍⚖️ The judgment provides important legal protection for voice creators, resolutely protecting voice rights and cracking down on infringement.

4. The Chinese chat model Llama3-8B-Chinese-Chat is released

This article introduces the Chinese chat model Llama3-8B-Chinese-Chat, which is based on the Meta-Llama-3-8B-Instruct model and fine-tuned with the ORPO method. The model reduces the use of mixed Chinese-English responses and emoticons, making the responses more formal and professional. It performs well in understanding the intentions of Chinese questions, providing appropriate responses, and refusing improper requests.

【AiBase Summary:】
🔑 Llama3-8B-Chinese-Chat is a Chinese chat model based on the Meta-Llama-3-8B-Instruct model and fine-tuned with the ORPO method, reducing the use of mixed Chinese-English responses and emoticons.
🌟 The ORPO method uses the concept of odds ratio to adjust the model's preference settings, optimizing the model's performance in specific tasks, with the Llama3-8B-Chinese-Chat model using ORPO to optimize Chinese-English generation preferences.
💡 The Llama3-8B-Chinese-Chat model performs well in safety, ethics, mathematical problem-solving, writing, and programming examples, providing more accurate and professional responses and example code.
Details link: https://top.aibase.com/tool/llama3-8b-chinese-chat

5. Adobe releases the video super-resolution project VideoGigaGAN

Adobe has recently launched the VideoGigaGAN video super-resolution project, which has made significant progress in video upscaling technology, capable of enlarging videos to eight times their original resolution while maintaining temporal coherence and high-frequency detail clarity. This technology brings video processing into a new stage, greatly expanding the application scope and quality of video content.

【AiBase Summary:】
✨ VideoGigaGAN can upscale videos to eight times their original resolution while maintaining temporal coherence and high-frequency detail clarity.
🔍 Adobe has optimized the GigaGAN model to enhance video stability, showcasing superior performance.
💡 VideoGigaGAN improves video visual quality, adapting to different styles of video content, with broad application potential.
Details link: https://top.aibase.com/tool/videogigagan

6. Midjourney releases the random feature, capable of generating completely random image styles based on prompts

Midjourney has released an interesting feature that can generate completely random image styles based on prompts. Users can explore different creative directions through randomly generated image styles and can also communicate and share in real-time with other users, discussing inspiration and ideas during the creative process. The launch of this feature will further enrich users' image generation experience, providing them with more creative choices and a communication platform.

【AiBase Summary:】
⚙️ Can generate completely random image styles based on prompts
💬 Users can communicate and share in real-time through the Room feature
🎨 Explore different creative directions, enriching users' image generation experience

7. The founder of AI unicorn Dark Side of the Moon, Yang Zhilin, cashes out tens of millions of dollars, official response

Yang Zhilin, the founder of Dark Side of the Moon, has cashed out tens of millions of dollars through personal share sales, attracting widespread attention. The company, founded just a year ago, has received huge financing, with a valuation exceeding $2.5 billion. The success of Dark Side of the Moon is not only reflected in its valuation but also in the flagship product, Kimi Chat.

【AiBase Summary:】
🚀 Dark Side of the Moon founder Yang Zhilin cashes out tens of millions of dollars through personal share sales, with a company valuation exceeding $2.5 billion.
💡 Dark Side of the Moon has risen rapidly within a year, becoming one of the unicorns in the field of large models in China.
💬 Dark Side of the Moon's flagship product, Kimi Chat, stands out in the AI large model field with its "long text" feature, triggering a frenzy in the capital market.

8. Without hesitation! Zuckerberg reveals willingness to open-source a $10 billion model, stating that AGI cannot be achieved before 2025

In the latest podcast interview, Zuckerberg has shown an open-source hero image, expressing his willingness to open-source a model worth $10 billion, emphasizing that open-source reduces costs and promotes innovation, but also needs to consider the economic pros and cons. He is pessimistic about the realization of AGI before 2025, believing that energy shortage is the bottleneck, and solving it may take decades. He criticizes Apple and Google for monopolizing the mobile ecosystem and hopes to change the situation through open-source to prevent competitive threats. Regarding the bottleneck of artificial intelligence development, he is concerned about energy constraints and data center challenges, taking a reserved stance on the future AI model capability improvement.

【AiBase Summary:】
💡 Zuckerberg is willing to open-source a model worth $10 billion, believing that open-source reduces costs and promotes innovation, but needs to consider the economic pros and cons.
💡 Pessimistic about the realization of AGI before 2025, believing that energy shortage is the bottleneck, and solving it may take decades.
💡 Criticizes Apple and Google for monopolizing the mobile ecosystem, hoping to change the situation through open-source, and prevent competitive threats.

9. ByteDance releases the image model distillation algorithm Hyper-SD

This article introduces ByteDance's Lightning team's new image model distillation algorithm Hyper-SD, which has made important progress in the field of image processing and machine learning. Through innovative methods, the model's performance has been improved, increasing inference speed and efficiency while maintaining model simplicity.

【AiBase Summary:】
⚙️ Segmented trajectory consistency distillation: The Hyper-SD technology ensures the integrity of the original ODE trajectory.
🧠 Human feedback learning mechanism: Introduces human feedback learning to improve model performance and reduce performance loss.
🔬 Score distillation technology: Enhances the model's generative ability at low step inference, further improving performance.
Details link: https://top.aibase.com/tool/hyper-sd

10. AI music generation tool AI Jukebox, enter prompts and select a style to create music

AI Jukebox is a music generation tool utilizing artificial intelligence technology, providing services through the Hugging Face platform. It simplifies the music creation process, making it intelligent and user-friendly. Users can guide AI to generate music in specific styles by entering prompts, achieving intelligent music creation. AI Jukebox encourages a human-machine collaboration model, providing inspiration and creative tools for musicians and music enthusiasts, exploring infinite possibilities.

【AiBase Summary:】
🎵 Localized model loading: After opening the AI Jukebox webpage, the system automatically loads the generative model without complex settings.
🎶 Music generation based on prompts: Users guide AI to generate music in specific styles by entering specific prompts, including descriptions of music types, emotions, instruments, etc.
🎼 Human-machine collaboration model: AI Jukebox encourages users to collaborate with AI, exploring new ways of music creation, providing inspiration and creative tools.
Details link: https://top.aibase.com/tool/ai-jukebox

11. Virtual human chat system Live2D

This article introduces the Live2D virtual human chat system project based on Unity development, utilizing Live2D technology to showcase dynamic virtual human images, providing smooth animation effects, enhancing user interaction experiences. The project integrates Azure, OpenAI, and APISpace APIs to support natural language processing and generation, enabling real-time text communication. It also supports image processing and facial detection, high-resolution display, and custom extensibility features.

【AiBase Summary:】
👩‍💻 Integrated Live2D virtual human images, providing smooth animation effects, enhancing user experience.
💬 Real-time chat functionality, virtual humans can understand and respond to user text input, enabling real-time communication.
🔍 Image processing and facial detection, allowing virtual humans to better respond to user visual input.
Details link: https://top.aibase.com/tool/live2d-virtual-human-for-chatting-based-on-unity

12. The University of Hong Kong and Zhejiang University jointly develop the SC-GS model

This article introduces the SC-GS model proposed by the joint research team of the University of Hong Kong's CVMI Laboratory, 3D large model company VAST, and Zhejiang University, which has made breakthrough achievements in the field of digital asset creation and 3D reconstruction. Through real-time interactive editing of sparse control points, it achieves efficient editing and synthesis of dynamic scenes, demonstrating great potential.

【AiBase Summary:】
🌟 The SC-GS model has revolutionized the field of new viewpoint synthesis, demonstrating the ability for real-time interactive editing of sparse control points on dynamic Gaussians.
🔑 Users can easily edit reconstructed dynamic scenes through simple mouse dragging and keyboard key combinations.
💡 The SC-GS model drives the deformation of dynamic Gaussians throughout the scene through neural network prediction of control point motion states, enhancing the performance of dynamic new viewpoint synthesis.
Details link: https://top.aibase.com/tool/sc-gs

13. New video segmentation technology SAM can efficiently identify moving objects

In the field of video segmentation, the research team explores new video object segmentation technologies, improving video segmentation performance by combining the SAM model with optical flow technology. Two models have demonstrated potential, achieving significant performance improvements, and extending segmentation technology to the entire video sequence, enabling object tracking. These technologies have improved the accuracy and efficiency of video segmentation, reducing computational complexity, and are of great significance for multiple application scenarios.