Daily AI Digest: Baidu Unveils Wenxin Large Model Learning Device; Open-Sora 1.1 Generates 21-Second Videos; Canva Introduces Automated Video Editing; The Versatile ComfyUI Node, AnyNode, Arrives In this AI update, we bring you the latest innovations: Baidu has launched its Wenxin Large Model Learning Device, offering advanced educational AI capabilities. Open-Sora 1.1 now boasts the ability to create 21-second videos, enhancing video production efficiency. Canva steps up its game with automated video editing features, simplifying content creation. Lastly, the introduction of the AnyNode in the ComfyUI system promises to be a game-changer with its versatile node capabilities. Stay tuned for more cutting-edge developments in the AI world.

Welcome to the AI Daily column! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the hottest content in the AI field, focusing on developers, helping you to understand technical trends and innovative AI product applications.

Discover the latest AI products click to learn more: https://top.aibase.com/

1. Baidu's Wenxin Large Model Learning Machine "Xiaodu Learning Machine Z30" Launched

This article introduces the Xiaodu Technology's first learning machine Z30 based on the Wenxin Large Model, aiming to redefine the AI teacher. It leverages advanced AI technology to assist children in active learning and provides parents with a super worry-free experience. The learning machine boasts an excellent visual experience and multiple advanced technologies, as well as a comprehensive reconstruction for multiple learning stages, offering scientific and efficient learning solutions. The AI teacher can provide rich educational content, support multi-round interactions and personalized customization, making learning more lively, interesting, and targeted.

【AiBase Summary:】

📚 Xiaodu Z30 Learning Machine uses Wenxin Large Model technology to vividly recreate textbook scenarios, helping children better understand and master knowledge.

👩‍🏫 Xiaodu Z30 supports AI interactive lessons and personalized practice after class, tailoring learning plans to achieve on-demand answers for all subjects and grades.

💡 The AI teacher, based on the Wenxin Knowledge Enhancement Large Model, provides authoritative and rich educational content, supporting multi-modal perception and humanized presentation, making learning more vivid and interesting.

2. Open-Sora 1.1 Release: Improved Video Quality, Extended Generation Time to 21 Seconds

The Open-Sora 1.1 release has brought significant improvements, especially in video generation quality and duration. The new version model can generate videos up to approximately 21 seconds long, using high-quality visual data and captions for training, enhancing the understanding of how the world works. The optimized CausalVideoVAE architecture improves performance and inference efficiency.

【AiBase Summary:】

🚀 Improved video generation quality and duration: The new version can generate videos up to approximately 21 seconds long, using high-quality visual data and captions for training, enhancing the understanding of how the world works.

🔍 High-quality visual data and caption training: Using higher-quality visual data and captions for training, the model's understanding of how the world works is enhanced.

⚙️ CausalVideoVAE architecture optimization: The optimized architecture improves video generation performance and inference efficiency, similar to the Sora base model.

Details link: https://top.aibase.com/tool/open-sora-plan-v1-1-0

Online experience address: https://huggingface.co/spaces/LanguageBind/Open-Sora-Plan-v1.1.0

3. Canva Launches Multiple Useful Features: Text-Generated Illustrations, Automatic Editing of Video Highlight Clips

Canva's latest update aims to enhance the appeal of professional teams and workspaces, including a redesign of the platform, the launch of an enterprise version, and the provision of AI style matching features. The update covers multiple functions and improvements, making design work more efficient and convenient.

【AiBase Summary:】

🎨 Canva launches an enterprise version, offering more collaboration and security controls.

🌟 New AI style matching feature, customizing workspace display folders.

💡 Canva Docs introduces suggestion mode and colored highlighting blocks, providing editing suggestions and highlighting text areas.

4. Impressive ComfyUI Node AnyNode is Here! Let AI Help You Write Any Function

AnyNode is an impressive new node in ComfyUI, utilizing the capabilities of large language models (LLMs) to allow users to create nodes with specific functions by inputting prompts. It provides users with tools for diverse operations in ComfyUI, greatly expanding the possibilities for users in creating custom nodes and workflows. It is undoubtedly a valuable resource for users looking to achieve advanced functions and automate tasks.

Author's video tutorial

【AiBase Summary:】

🔑 Function writing: Users can write Python functions according to requirements, achieving any function.

🔧 Flexibility: Users can let LLM help write different types of nodes through prompts, such as text summarization, color channel adjustment, Ins filter effects, etc.

🔗 Compatibility with ComfyUI: AnyNode, as a node in ComfyUI, can be used in conjunction with other nodes, linking to the required output format nodes.

Details link: https://top.aibase.com/tool/anynode

5. ChatTTS: A Speech Generation Model Designed for Dialogue Scenarios

ChatTTS is a speech generation model designed for dialogue scenarios, supporting both Chinese and English, providing high-quality and natural speech synthesis through extensive data training. The project team plans to open source a basic model, focusing on the controllability and security of the model. Users need to pay attention to the disclaimer when using ChatTTS, and the open-source model will bring new learning and innovation opportunities to the community.

【AiBase Summary:】

🔑 Wide range of applications: ChatTTS is suitable for dialogue tasks of large language model assistants, dialogue voice, and video introductions, providing natural and smooth interactive experiences.

🔑 Multilingual support: The model supports Chinese and English, overcoming language barriers and serving a wider user base.

🔑 Open source plan: The team plans to open source a basic model, promoting research and development in the academic and developer communities.

Details link: https://www.bilibili.com/video/BV1zn4y1o7iV/

Project address: https://top.aibase.com/tool/chattts

6. Video Repair Project ProPainter Releases Comfyui Node

This article introduces the ComfyUI_ProPainter_Nodes project, a solution specifically for video repair, based on the ProPainter framework and implemented with a ComfyUI interface. The project provides advanced video frame editing and seamless video repair tasks, bringing powerful tools to the field of video editing and repair.

【AiBase Summary:】

🔧 Based on the ProPainter framework, providing advanced video frame editing and seamless video repair tasks.

🎨 Providing video repair nodes, including ProPainter nodes and ProPainter Outpainting nodes.

📚 Providing detailed installation instructions, workflow examples, and recommended nodes, with rich input parameters and node reference information.

Details link: https://top.aibase.com/tool/comfyui-propainter-nodes

7. AI Can Generate Sign Language Gestures Videos Now! SignLLM Generates Sign Language Videos from Text Descriptions

SignLLM is an innovative multilingual sign language model that provides an important communication tool for the hearing-impaired, promoting research in the field of artificial intelligence in language understanding and generation. It generates sign language videos from text descriptions, facilitating barrier-free information exchange.

【AiBase Summary:】

🤖 Text-to-sign language video conversion: The SignLLM model can convert text into sign language gesture videos, which are intuitive and easy to understand.

🌍 Supports multiple sign languages: Including American Sign Language (ASL), German Sign Language (GSL), and eight other sign languages, with wide applicability.

📊 Pioneering multilingual sign language dataset: Introducing the Prompt2Sign dataset, which is crucial for model training and development.

Details link: https://top.aibase.com/tool/signllm

8. Noise Reduction Black Technology Look Once to Hear! Just Look at the Speaker and Their Voice Isolated

This article introduces an advanced noise-cancelling headphone technology "Look Once to Hear," which controls the amplification or elimination of sound through the user's visual input. This technology provides focus and convenience in noisy environments, allowing users to clearly hear the target speaker's voice. It may combine artificial intelligence, machine learning, and sound processing algorithms, especially useful for the hearing-impaired.

【AiBase Summary:】

👀 Visual control: Users only need to briefly look at the speaker they want to listen to, and the headphones can recognize and amplify that speaker's voice.

🌆 Environment adaptability: Suitable for noisy environments, helping users to clearly hear the target speaker's voice in streets, cafes, etc.

🚶 Use while moving: Even if the user is moving, the headphones can continuously track and play the target speaker's voice, providing focus and convenience.

Details link: https://top.aibase.com/tool/lookoncetohear

9. Meta Reportedly Uses Instagram Photos to Train AI Models, Sparking Privacy Controversy

This article exposes the privacy controversy of Meta using Instagram and Facebook user photos to train AI models. Users' privacy is severely violated, and data privacy collection regulations are seriously breached. Meta claims that users can opt out of data collection mode, but the data collected earlier may still be stored in the AI model, raising concerns and worries. Enterprises should strengthen compliance measures in data privacy and protection to ensure that users' personal information is properly handled and protected.

【AiBase Summary:】

📷 Meta uses user photos to train AI models, violating data privacy collection regulations

🔍 Users' privacy is infringed upon, sparking widespread concern

❌ Users can opt out of data collection mode, but the data collected earlier may still be stored in the AI model

10. Better Than Lip-Syncing! InstructAvatar: Generate Facial Expressions and Actions for Specified Faces from Text

InstructAvatar is the latest dialogue image generation model, achieving fine-grained control over emotions and facial actions through text-guided methods, enhancing the interactivity and generalization ability of virtual images. The model outperforms existing methods in fine-grained emotional control, lip-sync quality, and naturalness, capable of specifying facial expressions and actions, bringing breakthrough progress in generating 2D virtual images with emotional expressions.

【AiBase Summary:】

👄 InstructAvatar achieves fine-grained control over emotions and facial actions, enhancing the interactivity and generalization ability of virtual images.

😊 Experimental results show that InstructAvatar outperforms existing methods in fine-grained emotional control, lip-sync quality, and naturalness, capable of specifying facial expressions and actions.

🎭 InstructAvatar's framework includes a variational autoencoder (VAE) and an action generator based on a diffusion model, achieving fine-grained control over generated videos through a natural language interface.

Details link: https://top.aibase.com/tool/instructavatar

11. Elon Musk's xAI Raises $6 Billion in Funding, Valuation Reaches $24 Billion

Elon Musk's xAI company successfully raised $6 billion in the B round of financing, bringing the company's valuation to $24 billion. The company is dedicated to developing advanced artificial intelligence systems, aiming to provide AI technology that is authentic, capable, and maximally beneficial to all humanity. The latest financing will be used to launch the first product, build advanced infrastructure, and accelerate the research and development of future technologies.

【AiBase Summary:】

💰 xAI company raised $6 billion in the B round of financing, with a valuation of $24 billion

🚀 The latest financing will be used to launch the first product, build advanced infrastructure, and accelerate the research and development of future technologies

🤖 xAI is developing advanced artificial intelligence systems, aiming to provide authentic, capable, and maximally beneficial AI technology to all humanity

12. Apple's "Greymatter Project" Will Be AI Tools Usable by Ordinary Users in Daily Life

Apple will showcase its latest artificial intelligence advancements at the upcoming WWDC, focusing on AI tools that ordinary users can use in their daily lives. The new plan, named "Project Greymatter," will integrate AI tools into major applications and improve operating system notification functions. Apple's AI strategy is to catch up with competitors, leveraging its user base and privacy brand image to continue leading the market.