AI Daily: More Stable and High Definition! Keling AI Releases Version 1.5; ByteDance Pushes Music Generation Tool; Alibaba's Tongyi Wansiang Video Generation Model Officially Launched

Welcome to the AI Daily section! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the latest in AI, focusing on developers and helping you stay ahead of technological trends and understand innovative AI product applications.

Fresh AI Products Click to Learn More: https://top.aibase.com/

1、Keling AI Releases Version 1.5: Video Stabilization and High Clarity, Faces Remain Intact Even as People Fly

Keling AI's latest release, Version 1.5, introduces impressive new features and improvements, significantly enhancing the quantity and quality of video generation and expanding the boundaries of AI applications in creative media. The quality of the images has been greatly improved, supporting 10-second 1080p high-definition videos, with improved text responsiveness, enhanced aesthetics, and strengthened character and object consistency.

AiBase Highlights:
🚀 Version 1.5 significantly enhances video generation capabilities, supporting the simultaneous creation of up to four videos, with image-to-video functionality capable of generating 10-second 1080p high-definition videos.
🎨 Version 1.0 introduces the "Motion Brush" feature, offering more precise motion control and more vivid motion representation, expanding the creative space for video creators.
💡 The text understanding capability has been significantly improved, with Version 1.5 showing notable enhancements in image quality, dynamic performance, and compliance with text commands, with an overall effect improvement of 95%.
Details Link: https://top.aibase.com/tool/keling-ai

2、ByteDance Launches Music Generation Tool Seed-Music with Diverse Input and Precise Control

Recently, ByteDance introduced a new music creation tool, Seed-Music, allowing users to generate music through various methods such as text descriptions, audio references, sheet music, or even voice prompts. This magical model combines autoregressive language models and diffusion models to produce high-quality music compositions while offering precise control. Users can match lyrics with music, adapt melodies, or even upload voice clips to convert them into singing, featuring powerful and efficient functionalities.

AiBase Highlights:
🎵 Seed-Music combines autoregressive language models and diffusion models to generate high-quality music compositions, allowing users to precisely control music details.
🎶 Features include vocal and instrumental generation, voice synthesis, voice conversion, and music editing, meeting diverse user needs.
🎼 The Seed-Music architecture is divided into representation learning, generation, and rendering modules, generating high-quality music through multi-modal input.
Details Link: https://team.doubao.com/en/special/seed-music

3、Alibaba's Tongyi Qianwen Open-Sources Qwen2.5 Series Models: Qwen2-VL-72B on Par with GPT-4

The Tongyi Qianwen team announced the open-source release of the Qwen2.5 series models, including the general-purpose language model Qwen2.5, Qwen2.5-Coder, and Qwen2.5-Math, pre-trained on a 18T token dataset, enhancing knowledge acquisition, programming, and mathematical abilities. It supports long text processing, generating up to 8K tokens of content, and maintains support for over 29 languages. Various scale versions are provided, licensed under Apache 2.0. The Qwen2-VL-72B model performs on par with GPT-4, with significant improvements in instruction execution, long text generation, data understanding, and structured output.

AiBase Highlights:
🚀 The Qwen2.5 series models are open-sourced, including general-purpose language models and specialized domain models, enhancing knowledge acquisition, programming, and mathematical abilities.
💡 The models support long text processing, generating up to 8K tokens of content, and provide support for over 29 languages.
💻 The Qwen2-VL-72B model has achieved significant improvements in instruction execution, long text generation, data understanding, and structured output.
Details Link: https://modelscope.cn/studios/qwen/Qwen2.5

4、Alibaba's Tongyi Wanxiang Video Generation Model "AI Video" Feature Officially Launched

Alibaba's Tongyi's Tongyi Wanxiang AI video generation model is officially launched, featuring powerful visual dynamic generation capabilities, supporting the creation of various artistic styles and film-quality video content. The model optimizes the representation of Chinese elements, supports multi-language input and variable resolution generation, and is widely applicable, offering free services and audio generation capabilities, simplifying the video production process.

AiBase Highlights:
⚙️ The Tongyi Wanxiang AI video generation model features powerful visual dynamic generation capabilities, supporting various artistic styles and film-quality video content generation.
🌟 Optimizes the representation of Chinese elements, with unique advantages in generating Chinese-style content, supporting multi-language input and variable resolution generation, meeting diverse user needs.
🎬 Offers free services, supports audio generation for video content, simplifies the video production process, achieves audio-visual synchronization, and improves creative efficiency.
Details Link: https://tongyi.aliyun.com/wanxiang/wanxvideo

5、Tencent's EzAudio AI Audio Model Transforms Text into Realistic Voices

Recently, the Johns Hopkins University and Tencent AI Lab jointly launched the EzAudio model, marking a significant advancement in audio technology. The model generates high-quality audio samples through innovative architecture and technology, with broad application potential. As the technology develops, ethical and responsible use issues are becoming more prominent, and EzAudio's open research code also provides extensive testing opportunities for future risks and benefits.

AiBase Highlights:
🌟 EzAudio is a new text-to-audio generation model launched by Johns Hopkins University in collaboration with Tencent, marking a significant advancement in audio technology.
🎧 The model generates audio samples of superior quality compared to existing open-source models, with broad application potential.
⚖️ As the technology develops, ethical and responsible use issues are becoming more prominent, and EzAudio's open research code provides extensive testing opportunities for future risks and benefits.
Details Link: https://huggingface.co/spaces/OpenSound/EzAudio

6、Giant Network Launches Self-Developed Character Model GiantGPT and Voice Model BaiLing-TTS

Giant Network showcased its latest achievements in the "game + AI" field at the 2024 YUNQI Conference, including applications of large models like GiantGPT and BaiLing-TTS, as well as new technologies such as AI digital humans and the Giant Mujing AI painting platform. The company demonstrated highly optimized game business large models and a voice large model supporting multiple dialects, while also unveiling a new brand logo and opening beta applications for the AI painting platform. Giant Network also showcased high-precision real-time interactive digital human technology, expressing its commitment to深耕 the "game + AI" field.

AiBase Highlights:
🎮 GiantGPT is a vertical large model focused on the game business, combining proprietary data and internet public data for training, deeply optimizing foundational capabilities.
🗣 BaiLing-TTS is a voice large model that supports mixed speech in various Mandarin dialects, capable of generating multiple dialects.
🖌 The Giant Mujing AI painting platform is a one-stop cloud-based platform, supporting team collaboration and batch processing of art content.

7、ChatGPT's Advanced Voice Mode to Launch Fully on September 24

ChatGPT's advanced voice mode is set to be fully released on September 24, bringing users an unprecedented interactive experience. This feature generates realistic audio responses, enhancing the naturalness and immersion of human-computer interaction. The reliability of the update information has been verified, and some mobile users may experience the upgraded voice mode on September 24. The macOS version of the ChatGPT application interface has changed, with a richer voice mode interface and new convenient buttons. Some users can share more context information with ChatGPT, achieving a more coherent and personalized conversational experience.

AiBase Highlights:
⚙️ The advanced voice mode will be fully released on September 24, enhancing the interactive experience.
🔊 Generates realistic audio responses, enhancing the naturalness and immersion of human-computer interaction.
🌌 The macOS version of the ChatGPT application interface has changed, with new convenient buttons and a richer visual experience.

8、YouTube Integrates DeepMind's Veo Model, Empowering Creators' Imagination

YouTube officially announces the integration of Google DeepMind's Veo model into its short video platform YouTube Shorts, ushering in a new era of AI-driven short video creation. This move not only provides creators with unprecedented creative tools but also fundamentally changes the way users interact with the platform.

AiBase Highlights:
✨ The Dream Screen feature combines Imagen3 and Veo models to create an intelligent creative environment for creators.
🌟 YouTube ensures the transparency and credibility of AI-generated content through SynthID technology.
💡 The Made on YouTube 2024 plan introduces AI-driven creative tools such as the Inspiration Assistant and Smart Auto-Dubbing Tool, supporting content creators comprehensively.

9、2024 AI Agent Application Insight Half-Year Report: AI Apps Reach Over 66 Million Monthly Active Users

The 2024 AI Agent Application Half-Year Report shows that AI applications have over 66 million monthly active users, demonstrating the rapid development and popularization of AI technology at the application level. The report indicates that AI applications have formed eight major gameplay categories and have begun commercialization. Agent services address user needs, with WeChat ecosystem being an important channel, and the agent business model is being explored. Agents are mature in educational learning scenarios, with high usage popularity among leading agents. AI agent applications have become an important branch of the mobile internet, bringing users rich and convenient experiences and providing new impetus and direction for the industry. It is expected that AI agent applications will play a more significant role in the future.

AiBase Highlights:
📊 AI applications have over 66 million monthly active users, showing the rapid development and popularization of AI technology.
🎮 AI applications have formed eight major gameplay categories, with commercialization paths initiated.
📈 Agent services address user needs, with the WeChat ecosystem being an important channel, and the business model is being explored.

10、LinkedIn Quietly Uses User Data for AI Training, Requires Double Opt-Out

Recently, LinkedIn was found to be using user data for training generative AI models without prior notification to users. Users need to turn off the relevant options in their account settings to opt-out, but this only affects future data usage. LinkedIn also mentioned that other machine learning tools require filling out additional forms to completely opt-out of data usage.

AiBase Highlights:
🔒 LinkedIn defaults to using user data for training AI models, requiring users to actively opt-out.
✋ Users need to turn off the options in their account settings, which only affects future data usage.
📄 In addition to generative AI, LinkedIn has other machine learning tools that require filling out additional forms to completely opt-out of data usage.

11、$23 Million Funding! Fal.ai Attracts 500,000 Developers, Generating 50 Million Media Contents Daily

Fal.ai, a cloud platform focused on AI-generated audio, video, and images, recently secured $23 million in funding. The platform has attracted notable investors and numerous developers and enterprise clients, showcasing significant potential and market demand. In the future, Fal.ai will strengthen content review and model optimization efforts to better address the challenges posed by generative technology.

AiBase Highlights:
🚀 Fal.ai successfully raised $23 million, attracting multiple investors and showing great market potential.
💡 The platform focuses on providing efficient AI-generated media solutions for businesses, attracting numerous developers and enterprise clients.
🔍 Fal.ai will strengthen content review and model optimization efforts to better face the risks and challenges of generative technology.

12、Superhero of Office Software? Kingsoft WPS AI Members Surpass One Million, HarmonyOS Version Fully Launched

Kingsoft Office's WPS AI Members and Annual Paid Members have exceeded one million, demonstrating the potential and user recognition of artificial intelligence in the office field. In collaboration with Huawei, the WPS HarmonyOS version was launched, showing excellent cross-platform performance and enhancing the user experience of office work. The AI Member service layout is carefully planned, introducing AI Assistants and Linux12 Personal Edition, continuously improving user work efficiency and expanding platform coverage.

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

AI Daily: More Stable and High Definition! Keling AI Releases Version 1.5; ByteDance Pushes Music Generation Tool; Alibaba's Tongyi Wansiang Video Generation Model Officially Launched

站长之家

This article is from AIbase Daily

AI News Recommendations

AI Daily: Kimi's New Audio Foundation Model Kimi-Audio; Step1X-Edit, an Open-Source Image Editing Model; Quark AI Super Box Launches - Take a Photo and Ask Quark

AI Daily: Baidu Unveils Wenxin Large Model X1Turbo and AI Open Program; OpenAI Offers Free Lightweight Deep Research; iDream Video 3.0 Internal Testing

Pixverse Launches MCP: Unlocking a New Frontier in AI Video Generation

Jidream Video 3.0 Internal Testing: Smooth Camera Work, Accurate Capture of Facial Expressions

AI Daily: OpenAI Launches gpt-image-1 Image Generation API; Nano AI Releases MCP Universal Toolbox; China Accounts for 60% of Global AI Patents

AI Daily: Tencent Releases Version 2.5 of its HunYuan 3D Generation Model; Haier Launches Image-to-Person Reference Feature; Baidu Launches Mobile Super Intelligence App, Xinxiang

Revolutionizing Video Creation! Alibaba's VACE Model Unifies Text, Image, and Video Inputs

AI Daily: Vidu Q1 Officially Launched; MCP SDK Now Supports Streaming HTTP; Douyin Bans 2.6 Million AI-Related Accounts in Q1

MAGI-1, the World's First Autoregressive Video Generation Model, Officially Launched; Swin Transformer Team Leads a New Wave in Video Creation

Top 20 AI Video Generation Companies of 2025 Announced: Keling AI, Jimeng AI, and PixVerse AI Take the Lead