Welcome to the AI Daily section! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the hottest content in the AI field, focusing on developers, helping you understand technology trends and innovative AI product applications.
Check out the latest AI products here: https://top.aibase.com/
🤖📱💼AI Applications
Is Sora being replaced? The StreamingT2V AI video model, capable of generating 2-minute-long videos, is now open source and available for trial
AiBase Summary:
⭐ StreamingT2V can generate videos up to 1200 frames, or 2 minutes long, surpassing the Sora model.
⭐ It employs advanced autoregressive techniques to maintain temporal consistency and high quality in videos.
⭐ It is a free open-source project that seamlessly integrates with models like SVD and animatediff.
⭐ The code has been released, and a trial address is available. Video generation takes a long time; one video is expected to take over 13 minutes to generate.
Open-source code: https://top.aibase.com/tool/streamingt2v
Paper address: https://arxiv.org/pdf/2403.14773.pdf
Trial address 1: https://huggingface.co/spaces/PAIR/StreamingT2V
Trial address 2: https://replicate.com/camenduru/streaming-t2v
Udio AI offers multifunctional audio generation, including comedy, speeches, radio broadcasts, and more
AiBase Summary:
⭐ Udio can create music and also generate comedy, speeches, NPC dialogues, sports analysis, advertisements, radio broadcasts, ASMR, natural sound effects, and more.
⭐ Simple text description creation: Users can guide Udio to generate music with specific themes and emotions through simple text descriptions.
⭐ Wide range of music types and styles supported: Udio supports various music types and styles to cater to different users' music tastes.
Interested parties can view the playlist here: https://www.udio.com/playlists/deGuVDLYd9MrXtxnxfX7z1
Experience address: https://top.aibase.com/tool/udio
Meitu Wink's "AI Anime" feature upgrade can transform short dramas into anime style
AiBase Summary:
⭐ Meitu Wink recently upgraded its "AI Anime" feature, converting short dramas into anime style.
⭐ The introduction of the CFA module optimizes action consistency, generating more fluid and natural anime videos.
⭐ Fragmented technology processes long videos, reducing waiting time, making creation more free and smooth.
StableDesign: An SD solution suitable for interior design, allowing text prompts to modify interior design drawings
AiBase Summary:
⭐️ Developers have created a project for generative interior design.
⭐️ By downloading Airbnb property data and image metadata, features are extracted for training.
⭐️ ControlNet and Laura models are used for training to achieve control over interior design generation and text-to-image conversion.
Online experience: https://huggingface.co/spaces/MykolaL/StableDesign
SwapAnything: A more powerful alternative to face swapping, allowing the replacement of any element in an image
AiBase Summary:
🔍 The SwapAnything framework has advantages in precise control of objects and parts, retaining context pixels, and adapting to personalized concepts.
🔍 Through directed variable swapping and appearance adjustment techniques, SwapAnything demonstrates precise and faithful swapping capabilities.
🔍 SwapAnything can precisely control any object in an image, achieving high-quality personalized swaps.
Project entry: https://top.aibase.com/tool/swapanything
MagicTime, an AI time-lapse video generation tool, releases an online experience address
AiBase Summary:
⭐ Time-lapse video is a photography technique that showcases long-term changes.
⭐ MagicTime can generate time-lapse videos based on textual descriptions.
⭐ Applications are widespread, capable of recording natural phenomena and human changes.
Project address: https://top.aibase.com/tool/magictime
Experience address: https://huggingface.co/spaces/BestWishYsh/MagicTime
Model download address: https://huggingface.co/Kijai/MagicTime-merged-fp16
STORM, an automated writing tool, can generate deep, long-form content like Wikipedia
AiBase Summary:
⭐️ STORM automatically gathers materials, simulates expert dialogues, and generates structured article outlines.
⭐️ STORM conducts efficient research, integrates multi-angle information, and promotes in-depth understanding and precise problem generation.
⭐️ After generating an article outline, STORM fully writes and polishes the article to improve overall quality.
Project address: https://top.aibase.com/tool/storm
Meta introduces the ViewDiff model: text-to-3D image generation from multiple perspectives
AiBase Summary:
🌟 ViewDiff addresses three major challenges in generating consistent, multi-perspective 3D images from text.
🌟 The autoregressive generation module allows ViewDiff to generate more 3D-consistent images at any given perspective.
🌟 ViewDiff fills the technical gap in the field of text-to-multi-perspective 3D image generation.
Paper address: https://arxiv.org/abs/2403.01807
Project address: https://top.aibase.com/tool/viewdiff
📰🤖📢AI News
The first AI programmer caught for falsification, Devin shocks Silicon Valley again! Detailed text explanation of the exposé video attached
AiBase Summary:
⭐️ A YouTuber programmer exposes the first AI programmer, Devin, for video falsification.
⭐️ Devin's demonstration is not as miraculous as it seems, fixing bugs while creating new ones.
⭐️ Questioned and debunked, netizens scoff at the hype around AI products.
Detailed content: https://www.chinaz.com/2024/0415/1610127.shtml
Elon Musk's XAI releases Grok-1.5Vision multi-modal model, capable of processing text and image information
AiBase Summary:
⭐️ The Grok-1.5Vision model demonstrates superior performance, surpassing GPT4V.
⭐️ It performs excellently in the RealWorldQA benchmark test, understanding real-world physical spaces.
⭐️ The Grok-1.5Vision model has strong real-world spatial processing and understanding capabilities.
Official website address: https://top.aibase.com/tool/grok-1-5-vision-preview
360 Zhinao's 7B parameter large model is officially open-source, supporting input of up to approximately 500,000 characters
AiBase Summary:
🧠 360 Zhinao's 7B parameter large model is officially open-source.
🧩 Supports different text length versions, with the longest capable of processing 360K long texts.
🔥 Performs exceptionally in capability tests, ranking among the top three in comprehensive capabilities.
Project address: https://github.com/Qihoo360/360zhinao
Adobe's image generation AI "Firefly" has about 5% of its training set as AI-generated images
AiBase Summary:
⭐ Adobe Stock has begun accepting AI content, with approximately 14% being AI-generated images.
⭐ Scholars point out that Firefly learns from images generated by Midjourney, contradicting its claims.
⭐ Users express dissatisfaction with Adobe using their works to train Firefly.
Code and model fully open-source! Professor Jiajiaya's team's multi-modal model Mini-Gemini tops the trending list
AiBase Summary:
⭐️ The Mini-Gemini model achieves significant results in multi-modal tasks, open-sourcing both code and model data.
⭐️ Mini-Gemini combines image understanding and generation, demonstrating excellent image reasoning capabilities.
⭐️ Utilizing the Gemini visual dual-branch information mining method, it effectively handles high-resolution images and generates rich visual and textual content.
Project address: https://top.aibase.com/tool/mini-gemini
Trial address: https://103.170.5.190:7860/
Face the wall intelligence open-sources the MiniCPM2.0 series models, with significant enhancements in OCR capabilities
AiBase Summary:
⭐ MiniCPM-V2.0 is the most powerful multi-modal model on the edge, with strong OCR capabilities.
⭐ MiniCPM-1.2B is a base model suitable for edge scenarios, with fast inference speed and low cost.
⭐ MiniCPM-2B-128K is the smallest long-text model, capable of processing 128K text content.
MiniCPM-V2.0:
https://github.com/OpenBMB/MiniCPM-V
MiniCPM series open-source address:
https://github.com/OpenBMB/MiniCPM
MiniCPM technical Blog address:
https://openbmb.vercel.app/?category=Chinese+Blog
Competition heats up! ChatGPT's growth-momentum wanes, with 1.77 billion global visits in March, while Claude is on the rise
AiBase Summary:
📉 ChatGPT's global visit growth by momentum slows down, despite the introduction of new features.
🚀 Anthropic's Claude is thriving in the European market, intensifying competition with ChatGPT.
💥 Following the release of Claude3, it continues to grow rapidly, demonstrating the potential of new products.
InstantID team introduces a new style transfer method, InstantStyle, allowing you to instantly immerse yourself in "Van Gogh's Starry Night"