Welcome to the AI Daily section! This is your daily guide to exploring the world of artificial intelligence. Every day, we bring you the hottest topics in the AI field, focusing on developers, helping you stay on top of technological trends, and understanding innovative AI product applications.
Fresh AI Products Click to Learn More: https://top.aibase.com/
📰🤖📢AI News
Stanford's Large Model Octopusv2 Runs on Mobile Devices, Surpassing GPT-4 Overnight
AiBase Highlights:
⭐️ Stanford University introduces Octopusv2, a 2-billion-parameter model that runs on devices like smartphones, outperforming GPT-4 in accuracy and latency, with a 95% reduction in context length.
⭐️ The era of device-side AI agents has arrived, with Octopusv2 innovating with a function token strategy during its development process, enhancing inference speed and delivering outstanding performance.
⭐️ Octopus-V2-2B excels in performance evaluations, with a 168% increase in speed, injecting new vitality into the development of device-side AI.
Paper: https://arxiv.org/abs/2404.01744
Model Homepage: https://huggingface.co/NexaAIDev/Octopus-v2
90s Young Man Uses AI to 'Resurrect' the Deceased, Closing 1000 Deals in a Year
AiBase Highlights:
⭐️ Zhang Zewei, a 90s young man, uses AI technology to create digital avatars of the deceased, having received over 1000 orders.
⭐️ His team reconstructs the deceased's appearance and voice, allowing clients to interact with their digital avatars.
⭐️ The uniqueness of this business lies in the AI-generated reactions of the deceased's digital avatar, without the need for human扮演.
OpenAI Adds New Features for Developers, Allowing Custom Model Building
AiBase Highlights:
⭐️ Developers can use OpenAI's new features to build custom models tailored to specific organizational, business domain, or task needs.
⭐️ Custom models include specialized knowledge bases, specific data understanding, task execution, or specific input responses.
⭐️ OpenAI provides fine-tuning APIs, custom training model programs, and assisted fine-tuning services to help developers build custom models.
OpenAI Transcribes Over a Million Hours of YouTube Videos to Train GPT-4
AiBase Highlights:
🤖 OpenAI uses YouTube video transcriptions to train GPT-4
📚 AI companies face challenges with high-quality training data
⚖️ Companies navigate data issues involving ambiguous areas of copyright law
AI Video Understanding Breakthrough, New MiniGPT4-Video Breaks SOTA! Bvlgari Promo Video Captioning is Exceptional
AiBase Highlights:
⭐ MiniGPT4-Video framework can understand complex videos and create poetic captions.
⭐ It supports processing temporal visual and text data, adept at understanding video complexity.
⭐ MiniGPT4-Video shows significant improvements in multiple benchmark tests, offering powerful interpretation capabilities for video captions, advertising, etc.
DeepMind Releases Gecko: Focused on Document Retrieval, Performance Rivals Larger Models
AiBase Highlights:
🦎 Gecko is a universal text embedding model focused on document retrieval, semantic similarity, and classification tasks.
🦎 Gecko integrates knowledge from LLMs into retrievers, achieving robust retrieval performance.
🦎 On large-scale text embedding benchmarks, the 256-dimensional Gecko outperforms the 768-dimensional existing models.
Microsoft Invests $10 Billion in Generative AI, This Stock Could Soar
AiBase Highlights:
🧠 Microsoft is deploying custom chips based on Arm design, potentially boosting Arm Holdings' growth.
📈 Arm Holdings has already benefited from the growth in AI chips, and Microsoft's project could further drive its performance.
🔋 Microsoft may reduce reliance on other companies with custom chips, enhancing performance and lowering costs, potentially driving Arm's revenue growth.
Musk's Friends to Help Raise $3 Billion for xAI
AiBase Highlights:
🤑 Investors with close ties to Musk plan to help xAI raise $3 billion.
🤖 xAI is competing with rivals like OpenAI and Anthropic, accelerating development in the competitive AI field.
💼 The battle for AI talent is intense, with xAI and other competitors vying to attract and retain talent.
The Next Big Leap for AI is Understanding Emotions, The First Emotionally Intelligent Conversational AI is Here
AiBase Highlights:
⭐️ HumeAI releases a conversational AI with emotional recognition capabilities, detecting 53 emotions.
⭐️ HumeAI aims to understand and respond to user emotions, enabling interaction through vocal characteristics.
⭐️ Provides APIs for users to train their AI models, with applications spanning health, customer service, and more.
Website: https://dev.hume.ai/docs/expression-measurement-api/overview
Kingsoft WPS365 to Launch a One-Stop AI Office Product
AiBase Highlights:
⭐ WPS365 will emphasize enhancing user office efficiency and experience
⭐ The suite includes content creation tools and collaboration software
⭐ The core concept is unified tools, collaboration, and management
🤖📱💼AI Applications
Infinity AI: Input a Script to Generate a Movie with a Single Click, Also Offers Digital Human Cloning
AiBase Highlights:
⭐ Goal: Simply input the script content to generate a movie with one click, a demo has been released by the official team.
⭐ The technical team successfully cloned the CEO's image to demonstrate product functionality, predicting that small teams could win Oscars with AI in the future.
⭐ Offers cloning services, users can train custom AI models by recording videos, generating content with their voice and facial expressions.
Website: https://top.aibase.com/tool/infinity-ai
Online Experience: https://studio.infinity.ai/
Detailed Tutorial and Video: https://qqi2gjmnk4.feishu.cn/wiki/HTmRwpZ1hiRONpkZ3SIce89ynuc?fromScene=spaceOverview
Google Launches Scenic: Recognizes Video Content and Generates Detailed Descriptions
AiBase Highlights:
🔍 Offers SOTA models and baseline models, supporting rapid prototyping of large-scale visual models.
🔍 Developed using JAX and Flax, supports image, video, audio, and multimodal combination models.
🔍 Can recognize video content and generate detailed descriptions, providing feature-rich baseline models and datasets.
Product Entry: https://top.aibase.com/tool/scenic
CameraCtrl: Enables Lens Control in Text-to-Video Generation, Supports AnimateDiff Lens Control
AiBase Highlights:
⭐ Lens control is crucial in video generation
⭐ Achieved through training a lens encoder for plug-and-play lens modules
⭐ Enhances the controllability and generalization of lens control across different datasets
Product Entry: https://hehao13.github.io/projects-CameraCtrl/
Lixel CyberColor: Automatically Generates Infinite-Sized Cinematic 3D Scenes
AiBase Highlights:
⭐️ LCC uses Multi-SLAM and Gaussian Splatting technology to generate cinematic 3D scenes.
⭐️ XGRIDS' Multi-SLAM algorithm and 3DGS technology create realistic large 3D models.
⭐️ XGRIDS provides LCC plugins and SDKs to replicate 3D content on multiple platforms.
Website: https://xgrids.com/lcc
AI Voice Recognition Tool Universal-1: 38 Seconds to Process 60 Minutes of Audio, Faster than fast Whisper
AiBase Highlights:
⭐️ Universal-1 offers accurate and robust multilingual speech-to-text capabilities
⭐️ Universal-1 improves accuracy and speaker identification through timestamp estimation
⭐️ AssemblyAI builds the efficient Universal-1 model using Conformer RNN-T architecture and Google Cloud TPUs
Product Entry: https://top.aibase.com/tool/universal-1
InstantStyle: Text-to-Image Style Reference for Consistent Style in SD
AiBase Highlights:
⭐️ Effectively separates content and style through simple yet powerful techniques.
⭐️ Application of CLIP global features, clearly decoupling style and content.
⭐️ Different network levels capture various semantic information, achieving better style retention.
Product Entry: https://top.aibase.com/tool/instantstyle
————
Daily midjourney prompt: Sexy E-commerce Model
Image Source: Generated by AI, Image Licensed by Midjourney