AI Daily: Google Launches Experimental Version of Gemini 1.5 Pro 0801; Open Source Image Generation Model FLUX1 Emerges; Ultra-Fast 3D Image Generation Model Stable Fast 3D Released; Alibaba's Voice Synthesis Model CosyVoice Updated

Welcome to the AI Daily section! Here is your daily guide to exploring the world of artificial intelligence. Each day, we bring you the hottest topics in the AI field, focusing on developers to help you understand technological trends and discover innovative AI product applications.

Explore fresh AI products by clicking here: https://top.aibase.com/

1. Google Launches Powerful Multimodal Model Experimental Version Gemini 1.5 Pro, Leading GPT-4o, Claude-3.5 Sonnet

Google today introduced Gemini 1.5 Pro, achieving significant breakthroughs in the field of artificial intelligence. Gemini 1.5 Pro excels in multitasking, featuring multimodal capabilities and a wide context window, sparking discussions on AI development and societal impacts.

AiBase Highlights:

🚀 Google launches Gemini 1.5 Pro, leading in rankings against competitors.

💪 The model performs exceptionally in multitasking, with multimodal capabilities and a wide context window.

⚖️ The release has sparked discussions on AI development and societal impacts; Google seeks feedback to refine the model.

Details link: https://top.aibase.com/tool/gemini-pro

2. New AI Image Generation Dominator! Open-Source Model FLUX.1 Emerges, Making Midjourney and DALL·E3 Nervous?

In the field of artificial intelligence, disruptive changes can occur daily. FLUX.1, a notable dark horse, has exploded in the AI community with its powerful performance and open-source nature. With the authoritative background of founder Robin Rombach and the innovative architecture of FLUX.1, it has become the new leader in AI image generation, injecting new vitality into the entire AI industry.

AiBase Highlights:

🚀 FLUX.1 surpasses closed-source models and open-source SD3 series, significantly leading in performance.

💡 Based on the Vision Transformer architecture, using a process matching training method to enhance model performance.

🌟 FLUX.1 demonstrates clear advantages in text embedding images, among other areas.

Details link: https://github.com/black-forest-labs/flux

3. Stability AI Introduces New AI Model Stable Fast3D: Generate 3D Images in Half a Second, Speeding Up 1200 Times

Stability AI's latest Stable Fast3D technology enables rapid 3D image generation from a single image, processing speed 1200 times faster than before, with broad practical value. The technology is based on advanced generative AI models, bringing revolutionary changes to various industries such as design, architecture, retail, virtual reality, and game development.

AiBase Highlights:

😃 Stable Fast3D technology achieves 3D image generation in half a second, significantly increasing speed.

👍 The new model has practical value in various industries including design, architecture, retail, virtual reality, and game development.

👏 Stability AI continues to lead image generation technology development, innovating from 2D to 4D.

Details link: https://top.aibase.com/tool/stable-fast-3d

4. AI Video Creation Platform Hedra Raises $10 Million

Recently, the AI video creation field has received significant news with Hedra successfully raising $10 million in seed funding, drawing widespread attention. Hedra has launched the video foundation model Character-1, which has been used by over 350,000 users to create over 1.6 million videos, some of which have gone viral online. Multiple companies have launched video generation models, with large companies actively participating in AI-driven video creation.

AiBase Highlights:

🔥 Hedra raises $10 million in seed funding, launching the Character-1 model.

💡 Over 350,000 users use Character-1 to create over 1.6 million videos, some going viral online.

🚀 Multiple companies launch video generation models, with large companies actively participating in AI-driven video creation.

Details link: https://www.hedra.com/blog/announcement

5. Alibaba's Voice Synthesis Model CosyVoice Updated to Make AI Sound More Human

Alibaba's latest voice synthesis model, CosyVoice, showcases a beautiful blueprint for future human-machine interaction, with astonishing realism and flexibility. The technology can not only generate voices that match specific genders, ages, and personalities but also mimic natural human speech features, adding emotions and styles, making AI expressions more colorful. CosyVoice, together with SenseVoice, forms the FunAudioLLM framework, enhancing voice interaction experiences, supporting multilingual recognition and emotional recognition. The technological breakthrough heralds a new era of human-machine interaction, bringing revolutionary changes to education, entertainment, customer service, and other fields.

AiBase Highlights:

🤖 CosyVoice model showcases the future of human-machine interaction, realistic and flexible, generating voices that match gender, age, and personality, mimicking natural features, adding emotional style.

🔊 FunAudioLLM framework enhances voice interaction experience, SenseVoice supports multilingual and emotional recognition, fast response, broad application prospects.

📚 Technological breakthrough heralds a new era of human-machine interaction, CosyVoice and FunAudioLLM bring revolutionary changes to education, entertainment, customer service, and other fields.

Details link: https://top.aibase.com/tool/cosyvoice

6. Alibaba International AI Business Assistant Upgraded: Text-Based AI Generation Capabilities Fully Free

Alibaba International President Zhang Kuo announced the new release of the AI Business Assistant, including a simplified product launch feature and an AI automatic reception feature. The application of AI technology significantly lowers the barriers to the foreign trade industry, with over 30,000 small and medium-sized enterprises using it, improving product exposure by 37% and increasing payment conversion rates by 50%. The AI Business Assistant is a powerful tool for merchants to efficiently manage and quickly receive orders. The updated three benefits provide more flexible usage methods, with text-based AI generation capabilities free, and unsatisfactory features can be regenerated for free. More features will be continuously updated.

AiBase Highlights:

🚀 AI Business Assistant simplified product launch feature reduces merchant release time to as fast as 60 seconds.

💬 AI automatic reception feature improves the secondary reply rate of overseas buyers by about 40%.

💡 AI technology application lowers the barriers to the foreign trade industry, with 30,000 small and medium-sized enterprises using it, increasing product exposure by 37% and payment conversion rates by 50%.

7. Desktop Chrome AI Search Upgrade, Introducing Circle to Search-like Feature

Google Lens has undergone an AI-driven upgrade in the desktop version of Chrome, providing users with a more convenient search experience. Users can activate Google Lens by clicking a new button in the search box to perform multi-search and view text and image search results. This update will be rolled out globally, with some features only available to US users. Additionally, Chrome has added new AI features that allow users to search their history by asking questions. These features will be gradually rolled out to US users over the next few days or weeks.

AiBase Highlights:

🌐 Google Lens in the desktop version of Chrome undergoes an AI-driven upgrade, allowing users to activate and perform multi-search by clicking the button in the search box.

📅 The update will be rolled out globally in the "next few days," with some features only available to US users.

💬 Chrome adds AI features that can ask about search history, to be rolled out in the "next few weeks" in the US, users can opt-in, currently relying on cloud models for results.

8. Israeli AI Startup aiOla Launches Ultra-Fast Open-Source Speech Recognition Model Whisper-Medusa

aiOla's Whisper-Medusa speech recognition model is 50% faster than OpenAI's Whisper while maintaining accuracy. This move will accelerate the response speed of voice applications, improve efficiency, and reduce costs.

AiBase Highlights:

💥 Speed increase of 50%: Whisper-Medusa is 50% faster than OpenAI's Whisper.

🎯 No loss in accuracy: Whisper-Medusa maintains the same accuracy as the original model while increasing speed.

📈 Broad application prospects: Whisper-Medusa is expected to accelerate the response speed of voice applications, improve efficiency, and reduce costs.

9. Suno Claims Training Models with Copyrighted Music is "Fair Use"

This article reports on the lawsuit filed by the Recording Industry Association of America (RIAA) against music generation startups Udio and Suno. Suno admits to using copyrighted music to train its AI model and claims this is fair use. RIAA disagrees, considering it an infringement. The outcome of the case could set a precedent for the related field.

AiBase Highlights:

🎶 RIAA sues Udio and Suno for using copyrighted music to train models.

💻 Suno admits using copyrighted music for training models but claims it is fair use.

👀 The outcome of the case could set a precedent affecting the related field.

10. Microsoft Lists OpenAI as a Competitor for the First Time in SEC Filing

Microsoft recently listed its long-term partner OpenAI as a competitor in its annual 10K report submitted to the US Securities and Exchange Commission (SEC), sparking industry speculation. This move may be influenced by the current anti-monopoly environment, and the future direction of the relationship between Microsoft and OpenAI remains to be observed.

AiBase Highlights:

🔍 Microsoft lists OpenAI as a competitor, drawing industry attention.

💰 Microsoft has invested $13 billion in OpenAI, becoming the exclusive cloud provider.

🔄 Partners and competitors are not mutually exclusive; precedents exist for changes in the relationship between Microsoft and OpenAI.

11. Cook Says Apple AI Will Drive User Upgrades

Apple Inc. achieved solid financial results in the third fiscal quarter of 2024, especially with the growth in service revenue. Tim Cook revealed some features of Apple Intelligence and the upcoming new iPhone 16, looking forward to Apple's development in the artificial intelligence field.

AiBase Highlights:

📈 Apple's total net revenue in the third fiscal quarter of 2024 reached $85.777 billion, an increase of 5% year-over-year.

📱 iPhone revenue reached $39.296 billion, with growth in Mac and iPad revenue, and service revenue reached $24.213 billion.

🚀 Apple Intelligence features will be rolled out gradually, with the new iPhone 16 set to be released soon, supporting AI technology.

12. Over 300 Video Game Actors Jointly Protest Against Unregulated AI Use in Hollywood!

Behind the glitz of Hollywood, actors are uniting to protest against unregulated AI use, protecting their rights. This protest highlights the importance of actors' livelihood rights in the era of artificial intelligence.

AiBase Highlights:

🎭 Actors protest against unregulated AI use, protecting their rights.

💼 Artificial intelligence threatens actors' livelihoods, with voices and images possibly being misused.

💰 Stalemate in negotiations between actors and game companies, the key issue being who is the performer.

13. HKU and MIT Collaborate to Create ItiNera: Your Personal AI Tour Guide, One Click to Plan the Perfect Citywalk Route!

In the hustle and bustle of the city, everyone yearns for a spontaneous citywalk, wandering through alleys, exploring historical sites, and immersing themselves in local culture. The ItiNera system combines spatial optimization with large language models to provide personalized urban itinerary planning services, offering travelers a new way to explore the city.

AiBase Highlights:

🌆 ItiNera is an open-domain urban itinerary planning system that can generate personalized itineraries based on user natural language descriptions.

🗺️ ItiNera utilizes LLM and spatial optimization modules to extract and sequence POIs, creating spatially coherent itineraries.

🔓 ItiNera has been deployed on TuTu online travel services, attracting thousands of users to use its city travel planning services.