AI Daily: Enhanced Reasoning! OpenAI's New Model o1 Released; Midjourney 7.0 Can Generate 8 Images at Once; Open Source Voice Model Fish Speech 1.4 Released

Welcome to the AI Daily section! This is your daily guide to exploring the world of artificial intelligence. Each day, we bring you the hottest topics in the AI field, focusing on developers to help you understand technological trends and innovative AI product applications.

Discover the latest AI products by clicking here: https://top.aibase.com/

1. OpenAI Launches New Model Series OpenAI o1

OpenAI has introduced a new model series, OpenAI o1, which excels in reasoning capabilities, providing stronger reasoning power for solving complex problems. Users need to adjust their prompting methods to align with the working style of the o1 model. Significant changes in prompt engineering require users to provide simple and direct prompts, avoid chain-of-thought prompting, use delimiters to clarify model parsing sections, and limit additional context to prevent complex responses.

AiBase Summary:
🤖 OpenAI o1 requires simple, direct prompts rather than complex instructions.
🧠 Avoid chain-of-thought prompting as the o1 model already possesses internal reasoning capabilities.
📑 Use delimiters to clarify model parsing sections and limit additional context to avoid complex responses.
Detailed link: https://openai.com/index/introducing-openai-o1-preview/

2. Google's Gemini Live Voice Chat Now Free for Android Users, Chat with AI Anytime, Anywhere!

Google has announced that the Gemini Live voice chat mode is now free for all Android users, allowing everyone to experience the fun of intelligent conversational AI. Users can ask questions at any time using their voice and even interrupt during the response process, providing a smooth voice interaction experience. Gemini Live offers a new way of interaction, allowing users to chat with AI at home or on the go.

AiBase Summary:
🌟 Gemini Live voice chat feature is now free for all Android users!
🗣️ Users can directly ask questions with their voice and even interrupt during responses.
🌍 Currently supports English only, with plans to launch on iOS and support more languages in the future.

3. Midjourney Founder and CEO David Holz Shares Latest Project Progress

Midjourney founder David Holz shared the company's latest project progress on Discord, emphasizing technological innovation to compete in the AI image generation field. The company has postponed the release of version 7.0 but has made it more feature-rich. The focus is on improving technical accessibility and the professional value of tools. Plans include multi-image generation, an image editor, a 3D system, personalized features, and video generation. The company is taking a steady development path, focusing on enhancing user experience.

AiBase Summary:
🚀 Version 7.0 has been postponed but is more feature-rich, focusing on improving technical accessibility and professional tool value.
🎨 New features include multi-image generation, an image editor, a 3D system, personalization, and video generation, enhancing user experience.
💡 Midjourney chooses a steady development path, focusing on practical features and user experience to maintain competitive advantage.
Detailed link: https://top.aibase.com/tool/midjourneywangyeban

4. Metaverse Launches Open Source Large Model XVERSE-MoE-A36B

As the largest Mixture of Experts (MoE) open-source large model in China, the release of XVERSE-MoE-A36B marks a significant advancement in China's AI field, elevating domestic open-source technology to international leading levels. The model's performance and efficiency have shortened training time, improved inference performance, and reduced the cost of AI applications, providing more opportunities for small and medium-sized enterprises, researchers, and developers.

AiBase Summary:
🚀 XVERSE-MoE-A36B has 255B total parameters and 36B active parameters, comparable to models with over 100B parameters, achieving a performance leap across levels.
💡 The MoE architecture, by combining multiple expert models in specific fields, breaks the limitations of traditional scaling laws, maximizing model performance while reducing computational costs.
📈 Metaverse MoE outperforms several similar models in authoritative evaluations, including domestic trillion-parameter models like Skywork-MoE and traditional MoE leaders like Mixtral-8x22B.
Detailed link: https://huggingface.co/xverse/XVERSE-MoE-A36B

5. Fish Speech 1.4 Release: Open Source TTS Model Achieves Multilingual Breakthrough

The release of Fish Speech 1.4 marks a significant breakthrough for this open-source text-to-speech (TTS) model in multilingual support and performance. The update demonstrates strong technical capabilities and broad application prospects.

AiBase Summary:
🌐 Multilingual support has significantly improved: training data has doubled to 700,000 hours, supporting 8 major languages, expanding application scope.
⚡ Performance and functionality have been comprehensively upgraded: ultra-fast speed and low latency, instant voice cloning feature, flexible deployment options, and API services.
🚀 Broad application prospects: support for language learning in the education sector, instant voice cloning in the entertainment industry, assistive technology for the visually impaired, smart customer service, and cross-cultural communication.
Detailed link: https://fish.audio/zh-CN/auth/

6. Farewell to Hallucinations! Google Introduces New Model DataGemma, Statistical Data Accuracy Soars by 58%

Google has launched a new open-source AI model, DataGemma, aimed at solving the "hallucination" problem often encountered by large language models when dealing with statistical data, marking an important advancement in the AI field. DataGemma leverages Google's data sharing platform to significantly improve the accuracy of model responses to statistical questions. Preliminary tests show a significant improvement in the accuracy of statistical queries with DataGemma.

AiBase Summary:
🌟 DataGemma model aims to reduce errors in AI statistical queries and improve accuracy.
📊 DataGemma utilizes Data Commons platform data to enhance the accuracy of model responses.
🔍 DataGemma shows significant improvement in statistical query accuracy in preliminary tests.
Detailed link: https://huggingface.co/collections/google/datagemma-release-66df7636084d2b150a4e6643

7. Jina AI Launches Reader-LM Small Language Model

Jina AI's Reader-LM small language model provides convenience for converting raw HTML content into clean Markdown format, eliminating the need for cumbersome web data processing. The model is fast and efficient, automatically removing clutter, and demonstrating excellent performance and high accuracy.

AiBase Summary:
✨ Reader-LM quickly and efficiently converts web content to Markdown without complex rules or regular expressions.
🔍 Offers two parameter models, optimized for HTML to Markdown tasks, outperforming larger models.
💡 Strong long-context processing capabilities, efficient even in resource-constrained environments.
Detailed link: https://jina.ai/news/reader-lm-small-language-models-for-cleaning-and-converting-html-to-markdown/

8. Valued at $20 Million! AI Tool Shopsense AI Allows You to Buy Celebrity-Inspired Outfits by Taking Photos

At the MTV Video Music Awards (VMAs), audiences could instantly purchase clothing similar to celebrity styles using Shopsense AI technology, showcasing the future possibilities of shopping experiences. Although the technology still needs to improve its accuracy, Shopsense is continuously refining it to compete with other media companies. Its business model is diverse, earning revenue through pay-per-click and sales commissions, with significant market potential.

AiBase Summary:
🌟 Viewers can obtain product recommendations similar to celebrity styles by uploading photos, including both high-end and affordable options.
🛍️ Shopsense AI plans to expand into other areas such as travel and sports for product recommendations, seamlessly connecting content with shopping.
📈 Shopsense AI collaborates with Paramount to provide audiences with the convenience of instantly purchasing clothing similar to celebrity styles.

9. A Trademark Battle! Google Sued for Infringement for Using "Gemini" Name

Recently, Google has been sued by a company named Gemini Data for its new AI service "Gemini," accused of trademark infringement. This dispute highlights the challenges and legal risks faced by large tech companies in trademark usage, warning businesses to carefully consider existing trademarks when naming new products or services.

AiBase Summary:
🌟 Google sued by Gemini Data for using "Gemini" trademark, accused of trademark infringement.
🔍 Google's trademark application was rejected due to similarity with other trademarks.
🤖 Google's Gemini chatbot acknowledges the trademark infringement, reflecting the ongoing legal dispute between the parties.

10. UAE State Investment Company MGX Considers Investing Billions in OpenAI

The UAE state investment company MGX is considering investing billions of dollars in OpenAI, further advancing OpenAI's financing plan and demonstrating OpenAI's strong commercial performance. At the same time, MGX's establishment aims to accelerate the development of artificial intelligence and advanced technologies, consolidating the UAE's leading position in the global technology sector.

AiBase Summary:
💰 MGX considers investing billions of dollars in OpenAI, advancing OpenAI's financing plan.
🤖 OpenAI's annual recurring revenue reaches $4 billion, showing strong commercial performance.
🌍 MGX, co-founded by Mubadala and G42, focuses on the development of artificial intelligence and advanced technologies.

11. Impressive! Someone Tests OpenAI o1 to Solve High School Math Final Exam Questions, and It Gets Them All Right

This article introduces a Reddit user who used OpenAI's latest model, OpenAI o1, to solve high school math problems with remarkable results. Curious about the capabilities of artificial intelligence, the user found that OpenAI o1 accurately solved Chinese high school math exam questions in a short time, drawing attention and discussion from other users. The results demonstrate the powerful ability of AI in handling complex mathematical problems, sparking discussions about the future applications of AI.

AiBase Summary:
🤖 AI capabilities are impressive: OpenAI o1 accurately solves high school math exam questions in a short time, achieving a perfect score.
💡 Technological advancements spark thought: Users raise questions about the future development of AI and its impact on the education sector.
🌐 Intelligent learning assistance: AI has great potential in the education sector, providing students with intelligent learning assistance.

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

AI Daily: Enhanced Reasoning! OpenAI's New Model o1 Released; Midjourney 7.0 Can Generate 8 Images at Once; Open Source Voice Model Fish Speech 1.4 Released

站长之家

This article is from AIbase Daily

AI News Recommendations

Baidu Launches the World's First Chinese Audio-Visual Generation Model MuseSteamer, Revolutionizing the Creative Process

JD.com's Embodied Intelligence Strategy Accelerates Rapidly, JoyInside Collaboration Map Exposed

Foxconn Launches Its First AI Inference Large Model FoxBrain, Trademark Application Submitted

Zhipu AI Open Sources GLM-4.1V-Thinking: A Breakthrough in Multimodal Reasoning

AI Daily: Baidu Launches Drawn-Imagine Platform and MuseSteamer; Alibaba's Audio-Driven Full-Body Digital Human Model OmniAvatar

Open Source End-to-End Speech Large Model Step-Audio-AQAA: Understand Audio and Generate Natural Speech Directly

Foxconn's Parent Company Registers a Trademark for an AI Inference Large Model

Zhejiang University and Alibaba jointly launch OmniAvatar: A full-body digital human model driven by audio makes a stunning debut

Baidu Launches Self-Developed Video Generation Model MuseSteamer and Video Product Platform HuiXiang

1 Billion Investment! Zhipu AI Receives Support from Pudong Zhangjiang, GLM-4.1V Makes a Major Open Source Release, AGI Development Speeds Up