AI Daily: Alipay Launches AI Creative Generation Platform; Google's Game-Changing Inference Model Gemini 2.0 Flash Thinking; Runway Supports Intermediary Frame Insertion; OpenAI Prepares O3 Inference Model

Welcome to the "AI Daily" column! Here is your guide to exploring the world of artificial intelligence every day. We present you with the hottest topics in the AI field, focusing on developers to help you gain insights into technological trends and innovative AI product applications.

Fresh AI products click to learn more: https://top.aibase.com/

1. Google Releases Groundbreaking Inference Model Gemini 2.0 Flash Thinking, Challenging OpenAI o1

Recently, Google launched the Gemini 2.0 Flash Thinking model, showcasing powerful capabilities in multimodal reasoning. It supports 32,000 input tokens and 8,000 output tokens, significantly enhancing efficiency in tackling complex problems. This model addresses the "black box" issue in AI through transparent step-by-step reasoning, improving users' understanding of the model's decision-making process.

[AiBase Highlights:]

🌟 The Gemini 2.0 Flash Thinking model has strong reasoning capabilities, supporting 32,000 input tokens and 8,000 output tokens.

💡 The model provides step-by-step reasoning through a dropdown menu, enhancing transparency and addressing the AI "black box" issue.

🖼️ It features native image upload and analysis capabilities, expanding multimodal application scenarios.

Details link: https://ai.google.dev/gemini-api/docs/thinking-mode?hl=en

2. Alipay Launches AI Creative Generation Platform for Merchants, Producing 87 Million AI Assets

Alipay recently launched an AI creative generation platform called "Creative on MaShang," designed to quickly generate creative materials and provide intelligent analysis for merchants and designers. The platform offers a wealth of marketing image resources for free and supports rapid creation of posters, videos, and other content through AI technology, as well as providing creative insights to help merchants improve marketing effectiveness.

[AiBase Highlights:]

🖼️ Supports quick generation of posters, banners, videos, and more, streamlining the creative production process.

📊 Provides AI creative insight services to help merchants analyze and optimize marketing materials, increasing conversion rates.

🚀 Since last year, Alipay has generated 87 million AI assets, driving the intelligent development of merchant marketing.

3. Runway Updates Major Feature: Supports Inserting Intermediate Frames to Control Video Generation

Runway platform recently introduced a significant update allowing users to insert intermediate frames during video generation. This feature meets the expectations of many users, greatly enhancing the creative freedom and flexibility in video production. Users can not only upload start and end frames but also enrich video content by adding intermediate frames, improving the coordination and smoothness of the visuals.

[AiBase Highlights:]

🎨 Users can now select start and end frames and insert intermediate frames during video generation, increasing creative flexibility.

🚀 The new keyframe feature enriches the visuals, enhancing overall quality and smoothness.

✨ User feedback has been positive, demonstrating the effectiveness of this feature in practical applications.

4. E-commerce Product Try-On Tool! Krea AI's New Feature: Add Real Products to Any Image in Seconds

Krea AI has recently launched an exciting new feature—custom training, allowing users to add real products to any image in just seconds. This feature achieves seamless integration of products and images through simple brushing and selecting product images, greatly enhancing design and creative efficiency. Users can easily replace models' accessories and clothing, and even swap logos.

[AiBase Highlights:]

✨ Users can add real products to images in seconds, boosting design efficiency.

🖌️ Through simple brushing and selection, AI achieves perfect integration of products and images.

👗 Supports various replacement operations, including accessories, clothing, and logos, with smooth operation.

5. Skipping o2! OpenAI Plans to Launch Next-Generation "o3" Inference Model

OpenAI is developing a next-generation inference model "o3," aimed at enhancing the quality of responses during user inquiries, making them more thoughtful and logically sound. Due to trademark conflicts with the UK telecom company O2, OpenAI chose to skip "o2" and directly name it "o3." This move not only reflects the company's cautious approach to brand naming but also marks a significant strategic adjustment in response to the slowing pace of product updates and increasing market competition.

[AiBase Highlights:]

🌟 OpenAI is developing the new inference model "o3," aimed at enhancing reasoning capabilities and user interaction experiences.

⚖️ Due to potential trademark conflicts with the UK telecom company O2, OpenAI decided to skip "o2" and directly name it "o3."

📈 The launch of the new model is a strategic move by OpenAI to address the slowdown in product updates, with hopes of achieving broader applications across various industries.

6. Fast! ElevenLabs Launches Flash Voice Conversation Model: Only 75 Milliseconds Delay Supporting 32 Languages

ElevenLabs recently introduced its latest voice synthesis model Flash, claiming it to be the fastest text-to-speech solution to date, with a voice generation delay of only 75 milliseconds, particularly suitable for low-latency conversational voice assistants. The Flash model comes in two versions, with Flash v2 supporting only English, while Flash v2.5 supports 32 languages. Although it slightly lags behind the Turbo model in sound quality and emotional depth, Flash performed exceptionally well in blind tests, becoming the fastest option available.

[AiBase Highlights:]

🌟 The Flash model generates voice with only a 75-millisecond delay, suitable for low-latency conversational voice assistants.

🌍 Flash v2.5 supports 32 languages, with users consuming 1 credit for every two characters generated.

🚀 In blind tests, the Flash model outperformed other similar products, becoming the fastest text-to-speech solution available.

7. ChatGPT Desktop App Introduces Application Collaboration Feature

OpenAI recently released an important update for the ChatGPT desktop application, introducing the "Collaborate with Apps" feature, enabling ChatGPT to directly read content from various applications such as terminals, IDEs, and text editors. This update significantly enhances the efficiency of developers and creators, with supported applications including Apple Notes, Notion, VS Code, and more.

[AiBase Highlights:]

🌟 ChatGPT has added the "Collaborate with Apps" feature, supporting direct reading of multiple application contents.

💻 Supported applications include Apple Notes, Xcode, VS Code, and more, covering a wide range.

🗣️ After the update, users can interact with applications using advanced voice mode, providing a more intuitive experience.

8. AI Programming Assistant Cursor Secures $100 Million in Funding, Valuation Soars to $2.6 Billion

The AI programming assistant Cursor, developed by Anysphere, has completed a $100 million Series B funding round in just four months, with its valuation soaring to $2.6 billion. This round was led by Thrive Capital, with Andreessen Horowitz participating but not leading. Despite intense market competition, Cursor has gained significant popularity over its rivals, with annual revenue skyrocketing from $4 million to $48 million in a short time.

[AiBase Highlights:]

🌟 Cursor successfully raised $100 million, achieving a valuation of $2.6 billion!

🚀 In just four months, the company's valuation surged 6.5 times, attracting enthusiastic investors.

💰 The company's annual revenue quickly grew from $4 million to $48 million, showcasing impressive performance.

9. The Departure of GPT's Father Shakes the AI Community: OpenAI's Legendary Researcher Radford Turns to Independent Research

Alec Radford, a core researcher at OpenAI, announced his departure to pursue independent research, drawing widespread attention in the AI field. He was the chief designer of the GPT series and made significant contributions to AI, including proposing the pre-training method for language model generation based on transformers. Radford's departure highlights the challenges of talent mobility in AI and may indicate that independent researchers will play an increasingly important role in AI technology innovation.

[AiBase Highlights:]

🚀 Radford joined OpenAI in 2016, driving the development of the GPT series models and laying the foundation for modern AI.

📈 His departure comes amid frequent changes in OpenAI's upper management, potentially impacting the company's future direction.

🤝 Despite choosing independent research, Radford plans to collaborate with OpenAI and other AI developers to explore new innovation models.

10. FlashJik Launches China's First AI Glasses Priced at 999 Yuan: 30g Weight Challenges New Wearable Track

FlashJik Technology released China's first mass-produced AI glasses—the FlashJik AI "Snap Mirror," starting at 999 yuan, expected to ship on January 15, 2025. These glasses feature a classic black frame design and weigh only 50g, with an actual wearing feel of about 30g. Equipped with a Sony 16-megapixel camera and AAC Hi-Fi speakers, they support various AI functions and will receive more features through online upgrades in the future.

[AiBase Highlights:]

🕶️ The FlashJik AI "Snap Mirror" is China's first mass-produced AI glasses, starting at 999 yuan, expected to ship on January 15, 2025.

📸 The glasses are equipped with a Sony 16-megapixel camera and AAC Hi-Fi speakers, with a weight of only 50g and a wearing feel of just 30g.

🚀 Through its self-developed Loomo OS system, it supports AI functions like voice recognition and real-time translation, with more features available for online upgrades in the future.

11. Stable Diffusion 3.5 Large Officially Launches on Amazon Bedrock Platform

At the AWS re:Invent conference, Stable Diffusion 3.5 Large (SD3.5 Large) was officially launched on the Amazon Bedrock platform, aiming to provide developers with a secure and convenient environment for developing generative AI applications. This model excels in text-to-image generation, supports diverse visual styles, and accurately responds to user inputs.

[AiBase Highlights:]

🌟 The SD3.5 Large model is now available on the Amazon Bedrock platform, supporting convenient and secure AI application development.

🎨 This model features diverse style generation, excellent text prompt adherence, and versatile image output capabilities.

🔧 The newly upgraded image services include stable image super and core versions, providing higher quality and cost-effective generation solutions.

Details link: https://stability.ai/news/stable-diffusion-35-large-is-now-available-on-amazon-bedrock?utm_source=futuretools.io&utm_medium=newspage

12. 140 Trillion Tokens Training Volume Boosts: Falcon3 Challenges Mainstream Open Source AI Models

The new generation open-source AI model Falcon3 released by the Abu Dhabi Technology Innovation Institute (TII) has achieved remarkable performance on consumer-grade hardware with a training volume of 140 trillion tokens and optimized architectural design, setting new records. Compared to its predecessor, the training scale of Falcon3 has doubled, demonstrating strong competitiveness, especially in benchmark tests against other mainstream open-source models.

[AiBase Highlights:]

🚀 The Falcon3 series offers four specifications to meet different user needs, supporting multiple language processing.

🏆 In evaluations by Hugging Face, Falcon3 surpassed several mainstream open-source models, showcasing powerful performance.

💡 TII plans to launch multimodal models in 2025, further expanding Falcon3's application scenarios.