AI News and Product Search Page

Type :

AI News
AI Tools
AI Cases
AI Tutorial

2025-03-07 16:19:22.AIbase

No Training Needed! Q-Filters Enable Efficient Compression of KV Cache and Improved Inference Performance

2025-03-06 10:52:45.AIbase

IBM Launches Compact AI Model Granite 3.2, Emphasizing Efficient Inference and Practicality

IBM has released Granite 3.2, a smaller AI model designed for efficient inference and practical applications. This release focuses on delivering strong performance with reduced computational resources.

2025-03-06 10:04:01.AIbase

No Need for High-End Hardware! Alibaba Open-Sources the New Inference Model Tongyi Qianwen QwQ-32B; Consumer-Grade GPUs Achieve S-Tier Performance!

Alibaba has open-sourced its new inference model, Tongyi Qianwen QwQ-32B, enabling users with consumer-grade GPUs to achieve surprisingly high performance. This makes advanced AI capabilities accessible to a wider audience.

2025-03-06 09:17:43.AIbase

Alibaba Open-Sources New Inference Large Model QwQ-32B, Rivaling DeepSeek-R1 with Lower VRAM Requirements

Alibaba has open-sourced a new inference large language model, QwQ-32B. Benchmarks show performance comparable to DeepSeek-R1, but with significantly reduced VRAM requirements.

2025-03-03 21:42:17.AIbase

Qwen2.5-Max Inference Model Launched by Tongyi Lingma

Tongyi Lingma has announced the launch of its latest inference model, Qwen2.5-Max, offering developers powerful programming and mathematical capabilities. Trained on over 20 trillion tokens and incorporating a meticulously designed post-training scheme, Qwen2.5-Max demonstrates exceptional performance.

2025-03-03 09:45:19.AIbase

DeepSeek Open Source Week Day Six: Extreme Inference Optimization System for Enhanced GPU Computing Efficiency

2025-03-02 10:26:31.AIbase

DeepSeek Unveiled: The Astonishing 545% Profit Margin Behind its AI Inference System

DeepSeek launched an official account on Zhihu and published a technical article, "DeepSeek-V3/R1 Inference System Overview," detailing its model inference system optimization and cost-profit margin information for the first time, marking the end of the highly anticipated DeepSeek Open Source Week. The article highlights two key optimization goals for the DeepSeek-V3/R1 inference system: higher throughput and lower latency. To achieve these goals, DeepSeek employs large-scale cross-node expert parallelism (EP) technology, although...

2025-02-28 11:08:32.AIbase

ByteDance Launches AIBrix: A New Open-Source Inference System Designed for Large Language Models

2025-02-28 09:03:39.AIbase

Free to Use! ByteDance's AI Programming Software Trae Integrates Claude 3.7; Developers Rejoice!

ByteDance's Trae, an AI-powered programming software, has integrated Claude 3.7, offering developers a powerful and free-to-use tool. This integration is expected to significantly enhance coding efficiency and productivity.

2025-02-26 09:33:45.AIbase

DeepSeek Open Source Week Day 3: Announcing DeepGEMM, an FP8 GEMM Library for AI Training and Inference

2025-02-25 08:26:15.AIbase

AI Programming Tool Cursor Integrates Claude 3.7 Sonnet Reasoning Model

On February 25th, Anthropic officially released its latest reasoning model - Claude 3.7 Sonnet. This model, with its innovative design and significantly improved programming capabilities, quickly became an industry focus. The core innovation of Claude 3.7 Sonnet lies in its fusion of rapid response and in-depth reasoning capabilities, simulating human thought processes to provide users with a more seamless interactive experience.

2025-02-25 08:16:56.AIbase

Anthropic Unveils Claude 3.7 Sonnet: A Hybrid Reasoning Model Surpassing DeepSeek

Anthropic, an AI company, today announced the release of its latest hybrid reasoning model, Claude 3.7 Sonnet. Anthropic claims this model is its most intelligent AI yet, designed to tackle more complex problems and demonstrating superior performance in areas such as mathematics and coding.

2025-02-18 20:33:46.AIbase

DeepSeek Launches NSA Technology: Accelerating Long Context Training and Inference

In the field of artificial intelligence, the DeepSeek team recently released their latest research成果, introducing an innovative sparse attention mechanism named NSA (Native Sparse Attention). The core objective of this technology is to enhance the speed of long context training and inference, particularly optimized for modern hardware, significantly improving the efficiency of training and inference. The launch of NSA technology brings significant changes to the training of artificial intelligence models. Firstly, it utilizes a series of advancements tailored for modern computational hardware.

2025-02-12 14:04:43.AIbase

ByteDance's UltraMem Architecture Reduces Large Model Inference Costs by 83%

The ByteDance Doubao large model team announced today the successful development of a new sparse model architecture called UltraMem. This architecture effectively addresses the high memory access issues during the inference of MoE (Mixture of Experts) models, improving inference speed by 2 to 6 times compared to MoE, and reducing inference costs by up to 83%. This groundbreaking advancement opens a new path for efficient inference of large models. The UltraMem architecture successfully resolves the memory bottleneck during inference of MoE architectures while maintaining model performance. Experimental results show that the parameters and activation conditions are the same.

2025-02-10 14:16:32.AIbase

DeepSeek Full Series Launches on iFlytek Open Platform with Limited Time Free Inference API

Today, the iFlytek Open Platform officially launched the 807th AI capability – the full series of DeepSeek models, provided by Hangzhou DeepSeek. The full series includes two versions: DeepSeek-V3 and DeepSeek-R1, both supporting public cloud API calls and one-click deployment of dedicated models. Notably, the iFlytek Open Platform also introduced the DeepSeek large model fine-tuning service, allowing developers to deeply customize 'smarter and more suitable' solutions based on business scenario data.

2025-02-08 09:38:42.AIbase

In Response to DeepSeek Challenge, OpenAI Reveals o3-mini Public Inference Process

OpenAI recently announced the detailed inference process of its latest inference model, o3-mini, which is seen as a response to the increasing pressure from competitors like DeepSeek-R1. This change marks a significant shift in OpenAI's strategy towards model transparency. Previously, OpenAI regarded the 'Chain of Thought' (CoT) as a core competitive advantage and chose to keep it concealed. However, with open models like DeepSeek-R1 fully showing their inference paths, this has led to a reevaluation.

2025-02-06 10:57:40.AIbase

The DeepSeek-R1 Model Faces Severe Hallucination Issues, Challenging Its Inference Ability and Accuracy

2025-01-24 10:48:09.AIbase

Sakana AI's Transformer² Model Breaks LLM Limitations, Achieving Dynamic Inference

2025-01-24 10:04:42.AIbase

Pipeshift Launches Modular Inference Engine, Reducing AI Inference GPU Utilization by 75%

2025-01-22 14:28:54.AIbase

NetEase Youdao Launches 'ZiYue-o1' Inference Model, Ushering in a New Transformation in the Education Sector

On January 22, 2025, NetEase Youdao officially released the first inference model in China that supports step-by-step explanations, 'ZiYue - o1'. This model features a design with 14 billion parameters, enabling efficient deployment on standard consumer-grade graphics cards, marking a significant advancement in the application of artificial intelligence technologies in the domestic education sector. 'ZiYue - o1' utilizes Thinking Chain technology and incorporates self-dialogue and error-correction mechanisms, allowing it to output detailed thinking processes when solving problems. This way, students not only receive answers but also understand the logic behind problem-solving.

Search AI Products and News

Explore worldwide AI information, discover new AI opportunities

No Training Needed! Q-Filters Enable Efficient Compression of KV Cache and Improved Inference Performance

IBM Launches Compact AI Model Granite 3.2, Emphasizing Efficient Inference and Practicality

No Need for High-End Hardware! Alibaba Open-Sources the New Inference Model Tongyi Qianwen QwQ-32B; Consumer-Grade GPUs Achieve S-Tier Performance!

Alibaba Open-Sources New Inference Large Model QwQ-32B, Rivaling DeepSeek-R1 with Lower VRAM Requirements

Qwen2.5-Max Inference Model Launched by Tongyi Lingma

DeepSeek Open Source Week Day Six: Extreme Inference Optimization System for Enhanced GPU Computing Efficiency

DeepSeek Unveiled: The Astonishing 545% Profit Margin Behind its AI Inference System

ByteDance Launches AIBrix: A New Open-Source Inference System Designed for Large Language Models

Free to Use! ByteDance's AI Programming Software Trae Integrates Claude 3.7; Developers Rejoice!

DeepSeek Open Source Week Day 3: Announcing DeepGEMM, an FP8 GEMM Library for AI Training and Inference

AI Programming Tool Cursor Integrates Claude 3.7 Sonnet Reasoning Model

Anthropic Unveils Claude 3.7 Sonnet: A Hybrid Reasoning Model Surpassing DeepSeek

DeepSeek Launches NSA Technology: Accelerating Long Context Training and Inference

ByteDance's UltraMem Architecture Reduces Large Model Inference Costs by 83%

DeepSeek Full Series Launches on iFlytek Open Platform with Limited Time Free Inference API

In Response to DeepSeek Challenge, OpenAI Reveals o3-mini Public Inference Process

The DeepSeek-R1 Model Faces Severe Hallucination Issues, Challenging Its Inference Ability and Accuracy

Sakana AI's Transformer² Model Breaks LLM Limitations, Achieving Dynamic Inference

Pipeshift Launches Modular Inference Engine, Reducing AI Inference GPU Utilization by 75%

NetEase Youdao Launches 'ZiYue-o1' Inference Model, Ushering in a New Transformation in the Education Sector