Best CPU AI Tools & Models - Premium CPU News

AI News

Ali's Black Tech Shocks the Scene! A 0.6B Small Model is Modified into a 17B MoE with Only 5% Activated Parameters, Running Directly on CPU at 30 Token/s!

The Ali International Digital Commerce team launched the Marco-Mini-Instruct model, which has 17.3B parameters and only 0.86B activated parameters, offering high inference efficiency and smooth operation on regular CPUs. With 8-bit quantization and four DDR4 2400 memory modules, the inference speed reaches about 30 token/s, promoting the practical application of the MoE architecture.

16.9k 3 hours ago

Breaks World Record! Alibaba DAMO Academy Releases Xuantie C950: CPU Successfully Supports Billion-Parameter Large Models Natively for the First Time

Alibaba DAMO Academy has released the high-performance RISC-V CPU Xuantie C950, with a single-core score exceeding 70, breaking the global performance record for RISC-V. It achieves native support for billion-parameter large models for the first time, marking a significant improvement in the position of the RISC-V architecture in the computing market.

19k 12 hours ago

Breaks World Record! Alibaba DAMO Academy Releases Xuantie C950: CPU Successfully Supports Billion-Parameter Large Models Natively for the First Time

Liquid AI Releases LFM2.5: A Family of Small AI Models for Edge Devices

Liquid AI launches LFM2.5, a compact foundational model series for edge devices and local deployment, featuring base and instruction-tuned versions with Japanese, vision, and audio variants. Optimized for CPU/NPU, it offers fast inference and is open-sourced on Hugging Face.....

18.9k 1 days ago

Liquid AI Releases LFM2.5: A Family of Small AI Models for Edge Devices

New Version of Firefox Is Accused of Having AI Features Enabled by Default, Sparking Ongoing Debate on Privacy and Performance

The new version of Firefox has sparked controversy by enabling AI features by default, with users concerned about privacy and performance issues. Tests show that enabling it significantly increases CPU and memory usage, affecting the browsing experience, and most users were unaware of this.

14.8k 3 hours ago

New Version of Firefox Is Accused of Having AI Features Enabled by Default, Sparking Ongoing Debate on Privacy and Performance

AI Products

CanIRun.ai

Detect hardware and understand the AI models that can run locally. Support GPU, CPU, and RAM analysis.

Research tools

Firefox Translations Models

CPU-accelerated neural machine translation models optimized for the Firefox browser's translation feature.

Translate

12.7k

LiteAvatar

An audio-driven real-time 2D chatting avatar generation model that achieves 30fps real-time inference on CPU-only devices.

Chatbot

16k

MixTeX-Latex-OCR

Efficient local offline LaTeX recognition tool powered by CPU.

AI text-to-speech

12.2k

Models

GPT OSS 120B

Openai

$0.63

Input tokens/M

$3.15

Output tokens/M

131

Context Length

Gemma 3 12B

Google

$0.35

Input tokens/M

$0.7

Output tokens/M

131

Context Length

Gemma 3 4B

Google

$0.14

Input tokens/M

$0.28

Output tokens/M

131

Context Length

Gemma 3 27B

Google

$0.7

Input tokens/M

$1.4

Output tokens/M

131

Context Length

Gemma 3 1B

Google

Input tokens/M

Output tokens/M

Context Length

DeepSeek-R1-Distill-Llama-8B

Deepseek

Input tokens/M

Output tokens/M

Context Length

Qwen_v2.5_7b_base

Alibaba

Input tokens/M

Output tokens/M

128

Context Length

Qwen_v2.5_3b_base

Alibaba

Input tokens/M

Output tokens/M

Context Length

Gemma 2 27B

Google

Input tokens/M

Output tokens/M

Context Length

Yi-6B-Chat

01-ai

Input tokens/M

Output tokens/M

Context Length

MCP

Uniprof

Uniprof is a tool that simplifies CPU performance analysis. It supports multiple programming languages and runtimes, does not require code modification or additional dependencies, and can perform one-click performance profiling and hotspot analysis through Docker containers or the host mode.

typescript

8.5k

4.5points

Talos Mcp

A simple MCP implementation based on the Talos SDK, used to obtain data from multiple Talos nodes, including disk, network interface, CPU, and memory usage, and supports restarting nodes.

10.3k

2.5points

Monitor Mcp Server

A Mac system monitoring server based on the MCP protocol, which can monitor CPU, memory, and disk usage

python

7.7k

2.0points

Mcp System Info

An MCP server that provides real-time system information, allowing you to obtain metrics such as CPU, memory, disk, and network. It supports cross-platform operation and can be accessed through a standardized interface.

python

7.8k

2.0points

System Resource Monitor

An MCP server that provides real-time system monitoring functions for Claude, capable of monitoring indicators such as CPU, memory, disk, network, battery, and internet speed.

typescript

10.4k

2.0points

System Resource Monitor

An MCP server that provides real-time system monitoring functions for Claude, supporting monitoring of CPU, memory, disk, network, battery, and internet speed.

typescript

10.2k

2.0points

Mcp Sentiment

A lightweight application based on Gradio that uses Hugging Face Transformers for sentiment analysis and sarcasm detection, compatible with the MCP architecture and can run on CPUs.

python

9.6k

2.0points

Perfetto Mcp

Perfetto MCP is a model context protocol server that can convert natural language prompts into professional Perfetto trace analysis, helping developers perform performance analysis, ANR detection, CPU hotspot thread identification, lock contention analysis, and memory leak detection without writing SQL.

python

13.5k

2.0points

Empowering the future, your artificial intelligence solution think tank

English 简体中文繁體中文にほんご

FirendLinks:

AI Newsletters AI Tools MCP Servers AI News AIBase LLM Leaderboard AI Ranking

Business Cooperation Site Map

AI News

Ali's Black Tech Shocks the Scene! A 0.6B Small Model is Modified into a 17B MoE with Only 5% Activated Parameters, Running Directly on CPU at 30 Token/s!

Breaks World Record! Alibaba DAMO Academy Releases Xuantie C950: CPU Successfully Supports Billion-Parameter Large Models Natively for the First Time

Liquid AI Releases LFM2.5: A Family of Small AI Models for Edge Devices

New Version of Firefox Is Accused of Having AI Features Enabled by Default, Sparking Ongoing Debate on Privacy and Performance

AI Products

CanIRun.ai

Firefox Translations Models

LiteAvatar

MixTeX-Latex-OCR

Models

GPT OSS 120B

Gemma 3 12B

Gemma 3 4B

Gemma 3 27B

Gemma 3 1B

DeepSeek-R1-Distill-Llama-8B

Qwen_v2.5_7b_base

Qwen_v2.5_3b_base

Gemma 2 27B

Yi-6B-Chat

DeepSeek OCR Metal MPS

VieNeu TTS 1000h

Qwen3 VL 235B A22B Instruct GGUF

Qwen3 VL 2B Thinking GGUF

Qwen3 VL 32B Thinking GGUF

Qwen3 VL 2B Instruct GGUF

SecInt SmolLM2 360M Nginx

VieNeu TTS

DeepSeek OCR MBQ Quantized V1

Svara Tts V1

Deepseek Moe 16b Q4 K M Cpu Offload Gguf

Gpt Oss 20b Moe Cpu Offload Gguf

TheDrummer_Snowpiercer 15B V3 GGUF

Qwen3 Omni 30B A3B Thinking INT8FP16

Colmodernvbert

Huihui Tongyi DeepResearch 30B A3B Abliterated Q4_K_M GGUF

Llama 3.1 8b Roleplay Airtel Gguf

Openai_gpt Oss 120b NEO Imatrix GGUF

All MiniLM L6 V2 Quant.tflite

Qwen3 Zero Coder Reasoning V2 0.8B NEO EX GGUF

MCP

Uniprof

Talos Mcp

Monitor Mcp Server

Mcp System Info

System Resource Monitor

System Resource Monitor

Mcp Sentiment

Perfetto Mcp