Magic搭 Community Collaborates with vLLM and FastChat to Provide Efficient LLM Inference and Deployment Services

站长之家

Published inAI News · 1 min read · Dec 29, 2023

223

Recently, the ModelScope community has partnered with vLLM and FastChat to jointly offer Chinese developers faster and more efficient LLM inference and deployment services. Developers can utilize vLLM as the inference engine within FastChat, providing high-throughput model inference. FastChat is an open platform designed for training, serving, and evaluating ChatBot systems based on LLMs. vLLM, developed by researchers from the University of California, Berkeley, Stanford University, and the University of California, San Diego, is an LLM service system. Through FastChat and vLLM, developers can quickly load models from ModelScope for inference.

LLM Inference Engine Developers

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

Kayak launches AI mode, allowing users to research, plan, and book travel directly through an integrated chatbot. This feature supports comparison queries for flights, hotels, and car rentals, providing more accurate search results by utilizing ChatGPT integration technology. It is now available on both web and mobile platforms.

Oct 17, 2025

110

Travel Search Engine Kayak Launches AI Mode: Integrates ChatGPT to Provide Itinerary Planning and Booking Services

Kayak launched 'AI Mode', a chatbot for travel queries & bookings, integrating ChatGPT for contextual searches on web/mobile.....

Oct 17, 2025

Volcano Engine Launches Four Powerful Large Models, Voice Synthesis and Replication Features Upgraded

Volcano Engine launched four Doubao AI models at Wuhan AI Expo: upgraded 1.6 with four thinking lengths, lightweight 1.6lite, and new voice synthesis 2.0 & cloning 2.0, enhancing intelligence for flexible enterprise solutions.....

Oct 16, 2025

170

Volc Engine Launches Doubao Large Model 1.6 Upgrade, Daily Token Requests Exceed 30 Trillion!

Volc Engine upgrades the Doubao Large Model matrix and launches the 'Intelligent Model Routing' service to help enterprises move toward the Agentic AI era. Key upgrades include the Doubao Large Model 1.6 enhancing reasoning capabilities, supporting four different thinking lengths, and launching three new models: 1.6lite, Speech Synthesis 2.0, and Voice Cloning 2.0, further enriching the product ecosystem.

Oct 16, 2025

140

Firefox Browser Adds AI Search Engine Perplexity to Enhance User Search Experience

Mozilla announced that the Firefox browser has officially integrated the AI-powered Perplexity answer engine, allowing users to enjoy AI search services without switching browsers. Unlike companies that have launched new AI browsers, Mozilla chose to integrate this feature into the existing Firefox, enabling users to switch the default search engine to Perplexity in the settings, which will be listed as an option alongside Google and Bing.

Oct 15, 2025

190

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Meta Super Intelligence Lab introduced the REFRAG technology, which improves the inference speed of large language models in retrieval-augmented generation tasks by more than 30 times. This breakthrough result was published in a related paper and profoundly transforms the way AI models operate. The lab was established in California in June this year, stemming from Zuckerberg's emphasis on the Llama4 model.

Oct 14, 2025

220

Ant Group Releases and Opens Source the Ring-1T Model with Trillion Parameters, Setting New SOTA for Open Source

Ant Group open-sourced the Ring-1T model with trillion parameters on October 14th, including weights and training methods. The model is an upgrade from the preview version, optimized for reasoning capabilities through reinforcement learning, and has improved general performance, showing balanced results in multiple tasks. The team is aiming to tackle more challenging problems to enhance complex reasoning abilities such as mathematics.

Oct 14, 2025

350

Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!

Ant Group open-sources dInfer, the first high-performance inference framework for diffusion language models in the industry, significantly improving inference speed. Benchmark tests show that it is 10.7 times faster than NVIDIA Fast-dLLM, achieving 1011 Tokens per second in single inference on the HumanEval code generation task, pushing technology toward practical applications.

Oct 13, 2025

260

First to Surpass Autoregressive Models! Ant Group Opens Sourced the Industry's First High-Performance Diffusion Language Model Inference Framework dInfer

Ant Group open-sourced dInfer, the first high-performance diffusion language model framework. Benchmarks show it's 10.7x faster than NVIDIA's Fast-dLLM, achieving 1011 tokens/sec in HumanEval tasks—first to surpass autoregressive models in speed.....

Oct 13, 2025

260

Claude Code Plugin Function Goes Live to Help Developers Improve Programming Efficiency

Anthropic announced that its AI coding assistant, Claude Code, has launched the plugin feature and entered the public beta phase. The system includes four core components: slash commands can quickly perform common operations, aiming to enhance developers' customization capabilities, and improve work efficiency through a method similar to VS Code extension plugins.

Oct 13, 2025

160

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Magic搭 Community Collaborates with vLLM and FastChat to Provide Efficient LLM Inference and Deployment Services

站长之家

This article is from AIbase Daily

AI News Recommendations

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

Travel Search Engine Kayak Launches AI Mode: Integrates ChatGPT to Provide Itinerary Planning and Booking Services

Volcano Engine Launches Four Powerful Large Models, Voice Synthesis and Replication Features Upgraded

Volc Engine Launches Doubao Large Model 1.6 Upgrade, Daily Token Requests Exceed 30 Trillion!

Firefox Browser Adds AI Search Engine Perplexity to Enhance User Search Experience

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Ant Group Releases and Opens Source the Ring-1T Model with Trillion Parameters, Setting New SOTA for Open Source

Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!

First to Surpass Autoregressive Models! Ant Group Opens Sourced the Industry's First High-Performance Diffusion Language Model Inference Framework dInfer

Claude Code Plugin Function Goes Live to Help Developers Improve Programming Efficiency

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Magic搭 Community Collaborates with vLLM and FastChat to Provide Efficient LLM Inference and Deployment Services

站长之家

This article is from AIbase Daily

AI News Recommendations

Travel Search Engine Kayak Launches AI Mode for Easier Travel Planning and Booking

Travel Search Engine Kayak Launches AI Mode: Integrates ChatGPT to Provide Itinerary Planning and Booking Services

Volcano Engine Launches Four Powerful Large Models, Voice Synthesis and Replication Features Upgraded

Volc Engine Launches Doubao Large Model 1.6 Upgrade, Daily Token Requests Exceed 30 Trillion!

Firefox Browser Adds AI Search Engine Perplexity to Enhance User Search Experience

Meta Super Intelligence Lab Breaks RAG Technology Bottleneck: The REFRAG Framework Boosts Inference Speed by 30 Times

Ant Group Releases and Opens Source the Ring-1T Model with Trillion Parameters, Setting New SOTA for Open Source

Ant Group Launches dInfer: Significantly Speeding Up the Inference of Diffusion Language Models by 10 Times!

First to Surpass Autoregressive Models! Ant Group Opens Sourced the Industry's First High-Performance Diffusion Language Model Inference Framework dInfer

Claude Code Plugin Function Goes Live to Help Developers Improve Programming Efficiency

GEO Services