2025-04-28 08:22:33.AIbase.17.6k
World's Fastest Inference Speed Model! Qafind Labs Releases ChatDLM Technology
Qafind Labs recently released its newly developed ChatDLM model, an innovative achievement that has drawn widespread attention in the field of artificial intelligence. ChatDLM is the first model to deeply integrate "Block Diffusion" and "Mixture of Experts (MoE)". It achieves an astonishing inference speed of 2,800 tokens/s on GPU, supporting an ultra-large context window of 131,072 tokens, ushering in a new era of document-level generation and real-time conversation.