Mistral-NeMo-Minitron 8B is a small language model released by NVIDIA, serving as a streamlined version of the Mistral NeMo 12B model. It achieves computational efficiency while maintaining high accuracy, enabling operation in GPU-accelerated data centers, cloud environments, and workstations. This model is custom-developed through the NVIDIA NeMo platform and incorporates both pruning and distillation AI optimization techniques to reduce computational costs while providing accuracy comparable to the original model.