The researchers at HuggingFace have recently addressed the challenge of deploying large pre-trained speech recognition models in resource-constrained environments. By creating extensive open-source datasets and utilizing pseudo-labeling techniques, they have distilled a smaller version of the Whisper model, known as Distil-Whisper. This model maintains the resilience of the original model under challenging acoustic conditions while resolving the hallucination errors in long audio recordings. The study introduces a large-scale pseudo-labeling method, offering a new avenue for knowledge distillation of speech data, and effectively solves the model deployment issue. Whisper, as a large pre-trained ASR model, performs exceptionally well across various datasets, while Distil-Whisper achieves a WER of less than 1% in zero-shot scenarios, bringing a new solution to the deployment of speech recognition models.