CosyVoice is a multilingual large-scale voice generation model. It not only supports voice generation in multiple languages but also offers full-stack capabilities, from inference to training to deployment. The model holds significance in the field of voice synthesis because it can generate natural and fluent, near-human-like voices suitable for various language environments. Background information indicates that CosyVoice was developed by the FunAudioLLM team and is licensed under the Apache-2.0 license.