seed-tts-eval is a testing dataset for evaluating a model's zero-shot speech generation capability. It provides an objective evaluation test set across diverse domains, containing samples extracted from both English and Mandarin public language repositories. This dataset is used to measure the model's performance across various objective metrics. It utilizes 1000 samples from the Common Voice dataset and 2000 samples from the DiDiSpeech-2 dataset.