Anim400K is a dataset designed for automatic video dubbing, containing 425,000 audio-video clips covering a variety of themes and languages. Developers can utilize its rich metadata for training and improvement, supporting various video tasks. This dataset is widely used in areas such as automatic dubbing, multimodal learning, and speech and image recognition. For detailed information, please visit the GitHub project page: https://github.com/davidmchan/Anim400K.