Translated data: Chat-UniVi is a unified vision-language large model proposed by institutions such as Peking University and Sun Yat-sen University. It boasts short training times and outstanding performance. Through dynamic visual tokens and density peak clustering algorithms, the model achieves significant results in multi-task scenarios. The open-source code and datasets bring new ideas and economical solutions to the research of vision-language models.