Glyph-ByT5-v2 is a model developed by Microsoft Research Asia for accurate multi-language visual text rendering. It not only supports accurate visual text rendering in 10 different languages, but also has significantly improved in aesthetic quality. The model builds a multi-language visual paragraph benchmark through the creation of high-quality datasets of multi-lingual glyph text and graphic design images, utilizes state-of-the-art gait-aware preference learning methods to enhance visual aesthetic quality.