As globalization deepens, Neural Machine Translation (NMT) technology plays an increasingly important role in cross-language communication. While current translation tools excel at handling technical documents and simple texts, they still face numerous challenges in translating literary texts. Literary works often contain expressions rich in cultural and emotional connotations, such as metaphors and similes, which traditional translation systems often struggle to convey accurately.
To address this shortcoming, the Tencent research team has launched a new translation system called DRT-o1. This system includes two versions: DRT-o1-7B and DRT-o1-14B. Both models are built on Qwen2.5 and introduce a novel multi-agent framework specifically optimized for the translation of metaphors and similes. The research team collected approximately 400 public domain English books from the Gutenberg Project, extracted 577,600 sentences, and selected 63,000 sentences containing metaphors and similes as training data.
The DRT-o1 system employs a collaborative approach consisting of three roles: Translator, Consultant, and Evaluator. The workflow of this multi-agent framework begins with identifying and translating key terms in the source sentence one by one, ensuring contextual accuracy. After generating an initial translation, it undergoes multiple rounds of refinement and evaluation, resulting in a smooth and comprehensible final translation. The system is better able to capture the cultural nuances and emotional subtleties of literary works.
Experimental results show that the BLEU score of DRT-o1-7B improved by 8.26 points, and the COMET score increased by 3.36 points, outperforming its predecessor Qwen2.5-7B-Instruct. DRT-o1-14B also performed excellently, with a BLEU score increase of 7.33 points and a COMET score increase of 1.66 points. These results indicate that DRT-o1 surpasses existing models in literary translation, particularly its 7B version, which even outperformed the larger QwQ-32B model.
The DRT-o1 system introduces groundbreaking advancements in the field of neural machine translation through its multi-agent framework and long-chain reasoning methods. It not only enhances the accuracy and fluency of translations but also provides new solutions for translating complex literary texts.
Project link: https://github.com/krystalan/DRT-o1
Highlights:
🌟 The DRT-o1 system includes two versions (7B and 14B) and optimizes the translation of metaphors and similes using a multi-agent framework.
📚 The research team extracted and selected 63,000 literary sentences from 400 public domain books as training data.
🚀 DRT-o1 shows significant improvements in both BLEU and COMET scores, demonstrating strong capabilities in literary translation.