Website Hosting Directory (ChinaZ.com) June 22 News: Tencent Cloud has recently introduced a large-scale model knowledge engine, a tool capable of quickly setting up a knowledge service assistant. It excels in handling complex PDF documents, including industry reports, conference PPTs, textbooks, instruction manuals, contracts, receipts, and academic papers. These documents often contain a mix of text, images, and tables, presenting a challenge for traditional OCR technologies due to their complex formats.

WeChat Screenshot_20240622104006.png

The Tencent Cloud large-scale model knowledge engine utilizes a multi-modal document parsing model developed by Tencent's YouTu Lab. It analyzes the layout to locate and identify the type of document content, then outputs the content in a coherent, human-readable sequence. It can understand and process complex layout elements such as tables and formulas, and even infer and correctly restore table data and structures, significantly enhancing recognition accuracy.

Additionally, the knowledge engine supports over 20 languages and includes recognition for traditional characters and rare characters. It can convert images and PDF documents into Markdown format, providing structured data sources for large-scale model training and improving the model's generalization and adaptability. Currently, the accuracy rate of the document parsing function exceeds 98%, and it has been implemented in multiple products, offering standardized API services.

Experience the demo at: https://ocrdemo.cloud.tencent.com/