DocLayout-YOLO
Enhancing document layout analysis through diverse synthetic data and global-to-local adaptive perception.
CommonProductImageDocument Layout AnalysisDeep Learning
DocLayout-YOLO is a deep learning model designed for document layout analysis, enhancing accuracy and processing speed through diverse synthetic data and global-to-local adaptive perception. The model utilizes the Mesh-candidate BestFit algorithm to generate a large and diverse DocSynth-300K dataset, significantly improving fine-tuning performance across different document types. Additionally, it introduces a globally controllable perception field module to better handle multi-scale variations of document elements. DocLayout-YOLO performs exceptionally well on various downstream datasets, showcasing significant advantages in both speed and accuracy.
DocLayout-YOLO Visit Over Time
Monthly Visits
515580771
Bounce Rate
37.20%
Page per Visit
5.8
Visit Duration
00:06:42