YOLO-World
Real-time open vocabulary object detection
CommonProductImageReal-timeObject detection
YOLO-World is an advanced real-time open vocabulary object detector based on the You Only Look Once (YOLO) series of detectors. It enhances open vocabulary detection capabilities through visual-language modeling and pre-training on a large dataset. It employs a novel reparameterizable visual-language path aggregation network (RepVL-PAN) and region-text contrastive loss, promoting interaction between visual and linguistic information. YOLO-World efficiently detects a variety of objects in a zero-shot manner, exhibiting high efficiency. On the challenging LVIS dataset, YOLO-World achieves 35.4 AP and 52.0 FPS on a V100, outperforming many state-of-the-art methods in both accuracy and speed. Moreover, fine-tuned YOLO-World demonstrates outstanding performance on multiple downstream tasks, including object detection and open vocabulary instance segmentation.
YOLO-World Visit Over Time
Monthly Visits
488643166
Bounce Rate
37.28%
Page per Visit
5.7
Visit Duration
00:06:37