WebVoyager

An end-to-end web agent built on a large multimodal model

CommonProductProductivityWeb AgentMultimodal Model
WebVoyager is an innovative large multimodal model (LMM)-powered web agent that can complete user instructions end-to-end by interacting with real-world websites. We propose a novel web agent evaluation protocol to address the challenge of automatic evaluation for open-world agent tasks, leveraging the powerful multimodal understanding capabilities of GPT-4V. We collected real-world tasks from 15 widely used websites to evaluate our agent. We demonstrate that WebVoyager achieves a 55.7% task success rate, significantly outperforming the performance of GPT-4 (with all tools) and WebVoyager (text only) settings, highlighting WebVoyager's superior capabilities in practical applications. We find that our proposed automatic evaluation achieves 85.3% consistency with human judgment, paving the way for further development of web agents in real-world environments.
Visit

WebVoyager Visit Over Time

Monthly Visits

19075321

Bounce Rate

45.07%

Page per Visit

5.5

Visit Duration

00:05:32

WebVoyager Visit Trend

WebVoyager Visit Geography

WebVoyager Traffic Sources

WebVoyager Alternatives