PARTNR

Benchmarking for Multi-Agent Task Planning and Reasoning

CommonProductOthersMulti-AgentNatural Language Processing
PARTNR is a large-scale benchmarking initiative released by Meta FAIR, which includes 100,000 natural language tasks aimed at studying multi-agent reasoning and planning. PARTNR utilizes large language models (LLMs) to generate tasks while minimizing errors through simulation loops. It also supports evaluations of AI agents in collaboration with real human partners, facilitated through human-in-the-loop infrastructure. PARTNR reveals significant limitations of existing LLM-based planners in task coordination, tracking, and recovery from errors, with humans solving 93% of tasks compared to just 30% for LLMs.
Visit

PARTNR Visit Over Time

Monthly Visits

11833

Bounce Rate

44.03%

Page per Visit

2.4

Visit Duration

00:01:15

PARTNR Visit Trend

PARTNR Visit Geography

PARTNR Traffic Sources

PARTNR Alternatives