PARTNR
Benchmarking for Multi-Agent Task Planning and Reasoning
CommonProductOthersMulti-AgentNatural Language Processing
PARTNR is a large-scale benchmarking initiative released by Meta FAIR, which includes 100,000 natural language tasks aimed at studying multi-agent reasoning and planning. PARTNR utilizes large language models (LLMs) to generate tasks while minimizing errors through simulation loops. It also supports evaluations of AI agents in collaboration with real human partners, facilitated through human-in-the-loop infrastructure. PARTNR reveals significant limitations of existing LLM-based planners in task coordination, tracking, and recovery from errors, with humans solving 93% of tasks compared to just 30% for LLMs.
PARTNR Visit Over Time
Monthly Visits
11833
Bounce Rate
44.03%
Page per Visit
2.4
Visit Duration
00:01:15