AIbase
Product LibraryTool Navigation

deception

Public

Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claude, GPT-4, Gemini, Llama, etc.) with standardized evaluation metrics.

Creat2024-10-22T23:43:30
Update2025-03-20T08:18:57
24
Stars
0
Stars Increase