TOFU
The TOFU dataset provides a benchmark for fictional forgetting tasks for large language models.
CommonProductProductivityLanguage ModelForgetting
The TOFU dataset contains question-answer pairs generated based on 200 fictional authors that do not exist. It is used to evaluate the forgetting performance of large language models on real-world tasks. The task aims to forget models fine-tuned on various forgetting set ratios. This dataset uses the question-answer format, making it suitable for popular chatbot models like Llama2, Mistral, or Qwen. However, it can also be used for any other large language model. The corresponding codebase is written for Llama2 chatbot and Phi-1.5 models but can be easily adapted to other models.
TOFU Visit Over Time
Monthly Visits
488643166
Bounce Rate
37.28%
Page per Visit
5.7
Visit Duration
00:06:37