The TOFU dataset contains question-answer pairs generated based on 200 fictional authors that do not exist. It is used to evaluate the forgetting performance of large language models on real-world tasks. The task aims to forget models fine-tuned on various forgetting set ratios. This dataset uses the question-answer format, making it suitable for popular chatbot models like Llama2, Mistral, or Qwen. However, it can also be used for any other large language model. The corresponding codebase is written for Llama2 chatbot and Phi-1.5 models but can be easily adapted to other models.