The New York Times and the Daily News encountered an unexpected twist in their copyright lawsuit: An OpenAI engineer accidentally deleted virtual machine search data that could have become crucial evidence, adding a dramatic turn to this high-profile legal dispute.
According to a letter submitted to the U.S. District Court for the Southern District of New York on Wednesday evening, lawyers and technical experts from the two media companies had previously spent over 150 hours searching OpenAI's AI training dataset. However, on November 14, an OpenAI engineer accidentally deleted all the search data stored on the virtual machine.
Although OpenAI subsequently attempted to recover the data and achieved some basic success, the lost folder structure and file names were "irretrievably" gone, rendering the recovered data essentially useless for pinpointing the specific locations of the copied news articles within the AI model.
The plaintiffs' lawyers emphasized that there is currently no evidence to suggest that the deletion was intentional. However, this incident highlights OpenAI's technical capabilities in searching for potentially infringing content within its own dataset.
This case underscores the legal gray areas surrounding the use of training data for artificial intelligence. The two media giants pointed out that OpenAI's "accidental" deletion not only wasted significant human and computational resources but could also impact the critical chain of evidence in the upcoming legal proceedings.
Currently, OpenAI has declined to comment on the incident. This lawsuit is not only about copyright but could also set an important milestone for intellectual property protection in the age of artificial intelligence.