Copyleaks, a plagiarism detection company, recently released a report stating that up to 60% of the content generated by OpenAI's GPT-3.5 large language model contains various forms of plagiarism. Copyleaks uses its proprietary scoring method, which considers factors such as text similarity, minor alterations, and paraphrasing, to identify similar content as "plagiarism." Subjects such as computer science, physics, and psychology show the highest similarity, while drama, humanities, and English have lower similarity rates. OpenAI has stated that it has implemented various measures to limit unintentional memorization, and its terms of use explicitly prohibit users from deliberately having the model reproduce content. Additionally, in addition to the well-known lawsuit brought by The New York Times, some content creators are also attempting to claim that AI models are trained on their works, and therefore, the production of similar content should entitle them to royalties.
GPT-3.5 Model Output Up to 60% Suspected of Plagiarism
站长之家
51
© Copyright AIbase Base 2024, Click to View Source - https://www.aibase.com/news/5613