Cloudflare provided website hosting customers with a method on Wednesday to block AI robots from scraping website content and using data to train machine learning models without permission.
It is based on the customers' dislike of AI robots and the company's statement that it aims to help protect content creators' safety on the internet.
Image Source Note: The image was generated by AI, and the image is authorized by the service provider Midjourney
"We have clearly heard that customers do not want AI robots to access their websites, especially those that do so dishonestly. To help, we have added a new one-click feature to block all AI robots."
For website owners, there is already a relatively effective method to block robots, which is the widely available robots.txt file. When placed in the root directory of a website, automatic web crawlers should notice and comply with the instructions not to enter that are provided in the file.
Given the widely held belief that generative AI is based on theft, and with many legal lawsuits attempting to hold AI companies accountable, companies engaged in "cleaning" content are generously allowing web publishers to choose not to participate in theft.
Last August, OpenAI released guidelines on how to use robots.txt instructions to block its GPTbot web crawler, possibly because it was aware that people were concerned about content being scraped and used for AI training without consent. Google took similar measures the following month. In the same September last year, Cloudflare began to offer a method to block AI robots that comply with the rules, and reportedly 85% of customers enabled this blocking feature.
Key Points:
⭐️ Cloudflare launches a one-click feature to block AI web crawlers, ensuring that website content is not used by unauthorized AI robots.
⭐️ Generative AI is viewed as a source of theft, and major companies are taking measures to prevent AI robots from using content without authorization.
⭐️ Cloudflare uses machine learning models to identify and block disguised web crawlers, protecting the rights of content creators on the internet.