ZDNET’s key takeaways
- Cloudflare claimed Perplexity ignores web sites’ needs in its content material hunt.
- Cloudflare stated different AI firms, resembling OpenAI, do not wipe content material.
- Cloudflare now affords companies to dam aggressive AI crawlers.
Cloudflare, a number one content material supply community (CDN) firm, has accused the AI startup Perplexity of evading web sites’ “no crawl” directives by stealthily deploying net crawlers to scrape content material from websites which have explicitly blocked its official bots.
If that sounds acquainted, you’ve got heard these accusations earlier than. Final 12 months, WIRED and Forbes each accused Perplexity of doing the identical factor to their websites.
How Perplexity is bypassing ‘no crawl’ directives
Based on Cloudflare, when Perplexity’s net crawler encountered a robots.txt file, which internet sites use to dam their content material from being crawled, Perplexity pretended to be an extraordinary Chrome net browser on a Mac. This enabled it to bypass the bot obstacles.
Cloudflare began investigating when it obtained complaints from clients who had “each disallowed Perplexity crawling exercise of their robots.txt information and in addition created WAF [Web Application Firewall] guidelines to particularly block each of Perplexity’s declared crawlers: PerplexityBot and Perplexity-Consumer.” The shoppers stated their content material nonetheless ended up in Perplexity, even after that they had blocked it.
The CDN then arrange new take a look at domains, explicitly prohibiting all automated entry each in its robots.txt information and thru particular WAF guidelines that blocked crawling from Perplexity’s acknowledged crawlers. Cloudflare discovered that Perplexity would use a number of IP addresses not listed in Perplexity’s official IP vary and would rotate by way of these IPs to sneak into the websites’ content material and report.
“Along with rotating IPs, we noticed requests coming from totally different Autonomous System Numbers (ASNs) to evade web site blocks,” Cloudflare stated. “This exercise was noticed throughout tens of 1000’s of domains and thousands and thousands of requests per day.”
The outcome? Cloudflare stated it noticed “Perplexity not solely accessed such content material however was in a position to present detailed solutions about it when queried by customers.”
Cloudfare has a plan to cease Perplexity
Transferring ahead, Cloudflare has claimed its bot administration system can spot and block Perplexity’s hidden Consumer Agent. Any bot administration buyer who has an present block rule in place is already protected.
Should you do not need to block such visitors on the grounds that it is perhaps from actual customers, you may arrange guidelines to problem requests. This permits actual people to proceed. Prospects with present problem guidelines are already protected.
Lastly, Cloudflare has added signature matches for the stealth crawler to its managed rule, which blocks AI crawling exercise. This rule is offered to all Cloudflare clients, together with free customers.
Cloudflare famous that OpenAI does obey the robots.txt restrictions and does not attempt to break into web sites. That stated, Ziff Davis, ZDNET’s dad or mum firm, filed an April 2025 lawsuit towards OpenAI, alleging it infringed copyrights in coaching and working its AI techniques.
Cloudflare has just lately began providing its clients the choice to routinely block all AI crawlers. To enhance the transfer to dam AI crawlers, Cloudflare has additionally launched its “Pay Per Crawl” program, enabling publishers to set charges for AI firms that need to scrape their content material.
This follows quite a few offers during which media companies are allowing AI firms to legally use their content material to coach their giant language fashions (LLMs). Examples embrace The New York Occasions with Amazon, The Washington Publish with OpenAI, and Perplexity with Gannett Publishing.
Within the meantime, Perplexity seems to proceed to interrupt the principles in its hunt for content material. ZDNET has requested Perplexity about Cloudflare’s claims, however the firm has not responded.
Need extra tales about AI? Take a look at AI Leaderboard, our weekly publication.