ZhiXing Column · 2025-07-09

Startup Commentary”AI Crawlers Are Omnipresent, and Cloudflare Aims to Be the “Savior” of Websites”

Read More《AI爬虫无孔不入,Cloudflare要当网站的“救世主”》

Positive Review: Cloudflare Builds a Technological Defense Line for Content Creators and Reconstructs a New Symbiotic Ecosystem between AI and Content Providers

Against the backdrop where the “indiscriminate scraping” of online content by AI large – model training has become an industry norm and the legal rights – protection channels for copyright holders are blocked by US judicial rulings, Cloudflare’s “default block of AI crawlers + pay – per – use scraping” solution is a crucial breakthrough point in the game between content creators and AI manufacturers. Its value lies not only in providing defensive weapons for copyright holders at the technical level but also in promoting the shift from “confrontation” to “cooperation” between the two parties through a standardized mechanism, injecting new impetus into the sustainable development of the Internet content ecosystem.

First of all, Cloudflare’s technical solution precisely addresses the “survival anxiety” of content creators. Currently, the way users obtain information is shifting from “actively clicking on web pages” to “relying on AI conversations”, which directly leads to a sharp decline in website traffic and advertising revenue. Data shows that after the launch of Google’s AI search mode, the click – through rate of traditional URL links decreased by 30%, which is almost a “catastrophe” for small and medium – sized websites that rely on advertising for monetization. The previous “free scraping” behavior of AI manufacturers is essentially transforming the labor achievements of content providers into “training fuel” for their own models without giving reasonable compensation. Cloudflare’s anti – AI crawler technologies (such as the “5 – second shield” and the “AI nonsense maze”) increase the cost of AI scraping through technical means. The former distinguishes between humans and machines through multi – dimensional verification, and the latter consumes the crawler’s computing power and bandwidth with false pages, making the “cost – free scraping” of small and medium – sized websites by AI manufacturers unprofitable. This is equivalent to providing a “technological moat” for content creators, helping them safeguard their core assets and avoid the dilemma of “working for others”.

Secondly, the “pay – per – use scraping” model provides a standardized path for the cooperation between AI manufacturers and content providers. Previously, the way AI manufacturers obtained content showed an obvious “80/20 divide”: they paid high fees to top – tier media (such as Springer and Reddit) and relied on technical means to “freely use” the content of a large number of small and medium – sized websites. The pain point of this model is that there are a large number of small and medium – sized websites, and AI manufacturers lack the cost and motivation to negotiate with each one individually, while content providers also find it difficult to fight for their rights on their own. Cloudflare’s solution integrates the content of small and medium – sized websites through a platform, “packaging” it into a priced and tradable resource. This not only reduces the negotiation cost for AI manufacturers (no need to negotiate with each website separately) but also provides a stable income source for content providers (pay – per – use). This combination of “technical defense + payment channel” essentially constructs a “content trading market”, allowing AI manufacturers to “clearly price” the content they use, and content providers to obtain reasonable returns through the market mechanism, ultimately achieving a win – win situation where “AI has data to use and content providers have income to enjoy”.

Finally, Cloudflare’s actions are in line with the core spirit of “co – construction and sharing” of the Internet. The prosperity of the Internet depends on the continuous output of content creators. If content providers reduce the supply of high – quality content due to lack of returns, the entire ecosystem will ultimately suffer. By empowering content providers with technology, Cloudflare is actually maintaining the positive cycle of “creation – dissemination – income” on the Internet. As its CEO said, “The goal is to return control to creators while helping AI companies innovate.” This balanced thinking avoids the “black – and – white” opposition and explores a more sustainable development path for the industry.

Negative Review: Technological Confrontation May Escalate, and the Pay – per – Use Model Needs Verification. Cloudflare’s Solution Faces Multiple Challenges

Although Cloudflare’s solution is regarded as a “savior – like” innovation, there are still many uncertainties about its implementation effect. From technological games to commercial feasibility, from the industry ecosystem to user experience, the chain reactions that this solution may trigger are worthy of vigilance.

Firstly, technological confrontation may fall into a vicious cycle of “the magic is one foot high, but the devil is one foot higher”. The struggle between AI crawlers and anti – crawlers is essentially a “technological arms race”. Cloudflare’s “AI nonsense maze” consumes crawler resources through false pages. However, if AI manufacturers develop more intelligent “anti – entrapment” technologies (such as identifying the characteristics of false content and skipping invalid links), the current defense means may quickly become ineffective. For example, AI large models already have strong content understanding capabilities and may completely identify “meaningless content” through semantic analysis in the future, thus bypassing the maze. In addition, AI manufacturers may also use technologies such as “distributed crawlers” and “dynamic IP pools” to avoid Cloudflare’s detection, or even directly purchase or rent the server resources of Cloudflare’s customers for scraping. The escalation of technological confrontation will not only drive up the costs of both parties (AI manufacturers will increase computing power investment, and content providers need to continuously upgrade their defenses) but also lead to a decrease in the efficiency of Internet data flow and hinder technological innovation.

Secondly, the commercial feasibility of the “pay – per – use scraping” model is questionable. Cloudflare’s payment plan needs to solve two core problems: one is whether the pricing standard is reasonable, and the other is whether the profit – sharing mechanism between content providers and the platform is fair. Currently, the news does not clearly state the specific pricing logic (such as charging by page views, content quality, or data volume). If the pricing is too high, AI manufacturers may choose to abandon scraping the content of small and medium – sized websites and turn to other data sources (such as their own data or public databases); if the pricing is too low, content providers will not be able to obtain sufficient income and may lose the motivation to participate. In addition, as an intermediate platform, how can Cloudflare ensure transparent profit – sharing and avoid the controversy of “excessive commission”? For example, if the platform charges a 30% service fee, the actual income of content providers may be lower than expected, causing them to turn to other service providers. More importantly, the content value of small and medium – sized websites varies greatly. A professional industry analysis article and an ordinary blog post have completely different values for AI training. How to achieve “differentiated pricing” will test Cloudflare’s technical and operational capabilities.

Thirdly, excessive protection may damage the openness and public nature of the Internet. The charm of the Internet lies in the free flow and sharing of information. If a large number of websites set up “pay – walls” or high – intensity defenses through Cloudflare, it may make it difficult for AI large models to obtain diverse and inclusive training data, thereby affecting the inclusive development of AI technology. For example, if the content of small and medium – sized educational and science – popularization websites cannot be scraped by AI due to defense measures, it may hinder the application of AI in education and public knowledge dissemination. In addition, user experience may also be affected. If AI cannot effectively scrape the content of small and medium – sized websites, the comprehensiveness and accuracy of its answers may decline, and users may need to return to the traditional search mode, which actually reduces efficiency.

Suggestions for Entrepreneurs: Make Good Use of Technological Tools, Balance Protection and Openness, and Explore New Paths for Content Monetization

For content entrepreneurs (such as small and medium – sized website owners, self – media, and vertical – field publishers), Cloudflare’s solution provides important tools and ideas, but they need to apply them flexibly according to their own actual situations. The following are specific suggestions:

  1. Evaluate content value and choose appropriate protection strategies: Not all content needs “high – intensity defense”. Entrepreneurs need to first clarify the core value of their own content – is it in – depth analysis of exclusive original content? Or is it news and information with strong timeliness? Or is it practical guides with tool – like functions? For high – value and highly unique content (such as industry reports and professional tutorials), strict anti – crawler functions can be enabled through Cloudflare, or a relatively high “pay – per – use” threshold can be set. For content with strong universality and public nature (such as basic science – popularization and common – sense articles), the defense intensity can be appropriately reduced, or even free scraping can be allowed to expand the content dissemination scope and enhance brand influence.
  2. Actively participate in the pay – per – use model and explore diversified monetization: Cloudflare’s “pay – per – use scraping” provides a new income source for content providers, but entrepreneurs need to actively optimize the content structure to enhance the attractiveness of the content to AI manufacturers. For example, the content can be structured (such as adding tags and classification metadata) to facilitate AI manufacturers to quickly identify its value; or the content can be updated regularly to maintain the timeliness of the data (AI training requires the latest data). In addition, a compound monetization model of “content + service” can be explored in combination with their own business. For example, charge content usage fees to AI manufacturers and at the same time provide customized data annotation and cleaning services for them to increase income diversity.
  3. Keep an eye on technological trends and avoid over – relying on a single tool: Technological confrontation is a long – term process, and entrepreneurs need to keep track of anti – crawler and AI scraping technologies. In addition to Cloudflare, they can pay attention to similar services of other security manufacturers (such as Akamai and Fastly), or self – develop lightweight defense tools (such as rule – based request filtering and user behavior analysis) to avoid content leakage caused by the failure of a single platform’s technology. At the same time, they can join hands with entrepreneurs in the same industry to establish a “content alliance” and enhance their bargaining power with AI manufacturers through collective negotiation (such as jointly formulating payment standards and sharing defense technologies).
  4. Balance protection and openness and maintain the value of the content ecosystem: The core goal of content entrepreneurs is to achieve sustainable development through content, rather than simply “preventing AI scraping”. When using defense tools, a “one – size – fits – all” strategy should be avoided. For example, free or low – fee channels can be set for AI scraping for non – commercial purposes such as academic research and public welfare projects to enhance the social value of the content. A flexible model of “data authorization + traffic sharing” can be provided for cooperative AI manufacturers (such as AI embedding the original website link in the answer to direct traffic to the website and share advertising revenue). This open attitude can not only expand the influence of the content but also bring long – term benefits to themselves.

Conclusion

Cloudflare’s solution is an important turning point in the game between AI and content providers, but its effectiveness still needs time to be verified. For entrepreneurs, the key is to make good use of technological tools to protect core assets, and at the same time explore cooperation models with AI manufacturers with an open mind, find a balance between “protection” and “monetization”, and ultimately maximize the value of content.

创业时评《AI爬虫无孔不入,Cloudflare要当网站的“救世主”》

ZhiXing-AIx
Chatbot