"Cloudflare CEO Matthew Prince says Google sees more than three times the web pages OpenAI does. This data advantage could reshape competition in artificial intelligence."
Google is seeing far more of the internet
Google may already have a huge edge in the artificial intelligence race, and it has little to do with chips or talent. According to Cloudflare CEO Matthew Prince, Google crawls more than three times as many web pages as OpenAI.
Prince shared these numbers during a recent appearance on the TBPN podcast, citing data from Cloudflare's global network. His claim was clear and direct: for every one page OpenAI's crawler sees, Google sees about 3.2 pages.
How big is the data gap between AI players?
The gap does not stop with OpenAI. Prince explained that Google also outpaces other major AI developers by a wide margin.
| Company | Relative web access |
|---|---|
| 3.2x OpenAI, 4.8x Microsoft | |
| OpenAI | Baseline comparison |
| Microsoft and Anthropic | Similar, far behind Google |
Cloudflare's Year in Review report supports this view. During October and November 2025, Googlebot reached 11.6 percent of unique web pages, while OpenAI's GPTBot reached just 3.6 percent.
Why does Google get special access?
Prince says the answer lies in Google's long dominance in search. Over many years, website owners have treated Google differently from other crawlers.
Everyone has let them behind their paywall. Everyone has let them see parts of the internet that no one else sees.
This special access is often controlled through robots.txt files, which tell crawlers what they can and cannot see. Many publishers allow Googlebot to crawl premium or restricted content so their pages can rank in search results.
The problem is that Googlebot now serves two purposes:
- Indexing pages for Google Search
- Collecting data that can be used for AI training
If publishers block Googlebot to protect their content from AI training, they also risk disappearing from Google Search. That tradeoff does not exist in the same way for other AI crawlers.
Does data matter more than chips in AI?
Prince believes it does. In his view, access to data is becoming the most important factor in building powerful AI systems.
"Whoever has the most data wins in the era of AI," he said. This challenges the popular idea that success in AI depends mainly on advanced GPUs or large engineering teams.
Without comparable data, even well funded AI companies may struggle to compete with Google over the long term.
Calls for regulation and fair access
Prince suggested that regulators may need to step in if competition is to remain healthy. He outlined two possible paths:
- Limit Google's ability to use its search dominance for AI training
- Require equal data access for competing AI developers
Both options would represent a major shift in how the internet and AI ecosystems work today.
Gemini vs ChatGPT: the traffic shift
The debate comes as Google's Gemini AI assistant gains ground. According to Similarweb data from January, Gemini now accounts for 21.5 percent of generative AI website traffic.
At the same time, ChatGPT's share has dropped from 86.7 percent a year ago to 64.5 percent. While ChatGPT remains the leader, the trend suggests Google's momentum is real.
What this means for the future of AI
Prince's comments add fuel to growing concerns about infrastructure advantages in AI development. If one company can see much more of the web than everyone else, innovation may slow and competition could shrink.
Whether regulators act or not, one thing is clear: data access is becoming a defining issue in the next phase of the AI race.
Frequently Asked Questions
Why does Google crawl more web pages than OpenAI?
Google benefits from decades of search dominance. Many websites allow Googlebot deeper access so they can rank well in search results.
Can websites block Google from using content for AI?
They can, but doing so may hurt their visibility in Google Search, which makes this a difficult choice for publishers.
Is more data always better for AI models?
More data generally helps, but quality, diversity, and responsible use also matter. Still, large scale access provides a strong advantage.

