The Prague Post - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.230892
AFN 72.005817
ALL 95.216617
AMD 424.575565
ANG 2.062693
AOA 1057.57826
ARS 1659.774657
AUD 1.636627
AWG 2.076563
AZN 1.957066
BAM 1.936197
BBD 2.321121
BDT 141.447934
BGN 1.923825
BHD 0.434465
BIF 3436.55411
BMD 1.152046
BND 1.478508
BOB 7.962284
BRL 5.956888
BSD 1.152393
BTN 109.357305
BWP 15.482319
BYN 3.23264
BYR 22580.107459
BZD 2.317645
CAD 1.606765
CDF 2649.706458
CHF 0.918889
CLF 0.026794
CLP 1054.548399
CNY 7.794342
CNH 7.818742
COP 4155.027784
CRC 530.061091
CUC 1.152046
CUP 30.529227
CVE 110.769052
CZK 24.216702
DJF 204.741912
DKK 7.47405
DOP 67.106986
DZD 154.065368
EGP 59.698575
ERN 17.280694
ETB 182.951812
FJD 2.557315
FKP 0.863573
GBP 0.864547
GEL 3.064209
GGP 0.863573
GHS 13.611436
GIP 0.863573
GMD 84.099343
GNF 10112.083115
GTQ 8.784067
GYD 241.02087
HKD 9.026162
HNL 30.72489
HRK 7.532892
HTG 150.68229
HUF 356.424726
IDR 20953.418085
ILS 3.429095
IMP 0.863573
INR 110.141273
IQD 1509.180652
IRR 1584207.666692
ISK 143.602642
JEP 0.863573
JMD 182.195393
JOD 0.81678
JPY 184.650176
KES 149.063795
KGS 100.746357
KHR 4622.588781
KMF 493.076034
KPW 1036.674909
KRW 1774.15162
KWD 0.356293
KYD 0.960282
KZT 560.742064
LAK 25345.018327
LBP 104042.826649
LKR 387.774046
LRD 210.277236
LSL 19.066644
LTL 3.401693
LVL 0.696861
LYD 7.321251
MAD 10.669113
MDL 19.981784
MGA 4838.594253
MKD 61.576654
MMK 2418.276953
MNT 4120.919448
MOP 9.297905
MRU 46.122159
MUR 55.194618
MVR 17.799598
MWK 2001.104257
MXN 20.136853
MYR 4.688714
MZN 73.627484
NAD 19.066614
NGN 1567.266415
NIO 42.176374
NOK 10.893522
NPR 174.979562
NZD 1.987505
OMR 0.442956
PAB 1.152338
PEN 3.999041
PGK 5.023184
PHP 71.086438
PKR 320.839155
PLN 4.246178
PYG 7043.687359
QAR 4.190565
RON 5.245958
RSD 117.350867
RUB 84.904315
RWF 1685.443735
SAR 4.32964
SBD 9.27234
SCR 16.991255
SDG 691.801546
SEK 10.922032
SGD 1.487125
SHP 0.860119
SLE 28.338663
SLL 24157.837291
SOS 657.818156
SRD 42.984574
STD 23845.032416
STN 24.768995
SVC 10.082917
SYP 127.338094
SZL 19.066289
THB 37.914958
TJS 10.751594
TMT 4.032162
TND 3.361096
TOP 2.773851
TRY 53.109208
TTD 7.807362
TWD 36.408696
TZS 3024.119249
UAH 51.116084
UGX 4342.039741
USD 1.152046
UYU 46.528926
UZS 13787.11507
VES 648.124065
VND 30350.659746
VUV 137.014674
WST 3.141644
XAF 649.374065
XAG 0.017203
XAU 0.000268
XCD 3.113462
XCG 2.07698
XDR 0.816116
XOF 650.330363
XPF 119.331742
YER 274.907037
ZAR 19.124821
ZMK 10369.800751
ZMW 20.258979
ZWL 370.958438
  • CMSC

    -0.1384

    22.47

    -0.62%

  • BCE

    0.3300

    24.41

    +1.35%

  • GSK

    0.2500

    51.52

    +0.49%

  • RBGPF

    0.5500

    60.56

    +0.91%

  • NGG

    0.4800

    81.86

    +0.59%

  • RYCEF

    -0.4400

    16.7

    -2.63%

  • RELX

    0.6900

    35.15

    +1.96%

  • AZN

    4.1500

    185.95

    +2.23%

  • CMSD

    -0.1300

    22.52

    -0.58%

  • BCC

    -0.4000

    68.08

    -0.59%

  • JRI

    -0.2100

    12.6

    -1.67%

  • VOD

    -0.4000

    14.7

    -2.72%

  • BP

    -1.0700

    42.97

    -2.49%

  • RIO

    -4.7100

    100.69

    -4.68%

  • BTI

    1.8700

    59.72

    +3.13%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

Z.Marek--TPP