The Prague Post - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.276563
AFN 76.270948
ALL 96.603374
AMD 444.279807
ANG 2.084518
AOA 1067.831876
ARS 1698.391804
AUD 1.739333
AWG 2.097526
AZN 1.970167
BAM 1.956898
BBD 2.351499
BDT 142.670052
BGN 1.955599
BHD 0.439047
BIF 3455.764836
BMD 1.164483
BND 1.502511
BOB 8.067527
BRL 6.256307
BSD 1.167545
BTN 105.377367
BWP 15.61176
BYN 3.403581
BYR 22823.874456
BZD 2.348097
CAD 1.617229
CDF 2529.840232
CHF 0.933025
CLF 0.026301
CLP 1031.790994
CNY 8.125751
CNH 8.122801
COP 4292.996113
CRC 580.315648
CUC 1.164483
CUP 30.85881
CVE 110.326904
CZK 24.238842
DJF 207.90487
DKK 7.472146
DOP 74.390463
DZD 151.513238
EGP 54.84053
ERN 17.467251
ETB 181.438789
FJD 2.656307
FKP 0.864837
GBP 0.866469
GEL 3.126623
GGP 0.864837
GHS 12.522026
GIP 0.864837
GMD 85.595408
GNF 10219.51476
GTQ 8.952638
GYD 244.267058
HKD 9.082563
HNL 30.798488
HRK 7.535609
HTG 152.892505
HUF 386.087931
IDR 19645.12592
ILS 3.662714
IMP 0.864837
INR 105.111103
IQD 1529.458178
IRR 49053.863012
ISK 146.596862
JEP 0.864837
JMD 184.590403
JOD 0.825654
JPY 185.326326
KES 150.614445
KGS 101.832922
KHR 4696.534393
KMF 492.576182
KPW 1047.990704
KRW 1720.838403
KWD 0.358541
KYD 0.972946
KZT 594.603632
LAK 25241.946469
LBP 104550.265236
LKR 360.959434
LRD 209.565788
LSL 19.182799
LTL 3.438417
LVL 0.704384
LYD 6.340503
MAD 10.755435
MDL 19.912002
MGA 5402.962
MKD 61.551522
MMK 2445.211311
MNT 4147.536981
MOP 9.382665
MRU 46.60635
MUR 54.393182
MVR 18.002986
MWK 2024.51857
MXN 20.746084
MYR 4.722005
MZN 74.409251
NAD 19.182799
NGN 1661.042224
NIO 42.963184
NOK 11.741859
NPR 168.5897
NZD 2.02683
OMR 0.447747
PAB 1.167555
PEN 3.923401
PGK 4.982689
PHP 69.179638
PKR 326.814569
PLN 4.210964
PYG 7726.268853
QAR 4.268995
RON 5.091
RSD 117.31937
RUB 91.705073
RWF 1702.225882
SAR 4.367032
SBD 9.467501
SCR 16.400258
SDG 700.463022
SEK 10.7405
SGD 1.500018
SHP 0.873664
SLE 28.12267
SLL 24418.633606
SOS 666.062287
SRD 44.457062
STD 24102.455103
STN 24.513123
SVC 10.215732
SYP 12878.691272
SZL 19.178596
THB 36.701601
TJS 10.875309
TMT 4.075692
TND 3.416417
TOP 2.803796
TRY 50.252052
TTD 7.930382
TWD 36.844836
TZS 2916.874842
UAH 50.357907
UGX 4162.335483
USD 1.164483
UYU 45.345054
UZS 14125.926045
VES 384.231216
VND 30610.192647
VUV 141.054781
WST 3.245494
XAF 656.325264
XAG 0.012955
XAU 0.000252
XCD 3.147075
XCG 2.104181
XDR 0.816247
XOF 656.319624
XPF 119.331742
YER 277.611475
ZAR 19.050595
ZMK 10481.749816
ZMW 22.737563
ZWL 374.963177
  • SCS

    0.0200

    16.14

    +0.12%

  • RBGPF

    0.9300

    82.5

    +1.13%

  • JRI

    0.0100

    13.82

    +0.07%

  • BCC

    0.9100

    83.87

    +1.09%

  • CMSC

    0.0800

    23.39

    +0.34%

  • NGG

    -1.6800

    78.08

    -2.15%

  • GSK

    -0.4900

    49.9

    -0.98%

  • RYCEF

    -0.0100

    17.28

    -0.06%

  • RELX

    -0.5800

    42.19

    -1.37%

  • BCE

    -0.1200

    23.72

    -0.51%

  • BTI

    0.9400

    56.62

    +1.66%

  • RIO

    0.7100

    83.59

    +0.85%

  • CMSD

    0.0350

    23.9

    +0.15%

  • AZN

    0.8800

    94.51

    +0.93%

  • VOD

    -0.3700

    13.18

    -2.81%

  • BP

    0.9500

    35.36

    +2.69%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

Z.Marek--TPP