The Prague Post - As AI data scrapers sap websites' revenues, some fight back

EUR -
AED 4.276442
AFN 76.273354
ALL 97.000503
AMD 445.145408
ANG 2.084428
AOA 1067.800722
ARS 1639.678479
AUD 1.780745
AWG 2.096009
AZN 1.992038
BAM 1.961163
BBD 2.344101
BDT 142.221368
BGN 1.956892
BHD 0.438973
BIF 3428.651725
BMD 1.164449
BND 1.511634
BOB 8.071072
BRL 6.168551
BSD 1.163828
BTN 103.187168
BWP 16.52118
BYN 3.970612
BYR 22823.207476
BZD 2.340701
CAD 1.633199
CDF 2489.013098
CHF 0.922274
CLF 0.027601
CLP 1082.798081
CNY 8.282437
CNH 8.261815
COP 4365.019941
CRC 584.621327
CUC 1.164449
CUP 30.857908
CVE 111.059321
CZK 24.18014
DJF 206.94606
DKK 7.468173
DOP 74.988815
DZD 151.528662
EGP 54.950716
ERN 17.46674
ETB 178.862324
FJD 2.647783
FKP 0.886868
GBP 0.885593
GEL 3.141256
GGP 0.886868
GHS 12.750757
GIP 0.886868
GMD 85.005026
GNF 10113.242338
GTQ 8.921743
GYD 243.485848
HKD 9.046653
HNL 30.636958
HRK 7.533523
HTG 152.074227
HUF 384.413864
IDR 19457.948822
ILS 3.7567
IMP 0.886868
INR 103.319843
IQD 1525.428663
IRR 49037.875974
ISK 146.988396
JEP 0.886868
JMD 186.750698
JOD 0.825639
JPY 179.903353
KES 150.504601
KGS 101.830779
KHR 4656.416134
KMF 494.890571
KPW 1048.007614
KRW 1693.866205
KWD 0.357067
KYD 0.969852
KZT 608.607849
LAK 25262.729187
LBP 103836.057603
LKR 355.796729
LRD 211.929463
LSL 19.889437
LTL 3.438316
LVL 0.704364
LYD 6.357955
MAD 10.809004
MDL 19.587565
MGA 5240.02222
MKD 61.554569
MMK 2444.848094
MNT 4167.653364
MOP 9.316176
MRU 46.403005
MUR 53.157171
MVR 17.938341
MWK 2021.48388
MXN 21.319207
MYR 4.81209
MZN 74.478106
NAD 19.888437
NGN 1676.446321
NIO 42.817108
NOK 11.655791
NPR 165.099871
NZD 2.047725
OMR 0.447737
PAB 1.163783
PEN 3.923014
PGK 4.797679
PHP 68.750223
PKR 326.918992
PLN 4.232832
PYG 8198.419806
QAR 4.239645
RON 5.083286
RSD 117.170385
RUB 93.971878
RWF 1688.451573
SAR 4.366566
SBD 9.592003
SCR 17.044087
SDG 700.417154
SEK 10.937918
SGD 1.51333
SHP 0.873639
SLE 27.218982
SLL 24417.918709
SOS 665.477653
SRD 44.934928
STD 24101.750759
STN 24.977439
SVC 10.183849
SYP 12875.224817
SZL 19.889163
THB 37.641408
TJS 10.776971
TMT 4.075573
TND 3.438039
TOP 2.803715
TRY 49.288399
TTD 7.868823
TWD 36.255711
TZS 2841.256576
UAH 48.945649
UGX 4259.648647
USD 1.164449
UYU 46.27655
UZS 13979.214298
VES 271.625921
VND 30689.645134
VUV 142.281706
WST 3.275514
XAF 657.755731
XAG 0.021945
XAU 0.000278
XCD 3.146983
XCG 2.097617
XDR 0.820673
XOF 657.913965
XPF 119.331742
YER 277.724393
ZAR 19.835346
ZMK 10481.443952
ZMW 26.09837
ZWL 374.952219
  • RBGPF

    -2.8200

    75.65

    -3.73%

  • SCS

    -0.1300

    15.62

    -0.83%

  • BCC

    -1.1000

    69.18

    -1.59%

  • CMSC

    -0.2500

    23.83

    -1.05%

  • NGG

    0.0600

    78.09

    +0.08%

  • RYCEF

    -0.0500

    14.91

    -0.34%

  • JRI

    -0.1000

    13.77

    -0.73%

  • RIO

    -0.0700

    71.04

    -0.1%

  • GSK

    0.0700

    48.14

    +0.15%

  • BTI

    -1.3400

    54.48

    -2.46%

  • CMSD

    -0.3400

    24.21

    -1.4%

  • RELX

    0.0600

    41.42

    +0.14%

  • BCE

    0.3400

    23.11

    +1.47%

  • AZN

    0.9300

    88.61

    +1.05%

  • VOD

    0.0400

    12.41

    +0.32%

  • BP

    -0.3700

    36.49

    -1.01%

As AI data scrapers sap websites' revenues, some fight back
As AI data scrapers sap websites' revenues, some fight back / Photo: PATRICIA DE MELO MOREIRA - AFP

As AI data scrapers sap websites' revenues, some fight back

A swarm of AI "crawlers" is running rampant on the internet, scouring billions of websites for data to feed algorithms at leading tech companies -- all without permission or payment, upending the online economy.

Text size:

Before the rise of AI chatbots, websites allowed search engines to access their content in return for increased visibility, a system that rewarded them with traffic and advertising revenues.

But the rapid development of generative AI has allowed tech giants like Google and OpenAI to harvest information for their chatbots with web crawlers, without humans ever needing to visit the original sites.

Traditional content producers, such as media outlets, are being outpaced by AI crawlers, which have cut into their online operations and advertising revenues.

"Sites that gave bots access to their content used to get readers in exchange," said Kurt Muehmel, head of AI strategy at data management firm Dataiku.

But the arrival of generative AI "completely breaks" that model, he told AFP.

Wikipedia's human internet traffic fell by eight percent between 2024 and 2025 because of a rise in AI search engine summaries, the online encyclopaedia reported last month.

"The fundamental tension is that the new business of the internet that is AI-driven doesn't generate traffic," said Matthew Prince, CEO of Cloudflare, an American internet services provider.

- 'No trespassing' -

Cloudflare, which processes more than 20 percent of all internet traffic, announced this summer a new measure aimed at blocking AI crawlers from accessing content without payment or permission from website owners.

"It's basically like putting a speed limit sign or a no trespassing sign," Prince told AFP on the sidelines of the Web Summit in Lisbon.

"Badly behaving bots can get by that, but we can track that... Over time, we can tighten these controls in a way that we're confident the AI companies can't get through."

The measure, which applies to more than 10 million websites, has already "attracted the attention of artificial intelligence giants", he added.

On a smaller scale, American startup TollBit is providing online news publishers with tools to block, monitor and monetise AI crawler traffic.

"The internet is a highway," said CEO and co-founder Toshit Panigrahi, who described the company as a "tollbooth on the internet".

TollBit works with more than 5,600 sites, including USA Today, Time magazine and the Associated Press, allowing media outlets to set their own access fees for their content.

The analytics are free for publishers, but AI companies are charged a "transaction fee for every piece of content they access".

But for Muehmel, the online takeover by AI crawlers cannot be resolved with only "partial measures or by an individual company".

"This is an evolution of the entire internet economy, which will take years," he said.

If the bot swarm continues to roam freely online, "all of the incentives for content creation are going to go away," Prince said.

"That would be a loss, not just for us humans that want to consume it, but actually for the AI companies that need original content in order to train their systems."

Z.Marek--TPP