The Prague Post - Inner workings of AI an enigma - even to its creators

EUR -
AED 4.272621
AFN 80.512354
ALL 97.7276
AMD 447.553579
ANG 2.081892
AOA 1066.713342
ARS 1495.388576
AUD 1.786243
AWG 2.093875
AZN 1.971538
BAM 1.958247
BBD 2.353273
BDT 141.486829
BGN 1.958087
BHD 0.43853
BIF 3473.445599
BMD 1.163264
BND 1.496015
BOB 8.053065
BRL 6.492643
BSD 1.165522
BTN 100.331099
BWP 15.648432
BYN 3.814247
BYR 22799.971261
BZD 2.341137
CAD 1.595905
CDF 3357.180163
CHF 0.932054
CLF 0.029219
CLP 1121.258526
CNY 8.34985
CNH 8.352985
COP 4678.367981
CRC 588.064185
CUC 1.163264
CUP 30.826492
CVE 110.401556
CZK 24.62749
DJF 207.340719
DKK 7.464303
DOP 70.37949
DZD 151.604468
EGP 57.408353
ERN 17.448958
ETB 161.936275
FJD 2.622404
FKP 0.867192
GBP 0.866602
GEL 3.152749
GGP 0.867192
GHS 12.150185
GIP 0.867192
GMD 83.177806
GNF 10112.943369
GTQ 8.948068
GYD 243.834743
HKD 9.130929
HNL 30.503172
HRK 7.53318
HTG 153.025855
HUF 399.046216
IDR 18992.37606
ILS 3.915255
IMP 0.867192
INR 100.342555
IQD 1526.780348
IRR 48987.953335
ISK 142.406885
JEP 0.867192
JMD 186.383355
JOD 0.824718
JPY 172.426529
KES 150.572558
KGS 101.727493
KHR 4670.978299
KMF 492.413584
KPW 1046.937467
KRW 1617.460655
KWD 0.355494
KYD 0.971277
KZT 621.336602
LAK 25134.385102
LBP 104430.752203
LKR 351.614912
LRD 233.669025
LSL 20.635689
LTL 3.434816
LVL 0.703647
LYD 6.339277
MAD 10.538571
MDL 19.825886
MGA 5186.772218
MKD 61.571583
MMK 2442.517683
MNT 4171.296336
MOP 9.421973
MRU 46.361939
MUR 53.149361
MVR 17.914722
MWK 2021.038812
MXN 21.762802
MYR 4.9311
MZN 74.402794
NAD 20.636222
NGN 1781.562181
NIO 42.8949
NOK 11.823734
NPR 160.533704
NZD 1.952915
OMR 0.447283
PAB 1.165457
PEN 4.148485
PGK 4.826135
PHP 66.477063
PKR 331.940605
PLN 4.243244
PYG 9020.778168
QAR 4.237441
RON 5.071251
RSD 117.118526
RUB 91.429123
RWF 1684.19901
SAR 4.363282
SBD 9.653748
SCR 17.081588
SDG 698.54081
SEK 11.242334
SGD 1.493745
SHP 0.914143
SLE 26.642362
SLL 24393.065681
SOS 666.069668
SRD 43.282711
STD 24077.212864
STN 24.531186
SVC 10.198403
SYP 15124.587964
SZL 20.631939
THB 37.660666
TJS 11.217375
TMT 4.083056
TND 3.426138
TOP 2.724478
TRY 46.981205
TTD 7.912399
TWD 34.275571
TZS 3036.118923
UAH 48.674853
UGX 4176.219413
USD 1.163264
UYU 46.968095
UZS 14752.945433
VES 136.061153
VND 30419.349412
VUV 139.346893
WST 3.079045
XAF 656.814584
XAG 0.030331
XAU 0.000345
XCD 3.143779
XCG 2.100561
XDR 0.816867
XOF 656.814584
XPF 119.331742
YER 280.404394
ZAR 20.567063
ZMK 10470.76772
ZMW 26.836039
ZWL 374.570482
  • CMSC

    0.0900

    22.314

    +0.4%

  • CMSD

    0.0250

    22.285

    +0.11%

  • RBGPF

    0.0000

    69.04

    0%

  • SCS

    0.0400

    10.74

    +0.37%

  • RELX

    0.0300

    53

    +0.06%

  • RIO

    -0.1400

    59.33

    -0.24%

  • GSK

    0.1300

    41.45

    +0.31%

  • NGG

    0.2700

    71.48

    +0.38%

  • BP

    0.1750

    30.4

    +0.58%

  • BTI

    0.7150

    48.215

    +1.48%

  • BCC

    0.7900

    91.02

    +0.87%

  • JRI

    0.0200

    13.13

    +0.15%

  • VOD

    0.0100

    9.85

    +0.1%

  • BCE

    -0.0600

    22.445

    -0.27%

  • RYCEF

    0.1000

    12

    +0.83%

  • AZN

    -0.1200

    73.71

    -0.16%

Inner workings of AI an enigma - even to its creators
Inner workings of AI an enigma - even to its creators / Photo: Kirill KUDRYAVTSEV - AFP

Inner workings of AI an enigma - even to its creators

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

X.Vanek--TPP