The Prague Post - Inner workings of AI an enigma - even to its creators

EUR -
AED 4.077266
AFN 77.811182
ALL 98.043689
AMD 430.238721
ANG 1.987461
AOA 1017.375164
ARS 1254.921659
AUD 1.739147
AWG 2.000899
AZN 1.891707
BAM 1.955058
BBD 2.236252
BDT 134.568949
BGN 1.954805
BHD 0.418409
BIF 3295.179612
BMD 1.110069
BND 1.448064
BOB 7.681155
BRL 6.293866
BSD 1.10759
BTN 94.081733
BWP 15.1204
BYN 3.624625
BYR 21757.350044
BZD 2.224756
CAD 1.551216
CDF 3185.897538
CHF 0.937059
CLF 0.027376
CLP 1050.557815
CNY 7.996154
CNH 7.978875
COP 4689.7635
CRC 562.494211
CUC 1.110069
CUP 29.416825
CVE 110.223682
CZK 24.985435
DJF 197.226956
DKK 7.46019
DOP 65.146166
DZD 148.546064
EGP 56.062254
ERN 16.651033
ETB 149.904157
FJD 2.52929
FKP 0.834511
GBP 0.842287
GEL 3.047122
GGP 0.834511
GHS 14.403155
GIP 0.834511
GMD 80.131006
GNF 9595.529683
GTQ 8.521838
GYD 231.817026
HKD 8.649751
HNL 28.695902
HRK 7.533148
HTG 144.921132
HUF 405.075835
IDR 18456.831566
ILS 3.969079
IMP 0.834511
INR 94.121594
IQD 1452.802874
IRR 46900.4102
ISK 146.662194
JEP 0.834511
JMD 176.109525
JOD 0.787373
JPY 164.26189
KES 143.502668
KGS 97.075835
KHR 4446.496036
KMF 480.374501
KPW 999.062144
KRW 1572.690087
KWD 0.341277
KYD 0.92327
KZT 566.784182
LAK 23955.174862
LBP 99259.248038
LKR 331.142513
LRD 221.592788
LSL 20.308207
LTL 3.277745
LVL 0.67147
LYD 6.071638
MAD 10.363574
MDL 19.13026
MGA 4970.980545
MKD 61.580703
MMK 2330.8027
MNT 3967.337619
MOP 8.89301
MRU 43.962389
MUR 50.741251
MVR 17.137912
MWK 1921.319642
MXN 21.795225
MYR 4.816033
MZN 70.855698
NAD 20.309763
NGN 1778.030157
NIO 40.786429
NOK 11.599667
NPR 150.588228
NZD 1.891878
OMR 0.427363
PAB 1.110069
PEN 4.047323
PGK 4.55877
PHP 61.76756
PKR 311.89754
PLN 4.240712
PYG 8852.635056
QAR 4.042795
RON 5.103877
RSD 117.210946
RUB 89.914111
RWF 1586.046777
SAR 4.163496
SBD 9.270018
SCR 15.777974
SDG 666.598315
SEK 10.892629
SGD 1.448379
SHP 0.87234
SLE 25.254355
SLL 23277.605898
SOS 633.20497
SRD 40.183937
STD 22976.184591
SVC 9.694422
SYP 14432.963037
SZL 20.306163
THB 37.094339
TJS 11.544724
TMT 3.885241
TND 3.371327
TOP 2.67278
TRY 43.0826
TTD 7.521061
TWD 33.720561
TZS 2991.635639
UAH 46.026258
UGX 4054.725867
USD 1.110069
UYU 46.365678
UZS 14303.311319
VES 102.920022
VND 28810.50817
VUV 134.322181
WST 3.084353
XAF 655.912023
XAG 0.033843
XAU 0.000344
XCD 2.997187
XDR 0.799447
XOF 655.912023
XPF 119.331742
YER 271.646011
ZAR 20.29494
ZMK 9991.946368
ZMW 29.172314
ZWL 357.441726
  • RIO

    1.4300

    61.41

    +2.33%

  • CMSC

    0.0200

    22.08

    +0.09%

  • SCS

    0.3600

    10.82

    +3.33%

  • JRI

    0.0300

    13.01

    +0.23%

  • CMSD

    -0.0400

    22.3

    -0.18%

  • RBGPF

    2.2700

    65.27

    +3.48%

  • NGG

    -3.1600

    67.53

    -4.68%

  • BTI

    -0.6600

    40.98

    -1.61%

  • BCC

    4.4800

    93.1

    +4.81%

  • GSK

    0.7500

    37.37

    +2.01%

  • BCE

    -0.1500

    22.56

    -0.66%

  • RYCEF

    -0.1200

    10.38

    -1.16%

  • BP

    0.4200

    30.19

    +1.39%

  • VOD

    -0.2300

    9.07

    -2.54%

  • AZN

    1.3800

    68.95

    +2%

  • RELX

    -2.0200

    51.83

    -3.9%

Inner workings of AI an enigma - even to its creators
Inner workings of AI an enigma - even to its creators / Photo: Kirill KUDRYAVTSEV - AFP

Inner workings of AI an enigma - even to its creators

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

X.Vanek--TPP