午夜视频在线网站,日韩视频精品在线,中文字幕精品一区二区三区在线,在线播放精品,1024你懂我懂的旧版人,欧美日韩一级黄色片,一区二区三区在线观看视频

分享

突發(fā)!AI 泰斗Hinton 獲得 2024 年諾貝爾物理學(xué)獎,成為人類歷史上首個獲得圖靈獎 諾貝爾獎的科學(xué)家,附諾貝爾頒獎全文

 ha888cz 2024-10-10

John J. Hopfield Geoffrey E. Hinton 因通過神經(jīng)網(wǎng)絡(luò)對現(xiàn)代機(jī)器學(xué)習(xí)作出的奠基性貢獻(xiàn),獲得 2024 年諾貝爾物理學(xué)獎!

圖片

作為 2024 諾貝爾物理學(xué)獎得主,John Hopfield 解決了人工神經(jīng)網(wǎng)絡(luò)的記憶和數(shù)據(jù)存儲問題,而 Geoffrey Hinton 在 Hopfield Network 的基礎(chǔ)之上, 解決了人工神經(jīng)網(wǎng)絡(luò)如何自主學(xué)習(xí)和識別數(shù)據(jù)中特征的問題

因?yàn)橹Z貝爾沒有數(shù)學(xué)獎更沒有計(jì)算機(jī)科學(xué)獎,所以拿出了物理學(xué)獎這個最高榮頒給了足以改變?nèi)祟愇磥淼目茖W(xué)貢獻(xiàn) - 人工神經(jīng)網(wǎng)絡(luò)的 AI

杰弗里·辛頓(Geoffrey Hinton)也成為人類歷史上首個獲得圖靈獎(2018 年)+諾貝爾獎(2024 年)的科學(xué)家

Hinton 簡介

可能你對 Hinton 這個名字感到陌生

他是深度學(xué)習(xí)的泰斗

你稍微對 AI 有一點(diǎn)了解,肯定聽過現(xiàn)在如日中天的 OpenAI 公司

OpenAI 公司的前首席科學(xué)家 Ilya Sutskever 就是 Hinton 的嫡傳弟子、衣缽傳人, Ilya 繼承了 Hinton 的理念

Ilya 是個天才,他在暑期炸了兩個月的薯?xiàng)l后,走進(jìn)Hinton 多倫多大學(xué)的辦公室,要求成為他的學(xué)生。關(guān)于兩人的故事,參看量子位這篇文章《Hinton 揭秘 Ilya 成長歷程:Scaling Law 是他學(xué)生時代就有的直覺》

以下是 Hinton 的徒子徒孫,也是 AI 群英譜

圖片

杰弗里·辛頓(Geoffrey Hinton)是一位在人工智能領(lǐng)域具有重要影響力的科學(xué)家,他因在神經(jīng)網(wǎng)絡(luò)和機(jī)器學(xué)習(xí)方面的貢獻(xiàn)而聞名。辛頓出生于 1947 年 12 月 6 日,英國溫布爾登人,是多倫多大學(xué)的名譽(yù)教授。

辛頓在人工智能領(lǐng)域有著深厚的學(xué)術(shù)背景,他在 1978 年獲得了愛丁堡大學(xué)的人工智能博士學(xué)位。他的研究主要集中在神經(jīng)網(wǎng)絡(luò)、機(jī)器學(xué)習(xí)、分類監(jiān)督學(xué)習(xí)等領(lǐng)域,并且他是反向傳播算法的提出者之一。此外,他還提出了前向-前向算法(Forward-Forward algorithm),這是一種新的深度學(xué)習(xí)算法,旨在替代傳統(tǒng)的反向傳播訓(xùn)練方法。

辛頓因其在人工智能領(lǐng)域的杰出貢獻(xiàn),獲得了多個重要獎項(xiàng)。2018 年,他與 Yann LeCun 共同獲得了圖靈獎,這是計(jì)算機(jī)科學(xué)領(lǐng)域最高榮譽(yù)之一。此外,他在 2024 年還與 John J. Hopfield 一起榮獲諾貝爾物理學(xué)獎,以表彰他們在利用人工神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)機(jī)器學(xué)習(xí)方面的基礎(chǔ)性發(fā)現(xiàn)和發(fā)明。

辛頓不僅在學(xué)術(shù)界有著卓越的成就,他還致力于教育和普及人工智能知識。他曾開設(shè)了面向機(jī)器學(xué)習(xí)的神經(jīng)網(wǎng)絡(luò)公開課,并在 Coursera 平臺上進(jìn)行教學(xué)。他的課程深入介紹了神經(jīng)網(wǎng)絡(luò)在語音識別、物體識別、圖像分割和語言建模等過程中的應(yīng)用。

杰弗里·辛頓是人工智能領(lǐng)域的重要人物,他的工作不僅推動了機(jī)器學(xué)習(xí)和深度學(xué)習(xí)的發(fā)展,也為相關(guān)領(lǐng)域的研究者和學(xué)生提供了寶貴的教育資源。

諾貝爾官網(wǎng)介紹

Hinton 被引用最多的論文:

https://www.ademy/en/paper/reading?corpusId=784288

諾貝爾官網(wǎng)介紹鏈接:

https://www.ademy/zh/paper/reading?corpusId=195908774

譯文:

2024 年諾貝爾物理學(xué)獎

今年的獲獎?wù)哌\(yùn)用了物理學(xué)的工具,構(gòu)建了推動當(dāng)前強(qiáng)大機(jī)器學(xué)習(xí)發(fā)展的基礎(chǔ)性方法。John Hopfield 創(chuàng)建了一種能夠存儲和重現(xiàn)信息的結(jié)構(gòu)。Geoffrey Hinton 則發(fā)明了一種可以自主發(fā)現(xiàn)數(shù)據(jù)特征的方法,這項(xiàng)發(fā)明如今已經(jīng)成為大型人工神經(jīng)網(wǎng)絡(luò)的關(guān)鍵技術(shù)之一。

他們利用物理學(xué)發(fā)現(xiàn)信息中的模式

圖片? Johan Jarnestad/The Royal Swedish Academy of Sciences

許多人已經(jīng)見證了計(jì)算機(jī)可以在不同語言間進(jìn)行翻譯、解讀圖像,甚至參與合理的對話。但鮮為人知的是,這類技術(shù)很早就在研究領(lǐng)域中得到了廣泛應(yīng)用,尤其在海量數(shù)據(jù)的排序和分析方面。在過去十五到二十年間,機(jī)器學(xué)習(xí)的迅速發(fā)展依賴于一種稱為人工神經(jīng)網(wǎng)絡(luò)的結(jié)構(gòu)。現(xiàn)如今,當(dāng)我們談到人工智能時,這種技術(shù)通常就是我們所指的對象。

盡管計(jì)算機(jī)不能思考,但機(jī)器現(xiàn)在可以模擬人類的記憶與學(xué)習(xí)功能。今年的諾貝爾物理學(xué)獎得主們正是通過物理學(xué)的基本概念與方法,開發(fā)了利用網(wǎng)絡(luò)結(jié)構(gòu)來處理信息的技術(shù),促成了這一可能性。

與傳統(tǒng)軟件不同,傳統(tǒng)軟件的工作方式就像烹飪食譜一樣:輸入數(shù)據(jù),按照明確的步驟進(jìn)行處理,最終產(chǎn)出結(jié)果,比如按步驟做出蛋糕。而在機(jī)器學(xué)習(xí)中,計(jì)算機(jī)通過觀察實(shí)例進(jìn)行學(xué)習(xí),這使得它能夠處理那些模糊且復(fù)雜的任務(wù),無法依靠逐步指令解決的難題。一個例子是讓計(jì)算機(jī)解釋一張圖片并識別其中的物體。

模仿大腦

人工神經(jīng)網(wǎng)絡(luò)通過整個網(wǎng)絡(luò)結(jié)構(gòu)來處理信息。這種技術(shù)的靈感最初源于對大腦工作原理的探索。早在 1940 年代,研究人員便開始探討大腦中神經(jīng)元和突觸網(wǎng)絡(luò)背后的數(shù)學(xué)原理。而另一個關(guān)鍵來自心理學(xué),神經(jīng)科學(xué)家 Donald Hebb 提出了關(guān)于學(xué)習(xí)如何發(fā)生的假設(shè):當(dāng)神經(jīng)元共同工作時,它們之間的連接會增強(qiáng)。

隨著這些理論的發(fā)展,科學(xué)家開始嘗試通過計(jì)算機(jī)模擬構(gòu)建人工神經(jīng)網(wǎng)絡(luò),以復(fù)制大腦網(wǎng)絡(luò)的運(yùn)作。在這些網(wǎng)絡(luò)中,神經(jīng)元由賦予不同數(shù)值的節(jié)點(diǎn)模擬,而突觸則由節(jié)點(diǎn)之間的連接表示,這些連接可以隨著“訓(xùn)練”過程的進(jìn)行而變強(qiáng)或減弱。Donald Hebb 的假設(shè)至今仍然是人工神經(jīng)網(wǎng)絡(luò)訓(xùn)練的基本規(guī)則之一。

圖片

? Johan Jarnestad/The Royal Swedish Academy of Sciences

到了 1960 年代末期,一些理論上的負(fù)面結(jié)果使許多研究人員對神經(jīng)網(wǎng)絡(luò)的前景產(chǎn)生了懷疑,認(rèn)為它們可能永遠(yuǎn)無法實(shí)現(xiàn)實(shí)際應(yīng)用。然而,在 1980 年代,隨著幾項(xiàng)關(guān)鍵思想的提出,人工神經(jīng)網(wǎng)絡(luò)的研究重新受到關(guān)注,其中就包括今年諾貝爾獎得主的工作。

聯(lián)想記憶

試想一下,你正努力回憶一個平時不常用的詞語,比如那種電影院或講堂常見的斜坡。你在腦海中搜尋著,想到的詞可能是 ramp... 還是 rad...ial?不對,不是這個。哦,對了,是 rake!

這種在相似詞匯中搜尋正確詞的過程,類似于物理學(xué)家 John Hopfield 在 1982 年發(fā)現(xiàn)的聯(lián)想記憶。Hopfield 網(wǎng)絡(luò)能夠存儲不同的模式,并通過一種方法將它們重新找回。當(dāng)網(wǎng)絡(luò)接收到一個不完整或稍微變形的模式時,這個方法能找出最接近的存儲模式。

Hopfield 曾利用他在物理學(xué)的背景,研究分子生物學(xué)中的理論問題。當(dāng)他被邀請參加一場神經(jīng)科學(xué)會議時,他接觸到了有關(guān)大腦結(jié)構(gòu)的研究。這引起了他的極大興趣,促使他開始思考簡單神經(jīng)網(wǎng)絡(luò)的動態(tài)行為。當(dāng)神經(jīng)元共同工作時,它們能夠產(chǎn)生新的、強(qiáng)大的特性,這些特性無法通過單獨(dú)研究神經(jīng)網(wǎng)絡(luò)的單個組件來發(fā)現(xiàn)。

1980 年,Hopfield 離開了普林斯頓大學(xué),因?yàn)樗难芯颗d趣已經(jīng)超出了物理學(xué)領(lǐng)域的常規(guī)范疇。他接受了南加州帕薩迪納加州理工學(xué)院 (Caltech) 的化學(xué)和生物學(xué)教授職位。在那里,他可以自由使用計(jì)算資源,進(jìn)行各種實(shí)驗(yàn),發(fā)展他對神經(jīng)網(wǎng)絡(luò)的構(gòu)想。

然而,他并沒有放棄物理學(xué)的根基。他從物理學(xué)中獲得了許多靈感,特別是那些關(guān)于許多小組件協(xié)同作用產(chǎn)生新現(xiàn)象的理論。他從磁性材料中得到了特別的啟示,這些材料的特殊特性源于原子自旋——一種讓每個原子都成為微小磁體的特性。鄰近原子的自旋會相互影響,形成同方向的自旋區(qū)域。他利用描述自旋相互作用的物理學(xué),成功建立了一個節(jié)點(diǎn)和連接構(gòu)成的模型網(wǎng)絡(luò)。

網(wǎng)絡(luò)在一個結(jié)構(gòu)中保存圖像

Hopfield 創(chuàng)建的網(wǎng)絡(luò)由多個節(jié)點(diǎn)構(gòu)成,這些節(jié)點(diǎn)通過不同強(qiáng)度的連接相互連接。每個節(jié)點(diǎn)可以存儲一個數(shù)值——在 Hopfield 的最初實(shí)驗(yàn)中,這個數(shù)值可以是 0 或 1,類似于黑白圖像中的像素點(diǎn)。

Hopfield 使用一種類似于物理學(xué)中自旋系統(tǒng)能量的特性來描述整個網(wǎng)絡(luò)的狀態(tài);網(wǎng)絡(luò)的能量通過一個公式計(jì)算,公式使用所有節(jié)點(diǎn)的數(shù)值和節(jié)點(diǎn)之間連接的強(qiáng)度。

Hopfield 網(wǎng)絡(luò)的編程是通過將圖像輸入到節(jié)點(diǎn)中進(jìn)行的,節(jié)點(diǎn)會被賦予黑色 (0) 或白色 (1) 的值。接著,使用能量公式來調(diào)整網(wǎng)絡(luò)的連接,使得保存的圖像能夠達(dá)到最低能量狀態(tài)。當(dāng)一個新模式輸入網(wǎng)絡(luò)時,系統(tǒng)會按照規(guī)則逐一檢查每個節(jié)點(diǎn),判斷如果改變節(jié)點(diǎn)的數(shù)值,網(wǎng)絡(luò)的能量是否會減少。如果改變黑色像素為白色能夠降低能量,像素就會轉(zhuǎn)換顏色。這一過程會持續(xù)進(jìn)行,直到?jīng)]有進(jìn)一步的改進(jìn)為止。當(dāng)達(dá)到這種狀態(tài)時,網(wǎng)絡(luò)通常能夠重現(xiàn)它被訓(xùn)練的原始圖像。

如果只是保存一張圖像,可能看起來沒什么特別之處。你或許會想,為什么不直接保存圖像,然后將其與新的圖像進(jìn)行比較呢?但 Hopfield 的方法特別之處在于它能夠同時保存多張圖片,并且網(wǎng)絡(luò)通常能成功區(qū)分它們。

Hopfield 將在網(wǎng)絡(luò)中尋找保存狀態(tài)的過程形象地比喻為一個球在充滿山峰和山谷的景觀中滾動,滾動過程中摩擦力讓它逐漸減速。如果球從某個位置被釋放,它會滾向最近的山谷并停在那里。同樣,當(dāng)網(wǎng)絡(luò)接收到一個與已保存模式接近的輸入時,它也會不斷調(diào)整,直到達(dá)到能量景觀的最低點(diǎn),找到記憶中最相似的模式。

Hopfield 網(wǎng)絡(luò)可以用于重建受噪聲干擾或部分丟失的數(shù)據(jù)。

圖片

插圖 ? Johan Jarnestad/The Royal Swedish Academy of Sciences

Hopfield 和其他研究人員進(jìn)一步發(fā)展了 Hopfield 網(wǎng)絡(luò)的運(yùn)作機(jī)制,現(xiàn)在節(jié)點(diǎn)可以存儲任何數(shù)值,而不僅僅是 0 和 1。如果把節(jié)點(diǎn)比作圖像中的像素,它們可以擁有不同的顏色,而不僅僅是黑白兩色。改進(jìn)的方法使得網(wǎng)絡(luò)能夠保存更多的圖片,即便這些圖片非常相似,也能區(qū)分開來。只要信息是由多個數(shù)據(jù)點(diǎn)組成,網(wǎng)絡(luò)就能識別或重建它。

使用十九世紀(jì)物理學(xué)進(jìn)行分類

記住一張圖片是一回事,但要理解它的含義則需要更深入的分析。

即便是很小的孩子也能夠指著不同的動物,并自信地說出那是狗、貓或松鼠。盡管偶爾會出錯,但不久之后他們幾乎總是正確的。孩子們不需要通過圖解或?qū)W習(xí)物種和哺乳動物等概念,也能夠掌握這種分類能力。只要看到幾種動物的例子,他們的頭腦中就能自然地形成這些類別。人類通過接觸周圍的環(huán)境,學(xué)會識別貓、理解詞匯,或是在進(jìn)入房間時注意到某些變化。

當(dāng) Hopfield 發(fā)表他的聯(lián)想記憶研究時,Geoffrey Hinton 正在美國匹茲堡的卡內(nèi)基梅隆大學(xué)任職。他曾在英國和蘇格蘭學(xué)習(xí)實(shí)驗(yàn)心理學(xué)與人工智能,正在思索機(jī)器是否也能像人類一樣,通過自主發(fā)現(xiàn)信息分類方式來處理和解讀模式。與同事 Terrence Sejnowski 一起,Hinton 從 Hopfield 網(wǎng)絡(luò)出發(fā),并結(jié)合統(tǒng)計(jì)物理學(xué)的概念,開發(fā)出新的模型。

統(tǒng)計(jì)物理學(xué)用于描述由許多相似元素組成的系統(tǒng),比如氣體中的分子。雖然很難甚至不可能追蹤每個分子的運(yùn)動軌跡,但通過整體分析,可以得出氣體的總體特性,如壓力和溫度。盡管氣體分子可以以不同的速度擴(kuò)散在空間中,但它們?nèi)匀荒墚a(chǎn)生相同的整體性質(zhì)。

通過統(tǒng)計(jì)物理學(xué),我們可以分析這些分子組成的系統(tǒng)中的各種狀態(tài),并計(jì)算出它們的發(fā)生概率。一些狀態(tài)比其他狀態(tài)更有可能發(fā)生,這主要取決于可用能量的多少,這由 19 世紀(jì)物理學(xué)家 Ludwig Boltzmann 的方程描述。Hinton 的網(wǎng)絡(luò)利用了這個方程,這種方法在 1985 年以“Boltzmann 機(jī)”這個引人注目的名字發(fā)表出來。

識別同類型的新樣本

Boltzmann 機(jī)通常由兩種不同的節(jié)點(diǎn)組成。一部分節(jié)點(diǎn)稱為可見節(jié)點(diǎn),用于接收輸入信息;另一部分節(jié)點(diǎn)構(gòu)成了隱藏層。這些隱藏節(jié)點(diǎn)的數(shù)值和它們之間的連接也會影響整個網(wǎng)絡(luò)的能量狀態(tài)。

該機(jī)器通過逐個更新節(jié)點(diǎn)數(shù)值的規(guī)則運(yùn)行,最終機(jī)器會進(jìn)入一個狀態(tài),此時節(jié)點(diǎn)的模式可以變化,但網(wǎng)絡(luò)整體的屬性保持不變。每一種可能的模式都有一個由網(wǎng)絡(luò)能量根據(jù) Boltzmann 方程確定的特定概率。當(dāng)機(jī)器停止時,它生成了一個新的模式,這使得 Boltzmann 機(jī)成為早期生成模型的典型例子。

不同網(wǎng)絡(luò)類型的插圖

圖片

? Johan Jarnestad/The Royal Swedish Academy of Sciences

Boltzmann 機(jī)通過提供的示例進(jìn)行學(xué)習(xí),而不是依靠指令。它通過調(diào)整網(wǎng)絡(luò)連接中的數(shù)值進(jìn)行訓(xùn)練,使得訓(xùn)練時輸入到可見節(jié)點(diǎn)的示例模式在機(jī)器運(yùn)行時的出現(xiàn)概率最大化。如果在訓(xùn)練中多次重復(fù)某個模式,該模式的出現(xiàn)概率會進(jìn)一步增加。訓(xùn)練還會影響生成類似示例的新模式的概率。

訓(xùn)練后的 Boltzmann 機(jī)可以識別出未曾見過的信息中的熟悉特征。就像你見到朋友的兄弟姐妹時,能立刻感覺到他們是親屬一樣,Boltzmann 機(jī)也能識別出屬于訓(xùn)練類別的新例子,并將其與不相似的材料區(qū)分開來。

原始版本的 Boltzmann 機(jī)效率較低,尋找解決方案需要很長時間。當(dāng)進(jìn)行多方面的優(yōu)化后,它變得更有吸引力,Hinton 繼續(xù)探索這些改進(jìn)。后續(xù)版本精簡了某些單元之間的連接,事實(shí)證明,這提高了機(jī)器的效率。

在 1990 年代,許多研究人員對人工神經(jīng)網(wǎng)絡(luò)逐漸失去了興趣,但 Hinton 一直堅(jiān)持繼續(xù)這一領(lǐng)域的工作,并促成了新一輪的突破。在 2006 年,他與同事 Simon Osindero、Yee Whye Teh 和 Ruslan Salakhutdinov 共同開發(fā)了一種多層 Boltzmann 機(jī)的預(yù)訓(xùn)練方法。這種方法為網(wǎng)絡(luò)提供了更好的初始狀態(tài),使得它在圖像識別訓(xùn)練中更為高效。

Boltzmann 機(jī)經(jīng)常作為更大網(wǎng)絡(luò)的一部分使用。例如,它可以根據(jù)觀眾的偏好推薦電影或電視劇。

機(jī)器學(xué)習(xí)——現(xiàn)在與未來

John Hopfield 和 Geoffrey Hinton 自 1980 年代以來的研究,奠定了 2010 年左右開始的機(jī)器學(xué)習(xí)革命的基礎(chǔ)。

如今,我們看到的技術(shù)進(jìn)步,得益于能夠使用海量數(shù)據(jù)訓(xùn)練網(wǎng)絡(luò)以及計(jì)算能力的巨大提升?,F(xiàn)代的人工神經(jīng)網(wǎng)絡(luò)規(guī)模龐大,通常由多層結(jié)構(gòu)組成,稱為深度神經(jīng)網(wǎng)絡(luò),它們的訓(xùn)練方法被稱為深度學(xué)習(xí)。

回顧一下 Hopfield 在 1982 年發(fā)表的聯(lián)想記憶研究,可以讓我們對這一發(fā)展有更好的理解。當(dāng)時,他使用了一個包含 30 個節(jié)點(diǎn)的網(wǎng)絡(luò)。如果所有節(jié)點(diǎn)相互連接,就有 435 條連接。每個節(jié)點(diǎn)有其數(shù)值,連接有不同的強(qiáng)度,總共不到 500 個參數(shù)需要管理。他還嘗試了一個 100 個節(jié)點(diǎn)的網(wǎng)絡(luò),但由于當(dāng)時計(jì)算機(jī)的限制,這個網(wǎng)絡(luò)過于復(fù)雜。對比之下,今天的大型語言模型網(wǎng)絡(luò)可能包含超過一萬億個參數(shù)(即一百萬的百萬)。

目前,許多研究人員正在探索機(jī)器學(xué)習(xí)的應(yīng)用前景,哪些領(lǐng)域?qū)⒆罹呱€需拭目以待。同時,關(guān)于這項(xiàng)技術(shù)的倫理問題也在廣泛討論。

物理學(xué)不僅為機(jī)器學(xué)習(xí)的進(jìn)步提供了工具,反過來,物理學(xué)研究領(lǐng)域本身也從人工神經(jīng)網(wǎng)絡(luò)中受益。機(jī)器學(xué)習(xí)已在一些曾獲得諾貝爾物理學(xué)獎的研究領(lǐng)域中得到應(yīng)用,比如篩選和處理大量數(shù)據(jù)以發(fā)現(xiàn)希格斯粒子。其他應(yīng)用還包括減少黑洞碰撞產(chǎn)生的引力波測量中的噪聲,或用于尋找系外行星。

近年來,機(jī)器學(xué)習(xí)技術(shù)也開始用于計(jì)算和預(yù)測分子與材料的特性,比如計(jì)算決定蛋白質(zhì)功能的分子結(jié)構(gòu),或研究哪些新型材料可能具備制造更高效太陽能電池的最佳特性。

進(jìn)一步閱讀 如果您想了解今年諾貝爾獎的更多信息,包括英文版的科學(xué)背景材料,請?jiān)L問瑞典皇家科學(xué)院的網(wǎng)站 www.kva.se 或 www.nobelprize.org。在這些網(wǎng)站上,您可以觀看新聞發(fā)布會、諾貝爾獎演講等相關(guān)視頻。有關(guān)諾貝爾獎及經(jīng)濟(jì)科學(xué)獎相關(guān)展覽與活動的更多信息,請?jiān)L問 www.nobelprizemuseum.se。

瑞典皇家科學(xué)院決定將 2024 年諾貝爾物理學(xué)獎授予:

JOHN J. HOPFIELD1933 年生于美國伊利諾伊州芝加哥。1958 年獲美國紐約州康奈爾大學(xué)博士學(xué)位。現(xiàn)為美國普林斯頓大學(xué)教授。

GEOFFREY E. HINTON1947 年生于英國倫敦。1978 年獲英國愛丁堡大學(xué)博士學(xué)位。現(xiàn)為加拿大多倫多大學(xué)教授。

“表彰他們在使人工神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)機(jī)器學(xué)習(xí)的基礎(chǔ)性發(fā)現(xiàn)和發(fā)明。”

科學(xué)編輯:Ulf Danielsson、Olle Eriksson、Anders Irb?ck 和 Ellen Moons,諾貝爾物理學(xué)獎委員會 

撰稿:Anna Davour

翻譯:Clare Barnes

插圖:Johan Jarnestad

編輯:Sara Gustavsson

諾貝爾官網(wǎng)通告原文:

The Nobel Prize in Physics 2024 This year’s laureates used tools from physics to construct methods that helped lay the foundation for today’s powerful machine learning. John Hopfield created a structure that can store and reconstruct information. Geoffrey Hinton invented a method that can independently discover properties in data and which has become important for the large artificial neural networks now in use.

They used physics to find patterns in information

Illustration

? Johan Jarnestad/The Royal Swedish Academy of Sciences Many people have experienced how computers can translate between languages, interpret images and even conduct reasonable conversations. What is perhaps less well known is that this type of technology has long been important for research, including the sorting and analysis of vast amounts of data. The development of machine learning has exploded over the past fifteen to twenty years and utilises a structure called an artificial neural network. Nowadays, when we talk about artificial intelligence, this is often the type of technology we mean.

Although computers cannot think, machines can now mimic functions such as memory and learning. This year’s laureates in physics have helped make this possible. Using fundamental concepts and methods from physics, they have developed technologies that use structures in networks to process information.

Machine learning differs from traditional software, which works like a type of recipe. The software receives data, which is processed according to a clear description and produces the results, much like when someone collects ingredients and processes them by following a recipe, producing a cake. Instead of this, in machine learning the computer learns by example, enabling it to tackle problems that are too vague and complicated to be managed by step by step instructions. One example is interpreting a picture to identify the objects in it.

Popular information

 

Popular science background:
They used physics to find patterns in information (pdf)
Popul?rvetenskaplig information:
De anv?nde fysiken f?r att hitta m?nster i information (pdf)
圖片

The Nobel Prize in Physics 2024

This year’s laureates used tools from physics to construct methods that helped lay the foundation for today’s powerful machine learning. John Hopfield created a structure that can store and reconstruct information. Geoffrey Hinton invented a method that can independently discover properties in data and which has become important for the large artificial neural networks now in use.

They used physics to find patterns in information

圖片
? Johan Jarnestad/The Royal Swedish Academy of Sciences

Many people have experienced how computers can translate between languages, interpret images and even conduct reasonable conversations. What is perhaps less well known is that this type of technology has long been important for research, including the sorting and analysis of vast amounts of data. The development of machine learning has exploded over the past fifteen to twenty years and utilises a structure called an artificial neural network. Nowadays, when we talk about artificial intelligence, this is often the type of technology we mean.

Although computers cannot think, machines can now mimic functions such as memory and learning. This year’s laureates in physics have helped make this possible. Using fundamental concepts and methods from physics, they have developed technologies that use structures in networks to process information.

Machine learning differs from traditional software, which works like a type of recipe. The software receives data, which is processed according to a clear description and produces the results, much like when someone collects ingredients and processes them by following a recipe, producing a cake. Instead of this, in machine learning the computer learns by example, enabling it to tackle problems that are too vague and complicated to be managed by step by step instructions. One example is interpreting a picture to identify the objects in it.

Mimics the brain

An artificial neural network processes information using the entire network structure. The inspiration initially came from the desire to understand how the brain works. In the 1940s, researchers had started to reason around the mathematics that underlies the brain’s network of neurons and synapses. Another piece of the puzzle came from psychology, thanks to neuroscientist Donald Hebb’s hypothesis about how learning occurs because connections between neurons are reinforced when they work together.

Later, these ideas were followed by attempts to recreate how the brain’s network functions by building artificial neural networks as computer simulations. In these, the brain’s neurons are mimicked by nodes that are given different values, and the synapses are represented by connections between the nodes that can be made stronger or weaker. Donald Hebb’s hypothesis is still used as one of the basic rules for updating artificial networks through a process called training.

圖片
? Johan Jarnestad/The Royal Swedish Academy of Sciences

At the end of the 1960s, some discouraging theoretical results caused many researchers to suspect that these neural networks would never be of any real use. However, interest in artificial neural networks was reawakened in the 1980s, when several important ideas made an impact, including work by this year’s laureates.

Associative memory

Imagine that you are trying to remember a fairly unusual word that you rarely use, such as one for that sloping floor often found in cinemas and lecture halls. You search your memory. It’s something like ramp… perhaps rad…ial? No, not that. Rake, that’s it!

This process of searching through similar words to find the right one is reminiscent of the associative memory that the physicist John Hopfield discovered in 1982. The Hopfield network can store patterns and has a method for recreating them. When the network is given an incomplete or slightly distorted pattern, the method can find the stored pattern that is most similar.

Hopfield had previously used his background in physics to explore theoretical problems in molecular biology. When he was invited to a meeting about neuroscience he encountered research into the structure of the brain. He was fascinated by what he learned and started to think about the dynamics of simple neural networks. When neurons act together, they can give rise to new and powerful characteristics that are not apparent to someone who only looks at the network’s separate components.

In 1980, Hopfield left his position at Princeton University, where his research interests had taken him outside the areas in which his colleagues in physics worked, and moved across the continent. He had accepted the offer of a professorship in chemistry and biology at Caltech (California Institute of Technology) in Pasadena, southern California. There, he had access to computer resources that he could use for free experimentation and to develop his ideas about neural networks.

However, he did not abandon his foundation in physics, where he found inspiration for his under-standing of how systems with many small components that work together can give rise to new and interesting phenomena. He particularly benefitted from having learned about magnetic materials that have special characteristics thanks to their atomic spin – a property that makes each atom a tiny magnet. The spins of neighbouring atoms affect each other; this can allow domains to form with spin in the same direction. He was able to make a model network with nodes and connections by using the physics that describes how materials develop when spins influence each other.

The network saves images in a landscape

The network that Hopfield built has nodes that are all joined together via connections of different strengths. Each node can store an individual value – in Hopfield’s first work this could either be 0 or 1, like the pixels in a black and white picture.

Hopfield described the overall state of the network with a property that is equivalent to the energy in the spin system found in physics; the energy is calculated using a formula that uses all the values of the nodes and all the strengths of the connections between them. The Hopfield network is programmed by an image being fed to the nodes, which are given the value of black (0) or white (1). The network’s connections are then adjusted using the energy formula, so that the saved image gets low energy. When another pattern is fed into the network, there is a rule for going through the nodes one by one and checking whether the network has lower energy if the value of that node is changed. If it turns out that energy is reduced if a black pixel is white instead, it changes colour. This procedure continues until it is impossible to find any further improvements. When this point is reached, the network has often reproduced the original image on which it was trained.

This may not appear so remarkable if you only save one pattern. Perhaps you are wondering why you don’t just save the image itself and compare it to another image being tested, but Hopfield’s method is special because several pictures can be saved at the same time and the network can usually differentiate between them.

Hopfield likened searching the network for a saved state to rolling a ball through a landscape of peaks and valleys, with friction that slows its movement. If the ball is dropped in a particular location, it will roll into the nearest valley and stop there. If the network is given a pattern that is close to one of the saved patterns it will, in the same way, keep moving forward until it ends up at the bottom of a valley in the energy landscape, thus finding the closest pattern in its memory.

The Hopfield network can be used to recreate data that contains noise or which has been partially erased.

圖片
? Johan Jarnestad/The Royal Swedish Academy of Sciences

Hopfield and others have continued to develop the details of how the Hopfield network functions, including nodes that can store any value, not just zero or one. If you think about nodes as pixels in a picture, they can have different colours, not just black or white. Improved methods have made it possible to save more pictures and to differentiate between them even when they are quite similar. It is just as possible to identify or reconstruct any information at all, provided it is built from many data points.

Classification using nineteenth-century physics

Remembering an image is one thing, but interpreting what it depicts requires a little more.

Even very young children can point at different animals and confidently say whether it is a dog, a cat, or a squirrel. They might get it wrong occasionally, but fairly soon they are correct almost all the time. A child can learn this even without seeing any diagrams or explanations of concepts such as species or mammal. After encountering a few examples of each type of animal, the different categories fall into place in the child’s head. People learn to recognise a cat, or understand a word, or enter a room and notice that something has changed, by experiencing the environment around them.

When Hopfield published his article on associative memory, Geoffrey Hinton was working at Carnegie Mellon University in Pittsburgh, USA. He had previously studied experimental psychology and artificial intelligence in England and Scotland and was wondering whether machines could learn to process patterns in a similar way to humans, finding their own categories for sorting and interpreting information. Along with his colleague, Terrence Sejnowski, Hinton started from the Hopfield network and expanded it to build something new, using ideas from statistical physics.

Statistical physics describes systems that are composed of many similar elements, such as molecules in a gas. It is difficult, or impossible, to track all the separate molecules in the gas, but it is possible to consider them collectively to determine the gas’ overarching properties like pressure or temperature. There are many potential ways for gas molecules to spread through its volume at individual speeds and still result in the same collective

properties.

The states in which the individual components can jointly exist can be analysed using statistical physics, and the probability of them occurring calculated. Some states are more probable than others; this depends on the amount of available energy, which is described in an equation by the nineteenth-century physicist Ludwig Boltzmann. Hinton’s network utilised that equation, and the method was published in 1985 under the striking name of the Boltzmann machine.

Recognising new examples of the same type

The Boltzmann machine is commonly used with two different types of nodes. Information is fed to one group, which are called visible nodes. The other nodes form a hidden layer. The hidden nodes’ values and connections also contribute to the energy of the network as a whole.

The machine is run by applying a rule for updating the values of the nodes one at a time. Eventually the machine will enter a state in which the nodes’ pattern can change, but the properties of the network as a whole remain the same. Each possible pattern will then have a specific probability that is determined by the network’s energy according to Boltzmann’s equation. When the machine stops it has created a new pattern, which makes the Boltzmann machine an early example of a generative model.

圖片
? Johan Jarnestad/The Royal Swedish Academy of Sciences

The Boltzmann machine can learn – not from instructions, but from being given examples. It is trained by updating the values in the network’s connections so that the example patterns, which were fed to the visible nodes when it was trained, have the highest possible probability of occurring when the machine is run. If the same pattern is repeated several times during this training, the probability for this pattern is even higher. Training also affects the probability of outputting new patterns that resemble the examples on which the machine was trained.

A trained Boltzmann machine can recognise familiar traits in information it has not previously seen. Imagine meeting a friend’s sibling, and you can immediately see that they must be related. In a similar way, the Boltzmann machine can recognise an entirely new example if it belongs to a category found in the training material, and differentiate it from material that is dissimilar.

In its original form, the Boltzmann machine is fairly inefficient and takes a long time to find solutions. Things become more interesting when it is developed in various ways, which Hinton has continued to explore. Later versions have been thinned out, as the connections between some of the units have been removed. It turns out that this may make the machine more efficient.

During the 1990s, many researchers lost interest in artificial neural networks, but Hinton was one of those who continued to work in the field. He also helped start the new explosion of exciting results; in 2006 he and his colleagues Simon Osindero, Yee Whye Teh and Ruslan Salakhutdinov developed a method for pretraining a network with a series of Boltzmann machines in layers, one on top of the other. This pretraining gave the connections in the network a better starting point, which optimised its training to recognise elements in pictures.

The Boltzmann machine is often used as part of a larger network. For example, it can be used to recommend films or television series based on the viewer’s preferences.

Machine learning – today and tomorrow

Thanks to their work from the 1980s and onward, John Hopfield and Geoffrey Hinton have helped lay the foundation for the machine learning revolution that started around 2010.

The development we are now witnessing has been made possible through access to the vast amounts of data that can be used to train networks, and through the enormous increase in computing power. Today’s artificial neural networks are often enormous and constructed from many layers. These are called deep neural networks and the way they are trained is called deep learning.

A quick glance at Hopfield’s article on associative memory, from 1982, provides some perspective on this development. In it, he used a network with 30 nodes. If all the nodes are connected to each other, there are 435 connections. The nodes have their values, the connections have different strengths and, in total, there are fewer than 500 parameters to keep track of. He also tried a network with 100 nodes, but this was too complicated, given the computer he was using at the time. We can compare this to the large language models of today, which are built as networks that can contain more than one trillion parameters (one million millions).

Many researchers are now developing machine learning’s areas of application. Which will be the most viable remains to be seen, while there is also wide-ranging discussion on the ethical issues that surround the development and use of this technology.

Because physics has contributed tools for the development of machine learning, it is interesting to see how physics, as a research field, is also benefitting from artificial neural networks. Machine learning has long been used in areas we may be familiar with from previous Nobel Prizes in Physics. These include the use of machine learning to sift through and process the vast amounts of data necessary to discover the Higgs particle. Other applications include reducing noise in measurements of the gravitational waves from colliding black holes, or the search for exoplanets.

In recent years, this technology has also begun to be used when calculating and predicting the properties of molecules and materials – such as calculating protein molecules’ structure, which determines their function, or working out which new versions of a material may have the best properties for use in more efficient solar cells.


Further reading

Additional information on this year’s prizes, including a scientific background in English, is available on the website of the Royal Swedish Academy of Sciences, www.kva.se, and at www.nobelprize.org, where you can watch video from the press conferences, the Nobel Lectures and more. Information on exhibitions and activities related to the Nobel Prizes and the Prize in Economic Sciences is available at www.nobelprizemuseum.se.


The Royal Swedish Academy of Sciences has decided to award the Nobel Prize in Physics 2024 to

JOHN J. HOPFIELD
Born 1933 in Chicago, IL, USA. PhD 1958 from Cornell University, Ithaca, NY, USA. Professor at Princeton University, NJ, USA.

GEOFFREY E. HINTON
Born 1947 in London, UK. PhD 1978 from The University of Edinburgh, UK. Professor at University of Toronto, Canada.

“for foundational discoveries and inventions that enable machine learning with artificial neural networks”


Science Editors: Ulf Danielsson, Olle Eriksson, Anders Irb?ck, and Ellen Moons, the Nobel Committee for Physics
Text: Anna Davour
Translator: Clare Barnes
Illustrations: Johan Jarnestad
Editor: Sara Gustavsson
? The Royal Swedish Academy of Sciences

    本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約

    類似文章 更多