ChatGPT 是網路上的一個模糊 JPEG文件

fox hsiao
14 min readFeb 10, 2023

原文刊載於 The New Yorker ,作者為姜峯楠(Ted Chiang),是華裔美國科幻小說作家。 他曾獲得四項星雲獎、四項雨果獎、約翰·W·坎貝爾最佳新作家獎、四項軌跡獎和其他獎項。 他的短篇小說《你一生的故事》在2016年被改編成電影《異星入境》,本文由 AI 翻譯,些許編譯。

AI 產生

OpenAI 的聊天機器人提供"轉述",而 Google 則提供”引用”。我們更喜歡哪一個?

In 2013, workers at a German construction company noticed something odd about their Xerox photocopier: when they made a copy of the floor plan of a house, the copy differed from the original in a subtle but significant way. In the original floor plan, each of the house’s three rooms was accompanied by a rectangle specifying its area: the rooms were 14.13, 21.11, and 17.42 square metres, respectively. However, in the photocopy, all three rooms were labelled as being 14.13 square metres in size. The company contacted the computer scientist David Kriesel to investigate this seemingly inconceivable result. They needed a computer scientist because a modern Xerox photocopier doesn’t use the physical popularized in the nineteen-sixties. Instead, it scans the document digitally, and then prints the resulting image file. Combine that with the fact that virtually every digital image file is compressed to save space, and a solution to the mystery begins to suggest itself.

2013年,一家德國建築公司的工人注意到他們的 Xerox 影印機有一些奇怪的地方:當他們影印一所房子的平面圖時,影印件與原件有細微但重要的區別。在原始平面圖中,該房屋的三個房間都是長方形,其房間的面積分別為14.13、21.11和17.42平方公尺。然而,在影印文件中,所有三個房間都被標註為14.13平方公尺的面積。該公司聯繫電腦科學家 David Kriesel 來調查這個看似不可思議的結果。他們需要一位電腦科學家,因為現代 Xerox 公司的影印機並不使用 1960 年代流行的物理靜電印刷 過程。相反,它以數位方式掃描文件,然後列印出產生的圖像文件。結合這一事實,幾乎每一個數位圖像文件都被壓縮以節省空間,而這個謎團的解決方案也開始出現。

Compressing a file requires two steps: first, the encoding, during which the file is converted into a more compact format, and then the decoding, whereby the process is reversed. If the restored file is identical to the original, then the compression process is described as lossless: no information has been discarded. By contrast, if the restored file is only an approximation of the original, the compression is described as lossy: some information has been discarded and is now unrecoverable. Lossless compression is what’s typically used for text files and computer programs, because those are domains in which even a single incorrect character has the potential to be disastrous. Lossy compression is often used for photos, audio, and video in situations in which absolute accuracy isn’t essential. Most of the time, we don’t notice if a picture, song, or movie isn’t perfectly reproduced. The loss in fidelity becomes more perceptible only as files are squeezed very tightly. In those cases, we notice what are known as compression artifacts: the fuzziness of the smallest and images, or the tinny sound of low-bit-rate .

壓縮文件需要兩個步驟:首先是編碼,在這個過程中,文件被轉換成一個更緊湊的格式,然後是解碼,這個過程被逆轉了。如果恢復的文件與原始文件相同,那麼壓縮過程被描述為無損的:沒有資訊被丟棄。相比之下,如果恢復的文件只是原始文件的一個近似值,則壓縮被描述為有損的:一些資訊被丟棄,無法恢復。無損壓縮通常用於文字文件和電腦程式,因為在這些領域,即使是一個錯誤的字符也有可能是災難性的。有損壓縮通常用於照片、音檔和影片,在這種情況下,絕對的準確性並不重要。大多數時候,我們不會注意到一張照片、一首歌曲或一部電影沒有完美再現。只有當文件被壓縮得非常小時,擬真度的損失才會變得更容易被察覺。在這些情況下,我們會注意到所謂的壓縮失真:最小的 jpeg 和 mpeg 圖像的模糊性,或低位元率MP3 的尖銳聲音。

Xerox photocopiers use a lossy compression format known as 2, designed for use with black-and-white images. To save space, the copier identifies similar-looking regions in the image and stores a single copy for all of them; when the file is decompressed, it uses that copy repeatedly to reconstruct the image. It turned out that the photocopier had judged the labels specifying the area of the rooms to be similar enough that it needed to store only one of them — 14.13 — and it reused that one for all three rooms when printing the floor plan.

Xerox 影印機使用一種稱為jbig2的有損壓縮格式,設計用於黑白圖像。為了節省空間,影印機識別圖像中看起來相似的區域,並為所有這些區域存儲一個副件;當文件被解壓縮時,它重複使用該副件來重建圖像。事實證明,複印機判斷指定房間面積的標籤足夠相似,它只需要存儲其中的一個 — 14.13,而且在列印平面圖時,它對所有三個房間都重複使用這個標籤。

The fact that Xerox photocopiers use a lossy compression format instead of a lossless one isn’t, in itself, a problem. The problem is that the photocopiers were degrading the image in a subtle way, in which the compression artifacts weren’t immediately recognizable. If the photocopier simply produced blurry printouts, everyone would know that they weren’t accurate reproductions of the originals. What led to problems was the fact that the photocopier was producing numbers that were readable but incorrect; it made the copies seem accurate when they weren’t. (In 2014, Xerox released a patch to correct this issue.)

Xerox 影印機使用有損壓縮格式而不是無損壓縮格式這一事實本身並不是一個問題。問題是,影印機以一種微妙的方式降低了圖像的品質,在這種情況下,壓縮失真並不能立即被識別出來。如果影印機只是產生了模糊的列印文件,每個人都會知道它們不是原件的準確複製品。導致問題的原因是,影印機產生的數字是可讀的,但不正確;它使影印文件看起來是準確的,但實際上並不準確。 (2014年, Xerox 發布了一個更新來糾正這個問題)。

I think that this incident with the Xerox photocopier is worth bearing in mind today, as we consider OpenAI’s , which A.I. researchers call large-language models. The resemblance between a photocopier and a large-language model might not be immediately apparent — but consider the following scenario. Imagine that you’re about to lose your access to the Internet forever. In preparation, you plan to create a compressed copy of all the text on the Web, so that you can store it on a private server. Unfortunately, your private server has only one per cent of the space needed; you can’t use a lossless compression algorithm if you want everything to fit. Instead, you write a lossy algorithm that identifies statistical regularities in the text and stores them in a specialized file format. Because you have virtually unlimited computational power to throw at this task, your algorithm can identify extraordinarily nuanced statistical regularities, and this allows you to achieve the desired compression ratio of a hundred to one.

我認為,在我們考慮 OpenAI 的 ChatGPT 和其他類似程式(人工智慧研究人員稱之為大型語言模型)時,Xerox 影印機的這一事件值得我們今天銘記。影印機和大語言模型之間的相似性可能不會立即顯現出來 — 但請考慮以下場景。想像一下,你即將永遠失去對網路的造訪。在準備過程中,你計劃將網絡上的所有文字創建一個壓縮的副本,這樣你就可以將其存儲在一個私人伺服器上。不幸的是,你的私人伺服器只有所需空間的百分之一;如果你希望所有的東西都能裝下,你就不能使用無損壓縮演算法。相反,你寫一個有損演算法,識別文字中的統計規律性,並將其存儲在一個專門的文件格式中。因為你有幾乎無限的計算能力來完成這項任務,你的演算法可以識別非常細微的統計規律性,這使你可以實現所需的一百比一的壓縮率。

Now, losing your Internet access isn’t quite so terrible; you’ve got all the information on the Web stored on your server. The only catch is that, because the text has been so highly compressed, you can’t look for information by searching for an exact quote; you’ll never get an exact match, because the words aren’t what’s being stored. To solve this problem, you create an interface that accepts queries in the form of questions and responds with answers that convey the gist of what you have on your server.


What I’ve described sounds a lot like ChatGPT, or most any other large-language model. Think of ChatGPT as a blurry of all the text on the Web. It retains much of the information on the Web, in the same way that a retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry , but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.

我所描述的情況聽起來很像ChatGPT,或者其他大多數大型語言模型。把ChatGPT想像成網路上所有文本的模糊 JPEG。它保留了網路上的大部分訊息,就像 JPEG 保留高解析圖像的大部分資訊一樣,但是,如果你在尋找一個精確的位元序列,你不會找到它;你所得到的只是一個近似值。但是,由於這個近似值是以語法文字的形式呈現的,而 ChatGPT 擅長創建這種文字,所以它通常是可以接受的。你看到的仍然是一個模糊的JPEG,但模糊的發生方式並沒有使整個圖片看起來不那麼清晰。

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large-language models such as ChatGPT are all too prone. These hallucinations are compression artifacts, but — like the incorrect labels generated by the Xerox photocopier — they are plausible enough that identifying them requires comparing them against the originals, which in this case means either the Web or our own knowledge of the world. When we think about them this way, such hallucinations are anything but surprising; if a compression algorithm is designed to reconstruct text after ninety-nine per cent of the original has been discarded, we should expect that significant portions of what it generates will be entirely fabricated.

這種對有損壓縮的比喻不僅僅是理解 ChatGPT 透過使用不同的詞語來重新包裝網路上的資訊的一種方法。它也是理解 “幻覺 “的一種方式,即對事實問題的無意義的回答,像 ChatGPT 這樣的大型語言模型很容易出現這種情況。這些幻覺是壓縮人工製品,但就像 Xerox 影印機產生的錯誤標籤一樣,它們足夠可信,以至於識別它們需要將它們與原件進行比較,在這種情況下,原件意味著網路或我們自己對世界的了解。當我們這樣想的時候,這樣的幻覺並不令人驚訝;如果一個壓縮演算法被設計成在丟棄了99%的原件後重建文本,我們應該想到它所產生的內容中的很大一部分將是完全捏造的。

This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation — that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them. (“When in the Course of human events, it becomes necessary for one to separate his garments from their mates, in order to maintain the cleanliness and order thereof. . . .”) ChatGPT is so good at this form of interpolation that people find it entertaining: they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

當我們記得有損壓縮演算法使用的一個常見技術是插值 — 即透過查看缺口兩側的內容來估計缺少的內容,這個類比就更有意義了。當一個圖像程式在顯示一張照片時,必須重建一個在壓縮過程中丟失的像素,它將查看附近的像素並計算平均值。這就是 ChatGPT 在被要求用《獨立宣言》的風格來描述,比如說在烘乾機裡丟了一隻襪子時所做的事情:它在 “詞彙空間 “中取兩點,並生成佔據兩點之間位置的文字。 (“在人類活動的過程中,人們有必要把他的衣服和他們的伙伴分開,以保持其清潔和秩序。 …”)ChatGPT非常擅長這種形式的插值,以至於人們發現它很有趣:他們發現了一個 “模糊 “工具,用來代替照片的段落,並且玩得很開心。

Given that large-language models like ChatGPT are often extolled as the cutting edge of artificial intelligence, it may sound dismissive — or at least deflating — to describe them as lossy text-compression algorithms. I do think that this perspective offers a useful corrective to the tendency to anthropomorphize large-language models, but there is another aspect to the compression analogy that is worth considering. Since 2006, an A.I. researcher named Marcus Hutter has offered a cash reward — known as the Prize for Compressing Human Knowledge, or the Hutter Prize — to anyone who can losslessly compress a specific one-gigabyte snapshot of smaller than the previous prize-winner did. You have probably encountered files compressed using the zip file format. The zip format reduces Hutter’s one-gigabyte file to about three hundred megabytes; the most recent prize-winner has managed to reduce it to a hundred and fifteen megabytes. This isn’t just an exercise in smooshing. Hutter believes that better text compression will be instrumental in the creation of human-level artificial intelligence, in part because the greatest degree of compression can be achieved by understanding the text.

鑑於像ChatGPT這樣的大語言模型經常被頌揚為人工智慧的最前沿,把它們描述為有損失的文本壓縮算法可能聽起來不屑一顧,或者至少是漏氣。我確實認為這種觀點對將大型語言模型擬人化的傾向提供了有益的糾正,但壓縮類比的另一個方面值得考慮。自2006年以來,一位名叫馬庫斯-胡特(Marcus Hutter)的人工智慧研究員提供了一筆現金獎勵 — 稱為 “人類知識壓縮獎”,或稱 “胡特獎” — 獎勵任何能夠將維基百科的特定一GB快照無損壓縮得比前一位獲獎者小的人。你可能已經遇到過使用zip文件格式壓縮的文件。 zip格式將胡特的 1GB 文件減少到大約 300MB;最近的獲獎者設法將其減少到 115 MB。這並不僅僅是一個平滑的練習。胡特認為,更好的文字壓縮將有助於創造人類水平的人工智慧,部分原因是透過理解文字可以實現最大程度的壓縮。

To grasp the proposed relationship between compression and understanding, imagine that you have a text file containing a million examples of addition, subtraction, multiplication, and division. Although any compression algorithm could reduce the size of this file, the way to achieve the greatest compression ratio would probably be to derive the principles of arithmetic and then write the code for a calculator program. Using a calculator, you could perfectly reconstruct not just the million examples in the file but any other example of arithmetic that you might encounter in the future. The same logic applies to the problem of compressing a slice of Wikipedia. If a compression program knows that force equals mass times acceleration, it can discard a lot of words when compressing the pages about physics because it will be able to reconstruct them. Likewise, the more the program knows about supply and demand, the more words it can discard when compressing the pages about economics, and so forth.


Large-language models identify statistical regularities in text. Any analysis of the text of the Web will reveal that phrases like “supply is low” often appear in close proximity to phrases like “prices rise.” A chatbot that incorporates this correlation might, when asked a question about the effect of supply shortages, respond with an answer about prices increasing. If a large-language model has compiled a vast number of correlations between economic terms — so many that it can offer plausible responses to a wide variety of questions — should we say that it actually understands economic theory? Models like ChatGPT aren’t eligible for the Hutter Prize for a variety of reasons, one of which is that they don’t reconstruct the original text precisely — i.e., they don’t perform lossless compression. But is it possible that their lossy compression nonetheless indicates real understanding of the sort that A.I. researchers are interested in?

大語言模型可以識別文字中的統計規律性。對網路文字的任何分析都會發現,像 “供應不足 “這樣的字詞經常與 “價格上漲 “這樣的字詞緊緊相鄰出現。當被問及關於供應短缺的影響的問題時,一個包含這種相關性的聊天機器人可能會回答關於價格上漲的答案。如果一個大型語言模型編制了大量的經濟術語之間的相關性 — 以至於它可以對各種問題提供合理的回答 — 我們是否應該說它實際上理解了經濟理論?像ChatGPT這樣的模型沒有資格獲得胡特獎,原因有很多,其中之一是它們沒有精確地重建原文 — 也就是說,它們沒有進行無損壓縮。但是,他們的有損壓縮還是表明了人工智慧研究人員感興趣的那種真正的理解,這可能嗎?

Let’s go back to the example of arithmetic. If you ask GPT-3 (the large-language model ) to add or subtract a pair of numbers, it almost always responds with the correct answer when the numbers have only two digits. But its accuracy worsens significantly with larger numbers, falling to ten per cent when the numbers have five digits. Most of the correct answers that GPT-3 gives are not found on the Web — there aren’t many Web pages that contain the text “245 + 821,” for example — so it’s not engaged in simple memorization. But, despite ingesting a vast amount of information, it hasn’t been able to derive the principles of arithmetic, either. A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

讓我們回到算術的例子上。如果你要求 GPT-3(ChatGPT所依據的大型語言模型)對一組數字進行加減運算,當數字只有兩位時,它幾乎總是能做出正確的回答。但是當數字越大時,它的準確性就越差,當數字有五位數時,它的準確率就會下降到百分之十。 GPT-3給出的大多數正確答案在網路上找不到 — 例如,沒有多少網頁包含 “245+821 “這樣的文字,所以它不是在進行簡單的記憶。但是,儘管攝取大量的訊息,它也沒有能夠推導出算術的原理。對 GPT-3 的錯誤答案仔細檢查表明,它在進行算術時並沒有進位 “1”。網路上當然有關於進位 “1 “的解釋,但 GPT-3 並沒有能夠納入這些解釋。 GPT-3對算術實例的統計分析使其能夠產生對真實事物的表面近似,但也僅此而已。

Given GPT-3’s failure at a subject taught in elementary school, how can we explain the fact that it sometimes appears to perform well at writing college-level essays? Even though large-language models often hallucinate, when they’re lucid they sound like they actually understand subjects like economic theory. Perhaps arithmetic is a special case, one for which large-language models are poorly suited. Is it possible that, in areas outside addition and subtraction, statistical regularities in text actually correspond to genuine knowledge of the real world?

鑑於 GPT-3 在小學科目上的失敗,我們如何解釋它有時在寫大學水平的論文上似乎表現良好的事實?儘管大語言模型經常出現幻覺,但當它們清醒時,聽起來它們確實理解經濟理論等科目。也許算術是一個特殊的例子,大語言模型不適合於此。有沒有可能,在加減法之外的領域,文字中的統計規律性實際上與現實世界的真正知識相對應?

I think there’s a simpler explanation. Imagine what it would look like if ChatGPT were a lossless algorithm. If that were the case, it would always answer questions by providing a verbatim quote from a relevant Web page. We would probably regard the software as only a slight improvement over a conventional search engine, and be less impressed by it. The fact that ChatGPT rephrases material from the Web instead of quoting it word for word makes it seem like a student expressing ideas in her own words, rather than simply regurgitating what she’s read; it creates the illusion that ChatGPT understands the material. In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.

我認為有一個更簡單的解釋。想像一下,如果 ChatGPT 是一種無損的演算法,會是什麼樣子。如果是這樣的話,它將總是透過提供相關網頁的逐字引用來回答問題。我們可能會認為這個軟體只比傳統的搜尋引擎有一點改進,而且對它的印象也不那麼深刻。事實上, ChatGPT 重新表述了網路上的資料,而不是逐字逐句地引用,這使得它看起來像是一個學生在用自己的語言表達想法,而不是簡單地轉述她所讀到的東西;它創造了一個錯覺,即ChatGPT 理解了資料。在人類學生中,死記硬背並不是真正學習的指標,所以 ChatGPT 不能準確地引用網頁上的內容,正是讓我們認為它學到了什麼。當我們在處理單詞序列時,有損壓縮看起來比無損壓縮更聰明。

Alot of uses have been proposed for large-language models. Thinking about them as blurry s offers a way to evaluate what they might or might not be well suited for. Let’s consider a few scenarios.

人們已經為大語言模型提出了很多用途。把它們看作是模糊的 JPEGs,可以提供一種方法來評估它們可能或可能不適合做什麼。讓我們考慮幾個場景。

Can large-language models take the place of traditional search engines? For us to have confidence in them, we would need to know that they haven’t been fed propaganda and conspiracy theories — we’d need to know that the is capturing the right sections of the Web. But, even if a large-language model includes only the information we want, there’s still the matter of blurriness. There’s a type of blurriness that is acceptable, which is the re-stating of information in different words. Then there’s the blurriness of outright fabrication, which we consider unacceptable when we’re looking for facts. It’s not clear that it’s technically possible to retain the acceptable kind of blurriness while eliminating the unacceptable kind, but I expect that we’ll find out in the near future.

大語言模型能否取代傳統的搜索引擎?為了讓我們對它們有信心,我們需要知道它們沒有被灌輸宣傳和陰謀論 — 我們需要知道 JPEG 正在捕捉網路的正確部分。但是,即使一個大語言模型只包括我們想要的訊息,仍然存在著模糊性的問題。有一種模糊性是可以接受的,那就是用不同的詞重新表述訊息。還有一種模糊是完全的捏造,當我們在尋找事實的時候,我們認為這是不可接受的。目前還不清楚在技術上是否有可能保留可接受的模糊性而消除不可接受的模糊性,但我希望在不久的將來我們會發現這一點。

Even if it is possible to restrict large-language models from engaging in fabrication, should we use them to generate Web content? This would make sense only if our goal is to repackage information that’s already available on the Web. Some companies exist to do just that — we usually call them content mills. Perhaps the blurriness of large-language models will be useful to them, as a way of avoiding copyright infringement. Generally speaking, though, I’d say that anything that’s good for content mills is not good for people searching for information. The rise of this type of repackaging is what makes it harder for us to find what we’re looking for online right now; the more that text generated by large-language models gets published on the Web, the more the Web becomes a blurrier version of itself.

即使有可能限制大語言模型參與捏造,我們是否應該用它們來生成網絡內容?只有當我們的目標是重新包裝網絡上已有的資訊時,這才有意義。有些公司就是為了做這件事而存在的 — 我們通常稱它們為內容農場。也許大語言模型的模糊性對他們來說是有用的,這是一種避免侵犯版權的方法。不過,一般來說,我想說的是,對內容農場有利的東西,對搜尋資訊的人來說並不是好事。這種類型的重新包裝的興起,使我們現在更難在網上找到我們要找的東西;由大型語言模型產生的文字在網絡上發布得越多,網路就越是成為自己的一個模糊版本。

There is very little information available about OpenAI’s forthcoming successor to ChatGPT, GPT-4. But I’m going to make a prediction: when assembling the vast amount of text used to train GPT-4, the people at OpenAI will have made every effort to exclude material generated by ChatGPT or any other large-language model. If this turns out to be the case, it will serve as unintentional confirmation that the analogy between large-language models and lossy compression is useful. Repeatedly resaving a creates more compression artifacts, because more information is lost every time. It’s the digital equivalent of repeatedly making photocopies of photocopies in the old days. The image quality only gets worse.

關於 OpenAI 即將推出的 ChatGPT 的繼任者 GPT-4 的資訊非常少。但我要做一個預測:在收集用於訓練 GPT-4 的大量文字時,OpenAI 的人將盡一切努力排除由 ChatGPT 或任何其他大語言模型產生的資料。如果事實如此,它將作為無意的確認,大型語言模型和有損壓縮之間的類比是有用的。反覆儲存 JPEG 會產生更多的壓縮失真,因為每次都會丟失更多的訊息。這就相當於過去反覆列印影印文件的數字。圖像品質只會越來越差。

Indeed, a useful criterion for gauging a large-language model’s quality might be the willingness of a company to use the text that it generates as training material for a new model. If the output of ChatGPT isn’t good enough for GPT-4, we might take that as an indicator that it’s not good enough for us, either. Conversely, if a model starts generating text so good that it can be used to train new models, then that should give us confidence in the quality of that text. (I suspect that such an outcome would require a major breakthrough in the techniques used to build these models.) If and when we start seeing models producing output that’s as good as their input, then the analogy of lossy compression will no longer be applicable.

事實上,衡量一個大型語言模型品質的一個有用的標準可能是一個公司是否願意使用它所生成的文字作為新模型的訓練資料。如果 ChatGPT 的輸出對 GPT-4 來說不夠好,我們可以把它作為一個指標,認為它對我們也不夠好。相反,如果一個模型開始生成的文字如此之好,以至於可以用來訓練新的模型,那麼這應該讓我們對該文本的品質有信心。 (我懷疑這樣的結果需要在建立這些模型的技術上有重大突破)。如果當我們開始看到模型產生與輸入一樣好的輸出時,那麼有損壓縮的類比就不再適用了。

Can large-language models help humans with the creation of original writing? To answer that, we need to be specific about what we mean by that question. There is a genre of art known as Xerox art, or photocopy art, in which artists use the distinctive properties of photocopiers as creative tools. Something along those lines is surely possible with the photocopier that is ChatGPT, so, in that sense, the answer is yes. But I don’t think that anyone would claim that photocopiers have become an essential tool in the creation of art; the vast majority of artists don’t use them in their creative process, and no one argues that they’re putting themselves at a disadvantage with that choice.

大型語言模型能夠幫助人類創作原創性的文字嗎?要回答這個問題,我們需要具體說明我們對這個問題的看法。有一種藝術流派被稱為 Xerox 藝術,或影印藝術,在這種藝術中,藝術家們利用影印機的獨特屬性作為創作工具。使用 ChatGPT 的影印機肯定可以達到這些目的,因此,從這個意義上說,答案是肯定的。但我不認為有人會宣稱影印機已經成為藝術創作中必不可少的工具;絕大多數藝術家在創作過程中不使用影印機,也沒有人認為他們的這種選擇會使自己處於不利地位。

So let’s assume that we’re not talking about a new genre of writing that’s analogous to Xerox art. Given that stipulation, can the text generated by large-language models be a useful starting point for writers to build off when writing something original, whether it’s fiction or nonfiction? Will letting a large-language model handle the boilerplate allow writers to focus their attention on the really creative parts?

因此,讓我們假設我們不是在談論一種類似於 Xerox 藝術的新的寫作類型。鑑於這一規定,大型語言模型生成的文字能否成為作家在寫作原創作品時的一個有用的起點,無論是小說還是非虛構作品?讓大型語言模型來處理繁文縟節,是否能讓作家們把注意力集中在真正有創意的部分?

Obviously, no one can speak for all writers, but let me make the argument that starting with a blurry copy of unoriginal work isn’t a good way to create original work. If you’re a writer, you will write a lot of unoriginal work before you write something original. And the time and effort expended on that unoriginal work isn’t wasted; on the contrary, I would suggest that it is precisely what enables you to eventually create something original. The hours spent choosing the right word and rearranging sentences to better follow one another are what teach you how meaning is conveyed by prose. Having students write essays isn’t merely a way to test their grasp of the material; it gives them experience in articulating their thoughts. If students never have to write essays that we have all read before, they will never gain the skills needed to write something that we have never read.


And it’s not the case that, once you have ceased to be a student, you can safely use the template that a large-language model provides. The struggle to express your thoughts doesn’t disappear once you graduate — it can take place every time you start drafting a new piece. Sometimes it’s only in the process of writing that you discover your original ideas. Some might say that the output of large-language models doesn’t look all that different from a human writer’s first draft, but, again, I think this is a superficial resemblance. Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.

而且,並不是說,一旦你不再是一個學生,你就可以安全地使用大型語言模式所提供的模板。表達思想的掙扎並不會在你畢業後就消失 — 它可能在你開始起草新作品時就發生了。有時只有在寫作的過程中,你才會發現自己的原創想法。有些人可能會說,大型語言模型的輸出看起來與人類作家的初稿沒有什麼不同,但是,我再次認為這是一種表面上的相似性。你的初稿並不是一個表達清楚的非原創想法;它是一個表達不佳的原創想法,它伴隨著你無定形的不滿,你意識到它所說的和你希望它所說的之間的距離。這就是在重寫過程中指導你的東西,這也是當你從人工智慧生成的文字開始時缺乏的東西之一。

There’s nothing magical or mystical about writing, but it involves more than placing an existing document on an unreliable photocopier and pressing the Print button. It’s possible that, in the future, we will build an A.I. that is capable of writing good prose based on nothing but its own experience of the world. The day we achieve that will be momentous indeed — but that day lies far beyond our prediction horizon. In the meantime, it’s reasonable to ask, What use is there in having something that rephrases the Web? If we were losing our access to the Internet forever and had to store a copy on a private server with limited space, a large-language model like ChatGPT might be a good solution, assuming that it could be kept from fabricating. But we aren’t losing our access to the Internet. So just how much use is a blurry , when you still have the original? ♦

寫作並沒有什麼神奇或神秘之處,但它所涉及的不僅僅是將現有文件放在不可靠的影印機上,然後按下列印按鈕。有可能在未來,我們將建立一個人工智慧,能夠根據自己對世界的經驗寫出好的散文。我們實現這一目標的那一天將是非常重要的,但那一天遠遠超出我們的預測範圍。在此期間,我們有理由問,擁有重新表述網路的東西有什麼用?如果我們永遠失去了對網路的訪問能力,並且不得不在空間有限的私人伺服器上儲存一個副本,那麼像ChatGPT 這樣的大語言模型可能是一個很好的解決方案,前提是它可以不被製造出來。但我們並沒有失去對網路的訪問能力。那麼,當你還有原件的時候,一張模糊的 JPEG 有多大用處呢?



fox hsiao

fOx. A starter, blogger, gamer. Co-founder @ iCook & INSIDE