A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

преди 1 година
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363
  1. title: Artificial General Intelligence and the bird brains of Silicon Valley
  2. url: https://softwarecrisis.dev/letters/ai-bird-brains-silicon-valley/
  3. hash_url: f23d043d8e99f2af5fcf1b970f98744a
  4. <blockquote>
  5. <p>
  6. The problem is, if one side of the communication does not have meaning,
  7. then the comprehension of the implicit meaning is an illusion arising
  8. from our singular human understanding of language (independent of the
  9. model). Contrary to how it may seem when we observe its output, an LM is
  10. a system for haphazardly stitching together sequences of linguistic
  11. forms it has observed in its vast training data, according to
  12. probabilistic information about how they combine, but without any
  13. reference to meaning: a stochastic parrot.
  14. </p>
  15. </blockquote>
  16. <figcaption>
  17. <p>Emily M. Bender, Timnit Gebru, et al., <em>On the Dangers of Stochastic
  18. Parrots: Can Language Models Be Too Big?</em>.</p>
  19. </figcaption>
  20. </figure>
  21. <p>Bird brains have a bad reputation. The diminutive size of your average
  22. bird and their brain has lead people to assume that they are, well,
  23. dumb.</p>
  24. <p>But, bird brains are amazing. Birds commonly outperform mammals with
  25. larger brains at a variety of general reasoning and problem-solving
  26. tasks. Some by a large margin. Their small brains manage this by
  27. packing numerous neurons in a small space using structures that are
  28. unlike from those you find in mammals.</p>
  29. <p>Even though birds have extremely capable minds, those minds are built in
  30. ways that are different from our own or other mammals. Similar
  31. capabilities; different structure.</p>
  32. <p>The ambition of the Silicon Valley AI industry is to create something
  33. analogous to a bird brain: a new kind of mind that is functionally
  34. similar to the human mind, possibly outperforming it, while being built
  35. using very different mechanisms. Similar capabilities; different
  36. structure.</p>
  37. <p>This effort goes back decades, to the dawn of computing, and has had
  38. limited success.</p>
  39. <p>Until recently, it seems.</p>
  40. <p>If you’re reading this, you’ve almost certainly interacted with a
  41. Generative AI, however indirectly. Maybe you’ve tried Bing Chat. Maybe
  42. you’ve subscribed to the paid tier for ChatGPT. Or, maybe you’ve used
  43. Midjourney to generate images. At the very least you’ve been forced to
  44. see the images or text posted by the overenthusiastic on social media.</p>
  45. <p>These AI models are created by pushing an enormous amount of training
  46. data through various algorithms:</p>
  47. <ul>
  48. <li>Language models like ChatGPT is trained on a good chunk of the textual material available in digital form in the world.</li>
  49. <li>Image models like Midjourney and Stable Diffusion are trained on a huge collection of images found on the internet.</li>
  50. </ul>
  51. <p>What comes out the other end is a mathematical model of the media domain
  52. in question: text or images.</p>
  53. <p>You know what Generative AI is in terms of how it presents to you as
  54. software: clever chatbots that do or say things in response to what you
  55. say: <em>your prompt</em>. Some of those responses are useful, and they give
  56. you an impression of sophisticated comprehension. The models that
  57. generate text are fluent and often quite engaging.</p>
  58. <p>This fluency is misleading. What Bender and Gebru meant when they coined
  59. the term <em>stochastic parrot</em> wasn’t to imply that these are, indeed, the
  60. new bird brains of Silicon Valley, but that they are unthinking text
  61. synthesis engines that just repeat phrases. They are the proverbial
  62. parrot who echoes without thinking, not the actual parrot who is capable
  63. of complex reasoning and problem-solving.</p>
  64. <p>A <em>zombie parrot</em>, if you will, that screams for <em>brains</em> because it has
  65. none.</p>
  66. <p>The fluency of the zombie parrot—the unerring confidence and a style of
  67. writing that some find endearing—creates a strong illusion of
  68. intelligence.</p>
  69. <p>Every other time we read text, we are engaging with the product of
  70. another mind. We are so used to the idea of text as a representation of
  71. another person’s thoughts that we have come to mistake their writing
  72. <em>for</em> their thoughts. But they aren’t. Text and media are tools that
  73. authors and artists create to let people change their own state of
  74. mind—hopefully in specific ways to form the image or effect the author
  75. was after.</p>
  76. <p>Reading is an indirect collaboration with the author, mediated through
  77. the writing. Text has no inherent reasoning or intelligence. Agatha
  78. Christie’s ghost does not inhabit the words of <em>Murder on the Orient Express</em>.
  79. Stephen King isn’t hovering over you when you read <em>Carrie</em>. The ghost
  80. you feel while reading is an illusion you’ve made out of your own
  81. experience, knowledge, and imagination. Every word you read causes your
  82. mind to reconstruct its meaning using your memories and creativity. The
  83. idea that there is intelligence somehow inherent in writing is an
  84. illusion. The intelligence is <em>all</em> yours, all the time: thoughts you
  85. make yourself in order to make sense of another person’s words. This can
  86. prompt us to greatness, broaden our minds, inspire new thoughts, and
  87. introduce us to new concepts. A book can contain worlds, but we’re the
  88. ones that bring them into being as we read. What we see is uniquely our
  89. own. The thoughts are not transported from the author’s mind and
  90. injected into ours.</p>
  91. <p>The words themselves are just line forms on a background with no
  92. inherent meaning or intelligence. The word “horse” doesn’t come with the
  93. Platonic ideal of a horse attached to it. The word “anger” isn’t full of
  94. seething emotion or the restrained urge towards violence. Even words
  95. that are arguably onomatopoeic, like the word “brabra” we use in
  96. Icelandic for the sound a duck makes, are still incredibly specific to
  97. the cultures and context they come from. We are the ones doing the heavy
  98. lifting in terms of reconstructing a picture of an intelligence behind
  99. the text. When there is no actual intelligence, such as with ChatGPT, we
  100. are the ones who end up filling in the gaps with our memories,
  101. experience and imagination.</p>
  102. <p>When ChatGPT demonstrates intelligence, that comes from us. Some of
  103. it we construct ourselves. Some of it comes from our inherent
  104. biases.</p>
  105. <p>There is no ‘there’ there. We are alone in the room, reconstructing an
  106. abstract representation of a mind. The reasoning you see is only in your
  107. head. You are hallucinating intelligence where there is none. You are
  108. doing the textual equivalent of seeing a face in a power outlet.</p>
  109. <p>This drive—<em>anthropomorphism</em>—seems to be innate. Our first instinct
  110. when faced with anything unfamiliar—whose drives, motivations, and
  111. mechanisms we don’t understand—is to assume that they think much like a
  112. human would. When that unfamiliar agent uses language like a human
  113. would, the urge to see them as near or fully human is impossible to
  114. resist—a recurring issue in the history of AI research that dates all
  115. the way back to 1966.</p>
  116. <p>These tools solve problems and return fluent, if untruthful, answers,
  117. which is what creates such a convincing illusion of intelligence.</p>
  118. <p>Text synthesis engines like ChatGPT and GPT-4 do not have any
  119. self-awareness. They are mathematical models of the various patterns to
  120. be found in the collected body of human text. How granular the model is
  121. depends on its design and the languages in question. Some of the
  122. tokens—the smallest unit of language the model works with—will be
  123. characters or punctuation marks, some of them will be words, syllables,
  124. or even phrases. Many language models are a mixture of both.</p>
  125. <p>With enough detail—a big enough collection of text—these tools will
  126. model enough of the probabilistic distribution of various words or
  127. characters to be able to perform what looks like magic:</p>
  128. <ul>
  129. <li>They generate fluent answers by calculating the most probable sequence
  130. of words, at that time, which would serve as the continuation of or
  131. response to the prompt.</li>
  132. <li>They can perform limited reasoning tasks that correlate with textual
  133. descriptions of prior reasoning tasks in the training data.</li>
  134. </ul>
  135. <p>With enough of these correlative shortcuts, the model can perform
  136. something that looks like common sense reasoning: its output is text
  137. that replicates prior representations of reasoning. This works for
  138. as long as you don’t accidentally use the wrong phrasing in your prompt
  139. and break the correlation.</p>
  140. <p>The mechanism behind these systems is entirely correlative from the
  141. ground up.What looks like reasoning is incredibly fragile and
  142. breaks as soon as you rephrase or reword your prompt. It exists
  143. only as a probabilistic model of text. A Generative AI chatbot is a
  144. language engine incapable of genuine thought.</p>
  145. <p>These language models are interactive but static snapshots of the
  146. probability distributions of a written language.</p>
  147. <p>It’s obviously interactive, that’s the whole point of a chatbot. It’s
  148. static in that it does not change when it’s used or activated. In fact,
  149. changing it requires an enormous amount of computing power over a long
  150. period of time. What the system models are the distributions and
  151. correlations of the tokens it records for the texts in its training data
  152. set—how the various words, syllables, and punctuation relate to each
  153. other over as much of the written history of a language as the company
  154. can find.</p>
  155. <p>That’s what distinguishes biological minds from these algorithmic
  156. hindsight factories: a biological mind does not reason using the
  157. probability distributions of all the prior cultural records of its
  158. ancestors. Biological minds learn primarily through trial and error.
  159. Try, fail, try again. They build their neural network, which is
  160. functionally very different from what you see in a software model,
  161. through constant feedback, experimentation, and repeated failure—driven
  162. by a chemical network that often manifests as instinct, emotion,
  163. motivation, and drive. The neural network—bounded, defined, and driven
  164. by the chemical network—is constantly changing and responding to outside
  165. stimuli. Every time an animal’s nervous system is “used”, it changes. It
  166. is always changing, until it dies.</p>
  167. <p>Biological minds <em>experience</em>. Synthesis engines parse imperfect
  168. <em>records</em> of experiences. The former are forward-looking and operate
  169. primarily in the present, sometimes to their own detriment. The latter
  170. exist exclusively as probabilistic manifestations of imperfect
  171. representations of thoughts past. They are snapshots. Generative AI are
  172. themselves cultural records.</p>
  173. <p>These models aren’t new bird brains—new alien minds that are peers to
  174. our own. They aren’t even insect brains. Insects have autonomy. They are
  175. capable of general problem-solving—some of them dealing with tasks of
  176. surprising complexity—and their abilities tolerate the kind of
  177. minor alterations in the problem environment that would break the
  178. correlative pseudo-reasoning of a language model. Large Language
  179. Models are something lesser. They are water running down pathways etched
  180. into the ground over centuries by the rivers of human culture. Their
  181. originality comes entirely from random combinations of historical
  182. thought. They do not know the ‘meaning’ of anything—they only know the
  183. records humans find meaningful enough to store. Their unreliability
  184. comes from their unpredictable behaviour in novel circumstances. When
  185. there is no riverbed to follow, they drown the surrounding landscape.</p>
  186. <p>The entirety of their documented features, capabilities, and recorded
  187. behaviour—emergent or not—is explained by this conceptual model of
  188. generative AI. There are no unexplained corner cases that don’t fit or
  189. actively disprove this theory.</p>
  190. <p>Yet people keep assuming that what ChatGPT does can only be explained as
  191. the first glimmer of genuine Artificial General Intelligence. The bird
  192. brain of Silicon Valley is born at last!</p>
  193. <p>Because text and language are the primary ways we experience other
  194. people’s reasoning, it’ll be next to impossible to dislodge the notion
  195. that these are genuine intelligences. No amount of examples, scientific
  196. research, or analysis will convince those who want to maintain a
  197. pseudo-religious belief in alien peer intelligences. After all, if you
  198. want to believe in aliens, an artificial one made out of supercomputers
  199. and wishful thinking <em>feels</em> much more plausible than little grey men
  200. from outer space. But that’s what it is: <em>a belief in aliens.</em></p>
  201. <p>It doesn’t help that so many working in AI seem to <em>want</em> this to be
  202. true. They seem to be true believers who are convinced that the spark of
  203. Artificial General Intelligence has been struck.</p>
  204. <p>They are inspired by the science fictional notion that if you make
  205. something complex enough, it will spontaneously become intelligent. This
  206. isn’t an uncommon belief. You see it in movies and novels—the notion
  207. that any network of sufficient complexity will spontaneously become
  208. sentient has embedded itself in our popular psyche. James Cameron’s
  209. skull-crushing metal skeletons have a lot to answer for.</p>
  210. <p>That notion doesn’t seem to have any basis in science. The idea that
  211. general intelligence is an emergent property of neural networks that
  212. appears once the network reaches sufficient complexity, is a view based
  213. on archaic notions of animal intelligence—that animals are soulless
  214. automata incapable of feeling or reasoning. That view that was
  215. formed during a period where we didn’t realise just how common
  216. self-awareness (i.e. the mirror test) and general reasoning is in the
  217. animal kingdom. Animals are smarter than we assumed and the
  218. difference between our reasoning and theirs seems to be a matter of
  219. degree, not of presence or absence.</p>
  220. <p>General reasoning seems to be an <em>inherent</em>, not emergent, property of
  221. pretty much any biological lifeform with a notable nervous system.</p>
  222. <p>The bumblebee, despite having only a tiny fraction of the neurons of a
  223. human brain, is capable of not only solving puzzles but also of
  224. <em>teaching other bees to solve those puzzles.</em> They reason and have a
  225. culture. They have more genuine and robust general reasoning
  226. skills—that don’t collapse into incoherence at minor adjustments to the
  227. problem space—than GPT-4 or any large language model on the market.
  228. That’s with only around half a million neurons to work with.</p>
  229. <p>Conversely, GPT-3 is made up of 175 <em>billion</em> parameters—what passes for
  230. a “neuron” in a digital neural network. GPT-4 is even larger, with
  231. some estimates coming in at a <em>trillion</em> parameters. Then you have
  232. fine-tuned systems such as ChatGPT, that are built from multiple
  233. interacting models layered on top of GPT-3.5 or GPT-4, which make for an
  234. even more complex interactive system.</p>
  235. <p>ChatGPT, running on GPT-4 is, easily a <em>million</em> times more complex than
  236. the “neural network” of a bumblebee and yet, out of the two, it’s the
  237. striped invertebrate that demonstrates robust and adaptive
  238. general-purpose reasoning skills. Very simple minds, those belonging to
  239. small organisms that barely have a brain, are capable of reasoning about
  240. themselves, the world around them, and the behaviour of other
  241. animals.</p>
  242. <p>Unlike the evidence for ‘sparks’ of AGI in language models, the evidence
  243. for animal reasoning—even consciousness—is broad, compelling, and
  244. encompasses decades of work by numerous scientists.</p>
  245. <p>AI models are flawed attempts at digitally synthesising neurologies.
  246. They are built on the assumption that all the rest—metabolisms,
  247. hormones, chemicals, and senses—aren’t necessary for developing
  248. intelligence.</p>
  249. <p>Reasoning in biological minds does not seem to be a property that
  250. emerges from complexity. The capacity to reason looks more likely to be
  251. a <em>built-in</em> property of most animal minds. A reasoning mind
  252. appears to be a direct consequence of how animals are structured as a
  253. whole—chemicals, hormones, and physical body included. The animal
  254. capacity for problem-solving, social reasoning, and self-awareness seem
  255. to increase, unevenly, and fitfully with the number of neurons until it
  256. reaches the level we see in humans. Reasoning does not ‘emerge’ or
  257. appear. Some creatures are better at it than others, but it’s there in
  258. some form even in very small, very simple beings like the bumblebee. It
  259. doesn’t happen magically when you hook up a bunch of disparate objects
  260. together in a complex enough network. A reasoning mind is the <em>starting
  261. point</em> of biological thinking, not the endpoint that only “emerges” with
  262. sufficient complexity.</p>
  263. <p>The internet—a random interconnected collection of marketing offal,
  264. holiday snaps, insufferable meetings, and porn—isn’t going to become
  265. self-aware and suddenly acquire the capacity for general reasoning once
  266. it reaches a certain size, and neither will Large-Language-Models. The
  267. notion that we are making autonomous beings capable of Artificial
  268. General Intelligence just by loading a neural network up with an
  269. increasingly bigger collection of garbage from the internet is not one
  270. that has any basis in anything we understand about biology or animal
  271. reasoning.</p>
  272. <p>But, AI companies insist that they are on the verge of AGI. Their
  273. rhetoric around it verges on the religious as the idea of an AGI is
  274. idealised and almost worshipped. They claim to be close to making a
  275. new form of thinking life, but they refuse to release the data required
  276. to prove it. They’ve built software that performs well on the
  277. arbitrary benchmarks they’ve chosen and claim are evidence of general
  278. intelligence, but those tests prove no such thing and have no such
  279. validity. The benchmarks are theatrics that have no applicability
  280. towards demonstrating genuine general intelligence.</p>
  281. <p>AI researchers love to resurrect outdated pseudoscience such as
  282. phrenology—shipping AI software that promises to be able to tell you if
  283. somebody is likely to be a criminal based on the shape of their
  284. skull. It’s a field where researchers and vendors routinely claim
  285. that their AIs can detect whether you’re a potential criminal, gay, a
  286. good employee, liberal or conservative, or even a psychopath, based on
  287. “your face, body, gait, and tone of voice.”</p>
  288. <p><em>It’s pseudoscience</em>.</p>
  289. <p>This is the field and the industry that claims to have accomplished the
  290. first ‘spark’ of Artificial General Intelligence?</p>
  291. <p>Last time we saw a claim this grand, with this little scientific
  292. evidence, the men in the white coats were promising us room-temperature
  293. fusion, giving us free energy for life, and ending the world’s
  294. dependence on fossil fuels.</p>
  295. <p>Why give the tech industry the benefit of the doubt when they are all
  296. but claiming godhood—that they’ve created a new form of life never seen
  297. before?</p>
  298. <p>As <a href="https://en.wikipedia.org/wiki/Sagan_standard">Carl Sagan said</a>:
  299. <em>“extraordinary claims require extraordinary evidence.”</em></p>
  300. <p>He didn’t say “extraordinary claims require only vague insinuations and
  301. pinky-swear promises.”</p>
  302. <p>To claim you’ve created a completely new kind of mind that’s on par with
  303. any animal mind—or, even superior—and provides general intelligence
  304. using mechanisms that don’t resemble anything anybody has ever seen in
  305. nature, is by definition the most extraordinary of claims.</p>
  306. <p>The AI industry is backing their claims of Artificial General
  307. Intelligence with hot air, hand-waving, and cryptic references to data
  308. and software nobody outside their organisations is allowed to review or
  309. analyse.</p>
  310. <p>They are pouring an every-increasing amount of energy and work into
  311. ever-larger models all in the hope of triggering the
  312. ‘<a href="https://en.wikipedia.org/wiki/Technological_singularity">singularity</a>’
  313. and creating a digital superbeing. Like a cult of monks boiling the
  314. oceans in order to hear whispers of the name of God.</p>
  315. <p>It’s a farce. All theatre; no evidence. Whether they realise it or not,
  316. they are taking us for a ride. The sooner we see that they aren’t
  317. backing their claims with science, the sooner we can focus on finding
  318. safe and productive uses—limiting its harm, at least—for the technology
  319. as it exists today.</p>
  320. <p>After everything the tech industry has done over the past decade, the
  321. financial bubbles, the gig economy, legless virtual reality avatars,
  322. crypto, the endless software failures—just think about it—do you think
  323. we should believe them when they make grand, unsubstantiated claims
  324. about miraculous discoveries? Have they earned our trust? Have they
  325. shown that their word is worth more than that of independent scientists?</p>
  326. <p>Do you think that they, with this little evidence, have really done what
  327. they claim, and discovered a literal new form of life? But are
  328. conveniently unable to prove it because of ‘safety’?</p>
  329. <p>Me neither.</p>
  330. <p>The notion that large language models are on the path towards Artificial
  331. General Intelligence is a dangerous one. It’s a myth that directly
  332. undermines any effort to think clearly or strategise about generative AI
  333. because it strongly reinforces <em>anthropomorphism</em>.</p>
  334. <p>That’s when you reason about an object or animal <em>as if it were a
  335. person</em>. It prevents you from forming an accurate mental model of the non-human thing’s behaviour. AI is especially prone to creating this reaction. Software such as chatbots trigger all three major factors that promote
  336. anthropomorphism in people:</p>
  337. <ol>
  338. <li><em>Understanding.</em> If we lack an understanding of how an object works,
  339. our minds will resort to thinking of it in terms of something that’s
  340. familiar to us: people. We understand the world as people because
  341. that’s what we are. This becomes stronger the more similar we
  342. perceive the object to be to ourselves.</li>
  343. <li><em>Motivation.</em> We are motivated to both seek out human interaction
  344. and to interact effectively with our environment. This reinforces
  345. the first factor. The more uncertain we are of how that thing works,
  346. the stronger the anthropomorphism. The less control we have over it,
  347. the stronger the anthropomorphism.</li>
  348. <li><em>Sociality</em>. We have a need for human contact and our tendency
  349. towards anthropomorphising objects in our environment increase with
  350. our isolation.</li>
  351. </ol>
  352. <p>Because we lack cohesive cognitive models for what makes these language
  353. models so fluent, feel a strong motivation to understand and use them as
  354. they are integrated into our work, and, increasingly, our socialisation
  355. in the office takes on the very same text conversation form as a chatbot
  356. does, we inevitably feel a strong drive to see these software systems as
  357. people. The myth of AGI reinforces this—supercharges the anthropomorphism—because it implies that “people”
  358. is indeed an appropriate cognitive model for how these systems behave.</p>
  359. <p>It isn’t. <strong><em>AI are not people.</em></strong> Treating them as such is a major
  360. strategic error as it will prevent you from thinking clearly about their
  361. capabilities and limitations.</p>
  362. <p>Believing the myth of Artificial General Intelligence makes you incapable of understanding what language models today are and how they work.</p>