Browse Source

More links

master
David Larlet 2 years ago
parent
commit
a75443a807

+ 183
- 0
cache/2021/b404382125c07935b98295a801049097/index.html View File

@@ -0,0 +1,183 @@
<!doctype html><!-- This is a valid HTML5 document. -->
<!-- Screen readers, SEO, extensions and so on. -->
<html lang="fr">
<!-- Has to be within the first 1024 bytes, hence before the `title` element
See: https://www.w3.org/TR/2012/CR-html5-20121217/document-metadata.html#charset -->
<meta charset="utf-8">
<!-- Why no `X-UA-Compatible` meta: https://stackoverflow.com/a/6771584 -->
<!-- The viewport meta is quite crowded and we are responsible for that.
See: https://codepen.io/tigt/post/meta-viewport-for-2015 -->
<meta name="viewport" content="width=device-width,initial-scale=1">
<!-- Required to make a valid HTML5 document. -->
<title>The Questions Concerning Technology (archive) — David Larlet</title>
<meta name="description" content="Publication mise en cache pour en conserver une trace.">
<!-- That good ol' feed, subscribe :). -->
<link rel="alternate" type="application/atom+xml" title="Feed" href="/david/log/">
<!-- Generated from https://realfavicongenerator.net/ such a mess. -->
<link rel="apple-touch-icon" sizes="180x180" href="/static/david/icons2/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/static/david/icons2/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/static/david/icons2/favicon-16x16.png">
<link rel="manifest" href="/static/david/icons2/site.webmanifest">
<link rel="mask-icon" href="/static/david/icons2/safari-pinned-tab.svg" color="#07486c">
<link rel="shortcut icon" href="/static/david/icons2/favicon.ico">
<meta name="msapplication-TileColor" content="#f7f7f7">
<meta name="msapplication-config" content="/static/david/icons2/browserconfig.xml">
<meta name="theme-color" content="#f7f7f7" media="(prefers-color-scheme: light)">
<meta name="theme-color" content="#272727" media="(prefers-color-scheme: dark)">
<!-- Documented, feel free to shoot an email. -->
<link rel="stylesheet" href="/static/david/css/style_2021-01-20.css">
<!-- See https://www.zachleat.com/web/comprehensive-webfonts/ for the trade-off. -->
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<script>
function toggleTheme(themeName) {
document.documentElement.classList.toggle(
'forced-dark',
themeName === 'dark'
)
document.documentElement.classList.toggle(
'forced-light',
themeName === 'light'
)
}
const selectedTheme = localStorage.getItem('theme')
if (selectedTheme !== 'undefined') {
toggleTheme(selectedTheme)
}
</script>

<meta name="robots" content="noindex, nofollow">
<meta content="origin-when-cross-origin" name="referrer">
<!-- Canonical URL for SEO purposes -->
<link rel="canonical" href="https://theconvivialsociety.substack.com/p/the-questions-concerning-technology">

<body class="remarkdown h1-underline h2-underline h3-underline em-underscore hr-center ul-star pre-tick" data-instant-intensity="viewport-all">


<article>
<header>
<h1>The Questions Concerning Technology</h1>
</header>
<nav>
<p class="center">
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="https://theconvivialsociety.substack.com/p/the-questions-concerning-technology" title="Lien vers le contenu original">Source originale</a>
</p>
</nav>
<hr>
<p>A few days ago, a handful of similar stories or anecdotes about technology came to my attention. While they came from different sectors and were of varying degrees of seriousness, they shared a common characteristic. In each case, there was either an expressed bewilderment or admission of obliviousness about the possibility that a given technology would be put to destructive or nefarious purposes. Naturally, I tweeted about it … like one does. </p>
<p>I subsequently clarified that I was not subtweeting anyone in particular just everything in general. Of course, naiveté, hubris, and recklessness don’t quite cover all the possibilities—nor are they mutually exclusive. </p>
<p>In response, someone noted that “people find it hard to ‘think like an *-hole’, in <br><a href="https://twitter.com/mathbabedotorg" rel="">@mathbabedotorg</a>'s phrase, because most aren’t.” That handle belongs to Cathy O’Neil, best known for her 2016 book, <em><a href="https://crownpublishing.com/archives/feature/big-data-increases-inequality-threatens-democracy" rel="">Weapons of Math Destruction: How Big Data Increases Inequality And Threatens Democracy</a></em>. </p>
<p>There’s something to this, of course, and, as I mentioned in my reply, I truly do appreciate the generosity of this sentiment. I suggested that the witness of history is helpful on this score, correcting and informing our own limited perspectives. But I was also reminded of a set of questions that I had put together back in 2016 in a moment of similar frustration. </p>
<p>The occasion then was the following <a href="https://om.co/2014/11/26/technology-and-the-moral-dimension/" rel="">observation</a> from Om Malik: </p>
<blockquote><p>“I can safely say that we in tech don’t understand the emotional aspect of our work, just as we don’t understand the moral imperative of what we do. It is not that all players are bad; it is just not part of the thinking process the way, say, ‘minimum viable product’ or ‘growth hacking’ are.”</p></blockquote>
<p>Malik went on to write that “it is time to add an emotional and moral dimension to products,” by which he seems to have meant that tech companies should use data responsibly and make their terms of service more transparent. In my response at the time, I took the opportunity to suggest that we needn’t add an emotional and moral dimension to tech, it was already there. The only question was as to its nature. As Langdon Winner had famously inquired “Do artifacts have politics?” and answered in the affirmative, I likewise argued that artifacts have ethics. I then went on to produce a set of 41 questions that I drafted with a view to helping us draw out the moral or ethical implications of our tools. The post proved popular at the time and I received a few notes from developers and programmers who had found the questions useful enough to print out post in their workspaces. </p>
<p>This was all before the subsequent boom in “tech ethics,” and, frankly, while my concerns obviously overlap to some degree with the most vocal and popular representatives of that movement, I’ve generally come at the matter from a different place and have expressed my own <a href="https://thefrailestthing.com/2017/11/06/one-does-not-simply-add-ethics-to-technology/" rel="">reservations</a> with the shape more recent tech ethics advocacy has taken. Nonetheless, I have defended the need to think about the moral dimensions of technology against the notion that all that matters are the underlying dynamics of political economy (e.g., <a href="https://thefrailestthing.com/2018/07/07/political-economy-or-ethics-of-technology/" rel="">here</a> and <a href="https://thefrailestthing.com/2018/10/24/in-defense-of-technology-ethics-properly-understood/" rel="">here</a>). </p>
<p>I won’t cover that ground again, but I did think it might be worthwhile to repost the questions I drafted then. It’s been more than six years since I first posted them, and, while some you reading this have been following along since then, most of you picked up on my work in just the last couple of years. And, recalling where we began, trying to think like a malevolent actor might yield some useful insights, but I’d say that we probably need a better way to prompt our thinking about technology’s moral dimensions. Besides, worst case malevolent uses are not the only kinds of morally significant aspects of our technology worth our consideration, as I hope some of these questions will make clear. </p>
<p>This is not, of course, an exhaustive set of questions, nor do I claim any unique profundity for them. I do hope, however, that they are useful, wherever we happen to find ourselves in relation to technological artifacts and systems. At one point, I had considered doing something a bit more with these, possibly expanding on each briefly to explain the underlying logic and providing some concrete illustrative examples or cases. Who knows, may be that would be a good occasional series for the newsletter. Feel free to let me know what you think about that. </p>
<p>Anyway, without further ado, here they are: </p>
<ol><li><p>What sort of person will the use of this technology make of me?</p></li><li><p>What habits will the use of this technology instill?</p></li><li><p>How will the use of this technology affect my experience of time?</p></li><li><p>How will the use of this technology affect my experience of place?</p></li><li><p>How will the use of this technology affect how I relate to other people?</p></li><li><p>How will the use of this technology affect how I relate to the world around me?</p></li><li><p>What practices will the use of this technology cultivate?</p></li><li><p>What practices will the use of this technology displace?</p></li><li><p>What will the use of this technology encourage me to notice?</p></li><li><p>What will the use of this technology encourage me to ignore?</p></li><li><p>What was required of other human beings so that I might be able to use this technology?</p></li><li><p>What was required of other creatures so that I might be able to use this technology?</p></li><li><p>What was required of the earth so that I might be able to use this technology?</p></li><li><p>Does the use of this technology bring me joy? [N.B. This was years before I even heard of Marie Kondo!]</p></li><li><p>Does the use of this technology arouse anxiety?</p></li><li><p>How does this technology empower me? At whose expense?</p></li><li><p>What feelings does the use of this technology generate in me toward others?</p></li><li><p>Can I imagine living without this technology? Why, or why not?</p></li><li><p>How does this technology encourage me to allocate my time?</p></li><li><p>Could the resources used to acquire and use this technology be better deployed?</p></li><li><p>Does this technology automate or outsource labor or responsibilities that are morally essential?</p></li><li><p>What desires does the use of this technology generate?</p></li><li><p>What desires does the use of this technology dissipate?</p></li><li><p>What possibilities for action does this technology present? Is it good that these actions are now possible?</p></li><li><p>What possibilities for action does this technology foreclose? Is it good that these actions are no longer possible?</p></li><li><p>How does the use of this technology shape my vision of a good life?</p></li><li><p>What limits does the use of this technology impose upon me?</p></li><li><p>What limits does my use of this technology impose upon others?</p></li><li><p>What does my use of this technology require of others who would (or must) interact with me?</p></li><li><p>What assumptions about the world does the use of this technology tacitly encourage?</p></li><li><p>What knowledge has the use of this technology disclosed to me about myself?</p></li><li><p>What knowledge has the use of this technology disclosed to me about others? Is it good to have this knowledge?</p></li><li><p>What are the potential harms to myself, others, or the world that might result from my use of this technology?</p></li><li><p>Upon what systems, technical or human, does my use of this technology depend? Are these systems just?</p></li><li><p>Does my use of this technology encourage me to view others as a means to an end?</p></li><li><p>Does using this technology require me to think more or less?</p></li><li><p>What would the world be like if everyone used this technology exactly as I use it?</p></li><li><p>What risks will my use of this technology entail for others? Have they consented?</p></li><li><p>Can the consequences of my use of this technology be undone? Can I live with those consequences?</p></li><li><p>Does my use of this technology make it easier to live as if I had no responsibilities toward my neighbor?</p></li><li><p>Can I be held responsible for the actions which this technology empowers? Would I feel better if I couldn’t?</p></li></ol>
</article>


<hr>

<footer>
<p>
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="/david/log/" title="Accès au flux RSS"><svg class="icon icon-rss2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-rss2"></use>
</svg> Suivre</a> •
<a href="http://larlet.com" title="Go to my English profile" data-instant><svg class="icon icon-user-tie">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-user-tie"></use>
</svg> Pro</a> •
<a href="mailto:david%40larlet.fr" title="Envoyer un courriel"><svg class="icon icon-mail">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-mail"></use>
</svg> Email</a> •
<abbr class="nowrap" title="Hébergeur : Alwaysdata, 62 rue Tiquetonne 75002 Paris, +33184162340"><svg class="icon icon-hammer2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-hammer2"></use>
</svg> Légal</abbr>
</p>
<template id="theme-selector">
<form>
<fieldset>
<legend><svg class="icon icon-brightness-contrast">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-brightness-contrast"></use>
</svg> Thème</legend>
<label>
<input type="radio" value="auto" name="chosen-color-scheme" checked> Auto
</label>
<label>
<input type="radio" value="dark" name="chosen-color-scheme"> Foncé
</label>
<label>
<input type="radio" value="light" name="chosen-color-scheme"> Clair
</label>
</fieldset>
</form>
</template>
</footer>
<script src="/static/david/js/instantpage-5.1.0.min.js" type="module"></script>
<script>
function loadThemeForm(templateName) {
const themeSelectorTemplate = document.querySelector(templateName)
const form = themeSelectorTemplate.content.firstElementChild
themeSelectorTemplate.replaceWith(form)

form.addEventListener('change', (e) => {
const chosenColorScheme = e.target.value
localStorage.setItem('theme', chosenColorScheme)
toggleTheme(chosenColorScheme)
})

const selectedTheme = localStorage.getItem('theme')
if (selectedTheme && selectedTheme !== 'undefined') {
form.querySelector(`[value="${selectedTheme}"]`).checked = true
}
}

const prefersColorSchemeDark = '(prefers-color-scheme: dark)'
window.addEventListener('load', () => {
let hasDarkRules = false
for (const styleSheet of Array.from(document.styleSheets)) {
let mediaRules = []
for (const cssRule of styleSheet.cssRules) {
if (cssRule.type !== CSSRule.MEDIA_RULE) {
continue
}
// WARNING: Safari does not have/supports `conditionText`.
if (cssRule.conditionText) {
if (cssRule.conditionText !== prefersColorSchemeDark) {
continue
}
} else {
if (cssRule.cssText.startsWith(prefersColorSchemeDark)) {
continue
}
}
mediaRules = mediaRules.concat(Array.from(cssRule.cssRules))
}

// WARNING: do not try to insert a Rule to a styleSheet you are
// currently iterating on, otherwise the browser will be stuck
// in a infinite loop…
for (const mediaRule of mediaRules) {
styleSheet.insertRule(mediaRule.cssText)
hasDarkRules = true
}
}
if (hasDarkRules) {
loadThemeForm('#theme-selector')
}
})
</script>
</body>
</html>

+ 5
- 0
cache/2021/b404382125c07935b98295a801049097/index.md
File diff suppressed because it is too large
View File


+ 455
- 0
cache/2021/b56bb56209a04e6144454283a22311ad/index.html View File

@@ -0,0 +1,455 @@
<!doctype html><!-- This is a valid HTML5 document. -->
<!-- Screen readers, SEO, extensions and so on. -->
<html lang="fr">
<!-- Has to be within the first 1024 bytes, hence before the `title` element
See: https://www.w3.org/TR/2012/CR-html5-20121217/document-metadata.html#charset -->
<meta charset="utf-8">
<!-- Why no `X-UA-Compatible` meta: https://stackoverflow.com/a/6771584 -->
<!-- The viewport meta is quite crowded and we are responsible for that.
See: https://codepen.io/tigt/post/meta-viewport-for-2015 -->
<meta name="viewport" content="width=device-width,initial-scale=1">
<!-- Required to make a valid HTML5 document. -->
<title>Building a full-text search engine in 150 lines of Python code (archive) — David Larlet</title>
<meta name="description" content="Publication mise en cache pour en conserver une trace.">
<!-- That good ol' feed, subscribe :). -->
<link rel="alternate" type="application/atom+xml" title="Feed" href="/david/log/">
<!-- Generated from https://realfavicongenerator.net/ such a mess. -->
<link rel="apple-touch-icon" sizes="180x180" href="/static/david/icons2/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/static/david/icons2/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/static/david/icons2/favicon-16x16.png">
<link rel="manifest" href="/static/david/icons2/site.webmanifest">
<link rel="mask-icon" href="/static/david/icons2/safari-pinned-tab.svg" color="#07486c">
<link rel="shortcut icon" href="/static/david/icons2/favicon.ico">
<meta name="msapplication-TileColor" content="#f7f7f7">
<meta name="msapplication-config" content="/static/david/icons2/browserconfig.xml">
<meta name="theme-color" content="#f7f7f7" media="(prefers-color-scheme: light)">
<meta name="theme-color" content="#272727" media="(prefers-color-scheme: dark)">
<!-- Documented, feel free to shoot an email. -->
<link rel="stylesheet" href="/static/david/css/style_2021-01-20.css">
<!-- See https://www.zachleat.com/web/comprehensive-webfonts/ for the trade-off. -->
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<script>
function toggleTheme(themeName) {
document.documentElement.classList.toggle(
'forced-dark',
themeName === 'dark'
)
document.documentElement.classList.toggle(
'forced-light',
themeName === 'light'
)
}
const selectedTheme = localStorage.getItem('theme')
if (selectedTheme !== 'undefined') {
toggleTheme(selectedTheme)
}
</script>

<meta name="robots" content="noindex, nofollow">
<meta content="origin-when-cross-origin" name="referrer">
<!-- Canonical URL for SEO purposes -->
<link rel="canonical" href="https://bart.degoe.de/building-a-full-text-search-engine-150-lines-of-code/">

<body class="remarkdown h1-underline h2-underline h3-underline em-underscore hr-center ul-star pre-tick" data-instant-intensity="viewport-all">


<article>
<header>
<h1>Building a full-text search engine in 150 lines of Python code</h1>
</header>
<nav>
<p class="center">
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="https://bart.degoe.de/building-a-full-text-search-engine-150-lines-of-code/" title="Lien vers le contenu original">Source originale</a>
</p>
</nav>
<hr>
<p>Full-text search is everywhere. From finding a book on Scribd, a movie on Netflix, toilet paper on Amazon, or anything else on the web through Google (like <a href="https://localghost.dev/2019/09/everything-i-googled-in-a-week-as-a-professional-software-engineer/">how to do your job as a software engineer</a>), you’ve searched vast amounts of unstructured data multiple times today. What’s even more amazing, is that you’ve even though you searched millions (or <a href="https://www.worldwidewebsize.com/">billions</a>) of records, you got a response in milliseconds. In this post, we are going to explore the basic components of a full-text search engine, and use them to build one that can search across millions of documents and rank them according to their relevance in milliseconds, in less than 150 lines of Python code!</p>
<div id="player">
<p class="listen">Listen to this article instead</p>

<audio controls class="audio_controls " preload="metadata">
<source src="https://bart.degoe.de/audio/2021-03-24-python-full-text-search-engine.mp3" type="audio/mp3">
Your browser does not support the audio element
</source></audio>
</div>
<h1 id="data">Data</h1>
<p>All the code you in this blog post can be found on <a href="https://github.com/bartdegoede/python-searchengine/">Github</a>. I’ll provide links with the code snippets here, so you can try running this yourself. You can run the full example by installing <a href="https://github.com/bartdegoede/python-searchengine/blob/master/requirements.txt">the requirements</a> (<code>pip install -r requirements.txt</code>) and <a href="https://github.com/bartdegoede/python-searchengine/blob/master/run.py">run <code>python run.py</code></a>. This will download all the data and execute the example query with and without rankings.</p>
<p>Before we’re jumping into building a search engine, we first need some full-text, unstructured data to search. We are going to be searching abstracts of articles from the English Wikipedia, which is currently a gzipped XML file of about 785mb and contains about 6.27 million abstracts<sup id="fnref:1"></sup>. I’ve written <a href="https://github.com/bartdegoede/python-searchengine/blob/master/download.py">a simple function to download</a> the gzipped XML, but you can also just manually download the file.</p>
<h2 id="data-preparation">Data preparation</h2>
<p>The file is one large XML file that contains all abstracts. One abstract in this file is contained by a <code>&lt;doc&gt;</code> element, and looks roughly like this (I’ve omitted elements we’re not interested in):</p>
<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span>&lt;doc&gt;</span>
<span>&lt;title&gt;</span>Wikipedia: London Beer Flood<span>&lt;/title&gt;</span>
<span>&lt;url&gt;</span>https://en.wikipedia.org/wiki/London_Beer_Flood<span>&lt;/url&gt;</span>
<span>&lt;abstract&gt;</span>The London Beer Flood was an accident at Meux <span>&amp;</span> Co's Horse Shoe Brewery, London, on 17 October 1814. It took place when one of the wooden vats of fermenting porter burst.<span>&lt;/abstract&gt;</span>
...
<span>&lt;/doc&gt;</span>
</code></pre></div>
<p>The bits were interested in are the <code>title</code>, the <code>url</code> and the <code>abstract</code> text itself. We’ll represent documents with a <a href="https://realpython.com/python-data-classes/">Python dataclass</a> for convenient data access. We’ll add a property that concatenates the title and the contents of the abstract. You can find the code <a href="https://github.com/bartdegoede/python-searchengine/blob/master/search/documents.py">here</a>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>from</span> dataclasses <span>import</span> dataclass

<span>@dataclass</span>
<span>class</span> <span>Abstract</span>:
<span>"""Wikipedia abstract"""</span>
ID: int
title: str
abstract: str
url: str

<span>@property</span>
<span>def</span> <span>fulltext</span>(self):
<span>return</span> <span>' '</span><span>.</span>join([self<span>.</span>title, self<span>.</span>abstract])
</code></pre></div>
<p>Then, we’ll want to extract the abstracts data from the XML and parse it so we can create instances of our <code>Abstract</code> object. We are going to stream through the gzipped XML without loading the entire file into memory first<sup id="fnref:2"></sup>. We’ll assign each document an ID in order of loading (ie the first document will have ID=1, the second one will have ID=2, etcetera). You can find the code <a href="https://github.com/bartdegoede/python-searchengine/blob/master/load.py">here</a>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> gzip
<span>from</span> lxml <span>import</span> etree

<span>from</span> search.documents <span>import</span> Abstract

<span>def</span> <span>load_documents</span>():
<span># open a filehandle to the gzipped Wikipedia dump</span>
<span>with</span> gzip<span>.</span>open(<span>'data/enwiki.latest-abstract.xml.gz'</span>, <span>'rb'</span>) <span>as</span> f:
doc_id <span>=</span> <span>1</span>
<span># iterparse will yield the entire `doc` element once it finds the</span>
<span># closing `&lt;/doc&gt;` tag</span>
<span>for</span> _, element <span>in</span> etree<span>.</span>iterparse(f, events<span>=</span>(<span>'end'</span>,), tag<span>=</span><span>'doc'</span>):
title <span>=</span> element<span>.</span>findtext(<span>'./title'</span>)
url <span>=</span> element<span>.</span>findtext(<span>'./url'</span>)
abstract <span>=</span> element<span>.</span>findtext(<span>'./abstract'</span>)

<span>yield</span> Abstract(ID<span>=</span>doc_id, title<span>=</span>title, url<span>=</span>url, abstract<span>=</span>abstract)

doc_id <span>+=</span> <span>1</span>
<span># the `element.clear()` call will explicitly free up the memory</span>
<span># used to store the element</span>
element<span>.</span>clear()
</code></pre></div>
<h1 id="indexing">Indexing</h1>
<p>We are going to store this in a data structure known as an <a href="https://en.wikipedia.org/wiki/Inverted_index">“inverted index” or a “postings list”</a>. Think of it as the index in the back of a book that has an alphabetized list of relevant words and concepts, and on what page number a reader can find them.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/book-index-1080x675.png"> <figcaption>
<h4>Back of the book index</h4>
</figcaption>
</figure>
<p>Practically, what this means is that we’re going to create a dictionary where we map all the words in our corpus to the IDs of the documents they occur in. That will look something like this:</p>
<div class="highlight"><pre><code class="language-json" data-lang="json">{
<span>...</span>
<span>"london"</span>: [<span>5245250</span>, <span>2623812</span>, <span>133455</span>, <span>3672401</span>, <span>...</span>],
<span>"beer"</span>: [<span>1921376</span>, <span>4411744</span>, <span>684389</span>, <span>2019685</span>, <span>...</span>],
<span>"flood"</span>: [<span>3772355</span>, <span>2895814</span>, <span>3461065</span>, <span>5132238</span>, <span>...</span>],
<span>...</span>
}
</code></pre></div>
<p>Note that in the example above the words in the dictionary are lowercased; before building the index we are going to break down or <code>analyze</code> the raw text into a list of words or <code>tokens</code>. The idea is that we first break up or <code>tokenize</code> the text into words, and then apply zero or more <code>filters</code> (such as lowercasing or stemming) on each token to improve the odds of matching queries to text.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/tokenization.png"> <figcaption>
<h4>Tokenization</h4>
</figcaption>
</figure>
<h2 id="analysis">Analysis</h2>
<p>We are going to apply very simple tokenization, by just splitting the text on whitespace. Then, we are going to apply a couple of filters on each of the tokens: we are going to lowercase each token, remove any punctuation, remove the 25 most common words in the English language (and the word “wikipedia” because it occurs in every title in every abstract) and apply <a href="https://en.wikipedia.org/wiki/Stemming">stemming</a> to every word (ensuring that different forms of a word map to the same stem, like <em>brewery</em> and <em>breweries</em><sup id="fnref:3"></sup>).</p>
<p>The tokenization and lowercase filter are very simple:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> Stemmer

STEMMER <span>=</span> Stemmer<span>.</span>Stemmer(<span>'english'</span>)

<span>def</span> <span>tokenize</span>(text):
<span>return</span> text<span>.</span>split()

<span>def</span> <span>lowercase_filter</span>(tokens):
<span>return</span> [token<span>.</span>lower() <span>for</span> token <span>in</span> tokens]

<span>def</span> <span>stem_filter</span>(tokens):
<span>return</span> STEMMER<span>.</span>stemWords(tokens)
</code></pre></div>
<p>Punctuation is nothing more than a regular expression on the set of punctuation:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> re
<span>import</span> string

PUNCTUATION <span>=</span> re<span>.</span>compile(<span>'[</span><span>%s</span><span>]'</span> <span>%</span> re<span>.</span>escape(string<span>.</span>punctuation))

<span>def</span> <span>punctuation_filter</span>(tokens):
<span>return</span> [PUNCTUATION<span>.</span>sub(<span>''</span>, token) <span>for</span> token <span>in</span> tokens]
</code></pre></div>
<p>Stopwords are words that are very common and we would expect to occcur in (almost) every document in the corpus. As such, they won’t contribute much when we search for them (i.e. (almost) every document will match when we search for those terms) and will just take up space, so we will filter them out at index time. The Wikipedia abstract corpus includes the word “Wikipedia” in every title, so we’ll add that word to the stopword list as well. We drop the 25 most common words in English.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># top 25 most common words in English and "wikipedia":</span>
<span># https://en.wikipedia.org/wiki/Most_common_words_in_English</span>
STOPWORDS <span>=</span> set([<span>'the'</span>, <span>'be'</span>, <span>'to'</span>, <span>'of'</span>, <span>'and'</span>, <span>'a'</span>, <span>'in'</span>, <span>'that'</span>, <span>'have'</span>,
<span>'I'</span>, <span>'it'</span>, <span>'for'</span>, <span>'not'</span>, <span>'on'</span>, <span>'with'</span>, <span>'he'</span>, <span>'as'</span>, <span>'you'</span>,
<span>'do'</span>, <span>'at'</span>, <span>'this'</span>, <span>'but'</span>, <span>'his'</span>, <span>'by'</span>, <span>'from'</span>, <span>'wikipedia'</span>])

<span>def</span> <span>stopword_filter</span>(tokens):
<span>return</span> [token <span>for</span> token <span>in</span> tokens <span>if</span> token <span>not</span> <span>in</span> STOPWORDS]
</code></pre></div>
<p>Bringing all these filters together, we’ll <a href="https://github.com/bartdegoede/python-searchengine/blob/master/search/analysis.py#L28-L35">construct an <code>analyze</code> function</a> that will operate on the <code>text</code> in each abstract; it will tokenize the text into individual words (or rather, <em>tokens</em>), and then apply each filter in succession to the list of tokens. The order is important, because we use a non-stemmed list of stopwords, so we should apply the <code>stopword_filter</code> before the <code>stem_filter</code>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>analyze</span>(text):
tokens <span>=</span> tokenize(text)
tokens <span>=</span> lowercase_filter(tokens)
tokens <span>=</span> punctuation_filter(tokens)
tokens <span>=</span> stopword_filter(tokens)
tokens <span>=</span> stem_filter(tokens)

<span>return</span> [token <span>for</span> token <span>in</span> tokens <span>if</span> token]
</code></pre></div>
<h2 id="indexing-the-corpus">Indexing the corpus</h2>
<p>We’ll create an <code>Index</code> class that will store the <code>index</code> and the <code>documents</code>. The <code>documents</code> dictionary stores the dataclasses by ID, and the <code>index</code> keys will be the tokens, with the values being the document IDs the token occurs in:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>class</span> <span>Index</span>:
<span>def</span> __init__(self):
self<span>.</span>index <span>=</span> {}
self<span>.</span>documents <span>=</span> {}

<span>def</span> <span>index_document</span>(self, document):
<span>if</span> document<span>.</span>ID <span>not</span> <span>in</span> self<span>.</span>documents:
self<span>.</span>documents[document<span>.</span>ID] <span>=</span> document

<span>for</span> token <span>in</span> analyze(document<span>.</span>fulltext):
<span>if</span> token <span>not</span> <span>in</span> self<span>.</span>index:
self<span>.</span>index[token] <span>=</span> set()
self<span>.</span>index[token]<span>.</span>add(document<span>.</span>ID)
</code></pre></div>
<h1 id="searching">Searching</h1>
<p>Now we have all tokens indexed, searching for a query becomes a matter of analyzing the query text with the same analyzer as we applied to the documents; this way we’ll end up with tokens that should match the tokens we have in the index. For each token, we’ll do a lookup in the dictionary, finding the document IDs that the token occurs in. We do this for every token, and then find the IDs of documents in all these sets (i.e. for a document to match the query, it needs to contain all the tokens in the query). We will then take the resulting list of document IDs, and fetch the actual data from our <code>documents</code> store<sup id="fnref:4"></sup>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>_results</span>(self, analyzed_query):
<span>return</span> [self<span>.</span>index<span>.</span>get(token, set()) <span>for</span> token <span>in</span> analyzed_query]

<span>def</span> <span>search</span>(self, query):
<span>"""
</span><span> Boolean search; this will return documents that contain all words from the
</span><span> query, but not rank them (sets are fast, but unordered).
</span><span> """</span>
analyzed_query <span>=</span> analyze(query)
results <span>=</span> self<span>.</span>_results(analyzed_query)
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>intersection(<span>*</span>results)]

<span>return</span> documents


In [<span>1</span>]: index<span>.</span>search(<span>'London Beer Flood'</span>)
search took <span>0.16307830810546875</span> milliseconds
Out[<span>1</span>]:
[Abstract(ID<span>=</span><span>1501027</span>, title<span>=</span><span>'Wikipedia: Horse Shoe Brewery'</span>, abstract<span>=</span><span>'The Horse Shoe Brewery was an English brewery in the City of Westminster that was established in 1764 and became a major producer of porter, from 1809 as Henry Meux &amp; Co. It was the site of the London Beer Flood in 1814, which killed eight people after a porter vat burst.'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Horse_Shoe_Brewery'</span>),
Abstract(ID<span>=</span><span>1828015</span>, title<span>=</span><span>'Wikipedia: London Beer Flood'</span>, abstract<span>=</span><span>"The London Beer Flood was an accident at Meux &amp; Co's Horse Shoe Brewery, London, on 17 October 1814. It took place when one of the wooden vats of fermenting porter burst."</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/London_Beer_Flood'</span>)]
</code></pre></div>
<p>Now, this will make our queries very precise, especially for long query strings (the more tokens our query contains, the less likely it’ll be that there will be a document that has all of these tokens). We could optimize our search function for <a href="https://en.wikipedia.org/wiki/Precision_and_recall">recall rather than precision</a> by allowing users to specify that only one occurrence of a token is enough to match our query:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>search</span>(self, query, search_type<span>=</span><span>'AND'</span>):
<span>"""
</span><span> Still boolean search; this will return documents that contain either all words
</span><span> from the query or just one of them, depending on the search_type specified.
</span><span>
</span><span> We are still not ranking the results (sets are fast, but unordered).
</span><span> """</span>
<span>if</span> search_type <span>not</span> <span>in</span> (<span>'AND'</span>, <span>'OR'</span>):
<span>return</span> []

analyzed_query <span>=</span> analyze(query)
results <span>=</span> self<span>.</span>_results(analyzed_query)
<span>if</span> search_type <span>==</span> <span>'AND'</span>:
<span># all tokens must be in the document</span>
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>intersection(<span>*</span>results)]
<span>if</span> search_type <span>==</span> <span>'OR'</span>:
<span># only one token has to be in the document</span>
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>union(<span>*</span>results)]

<span>return</span> documents


In [<span>2</span>]: index<span>.</span>search(<span>'London Beer Flood'</span>, search_type<span>=</span><span>'OR'</span>)
search took <span>0.02816295623779297</span> seconds
Out[<span>2</span>]:
[Abstract(ID<span>=</span><span>5505026</span>, title<span>=</span><span>'Wikipedia: Addie Pryor'</span>, abstract<span>=</span><span>'| birth_place = London, England'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Addie_Pryor'</span>),
Abstract(ID<span>=</span><span>1572868</span>, title<span>=</span><span>'Wikipedia: Tim Steward'</span>, abstract<span>=</span><span>'|birth_place = London, United Kingdom'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Tim_Steward'</span>),
Abstract(ID<span>=</span><span>5111814</span>, title<span>=</span><span>'Wikipedia: 1877 Birthday Honours'</span>, abstract<span>=</span><span>'The 1877 Birthday Honours were appointments by Queen Victoria to various orders and honours to reward and highlight good works by citizens of the British Empire. The appointments were made to celebrate the official birthday of the Queen, and were published in The London Gazette on 30 May and 2 June 1877.'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/1877_Birthday_Honours'</span>),
<span>...</span>
In [<span>3</span>]: len(index<span>.</span>search(<span>'London Beer Flood'</span>, search_type<span>=</span><span>'OR'</span>))
search took <span>0.029065370559692383</span> seconds
Out[<span>3</span>]: <span>49627</span>
</code></pre></div>
<h1 id="relevancy">Relevancy</h1>
<p>We have implemented a pretty quick search engine with just some basic Python, but there’s one aspect that’s obviously missing from our little engine, and that’s the <a href="https://livebook.manning.com/book/relevant-search/chapter-1/13">idea of <strong>relevance</strong></a>. Right now we just return an unordered list of documents, and we leave it up to the user to figure out which of those (s)he is actually interested in. Especially for large result sets, that is painful or just impossible (in our <code>OR</code> example, there are almost 50,000 results).</p>
<p>This is where the idea of relevancy comes in; what if we could assign each document a score that would indicate how well it matches the query, and just order by that score? A naive and simple way of assigning a score to a document for a given query is to just count how often that document mentions that particular word. After all, the more that document mentions that term, the more likely it is that it is about our query!</p>
<h2 id="term-frequency">Term frequency</h2>
<p>Let’s expand our <code>Abstract</code> dataclass to compute and store it’s term frequencies when we index it. That way, we’ll have easy access to those numbers when we want to rank our unordered list of documents:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># in documents.py</span>
<span>from</span> collections <span>import</span> Counter
<span>from</span> .analysis <span>import</span> analyze

<span>@dataclass</span>
<span>class</span> <span>Abstract</span>:
<span># snip</span>
<span>def</span> <span>analyze</span>(self):
<span># Counter will create a dictionary counting the unique values in an array:</span>
<span># {'london': 12, 'beer': 3, ...}</span>
self<span>.</span>term_frequencies <span>=</span> Counter(analyze(self<span>.</span>fulltext))

<span>def</span> <span>term_frequency</span>(self, term):
<span>return</span> self<span>.</span>term_frequencies<span>.</span>get(term, <span>0</span>)
</code></pre></div>
<p>We need to make sure to generate these frequency counts when we index our data:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># in index.py we add `document.analyze()</span>

<span>def</span> <span>index_document</span>(self, document):
<span>if</span> document<span>.</span>ID <span>not</span> <span>in</span> self<span>.</span>documents:
self<span>.</span>documents[document<span>.</span>ID] <span>=</span> document
document<span>.</span>analyze()
</code></pre></div>
<p>We’ll modify our search function so we can apply a ranking to the documents in our result set. We’ll fetch the documents using the same Boolean query from the index and document store, and then we’ll for every document in that result set, we’ll simply sum up how often each term occurs in that document</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>search</span>(self, query, search_type<span>=</span><span>'AND'</span>, rank<span>=</span>True):
<span># snip</span>
<span>if</span> rank:
<span>return</span> self<span>.</span>rank(analyzed_query, documents)
<span>return</span> documents


<span>def</span> <span>rank</span>(self, analyzed_query, documents):
results <span>=</span> []
<span>if</span> <span>not</span> documents:
<span>return</span> results
<span>for</span> document <span>in</span> documents:
score <span>=</span> sum([document<span>.</span>term_frequency(token) <span>for</span> token <span>in</span> analyzed_query])
results<span>.</span>append((document, score))
<span>return</span> sorted(results, key<span>=</span><span>lambda</span> doc: doc[<span>1</span>], reverse<span>=</span>True)
</code></pre></div>
<h2 id="inverse-document-frequency">Inverse Document Frequency</h2>
<p>That’s already a lot better, but there are some obvious short-comings. We’re considering all query terms to be of equivalent value when assessing the relevancy for the query. However, it’s likely that certain terms have very little to no discriminating power when determining relevancy; for example, a collection with lots of documents about beer would be expected to have the term “beer” appear often in almost every document (in fact, we’re already trying to address that by dropping the 25 most common English words from the index). Searching for the word “beer” in such a case would essentially do another random sort.</p>
<p>In order to address that, we’ll add another component to our scoring algorithm that will reduce the contribution of terms that occur very often in the index to the final score. We could use the <em>collection frequency</em> of a term (i.e. how often does this term occur across <em>all</em> documents), but <a href="https://nlp.stanford.edu/IR-book/html/htmledition/inverse-document-frequency-1.html">in practice</a> the <em>document frequency</em> is used instead (i.e. how many <em>documents</em> in the index contain this term). We’re trying to rank documents after all, so it makes sense to have a document level statistic.</p>
<p>We’ll compute the <em>inverse document frequency</em> for a term by dividing the number of documents (<em>N</em>) in the index by the amount of documents that contain the term, and take a logarithm of that.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/idf.jpg"> <figcaption>
<h4>IDF; taken from https://moz.com/blog/inverse-document-frequency-and-the-importance-of-uniqueness</h4>
</figcaption>
</figure>
<p>We’ll then simply multiple the term frequency with the inverse document frequency during our ranking, so matches on terms that are rare in the corpus will contribute more to the relevancy score<sup id="fnref:5"></sup>. We can easily compute the inverse document frequency from the data available in our index:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># index.py</span>
<span>import</span> math

<span>def</span> <span>document_frequency</span>(self, token):
<span>return</span> len(self<span>.</span>index<span>.</span>get(token, set()))

<span>def</span> <span>inverse_document_frequency</span>(self, token):
<span># Manning, Hinrich and Schütze use log10, so we do too, even though it</span>
<span># doesn't really matter which log we use anyway</span>
<span># https://nlp.stanford.edu/IR-book/html/htmledition/inverse-document-frequency-1.html</span>
<span>return</span> math<span>.</span>log10(len(self<span>.</span>documents) <span>/</span> self<span>.</span>document_frequency(token))

<span>def</span> <span>rank</span>(self, analyzed_query, documents):
results <span>=</span> []
<span>if</span> <span>not</span> documents:
<span>return</span> results
<span>for</span> document <span>in</span> documents:
score <span>=</span> <span>0.0</span>
<span>for</span> token <span>in</span> analyzed_query:
tf <span>=</span> document<span>.</span>term_frequency(token)
idf <span>=</span> self<span>.</span>inverse_document_frequency(token)
score <span>+=</span> tf <span>*</span> idf
results<span>.</span>append((document, score))
<span>return</span> sorted(results, key<span>=</span><span>lambda</span> doc: doc[<span>1</span>], reverse<span>=</span>True)
</code></pre></div>
<h1 id="future-work">Future Work™</h1>
<p>And that’s a basic search engine in just a few lines of Python code! You can find all the code on <a href="https://github.com/bartdegoede/python-searchengine">Github</a>, and I’ve provided a utility function that will download the Wikipedia abstracts and build an index. Install the requirements, run it in your Python console of choice and have fun messing with the data structures and searching.</p>
<p>Now, obviously this is a project to illustrate the concepts of search and how it can be so fast (even with ranking, I can search and rank 6.27m documents on my laptop with a “slow” language like Python) and not production grade software. It runs entirely in memory on my laptop, whereas libraries like Lucene utilize hyper-efficient data structures and even optimize disk seeks, and software like Elasticsearch and Solr scale Lucene to hundreds if not thousands of machines.</p>
<p>That doesn’t mean that we can’t think about fun expansions on this basic functionality though; for example, we assume that every field in the document has the same contribution to relevancy, whereas a query term match in the title should probably be weighted more strongly than a match in the description. Another fun project could be to expand the query parsing; there’s no reason why either all or just one term need to match. Why not exclude certain terms, or do <code>AND</code> and <code>OR</code> between individual terms? Can we persist the index to disk and make it scale beyond the confines of my laptop RAM?</p>
</article>


<hr>

<footer>
<p>
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="/david/log/" title="Accès au flux RSS"><svg class="icon icon-rss2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-rss2"></use>
</svg> Suivre</a> •
<a href="http://larlet.com" title="Go to my English profile" data-instant><svg class="icon icon-user-tie">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-user-tie"></use>
</svg> Pro</a> •
<a href="mailto:david%40larlet.fr" title="Envoyer un courriel"><svg class="icon icon-mail">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-mail"></use>
</svg> Email</a> •
<abbr class="nowrap" title="Hébergeur : Alwaysdata, 62 rue Tiquetonne 75002 Paris, +33184162340"><svg class="icon icon-hammer2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-hammer2"></use>
</svg> Légal</abbr>
</p>
<template id="theme-selector">
<form>
<fieldset>
<legend><svg class="icon icon-brightness-contrast">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-brightness-contrast"></use>
</svg> Thème</legend>
<label>
<input type="radio" value="auto" name="chosen-color-scheme" checked> Auto
</label>
<label>
<input type="radio" value="dark" name="chosen-color-scheme"> Foncé
</label>
<label>
<input type="radio" value="light" name="chosen-color-scheme"> Clair
</label>
</fieldset>
</form>
</template>
</footer>
<script src="/static/david/js/instantpage-5.1.0.min.js" type="module"></script>
<script>
function loadThemeForm(templateName) {
const themeSelectorTemplate = document.querySelector(templateName)
const form = themeSelectorTemplate.content.firstElementChild
themeSelectorTemplate.replaceWith(form)

form.addEventListener('change', (e) => {
const chosenColorScheme = e.target.value
localStorage.setItem('theme', chosenColorScheme)
toggleTheme(chosenColorScheme)
})

const selectedTheme = localStorage.getItem('theme')
if (selectedTheme && selectedTheme !== 'undefined') {
form.querySelector(`[value="${selectedTheme}"]`).checked = true
}
}

const prefersColorSchemeDark = '(prefers-color-scheme: dark)'
window.addEventListener('load', () => {
let hasDarkRules = false
for (const styleSheet of Array.from(document.styleSheets)) {
let mediaRules = []
for (const cssRule of styleSheet.cssRules) {
if (cssRule.type !== CSSRule.MEDIA_RULE) {
continue
}
// WARNING: Safari does not have/supports `conditionText`.
if (cssRule.conditionText) {
if (cssRule.conditionText !== prefersColorSchemeDark) {
continue
}
} else {
if (cssRule.cssText.startsWith(prefersColorSchemeDark)) {
continue
}
}
mediaRules = mediaRules.concat(Array.from(cssRule.cssRules))
}

// WARNING: do not try to insert a Rule to a styleSheet you are
// currently iterating on, otherwise the browser will be stuck
// in a infinite loop…
for (const mediaRule of mediaRules) {
styleSheet.insertRule(mediaRule.cssText)
hasDarkRules = true
}
}
if (hasDarkRules) {
loadThemeForm('#theme-selector')
}
})
</script>
</body>
</html>

+ 274
- 0
cache/2021/b56bb56209a04e6144454283a22311ad/index.md View File

@@ -0,0 +1,274 @@
title: Building a full-text search engine in 150 lines of Python code
url: https://bart.degoe.de/building-a-full-text-search-engine-150-lines-of-code/
hash_url: b56bb56209a04e6144454283a22311ad

<p>Full-text search is everywhere. From finding a book on Scribd, a movie on Netflix, toilet paper on Amazon, or anything else on the web through Google (like <a href="https://localghost.dev/2019/09/everything-i-googled-in-a-week-as-a-professional-software-engineer/">how to do your job as a software engineer</a>), you’ve searched vast amounts of unstructured data multiple times today. What’s even more amazing, is that you’ve even though you searched millions (or <a href="https://www.worldwidewebsize.com/">billions</a>) of records, you got a response in milliseconds. In this post, we are going to explore the basic components of a full-text search engine, and use them to build one that can search across millions of documents and rank them according to their relevance in milliseconds, in less than 150 lines of Python code!</p>
<div id="player">
<p class="listen">Listen to this article instead</p>

<audio controls class="audio_controls " preload="metadata">
<source src="https://bart.degoe.de/audio/2021-03-24-python-full-text-search-engine.mp3" type="audio/mp3">
Your browser does not support the audio element
</source></audio>
</div>
<h1 id="data">Data</h1>
<p>All the code you in this blog post can be found on <a href="https://github.com/bartdegoede/python-searchengine/">Github</a>. I’ll provide links with the code snippets here, so you can try running this yourself. You can run the full example by installing <a href="https://github.com/bartdegoede/python-searchengine/blob/master/requirements.txt">the requirements</a> (<code>pip install -r requirements.txt</code>) and <a href="https://github.com/bartdegoede/python-searchengine/blob/master/run.py">run <code>python run.py</code></a>. This will download all the data and execute the example query with and without rankings.</p>
<p>Before we’re jumping into building a search engine, we first need some full-text, unstructured data to search. We are going to be searching abstracts of articles from the English Wikipedia, which is currently a gzipped XML file of about 785mb and contains about 6.27 million abstracts<sup id="fnref:1"></sup>. I’ve written <a href="https://github.com/bartdegoede/python-searchengine/blob/master/download.py">a simple function to download</a> the gzipped XML, but you can also just manually download the file.</p>
<h2 id="data-preparation">Data preparation</h2>
<p>The file is one large XML file that contains all abstracts. One abstract in this file is contained by a <code>&lt;doc&gt;</code> element, and looks roughly like this (I’ve omitted elements we’re not interested in):</p>
<div class="highlight"><pre><code class="language-xml" data-lang="xml"><span>&lt;doc&gt;</span>
<span>&lt;title&gt;</span>Wikipedia: London Beer Flood<span>&lt;/title&gt;</span>
<span>&lt;url&gt;</span>https://en.wikipedia.org/wiki/London_Beer_Flood<span>&lt;/url&gt;</span>
<span>&lt;abstract&gt;</span>The London Beer Flood was an accident at Meux <span>&amp;</span> Co's Horse Shoe Brewery, London, on 17 October 1814. It took place when one of the wooden vats of fermenting porter burst.<span>&lt;/abstract&gt;</span>
...
<span>&lt;/doc&gt;</span>
</code></pre></div><p>The bits were interested in are the <code>title</code>, the <code>url</code> and the <code>abstract</code> text itself. We’ll represent documents with a <a href="https://realpython.com/python-data-classes/">Python dataclass</a> for convenient data access. We’ll add a property that concatenates the title and the contents of the abstract. You can find the code <a href="https://github.com/bartdegoede/python-searchengine/blob/master/search/documents.py">here</a>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>from</span> dataclasses <span>import</span> dataclass

<span>@dataclass</span>
<span>class</span> <span>Abstract</span>:
<span>"""Wikipedia abstract"""</span>
ID: int
title: str
abstract: str
url: str

<span>@property</span>
<span>def</span> <span>fulltext</span>(self):
<span>return</span> <span>' '</span><span>.</span>join([self<span>.</span>title, self<span>.</span>abstract])
</code></pre></div><p>Then, we’ll want to extract the abstracts data from the XML and parse it so we can create instances of our <code>Abstract</code> object. We are going to stream through the gzipped XML without loading the entire file into memory first<sup id="fnref:2"></sup>. We’ll assign each document an ID in order of loading (ie the first document will have ID=1, the second one will have ID=2, etcetera). You can find the code <a href="https://github.com/bartdegoede/python-searchengine/blob/master/load.py">here</a>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> gzip
<span>from</span> lxml <span>import</span> etree

<span>from</span> search.documents <span>import</span> Abstract

<span>def</span> <span>load_documents</span>():
<span># open a filehandle to the gzipped Wikipedia dump</span>
<span>with</span> gzip<span>.</span>open(<span>'data/enwiki.latest-abstract.xml.gz'</span>, <span>'rb'</span>) <span>as</span> f:
doc_id <span>=</span> <span>1</span>
<span># iterparse will yield the entire `doc` element once it finds the</span>
<span># closing `&lt;/doc&gt;` tag</span>
<span>for</span> _, element <span>in</span> etree<span>.</span>iterparse(f, events<span>=</span>(<span>'end'</span>,), tag<span>=</span><span>'doc'</span>):
title <span>=</span> element<span>.</span>findtext(<span>'./title'</span>)
url <span>=</span> element<span>.</span>findtext(<span>'./url'</span>)
abstract <span>=</span> element<span>.</span>findtext(<span>'./abstract'</span>)

<span>yield</span> Abstract(ID<span>=</span>doc_id, title<span>=</span>title, url<span>=</span>url, abstract<span>=</span>abstract)

doc_id <span>+=</span> <span>1</span>
<span># the `element.clear()` call will explicitly free up the memory</span>
<span># used to store the element</span>
element<span>.</span>clear()
</code></pre></div><h1 id="indexing">Indexing</h1>
<p>We are going to store this in a data structure known as an <a href="https://en.wikipedia.org/wiki/Inverted_index">“inverted index” or a “postings list”</a>. Think of it as the index in the back of a book that has an alphabetized list of relevant words and concepts, and on what page number a reader can find them.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/book-index-1080x675.png"> <figcaption>
<h4>Back of the book index</h4>
</figcaption>
</figure>
<p>Practically, what this means is that we’re going to create a dictionary where we map all the words in our corpus to the IDs of the documents they occur in. That will look something like this:</p>
<div class="highlight"><pre><code class="language-json" data-lang="json">{
<span>...</span>
<span>"london"</span>: [<span>5245250</span>, <span>2623812</span>, <span>133455</span>, <span>3672401</span>, <span>...</span>],
<span>"beer"</span>: [<span>1921376</span>, <span>4411744</span>, <span>684389</span>, <span>2019685</span>, <span>...</span>],
<span>"flood"</span>: [<span>3772355</span>, <span>2895814</span>, <span>3461065</span>, <span>5132238</span>, <span>...</span>],
<span>...</span>
}
</code></pre></div><p>Note that in the example above the words in the dictionary are lowercased; before building the index we are going to break down or <code>analyze</code> the raw text into a list of words or <code>tokens</code>. The idea is that we first break up or <code>tokenize</code> the text into words, and then apply zero or more <code>filters</code> (such as lowercasing or stemming) on each token to improve the odds of matching queries to text.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/tokenization.png"> <figcaption>
<h4>Tokenization</h4>
</figcaption>
</figure>
<h2 id="analysis">Analysis</h2>
<p>We are going to apply very simple tokenization, by just splitting the text on whitespace. Then, we are going to apply a couple of filters on each of the tokens: we are going to lowercase each token, remove any punctuation, remove the 25 most common words in the English language (and the word “wikipedia” because it occurs in every title in every abstract) and apply <a href="https://en.wikipedia.org/wiki/Stemming">stemming</a> to every word (ensuring that different forms of a word map to the same stem, like <em>brewery</em> and <em>breweries</em><sup id="fnref:3"></sup>).</p>
<p>The tokenization and lowercase filter are very simple:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> Stemmer

STEMMER <span>=</span> Stemmer<span>.</span>Stemmer(<span>'english'</span>)

<span>def</span> <span>tokenize</span>(text):
<span>return</span> text<span>.</span>split()

<span>def</span> <span>lowercase_filter</span>(tokens):
<span>return</span> [token<span>.</span>lower() <span>for</span> token <span>in</span> tokens]

<span>def</span> <span>stem_filter</span>(tokens):
<span>return</span> STEMMER<span>.</span>stemWords(tokens)
</code></pre></div><p>Punctuation is nothing more than a regular expression on the set of punctuation:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>import</span> re
<span>import</span> string

PUNCTUATION <span>=</span> re<span>.</span>compile(<span>'[</span><span>%s</span><span>]'</span> <span>%</span> re<span>.</span>escape(string<span>.</span>punctuation))

<span>def</span> <span>punctuation_filter</span>(tokens):
<span>return</span> [PUNCTUATION<span>.</span>sub(<span>''</span>, token) <span>for</span> token <span>in</span> tokens]
</code></pre></div><p>Stopwords are words that are very common and we would expect to occcur in (almost) every document in the corpus. As such, they won’t contribute much when we search for them (i.e. (almost) every document will match when we search for those terms) and will just take up space, so we will filter them out at index time. The Wikipedia abstract corpus includes the word “Wikipedia” in every title, so we’ll add that word to the stopword list as well. We drop the 25 most common words in English.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># top 25 most common words in English and "wikipedia":</span>
<span># https://en.wikipedia.org/wiki/Most_common_words_in_English</span>
STOPWORDS <span>=</span> set([<span>'the'</span>, <span>'be'</span>, <span>'to'</span>, <span>'of'</span>, <span>'and'</span>, <span>'a'</span>, <span>'in'</span>, <span>'that'</span>, <span>'have'</span>,
<span>'I'</span>, <span>'it'</span>, <span>'for'</span>, <span>'not'</span>, <span>'on'</span>, <span>'with'</span>, <span>'he'</span>, <span>'as'</span>, <span>'you'</span>,
<span>'do'</span>, <span>'at'</span>, <span>'this'</span>, <span>'but'</span>, <span>'his'</span>, <span>'by'</span>, <span>'from'</span>, <span>'wikipedia'</span>])

<span>def</span> <span>stopword_filter</span>(tokens):
<span>return</span> [token <span>for</span> token <span>in</span> tokens <span>if</span> token <span>not</span> <span>in</span> STOPWORDS]
</code></pre></div><p>Bringing all these filters together, we’ll <a href="https://github.com/bartdegoede/python-searchengine/blob/master/search/analysis.py#L28-L35">construct an <code>analyze</code> function</a> that will operate on the <code>text</code> in each abstract; it will tokenize the text into individual words (or rather, <em>tokens</em>), and then apply each filter in succession to the list of tokens. The order is important, because we use a non-stemmed list of stopwords, so we should apply the <code>stopword_filter</code> before the <code>stem_filter</code>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>analyze</span>(text):
tokens <span>=</span> tokenize(text)
tokens <span>=</span> lowercase_filter(tokens)
tokens <span>=</span> punctuation_filter(tokens)
tokens <span>=</span> stopword_filter(tokens)
tokens <span>=</span> stem_filter(tokens)

<span>return</span> [token <span>for</span> token <span>in</span> tokens <span>if</span> token]
</code></pre></div><h2 id="indexing-the-corpus">Indexing the corpus</h2>
<p>We’ll create an <code>Index</code> class that will store the <code>index</code> and the <code>documents</code>. The <code>documents</code> dictionary stores the dataclasses by ID, and the <code>index</code> keys will be the tokens, with the values being the document IDs the token occurs in:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>class</span> <span>Index</span>:
<span>def</span> __init__(self):
self<span>.</span>index <span>=</span> {}
self<span>.</span>documents <span>=</span> {}

<span>def</span> <span>index_document</span>(self, document):
<span>if</span> document<span>.</span>ID <span>not</span> <span>in</span> self<span>.</span>documents:
self<span>.</span>documents[document<span>.</span>ID] <span>=</span> document

<span>for</span> token <span>in</span> analyze(document<span>.</span>fulltext):
<span>if</span> token <span>not</span> <span>in</span> self<span>.</span>index:
self<span>.</span>index[token] <span>=</span> set()
self<span>.</span>index[token]<span>.</span>add(document<span>.</span>ID)
</code></pre></div><h1 id="searching">Searching</h1>
<p>Now we have all tokens indexed, searching for a query becomes a matter of analyzing the query text with the same analyzer as we applied to the documents; this way we’ll end up with tokens that should match the tokens we have in the index. For each token, we’ll do a lookup in the dictionary, finding the document IDs that the token occurs in. We do this for every token, and then find the IDs of documents in all these sets (i.e. for a document to match the query, it needs to contain all the tokens in the query). We will then take the resulting list of document IDs, and fetch the actual data from our <code>documents</code> store<sup id="fnref:4"></sup>.</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>_results</span>(self, analyzed_query):
<span>return</span> [self<span>.</span>index<span>.</span>get(token, set()) <span>for</span> token <span>in</span> analyzed_query]

<span>def</span> <span>search</span>(self, query):
<span>"""
</span><span> Boolean search; this will return documents that contain all words from the
</span><span> query, but not rank them (sets are fast, but unordered).
</span><span> """</span>
analyzed_query <span>=</span> analyze(query)
results <span>=</span> self<span>.</span>_results(analyzed_query)
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>intersection(<span>*</span>results)]

<span>return</span> documents


In [<span>1</span>]: index<span>.</span>search(<span>'London Beer Flood'</span>)
search took <span>0.16307830810546875</span> milliseconds
Out[<span>1</span>]:
[Abstract(ID<span>=</span><span>1501027</span>, title<span>=</span><span>'Wikipedia: Horse Shoe Brewery'</span>, abstract<span>=</span><span>'The Horse Shoe Brewery was an English brewery in the City of Westminster that was established in 1764 and became a major producer of porter, from 1809 as Henry Meux &amp; Co. It was the site of the London Beer Flood in 1814, which killed eight people after a porter vat burst.'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Horse_Shoe_Brewery'</span>),
Abstract(ID<span>=</span><span>1828015</span>, title<span>=</span><span>'Wikipedia: London Beer Flood'</span>, abstract<span>=</span><span>"The London Beer Flood was an accident at Meux &amp; Co's Horse Shoe Brewery, London, on 17 October 1814. It took place when one of the wooden vats of fermenting porter burst."</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/London_Beer_Flood'</span>)]
</code></pre></div><p>Now, this will make our queries very precise, especially for long query strings (the more tokens our query contains, the less likely it’ll be that there will be a document that has all of these tokens). We could optimize our search function for <a href="https://en.wikipedia.org/wiki/Precision_and_recall">recall rather than precision</a> by allowing users to specify that only one occurrence of a token is enough to match our query:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>search</span>(self, query, search_type<span>=</span><span>'AND'</span>):
<span>"""
</span><span> Still boolean search; this will return documents that contain either all words
</span><span> from the query or just one of them, depending on the search_type specified.
</span><span>
</span><span> We are still not ranking the results (sets are fast, but unordered).
</span><span> """</span>
<span>if</span> search_type <span>not</span> <span>in</span> (<span>'AND'</span>, <span>'OR'</span>):
<span>return</span> []

analyzed_query <span>=</span> analyze(query)
results <span>=</span> self<span>.</span>_results(analyzed_query)
<span>if</span> search_type <span>==</span> <span>'AND'</span>:
<span># all tokens must be in the document</span>
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>intersection(<span>*</span>results)]
<span>if</span> search_type <span>==</span> <span>'OR'</span>:
<span># only one token has to be in the document</span>
documents <span>=</span> [self<span>.</span>documents[doc_id] <span>for</span> doc_id <span>in</span> set<span>.</span>union(<span>*</span>results)]

<span>return</span> documents


In [<span>2</span>]: index<span>.</span>search(<span>'London Beer Flood'</span>, search_type<span>=</span><span>'OR'</span>)
search took <span>0.02816295623779297</span> seconds
Out[<span>2</span>]:
[Abstract(ID<span>=</span><span>5505026</span>, title<span>=</span><span>'Wikipedia: Addie Pryor'</span>, abstract<span>=</span><span>'| birth_place = London, England'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Addie_Pryor'</span>),
Abstract(ID<span>=</span><span>1572868</span>, title<span>=</span><span>'Wikipedia: Tim Steward'</span>, abstract<span>=</span><span>'|birth_place = London, United Kingdom'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/Tim_Steward'</span>),
Abstract(ID<span>=</span><span>5111814</span>, title<span>=</span><span>'Wikipedia: 1877 Birthday Honours'</span>, abstract<span>=</span><span>'The 1877 Birthday Honours were appointments by Queen Victoria to various orders and honours to reward and highlight good works by citizens of the British Empire. The appointments were made to celebrate the official birthday of the Queen, and were published in The London Gazette on 30 May and 2 June 1877.'</span>, url<span>=</span><span>'https://en.wikipedia.org/wiki/1877_Birthday_Honours'</span>),
<span>...</span>
In [<span>3</span>]: len(index<span>.</span>search(<span>'London Beer Flood'</span>, search_type<span>=</span><span>'OR'</span>))
search took <span>0.029065370559692383</span> seconds
Out[<span>3</span>]: <span>49627</span>
</code></pre></div><h1 id="relevancy">Relevancy</h1>
<p>We have implemented a pretty quick search engine with just some basic Python, but there’s one aspect that’s obviously missing from our little engine, and that’s the <a href="https://livebook.manning.com/book/relevant-search/chapter-1/13">idea of <strong>relevance</strong></a>. Right now we just return an unordered list of documents, and we leave it up to the user to figure out which of those (s)he is actually interested in. Especially for large result sets, that is painful or just impossible (in our <code>OR</code> example, there are almost 50,000 results).</p>
<p>This is where the idea of relevancy comes in; what if we could assign each document a score that would indicate how well it matches the query, and just order by that score? A naive and simple way of assigning a score to a document for a given query is to just count how often that document mentions that particular word. After all, the more that document mentions that term, the more likely it is that it is about our query!</p>
<h2 id="term-frequency">Term frequency</h2>
<p>Let’s expand our <code>Abstract</code> dataclass to compute and store it’s term frequencies when we index it. That way, we’ll have easy access to those numbers when we want to rank our unordered list of documents:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># in documents.py</span>
<span>from</span> collections <span>import</span> Counter
<span>from</span> .analysis <span>import</span> analyze

<span>@dataclass</span>
<span>class</span> <span>Abstract</span>:
<span># snip</span>
<span>def</span> <span>analyze</span>(self):
<span># Counter will create a dictionary counting the unique values in an array:</span>
<span># {'london': 12, 'beer': 3, ...}</span>
self<span>.</span>term_frequencies <span>=</span> Counter(analyze(self<span>.</span>fulltext))

<span>def</span> <span>term_frequency</span>(self, term):
<span>return</span> self<span>.</span>term_frequencies<span>.</span>get(term, <span>0</span>)
</code></pre></div><p>We need to make sure to generate these frequency counts when we index our data:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># in index.py we add `document.analyze()</span>

<span>def</span> <span>index_document</span>(self, document):
<span>if</span> document<span>.</span>ID <span>not</span> <span>in</span> self<span>.</span>documents:
self<span>.</span>documents[document<span>.</span>ID] <span>=</span> document
document<span>.</span>analyze()
</code></pre></div><p>We’ll modify our search function so we can apply a ranking to the documents in our result set. We’ll fetch the documents using the same Boolean query from the index and document store, and then we’ll for every document in that result set, we’ll simply sum up how often each term occurs in that document</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span>def</span> <span>search</span>(self, query, search_type<span>=</span><span>'AND'</span>, rank<span>=</span>True):
<span># snip</span>
<span>if</span> rank:
<span>return</span> self<span>.</span>rank(analyzed_query, documents)
<span>return</span> documents


<span>def</span> <span>rank</span>(self, analyzed_query, documents):
results <span>=</span> []
<span>if</span> <span>not</span> documents:
<span>return</span> results
<span>for</span> document <span>in</span> documents:
score <span>=</span> sum([document<span>.</span>term_frequency(token) <span>for</span> token <span>in</span> analyzed_query])
results<span>.</span>append((document, score))
<span>return</span> sorted(results, key<span>=</span><span>lambda</span> doc: doc[<span>1</span>], reverse<span>=</span>True)
</code></pre></div><h2 id="inverse-document-frequency">Inverse Document Frequency</h2>
<p>That’s already a lot better, but there are some obvious short-comings. We’re considering all query terms to be of equivalent value when assessing the relevancy for the query. However, it’s likely that certain terms have very little to no discriminating power when determining relevancy; for example, a collection with lots of documents about beer would be expected to have the term “beer” appear often in almost every document (in fact, we’re already trying to address that by dropping the 25 most common English words from the index). Searching for the word “beer” in such a case would essentially do another random sort.</p>
<p>In order to address that, we’ll add another component to our scoring algorithm that will reduce the contribution of terms that occur very often in the index to the final score. We could use the <em>collection frequency</em> of a term (i.e. how often does this term occur across <em>all</em> documents), but <a href="https://nlp.stanford.edu/IR-book/html/htmledition/inverse-document-frequency-1.html">in practice</a> the <em>document frequency</em> is used instead (i.e. how many <em>documents</em> in the index contain this term). We’re trying to rank documents after all, so it makes sense to have a document level statistic.</p>
<p>We’ll compute the <em>inverse document frequency</em> for a term by dividing the number of documents (<em>N</em>) in the index by the amount of documents that contain the term, and take a logarithm of that.</p>
<figure>
<img src="https://bart.degoe.de/img/2021-03-24-building-a-full-text-search-engine-150-lines-of-code/idf.jpg"> <figcaption>
<h4>IDF; taken from https://moz.com/blog/inverse-document-frequency-and-the-importance-of-uniqueness</h4>
</figcaption>
</figure>
<p>We’ll then simply multiple the term frequency with the inverse document frequency during our ranking, so matches on terms that are rare in the corpus will contribute more to the relevancy score<sup id="fnref:5"></sup>. We can easily compute the inverse document frequency from the data available in our index:</p>
<div class="highlight"><pre><code class="language-python" data-lang="python"><span># index.py</span>
<span>import</span> math

<span>def</span> <span>document_frequency</span>(self, token):
<span>return</span> len(self<span>.</span>index<span>.</span>get(token, set()))

<span>def</span> <span>inverse_document_frequency</span>(self, token):
<span># Manning, Hinrich and Schütze use log10, so we do too, even though it</span>
<span># doesn't really matter which log we use anyway</span>
<span># https://nlp.stanford.edu/IR-book/html/htmledition/inverse-document-frequency-1.html</span>
<span>return</span> math<span>.</span>log10(len(self<span>.</span>documents) <span>/</span> self<span>.</span>document_frequency(token))

<span>def</span> <span>rank</span>(self, analyzed_query, documents):
results <span>=</span> []
<span>if</span> <span>not</span> documents:
<span>return</span> results
<span>for</span> document <span>in</span> documents:
score <span>=</span> <span>0.0</span>
<span>for</span> token <span>in</span> analyzed_query:
tf <span>=</span> document<span>.</span>term_frequency(token)
idf <span>=</span> self<span>.</span>inverse_document_frequency(token)
score <span>+=</span> tf <span>*</span> idf
results<span>.</span>append((document, score))
<span>return</span> sorted(results, key<span>=</span><span>lambda</span> doc: doc[<span>1</span>], reverse<span>=</span>True)
</code></pre></div><h1 id="future-work">Future Work™</h1>
<p>And that’s a basic search engine in just a few lines of Python code! You can find all the code on <a href="https://github.com/bartdegoede/python-searchengine">Github</a>, and I’ve provided a utility function that will download the Wikipedia abstracts and build an index. Install the requirements, run it in your Python console of choice and have fun messing with the data structures and searching.</p>
<p>Now, obviously this is a project to illustrate the concepts of search and how it can be so fast (even with ranking, I can search and rank 6.27m documents on my laptop with a “slow” language like Python) and not production grade software. It runs entirely in memory on my laptop, whereas libraries like Lucene utilize hyper-efficient data structures and even optimize disk seeks, and software like Elasticsearch and Solr scale Lucene to hundreds if not thousands of machines.</p>
<p>That doesn’t mean that we can’t think about fun expansions on this basic functionality though; for example, we assume that every field in the document has the same contribution to relevancy, whereas a query term match in the title should probably be weighted more strongly than a match in the description. Another fun project could be to expand the query parsing; there’s no reason why either all or just one term need to match. Why not exclude certain terms, or do <code>AND</code> and <code>OR</code> between individual terms? Can we persist the index to disk and make it scale beyond the confines of my laptop RAM?</p>


+ 215
- 0
cache/2021/e554fd03f2342ab72115688dd258cba4/index.html View File

@@ -0,0 +1,215 @@
<!doctype html><!-- This is a valid HTML5 document. -->
<!-- Screen readers, SEO, extensions and so on. -->
<html lang="fr">
<!-- Has to be within the first 1024 bytes, hence before the `title` element
See: https://www.w3.org/TR/2012/CR-html5-20121217/document-metadata.html#charset -->
<meta charset="utf-8">
<!-- Why no `X-UA-Compatible` meta: https://stackoverflow.com/a/6771584 -->
<!-- The viewport meta is quite crowded and we are responsible for that.
See: https://codepen.io/tigt/post/meta-viewport-for-2015 -->
<meta name="viewport" content="width=device-width,initial-scale=1">
<!-- Required to make a valid HTML5 document. -->
<title>Ecosocialism is the Horizon, Degrowth is the Way (archive) — David Larlet</title>
<meta name="description" content="Publication mise en cache pour en conserver une trace.">
<!-- That good ol' feed, subscribe :). -->
<link rel="alternate" type="application/atom+xml" title="Feed" href="/david/log/">
<!-- Generated from https://realfavicongenerator.net/ such a mess. -->
<link rel="apple-touch-icon" sizes="180x180" href="/static/david/icons2/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/static/david/icons2/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/static/david/icons2/favicon-16x16.png">
<link rel="manifest" href="/static/david/icons2/site.webmanifest">
<link rel="mask-icon" href="/static/david/icons2/safari-pinned-tab.svg" color="#07486c">
<link rel="shortcut icon" href="/static/david/icons2/favicon.ico">
<meta name="msapplication-TileColor" content="#f7f7f7">
<meta name="msapplication-config" content="/static/david/icons2/browserconfig.xml">
<meta name="theme-color" content="#f7f7f7" media="(prefers-color-scheme: light)">
<meta name="theme-color" content="#272727" media="(prefers-color-scheme: dark)">
<!-- Documented, feel free to shoot an email. -->
<link rel="stylesheet" href="/static/david/css/style_2021-01-20.css">
<!-- See https://www.zachleat.com/web/comprehensive-webfonts/ for the trade-off. -->
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t4_poly_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: light), (prefers-color-scheme: no-preference)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_regular.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_bold.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<link rel="preload" href="/static/david/css/fonts/triplicate_t3_italic.woff2" as="font" type="font/woff2" media="(prefers-color-scheme: dark)" crossorigin>
<script>
function toggleTheme(themeName) {
document.documentElement.classList.toggle(
'forced-dark',
themeName === 'dark'
)
document.documentElement.classList.toggle(
'forced-light',
themeName === 'light'
)
}
const selectedTheme = localStorage.getItem('theme')
if (selectedTheme !== 'undefined') {
toggleTheme(selectedTheme)
}
</script>

<meta name="robots" content="noindex, nofollow">
<meta content="origin-when-cross-origin" name="referrer">
<!-- Canonical URL for SEO purposes -->
<link rel="canonical" href="https://www.the-trouble.com/content/2021/2/11/ecosocialism-is-the-horizon-degrowth-is-the-way">

<body class="remarkdown h1-underline h2-underline h3-underline em-underscore hr-center ul-star pre-tick" data-instant-intensity="viewport-all">


<article>
<header>
<h1>Ecosocialism is the Horizon, Degrowth is the Way</h1>
</header>
<nav>
<p class="center">
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="https://www.the-trouble.com/content/2021/2/11/ecosocialism-is-the-horizon-degrowth-is-the-way" title="Lien vers le contenu original">Source originale</a>
</p>
</nav>
<hr>
<p class="">“Degrowth” means many things to many people. To most, it probably doesn’t mean much beyond an antonym to “growth,” the process of getting larger or more complex. To some detractors, the term represents a scary violation of the imperative to increase GDP annually, what’s now a holy sacrament to policymakers and economic pundits (though less so to actual academic economists, who are more ambivalent). To its less pedantic and more hysterical detractors, it’s a ploy to take away everyone’s Hummers and return to a mushroom-foraging-based economy. </p>
<p class="">At its most distilled, “degrowth” refers to a process of reducing the material impact of the economy on the world’s many imperiled ecologies, abandoning GDP as a measurement of well-being, and forging an equitable steady-state economy.  </p>
<p class="">Although the concept of placing limits to economic growth is not very new, having been articulated by environmentalists several decades ago—most famously <a href="https://en.wikipedia.org/wiki/The_Limits_to_Growth">by the Club of Rome</a> in 1972—the more recent iteration, only just over a decade old, emerges from the French <em>décroissance</em>. Given that the community and scholarship is so young, there’s still a lot of debate around some of the fundamentals of what the term means, and what it <em>should</em> mean. Some who believe in the principles recoil at the term itself: Noam Chomsky <a href="https://canadiandimension.com/articles/view/the-greening-of-noam-chomsky-a-conversation">has said</a> “when you say ‘degrowth’ it frightens people. It’s like saying you’re going to have to be poorer tomorrow than you are today, and it doesn’t mean that.” But many degrowth defenders, one of the most prominent being ecological economist <a href="https://en.wikipedia.org/wiki/Giorgos_Kallis">Giorgos Kallis</a>, <a href="https://oxfamblogs.org/fp2p/youre-wrong-kate-degrowth-is-a-compelling-word/">stand by it</a> and see value in such a unifying notion. </p>
<p class="">Even so, there lurks some danger in all such terms and political communities, like socialism or democracy, as <a href="https://www.currentaffairs.org/2019/10/we-need-a-fair-way-to-end-infinite-growth">I have warned</a> elsewhere of the perennial risk of being co-opted and ill-defined by bad-faith actors. If the degrowth critique goes only as far as targeting economic growth, or even general anticapitalism, there’s little intrinsic to it to stop a right-wing authoritarian program from co-opting degrowth rhetoric to justify imposing authoritarianism, or giving cover to cynical Global North states to demand degrowth of the Global South while continuing to disproportionately consume and pollute. Degrowth, if it is to get traction and if that traction is to be desirable, needs to be abundantly clear about what it stands for and what it rejects. Luckily, we have just the book to offer this much needed clarity. </p>
<p class="">Economic anthropologist Jason Hickel is among the most eloquent advocates of degrowth, and has been intimately involved in the community’s attempt to stake out a useful, clear meaning for the term and pathway to integrating its principles into a coherent program. Hickel’s latest book, <a href="https://www.jasonhickel.org/less-is-more"><em>Less is More: How Degrowth Will Save the World</em></a> published in August 2020 (with a paperback edition released this month), offers an abundance of facts, concepts, and research alongside a passionate defense of ecocentric and humanistic values. Hickel has achieved something many writers of popular nonfiction seek in vain: a high density of ideas and data delivered in a light, enjoyable narrative prose. The book makes <a href="https://www.currentaffairs.org/2020/08/the-case-for-degrowth">a very strong case</a> for a topic in need of strong cases. And <em>Less Is More</em> arrives in good company: degrowth advocate Timothée Parrique <a href="https://twitter.com/timparrique/status/1347098907235512320">counted</a> 203 essays, 70 academic articles, and 11 books on degrowth published in 2020. </p>
<p class="">Some bad-faith commentators have attempted to paint degrowth as dressed-up primitivist austerity, intrinsically harmful to the Global South, but Hickel does a persuasive job emphasizing that degrowth actually means the opposite. He musters an army of historical and contemporary data, anecdotes, and theory to argue definitively that an equitable degrowth scenario is more likely to <em>increase </em>material abundance and resource access. If the ideology of growthism offers an ethic of constant amoral expansion and exploitation, degrowth(ism) offers a more restrained ethic that values an abundance of time, leisure, love, and equality over concentrated wealth and distributed waste. </p>
<p class="">While the book explores the moral imperative for controlled degrowth, Hickel is equally comfortable arguing for degrowth from a standpoint of a purely rational approach to fundamentally shifting an economy that is currently heating the world to death, guaranteeing centuries of mass death and destruction. The only way to slow the rapid race to collapse civilization and accelerate extinctions is to stop the omnicidal political economy that rules the globe. Given the natural limits that thermodynamics and terrestrial ecologies impose on human economies and non-human populations, degrowth is <a href="https://dothemath.ucsd.edu/2012/04/economist-meets-physicist/">inevitable</a>: it’s just a matter of deciding whether human agency will play a positive, benevolent role in the process, or continue to maximize the chaos and violence involved. I asked Dr. Hickel via email about some of the major challenges to achieving degrowth reforms and some important peripheral issues. Here is our discussion:<br><br><br></p>
<p class=""><strong>SMM: The ideology of degrowthism seems very compatible with a range of anticapitalist programs from ecosocialism to Green New Deal social democracy to anarchism and heterodox environmentalist political economy. Do you see degrowthism ideally as its own ideological program, or a supplement to existing traditions, both, or something else? </strong></p>
<p class=""><strong>JH:</strong> The power of degrowth is that it offers a critique, and an alternative path, that speaks to a broad range of movements.  So, we can support the social-democratic vision of a Green New Deal, but point out that it cannot be achieved if we continue to pursue growth at the same time.  If we want the Green New Deal to be feasible, just, and ecologically coherent, we should abandon growth as an objective and focus directly on social and ecological goals instead.  Similarly, we can support the demands of Extinction Rebellion for rapid decarbonization, while offering a clear strategy for how this can be achieved in a just and equitable way.</p>
<p class="">What I like about degrowth is that it offers a critique of capitalism that makes sense to people who are not already anti-capitalist, because it gets to the nub of what capitalism is really about.  Most people assume capitalism is about markets and trade; and what could possibly be wrong with that?  But markets and trade were around for thousands of years prior to capitalism; what makes capitalism distinctive is that it is organized around, and dependent on, perpetual expansion, for the sake of elite accumulation.  When you point this out to people they immediately recognize it as a problem, and start thinking about what a post-growth, post-capitalist society might look like.  In other words, degrowth offers a kind of practical and relevant entry to post-capitalist thought.</p>
<p class="">I think that most proponents of degrowth would consider themselves to be ecosocialists of some stripe (with various persuasions running from democratic socialism to anarchism to autonomism).   But there is a tendency within ecosocialism that assumes growth can and should continue, with the goal of achieving some kind of automated,  millionaire-style luxury for all, while hoping that state policy and publicly-funded technological innovations will make this vision compatible with ecology. In other words, a kind of left-wing ecomodernism. Degrowth rejects this approach on the grounds that it is ecologically illiterate, but also because we just don’t need growth (i.e., an increase in resource throughput and commodity output) to achieve a flourishing society – that assumption is a holdover from capitalist ideology, which falsely seeks to equate growth with human well-being, and we should reject it.  So, one might sum it up like this: ecosocialism is the horizon, degrowth is the way.</p>
<p class="">Degrowth also adds an anti-imperialist ethic to ecosocialism. We have to understand that high levels of consumption always rely on forms of extraction and appropriation from elsewhere, specifically, colonial or neo-colonial “frontiers”.  Degrowth is attentive to these dynamics.  The call for degrowth in the global North is not just about ecology.  It is also a call for decolonization in the global South.  Ecosocialism without anti-imperialism is not an ecosocialism worth having.</p>
<p class=""><strong>SMM: Is degrowthism more immediate stopgap to halt the extinction and climate crises, or more long-term civilization-building, or something else?</strong> </p>
<p class=""><strong>JH:</strong> No, it’s definitely not just a stopgap to halt ecological breakdown, because it’s not just about ecology.  Degrowth represents an approach to halting ecological breakdown that is just and equitable. It requires a different kind of economy, and a different kind of society.  In that sense, yes it does represent civilization-building.  But it also has an undeniable immediacy to it.  These are things that need to be done now, starting this decade, in order for us to have anything like a reasonable chance of stopping dangerous climate change. </p>
<p class=""><strong>SMM: Many mainstream commentators, from liberals to the entire right-wing media-government-industrial complex and even some growthist socialists, are still generally opposed to ideas of degrowth. Is it worth trying to reach these hostile groups or to focus on those without a preformed opinion? Following up, which groups have you found generally most receptive to the ideas in <em>Less is More</em>? Do you see unorthodox coalitions forming? </strong></p>
<p class=""><strong>JH:</strong> There is a certain faction of the socialist left (mostly older males in the global North, many of them economists) who seem personally offended by degrowth, and express their vitriol on social media accordingly.  What strikes me about this faction is that it seems they have read little if any actual degrowth literature, to say nothing of the broader literature on ecological economics. It is a knee-jerk reaction to something they haven’t thought about.  If they would engage in good faith, I suspect they would find it all much more reasonable than they assume.  What’s great about degrowth scholarship is that it is deeply grounded in empirical evidence; it has to be, as this is required of any insurgent idea that hopes to go up against longstanding assumptions.  </p>
<p class="">As for the right, to the extent that they are committed to serving the interests of capital, I am under no illusion that they would give degrowth a fair hearing, any more than they would give even the most basic tenets of social democracy a fair hearing.  Liberals are a different story, though; degrowth has received good coverage in establishment outlets like the <em>New Yorker, Vox</em>, <em>The Guardian</em>, <em>LARB</em> etc.  If you’re paying attention to the ecological crisis, you know that our existing approach isn’t working and you’re ready for something else.  People are increasingly open to new ideas.  In fact, to my surprise, it seems that broader public audiences tend to be remarkably receptive to degrowth.  It was once thought that we shouldn’t use the word degrowth, for fear that people might misunderstand it and be turned off.  I’ve found the opposite; people seem to find it intuitive and refreshing.  It makes no sense to patronize people, as though they’re not capable of understanding the nuances of the concept.  Instead, appeal to their intellect, their sense of humanity, their sense of care and solidarity – that is much more powerful. </p>
<p class=""><strong>SMM: Shrinking the economy and building a steady-state one could hypothetically be achieved with authoritarian austerity rather than egalitarian abundance (the latter of which <em>Less is More</em> places at the heart of degrowthism). Do you think there is a </strong><a href="https://www.currentaffairs.org/2019/10/we-need-a-fair-way-to-end-infinite-growth"><strong>risk</strong></a><strong> of degrowth being co-opted, as socialist principles have frequently been co-opted, to justify authoritarian states? How can degrowthists maintain control of the idea to avoid co-optation by authoritarians? </strong></p>
<p class=""><strong>JH:</strong> I don’t think the word “austerity” works for what you’re describing here.  Austerity is what growth-oriented governments do when they are desperate to get growth going: they slash spending on public goods to create artificial scarcities that induce people into competitive productivity (George Osborne was explicit about this), and they privatize public services and assets in order to create new frontiers for investment and to expand the remit of the market.  These are growthist strategies.  It’s not clear to me that any government that wanted to reduce throughput would adopt austerity measures to accomplish this goal, because that wouldn’t solve the problem.  The problem isn’t public services.  The problem is capitalism.  </p>
<p class="">If capitalism calls for scarcity in order to generate more growth, degrowth calls for the opposite: reversing artificial scarcities in order to remove growthist pressures, and indeed to render additional growth unnecessary. Expanding universal public services is key to this (i.e., the opposite of austerity).  As for the problem of excess throughput: this is being driven by unnecessary industrial activity (in other words, industrial activity that is organized around exchange-value rather than use-value) and elite accumulation.  So that’s what we have to degrow. </p>
<p class="">Of course, one can imagine this being achieved by an authoritarian government, but it wouldn’t work very well.  The problem with any elitist state structure is that it is removed from the complex realities of regional ecology. You can’t manage ecosystems with abstract planning (James Scott’s work in <em>Seeing Like a State</em> is good on this); it requires the knowledge of people who have a relationship with the land… it requires <em>commoners</em>.  We know that when people have collective democratic control over local ecological commons they make decisions to sustain rather than liquidate them.  That’s the principle we need to build on.  So we reject authoritarianism not only on political-ideological grounds, but also because authoritarianism is intrinsically anti-ecological. </p>
<p class="">I think Murray Bookchin is correct on this point.  Our relationship with nature will mimic the structure of our society.  If we organize society around hierarchy, domination and extraction (which is true of both capitalism and any form of authoritarianism), then our relationship with nature will be hierarchical, dominating and extractive.  But if we organize society around egalitarianism, reciprocity and care, then our relationship with nature will be egalitarian, reciprocal and caring.  Every human society necessarily relies on nonhuman species; the question is, according to what principles do we incorporate them?  </p>
<p class=""><strong>SMM: The degrowth community is still relatively small (though has grown very quickly). What would you say is the greatest obstacle to spreading degrowthist principles to mainstream audiences? There probably are very different obstacles depending on the community one is approaching. But if you could point to the biggest barrier, that if we fixed this one thing we could make a lot of progress on spreading the idea, what would it be? </strong></p>
<p class=""><strong>JH:</strong> The key thing is that those who align with degrowth ideas need to be bold enough to champion them, rather than leaving this to “experts”.  Those of us who have become public voices for degrowth can only do so much on our own.  Ideas spread when people spread them.  Form book clubs, write op-eds for your local paper, do radio interviews with your local station.  If you’re a postgraduate student who is interested in degrowth, then actively contribute to developing the idea and answering the remaining questions, from a position of solidarity, rather than writing about it from a remove.</p>
<p class="">Other than that, I think we need to normalize the word.  I meet so many politicians and other thought-leaders who privately align with degrowth ideas, but try not to use the word because they’re worried about how it will be received.  I get that.  I understand that this is more or less the position that people like Naomi Klein and Noam Chomsky have taken. But we can only advance the conversation by actually talking about it. We need people who are bold enough to do that. Angela Davis said “One of the greatest challenges of any social movement is to develop new vocabularies.” Words like degrowth enable new thinking and analysis, and we need that now more than ever.</p>
<p class=""><strong>SMM: <em>Less Is More</em> includes a really fascinating section on the creation story of capitalism. The story is basically of peasants who threw off the rule of aristocrats and built egalitarian communes that also were quite animistic, with an ecologically-minded relationship to non-human (or your great phrase “more-than-human”) life. Rulers invented capitalism to basically extract more from the peasant communities and compel farmers to extract more from the land. The takeaway seems to be that in the absence of such psychopathic aristocrats and autocrats, people generally self-organize into more or less eco-anarchist democracies. There are many examples of Indigenous societies incorporating social tools to maintain democratic politics and prevent wealth and power hoarders from taking over. Are there practical mechanisms (that you didn’t include in <em>Less Is More</em>) that you’d point to for achieving such enviable accountability in modern fossil states, or do we just need to hope for collapses and fragmentation? </strong></p>
<p class=""><strong>JH:</strong> It's worth remembering that the ecological ontologies that characterize many Indigenous communities today are not some kind of timeless trait.  They have been formulated in response to capitalism. In most cases these communities, or their ancestors, have had first-hand experience of the violence of colonial capital.  They know how destructive it is, to both humans and ecologies, especially on the frontiers of the world-system.  Consider the devastation wrought by the European invasion of the Americas, which wiped out 90% of the population and turned vast tracts of land into plantation monoculture and strip mines.  That’s the context here.  Indigenous communities have seen apocalypse up close, and their ontologies have been formed accordingly, with an acute awareness of the values that are required if we are to thrive together on this planet. </p>
<p class="">I expect that if ecological crisis causes our civilization to collapse and fragment, similar ontologies will emerge, with a kind of “never again” ethic: never again will we treat the living world as a stock of resources, never again will we organize the economy around perpetual growth, never again will we allow elites to monopolize power, etc.  But I don’t think that such a collapse is the only<em> </em>way to get there; nobody wants that. My goal in <em>Less is More </em>is to argue that we can feasibly transition to an ecological economy and <em>prevent</em> collapse. The book charts a clear pathway from here to there. There’s still time to take it, but that window is quickly closing.</p>
<p class=""><strong>SMM: You wrote a really </strong><a href="https://newint.org/features/2019/07/01/long-read-progress-and-its-discontents"><strong>great essay</strong></a><strong> about how status quo defenders like Stephen Pinker and Bill Gates use narratives of progress to stifle real change and authentic progress, which your previous book </strong><a href="https://www.penguin.co.uk/books/111/1113531/the-divide/9781786090034.html"><strong><em>The Divide</em></strong></a><strong> also speaks to. Do you see <em>Less Is More</em> and degrowth more generally as putting forward an alternative story and definition of progress, or rejecting progressive narratives entirely, or something else?  </strong></p>
<p class=""><strong>JH:</strong> The problem with the dominant progress narrative is that it is deeply disingenuous.  People like Pinker and Gates, and the media outlets that have amplified them, appear to start from the position of seeking evidence to defend the status quo (basically, capitalism, and specifically the neoliberal variety).  Toward this end, they overstate the extent of progress (for example, by selecting a poverty line that is well below subsistence), and they studiously ignore trends that complicate their good news narrative (for example, worsening ecological breakdown, increasing inequality, etc.). But their biggest error is that they attempt to cast progress as the spontaneous outcome of capitalism, when in fact it has been fought for by progressive social movements <em>against</em> the interests of the capitalist class.  </p>
<p class="">For the first 400 years of its history, capitalism caused immiseration virtually everywhere it went: enclosure, dispossession, genocide, mass enslavement, colonization, famine.  It wasn’t until 1870 that we began to see any improvement in life expectancy in Europe, and that was the product of the labour movement and related struggles for democracy, municipal socialism, and basic interventions like public sanitation, public housing, and public healthcare.  We don’t see improvement in the global South until progressive movements succeed in achieving decolonization.  This history is important, because it reveals that what’s required for progress isn’t growth as such (as in, an aggregate expansion in the commodity economy), but rather a fair distribution of income and opportunity, and access to universal public goods.  It’s not rocket science, but it does require a political struggle.  So one might say that degrowth redefines progress.  The goal is to achieve well-being for all, in balance with the Earth’s ecosystems, and any step we take in this direction (i.e., degrowth) represents progress.  </p>
<p class=""><strong>SMM: <em>Less is More</em> ends with a powerful argument for implementing more animistic spirituality and biocentric ethics as part of a degrowth agenda. This is close to my heart; something I’m struggling with is the question of how we can seek to achieve a sort of hegemony of such value systems while remaining faithful to cultural differences and local ecological conditions. Is there a practical way you would suggest starting to work toward evangelizing these values effectively? </strong></p>
<p class=""><strong>JH:</strong> This is a real challenge.  I think the first step is to amplify the voices of Indigenous leaders and activists who are already pointing in this direction.  The Red Nation movement’s tagline says “All Relatives Forever”, with relatives here of course referring to both human and nonhuman persons.  Consider the implications of such a politics; it is profound – far more radical, and far more inspiring and enriching, than traditional leftist discourse.  Media outlets need to give platforms and column space to people like Winona LaDuke, Ailton Krenak, Nemonte Nenquimo and Robin Kimmerer, who are connecting anti-colonial struggles and post-capitalist visions with what we might call animist ontologies.  This is not about warm, fuzzy spiritualism; on the contrary, it is the sharp edge of a radical politics.</p>
<p class="">I think the Rights of Nature movement is also promising; the more we talk about rivers, watersheds and ecosystems as persons, with rights to existence, the more this idea becomes thinkable.  We don’t have to wait for national governments to create such rights; in many places local councils have this power.  But we could also consider more direct interventions, such as creating ecological education programmes.  Sweden did something like this in the 1960s, to enable people to learn about local ecosystems and develop ecological consciousness, on a mass scale.  Schumacher College is an example of this in the UK.  At minimum, we could make ecology a required course in schools and universities, with a strong practical component that allows students to develop inter-species understanding.</p>
<p class=""><strong>SMM: There’s been discussion about the utopian imaginary of degrowth. It seems so often that the only two visions of futuristic society we’re regularly presented with are either 1) progressively high-tech society with killer (or helper) robots and space colonies or 2) low-tech visions of what industrialized people think of as “primitivism,” maybe with returns to foraging or agrarian serfdom. <em>Less Is More</em> and degrowthism more broadly seem to be striking a totally different path that incorporates high-tech solutions to build low-tech, low-harm economies. Does that assessment ring true, or do you see it going in a different direction?</strong></p>
<p class=""><strong>JH:</strong> Yes, that’s the way I see it.  I am not anti-tech at all.  The truth is that capitalism <em>constrains </em>innovation, rather than enabling it.  Consider the fact that so many of our brightest minds are focused on getting people to click on ads and buy stuff they don’t need, or even want.  That is literally the cutting edge of US capitalism.  Not surprisingly, capitalism prioritizes innovations that will further the interests of capital accumulation, rather than innovations that we actually need to solve social and ecological problems.  Then there’s the intellectual property regime; imagine the innovations that would happen if knowledge was shared freely, rather than being locked up in corporate patents for decades? </p>
<p class="">The second problem is that, under capitalism, innovations that deliver efficiency improvements lead not to a <em>reduction</em> of energy and resource use, but rather to <em>more </em>energy and resource use, because the gains are reinvested to expand the process of production and consumption. In other words, growthism wipes out our most impressive improvements. When it comes to confronting ecological breakdown, we must realize that it’s not our technology that’s the problem, it’s growth. In a post-growth or post-capitalist economy, this wouldn’t be a problem.  Efficiency improvements would work as expect them to, and enable us to reduce our impact on the Earth.</p>
<p class=""><strong>SMM: Follow-up on the utopia: would you point to good fiction writing or recent research trying to put in really granular concrete terms what an ideal degrowth society might physically look like? Is it better to leave the visioning more open to local variations and not get too concrete and specific? </strong></p>
<p class=""><strong>JH:</strong> A lot of people will point to Ursula Le Guin’s <em>The Dispossessed</em>. It’s a story about a kind of ecosocialist society on another planet.  The premise is that the ecosystem is primarily desert, so people have to find ways to sustain a flourishing society with relatively little material throughput.  They do it with a firm commitment to egalitarianism, public goods, and direct democracy.  They fiercely reject elite accumulation, which they see as dangerously wasteful.  Because they do not measure civilization in terms of the quantity of stuff they consume (as our society does), they are free to focus on higher goals: philosophy, science and art.  It’s worth noting that Le Guin was the daughter of Alfred Kroeber, an anthropologist who spent his career learning from Indigenous communities in the American Southwest.  These were people who saw egalitarianism and direct democracy as essential to survival in a desert ecosystem.  Le Guin was clearly inspired by their approach to the world.  </p>
<p class="">There’s other literature that deals with degrowth themes, although without trying to portray a degrowth society.  Michael Ende’s <em>Momo</em> comes to mind.  There’s also Hayao Miyazaki’s films.  Aldous Huxley’s <em>Island. </em>David Graeber’s <em>Fragments of an Anarchist Anthropology </em>explores ethnographic insights that are relevant to degrowth theory.  Then there are the writings of anti-colonial leaders like Gandhi, Fanon and Sankara, who rejected growthism and sought to define a more human-centered economics.  These are all resources we can draw on as we imagine a more just and ecological civilization. </p>
<p class=""><strong>SMM: Neoliberalism basically trojan-horsed itself into a global consensus (the horse being a shiny new innovative economic theory and the Greek soldiers being basic laissez-faire corporate serfdom now with Robots), its operating logic embedded into governments, international orgs, nonprofits, universities, and even individual minds while the name evaporates to the point where neoliberals deny neoliberalism exists. Of course we don’t (necessarily) want to replicate such a machiavellian underhanded maneuver, but do you ideally see degrowthism following a similar sort of trajectory of embedding its logic in a global consensus and then disappearing? Or does it need to totally abandon this Washington consensus model of international governance? </strong></p>
<p class=""> <strong>JH:</strong> There’s a lot of work to be done when it comes to degrowth political strategy.  I think what’s required is a range of approaches.  There are people at the community level working to bring degrowth principles to local economic governance.  <a href="https://transitionnetwork.org/">Transition Towns</a> in the UK are a nascent example of this.  So too with cities like Amsterdam and Copenhagen adopting “doughnut economics”.  We can see it at a national level, too, with New Zealand, Scotland and Iceland choosing to abandon GDP growth as a government objective.  I think there’s hope at a multilateral level, too: the Environment Committee of the European Parliament just recently voted in favour of binding targets to reduce material throughput in absolute terms.  That’s a core degrowth policy.  Of course, it’s not law yet – but it’s a huge step. </p>
<p class="">The difference between neoliberal political strategy and degrowth is that the former had the backing of billionaires and corporations that bankrolled think tanks, university departments, and media outlets.  It also had international financial institutions and the US military, which forcibly imposed the Washington Consensus around the world.  Degrowth has to rely almost entirely on social movements.  That’s a tall order, but we can take inspiration from our ancestors: the anti-slavery movement, the anti-apartheid movement, the anti-colonial movement, the Civil Rights Movement, the labour movement, the feminist movement... all of these have changed the world, against overwhelming odds.  That’s the scale of what’s required of us.</p>
<p class=""><em>Samuel Miller McDonald</em><strong><em> </em></strong><em>is a writer and geography PhD student at University of Oxford studying the intersection of grassroots movements and energy transition.</em></p>
</article>


<hr>

<footer>
<p>
<a href="/david/" title="Aller à l’accueil"><svg class="icon icon-home">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-home"></use>
</svg> Accueil</a> •
<a href="/david/log/" title="Accès au flux RSS"><svg class="icon icon-rss2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-rss2"></use>
</svg> Suivre</a> •
<a href="http://larlet.com" title="Go to my English profile" data-instant><svg class="icon icon-user-tie">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-user-tie"></use>
</svg> Pro</a> •
<a href="mailto:david%40larlet.fr" title="Envoyer un courriel"><svg class="icon icon-mail">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-mail"></use>
</svg> Email</a> •
<abbr class="nowrap" title="Hébergeur : Alwaysdata, 62 rue Tiquetonne 75002 Paris, +33184162340"><svg class="icon icon-hammer2">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-hammer2"></use>
</svg> Légal</abbr>
</p>
<template id="theme-selector">
<form>
<fieldset>
<legend><svg class="icon icon-brightness-contrast">
<use xlink:href="/static/david/icons2/symbol-defs.svg#icon-brightness-contrast"></use>
</svg> Thème</legend>
<label>
<input type="radio" value="auto" name="chosen-color-scheme" checked> Auto
</label>
<label>
<input type="radio" value="dark" name="chosen-color-scheme"> Foncé
</label>
<label>
<input type="radio" value="light" name="chosen-color-scheme"> Clair
</label>
</fieldset>
</form>
</template>
</footer>
<script src="/static/david/js/instantpage-5.1.0.min.js" type="module"></script>
<script>
function loadThemeForm(templateName) {
const themeSelectorTemplate = document.querySelector(templateName)
const form = themeSelectorTemplate.content.firstElementChild
themeSelectorTemplate.replaceWith(form)

form.addEventListener('change', (e) => {
const chosenColorScheme = e.target.value
localStorage.setItem('theme', chosenColorScheme)
toggleTheme(chosenColorScheme)
})

const selectedTheme = localStorage.getItem('theme')
if (selectedTheme && selectedTheme !== 'undefined') {
form.querySelector(`[value="${selectedTheme}"]`).checked = true
}
}

const prefersColorSchemeDark = '(prefers-color-scheme: dark)'
window.addEventListener('load', () => {
let hasDarkRules = false
for (const styleSheet of Array.from(document.styleSheets)) {
let mediaRules = []
for (const cssRule of styleSheet.cssRules) {
if (cssRule.type !== CSSRule.MEDIA_RULE) {
continue
}
// WARNING: Safari does not have/supports `conditionText`.
if (cssRule.conditionText) {
if (cssRule.conditionText !== prefersColorSchemeDark) {
continue
}
} else {
if (cssRule.cssText.startsWith(prefersColorSchemeDark)) {
continue
}
}
mediaRules = mediaRules.concat(Array.from(cssRule.cssRules))
}

// WARNING: do not try to insert a Rule to a styleSheet you are
// currently iterating on, otherwise the browser will be stuck
// in a infinite loop…
for (const mediaRule of mediaRules) {
styleSheet.insertRule(mediaRule.cssText)
hasDarkRules = true
}
}
if (hasDarkRules) {
loadThemeForm('#theme-selector')
}
})
</script>
</body>
</html>

+ 48
- 0
cache/2021/e554fd03f2342ab72115688dd258cba4/index.md View File

@@ -0,0 +1,48 @@
title: Ecosocialism is the Horizon, Degrowth is the Way
url: https://www.the-trouble.com/content/2021/2/11/ecosocialism-is-the-horizon-degrowth-is-the-way
hash_url: e554fd03f2342ab72115688dd258cba4

<p class="">“Degrowth” means many things to many people. To most, it probably doesn’t mean much beyond an antonym to “growth,” the process of getting larger or more complex. To some detractors, the term represents a scary violation of the imperative to increase GDP annually, what’s now a holy sacrament to policymakers and economic pundits (though less so to actual academic economists, who are more ambivalent). To its less pedantic and more hysterical detractors, it’s a ploy to take away everyone’s Hummers and return to a mushroom-foraging-based economy. </p>
<p class="">At its most distilled, “degrowth” refers to a process of reducing the material impact of the economy on the world’s many imperiled ecologies, abandoning GDP as a measurement of well-being, and forging an equitable steady-state economy.  </p>
<p class="">Although the concept of placing limits to economic growth is not very new, having been articulated by environmentalists several decades ago—most famously <a href="https://en.wikipedia.org/wiki/The_Limits_to_Growth">by the Club of Rome</a> in 1972—the more recent iteration, only just over a decade old, emerges from the French <em>décroissance</em>. Given that the community and scholarship is so young, there’s still a lot of debate around some of the fundamentals of what the term means, and what it <em>should</em> mean. Some who believe in the principles recoil at the term itself: Noam Chomsky <a href="https://canadiandimension.com/articles/view/the-greening-of-noam-chomsky-a-conversation">has said</a> “when you say ‘degrowth’ it frightens people. It’s like saying you’re going to have to be poorer tomorrow than you are today, and it doesn’t mean that.” But many degrowth defenders, one of the most prominent being ecological economist <a href="https://en.wikipedia.org/wiki/Giorgos_Kallis">Giorgos Kallis</a>, <a href="https://oxfamblogs.org/fp2p/youre-wrong-kate-degrowth-is-a-compelling-word/">stand by it</a> and see value in such a unifying notion. </p>
<p class="">Even so, there lurks some danger in all such terms and political communities, like socialism or democracy, as <a href="https://www.currentaffairs.org/2019/10/we-need-a-fair-way-to-end-infinite-growth">I have warned</a> elsewhere of the perennial risk of being co-opted and ill-defined by bad-faith actors. If the degrowth critique goes only as far as targeting economic growth, or even general anticapitalism, there’s little intrinsic to it to stop a right-wing authoritarian program from co-opting degrowth rhetoric to justify imposing authoritarianism, or giving cover to cynical Global North states to demand degrowth of the Global South while continuing to disproportionately consume and pollute. Degrowth, if it is to get traction and if that traction is to be desirable, needs to be abundantly clear about what it stands for and what it rejects. Luckily, we have just the book to offer this much needed clarity. </p>
<p class="">Economic anthropologist Jason Hickel is among the most eloquent advocates of degrowth, and has been intimately involved in the community’s attempt to stake out a useful, clear meaning for the term and pathway to integrating its principles into a coherent program. Hickel’s latest book, <a href="https://www.jasonhickel.org/less-is-more"><em>Less is More: How Degrowth Will Save the World</em></a> published in August 2020 (with a paperback edition released this month), offers an abundance of facts, concepts, and research alongside a passionate defense of ecocentric and humanistic values. Hickel has achieved something many writers of popular nonfiction seek in vain: a high density of ideas and data delivered in a light, enjoyable narrative prose. The book makes <a href="https://www.currentaffairs.org/2020/08/the-case-for-degrowth">a very strong case</a> for a topic in need of strong cases. And <em>Less Is More</em> arrives in good company: degrowth advocate Timothée Parrique <a href="https://twitter.com/timparrique/status/1347098907235512320">counted</a> 203 essays, 70 academic articles, and 11 books on degrowth published in 2020. </p>
<p class="">Some bad-faith commentators have attempted to paint degrowth as dressed-up primitivist austerity, intrinsically harmful to the Global South, but Hickel does a persuasive job emphasizing that degrowth actually means the opposite. He musters an army of historical and contemporary data, anecdotes, and theory to argue definitively that an equitable degrowth scenario is more likely to <em>increase </em>material abundance and resource access. If the ideology of growthism offers an ethic of constant amoral expansion and exploitation, degrowth(ism) offers a more restrained ethic that values an abundance of time, leisure, love, and equality over concentrated wealth and distributed waste. </p>
<p class="">While the book explores the moral imperative for controlled degrowth, Hickel is equally comfortable arguing for degrowth from a standpoint of a purely rational approach to fundamentally shifting an economy that is currently heating the world to death, guaranteeing centuries of mass death and destruction. The only way to slow the rapid race to collapse civilization and accelerate extinctions is to stop the omnicidal political economy that rules the globe. Given the natural limits that thermodynamics and terrestrial ecologies impose on human economies and non-human populations, degrowth is <a href="https://dothemath.ucsd.edu/2012/04/economist-meets-physicist/">inevitable</a>: it’s just a matter of deciding whether human agency will play a positive, benevolent role in the process, or continue to maximize the chaos and violence involved. I asked Dr. Hickel via email about some of the major challenges to achieving degrowth reforms and some important peripheral issues. Here is our discussion:<br><br><br></p>
<p class=""><strong>SMM: The ideology of degrowthism seems very compatible with a range of anticapitalist programs from ecosocialism to Green New Deal social democracy to anarchism and heterodox environmentalist political economy. Do you see degrowthism ideally as its own ideological program, or a supplement to existing traditions, both, or something else? </strong></p>
<p class=""><strong>JH:</strong> The power of degrowth is that it offers a critique, and an alternative path, that speaks to a broad range of movements.  So, we can support the social-democratic vision of a Green New Deal, but point out that it cannot be achieved if we continue to pursue growth at the same time.  If we want the Green New Deal to be feasible, just, and ecologically coherent, we should abandon growth as an objective and focus directly on social and ecological goals instead.  Similarly, we can support the demands of Extinction Rebellion for rapid decarbonization, while offering a clear strategy for how this can be achieved in a just and equitable way.</p>
<p class="">What I like about degrowth is that it offers a critique of capitalism that makes sense to people who are not already anti-capitalist, because it gets to the nub of what capitalism is really about.  Most people assume capitalism is about markets and trade; and what could possibly be wrong with that?  But markets and trade were around for thousands of years prior to capitalism; what makes capitalism distinctive is that it is organized around, and dependent on, perpetual expansion, for the sake of elite accumulation.  When you point this out to people they immediately recognize it as a problem, and start thinking about what a post-growth, post-capitalist society might look like.  In other words, degrowth offers a kind of practical and relevant entry to post-capitalist thought.</p>
<p class="">I think that most proponents of degrowth would consider themselves to be ecosocialists of some stripe (with various persuasions running from democratic socialism to anarchism to autonomism).   But there is a tendency within ecosocialism that assumes growth can and should continue, with the goal of achieving some kind of automated,  millionaire-style luxury for all, while hoping that state policy and publicly-funded technological innovations will make this vision compatible with ecology. In other words, a kind of left-wing ecomodernism. Degrowth rejects this approach on the grounds that it is ecologically illiterate, but also because we just don’t need growth (i.e., an increase in resource throughput and commodity output) to achieve a flourishing society – that assumption is a holdover from capitalist ideology, which falsely seeks to equate growth with human well-being, and we should reject it.  So, one might sum it up like this: ecosocialism is the horizon, degrowth is the way.</p>
<p class="">Degrowth also adds an anti-imperialist ethic to ecosocialism. We have to understand that high levels of consumption always rely on forms of extraction and appropriation from elsewhere, specifically, colonial or neo-colonial “frontiers”.  Degrowth is attentive to these dynamics.  The call for degrowth in the global North is not just about ecology.  It is also a call for decolonization in the global South.  Ecosocialism without anti-imperialism is not an ecosocialism worth having.</p>
<p class=""><strong>SMM: Is degrowthism more immediate stopgap to halt the extinction and climate crises, or more long-term civilization-building, or something else?</strong> </p>
<p class=""><strong>JH:</strong> No, it’s definitely not just a stopgap to halt ecological breakdown, because it’s not just about ecology.  Degrowth represents an approach to halting ecological breakdown that is just and equitable. It requires a different kind of economy, and a different kind of society.  In that sense, yes it does represent civilization-building.  But it also has an undeniable immediacy to it.  These are things that need to be done now, starting this decade, in order for us to have anything like a reasonable chance of stopping dangerous climate change. </p>
<p class=""><strong>SMM: Many mainstream commentators, from liberals to the entire right-wing media-government-industrial complex and even some growthist socialists, are still generally opposed to ideas of degrowth. Is it worth trying to reach these hostile groups or to focus on those without a preformed opinion? Following up, which groups have you found generally most receptive to the ideas in <em>Less is More</em>? Do you see unorthodox coalitions forming? </strong></p>
<p class=""><strong>JH:</strong> There is a certain faction of the socialist left (mostly older males in the global North, many of them economists) who seem personally offended by degrowth, and express their vitriol on social media accordingly.  What strikes me about this faction is that it seems they have read little if any actual degrowth literature, to say nothing of the broader literature on ecological economics. It is a knee-jerk reaction to something they haven’t thought about.  If they would engage in good faith, I suspect they would find it all much more reasonable than they assume.  What’s great about degrowth scholarship is that it is deeply grounded in empirical evidence; it has to be, as this is required of any insurgent idea that hopes to go up against longstanding assumptions.  </p>
<p class="">As for the right, to the extent that they are committed to serving the interests of capital, I am under no illusion that they would give degrowth a fair hearing, any more than they would give even the most basic tenets of social democracy a fair hearing.  Liberals are a different story, though; degrowth has received good coverage in establishment outlets like the <em>New Yorker, Vox</em>, <em>The Guardian</em>, <em>LARB</em> etc.  If you’re paying attention to the ecological crisis, you know that our existing approach isn’t working and you’re ready for something else.  People are increasingly open to new ideas.  In fact, to my surprise, it seems that broader public audiences tend to be remarkably receptive to degrowth.  It was once thought that we shouldn’t use the word degrowth, for fear that people might misunderstand it and be turned off.  I’ve found the opposite; people seem to find it intuitive and refreshing.  It makes no sense to patronize people, as though they’re not capable of understanding the nuances of the concept.  Instead, appeal to their intellect, their sense of humanity, their sense of care and solidarity – that is much more powerful. </p>
<p class=""><strong>SMM: Shrinking the economy and building a steady-state one could hypothetically be achieved with authoritarian austerity rather than egalitarian abundance (the latter of which <em>Less is More</em> places at the heart of degrowthism). Do you think there is a </strong><a href="https://www.currentaffairs.org/2019/10/we-need-a-fair-way-to-end-infinite-growth"><strong>risk</strong></a><strong> of degrowth being co-opted, as socialist principles have frequently been co-opted, to justify authoritarian states? How can degrowthists maintain control of the idea to avoid co-optation by authoritarians? </strong></p>
<p class=""><strong>JH:</strong> I don’t think the word “austerity” works for what you’re describing here.  Austerity is what growth-oriented governments do when they are desperate to get growth going: they slash spending on public goods to create artificial scarcities that induce people into competitive productivity (George Osborne was explicit about this), and they privatize public services and assets in order to create new frontiers for investment and to expand the remit of the market.  These are growthist strategies.  It’s not clear to me that any government that wanted to reduce throughput would adopt austerity measures to accomplish this goal, because that wouldn’t solve the problem.  The problem isn’t public services.  The problem is capitalism.  </p>
<p class="">If capitalism calls for scarcity in order to generate more growth, degrowth calls for the opposite: reversing artificial scarcities in order to remove growthist pressures, and indeed to render additional growth unnecessary. Expanding universal public services is key to this (i.e., the opposite of austerity).  As for the problem of excess throughput: this is being driven by unnecessary industrial activity (in other words, industrial activity that is organized around exchange-value rather than use-value) and elite accumulation.  So that’s what we have to degrow. </p>
<p class="">Of course, one can imagine this being achieved by an authoritarian government, but it wouldn’t work very well.  The problem with any elitist state structure is that it is removed from the complex realities of regional ecology. You can’t manage ecosystems with abstract planning (James Scott’s work in <em>Seeing Like a State</em> is good on this); it requires the knowledge of people who have a relationship with the land… it requires <em>commoners</em>.  We know that when people have collective democratic control over local ecological commons they make decisions to sustain rather than liquidate them.  That’s the principle we need to build on.  So we reject authoritarianism not only on political-ideological grounds, but also because authoritarianism is intrinsically anti-ecological. </p>
<p class="">I think Murray Bookchin is correct on this point.  Our relationship with nature will mimic the structure of our society.  If we organize society around hierarchy, domination and extraction (which is true of both capitalism and any form of authoritarianism), then our relationship with nature will be hierarchical, dominating and extractive.  But if we organize society around egalitarianism, reciprocity and care, then our relationship with nature will be egalitarian, reciprocal and caring.  Every human society necessarily relies on nonhuman species; the question is, according to what principles do we incorporate them?  </p>
<p class=""><strong>SMM: The degrowth community is still relatively small (though has grown very quickly). What would you say is the greatest obstacle to spreading degrowthist principles to mainstream audiences? There probably are very different obstacles depending on the community one is approaching. But if you could point to the biggest barrier, that if we fixed this one thing we could make a lot of progress on spreading the idea, what would it be? </strong></p>
<p class=""><strong>JH:</strong> The key thing is that those who align with degrowth ideas need to be bold enough to champion them, rather than leaving this to “experts”.  Those of us who have become public voices for degrowth can only do so much on our own.  Ideas spread when people spread them.  Form book clubs, write op-eds for your local paper, do radio interviews with your local station.  If you’re a postgraduate student who is interested in degrowth, then actively contribute to developing the idea and answering the remaining questions, from a position of solidarity, rather than writing about it from a remove.</p>
<p class="">Other than that, I think we need to normalize the word.  I meet so many politicians and other thought-leaders who privately align with degrowth ideas, but try not to use the word because they’re worried about how it will be received.  I get that.  I understand that this is more or less the position that people like Naomi Klein and Noam Chomsky have taken. But we can only advance the conversation by actually talking about it. We need people who are bold enough to do that. Angela Davis said “One of the greatest challenges of any social movement is to develop new vocabularies.” Words like degrowth enable new thinking and analysis, and we need that now more than ever.</p>
<p class=""><strong>SMM: <em>Less Is More</em> includes a really fascinating section on the creation story of capitalism. The story is basically of peasants who threw off the rule of aristocrats and built egalitarian communes that also were quite animistic, with an ecologically-minded relationship to non-human (or your great phrase “more-than-human”) life. Rulers invented capitalism to basically extract more from the peasant communities and compel farmers to extract more from the land. The takeaway seems to be that in the absence of such psychopathic aristocrats and autocrats, people generally self-organize into more or less eco-anarchist democracies. There are many examples of Indigenous societies incorporating social tools to maintain democratic politics and prevent wealth and power hoarders from taking over. Are there practical mechanisms (that you didn’t include in <em>Less Is More</em>) that you’d point to for achieving such enviable accountability in modern fossil states, or do we just need to hope for collapses and fragmentation? </strong></p>
<p class=""><strong>JH:</strong> It's worth remembering that the ecological ontologies that characterize many Indigenous communities today are not some kind of timeless trait.  They have been formulated in response to capitalism. In most cases these communities, or their ancestors, have had first-hand experience of the violence of colonial capital.  They know how destructive it is, to both humans and ecologies, especially on the frontiers of the world-system.  Consider the devastation wrought by the European invasion of the Americas, which wiped out 90% of the population and turned vast tracts of land into plantation monoculture and strip mines.  That’s the context here.  Indigenous communities have seen apocalypse up close, and their ontologies have been formed accordingly, with an acute awareness of the values that are required if we are to thrive together on this planet. </p>
<p class="">I expect that if ecological crisis causes our civilization to collapse and fragment, similar ontologies will emerge, with a kind of “never again” ethic: never again will we treat the living world as a stock of resources, never again will we organize the economy around perpetual growth, never again will we allow elites to monopolize power, etc.  But I don’t think that such a collapse is the only<em> </em>way to get there; nobody wants that. My goal in <em>Less is More </em>is to argue that we can feasibly transition to an ecological economy and <em>prevent</em> collapse. The book charts a clear pathway from here to there. There’s still time to take it, but that window is quickly closing.</p>
<p class=""><strong>SMM: You wrote a really </strong><a href="https://newint.org/features/2019/07/01/long-read-progress-and-its-discontents"><strong>great essay</strong></a><strong> about how status quo defenders like Stephen Pinker and Bill Gates use narratives of progress to stifle real change and authentic progress, which your previous book </strong><a href="https://www.penguin.co.uk/books/111/1113531/the-divide/9781786090034.html"><strong><em>The Divide</em></strong></a><strong> also speaks to. Do you see <em>Less Is More</em> and degrowth more generally as putting forward an alternative story and definition of progress, or rejecting progressive narratives entirely, or something else?  </strong></p>
<p class=""><strong>JH:</strong> The problem with the dominant progress narrative is that it is deeply disingenuous.  People like Pinker and Gates, and the media outlets that have amplified them, appear to start from the position of seeking evidence to defend the status quo (basically, capitalism, and specifically the neoliberal variety).  Toward this end, they overstate the extent of progress (for example, by selecting a poverty line that is well below subsistence), and they studiously ignore trends that complicate their good news narrative (for example, worsening ecological breakdown, increasing inequality, etc.). But their biggest error is that they attempt to cast progress as the spontaneous outcome of capitalism, when in fact it has been fought for by progressive social movements <em>against</em> the interests of the capitalist class.  </p>
<p class="">For the first 400 years of its history, capitalism caused immiseration virtually everywhere it went: enclosure, dispossession, genocide, mass enslavement, colonization, famine.  It wasn’t until 1870 that we began to see any improvement in life expectancy in Europe, and that was the product of the labour movement and related struggles for democracy, municipal socialism, and basic interventions like public sanitation, public housing, and public healthcare.  We don’t see improvement in the global South until progressive movements succeed in achieving decolonization.  This history is important, because it reveals that what’s required for progress isn’t growth as such (as in, an aggregate expansion in the commodity economy), but rather a fair distribution of income and opportunity, and access to universal public goods.  It’s not rocket science, but it does require a political struggle.  So one might say that degrowth redefines progress.  The goal is to achieve well-being for all, in balance with the Earth’s ecosystems, and any step we take in this direction (i.e., degrowth) represents progress.  </p>
<p class=""><strong>SMM: <em>Less is More</em> ends with a powerful argument for implementing more animistic spirituality and biocentric ethics as part of a degrowth agenda. This is close to my heart; something I’m struggling with is the question of how we can seek to achieve a sort of hegemony of such value systems while remaining faithful to cultural differences and local ecological conditions. Is there a practical way you would suggest starting to work toward evangelizing these values effectively? </strong></p>
<p class=""><strong>JH:</strong> This is a real challenge.  I think the first step is to amplify the voices of Indigenous leaders and activists who are already pointing in this direction.  The Red Nation movement’s tagline says “All Relatives Forever”, with relatives here of course referring to both human and nonhuman persons.  Consider the implications of such a politics; it is profound – far more radical, and far more inspiring and enriching, than traditional leftist discourse.  Media outlets need to give platforms and column space to people like Winona LaDuke, Ailton Krenak, Nemonte Nenquimo and Robin Kimmerer, who are connecting anti-colonial struggles and post-capitalist visions with what we might call animist ontologies.  This is not about warm, fuzzy spiritualism; on the contrary, it is the sharp edge of a radical politics.</p>
<p class="">I think the Rights of Nature movement is also promising; the more we talk about rivers, watersheds and ecosystems as persons, with rights to existence, the more this idea becomes thinkable.  We don’t have to wait for national governments to create such rights; in many places local councils have this power.  But we could also consider more direct interventions, such as creating ecological education programmes.  Sweden did something like this in the 1960s, to enable people to learn about local ecosystems and develop ecological consciousness, on a mass scale.  Schumacher College is an example of this in the UK.  At minimum, we could make ecology a required course in schools and universities, with a strong practical component that allows students to develop inter-species understanding.</p>
<p class=""><strong>SMM: There’s been discussion about the utopian imaginary of degrowth. It seems so often that the only two visions of futuristic society we’re regularly presented with are either 1) progressively high-tech society with killer (or helper) robots and space colonies or 2) low-tech visions of what industrialized people think of as “primitivism,” maybe with returns to foraging or agrarian serfdom. <em>Less Is More</em> and degrowthism more broadly seem to be striking a totally different path that incorporates high-tech solutions to build low-tech, low-harm economies. Does that assessment ring true, or do you see it going in a different direction?</strong></p>
<p class=""><strong>JH:</strong> Yes, that’s the way I see it.  I am not anti-tech at all.  The truth is that capitalism <em>constrains </em>innovation, rather than enabling it.  Consider the fact that so many of our brightest minds are focused on getting people to click on ads and buy stuff they don’t need, or even want.  That is literally the cutting edge of US capitalism.  Not surprisingly, capitalism prioritizes innovations that will further the interests of capital accumulation, rather than innovations that we actually need to solve social and ecological problems.  Then there’s the intellectual property regime; imagine the innovations that would happen if knowledge was shared freely, rather than being locked up in corporate patents for decades? </p>
<p class="">The second problem is that, under capitalism, innovations that deliver efficiency improvements lead not to a <em>reduction</em> of energy and resource use, but rather to <em>more </em>energy and resource use, because the gains are reinvested to expand the process of production and consumption. In other words, growthism wipes out our most impressive improvements. When it comes to confronting ecological breakdown, we must realize that it’s not our technology that’s the problem, it’s growth. In a post-growth or post-capitalist economy, this wouldn’t be a problem.  Efficiency improvements would work as expect them to, and enable us to reduce our impact on the Earth.</p>
<p class=""><strong>SMM: Follow-up on the utopia: would you point to good fiction writing or recent research trying to put in really granular concrete terms what an ideal degrowth society might physically look like? Is it better to leave the visioning more open to local variations and not get too concrete and specific? </strong></p>
<p class=""><strong>JH:</strong> A lot of people will point to Ursula Le Guin’s <em>The Dispossessed</em>. It’s a story about a kind of ecosocialist society on another planet.  The premise is that the ecosystem is primarily desert, so people have to find ways to sustain a flourishing society with relatively little material throughput.  They do it with a firm commitment to egalitarianism, public goods, and direct democracy.  They fiercely reject elite accumulation, which they see as dangerously wasteful.  Because they do not measure civilization in terms of the quantity of stuff they consume (as our society does), they are free to focus on higher goals: philosophy, science and art.  It’s worth noting that Le Guin was the daughter of Alfred Kroeber, an anthropologist who spent his career learning from Indigenous communities in the American Southwest.  These were people who saw egalitarianism and direct democracy as essential to survival in a desert ecosystem.  Le Guin was clearly inspired by their approach to the world.  </p>
<p class="">There’s other literature that deals with degrowth themes, although without trying to portray a degrowth society.  Michael Ende’s <em>Momo</em> comes to mind.  There’s also Hayao Miyazaki’s films.  Aldous Huxley’s <em>Island. </em>David Graeber’s <em>Fragments of an Anarchist Anthropology </em>explores ethnographic insights that are relevant to degrowth theory.  Then there are the writings of anti-colonial leaders like Gandhi, Fanon and Sankara, who rejected growthism and sought to define a more human-centered economics.  These are all resources we can draw on as we imagine a more just and ecological civilization. </p>
<p class=""><strong>SMM: Neoliberalism basically trojan-horsed itself into a global consensus (the horse being a shiny new innovative economic theory and the Greek soldiers being basic laissez-faire corporate serfdom now with Robots), its operating logic embedded into governments, international orgs, nonprofits, universities, and even individual minds while the name evaporates to the point where neoliberals deny neoliberalism exists. Of course we don’t (necessarily) want to replicate such a machiavellian underhanded maneuver, but do you ideally see degrowthism following a similar sort of trajectory of embedding its logic in a global consensus and then disappearing? Or does it need to totally abandon this Washington consensus model of international governance? </strong></p>
<p class=""> <strong>JH:</strong> There’s a lot of work to be done when it comes to degrowth political strategy.  I think what’s required is a range of approaches.  There are people at the community level working to bring degrowth principles to local economic governance.  <a href="https://transitionnetwork.org/">Transition Towns</a> in the UK are a nascent example of this.  So too with cities like Amsterdam and Copenhagen adopting “doughnut economics”.  We can see it at a national level, too, with New Zealand, Scotland and Iceland choosing to abandon GDP growth as a government objective.  I think there’s hope at a multilateral level, too: the Environment Committee of the European Parliament just recently voted in favour of binding targets to reduce material throughput in absolute terms.  That’s a core degrowth policy.  Of course, it’s not law yet – but it’s a huge step. </p>
<p class="">The difference between neoliberal political strategy and degrowth is that the former had the backing of billionaires and corporations that bankrolled think tanks, university departments, and media outlets.  It also had international financial institutions and the US military, which forcibly imposed the Washington Consensus around the world.  Degrowth has to rely almost entirely on social movements.  That’s a tall order, but we can take inspiration from our ancestors: the anti-slavery movement, the anti-apartheid movement, the anti-colonial movement, the Civil Rights Movement, the labour movement, the feminist movement... all of these have changed the world, against overwhelming odds.  That’s the scale of what’s required of us.</p>
<p class=""><em>Samuel Miller McDonald</em><strong><em> </em></strong><em>is a writer and geography PhD student at University of Oxford studying the intersection of grassroots movements and energy transition.</em></p>

+ 6
- 0
cache/2021/index.html View File

@@ -197,6 +197,8 @@
<li><a href="/david/cache/2021/388cf40eae756175ee87c9bf7a1548c4/" title="Accès à l’article dans le cache local : No, Really, mRNA Vaccines Are Not Going To Affect Your DNA">No, Really, mRNA Vaccines Are Not Going To Affect Your DNA</a> (<a href="https://www.deplatformdisease.com/blog/no-really-mrna-vaccines-are-not-going-to-affect-your-dna" title="Accès à l’article original distant : No, Really, mRNA Vaccines Are Not Going To Affect Your DNA">original</a>)</li>
<li><a href="/david/cache/2021/e554fd03f2342ab72115688dd258cba4/" title="Accès à l’article dans le cache local : Ecosocialism is the Horizon, Degrowth is the Way">Ecosocialism is the Horizon, Degrowth is the Way</a> (<a href="https://www.the-trouble.com/content/2021/2/11/ecosocialism-is-the-horizon-degrowth-is-the-way" title="Accès à l’article original distant : Ecosocialism is the Horizon, Degrowth is the Way">original</a>)</li>
<li><a href="/david/cache/2021/22b380308edae42dce43930916bc6375/" title="Accès à l’article dans le cache local : Pourquoi mesurer le taux de CO2 peut-il nous aider à lutter contre la COVID-19 ?">Pourquoi mesurer le taux de CO2 peut-il nous aider à lutter contre la COVID-19 ?</a> (<a href="https://www.adioscorona.org/questions-reponses/2021-04-15-mesurer-taux-co2-pour-lutter-contre-covid19.html" title="Accès à l’article original distant : Pourquoi mesurer le taux de CO2 peut-il nous aider à lutter contre la COVID-19 ?">original</a>)</li>
<li><a href="/david/cache/2021/fe75ef80663602733dbe24cc717f257b/" title="Accès à l’article dans le cache local : The race for coronavirus vaccines: a graphical guide">The race for coronavirus vaccines: a graphical guide</a> (<a href="https://www.nature.com/articles/d41586-020-01221-y" title="Accès à l’article original distant : The race for coronavirus vaccines: a graphical guide">original</a>)</li>
@@ -375,6 +377,8 @@
<li><a href="/david/cache/2021/cfcd10768187ce1c3e598136cd8838b2/" title="Accès à l’article dans le cache local : Bad News Wrapped in Protein: Inside the Coronavirus Genome">Bad News Wrapped in Protein: Inside the Coronavirus Genome</a> (<a href="https://www.nytimes.com/interactive/2020/04/03/science/coronavirus-genome-bad-news-wrapped-in-protein.html" title="Accès à l’article original distant : Bad News Wrapped in Protein: Inside the Coronavirus Genome">original</a>)</li>
<li><a href="/david/cache/2021/b56bb56209a04e6144454283a22311ad/" title="Accès à l’article dans le cache local : Building a full-text search engine in 150 lines of Python code">Building a full-text search engine in 150 lines of Python code</a> (<a href="https://bart.degoe.de/building-a-full-text-search-engine-150-lines-of-code/" title="Accès à l’article original distant : Building a full-text search engine in 150 lines of Python code">original</a>)</li>
<li><a href="/david/cache/2021/bda7c1903601dae7d2b10ce8e3d87572/" title="Accès à l’article dans le cache local : Le « TravelPorn », nouvel outil de distinction sociale sur les réseaux sociaux">Le « TravelPorn », nouvel outil de distinction sociale sur les réseaux sociaux</a> (<a href="https://usbeketrica.com/fr/article/travel-porn-outil-distinction-social" title="Accès à l’article original distant : Le « TravelPorn », nouvel outil de distinction sociale sur les réseaux sociaux">original</a>)</li>
<li><a href="/david/cache/2021/cf2952fa1898656b9a6d99ca299ddd2f/" title="Accès à l’article dans le cache local : La canicule au Canada jugée responsable d’une centaine de morts dans la région de Vancouver">La canicule au Canada jugée responsable d’une centaine de morts dans la région de Vancouver</a> (<a href="https://www.lemonde.fr/planete/article/2021/06/30/le-canada-subit-une-canicule-des-dizaines-de-morts-subites-dans-la-region-de-vancouver_6086286_3244.html" title="Accès à l’article original distant : La canicule au Canada jugée responsable d’une centaine de morts dans la région de Vancouver">original</a>)</li>
@@ -551,6 +555,8 @@
<li><a href="/david/cache/2021/33c4c1d859a2d85c1710ea50628a71b1/" title="Accès à l’article dans le cache local : ☕️ Journal : Acheter de la forêt">☕️ Journal : Acheter de la forêt</a> (<a href="https://oncletom.io/2021/03/12/acheter-de-la-foret/" title="Accès à l’article original distant : ☕️ Journal : Acheter de la forêt">original</a>)</li>
<li><a href="/david/cache/2021/b404382125c07935b98295a801049097/" title="Accès à l’article dans le cache local : The Questions Concerning Technology">The Questions Concerning Technology</a> (<a href="https://theconvivialsociety.substack.com/p/the-questions-concerning-technology" title="Accès à l’article original distant : The Questions Concerning Technology">original</a>)</li>
<li><a href="/david/cache/2021/0e0d866f920298fbc0624c03ddc83d24/" title="Accès à l’article dans le cache local : Reconnaissance faciale: Clearview AI a violé la vie privée des Canadiens">Reconnaissance faciale: Clearview AI a violé la vie privée des Canadiens</a> (<a href="https://www.ledevoir.com/societe/594536/reconnaissance-faciale-clearview-ai-a-viole-la-vie-privee-des-canadiens" title="Accès à l’article original distant : Reconnaissance faciale: Clearview AI a violé la vie privée des Canadiens">original</a>)</li>
<li><a href="/david/cache/2021/09acd86a2ea00109af7fb53ff0953729/" title="Accès à l’article dans le cache local : Three observations on my first vaccination shot">Three observations on my first vaccination shot</a> (<a href="https://interconnected.org/home/2021/04/28/vaccination" title="Accès à l’article original distant : Three observations on my first vaccination shot">original</a>)</li>

Loading…
Cancel
Save