|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146 |
- title: Getting Takahē to run on Piku
- url: https://taoofmac.com/space/blog/2022/12/21/0900
- hash_url: a889fa6d4e07bdc390d44461ed6dce21
-
- <p class="lead">Last night after work I decided to see how easy it would be to run a <a href="https://jointakahe.org" rel="external">Takahē</a> <a href="/space/protocols/activitypub" rel="next">ActivityPub</a> instance under <a href="https://github.com/piku/piku" rel="external">Piku</a>, my tiny <a href="/space/dev/python" rel="next">Python</a>-oriented PaaS.</p>
- <p>Self-hosting <a href="https://joinmastodon.org" rel="external">Mastodon</a> is all the rage, but having to deal with a full-blown installation of <a href="/space/dev/ruby" rel="next">Ruby</a> (which is always a pain to install properly, even if you use <code>rbenv</code>), plus the abomination that is <a href="https://sidekiq.org" rel="external">Sidekiq</a> and the overall Rube Goldberg-esque architectural approach that is almost mandatory to deal with the complexities of <a href="/space/protocols/activitypub" rel="next">ActivityPub</a> is just something I don’t want to maintain. Ever. Even inside Docker.</p>
- <p>Which is why I have been developing my own <a href="/space/protocols/activitypub" rel="next">ActivityPub</a> server using <a href="https://sanic.dev" rel="external">Sanic</a> and a very lightweight <code>asyncio</code>-based approach at handling all the transactional aspects of <a href="/space/protocols/activitypub" rel="next">ActivityPub</a> atop <a href="/space/db/sqlite" rel="next">SQLite</a>. And let me tell you, I honestly wish the protocol was less about doing what boils down to P2P webhooks with PEM signatures embedded in requests.</p>
- <a id="anchor-enter-takahe" class="anchor" href="/space/blog/2022/12/21/0900#enter-takahe" rel="anchor"><h2 id="enter-takahe">Enter Takahē</h2></a><figure>
- <img src="/media/blog/2022/12/21/0900/bqAzQDKitQgUZ3Uu9MByVxvt8r4=/takahe.png" title="logo">
- <figcaption>Blue birds are cool. Well, at least flightless ones.</figcaption>
- </figure>
- <p>But <a href="https://jointakahe.org" rel="external">Takahē</a> is now aiming to support client apps as of version <code>0.6</code>, is built on <a href="dev/python/django" rel="nofollow">Django</a> (which I have always loved as a framework), and it saves me from the trouble of building <em>everything</em> from scratch, so… I had to try it out.</p>
- <p>More to the point, <a href="dev/python/django" rel="nofollow">Django</a> is <em>exactly</em> what <a href="https://github.com/piku/piku" rel="external">Piku</a> was originally designed to run.</p>
- <p>Besides running as a <code>WSGI</code> app, <a href="https://jointakahe.org" rel="external">Takahē</a> uses an async stator to handle all the background tasks (which is also <em>exactly</em> the pattern I aim for and designed <a href="https://github.com/piku/piku" rel="external">Piku</a> to support), so I just had to see how easy it was to get it running under <a href="https://github.com/piku/piku" rel="external">Piku</a> on very low-end hardware.</p>
- <a id="anchor-the-hardware" class="anchor" href="/space/blog/2022/12/21/0900#the-hardware" rel="anchor"><h2 id="the-hardware">The Hardware</h2></a><p>I have a 4GB <a href="/space/hw/raspberry_pi" rel="next">Raspberry Pi</a> 4s set up as an SSD-backed <a href="https://github.com/pimox/pimox7" rel="external">Proxomox</a> server, hosting several different <code>arm64</code> LXC containers I use for developing stuff. I love it because I can use LXC CPU allocations to throttle things and make sure they run fast enough on very low-end hardware, plus I can just snapshot, mess up and restore entire environments.</p>
- <p>So I set up an Ubuntu 22.04 container with 1GB of RAM and access to 2 CPU cores, capped to 50% overall usage–which is <em>roughly</em> the performance of a <a href="/space/hw/raspberry_pi" rel="next">Raspberry Pi</a> 2 give or take, albeit with a fully 64-bit CPU.</p>
- <p>I deployed <a href="https://github.com/piku/piku" rel="external">Piku</a>, set up a CloudFlare tunnel, and then went to town.</p>
- <a id="anchor-zero-code-changes-required" class="anchor" href="/space/blog/2022/12/21/0900#zero-code-changes-required" rel="anchor"><h2 id="zero-code-changes-required">Zero Code Changes Required</h2></a><p>In short, what I needed to get <a href="https://jointakahe.org" rel="external">Takahē</a> up and running under <a href="https://github.com/piku/piku" rel="external">Piku</a> was to:</p>
- <ol>
- <li>Clone the repository.</li>
- <li>Create a <code>production</code> remote pointing to <a href="https://github.com/piku/piku" rel="external">Piku</a>.</li>
- <li>Edit the supplied <code>ENV</code> and <code>Procfile</code>.</li>
- <li>Do a <code>git push production main</code>.</li>
- </ol>
- <p>It was <em>that simple</em>.</p>
- <p>Here’s the configuration I used, annotated. First the <code>ENV</code> file:</p>
- <div class="highlight"><pre><span></span><code><span class="c1"># Yes, I went and got it to use SQLite, and it nearly worked 100%</span><span class="w"></span>
- <span class="na">TAKAHE_DATABASE_SERVER</span><span class="o">=</span><span class="s">sqlite:////home/piku/takahe.db</span><span class="w"></span>
- <span class="c1"># This is what I eventually migrated to (more below)</span><span class="w"></span>
- <span class="c1"># TAKAHE_DATABASE_SERVER=postgres://piku:<password>@localhost/takahe</span><span class="w"></span>
- <span class="c1"># I actually love Django debugging, and with it on I can see the inner workings</span><span class="w"></span>
- <span class="na">TAKAHE_DEBUG</span><span class="o">=</span><span class="s">true</span><span class="w"></span>
- <span class="c1"># You know who uses this password, don't you? </span><span class="w"></span>
- <span class="na">TAKAHE_SECRET_KEY</span><span class="o">=</span><span class="s">pepsicola</span><span class="w"></span>
- <span class="c1"># No, it's not the one I'm actually using.</span><span class="w"></span>
- <span class="c1"># Anyway, this next one breaks a little on Piku, so I need to revise parsing for this case.</span><span class="w"></span>
- <span class="na">TAKAHE_CSRF_TRUSTED_ORIGINS</span><span class="o">=</span><span class="s">["http://127.0.0.1:8000", "https://127.0.0.1:8000"]</span><span class="w"></span>
- <span class="na">TAKAHE_USE_PROXY_HEADERS</span><span class="o">=</span><span class="s">true</span><span class="w"></span>
- <span class="na">TAKAHE_EMAIL_SERVER</span><span class="o">=</span><span class="s">console://console</span><span class="w"></span>
- <span class="na">TAKAHE_MAIN_DOMAIN</span><span class="o">=</span><span class="s">insightful.systems</span><span class="w"></span>
- <span class="na">TAKAHE_ENVIRONMENT</span><span class="o">=</span><span class="s">development</span><span class="w"></span>
- <span class="na">TAKAHE_MEDIA_BACKEND</span><span class="o">=</span><span class="s">local://</span><span class="w"></span>
- <span class="na">TAKAHE_MEDIA_ROOT</span><span class="o">=</span><span class="s">/home/piku/media</span><span class="w"></span>
- <span class="na">TAKAHE_MEDIA_URL</span><span class="o">=</span><span class="s">https://insightful.systems/media/</span><span class="w"></span>
- <span class="na">TAKAHE_AUTO_ADMIN_EMAIL</span><span class="o">=</span><span class="s"><my e-mail></span><span class="w"></span>
- <span class="na">SERVER_NAME</span><span class="o">=</span><span class="s">insightful.systems</span><span class="w"></span>
-
- <span class="c1"># This is all Piku config from here on down</span><span class="w"></span>
- <span class="c1"># I need IPv6 off for sanity inside Proxmox</span><span class="w"></span>
- <span class="na">DISABLE_IPV6</span><span class="o">=</span><span class="s">true</span><span class="w"></span>
- <span class="na">LC_ALL</span><span class="o">=</span><span class="s">en_US.UTF-8</span><span class="w"></span>
- <span class="na">LANG</span><span class="o">=</span><span class="s">$LC_ALL</span><span class="w"></span>
- <span class="c1"># This ensures nginx only accepts requests from CloudFlare, plus a few extra tweaks</span><span class="w"></span>
- <span class="na">NGINX_CLOUDFLARE_ACL</span><span class="o">=</span><span class="s">True</span><span class="w"></span>
- <span class="na">NGINX_SERVER_NAME</span><span class="o">=</span><span class="s">$SERVER_NAME</span><span class="w"></span>
-
- <span class="c1"># These are caching settings for my dev branch of Piku</span><span class="w"></span>
- <span class="na">NGINX_CACHE_SIZE</span><span class="o">=</span><span class="s">2</span><span class="w"></span>
- <span class="na">NGINX_CACHE_TIME</span><span class="o">=</span><span class="s">28800</span><span class="w"></span>
- <span class="na">NGINX_CACHE_DAYS</span><span class="o">=</span><span class="s">12</span><span class="w"></span>
- <span class="c1"># This has nginx cache these prefixes</span><span class="w"></span>
- <span class="na">NGINX_CACHE_PREFIXES</span><span class="o">=</span><span class="s">/media,/proxy </span><span class="w"></span>
-
- <span class="c1"># This maps static user media directly to an nginx route</span><span class="w"></span>
- <span class="na">NGINX_STATIC_PATHS</span><span class="o">=</span><span class="s">/media:/home/piku/media,/static:static,/robots.txt:static/robots.txt</span><span class="w"></span>
- <span class="na">PORT</span><span class="o">=</span><span class="s">8000</span><span class="w"></span>
- <span class="c1"># You want to set these, trust me. I should make them defaults in Piku.</span><span class="w"></span>
- <span class="na">PYTHONIOENCODING</span><span class="o">=</span><span class="s">UTF_8:replace</span><span class="w"></span>
- <span class="na">PYTHONUNBUFFERED</span><span class="o">=</span><span class="s">1</span><span class="w"></span>
- <span class="na">TZ</span><span class="o">=</span><span class="s">Europe/Lisbon</span><span class="w"></span>
- <span class="c1"># This tells uWSGI to shut down idle HTTP workers</span><span class="w"></span>
- <span class="c1"># Saves RAM, but startup from idle is a bit more expensive CPU-wise</span><span class="w"></span>
- <span class="na">UWSGI_IDLE</span><span class="o">=</span><span class="s">60</span><span class="w"></span>
- <span class="c1"># We need to run at least 2 uWSGI workers for Takahe</span><span class="w"></span>
- <span class="na">UWSGI_PROCESSES</span><span class="o">=</span><span class="s">2</span><span class="w"></span>
- <span class="c1"># Each worker will have this many threads </span><span class="w"></span>
- <span class="c1"># (even though I'm only giving this 2 cores)</span><span class="w"></span>
- <span class="c1"># to match the original gunicorn config.</span><span class="w"></span>
- <span class="na">UWSGI_THREADS</span><span class="o">=</span><span class="s">4</span><span class="w"></span>
- </code></pre></div>
- <p>…and only very minor changes to the <code>Procfile</code>:</p>
- <div class="highlight"><pre><span></span><code><span class="nl">wsgi</span><span class="p">:</span><span class="w"> </span>takahe.wsgi<span class="err">:</span>application<span class="w"></span>
- <span class="nl">worker</span><span class="p">:</span><span class="w"> </span>python<span class="w"> </span>manage.py<span class="w"> </span>runstator<span class="w"></span>
- <span class="nl">release</span><span class="p">:</span><span class="w"> </span>python<span class="w"> </span>manage.py<span class="w"> </span>migrate<span class="w"></span>
- </code></pre></div>
- <p>In essence, I removed <code>gunicorn</code> (which I could use anyway) to let <code>uWSGI</code> handle HTTP requests and scale down to <em>zero</em> (saving RAM). And yes, <a href="https://github.com/piku/piku" rel="external">Piku</a> also supports <code>release</code> activities, thanks to <a href="https://github.com/chr15m" rel="external">Chris McCormick</a>.</p>
- <p>And that was it. <em>Zero</em> code changes. None. <em>Nada</em>. And I can use exactly the same setup on <em>any</em> VPS on the planet, thanks to <a href="https://github.com/piku/piku" rel="external">Piku</a>.</p>
- <p>After a little faffing about with the media storage settings (which I got wrong the first time around, since <a href="https://jointakahe.org" rel="external">Takahē</a> also uses <code>/static</code> for its own assets), I had a fully working <a href="/space/protocols/activitypub" rel="next">ActivityPub</a> instance, and, well… John Mastodon just happened to sign up:</p>
- <figure>
- <img src="/media/blog/2022/12/21/0900/4XIkhj5AApzxqSDDOgaKutrv51k=/johnmastodon.jpg" title="John Mastodon, with fail whale by Matthew Inman" rel="hero">
- <figcaption>Every ActivityPub server ought to make this their demo account.</figcaption>
- </figure>
- <a id="anchor-teething-issues" class="anchor" href="/space/blog/2022/12/21/0900#teething-issues" rel="anchor"><h2 id="teething-issues">Teething Issues</h2></a><p><a href="https://jointakahe.org" rel="external">Takahē</a> <em>nearly</em> works with <a href="/space/db/sqlite" rel="next">SQLite</a>, but sadly it relies on <code>JSON_CONTAINS</code>, which is an unsupported feature in <a href="/space/db/sqlite" rel="next">SQLite</a> (but one which <a href="/space/db/postgresql" rel="next">PostgreSQL</a> excels at).</p>
- <p>The upshot of this was that the stator <code>worker</code> was very sad and bombed out when trying to handle hashtags–but all critical stuff worked, so there might well be a workaroud.</p>
- <p>But I took some time after breakfast to migrate the database, and since my <a href="dev/python/django" rel="nofollow">Django</a> skills are rusty, here are my notes:</p>
- <div class="highlight"><pre><span></span><code><span class="c1"># Open a shell to Piku</span>
- ssh -t <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="daaab3b1af9abbb9aeb3acb3aea3aaafb8f4b6bbb4">[email protected]</a> run takahe bash
- sudo apt install postgresql
- python manage.py dumpdata > /tmp/dump.json
- sudo su - postgres
- psql
- </code></pre></div>
- <div class="highlight"><pre><span></span><code><span class="c1">-- Set up the database</span>
- <span class="k">create</span><span class="w"> </span><span class="k">user</span><span class="w"> </span><span class="n">piku</span><span class="p">;</span><span class="w"></span>
- <span class="k">create</span><span class="w"> </span><span class="k">database</span><span class="w"> </span><span class="n">takahe</span><span class="p">;</span><span class="w"></span>
- <span class="k">alter</span><span class="w"> </span><span class="k">role</span><span class="w"> </span><span class="n">piku</span><span class="w"> </span><span class="k">with</span><span class="w"> </span><span class="n">password</span><span class="w"> </span><span class="s1">'<mysecret>'</span><span class="p">;</span><span class="w"></span>
- <span class="k">grant</span><span class="w"> </span><span class="k">all</span><span class="w"> </span><span class="k">privileges</span><span class="w"> </span><span class="k">on</span><span class="w"> </span><span class="k">database</span><span class="w"> </span><span class="n">takahe</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">piku</span><span class="p">;</span><span class="w"></span>
- <span class="k">alter</span><span class="w"> </span><span class="k">database</span><span class="w"> </span><span class="n">takahe</span><span class="w"> </span><span class="k">owner</span><span class="w"> </span><span class="k">to</span><span class="w"> </span><span class="n">piku</span><span class="p">;</span><span class="w"></span>
- </code></pre></div>
- <div class="highlight"><pre><span></span><code><span class="c1"># Reset all the migrations, just in case</span>
- find . -path “*/migrations/*.py” -not -name “__init__.py” -delete
- find . -path “*/migrations/*.pyc” -delete
- <span class="c1"># Reapply them</span>
- python manage.py makemigrations
- python manage.py migrate
- <span class="c1"># Wipe all default entities</span>
- python manage.py shell
- from django.contrib.contenttypes.models import ContentType
- ContentType.objects.all<span class="o">()</span>.delete<span class="o">()</span>
- <span class="c1"># Load everything back</span>
- python manage.py loaddata /tmp/dump.json
- </code></pre></div>
- <a id="anchor-performance" class="anchor" href="/space/blog/2022/12/21/0900#performance" rel="anchor"><h2 id="performance">Performance</h2></a><p>Overall, I’m quite impressed with the whole thing. Even with such measly resources and Linux’s tendency to take up RAM with buffers, <a href="https://jointakahe.org" rel="external">Takahē</a> under <a href="https://github.com/piku/piku" rel="external">Piku</a> is taking up around 100MB per active worker (2 web handlers, plus the stator worker), plus less than 50MB for <a href="/space/db/postgresql" rel="next">PostgreSQL</a> and <code>nginx</code> <em>together</em>.</p>
- <p>So I’m seeing <em>less than 512MB of RAM</em> in actual use, and a steady <10% CPU load inside the container as the stator keeps picking up inbound updates, handling them (including any outbound requests) and doing all the messy housekeeping associated with <a href="/space/protocols/activitypub" rel="next">ActivityPub</a>:</p>
- <figure>
- <img src="/media/blog/2022/12/21/0900/-ppthry9RSX9xU_aab-SgYe8sok=/pimox.jpg" title="My Pimox console, showing CPU and RAM" rel="hero">
- <figcaption>These are the stats a few hours later in the day, after publishing this post.</figcaption>
- </figure>
- <p>But here’s the kicker: Since this is being capped inside LXC, that is actually around 5% <em>overall</em> CPU load on the hardware–which should translate to something like 2% of CPU usage on any kind of “real” hardware. </p>
- <p>With only one active user for now (but following a few accounts already), this is very, very promising.</p>
- <p>I have no real plans to leave <code>mastodon.social</code> for my own domain, but using <a href="https://jointakahe.org" rel="external">Takahē</a> to host a small group of people (or a company) with nothing more than a tiny VPS seems entirely feasible, and is certainly in my future.</p>
- <a id="anchor-next-steps" class="anchor" href="/space/blog/2022/12/21/0900#next-steps" rel="anchor"><h2 id="next-steps">Next Steps</h2></a><p>Right now, I’m going to try to contribute by testing various iOS clients (I will be using the <a href="https://jointakahe.org" rel="external">Takahē</a> public test instance as well) and do some minor tweaks to my install, namely:</p>
- <ul>
- <li>Setting up <code>nginx</code> caching. Cloudflare is already caching one third of the data, but I want to bulk up this setup so that I can eventually move it to Azure, and I’ve been meaning to add that to <a href="https://github.com/piku/piku" rel="external">Piku</a> anyway.</li>
- <li>Fine-tuning the stator to see how it scales up or down (I might want to try to scale it down further).</li>
- <li>Trying <code>gunicorn</code> to see if it makes any difference in overall RAM and CPU.</li>
- <li>Seeing if I can get it to work on Azure Functions (that is sure to be fun, although the current SDK failed to install on my M1 and I haven’t tried since).</li>
- <li>Look at how media assets are handled and see if I can add a patch to support Azure Storage via <a href="https://github.com/rcarmo/aioazstorage" rel="external">my own <code>aioazstorage</code> library</a>.</li>
- <li>Deploy on my <a href="https://github.com/rcarmo/azure-k3s-cluster" rel="external">k3s cluster</a>, to get a feel for how much it would cost to run on spot instances.</li>
- </ul>
- <p>There goes my holiday break, I guess…</p>
- <h3 id="update-a-few-days-later"><strong>Update:</strong> A Few Days Later</h3>
- <p>I’ve since sorted out <code>nginx</code> caching in <a href="https://github.com/piku/piku" rel="external">Piku</a> (and will soon be merging it to <code>main</code>), which makes things significantly snappier. I’ve also filed <a href="https://github.com/jointakahe/takahe/issues/287" rel="external">#287</a> to improve caching via Cloudflare and <a href="https://github.com/jointakahe/takahe/pull/288" rel="external">#288</a> to have <code>nginx</code> immediately cache assets (which works for me, at least).</p>
- <p>Before that, I had some fun tuning stator pauses and filed <a href="https://github.com/jointakahe/takahe/issues/232" rel="external">#232</a>, which resulted in a tweak that lowered idle CPU consumption to a pretty amazing 3% in my test instance.</p>
- <p>With the caching tweaks, <code>gunicorn</code> doesn’t have any real advantage against <code>uWSGI</code> workers, although I suspect that may be different in higher-load instances.</p>
- <p>I’ve also tossed the source tree into an Azure Function and got it to “work”, but not fully. Right now I’m not sure that is worth pursuing given I still need an external database, but I’m really curious to try again in a few months’ time.</p>
|