A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 17KB

5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263
  1. title: Self-Host Your Static Assets
  2. url: https://csswizardry.com/2019/05/self-host-your-static-assets/
  3. hash_url: 6f8793385f7b1ea2511bf614b5287397
  4. <details>
  5. <summary>Table of Contents</summary>
  6. <ol>
  7. <li><a href="#what-am-i-talking-about">What Am I Talking About?</a></li>
  8. <li><a href="#risk-slowdowns-and-outages">Risk: Slowdowns and Outages</a></li>
  9. <li><a href="#risk-service-shutdowns">Risk: Service Shutdowns</a></li>
  10. <li><a href="#risk-security-vulnerabilities">Risk: Security Vulnerabilities</a>
  11. <ol>
  12. <li><a href="#mitigation-subresource-integrity">Mitigation: Subresource Integrity</a></li>
  13. </ol>
  14. </li>
  15. <li><a href="#penalty-network-negotiation">Penalty: Network Negotiation</a>
  16. <ol>
  17. <li><a href="#mitigation-preconnect">Mitigation: <code class="highlighter-rouge">preconnect</code></a></li>
  18. </ol>
  19. </li>
  20. <li><a href="#penalty-loss-of-prioritisation">Penalty: Loss of Prioritisation</a></li>
  21. <li><a href="#penalty-caching">Penalty: Caching</a></li>
  22. <li><a href="#myth-cross-domain-caching">Myth: Cross-Domain Caching</a></li>
  23. <li><a href="#myth-access-to-a-cdn">Myth: Access to a CDN</a></li>
  24. <li><a href="#self-host-your-static-assets">Self-Host Your Static Assets</a></li>
  25. </ol>
  26. </details>
  27. <p>One of the quickest wins—and one of the first things I recommend my clients
  28. do—to make websites faster can at first seem counter-intuitive: you should
  29. self-host all of your static assets, forgoing others’ CDNs/infrastructure. In
  30. this short and hopefully very straightforward post, I want to outline the
  31. disadvantages of hosting your static assets ‘off-site’, and the overwhelming
  32. benefits of hosting them on your own origin.</p>
  33. <h2 id="what-am-i-talking-about">What Am I Talking About?</h2>
  34. <p>It’s not uncommon for developers to link to static assets such as libraries or
  35. plugins that are hosted at a public/CDN URL. A classic example is jQuery, that
  36. we might link to like so:</p>
  37. <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;script src="https://code.jquery.com/jquery-3.3.1.slim.min.js"&gt;&lt;/script&gt;
  38. </code></pre></div></div>
  39. <p>There are a number of perceived benefits to doing this, but my aim later in this
  40. article is to either debunk these claims, or show how other costs vastly
  41. outweigh them.</p>
  42. <ul>
  43. <li><strong>It’s convenient.</strong> It requires very little effort or brainpower to include
  44. files like this. Copy and paste a line of HTML and you’re done. Easy.</li>
  45. <li><strong>We get access to a CDN.</strong> <code class="highlighter-rouge">code.jquery.com</code> is served by
  46. <a href="https://www.stackpath.com/products/cdn/">StackPath</a>, a CDN. By linking to
  47. assets on this origin, we get CDN-quality delivery, free!</li>
  48. <li><strong>Users might already have the file cached.</strong> If <code class="highlighter-rouge">website-a.com</code> links to
  49. <code class="highlighter-rouge">https://code.jquery.com/jquery-3.3.1.slim.min.js</code>, and a user goes from there
  50. to <code class="highlighter-rouge">website-b.com</code> who also links to
  51. <code class="highlighter-rouge">https://code.jquery.com/jquery-3.3.1.slim.min.js</code>, then the user will already
  52. have that file in their cache.</li>
  53. </ul>
  54. <h2 id="risk-slowdowns-and-outages">Risk: Slowdowns and Outages</h2>
  55. <p>I won’t go into too much detail in this post, because I have a <a href="https://csswizardry.com/2017/07/performance-and-resilience-stress-testing-third-parties/">whole
  56. article</a>
  57. on the subject of third party resilience and the risks associated with slowdowns
  58. and outages. Suffice to say, if you have any critical assets served by third
  59. party providers, and that provider is suffering slowdowns or, heaven forbid,
  60. outages, it’s pretty bleak news for you. You’re going to suffer, too.</p>
  61. <p>If you have any render-blocking CSS or synchronous JS hosted on third party
  62. domains, go and bring it onto your own infrastructure <em>right now</em>. Critical
  63. assets are far too valuable to leave on someone else’s servers.</p>
  64. <h2 id="risk-service-shutdowns">Risk: Service Shutdowns</h2>
  65. <p>A far less common occurrence, but what happens if a provider decides they need
  66. to shut down the service? This is exactly what <a href="https://rawgit.com">Rawgit</a> did
  67. in October 2018, yet (at the time of writing) a crude GitHub code search still
  68. yielded <a href="https://github.com/search?q=rawgit&amp;type=Code">over a million
  69. references</a> to the now-sunset
  70. service, and almost 20,000 live sites are still linking to it!</p>
  71. <figure>
  72. <img src="/wp-content/uploads/2019/05/big-query-rawgit.jpg" alt=""/>
  73. <figcaption>Many thanks to <a href="https://twitter.com/paulcalvano">Paul
  74. Calvano</a> who very kindly <a href="https://bigquery.cloud.google.com/savedquery/226352634162:7c27aa5bac804a6687f58db792c021ee">queried
  75. the HTTPArchive</a> for me.</figcaption>
  76. </figure>
  77. <h2 id="risk-security-vulnerabilities">Risk: Security Vulnerabilities</h2>
  78. <p>Another thing to take into consideration is the simple question of trust. If
  79. we’re bringing content from external sources onto our page, we have to hope that
  80. the assets that arrive are the ones we were expecting them to be, and that
  81. they’re doing only what we expected them to do.</p>
  82. <p>Imagine the damage that would be caused if someone managed to take control of
  83. a provider such as <code class="highlighter-rouge">code.jquery.com</code> and began serving compromised or malicious
  84. payloads. It doesn’t bear thinking about!</p>
  85. <h3 id="mitigation-subresource-integrity">Mitigation: Subresource Integrity</h3>
  86. <p>To the credit of all of the providers referenced so far in this article, they do
  87. all make use of <a href="https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity">Subresource
  88. Integrity</a>
  89. (SRI). SRI is a mechanism by which the provider supplies a hash (technically,
  90. a hash that is then Base64 encoded) of the exact file that you both expect and
  91. intend to use. The browser can then check that the file you received is indeed
  92. the one you requested.</p>
  93. <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;script src="https://code.jquery.com/jquery-3.4.1.slim.min.js"
  94. integrity="sha256-pasqAKBDmFT4eHoN2ndd6lN370kFiGUFyTiUHWhU7k8="
  95. crossorigin="anonymous"&gt;&lt;/script&gt;
  96. </code></pre></div></div>
  97. <p>Again, if you absolutely must link to an externally hosted static asset, make
  98. sure it’s SRI-enabled. You can add SRI yourself using <a href="https://www.srihash.org/">this handy
  99. generator</a>.</p>
  100. <h2 id="penalty-network-negotiation">Penalty: Network Negotiation</h2>
  101. <p>One of the biggest and most immediate penalties we pay is the cost of opening
  102. new TCP connections. Every new origin we need to visit needs a connection
  103. opening, and that can be very costly: DNS resolution, TCP handshakes, and TLS
  104. negotiation all add up, and the story gets worse the higher the latency of the
  105. connection is.</p>
  106. <p>I’m going to use an example taken straight from Bootstrap’s own <a href="https://getbootstrap.com/docs/4.3/getting-started/introduction/">Getting
  107. Started</a>. They
  108. instruct users to include these following four files:</p>
  109. <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="..." crossorigin="anonymous"&gt;
  110. &lt;script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="..." crossorigin="anonymous"&gt;&lt;/script&gt;
  111. &lt;script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="..." crossorigin="anonymous"&gt;&lt;/script&gt;
  112. &lt;script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="..." crossorigin="anonymous"&gt;&lt;/script&gt;
  113. </code></pre></div></div>
  114. <p>These four files are hosted across three different origins, so we’re going to
  115. need to open three TCP connections. How much does that cost?</p>
  116. <p>Well, on a reasonably fast connection, hosting these static assets off-site is
  117. 311ms, or 1.65×, slower than hosting them ourselves.</p>
  118. <figure>
  119. <img src="/wp-content/uploads/2019/05/wpt-off-site-cable.png" alt=""/>
  120. <figcaption>By linking to three different origins in order to serve static
  121. assets, we cumulatively lose a needless 805ms to network negotiation. <a href="https://www.webpagetest.org/result/190531_FY_618f9076491312ef625cf2b1a51167ae/3/details/">Full
  122. test.</a></figcaption>
  123. </figure>
  124. <p>Okay, so not exactly terrifying, but Trainline, a client of mine, found that by
  125. reducing latency by 300ms, <a href="https://wpostats.com/2016/05/04/trainline-spending.html">customers spent an extra £8m
  126. a year</a>. This is
  127. a pretty quick way to make eight mill.</p>
  128. <figure>
  129. <img src="/wp-content/uploads/2019/05/wpt-self-hosted-cable.png" alt=""/>
  130. <figcaption>By simply moving our assets onto the host domain, we completely
  131. remove any extra connection overhead. <a href="https://www.webpagetest.org/result/190531_FX_f7d7b8ae511b02aabc7fa0bbef0e37bc/3/details/">Full
  132. test.</a></figcaption>
  133. </figure>
  134. <p>On a slower, higher-latency connection, the story is much, much worse. Over 3G,
  135. the externally-hosted version comes in at an eye-watering <strong>1.765s slower</strong>.
  136. I thought this was meant to make our site faster?!</p>
  137. <figure>
  138. <img src="/wp-content/uploads/2019/05/wpt-off-site-3g.png" alt=""/>
  139. <figcaption>On a high latency connection, network overhead totals a whopping
  140. 5.037s. All completely avoidable. <a href="https://www.webpagetest.org/result/190531_XE_a95eebddd2346f8bb572cecf4a8dae68/3/details/">Full
  141. test.</a></figcaption>
  142. </figure>
  143. <p>Moving the assets onto our own infrastructure brings load times down from around
  144. 5.4s to just 3.6s.</p>
  145. <figure>
  146. <img src="/wp-content/uploads/2019/05/wpt-self-hosted-3g.png" alt=""/>
  147. <figcaption>By self-hosting our static assets, we don’t need to open any more
  148. connections. <a href="https://www.webpagetest.org/result/190531_ZF_4d76740567ec1eba1e6ec67acfd57627/1/details/">Full
  149. test.</a></figcaption>
  150. </figure>
  151. <p>If this isn’t already a compelling enough reason to self-host your static
  152. assets, I’m not sure what is!</p>
  153. <h3 id="mitigation-preconnect">Mitigation: <code class="highlighter-rouge">preconnect</code></h3>
  154. <p>Naturally, my whole point here is that you should not host any static assets
  155. off-site if you’re otherwise able to self-host them. However, if your hands are
  156. somehow tied, then you can use <a href="https://speakerdeck.com/csswizardry/more-than-you-ever-wanted-to-know-about-resource-hints?slide=28">a <code class="highlighter-rouge">preconnect</code> Resource
  157. Hint</a>
  158. to preemptively open a TCP connection to the specified origin(s):</p>
  159. <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;head&gt;
  160. ...
  161. &lt;link rel="preconnect" href="https://code.jquery.com" /&gt;
  162. ...
  163. &lt;/head&gt;
  164. </code></pre></div></div>
  165. <p>For bonus points, deploying these as <a href="https://andydavies.me/blog/2019/03/22/improving-perceived-performance-with-a-link-rel-equals-preconnect-http-header/">HTTP
  166. headers</a>
  167. will be even faster.</p>
  168. <p><strong>N.B.</strong> Even if you do implement <code class="highlighter-rouge">preconnect</code>, you’re still only going to make
  169. a small dent in your lost time: you still need to open the relevant connections,
  170. and, especially on high latency connections, it’s unlikely that you’re ever
  171. going to fully pay off the overhead upfront.</p>
  172. <h2 id="penalty-loss-of-prioritisation">Penalty: Loss of Prioritisation</h2>
  173. <p>The second penalty comes in the form of a protocol-level optimisation that we
  174. miss out on the moment we split content across domains. If you’re running over
  175. HTTP/2—which, by now, you should be—you get access to prioritisation. All
  176. streams (ergo, resources) within the same TCP connection carry a priority, and
  177. the browser and server work in tandem to build a dependency tree of all of these
  178. prioritised streams so that we can return critical assets sooner, and perhaps
  179. delay the delivery of less important ones.</p>
  180. <p><small><strong>N.B.</strong> Technically, owing to H/2’s <a href="https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/">connection
  181. coalescence</a>,
  182. requests can be prioritised against each other over different domains as long as
  183. they share the same IP address.</small></p>
  184. <p>If we split our assets across multiple domains, we have to open up several
  185. unique TCP connections. We cannot cross-reference any of the priorities within
  186. these connections, so we lose the ability to deliver assets in a considered and
  187. well designed manner.</p>
  188. <p>Compare the two HTTP/2 dependency trees for both the off-site and self-hosted
  189. versions respectively:</p>
  190. <figure>
  191. <img src="/wp-content/uploads/2019/05/wpt-dep-tree-off-site.png" alt=""/>
  192. <figcaption>Notice how we need to build new dependency trees per
  193. origin? Stream IDs 1 and 3 keep reoccurring.</figcaption>
  194. </figure>
  195. <figure>
  196. <img src="/wp-content/uploads/2019/05/wpt-dep-tree-self-hosted.png" alt=""/>
  197. <figcaption>By hosting all content under the same origin, we can build one, more
  198. complete dependency tree. Every stream has a unique ID as they’re all in the
  199. same tree.</figcaption>
  200. </figure>
  201. <p><small>Fun fact: Stream IDs with an odd number were initiated by the client;
  202. those with an even number were initiated by the server. I honestly don’t think
  203. I’ve ever seen an even-numbered ID in the wild.</small></p>
  204. <p>If we serve as much content as possible from one domain, we can let H/2 do its
  205. thing and prioritise assets more completely in the hopes of better-timed
  206. responses.</p>
  207. <h2 id="penalty-caching">Penalty: Caching</h2>
  208. <p>By and large, static asset hosts seem to do pretty well at establishing
  209. long-lived <code class="highlighter-rouge">max-age</code> directives. This makes sense, as static assets at versioned
  210. URLs (as above) will never change. This makes it very safe and sensible to
  211. enforce a reasonably aggressive cache policy.</p>
  212. <p>That said, this isn’t always the case, and by self-hosting your assets you can
  213. design <a href="https://csswizardry.com/2019/03/cache-control-for-civilians/">much more bespoke caching
  214. strategies</a>.</p>
  215. <h2 id="myth-cross-domain-caching">Myth: Cross-Domain Caching</h2>
  216. <p>A more interesting take is the power of cross-domain caching of assets. That is
  217. to say, if lots and lots of sites link to the same CDN-hosted version of, say,
  218. jQuery, then surely users are likely to already have that exact file on their
  219. machine already? Kinda like peer-to-peer resource sharing. This is one of the
  220. most common arguments I hear in favour of using a third-party static asset
  221. provider.</p>
  222. <p>Unfortunately, there seems to be no published evidence that backs up these
  223. claims: there is nothing to suggest that this is indeed the case. Conversely,
  224. <a href="https://discuss.httparchive.org/t/analyzing-resource-age-by-content-type/1659">recent
  225. research</a>
  226. by <a href="https://twitter.com/paulcalvano">Paul Calvano</a> hints that the opposite might
  227. be the case:</p>
  228. <blockquote>
  229. <p>There is a significant gap in the 1st vs 3rd party resource age of CSS and web
  230. fonts. 95% of first party fonts are older than 1 week compared to 50% of 3rd
  231. party fonts which are less than 1 week old! This makes a strong case for self
  232. hosting web fonts!</p>
  233. </blockquote>
  234. <p>In general, third party content seems to be less-well cached than first party
  235. content.</p>
  236. <p>Even more importantly, <a href="https://andydavies.me/blog/2018/09/06/safari-caching-and-3rd-party-resources/">Safari has completely disabled this
  237. feature</a>
  238. for fear of abuse where privacy is concerned, so the shared cache technique
  239. cannot work for, at the time of writing, <a href="http://gs.statcounter.com/">16% of users
  240. worldwide</a>.</p>
  241. <p>In short, although nice in theory, there is no evidence that cross-domain
  242. caching is in any way effective.</p>
  243. <h2 id="myth-access-to-a-cdn">Myth: Access to a CDN</h2>
  244. <p>Another commonly touted benefit of using a static asset provider is that they’re
  245. likely to be running beefy infrastructure with CDN capabilities: globally
  246. distributed, scalable, low-latency, high availability.</p>
  247. <p>While this is absolutely true, if you care about performance, you should be
  248. running your own content from a CDN already. With the price of modern hosting
  249. solutions being what they are (this site is fronted by Cloudflare which is
  250. free), there’s very little excuse for not serving your own assets from one.</p>
  251. <p>Put another way: if you think you need a CDN for your jQuery, you’ll need a CDN
  252. for everything. Go and get one.</p>
  253. <h2 id="self-host-your-static-assets">Self-Host Your Static Assets</h2>
  254. <p>There really is very little reason to leave your static assets on anyone else’s
  255. infrastructure. The perceived benefits are often a myth, and even if they
  256. weren’t, the trade-offs simply aren’t worth it. Loading assets from multiple
  257. origins is demonstrably slower. Take ten minutes over the next few days to audit
  258. your projects, and fetch any off-site static assets under your own control.</p>