title: The offline cookbook url: http://jakearchibald.com/2014/offline-cookbook/ hash_url: 100b33530311027216b4e56029c7a20a
When AppCache arrived on the scene it gave us a couple of patterns to make content work offline. If those were the patterns you needed, congratulations, you won the AppCache lottery (the jackpot remains unclaimed), but the rest of us were left huddled in a corner rocking back & forth.
With ServiceWorker (intro) we gave up trying to solve offline, and gave developers the moving parts to go solve it themselves. It gives you control over caching and how requests are handled. That means you get to create your own patterns. Let's take a look at a few possible patterns in isolation, but in practice you'll likely use many of them in tandem depending on URL & context.
All code examples work today in Chrome 40 beta with the cache polyfill, unless otherwise noted. This stuff will land into the stable version January/February 2015 barring any emergencies, so it won't be long until millions of real users can benefit from this stuff.
For a working demo of some of these patterns, see Trained-to-thrill, and this video showing the performance impact.
ServiceWorker lets you handle requests independently from caching, so we'll look at them separately. First up, caching, when should it be done?
ServiceWorker gives you an install event. You can use this to get stuff ready, stuff that must be ready before you handle other events. While this happens any previous version of your ServiceWorker is still running & serving pages, so the things you do here mustn't disrupt that.
Ideal for: CSS, images, fonts, JS, templates… basically anything you'd consider static to that "version" of your site.
These are things that would make your site entirely non-functional if they failed to fetch, things an equivalent native-app would make part of the initial download.
self.addEventListener('install', function(event) { event.waitUntil( caches.open('mysite-static-v3').then(function(cache) { return cache.addAll([ '/css/whatever-v3.css', '/css/imgs/sprites-v6.png', '/css/fonts/whatever-v8.woff', '/js/all-min-v4.js' // etc ]); }) ); });
event.waitUntil
takes a promise to define the length & success of the install. If the promise rejects, the installation is considered a failure and this ServiceWorker will be abandoned (if an older version is running, it'll be left intact). caches.open
and cache.addAll
return promises. If any of the resources fail to fetch, the cache.addAll
call rejects.
On trained-to-thrill I use this to cache static assets.
Similar to above, but won't delay install completing and won't cause installation to fail if caching fails.
Ideal for: Bigger resources that aren't needed straight away, such as assets for later levels of a game.
self.addEventListener('install', function(event) { event.waitUntil( caches.open('mygame-core-v1').then(function(cache) { cache.addAll( // levels 11-100 ); return cache.addAll( // core assets & levels 1-10 ); }) ); });
We're not passing the cache.addAll
promise for levels 11-100 back to event.waitUntil
, so even if it fails, the game will still be available offline. Of course, you'll have to cater for the possible absence of those levels & reattempt caching them if they're missing.
Also, the ServiceWorker may be killed while levels 11-100 download since it's finished handling events, but the download will continue in the background.
Ideal for: Clean-up & migration.
Once a new ServiceWorker has installed & a previous version isn't being used, the new one activates, and you get an activate
event. Because the old version is out of the way, it's a good time to handle schema migrations in IndexedDB and also delete unused caches.
self.addEventListener('activate', function(event) { event.waitUntil( caches.keys().then(function(cacheNames) { return Promise.all( cacheNames.filter(function(cacheName) { // Return true if you want to remove this cache, // but remember that caches are shared across // the whole origin }).map(function(cacheName) { return caches.delete(cacheName); }) ); }) ); });
During activation, other events such as fetch
are put into a queue, so a long activation could potentially block page loads. Keep your activation as lean as possible, only use it for things you couldn't do while the old version was active.
On trained-to-thrill I use this to remove old caches.
Ideal for: If the whole site can't be taken offline, you may allow the user to select the content they want available offline. E.g. a video on something like YouTube, an article on Wikipedia, a particular gallery on Flickr.
Give the user a "Read later" or "Save for offline" button. When it's clicked, fetch what you need from the network & pop it in the cache.
document.querySelector('.cache-article').addEventListener('click', function(event) { event.preventDefault(); var id = this.dataset.articleId; caches.open('mysite-article-' + id).then(function(cache) { fetch('/get-article-urls?id=' + id).then(function(response) { // /get-article-urls returns a JSON-encoded array of // resource URLs that a given article depends on return response.json(); }).then(function(urls) { cache.addAll(urls); }); }); });
The above doesn't work in Chrome yet, as we've yet to expose fetch
and caches
to pages (ticket #1, ticket #2). You could use navigator.serviceWorker.controller.postMessage
to send a signal to the ServiceWorker telling it which article to cache, and handle the rest there.
Ideal for: Frequently updating resources such as a user's inbox, or article contents. Also useful for non-essential content such as avatars, but care is needed.
If a request doesn't match anything in the cache, get it from the network, send it to the page & add it to the cache at the same time.
If you do this for a range of URLs, such as avatars, you'll need to be careful you don't bloat the storage of your origin — if the user needs to reclaim disk space you don't want to be the prime candidate. Make sure you get rid of items in the cache you don't need any more.
self.addEventListener('fetch', function(event) { event.respondWith( caches.open('mysite-dynamic').then(function(cache) { return cache.match(event.request).then(function (response) { return response || fetch(event.request.clone()).then(function(response) { cache.put(event.request, response.clone()); return response; }); }); }) ); });
To allow for efficient memory usage, you can only read a response/request's body once. In the code above, .clone()
is used to create additional copies that can be read separately.
On trained-to-thrill I use this to cache Flickr images.
Ideal for: Frequently updating resources where having the very latest version is non-essential. Avatars can fall into this category.
If there's a cached version available, use it, but fetch an update for next time.
self.addEventListener('fetch', function(event) { event.respondWith( caches.open('mysite-dynamic').then(function(cache) { return cache.match(event.request).then(function(response) { var fetchPromise = fetch(event.request.clone()).then(function(networkResponse) { cache.put(event.request, networkResponse.clone()); return networkResponse; }) return response || fetchPromise; }) }) ); });
This is very similar to HTTP's stale-while-revalidate.
Note: Push isn't supported in Chrome yet.
The Push API is another feature built on top of SerivceWorker. This allows the ServiceWorker to be awoken in response to a message from the OS's messaging service. This happens even when the user doesn't have a tab open to your site, only the ServiceWorker is woken up. You request permission to do this from a page & the user will be prompted.
Ideal for: Content relating to a notification, such as a chat message, a breaking news story, or an email. Also infrequently changing content that benefits from immediate sync, such as a todo list update or a calendar alteration.
The common final outcome is a notification which, when tapped, opens/focuses a relevant page, but updating caches before this happens is extremely important. The user is obviously online at the time of receiving the push message, but they may not be when they finally interact with the notification, so making this content available offline is important. The Twitter native app, which is for the most part an excellent example of offline-first, gets this a bit wrong:
Without a connection, Twitter fails to provide the content relating to the push message. Tapping it does remove the notification however, leaving the user with less information than before they tapped. Don't do this!
This code updates caches before showing a notification:
self.addEventListener('push', function(event) { if (event.data.text() == 'new-email') { event.waitUntil( caches.open('mysite-dynamic').then(function(cache) { return fetch('/inbox.json').then(function(response) { cache.put('/inbox.json', response.clone()); return response.json(); }); }).then(function(emails) { registration.showNotification("New email", { body: "From " + emails[0].from.name tag: "new-email" }); }) ); } }); self.addEventListener('notificationclick', function(event) { if (event.notification.tag == 'new-email') { // Assume that all of the resources needed to render // /inbox/ have previously been cached, e.g. as part // of the install handler. new WindowClient('/inbox/'); } });
Note: Background sync isn't supported in Chrome yet.
Background sync is another feature built on top of ServiceWorker. It allows you to request background data synchronisation as a one-off, or on an (extremely heuristic) interval. This happens even when the user doesn't have a tab open to your site, only the ServiceWorker is woken up. You request permission to do this from a page & the user will be prompted.
Ideal for: Non-urgent updates, especially those that happen so regularly that a push message per update would be too frequent, such as social timelines or news articles.
self.addEventListener('sync', function(event) { if (event.id == 'update-leaderboard') { event.waitUntil( caches.open('mygame-dynamic').then(function(cache) { return cache.add('/leaderboard.json'); }) ); } });
Your origin is given a certain amount of free space to do what it wants with. That free space is shared between all origin storage: LocalStorage, IndexedDB, Filesystem, and of course Caches.
The amount you get isn't spec'd, it will differ depending on device and storage conditions. You can find out how much you've got via:
navigator.storageQuota.queryInfo("temporary").then(function(info) { console.log(info.quota); // Result: <quota in bytes> console.log(info.usage); // Result: <used data in bytes> });
However, like all browser storage, the browser is free to throw it away if the device becomes under storage pressure. Unfortunately the browser can't tell the different between those movies you want to keep at all costs, and the game you don't really care about.
To work around this, there's a proposed API, Storage Durability:
// From a page: navigator.requestStorageDurability().then(function() { // Hurrah, your data is here to stay! })
Of course, the user has to grant permission. Making the user part of this flow is important, as we can now expect them to be in control of deletion. If their device comes under storage pressure, and clearing non-essential data doesn't solve it, the user gets to make a judgement call on which items to keep and remove.
For this to work, it requires operating systems to treat "durable" origins as equivalent to native apps in their breakdowns of storage usage, rather than reporting the browser as a single item.
It doesn't matter how much caching you do, the ServiceWorker won't use the cache unless you tell it when & how. Here are a few patterns for handling requests:
Ideal for: Anything you'd consider static to that "version" of your site. You should have cached these in the install event, so you can depend on them being there.
self.addEventListener('fetch', function(event) { // If a match isn't found in the cache, the response // will look like a connection error event.respondWith(caches.match(event.request)); });
…although you don't often need to handle this case specifically, "Cache, falling back to network" covers it.
Ideal for: Things that have no offline equivalent, such as analytics pings, non-GET requests.
self.addEventListener('fetch', function(event) { event.respondWith(fetch(event.request)); // or simply don't call event.respondWith, which // will result in default browser behaviour });
…although you don't often need to handle this case specifically, "Cache, falling back to network" covers it.
Ideal for: If you're building offline-first, this is how you'll handle the majority of requests. Other patterns will be exceptions based on the incoming request.
self.addEventListener('fetch', function(event) { event.respondWith( caches.match(event.request).then(function(response) { return response || fetch(event.request); }) ); });
This gives you the "Cache only" behaviour for things in the cache and the "Network only" behaviour for anything not-cached (which includes all non-GET requests, as they cannot be cached).
Ideal for: Small assets where you're chasing performance on devices with slow disk access.
With some combinations of older hard drives, virus scanners, and faster internet connections, getting resources from the network can be quicker than going to disk. However, going to the network when the user has the content on their device can be a waste of data, so bear that in mind.
// Promise.race is no good to us because it rejects if // a promise rejects before fulfilling. Let's make a proper // race function: function properRace(promises) { // we implement by inverting Promise.all Promise.all( promises.map(function(promise) { // for each promise, cast it, then swap around rejection & fulfill return Promise.resolve(promise).then(function(val) { throw val; }, function(err) { return err; }); }) ).then(function(errs) { // then swap it back throw Error("Proper race: none fulfilled"); }, function(val) { return val; }); } self.addEventListener('fetch', function(event) { event.respondWith( properRace([ caches.match(event.request), fetch(event.request) ]) ); });
Ideal for: A quick-fix for resources that update frequently, outside of the "version" of the site. E.g. articles, avatars, social media timelines, game leader boards.
This means you give online users the most up-to-date content, but offline users get an older cached version. If the network request succeeds you'll most-likely want to update the cache entry.
However, this method has flaws. If the user has an intermittent or slow connection they'll have to wait for the network to fail before they get the perfectly acceptable content already on their device. This can take an extremely long time and is a frustrating user experience. See the next pattern, "Cache then network", for a better solution.
self.addEventListener('fetch', function(event) { event.respondWith( fetch(event.request).catch(function() { return caches.match(event.request); }) ); });
Ideal for: Content that updates frequently. E.g. articles, social media timelines, game leaderboards.
This requires the page to make two requests, one to the cache, one to the network. The idea is to show the cached data first, then update the page when/if the network data arrives.
Sometimes you can just replace the current data when new data arrives (e.g. game leaderboard), but that can be disruptive with larger pieces of content. Basically, don't "disappear" something the user may be reading or interacting with.
Twitter adds the new content above the old content & adjusts the scroll position so the user is uninterrupted. This is possible because Twitter mostly retains a mostly-linear order to content. I copied this pattern for trained-to-thrill to get content on screen as fast as possible, but still display up-to-date content once it arrives.
var networkDataReceived = false; startSpinner(); // fetch fresh data var networkUpdate = fetch('/data.json').then(function(response) { return response.json(); }).then(function(data) { networkDataReceived = true; updatePage(); }); // fetch cached data caches.match('/data.json').then(function(response) { if (!response) throw Error("No data"); return response.json(); }).then(function(data) { // don't overwrite newer network data if (!networkDataReceived) { updatePage(data); } }).catch(function() { // we didn't get cached data, the network is our last hope: return networkUpdate; }).catch(showErrorMessage).then(stopSpinner);
We always go to the network & update a cache as we go.
self.addEventListener('fetch', function(event) { event.respondWith( caches.open('mysite-dynamic').then(function(cache) { return fetch(event.request.clone()).then(function(response) { cache.put(event.request, response.clone()); return response; }); }) ); });
The above doesn't work in Chrome yet, as we've yet to expose fetch
and caches
to pages (ticket #1, ticket #2).
In trained-to-thrill I worked around this by using XHR instead of fetch, and abusing the Accept
header to tell the ServiceWorker where to get the result from (page code, ServiceWorker code).
If you fail to serve something from the cache and/or network you may want to provide a generic fallback.
Ideal for: Secondary imagery such as avatars, failed POST requests, "Unavailable while offline" page.
self.addEventListener('fetch', function(event) { event.respondWith( // Try the cache caches.match(event.request).then(function(response) { // Fall back to network return response || fetch(event.request); }).catch(function() { // If both fail, show a generic fallback: return caches.match('/offline.html'); // However, in reality you'd have many different // fallbacks, depending on URL & headers. // Eg, a fallback silhouette image for avatars. }) ); });
The item you fallback to is likely to be an install dependency.
If your page is posting an email, your ServiceWorker may fall back to storing the email in an IDB 'outbox' & respond letting the page know that the send failed but the data was successfully retained.
Ideal for: Pages that cannot have their server response cached.
Rendering pages on the server makes things fast, but that can mean including state data that may not make sense in a cache, e.g. "Logged in as…". If your page is controlled by a ServiceWorker, you may instead choose to request JSON data along with a template, and render that instead.
importScripts('templating-engine.js'); self.addEventListener('fetch', function(event) { var requestURL = new URL(event.request); event.respondWith( Promise.all([ caches.match('/article-template.html').then(function(response) { return response.text(); }), caches.match(requestURL.path + '.json').then(function(response) { return response.json(); }) ]).then(function(responses) { var template = responses[0]; var data = responses[1]; return new Response(renderTemplate(template, data), { headers: { 'Content-Type': 'text/html' } }); }) ); });
You don't have to pick one of these methods, you'll likely use many of them depending on request URL. For example, trained-to-thrill uses:
Just look at the request and decide what to do:
self.addEventListener('fetch', function(event) { // Parse the URL: var requestURL = new URL(event.request.url); // Handle requests to a particular host specifically if (requestURL.hostname == 'api.example.com') { event.respondWith(/* some combination of patterns */); return; } // Routing for local URLs if (requestURL.origin == location.origin) { // Handle article URLs if (/^\/article\//.test(requestURL.pathname)) { event.respondWith(/* some other combination of patterns */); return; } if (/\.webp$/.test(requestURL.pathname)) { event.respondWith(/* some other combination of patterns */); return; } if (request.method == 'POST') { event.respondWith(/* some other combination of patterns */); return; } if (/cheese/.test(requestURL.pathname)) { event.respondWith( new Response("Flagrant cheese error", { status: 512 }) ); return; } } // A sensible default pattern event.respondWith( caches.match(event.request).then(function(response) { return response || fetch(event.request); }) ); });
…you get the picture.
If you come up with additional patterns, throw them at me in the comments!