title: Taking the web offline with service workers
url: https://mobiforge.com/design-development/taking-web-offline-service-workers
hash_url: 37423996e7
You’re probably already familiar with the idea of offline web apps, web apps that can continue to work in the face of intermittent network connectivity. This concept has been around for a while, and various technologies have been developed along the way to achieve offline web apps, (Google) Gears, and Appcache for example, but none of these addressed the offline challenge quite as well as service workers.
Service workers provide a way for webpages to run scripts in the background even when the page is not open. A service worker can act as a proxy to a web page, intercepting requests and controlling responses, and this makes them well suited to dealing with many exciting web features, such as offline capabilities, background syncing, and push notifications. Features like these have traditionally given native apps an edge over web apps, but with service workers, all kinds of previously impossible things are now possible on the web. This means service workers are a very big deal!
The term offline web app can mean different things to different people. A good description of what is meant by an offline web app is given in this article: What’s offline and why should I care? Sometimes called online-offline, what we’re really talking about in this article is a class of web app that can function, at least to some useful degree, with an intermittent network connection.
Two main reason are availability and performance:
Now with some idea of what an offline webapp is, we can list a few use cases:
A list of interesting offline projects and use cases is given on the W3C Web and Mobile (WebMob) Interest Group’s GitHub page.
However, there is no doubt that to endow a web app with offline capabilities is more complex than to build a web app without. There are cache policies to consider, issues around stale content, when to update, and how to notify users that there is newer content, all of which must be considered so that the user experience is not accidentally adversely affected. For example, updating a piece of content while a user is reading it might be jarring if the user was not expecting it. As pointed out by Michael Mahemoff, you should be able to justify the complexity of making your web app available offline.
In this article we’ll show how to add basic offline capabilities to a webapp, which sort of corresponds with the offline magazine app suggestion on the WebMob Service Workers Demo page given above, although what we develop here will be more of a proof of concept that a full blown offline solution.
Sure, browsers cache stuff all the time. The advantages here are in persistence and control. Browser caches are easily overwritten while an application cache is more persistent. Badly configured servers often force the client to refetch things unecessarily, for example hosting providers you don't have control over. Service workers, with a full caching API, allow the client full control, and can make smarter decisions about what and when to cache. Many of us have experienced the case where we get on a plane with a page loaded in our browser, only to find that either a browser setting or cache header forces a reload attempt when you then go to read the page, a reload that has no hope of succeeding, and your page is gone.
Let’s get the service worker basics out of the way.
Firstly, HTTPS is required for service workers. Since service workers control webpages and how they respond to a client, HTTPS is needed to needed to prevent hijacking.
To register a service worker, you need to include a reference to the service worker JavaScript file in your web page. You should check for service worker support first:
if ('serviceWorker' in navigator) { navigator.serviceWorker.register('/sw.js').then(function() { // Success }).catch(function() { // Fail :( }); }
Scope refers to the pages the service worker can control. In the above, the service worker can control any under the root and its subdirectories.
The lifecyle of a service worker is a little bit more complex than this, but the two main events we are interested in are
Implicit in the above is that a service worker can be installed but not active. On first page-load the service worker will be installed, but it won’t be until the next page request that it actually takes control of the page. This default behaviour can be overridden, and a service worker can take control of a page immediately by making calls to the skipWaiting()
and clients.claim()
methods.
Service workers can be stopped and restarted as they are needed. This means that global variables don’t exist across restarts. If you need to preserve state then you should use IndexedDB, which is available in service workers.
Two key components of getting service workers to build offline apps are the cache, and fetch APIs. The cache allows us to store requests and responses, while fetch allows us the intercept requests, and serve up cached responses. Using cache and fetch together enables us to concoct all kinds of interesting offline scenarios. See Jake Archibald’s Offline Cookbook for an excellent hands-on review of various caching policies you might use.
Service workers bring a new cache API to the table. Let’s have a quick look at this. With the CacheStorage object we can open
(and name), and delete
individual caches. With individual caches, typical operations are to add
and remove
items. So, let’s open a cache and and store some pages.
// Add the addAll method importScripts('sw-cache-addall.js'); // A list of paths to cache var paths = [ '/', '/css/main.css', '/js/main.js', '/about.htm' ]; // Open the cache (and name it) caches.open('offline-v1').then(function(cache) { return cache.addAll(paths); })
By listening for fetch
events from web pages under its control, a service worker can intercept, manipulate and respond to requests for those pages. For instance, the service worker can try to pull content from its cache, and if that fails then go to the network for the original request:
self.addEventListener('fetch', function(event) { event.respondWith( caches.match(event.request).then(function(response) { return response || fetch(event.request); }) ); });
Service workers control pages, but they don’t have access to the DOM. However, service workers and web pages can communicate by sending messages to each other via the postMessage()
method. To send a message from webpage to service worker you could use:
navigator.serviceWorker.controller.postMessage({'command': 'say-hello!'})
Note that the structure and content of the message is entirely up to you.
And to send a message from service worker to a page that it controls, you could use:
client.postMessage({'message':'hello!'});
You’ll need to get a reference to the client page, as a service worker might control multiple pages.
To receive a message in either page or service worker, we need to set up a listener for the message event. In a webpage we set it up as:
navigator.serviceWorker.addEventListener('message', function(event) { //do something with the message event.data })
In the service worker we set it up with:
self.addEventListener('message', function(event) { //do something with the message event.data })
And for completeness, to send a message to a service worker from a page, and to receive a response from the service worker, then you must also provide a message channel in the original message that the service worker can use to reply via. You can see an example of this in this Service Worker two way postMessage example article.
Additionally, since a service worker intercepts requests, it’s also possible to communicate with the service worker via HTTP request, with the service worker responding perhaps with a JSON response.
After that quick introduction to service workers, you should now have a rough idea what they are, and how to use them, and you’re hopefully convinced of the benefits of offline-capable web apps; now it's time to build an offline web app with service workers.
As a working demo, we’ll add some offline capabilities to mobiForge so that it will have opt-in offline-reading capabilities. It will work as follows:
The impatient reader can click through to the demo now; after opting in, kill the network on your device, and follow the link to the offline content.
First we register the service worker on our web page:
if ('serviceWorker' in navigator) { navigator.serviceWorker.register('/sw.js').then(function() { console.log('Service worker registered'); }).catch(function() { console.log('Service worker registration failed'); }); } else { console.log('Service worker not supported'); }
We’ll want to cache some static resources when the service worker is installed, but we’ll come to that later; let’s stick with the web page code for now.
Next, we’ll build the user interface. It’s just a simple checkbox for an all-or-nothing opt-in. A better interface might allow the user to specify exactly what content he or she wanted to make available offline, by topic or category, or by author, for example.
Because a service worker can be terminated and restarted as needed, you can’t rely on it to store state across restarts. If you need persistent data, then you can use IndexedDB. Since we only want to store a simple boolean in this example opt-in
to keep things simple we’ll just use localStorage
to store this state, but in general if you have state information to store it would probably be better to use IndexedDB.
<div id="go-offline"></div>
And the JavaScript:
var optedIn = false; window.onload = function() { if('serviceWorker' in navigator) { document.getElementById('go-offline').innerHTML = '<h4>mobiForge offline reading</h4><input type="checkbox" name="go-offline" id="go-offline-status" value="false" /><label for="go-offline-status">Make recent articles available for offline reading</label><div id="off-line-msg"></div>'; checkOptedIn(); document.getElementById("go-offline-status").addEventListener('click', function(){ console.log('start/stop fetching'); optInOut(); }); } else document.getElementById('go-offline').innerHTML = 'ServiceWorker not supported :-('; };
Note that we need to check the opt-in status with checkOptedIn
. This just checks the value of the localStorage item with the following code, and sets the checkbox appropriately:
function checkOptedIn() { console.log('checking opted in...') if(!!localStorage.getItem('offlineoptin')) { optedIn = true; document.getElementById("go-offline-status").checked = true; console.log('opted in'); return; } else console.log('not opted in'); }
Note also that we added a click handler for the check box, which call optInOut
. Recall from earlier that pages and service workers can communicate by posting messages to each other. To allow a user to opt-in to offline, we’ll provide a checkbox which, when checked, will post a message to the service worker to make the most recent content available offline. We could also have a Make this article available offline associated with each article, in addition to, or instead of our Make all recent content available offline checkbox. This is left as an exercise for the reader.
function optInOut() { if(!optedIn) { optedIn = true; localStorage.setItem('offlineoptin', '1'); navigator.serviceWorker.controller.postMessage({'command': 'offline-opt-in'}); } else { optedIn = false; localStorage.removeItem('offlineoptin'); navigator.serviceWorker.controller.postMessage({'command': 'offline-opt-out'}); } }
So that’s the interface out of the way, and communication is set up with the service worker. Let’s look at the service worker code.
The service worker needs to cache some basic static resources. When the user opts in, it should also cache the most recent content. It will pull this from the home page of the site which contains a stream of the latest posts. We’ll cache this page, plus each of the latest posts, and their associated assets.
We cache some basic static resources, such as images, scripts, and css that will be needed just to display every page. These resources are cached during the install event that was mentioned earlier; waitUntil
ensures that the pages are cached before installation is completed:
// Add the missing addAll functionality importScripts('sw-cache-addall.js'); // A list of paths to cache var paths = [ '/', '/css/main.css', '/js/main.js', '/img/logo.png', '/about.htm' ]; self.addEventListener('install', function(event) { event.waitUntil( caches.open('offline-v1') .then(function(cache) { return cache.addAll(paths); }) ); event.waitUntil(self.skipWaiting()); });
Next, we’ll listen out for messages from the web page, using the message
event. We saw how to do this earlier:
self.addEventListener('message', function(event) { //do something with the message event.data })
To make the recent content available offline, we’ll fetch and cache the home page, as well as each of the recent articles listed on that page. We’ll scrape the HTML to get these links. This is a bit ugly; ideally, instead of scraping, it would be better to have an API that we could simply query for content.
We can pull out the links we need with a regular expression. Because we’re in control of the site, we don't need to worry that it will be brittle.
For each URL we cache, we’ll want to cache its image assets, or it will look broken when served from cache, so we must take a look at the response, and also add any images included in the content to the cache. We perform this in a function fetchAndCache
:
function fetchAndCache(url, cache) { return fetch(url).then(function (response) { if (response.status < 400) { console.log('got '+url); cache.put(url, response.clone()); } return response.text(); }).then(function(text) { var pattern = /img src=(?:\'|\")\/((?:files|img)\/[^\'\"]+)\"/g; var assets = getMatches(text, pattern, 1); return cache.addAll(assets); }) }
This code is outlined:
self.addEventListener('message', function(event) { console.log('SW got message:'+event.data.command); switch (event.data.command) { case 'offline-opt-in': // Cache homepage, and parse the top links from it caches.open('static-v1').then(function(cache) { fetch('/').then(function(response) { fetchAndCache('/', cache); return response.text(); }) .then(function(text) { var pattern = /a href=\"([^\'\"]+)\" class=\"item-title\"/g; var urls = getMatches(text, pattern, 1); console.log('caching: ' + urls); for(var i=0;i<urls.length;i++) { console.log('fetching '+urls[i]); fetchAndCache(urls[i], cache); } }) }); break; } });
Another helper function is used help pull out page and image URLs:
//helper function getMatches(string, regex, index) { index || (index = 1); // default to the first capturing group var matches = []; var match; while (match = regex.exec(string)) { matches.push(match[index]); } return matches; }
Now it’s time to start intercepting requests. We saw how to do this earlier, by listening out for the fetch
event. We need to think a bit about our caching policy at this point. For this example, we’ll keep it super simple: for the home page, we’ll implement network, then cache, then fallback. For all other pages, we’ll implement cache, then network, then fallback. The reasoning is: the home page is likely to change more often than article content. So, we don’t want the user to miss any new content when there is an network connection available. For the articles, we go straight to cache. A better solution would be to implement cache, then network and notify the user if there is newer content. But we’ll keep it simple here—this is more a proof of concept than a full-fledged offline solution.
self.addEventListener('fetch', function(event) { var requestURL = new URL(event.request.url); // Network, then cache, then fallback for home page if(requestURL=='/') { event.respondWith( fetch(event.request).then(function() { return caches.match(event.request); }).catch(function() { return caches.match('/page/content-not-available-offline'); }) ); } // Cache, then network, then fallback for other urls event.respondWith( caches.match(event.request).then(function(response) { return response || fetch(event.request); }).catch(function() { return caches.match('/page/content-not-available-offline'); }) ); });
And that about wraps it up. Note that this implementation is not perfect; it’s a partial offline solution retro-fitted onto an existing website. It required a little ugly scraping of links from the homepage. If we were building this from the ground up, it would be worth considering to implement a simple content API that we could query for content without HTML page template, and which we could query for new or updated content. We could also do a better job of ensuring necessary scripts and other assets are available offline, and we could also consider hiding functionality that is not available offline, such as the social sharing buttons. But despite these shortcomings, it still serves as a relatively accessible proof of concept of the power of service workers and in this case, their ability to build compelling offline experiences.
Many applications may have unique caching requirements. Should you go to cache first, network first, or both at the same time? We chose a fairly simple approach for this example.Jake Archibald lists several useful patterns in his offline cookbook article. Every application will have its own needs, so choose, or develop, a caching solution that’s appropriate for your application.
Two all important URLs for debugging service workers are
You can also view what caches have been opened, and what URLs are currently cached in them.
Remote debugging device, showing cache contents
You can also view registered service workers right from the Chrome Developer Tools. However, sometimes the service worker information pane disappears from Chrome developer tools altogether—be prepared to shut down Chrome and restart to bring it back!
When you’re developing, you might run into trouble with the page loading the old service worker code. The trick here is that a reload won’t necessarily load the new service worker. Instead, close the current tab altogether, and then load the page again.
It can also be useful to unregister the registered service worker in the Developer Tools before reloading. And remember, unless you're using skipWaiting()
/clients.claim()
your sevice worker won't take control until the next time the page is loaded.
Read more about updating service workers here: Updating service workers.
While service workers aren't universally supported yet, most of the main browsers are either working on bringing support, or have it on a roadmap. The notable exception is Safari.
Jake Archibald's Is ServiceWorker Ready? page gives a very detailed run down of the level of support in the main browsers. A quick summary would be that right now there are usable implementations in Chrome, Firefox, and Opera; Microsoft's Edge browser is likely to build support soon, but there are no signs right now that Apple will bring service worker support to Safari any time soon.
Service Worker browser support (adapted fromIs ServiceWorker Ready?)
Despite only partial support, service workers are popping up in websites everywhere. Just take a look at chrome://serviceworker-internals/ and you might be surprised to see the number of service workers listed from various sites that you've recently visited. With well planned usage, a site can function without service workers where it's not supported, but will benefit from all that service workers have to offer where it is!
Once again, documentation by Matt Gaunt and Jake Archibald at Google has been indispensable!