title: Web Packaging Format Explainer
url: https://github.com/WICG/webpackage/blob/master/explainer.md
hash_url: 7fb7e8f5fb
This document describes use cases for a new package format for web sites and
applications and outlines such a format. It replaces the
W3C TAG's Web Packaging Draft.
It serves similar role as typical "Introduction" or "Using" and other
non-normative sections of specs.
Some new use cases for Web technology have motivated thinking about a multi-resource packaging format. Those new opportunities include:
Local sharing is quite popular, especially in Emerging Markets countries, due to cost and limitations on cellular data and relatively spotty WiFi availability. It is typically done over local Bluetooth/WiFi, by either built-in OS features like Android Beam or with popular 3-rd party apps, such as ShareIt or Xender). Typically, the locally stored media files and apps (APK files for Android for example) are shared this way. Extending sharing to bundles of content and web apps (Progressive Web Apps in particular) opens up new possibilities:
Any client can snapshot the page they're currently reading. This is currently done mostly by capturing a screenshot or saving into a browser-specific format like MHTML or Web Archive, but these only support at most a single page per archive, and the second set aren't supported by other browsers.
If the original origin signs a package, the contents can get full access to browser state and network requests for that origin. This lets people share full PWAs peer-to-peer.
Local sharing tends to be popular where connectivity to the web is expensive: each byte costs money. This means the client may be stuck with an outdated version of a package for a significant amount of time, including that package's vulnerabilities. It may be feasible to periodically check for OCSP notification that a package's certificate has been revoked. We also need to design a cheap notification that the package is critically vulnerable and needs to be disabled until it can be updated.
Beacons and other physical web devices often want to 'broadcast' various content locally. Today, they broadcast a URL and make the user's device go to a web site. This delivers the trusted content to the user's browser (user can observe the address bar to verify) and allow web apps to talk back to their services. It can be useful to be able to broadcast a package containing several pages or even a simple web app, even without a need to immediately have a Web connection - for example, via Bluetooth. If combined with signature from the publisher, the loaded pages may be treated as if they were loaded via TLS connection with a valid certificate, in terms of the origin-based security model. For example, they can use fetch()
against its service or use "Add To Homescreen" for the convenience of the user.
Physical web beacons may be located in a place that has no connectivity, and their owner may only visit to update them as often as their battery needs replacement: annually or less often. This leaves a large window during which a certificate could be compromised or a package could get out of date. We think there's no way to prevent a client from trusting its initial download of a package signed by a compromised certificate. When the client gets back to a public network, it should attempt to validate both the certificate and the package using the mechanisms alluded to under Local Sharing.
Content distributors can provide the service of hosting web content that should be delivered at scale. This includes both hosting subresources (JS libraries, images) as well as entire content (Google AMP) on a network of servers, often provided as a service by 3rd party. Unfortunately, the origin-based security model of the Web limits the ways 3rd-party caches/servers can be used. For example in the case of hosting JS subresources, the original document must explicitly trust the distributor's origin to serve the trusted script. The user agent must use protocol-based means to verify the subresource is coming from the trusted distributor. Another example is a content distributor that caches the whole content. Because the origin of the distributor is different from the origin of the site, the browser normally can't afford the origin treatment of the site to the loaded content. Look at how an article from USA Today is represented:
Note the address bar indicating google.com. Also, since the content of USA Today is hosted in an iframe, it can't use all the functionality typically afforded to a top-level document:
Packages served to content distributors can staple an OCSP response and have a short expiration time, avoiding the problems with outdated packages under "Local Sharing".
See https://wicg.github.io/webpackage/draft-yasskin-webpackage-use-cases.html#requirements
We propose to introduce a packaging format for the Web that would be able to contain multiple resources (HTML, CSS, Images, Media files, JS files etc) in a "bundle". That bundle can be distributed as regular Web resource (over HTTP[S]) and also by non-Web means, which includes local storage, local sharing, non-HTTP protocols like Bluetooth, etc. Being a single "bundle", it facilitates various modes of transfer. The packages may be nested, providing natural way to represent the bundles of resources corresponding to different origins and sites.
In addition, that format would include optional signing of the resources, which can be used to verify authenticity and integrity of the content. Once and if verified (this may or may not require network connection), the content can be afforded the treatment of the claimed origin - for example showing a "green lock" with URL in a browser, or being able to send network request to the origin's server. This disconnects the verification of the origin from actual network connection and enables many new scenarios for the web content to be consumed, including time-shifted delivery (when content is delivered by an opportunistic restartable download for example), peer-to-peer sharing or caching on local file servers.
Since the packaged "bundle" can be quite large (a game with a lot of resources or content of multiple web sites), efficient access to that content becomes important. For example, it would be often prohibitively expensive to "unpack" or somehow else pre-process such a large resource on the client device. Unpacking, for example, may require twice the space to be occupied in device's storage, which can be a problem, especially on low-end devices. We propose a optional Content Index structure that allows the bundle to be consumed (browsed) efficiently as is, without unpacking - by adding an index-like structure which provides direct offsets into the package.
This is roughly based on the existing Packaging on the Web W3C TAG proposal, but we've made significant changes:
Note that this is just an explainer, not a specification. We'll add more precision when we translate it to a spec.
The package is a CBOR-encoded data item
with MIME type application/package+cbor
. It logically contains a flat sequence
of resources represented as HTTP responses. The package also includes metadata
such as a manifest and an index to let consumers validate the resources and
access them directly.
The overall structure of the item is described by the following CDDL:
webpackage = [
magic1: h'F0 9F 8C 90 F0 9F 93 A6', ; 🌐📦 in UTF-8.
section-offsets: { * (($section-name .within tstr) => offset) },
sections: ({ * $$section }) .within ({ * $section-name => any }),
length: uint, ; Total number of bytes in the package.
magic2: h'F0 9F 8C 90 F0 9F 93 A6', ; 🌐📦 in UTF-8.
]
; Offsets are measured from the first byte of the webpackage item to the first
; byte of the target item.
offset = uint
Each section-offset points to a section with the same key, by holding the byte offset from the start of the webpackage item to the start of the section's name item.
The length holds the total length in bytes of the webpackage
item and must be
encoded in the uint64_t format, which makes it possible to build self-extracting
executables by appending a normal web package to the extractor executable.
The defined section types are:
"indexed-content"
: The only required section.
Maps resource keys (URLs possibly extended with HTTP headers) to
HTTP2 responses. The mapping uses byte offsets to allow random
access to any resource."manifest"
: Validates that resources came from the expected
source. May refer to other manifests among the responses. If this section
isn't provided, the resources are un-signed and can be loaded as untrusted
data.More sections may be defined later. If an unexpected section is encountered, it is ignored.
Note that this top-level information is not signed, and so can't be trusted. Only information in the manifest and below can be trusted.
The main content of a package is an index of HTTP requests pointing to HTTP responses. These request/response pairs hold the manifests of sub-packages and the resources in the package and all of its sub-packages. Both the requests and responses can appear in any order, usually chosen to optimize loading while the package is streamed.
$section-name /= "indexed-content"
$$section //= ("indexed-content" => [
index: [* [resource-key, offset, ? length: uint] ],
responses: [* [response-headers: http-headers, body: bstr]],
])
resource-key = uri / http-headers
; http-headers is a byte string in HPACK format (RFC7541).
; The dynamic table begins empty for each instance of http-headers.
http-headers = bstr
A uri
resource-key
is equivalent to an http-headers
block with ":method"
set to "GET" and with ":scheme", ":authority", and ":path" headers set from the
URI as described in
RFC7540 section 8.1.2.3.
As an optimization, the resource-key
s in the index store relative instead of
absolute URLs. Each entry is resolved relative to the resolved version of the
previous entry.
TODO: Consider random access into large indices.
In addition to the CDDL constraints:
resource-key
s with the
same header list after
HPACK decoding.resource-key
must not contain any headers that aren't either ":method",
":scheme", ":authority", ":path", or listed in the
response-headers
'
"Vary" header.resource-key
must contain at most one of each ":method", ":scheme",
":authority", ":path" header, in that order, before any other headers.
Resolving the resource-key
fills in any missing pseudo-headers from that
set, ensuring that all resolved keys have exactly one of each.The optional length
field in the index entries is redundant with the length
prefixes on the response-headers
and body
in the content, but it can be used
to issue Range requests for responses
that appear late in the content
.
TODO: Now that this no longer contains a manifest, consider renaming it to something like "authenticity".
A package's manifest contains some metadata for the package, hashes for all resources included in that package, and validity information for any sub-packages the package depends on. The manifest is signed, so that UAs can trust that it comes from its claimed origin.
$section-name /= "manifest"
$$section //= ("manifest" => signed-manifest)
signed-manifest = {
manifest: manifest,
certificates: [+ certificate],
signatures: [+ signature]
}
manifest = {
metadata: manifest-metadata,
resource-hashes: {* hash-algorithm => [hash-value]},
? subpackages: [* subpackage],
}
manifest-metadata = {
date: time,
origin: uri,
* tstr => any,
}
; From https://www.w3.org/TR/CSP3/#grammardef-hash-algorithm.
hash-algorithm /= "sha256" / "sha384" / "sha512"
; Note that a hash value is not base64-encoded, unlike in CSP.
hash-value = bstr
; X.509 format; see https://tools.ietf.org/html/rfc5280
certificate = bstr
signature = {
; RFC5280 says certificates can be identified by either the
; issuer-name-and-serial-number or by the subject key identifier. However,
; issuer names are complicated, and the subject key identifier only identifies
; the public key, not the certificate, so we identify certificates by their
; index in the certificates array instead.
keyIndex: uint,
; Encoded as described in TLS 1.3,
; https://tlswg.github.io/tls13-spec/#signature-algorithms.
signature: bstr,
}
The metadata must include an absolute URL identifying the origin vouching for the package and the date the package was created. It may contain more keys defined in https://www.w3.org/TR/appmanifest/.
The manifest is signed by a set of certificates, including at least one that is trusted to sign content from the manifest's origin. Other certificates can sign to vouch for the package along other dimensions, for example that it was checked for malicious behavior by some authority.
The signed sequence of bytes is the concatenation of the following byte strings. This matches the TLS1.3 format to avoid cross-protocol attacks when TLS certificates are used to sign manifests.
manifest
CBOR item.Each signature uses the keyIndex
field to identify the certificate used to
generate it.
The TLS 1.3 signing algorithm
is determined from the certificate's public key type:
The following key types are not supported, for the mentioned reason:
ed25519
and ed448:
The RFC for using
these in certificates isn't yet final.As a special case, if the package is being transferred from the manifest's origin under TLS, the UA may load it without checking that its own resources match the manifest. The UA still needs to validate resources provided by sub-manifests.
The signed-manifest.certificates
array should contain enough
X.509 certificates to chain from the signing certificates, using the rules
in RFC5280, to roots trusted by all
expected consumers of the package.
Sub-packages manifests can contain their own certificates or can rely on certificates in their parent packages.
Requirements on the certificates' Key Usage and Extended Key Usage are TBD. It may or may not be important to prevent TLS serving certificates from being used to sign packages, in order to prevent cross-protocol attacks.
For a resource to be valid, then for each hash-algorithm => [hash-value]
in
resource-hashes
, the resource's hash using that algorithm needs to appear in
that list of hash-value
s. Like
in Subresource Integrity, the UA will only
check one of these, but it's up to the UA which one.
The hash of a resource is the hash of its Canonical CBOR encoding using the following CDDL. Headers are decompressed before being encoded and hashed.
resource = [
request: [
':method', bstr,
':scheme', bstr,
':authority', bstr,
':path', bstr,
* (header-name, header-value: bstr)
],
response-headers: [
':status', bstr,
* (header-name, header-value: bstr)
],
response-body: bstr
]
# Headers must be lower-case ascii per
# http://httpwg.org/specs/rfc7540.html#rfc.section.8.1.2, and only
# pseudo-headers can include ":".
header-name = bstr .regexp "[\x21-\x39\x3b-\x40\x5b-\x7e]+"
This differs from SRI, which only hashes the body. Note: This will usually prevent a package from relying on some of its contents being transferred as normal network responses, unless its author can guarantee the network won't change or reorder the headers.
A sub-package is represented by a manifest file in
the "content"
section, which contains hashes of resources
from another origin. The sub-package's resources are not otherwise distinguished
from the rest of the resources in the package. Sub-packages can form an
arbitrarily-deep tree.
There are three possible forms of dependencies on sub-packages, of which we allow two. Because a sub-package is protected by its own signature, if the main package trusts the sub-package's server, it could avoid specifying a version of the sub-package at all. However, this opens the main package up to downgrade attacks, where the sub-package is replaced by an older, vulnerable version, so we don't allow this option.
subpackage = [
resource: resource-key,
validation: {
? hash: hashes,
? notbefore: time,
}
]
If the main package wants to load either the sub-package it was built with or any upgrade, it can specify the date of the original sub-package:
[32("https://example.com/loginsdk.package"), {"notbefore": 1(1486429554)}]
Constraining packages with their date makes it possible to link together sub-packages with common dependencies, even if the sub-packages were built at different times.
If the main package wants to be certain it's loading the exact version of a sub-package that it was built with, it can constrain sub-package with a hash of its manifest:
[32("https://example.com/loginsdk.package"),
{"hash": {"sha256": 22(b64'9qg0NGDuhsjeGwrcbaxMKZAvfzAHJ2d8L7NkDzXhgHk=')}}]
Note that because the sub-package may include sub-sub-packages by date, the top package may need to explicitly list those sub-sub-packages' hashes in order to be completely constrained.
Following are some example usages that correspond to these additions. The packages are written in CBOR's extended diagnostic notation, with the extensions that:
hpack({key:value,...})
is an hpack
encoding of the described headers.DER(...)
is the DER encoding of a certificate described partially by the
contents of the ...
.All examples are available in the examples directory.
The example web site contains two HTML pages and an image. This is straightforward case, demonstrating the following:
section-offsets
section declares one main section starting 1 byte into
the sections
item. (The 1 byte is the map header for the sections
item.)index
maps hpack-encoded
headers for each resource to the start of that resource, measured relative to
the start of the responses
item.date
/expires
headers that specify when the
resource can be used by UA, similar to HTTP 1.1
Expiration Model.
The actual expiration model is TBD and to be reflected in the spec. Note that
we haven't yet described a way to set an expires
value for the whole
package at once.['🌐📦',
{"indexed-content": 1},
{"indexed-content":
[
[ # Index.
[hpack({
:method: GET,
:scheme: https
:authority: example.com
:path: /index.html
}), 1],
[hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /otherPage.html
}), 121],
[hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /images/world.png
}), 243]
],
[ # Resources.
[
hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'<body>\n <a href=\"otherPage.html\">Other page</a>\n</body>\n'
],
[
hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'<body>\n Hello World! <img src=\"images/world.png\">\n</body>\n'
], [
hpack({
:status: 200
content-type: image/png
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'... binary png image ...'
]
]
]
},
473, # Always 8 bytes long.
'🌐📦'
]
The example web site contains an HTML page and pulls a script from the well-known location (different origin). Note that there's no need to distinguish the resources from other origins vs the ones from the main origin. Since none of them are signed, the browser won't treat any as same-origin with their claimed origin.
['🌐📦',
{"indexed-content": 1},
{"indexed-content":
[
[
[hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /index.html
}), 1],
[hpack({
:method: GET
:scheme: https
:authority: ajax.googleapis.com
:path: /ajax/libs/jquery/3.1.0/jquery.min.js
}), 179]
],
[
[
hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'<head>\n<script src=\"https://ajax.googleapis.com/ajax/libs/jquery/3.1.0/jquery.min.js\"></script>\n<body>\n...\n</body>\n'
],
[
hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'... some JS code ...\n'
]
]
]
},
396,
'🌐📦'
]
The example contains example.com/index.html. The package is signed by the example.com publisher, using the same private key that example.com uses for HTTPS. The signed package ensures the verification of the origin even if the package is stored in a local file or obtained via other insecure ways like HTTP, or hosted on another origin's server.
Some interesting things to notice in this package:
"manifest"
map contains "certificates"
and "signatures"
arrays
describing how the manifest is signed."certificates"
as the signing
certificate."certificates"
are
DER-encoded X.509 certificates.
The signing certificate is
trusted for example.com
, and that certificate chains,
using other elements of
"certificates"
, to
a trusted root certificate. The chain
is built and trusted in the same way as TLS chains during normal web
browsing.prime256v1
, and isn't encoded separately in the signature block."resource-hashes"
block, which contains the hashes,
using the SHA384 algorithm in this case, of all resources in the package.
Unlike in
Subresource Integrity,
the hashes include the request and response headers.[
'🌐📦',
{
"manifest": 1,
"indexed-content": 2057
},
{
"manifest": {
"manifest": {
"metadata": {
"date": 1(1494583200),
"origin": 32("https://example.com")
},
"resource-hashes": {
"sha384": [
h'3C3A03F7C3FC99494F6AAA25C3D11DA3C0D7097ABBF5A9476FB64741A769984E8B6801E71BB085E25D7134287B99BAAB',
h'5AA8B83EE331F5F7D1EF2DF9B5AFC8B3A36AEC953F2715CE33ECCECD58627664D53241759778A8DC27BCAAE20F542F9F',
h'D5B2A3EA8FE401F214DA8E3794BE97DE9666BAF012A4B515B8B67C85AAB141F8349C4CD4EE788C2B7A6D66177BC68171'
]
}
},
"signatures": [
{
"keyIndex": 0,
"signature": h'3044022015B1C8D46E4C6588F73D9D894D05377F382C4BC56E7CDE41ACEC1D81BF1EBF7E02204B812DACD001E0FD4AF968CF28EC6152299483D6D14D5DBE23FC1284ABB7A359'
}
],
"certificates": [
DER(
Certificate:
...
Signature Algorithm: ecdsa-with-SHA256
Issuer: C=US, O=Honest Achmed's, CN=Honest Achmed's Test Intermediate CA
Validity
Not Before: May 10 00:00:00 2017 GMT
Not After : May 18 00:10:36 2018 GMT
Subject: C=US, O=Test Example, CN=example.com
Subject Public Key Info:
Public Key Algorithm: id-ecPublicKey
Public-Key: (256 bit)
pub:
...
ASN1 OID: prime256v1
...
),
DER(
Certificate:
...
Signature Algorithm: sha256WithRSAEncryption
Issuer: C=US, O=Honest Achmed's, CN=Honest Achmed's Test Root CA
Validity
Not Before: May 10 00:00:00 2017 GMT
Not After : May 18 00:10:36 2018 GMT
Subject: C=US, O=Honest Achmed's, CN=Honest Achmed's Test Intermediate CA
Subject Public Key Info:
Public Key Algorithm: id-ecPublicKey
Public-Key: (521 bit)
pub:
...
ASN1 OID: secp521r1
...
)
]
},
"indexed-content": [
[
[ hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /index.html
}), 1]
[ hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /otherPage.html
}), 121],
[ hpack({
:method: GET
:scheme: https
:authority: example.com
:path: /images/world.png
}), 243]
],
],
[
[ hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'<body>\n <a href=\"otherPage.html\">Other page</a>\n</body>\n'
]
[ hpack({
:status: 200
content-type: text/html
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'<body>\n Hello World! <img src=\"images/world.png\">\n</body>\n'
],
[ hpack({
:status: 200
content-type: image/png
date: Wed, 15 Nov 2016 06:25:24 GMT
expires: Thu, 01 Jan 2017 16:00:00 GMT
}),
'... binary png image ...'
]
]
]
},
2541,
'🌐📦'
]
The process of validation:
"resource-hashes"
map, and use that to hash the Canonical CBOR
representation of its request headers, response headers, and body. Verify
that the resulting digest appears in that array in the "resource-hashes"
map.Examples below here are out of date
Lets add signing to the example mentioned above where a page uses a cross-origin JS library, hosted on https://ajax.googleapis.com. Since this package includes resources from 2 origins, this means there are 2 packages, one of them nested. Both of them should be signed by their respective publisher, since for the main page to be validated as secure (green lock, origin access) all resources that comprise it must be signed/validated - equivalent of them being loaded via HTTPS.
Important notes:
Package-Signature: NNejtdEjGnea4VTvO7A/x+5ucZm+pGPkQ1TD32oT3oKGhPWeF0hASWjxQOXvfX5+; algorithm=sha384; certificate=urn:uuid:f47ac10b-58cc-4372-a567-0e02b2c3d479 Content-Type: application/package Content-Location: https://example.org/examplePack.pack Link: </index.html>; rel=describedby Link: <https://ajax.googleapis.com/packs/jquery_3.1.0.pack>; rel=package; scope=/ajax/libs/jquery/3.1.0 Link: <urn:uuid:d479c10b-58cc-4243-97a5-0e02b2c3f47a>; rel=index; offset=12014/2048 --j38n02qryf9n0eqny8cq0 Content-Location: /index.html Content-Type: text/html <head> <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.1.0/jquery.min.js"></script> <body> ... </body> --j38n02qryf9n0eqny8cq0 Package-Signature: A/xtdEjGnea4VTvNNejO7+5ucZm+pGPkQ1TD32oT3oKGhPWeF0hASWjx5+QOXvfX; algorithm=sha384; certificate=urn:uuid:7af4c10b-58cc-4372-8567-0e02b2c3dabc Content-Location: https://ajax.googleapis.com/packs/jquery_3.1.0.pack Content-Type: application/package Link: <urn:uuid:aaf4c10b-58cc-4372-8567-0e02b2c3daaa>; rel=index; offset=12014/2048 --klhfdlifhhiorefioeri1 Content-Location: /ajax/libs/jquery/3.1.0/jquery.min.js Content-Type: application/javascript ... some JS code ... --klhfdlifhhiorefioeri1 (This is Content Index for ajax.googleapis.com subpackage) Content-Location: urn:uuid:aaf4c10b-58cc-4372-8567-0e02b2c3daaa Content-Type: application/package.index /ajax/libs/jquery/3.1.0/jquery.min.js sha384-3dEjGnea4A/xtGPkQ1TDVTvNNejO7+5ucZm+pASWjx5+QOXvfX2oT3oKGhPWeF0h 102 3876 ... other entries ... --klhfdlifhhiorefioeri1 Content-Location: urn:uuid:7af4c10b-58cc-4372-8567-0e02b2c3dabc Content-Type: application/pkix-cert ... certificate for ajax.googleapi.com ... --klhfdlifhhiorefioeri1-- --j38n02qryf9n0eqny8cq0 (This is Content Index for example.com package) Content-Location: urn:uuid:d479c10b-58cc-4243-97a5-0e02b2c3f47a Content-Type: application/package.index /index.html sha384-WeF0h3dEjGnea4ANejO7+5/xtGPkQ1TDVTvNucZm+pASWjx5+QOXvfX2oT3oKGhP 153 215 --j38n02qryf9n0eqny8cq0 Content-Location: urn:uuid:f47ac10b-58cc-4372-a567-0e02b2c3d479 Content-Type: application/pkcs7-mime ... certificate for example.com ... --j38n02qryf9n0eqny8cq0--
The signing part of the proposal addresses integrity and authenticity aspects of the security. It is enough for the resource to be signed to validate it belongs to the origin corresponding to the certificate used. This, in turn allows the browsers and other user agents to afford the 'origin treatment' to the resources in the package, because there is a guarantee that those resources were not tampered with.
Indeed, as is the case with web browsers as well, certificate revocation is not instant on the web. In case of packages that are consumed while device is offline (maybe for a long period of time), the revocation of the certificate may not reach device promptly. But then again, if the web resources were stored in a browser cache, or if pages were Saved As, and used when device is offline, there would be no way to receive the CRL or use OCSP for real-time certificate validation as well. Once the device is online, the certificate should be validated using best practices of the user agent and access revoked if needed.
No, we don't use what commonly is called MAC here because the packages are not encrypted (and there is no strong use case motivating such encryption) so there is no symmetrical key and therefore the traditional concept of MAC is not applicable. However, the Package-Signature contains a Digital Signature which is a hash of the (Content Index + Package Header) signed with a private key of the publisher. The Content Index contains hashes for each resource part included in the package, so the Package-Signature validates each resource part as well.
Yes, the Package Header and Content Index are hashed and this hash, signed, is provided in the Package-Signature header. The Content Index, in turn, has hashes for all resources (Header+Body) so all bits of package are covered.
No. If a package contains subpackages, those subpackages are not covered by the package's signature or hashes and have to have their own Package-Signature header if they need to be signed. This reflects the fact that subpackages typically group resources from a different origin, with their own certificate. The [sub]packages are the units that are typically package resources from their respective origins and are therefore separately signed.
Since they are Version 4 UUIDs, the chances of them colliding are vanishingly small.
The expiration headers in Package Headers section prescribe the 'useful lifetime' of the package, with UA optionally indicating the 'stale' state to the user and asking to upgrade, or automatically fetching a new one. While offline, the expiration may be ignored (not unlike Cache-Control: no-cache) but once user is online, the UA should verify both the certificate and if the package Content-Location contains an updated package (per Package Headers section) - and replace the package if necessary. In general, if the device is online and the package is expired, and the original location has updated package, the UA should obtain a new one (details TBD).
This is due to two main use cases of package loading:
Not necessarily. Different devices have different sets of roots in their trust stores, so there is not a single "correct" set of certificates to send that will work best for all clients. Instead, for compatibility, packages may need to include a set of certificates from which chains can be built to multiple roots, or rely on clients to dynamically fetch additional intermediates when needed.
This becomes a tradeoff between the package size vs the set of clients that can validate the signature offline. We expect that packaging tools will allow their users to configure this tradeoff in appropriate ways.