|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862 |
- title: How Dat Works
- url: https://datprotocol.github.io/how-dat-works/
- hash_url: 87310ff690bc7663b397af17121a5a09
-
- <p id="introduction"><strong>Dat is a protocol for sharing data between computers.</strong> Dat’s strengths are that data is hosted and distributed by many computers on the network, that it can work offline or with poor connectivity, that the original uploader can add or modify data while keeping a full history and that it can handle large amounts of data.</p>
- <div>
- <p>Dat is compelling because the people working on it have a dedication to user experience and ease-of-use. The software around Dat brings publishing within reach for people with a wide range of skills, not just technical. Although first designed with scientific data in mind, the Dat community is testing the waters and has begun to use it for websites, art, music releases, peer-to-peer chat programs and many other experiments.</p>
- <p>This guide is an in-depth tour through the bits and bytes of the Dat protocol, starting from a blank slate and ending with being able to download and share files with other peers running Dat. There will be enough detail for readers who are considering writing their own implementation of Dat, but if you are just curious how it works or want to learn from Dat’s design then I hope you will find this guide useful too!</p>
- </div>
- <aside class="basealign">
- <h2>More documentation about Dat</h2>
-
- </aside>
- <div class="feedback">
- <div>
- <h2>Feedback</h2>
- <p>This guide was published only a few days ago! There are probably lots of little improvements that will be obvious to you as you read through it but weren’t obvious to me as I wrote it. Please <strong><a href="https://goo.gl/forms/R22N7C3e0trxNn9h1">send feedback</a></strong> about what you liked, found confusing or would change.</p>
- <p><strong>New Zealanders:</strong> Would you be willing to read through this guide together, in person, with me the author? <a href="https://goo.gl/forms/VT1UcGgMPTziILxH2">More info.</a></p>
- </div>
- </div>
-
- <h1 id="urls" class="section"><span>URLs</span></h1>
- <p>To fetch a file in Dat you need to know its URL. Here is an example:</p>
- <svg class="smallmargin pagewidth">
- <text class="code" x="0" y="18"><tspan>dat://</tspan><tspan fill="#007fff">778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943666fe639</tspan><tspan>/dat_intro.gif</tspan></text>
- <path stroke="#93a0b6" strokewidth="1" fill="none" d="M0.5,24 v4.5 h60 v-4.5 m0,4.5 h640 v-4.5 m0,4.5 h140 v-4.5"/>
- <text y="44" text-anchor="middle"><tspan x="30">protocol</tspan><tspan x="30" dy="1.2em">identifier</tspan></text>
- <text y="44" text-anchor="middle"><tspan x="380">ed25519 public key</tspan><tspan x="380" dy="1.2em">(hexadecimal)</tspan></text>
- <text y="44" text-anchor="middle"><tspan x="770">optional suffix</tspan><tspan x="770" dy="1.2em">path to data within Dat</tspan></text>
- </svg>
- <ul>
- <li><strong>Protocol identifier.</strong> Makes Dat URLs easily recognizable. Dat-capable applications can register with the operating system to handle <code>dat://</code> links, like <a href="https://beakerbrowser.com/">Beaker</a> does. In Dat-specific applications the protocol identifier can be left off.</li>
- <li><strong>Public key.</strong> An ed25519 public key unique to this Dat, used by the author to create and update data within it. The public key enables you to discover other peers who have the data and is also verify that the data was not corrupted or tampered with as it passed through the network.</li>
- <li><strong>Suffix.</strong> Identifies specific data within this Dat. For most Dats which contain a directory of files, the suffix is a slash-separated file path. Dats can also contain data in structures that don’t use the concept of files or directories, in which case the suffix would use some other format as understood by the applications that handle that sort of data.</li>
- </ul>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
-
- <h1 id="discovery" class="section"><span>Discovery</span></h1>
- <p>Dat clients use several different methods for discovering peers who they can download data from. Each discovery method has strengths and weaknesses, but combined they form a reasonably robust way of finding peers.</p>
-
- <h2 id="discovery-keys">Discovery keys</h2>
- <div>
- <p>Discovery keys are used for finding other peers who are interested in the same Dat as you.</p>
- <p>If you know a Dat’s public key then you can calculate the discovery key easily, however if you only know a discovery key you cannot work backwards to find the corresponding public key. This prevents eavesdroppers learning of Dat URLs (and therefore being able to read their contents) by observing network traffic.</p>
- </div>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <p>However eavesdroppers can confirm that peers are talking about a specific Dat and read all communications between those peers if they know its public key already. Eavesdroppers who do not know the public key can still get an idea of how many Dats are popular on the network, their approximate sizes, which IP addresses are interested in them and potentially the IP address of the creator by observing handshakes, traffic timing and volumes. Dat makes no attempt to hide IP addresses.</p>
- <p>Calculate a Dat’s discovery key using the <strong>BLAKE2b</strong> hashing function, keyed with the public key (as 32 bytes, not 64 hexadecimal characters), to hash the word “hypercore”:</p>
- <aside class="basealign">Dat uses the BLAKE2b variant that accepts both a key and input to be hashed, returning 256 bits (32 bytes) of output.</aside>
- <img id="blake2b" src="png/blake2b.png"/>
- <aside>
- <h3>Byte notation</h3>
- <img src="png/notation_byte.png"/>
- <p>Throughout this guide bytes are shown as a number inside a square. The number is always in decimal (base‑10) and can range from 0 to 255.</p>
- </aside>
-
- <h2 id="local-network-discovery">Local network discovery</h2>
- <div>
- <p>Peers broadcast which Dats they are interested in via their local network.</p>
- <ul>
- <li><strong>Strengths.</strong> Fast, finds physically nearby peers, doesn’t need special infrastructure, works offline.</li>
- <li><strong>Weaknesses.</strong> Limited reach.</li>
- <li><strong>Deployment status.</strong> Currently in use, will be replaced by <a href="https://pfrazee.hashbase.io/blog/hyperswarm">Hyperswarm</a> in the future.</li>
- </ul>
- </div>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <p>Local network discovery uses multicast DNS, which is like a regular DNS query except instead of sending queries to a nameserver they are broadcast to the local network with the hope that someone else on the network sees it and responds.</p>
- <figure id="mdns-request" class="pagewidth">
- <figcaption>Client asking for peers</figcaption>
- <img src="png/mdns_request.png"/>
- </figure>
- <p>Multicast DNS packets are sent to the special broadcast MAC and IP addresses shown above. Both the source and destination ports are 5353.</p>
- <p>Essentially the computer is asking “Does anybody have any TXT records for the domain name <em>25a78aa81615847eba00995df29dd41d7ee30f3b.dat.local</em>?” Other Dat clients on the network will recognize requests following this pattern and know that the client who sent it is looking for peers.</p>
- <figure id="mdns-response" class="pagewidth">
- <figcaption>Peer reporting that they are also interested in this Dat</figcaption>
- <img src="png/mdns_response.png"/>
- </figure>
- <p id="mdns-peers-record">Responses contain two TXT records:</p>
- <ul>
- <li>The <strong>token</strong> record is a random value that makes it easier for clients to avoid connecting to themselves. If a client sees a response with the same token as a response they just sent out, they will know it came from them and ignore it.</li>
- <li>The <strong>peers</strong> record is a base64-encoded list of IP addresses and ports of peers interested in this Dat:</li>
- </ul>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <div>
- <img src="png/mdns_peers_record.png"/>
- <p>The special IP address 0.0.0.0 means “use the address this mDNS response came from”. When discovering peers on the local network all mDNS responses will contain only one peer and will use the 0.0.0.0 address.</p>
- </div>
- <aside>
- <p>Base64 encoding in Dat uses the variant with plus <code>+</code> and slash <code>/</code> characters. Padding equals <code>=</code> characters are required.</p>
- <p>Only IPv4 addresses are supported by this discovery mechanism.</p>
- <h3>Multi-byte numbers</h3>
- <p>Port numbers go from 0 to 65,535 which is larger than can fit inside a single byte, so in this case two bytes are used.</p>
- <p>The first byte is how many 256’s there are and the second byte is how many ones there are:</p>
- <img src="png/notation_multi_byte_number.png"/>
- <p>In the Dat protocol multi-byte numbers are big-endian meaning the most significant byte comes first.</p>
- </aside>
-
- <h2 id="centralized-dns-discovery">Centralized DNS discovery</h2>
- <p>Peers ask a server on the internet for other peers using a DNS-based protocol.</p>
- <ul>
- <li><strong>Strengths.</strong> Fast, global reach.</li>
- <li><strong>Weaknesses.</strong> Must be online, centralized point of failure, one server sees everyone’s metadata.</li>
- <li><strong>Deployment status.</strong> Currently in use, will be replaced by <a href="https://pfrazee.hashbase.io/blog/hyperswarm">Hyperswarm</a> in the future.</li>
- </ul>
- <p>Currently the server running this is <strong>discovery1.datprotocol.com</strong>. If that goes offline then <strong>discovery2.datprotocol.com</strong> can be used as a fallback.</p>
- <p>Here is a typical message flow between a Dat peer and the DNS discovery server:</p>
- <img class="pagewidth" src="png/centralized_dns_conversation.png"/>
- <p>To stay subscribed, peers should re-announce themselves every 60 seconds. The discovery server will also cycle its tokens periodically so peers should remember the token they last received and update it when the receive a new one.</p>
- <p id="centralized-dns-peers-record">The <em>peers</em> record returned by the discovery server uses the same structure as in mDNS:</p>
- <img src="png/mdns_peers_record_multi.png"/>
- <aside>In this case the server sent back a list of five peers. DNS TXT records are limited to 255 characters so the server is limited to sending back 31 peers at a time. If the server knows more than this it will have to choose which to send, for example the most recent, longest lived or by picking at random.</aside>
- <p>Following are three examples showing how these DNS requests appear as bytes sent over the network:</p>
- <figure id="centralized-dns-request">
- <figcaption>Peer announce request to discovery server</figcaption>
- <img src="png/centralized_dns_request.png"/>
- </figure>
- <figure id="centralized-dns-response">
- <figcaption>Discovery server response to announce</figcaption>
- <img src="png/centralized_dns_response.png"/>
- </figure>
- <figure id="centralized-dns-srv">
- <figcaption>Discovery server SRV push notification</figcaption>
- <img src="png/centralized_dns_srv.png"/>
- </figure>
-
- <h1 id="wire-protocol" class="section"><span>Wire protocol</span></h1>
- <p>Once a peer has discovered another peer’s IP address and port number it will open a TCP connection to the other peer. Each half of the conversation has this structure which repeats until the end of the connection:</p>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <img class="pagewidth" src="png/wire_protocol_structure.png"/>
- <ul>
- <li><strong>Length.</strong> Number of bytes until the start of the next length field.</li>
- <li><strong>Channel and type.</strong> A single number (up to 11 bits long) that encodes two sub-fields as:</li>
- <img src="png/channel_type_field.png"/>
- <ul>
- <li><strong>Channel.</strong> Peers can talk about multiple Dats using the same TCP connection. The channel number is 0 for the first Dat talked about, 1 for the next Dat and so on.</li>
- <li id="message-type"><strong>Type.</strong> Number that says what the purpose of the message is.</li>
- <table>
- <thead>
- <tr>
- <th>Type</th>
- <th>Name</th>
- <th>Meaning</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>0</td>
- <td><strong>Feed</strong></td>
- <td>I want to talk to you about this particular Dat</td>
- </tr>
- <tr>
- <td>1</td>
- <td><strong>Handshake</strong></td>
- <td>I want to negotiate how we will communicate on this TCP connection</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Info</strong></td>
- <td>I am either starting or stopping uploading or downloading</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Have</strong></td>
- <td>I have some data that you said you wanted</td>
- </tr>
- <tr>
- <td>4</td>
- <td><strong>Unhave</strong></td>
- <td>I no longer have some data that I previously said I had (alternatively: I didn’t store that data you just sent, please stop sending me data preemptively)</td>
- </tr>
- <tr>
- <td>5</td>
- <td><strong>Want</strong></td>
- <td>This is what data I want</td>
- </tr>
- <tr>
- <td>6</td>
- <td><strong>Unwant</strong></td>
- <td>I no longer want this data</td>
- </tr>
- <tr>
- <td>7</td>
- <td><strong>Request</strong></td>
- <td>Please send me this data now</td>
- </tr>
- <tr>
- <td>8</td>
- <td><strong>Cancel</strong></td>
- <td>Actually, don’t send me that data</td>
- </tr>
- <tr>
- <td>9</td>
- <td><strong>Data</strong></td>
- <td>Here is the data you requested</td>
- </tr>
- <tr>
- <td>10–14</td>
- <td/>
- <td><em>(Unused)</em></td>
- </tr>
- <tr>
- <td>15</td>
- <td><strong>Extension</strong></td>
- <td>Some other message that is not part of the core protocol</td>
- </tr>
- </tbody>
- </table>
- </ul>
- <li><strong>Body.</strong> Contents of the message.</li>
- </ul>
- <aside>
- <h3>Bit notation</h3>
- <p>In several parts of the Dat protocol multiple fields are packed into a single number. It helps to look at the number as a sequence of bits because this makes the fields visible.</p>
- <img src="png/notation_bits.png"/>
- <p>Throughout this guide bit sequences are shown as 1’s and 0’s in a box, grouped into fields. The most significant bit is always on the left.</p>
- <p>Eight bits make up a byte, however this number and many others are varints which can be up to 64 bits long. The fields on the right are always a fixed number of bits but the leftmost field can take up as many of the remaining 64 bits as it needs.</p>
- </aside>
-
- <h2 id="varints">Varints</h2>
- <div>
- <p>The first two fields are encoded as variable-length integers and therefore do not have a fixed size. You must read each field starting from the beginning to determine how long the field is and where the next field starts.</p>
- <p>The advantage of varints is that they only require a few bytes to represent small numbers, while still being able to represent large numbers by using more bytes. The disadvantage of varints is that they take more work to encode and decode compared to regular integers.</p>
- </div>
- <aside>
- <div class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
-
- <p class="lang">Rust</p>
-
- </div>
- <p>In Dat, varints are between 1 and 10 bytes long and represent integers from 0 to 2<sup>64</sup> - 1. Negative numbers aren’t used.</p>
- </aside>
- <p>Here’s how to decode a varint:</p>
- <img class="pagewidth" src="png/varint.png"/>
-
- <h2 id="keepalive">Keepalive</h2>
- <p>Keepalive messages are empty messages containing no channel number, type or body. They are discarded upon being received. Sending keepalives is necessary when there is a network middlebox that kills TCP connections which haven’t sent any data in a while. In these cases each peer periodically sends keepalive messages when no other data is being sent.</p>
- <aside>How frequently to send keepalive messages is essentially a guess based on what types of middleboxes are commonly used on the internet today. Other TCP-based protocols typically send keepalives every 30 to 120 seconds.</aside>
- <p>Here’s an example of several keepalive messages interleaved with messages containing actual data. Each keepalive message is a single byte of zero:</p>
- <img class="pagewidth" src="png/wire_protocol_keepalive.png"/>
-
- <h2 id="message-structure">Message structure</h2>
- <p>Within each message body is a series of field tags and values:</p>
- <img class="pagewidth" src="png/wire_protocol_field_value_structure.png"/>
- <p id="message-structure-field-tag">The field tag is a varint. The most significant bits indicate which field within the message this is, for example: 1 = <em>discovery key</em>, 2 = <em>nonce</em>. This is needed because messages can have missing or repeated fields. The 3 least significant bits are the type of field.</p>
- <img src="png/field_type_field.png"/>
- <p id="message-field-types">The two types of field are:</p>
- <ul>
- <li><strong>Varint.</strong> The field tag followed by a varint value. Used for simple numeric values and booleans.</li>
- <img src="png/field_structure_varint.png"/>
- <li><strong>Length-prefixed.</strong> The field tag followed by a varint to say how many bytes the field contains, followed by the bytes themselves. Used for strings, bytes and embedded messages.</li>
- <img src="png/field_structure_length_prefixed.png"/>
- </ul>
- <aside class="basealign">
- <p>If you are familiar with the <a href="https://developers.google.com/protocol-buffers/">Protocol Buffers</a> data serialization format you might recognize that this message structure adheres to it. Dat only uses a small subset of the features available with Protocol Buffers, however if you are writing an implementation you may be interested in the <a href="https://github.com/mafintosh/hypercore-protocol/blob/master/schema.proto">.proto file</a> that describes the wire protocol.</p>
- <p>Over time new fields could be added to messages in the wire protocol. Field tags enable this to happen in a backwards-compatible way. If your implementation sees a field number that it doesn’t know about it can use the field type to find out how long the unknown field is and skip over it.</p>
- <div class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </div>
- </aside>
-
- <h2 id="feed-message">Feed message</h2>
- <p>After opening the TCP connection the first message is always a <strong>feed</strong> message.</p>
- <p>Feed messages have two fields:</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Discovery key</strong></td>
- <td>Length-prefixed</td>
- <td>32-byte discovery key for this Dat.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Nonce</strong></td>
- <td>Length-prefixed</td>
- <td>24-byte random nonce generated for this TCP connection. Only present for the first feed message.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>Putting everything together, this is how each side of the TCP connection begins:</p>
- <img class="pagewidth" src="png/feed_message.png"/>
- <p>Or, as the bytes actually sent over the wire:</p>
- <img id="feed-message-bytes" class="pagewidth" src="png/feed_message_bytes.png"/>
-
- <h2 id="encryption">Encryption</h2>
- <div>
- <p>Each side of the TCP connection is encrypted starting from the second message and continuing until the end of the connection. This prevents network eavesdroppers from finding out what data a Dat contains unless they already know its public key.</p>
- <p><strong>XSalsa20</strong> is the encryption cipher used. Given a 32-byte key and a 24-byte nonce, XSalsa20 produces a never-ending stream of pseudorandom bytes called the keystream.</p>
- </div>
- <aside class="basealign">The encryption scheme is not authenticated, meaning a man-in-the-middle can flip bits to disrupt the connection. This is mitigated because data integrity in a Dat is verified using a separate mechanism. If there is an attacker who is in a position to modify bits on the network they have lots of other options for disrupting connections, for example by simply dropping packets.</aside>
- <img class="pagewidth" src="png/encryption.png"/>
- <p id="encryption-sender">The <strong>sender</strong> generates a random 24-byte value for the nonce and includes it in their first message (which is always a feed message). The Dat’s 32-byte public key is used as the XSalsa20 key. From the second message onwards all bytes they send are XORed with the keystream.</p>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <img class="pagewidth" src="png/encryption_send.png"/>
- <p id="encryption-receiver">The <strong>receiver</strong> reads the nonce from the sender’s first message and uses this with their knowledge of the Dat’s public key to set up an identical XSalsa20 keystream. Then they XOR the keystream with the bytes received to decrypt the stream.</p>
- <img class="pagewidth" src="png/encryption_receive.png"/>
-
- <h2 id="handshake-message">Handshake message</h2>
- <p>After the initial feed message, the second message sent on each side of the TCP connection is always a <strong>handshake</strong> message.</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>ID</strong></td>
- <td>Length-prefixed</td>
- <td>Random ID generated by this peer upon starting, 32 bytes long. Used to detect multiple connections to the same peer or accidental connections to itself.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Live</strong></td>
- <td>Varint</td>
- <td>0 = End the connection when neither peer is downloading, 1 = Keep the connection open indefinitely (only takes effect if both peers set to 1).</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>User data</strong></td>
- <td>Length-prefixed</td>
- <td>Arbitrary bytes that can be used by higher-level applications for any purpose. Remember that any data put in here is not protected from tampering as it passes through the network. Additionally, any eavesdropper who knows the Dat’s public key can read this.</td>
- </tr>
- <tr>
- <td>4</td>
- <td><strong>Extensions</strong></td>
- <td>Length-prefixed</td>
- <td>Names of any extensions the peer wants to use, for example: “session-data”. This field can appear multiple times, one for each extension. Both peers need to request an extension in their handshake messages for it to become active.</td>
- </tr>
- <tr>
- <td>5</td>
- <td><strong>Acknowledge</strong></td>
- <td>Varint</td>
- <td>0 = No need to acknowledge each chunk of data received, 1 = Must acknowledge each chunk of data received.</td>
- </tr>
- </tbody>
- </table>
- </div>
-
- <h1 id="data-model" class="section"><span>Data model</span></h1>
- <p>After completing the handshake peers begin requesting data from each other. Dats contain a list of variable-sized chunks of bytes. New chunks can be added to the end by the Dat’s author, but existing chunks can’t be deleted or modified.</p>
- <aside>Boundaries between chunks are preserved. This can be useful for dividing up the Dat into a series of messages, for example one message per chunk.</aside>
- <img src="png/chunk_structure.png"/>
- <p>Hashes are used to verify the integrity of data within a Dat. Each chunk of data has a corresponding hash. There are also parent hashes which verify the integrity of two other hashes. Parent hashes form a tree structure. In this example the integrity of all the data can be verified if you know hash number 3:</p>
- <img id="chunk-structure-hashes" src="png/chunk_structure_hashes.png"/>
- <aside>
- <p>Chunk hashes are even-numbered. Parent hashes are odd-numbered.</p>
- <p>Hash trees allow downloaders to download and verify a specific chunk without needing to also download hashes of every other chunk.</p>
- </aside>
- <p>Each time the author adds new chunks they calculate a root hash and sign it with the Dat’s secret key. Downloaders can use the Dat’s public key to verify the signature, which in turn verifies the integrity of all the other hashes and chunks.</p>
- <img id="chunk-structure-signature" src="png/chunk_structure_signature.png"/>
- <p>Depending on the number of chunks, the root hash can have more than one input. The root hash combines as many parent or chunk hashes as necessary to cover all the chunks. Here is how the hash tree looks with different numbers of chunks:</p>
- <img id="hash-tree-1" class="pagewidth" src="png/hash_tree_1.png"/>
- <img id="hash-tree-2" class="pagewidth" src="png/hash_tree_2.png"/>
-
- <h2 id="hashes-and-signatures">Hashes and signatures</h2>
- <div>
- <p>The three types of hash seen in the hash tree are:</p>
- <ul>
- <li><strong>Chunk hashes</strong>, which hash the contents of a single chunk.</li>
- <li><strong>Parent hashes</strong>, which hash two other hashes forming a tree structure.</li>
- <li><strong>Root hashes</strong>, which sit at the root of the tree and are signed by the Dat’s author.</li>
- </ul>
- </div>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- <p class="lang">Rust</p>
-
- </aside>
- <p>Each type of hash has a specific way to construct it:</p>
-
-
- <h1 id="exchanging-data" class="section"><span>Exchanging data</span></h1>
- <p>Peers exchange chunks in multi-step process where the downloader and uploader negotiate what chunks they want and have:</p>
- <img src="png/chunk_conversation.png"/>
- <aside>
- <p>In this example one peer is a downloader and the other is an uploader. In other cases both peers will be uploading and downloading at the same time as they each have some of the chunks but not all of them.</p>
- <p>Each request message can only request a single chunk per message. Downloaders may want to queue up several request messages at a time. When the uploader finishes sending a chunk this lets them immediately start sending the next one without waiting for a round-trip.</p>
- </aside>
-
- <h2 id="want-and-have">Want and have</h2>
- <p>Each peer remembers which chunks the other peer wants and has.</p>
- <ul>
- <li><strong>Wanting</strong> a chunk means “I want to download this chunk, please tell me if you have it”.</li>
- <li><strong>Having</strong> a chunk means “I know you want this chunk and I will send it if you ask for it”. If a peer tells you they are only interested in a small range of chunks then you only have to tell them about chunks within that range.</li>
- </ul>
- <p>As peers download (or even delete) data, the list of chunks they want and have will change. This state is communicated with four message types: <strong>want</strong>, <strong>unwant</strong>, <strong>have</strong> and <strong>unhave</strong>. Each of these four messages has the same structure which indicates a contiguous range of chunks:</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Start</strong></td>
- <td>Varint</td>
- <td>Number of the first chunk you want/unwant/have/unhave. Chunk numbering starts at 0.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Length</strong></td>
- <td>Varint</td>
- <td>1 = Just the <em>start</em> chunk, 2 = The <em>start</em> chunk and the next one, and so on. Omit this field to select all following chunks to the end of the Dat, including new chunks as they are added.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>Here is an example showing typical use of have and want messages between two peers:</p>
- <img id="want-have-conversation-1" class="pagewidth" src="png/want_have_1.png"/>
- <img id="want-have-conversation-2" class="pagewidth" src="png/want_have_2.png"/>
-
- <h2 id="have-bitfield">Have bitfield</h2>
- <div>
- <p>If you have lots of little, non-contiguous ranges of data it can take a lot of have messages to tell your peer exactly what you have. There is an alternate form of the have message for this purpose. It is efficient at representing both contiguous and non-contiguous ranges of data.</p>
- <p>This form of the have message only has one field:</p>
- </div>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- <p class="lang">Rust</p>
-
- </aside>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>3</td>
- <td><strong>Bitfield</strong></td>
- <td>Length-prefixed</td>
- <td>A sequence of contiguous and non-contiguous chunk ranges.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>For example, let’s look at the chunks this peer has. Normally this would take 11 messages to represent:</p>
- <img class="pagewidth" src="png/run_length_encoding_bitfield.png"/>
- <p>Instead, divide the chunks into ranges where each range is either contiguous (all chunks present or none present), or non-contiguous (some chunks present). The ranges must be multiples of 8 chunks long.</p>
- <aside>In this case the ranges alternate between contiguous and non-contiguous, but this does not always happen. It is possible for contiguous ranges to be next to each other if one has all chunks present and the other has no chunks present.</aside>
- <img class="pagewidth" src="png/run_length_encoding_bitfield_segment.png"/>
- <p id="have-bitfield-range-types">Ranges are encoded as:</p>
- <ul>
- <li><strong>Contiguous.</strong> A single varint that contains sub-fields saying how many 8-chunk spans there are and whether all the chunks are present or absent.</li>
- <img src="png/run_length_encoding_contiguous_range.png"/>
- <li><strong>Non-contiguous.</strong> A varint saying how many 8-chunk spans the following bitfield represents (which is also its length in bytes), followed by the bitfield.</li>
- <img src="png/run_length_encoding_non_contiguous_range.png"/>
- </ul>
- <aside>The first byte in the bitfield represents the first 8-chunk span. The most significant bit of each byte represents the first chunk of each 8-chunk span.</aside>
- <p>Putting everything together, here are the bits used to encode the chunks this peer has:</p>
- <img id="have-bitfield-bits" class="pagewidth" src="png/run_length_encoding_bitfield_bits.png"/>
- <p>And here is how the final have message would appear on the wire:</p>
- <img id="have-bitfield-bytes" class="pagewidth" src="png/have_message_bytes.png"/>
-
- <h2 id="requesting-data">Requesting data</h2>
- <p>Once your peer has told you that they have a chunk you want, send a <strong>request</strong> message to ask them for it:
- </p><div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Index</strong></td>
- <td>Varint</td>
- <td>Number of the chunk to send back. This field must be present even when using the <em>bytes</em> field below.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Bytes</strong></td>
- <td>Varint</td>
- <td>If this field is present, ignore the <em>index</em> field and send back the chunk containing this byte. Useful if you don’t know how big each chunk is but you want to seek to a specific byte.</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Hash</strong></td>
- <td>Varint</td>
- <td>0 = Send back the data in this chunk as well as hashes needed to verify it, 1 = Don’t send back the data in this chunk, only send the hashes.</td>
- </tr>
- <tr>
- <td>4</td>
- <td><strong>Nodes</strong></td>
- <td>Varint</td>
- <td>Used to request additional hashes needed to verify the integrity of this chunk. 0 = Send back all hashes needed to verify this chunk, 1 = Just send the data, no hashes. For other values that can be used to request specific hashes from the hash tree, see the <a href="https://github.com/pfrazee/DEPs/blob/dep-wire-protocol/proposals/0000-wire-protocol.md#block-tree-digest">Wire Protocol specification</a>.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p id="cancel-message">If you no longer want a chunk you requested, send a <strong>cancel</strong> message:</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Index</strong></td>
- <td>Varint</td>
- <td>Number of the chunk to cancel. This field must be present even when using the <em>bytes</em> field below.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Bytes</strong></td>
- <td>Varint</td>
- <td>If this field is present, ignore the <em>index</em> field and cancel the request for the chunk containing this byte.</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Hash</strong></td>
- <td>Varint</td>
- <td>Set to the same value as the <em>hash</em> field of the request you want to cancel.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>Cancel messages can be used if you preemptively requested a chunk from multiple peers at the same time. Upon receiving the chunk from the fastest peer, send cancel messages to the others.</p>
- <p id="data-message">When a peer has requested a chunk from you, send it to them with a <strong>data</strong> message:</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Index</strong></td>
- <td>Varint</td>
- <td>Chunk number.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Value</strong></td>
- <td>Length-prefixed</td>
- <td>Contents of the chunk. Do not set this field if the request had <em>hash</em> = 1.</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Nodes</strong></td>
- <td>Length-prefixed</td>
- <td>
- <p>This field is repeated for each hash that the requester needs to verify the chunk’s integrity.</p>
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Index</strong></td>
- <td>Varint</td>
- <td>Hash number.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Hash</strong></td>
- <td>Length-prefixed</td>
- <td>32-byte <a href="#chunk-hash">chunk hash</a> or <a href="#parent-hash">parent hash</a>.</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Size</strong></td>
- <td>Varint</td>
- <td>Total length of data in chunks covered by this hash.</td>
- </tr>
- </tbody>
- </table>
- </td>
- </tr>
- <tr>
- <td>4</td>
- <td><strong>Signature</strong></td>
- <td>Length-prefixed</td>
- <td>64-byte ed25519 signature of the root hash corresponding to this chunk.</td>
- </tr>
- </tbody>
- </table>
- </div>
-
- <h1 id="files-and-folders" class="section"><span>Files and folders</span></h1>
- <p>Dat uses two coupled feeds to represent files and folders. The <strong>metadata</strong> feed contains the names, sizes and other metadata for each file, and its typically quite small even when the Dat contains a lot of data. The <strong>content</strong> feed contains the actual file contents. The metadata feed points to where in the content feed each file is located, so you only need to fetch the contents of files you are interested in.</p>
- <aside>Folders aren’t created explicitly. Instead, files have slash-separated names and all but the last path segment represents the folders that file is inside. Dat can’t represent an empty folder or remember UIDs, permission modes or modification dates on folders.</aside>
- <img class="pagewidth" src="png/files_folders_overview.png"/>
- <p>The first chunk of the metadata feed is always an <strong>index</strong> chunk. Check that the <em>type</em> field contains the word “hyperdrive”. If so, the <em>content</em> field is the public key of the content feed.</p>
- <div class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Type</strong></td>
- <td>Length-prefixed</td>
- <td>What sort of data is contained in this Dat. For Dats using the concept of files and folders this is “hyperdrive”.</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Content</strong></td>
- <td>Length-prefixed</td>
- <td>32-byte public key of the content feed.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>After the index all following chunks in the metadata feed are <strong>nodes</strong>, which store file metadata. Nodes have these fields:</p>
- <div id="node-fields" class="pagewidth msgfields">
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Name</strong></td>
- <td>Length-prefixed</td>
- <td>Slash-separated path and filename. Always begins with a slash. For example: “/src/main.c”</td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>Value</strong></td>
- <td>Length-prefixed</td>
- <td>
- <p>If this field is present the file is being created or updated. These sub-fields give the details of the new or updated file:</p>
- <table>
- <thead>
- <tr>
- <th>No.</th>
- <th>Name</th>
- <th>Type</th>
- <th>Description</th>
- </tr>
- </thead>
- <tbody>
- <tr>
- <td>1</td>
- <td><strong>Mode</strong></td>
- <td>Varint</td>
- <td>
- <p>Unix permissions. In practice one of these two common values depending on whether the file is executable or not:</p>
- <img src="png/files_folders_mode_flags.png"/>
- <p>Security-sensitive bits such as setuid and setgid might also be set. When extracting files from a Dat to the filesystem you might consider not honoring these bits.</p>
- </td>
- </tr>
- <tr>
- <td>2</td>
- <td><strong>UID</strong></td>
- <td>Varint</td>
- <td>Unix user ID. Alternatively, set to 0 to not expose the user’s ID.</td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>GID</strong></td>
- <td>Varint</td>
- <td>Unix group ID. Alternatively, set to 0 to not expose the user’s group ID.</td>
- </tr>
- <tr>
- <td>4</td>
- <td><strong>Size</strong></td>
- <td>Varint</td>
- <td>Size of the file in bytes.</td>
- </tr>
- <tr>
- <td>5</td>
- <td><strong>Blocks</strong></td>
- <td>Varint</td>
- <td>Number of chunks the file occupies in the content feed.</td>
- </tr>
- <tr>
- <td>6</td>
- <td><strong>Offset</strong></td>
- <td>Varint</td>
- <td>Chunk number of the first chunk in the content feed.</td>
- </tr>
- <tr>
- <td>7</td>
- <td><strong>Byte offset</strong></td>
- <td>Varint</td>
- <td>Size in bytes of all chunks in the content feed before this file. 0 for the first file.</td>
- </tr>
- <tr>
- <td>8</td>
- <td><strong>Mtime</strong></td>
- <td>Varint</td>
- <td>Time the file was last modified. Number of milliseconds since 1 January 1970 00:00:00 UTC.</td>
- </tr>
- <tr>
- <td>9</td>
- <td><strong>Ctime</strong></td>
- <td>Varint</td>
- <td>Time the file was created. Number of milliseconds since 1 January 1970 00:00:00 UTC.</td>
- </tr>
- </tbody>
- </table>
- <p>If the <em>value</em> field is absent (and therefore none of the sub-fields above are set) the file previously existing with this <em>name</em> is now deleted.</p>
- </td>
- </tr>
- <tr>
- <td>3</td>
- <td><strong>Paths</strong></td>
- <td>Length-prefixed</td>
- <td>Index that helps to traverse folders more efficiently. <a href="#paths-index">See below</a>.</td>
- </tr>
- </tbody>
- </table>
- </div>
- <p>To find the latest version of a file, start from the end of the metadata feed and work backwards until you find a node with that file’s name. Even though files can be modified and deleted, previous versions can still be retrieved by searching back in the metadata feed.</p>
- <p>Here is an example showing how the <em>offset</em> and <em>blocks</em> fields refer to chunks in the content feed:</p>
- <img id="files-folders-feeds" class="pagewidth" src="png/files_folders_feeds.png"/>
-
- <h2 id="paths-index">Paths index</h2>
- <p>Scanning through the list of all files added, modified or deleted would be slow for Dats that contain lots of files or a long history. To make this faster, every node in the metadata feed contains extra information in the <strong>paths</strong> field to help traverse folders.</p>
- <aside class="impl">
- <img class="icon" src="png/spanner_icon.png"/>
- <h3>Implementations</h3>
- <p class="lang">JS</p>
-
- </aside>
- <img class="pagewidth" src="png/paths_index_storage.png"/>
- <p>To calculate a <em>paths</em> field, start by constructing a file hierarchy from the metadata feed:</p>
- <img id="paths-index-tree" src="png/paths_index_tree.png"/>
- <aside>
- <p>The first node is called node 0, which is stored in chunk 1 of the metadata feed.</p>
- <p>Node number = chunk number - 1</p>
- </aside>
- <p>For each file in the hierarchy, find the most recent entry for it in the metadata feed and remember the node number. For each folder, remember the highest node number among its children:</p>
- <img id="paths-index-versions" src="png/paths_index_versions.png"/>
- <p>Locate the file that was just added. Select that file and all its parent folders up to the root folder. Also select files and folders within those folders, but not their descendants. Ignore everything else.</p>
- <img id="paths-index-select" src="png/paths_index_select.png"/>
- <aside>In this example we are calculating the <em>paths</em> field for <strong>node 8</strong>, however the same calculation will have already been done for all the previous nodes as they were added.</aside>
- <p>Next, follow these steps to process the node numbers into bytes:</p>
- <img id="paths-index-encode" class="pagewidth" src="png/paths_index_encode.png"/>
- <p>So node 8 would appear in the metadata feed as:</p>
- <img src="png/paths_index_result.png"/>
- <p>The process for calculating the <em>paths</em> field after deleting a file is mostly the same as when adding a file:</p>
- <img id="paths-index-delete" class="pagewidth" src="png/paths_index_delete.png"/>
-
- <h1 id="future-of-dat" class="section"><span>Future of Dat</span></h1>
- <p>Dat was first released in 2013, which in terms of internet infrastructure is very recent. Parts of the protocol are still changing today to enable Dat to handle bigger datasets, more hostile network conditions and support new types of applications.</p>
- <p>This guide has described the Dat protocol as of January 2019. Here’s a brief summary of upcoming proposals to modify the Dat protocol in the near future:</p>
- <ul>
- <li>
- <p><strong><a href="https://github.com/datproject/planning#protocol-performance-and-features">Hyperdrive</a></strong> is the technical name of the files and folders system. Hyperdrive is going to be completely overhauled to make it faster for datasets containing millions of files. The new version will use a data structure called a prefix tree. There will also be refactoring changes so that files/folders and key/value databases use the same underlying storage system.</p>
- </li>
- <li>
- <p><strong><a href="https://pfrazee.hashbase.io/blog/hyperswarm">Hyperswarm</a></strong> is a new set of discovery mechanisms for finding other peers. It will replace the local network discovery and centralized DNS discovery mechanisms. It will also replace the BitTorrent distributed hash table mechanism that is in use today (but not documented in this guide).</p>
- <p>Hyperswarm is able to hole-punch through NAT devices on networks. This helps users on residential or mobile internet connections directly connect to each other despite not having dedicated IP addresses or being able to accept incoming TCP connections.</p>
- </li>
- <li>
- <p><strong><a href="https://www.datprotocol.com/deps/0008-multiwriter/">Multi-writer</a></strong> will allow Dats to be modified by multiple devices and multiple authors at the same time. Each author will have their own secret key and publish a Dat with their data in it. Multi-writer fuses all of these separate Dats into one “meta-Dat” that is a view of everyone’s data combined.</p>
- </li>
- <li>
- <p><strong><a href="https://noiseprotocol.org/">NOISE protocol</a></strong> is a cryptographic framework for setting up secure connections between computers on the internet. This will replace how handshakes and encryption currently work in the wire protocol.</p>
- <p>NOISE will fix weaknesses such as connections being readable by any eavesdropper who knows the Dat’s public key. It will allow connections to be authenticated to prevent tampering and also support forward secrecy so that past connections cannot be decrypted if the key is later stolen.</p>
- </li>
- </ul>
-
- <hr/>
-
- <p id="conclusion">This is the end of <em>How Dat Works</em>. We’ve seen all the steps necessary to download and share files using Dat. If you’d like to write an implementation then check out the <em><a href="https://datprotocol.github.io/book/">Dat Protocol Book</a></em> which offers guidance about implementation details.</p>
- <p>The focus of this guide has been on storing files, but this is just one use of Dat. The protocol is flexible enough to store arbitrary data that does not use the concept of files or folders. One example is <a href="https://www.datprotocol.com/deps/0004-hyperdb/">Hyperdb</a> which is a key/value database, but Dat can be extended to support completely different use cases too. Take a look at the formal <a href="https://www.datprotocol.com/deps/">protocol specifications</a> for more information.</p>
-
|