A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 24KB

4 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236
  1. title: HyperDB architecture
  2. url: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
  3. hash_url: c3bb77e9a6fb2e55fdf41b57919ecfa0
  4. <article class="markdown-body entry-content" itemprop="text"><h1><a id="user-content-hyperdb-architecture" class="anchor" aria-hidden="true" href="#hyperdb-architecture"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>HyperDB Architecture</h1>
  5. <p>HyperDB is a scalable peer-to-peer key-value database.</p>
  6. <h2><a id="user-content-filesystem-metaphor" class="anchor" aria-hidden="true" href="#filesystem-metaphor"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Filesystem metaphor</h2>
  7. <p>HyperDB is structured to be used much like a traditional hierarchical
  8. filesystem. A value can be written and read at locations like <code>/foo/bar/baz</code>,
  9. and the API supports querying or tracking values at subpaths, like how watching
  10. for changes on <code>/foo/bar</code> will report both changes to <code>/foo/bar/baz</code> and also
  11. <code>/foo/bar/19</code>.</p>
  12. <h2><a id="user-content-set-of-append-only-logs-feeds" class="anchor" aria-hidden="true" href="#set-of-append-only-logs-feeds"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Set of append-only logs (feeds)</h2>
  13. <p>A HyperDB is fundamentally a set of
  14. <a href="https://github.com/mafintosh/hypercore">hypercore</a>s. A <em>hypercore</em> is a secure
  15. append-only log that is identified by a public key, and can only be written to
  16. by the holder of the corresponding private key. Because it is append-only, old
  17. values cannot be deleted nor modified. Because it is secure, a feed can be
  18. downloaded from even untrustworthy peers and verified to be accurate. Any
  19. modifications (malicious or otherwise) to the original feed data by someone
  20. other than the author can be readily detected.</p>
  21. <p>Each entry in a hypercore has a <em>sequence number</em>, that increments by 1 with
  22. each write, starting at 0 (<code>seq=0</code>).</p>
  23. <p>HyperDB builds its hierarchical key-value store on top of these hypercore feeds,
  24. and also provides facilities for authorization, and replication of those member
  25. hypercores.</p>
  26. <h3><a id="user-content-directed-acyclic-graph" class="anchor" aria-hidden="true" href="#directed-acyclic-graph"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Directed acyclic graph</h3>
  27. <p>The combination of all operations performed on a HyperDB by all of its members
  28. forms a DAG (<em>directed acyclic graph</em>). Each write to the database (setting a
  29. key to a value) includes information to point backward at all of the known
  30. "heads" in the graph.</p>
  31. <p>To illustrate what this means, let's say Alice starts a new HyperDB and writes 2
  32. values to it:</p>
  33. <pre><code>// Feed
  34. 0 (/foo/bar = 'baz')
  35. 1 (/foo/2 = '{ "some": "json" }')
  36. // Graph
  37. Alice: 0 &lt;--- 1
  38. </code></pre>
  39. <p>Where sequence number 1 (the second entry) refers to sequence number 0 on the
  40. same feed (Alice's).</p>
  41. <p>Now Alice <em>authorizes</em> Bob to write to the HyperDB. Internally, this means Alice
  42. writes a special message to her feed saying that Bob's feed (identified by his
  43. public key) should be read and replicated in by other participants. Her feed
  44. becomes</p>
  45. <pre><code>// Feed
  46. 0 (/foo/bar = 'baz')
  47. 1 (/foo/2 = '{ "some": "json" }')
  48. 2 ('' = '')
  49. // Graph
  50. Alice: 0 &lt;--- 1 &lt;--- 2
  51. </code></pre>
  52. <p>Authorization is formatted internally in a special way so that it isn't
  53. interpreted as a key/value pair.</p>
  54. <p>Now Bob writes a value to his feed, and then Alice and Bob sync. The result is:</p>
  55. <pre><code>// Feed
  56. //// Alice
  57. 0 (/foo/bar = 'baz')
  58. 1 (/foo/2 = '{ "some": "json" }')
  59. 2 ('' = '')
  60. //// Bob
  61. 0 (/a/b = '12')
  62. // Graph
  63. Alice: 0 &lt;--- 1 &lt;--- 2
  64. Bob : 0
  65. </code></pre>
  66. <p>Notice that none of Alice's entries refer to Bob's, and vice versa. This is
  67. because neither has written any entries to their feeds since the two became
  68. aware of each other (authorized &amp; replicated each other's feeds).</p>
  69. <p>Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
  70. feed at seq 0.</p>
  71. <p>Next, Alice writes a new value, and her latest entry will refer to Bob's:</p>
  72. <pre><code>// Feed
  73. //// Alice
  74. 0 (/foo/bar = 'baz')
  75. 1 (/foo/2 = '{ "some": "json" }')
  76. 2 ('' = '')
  77. 3 (/foo/hup = 'beep')
  78. //// Bob
  79. 0 (/a/b = '12')
  80. // Graph
  81. Alice: 0 &lt;--- 1 &lt;--- 2 &lt;--/ 3
  82. Bob : 0 &lt;-------------------/
  83. </code></pre>
  84. <p>Because Alice's latest feed entry refers to Bob's latest feed entry, there is
  85. now only one "head" in the database. That means there is enough information in
  86. Alice's seq=3 entry to find any other key in the database. In the last example,
  87. there were two heads (Alice's seq=2 and Bob's seq=0); both of which would need
  88. to be read internally in order to locate any key in the database.</p>
  89. <p>Now there is only one "head": Alice's feed at seq 3.</p>
  90. <h2><a id="user-content-authorization" class="anchor" aria-hidden="true" href="#authorization"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Authorization</h2>
  91. <p>The set of hypercores are <em>authorized</em> in that the original author of the first
  92. hypercore in a hyperdb must explicitly denote in their append-only log that the
  93. public key of a new hypercore is permitted to edit the database. Any authorized
  94. member may authorize more members. There is no revocation or other author
  95. management elements currently.</p>
  96. <h2><a id="user-content-incremental-index" class="anchor" aria-hidden="true" href="#incremental-index"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Incremental index</h2>
  97. <p>HyperDB builds an <em>incremental index</em> with every new key/value pairs ("nodes")
  98. written. This means a separate data structure doesn't need to be maintained
  99. elsewhere for fast writes and lookups: each node written has enough information
  100. to look up any other key quickly and otherwise navigate the database.</p>
  101. <p>Each node stores the following basic information:</p>
  102. <ul>
  103. <li><code>key</code>: the key that is being created or modified. e.g. <code>/home/sww/dev.md</code></li>
  104. <li><code>value</code>: the value stored at that key.</li>
  105. <li><code>seq</code>: the sequence number of this entry in the owner's hypercore. 0 is the
  106. first, 1 the second, and so forth.</li>
  107. <li><code>feed</code>: the ID of the hypercore writer that wrote this</li>
  108. <li><code>path</code>: a 2-bit hash sequence of the key's components</li>
  109. <li><code>trie</code>: a navigation structure used with <code>path</code> to find a desired key</li>
  110. <li><code>clock</code>: vector clock to determine node insertion causality</li>
  111. <li><code>feeds</code>: an array of { feedKey, seq } for decoding a <code>clock</code></li>
  112. </ul>
  113. <h3><a id="user-content-vector-clock" class="anchor" aria-hidden="true" href="#vector-clock"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Vector clock</h3>
  114. <p>Each node stores a <a href="https://en.wikipedia.org/wiki/Vector_clock" rel="nofollow">vector clock</a> of
  115. the last known sequence number from each feed it knows about. This is what forms
  116. the DAG structure.</p>
  117. <p>A vector clock on a node of, say, <code>[0, 2, 5]</code> means:</p>
  118. <ul>
  119. <li>when this node was written, the largest seq # in my local fed is 0</li>
  120. <li>when this node was written, the largest seq # in the second feed I have is 2</li>
  121. <li>when this node was written, the largest seq # in the third feed I have is 5</li>
  122. </ul>
  123. <p>For example, Bob's vector clock for Alice's seq=3 entry above would be <code>[0, 3]</code>
  124. since he knows of her latest entry (seq=3) and his own (seq=0).</p>
  125. <p>The vector clock is used for correctly traversing history. This is necessary for
  126. the <code>db#heads</code> API as well as <code>db#createHistoryStream</code>.</p>
  127. <h3><a id="user-content-prefix-trie" class="anchor" aria-hidden="true" href="#prefix-trie"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Prefix trie</h3>
  128. <p>Given a HyperDB with hundreds of entries, how can a key like <code>/a/b/c</code> be looked
  129. up quickly?</p>
  130. <p>Each node stores a <em>prefix <a href="https://en.wikipedia.org/wiki/Trie" rel="nofollow">trie</a></em> that
  131. assists with finding the shortest path to the desired key.</p>
  132. <p>When a node is written, its <em>prefix hash</em> is computed. This done by first
  133. splitting the key into its components (<code>a</code>, <code>b</code>, and <code>c</code> for <code>/a/b/c</code>), and then
  134. hashing each component into a 32-character hash, where one character is a 2-bit
  135. value (0, 1, 2, or 3). The <code>prefix</code> hash for <code>/a/b/c</code> is</p>
  136. <div class="highlight highlight-source-js"><pre><span class="pl-smi">node</span>.<span class="pl-smi">path</span> <span class="pl-k">=</span> [
  137. <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
  138. <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>,
  139. <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>,
  140. <span class="pl-c1">4</span> ]</pre></div>
  141. <p>Each component is divided by a newline. <code>4</code> is a special value indicating the
  142. end of the prefix.</p>
  143. <h4><a id="user-content-example" class="anchor" aria-hidden="true" href="#example"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Example</h4>
  144. <p>Consider a fresh HyperDB. We write <code>/a/b = 24</code> and get back this node:</p>
  145. <div class="highlight highlight-source-js"><pre>{ key<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>/a/b<span class="pl-pds">'</span></span>,
  146. value<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>24<span class="pl-pds">'</span></span>,
  147. clock<span class="pl-k">:</span> [ <span class="pl-c1">0</span> ],
  148. trie<span class="pl-k">:</span> [],
  149. feeds<span class="pl-k">:</span> [ [<span class="pl-c1">Object</span>] ],
  150. feedSeq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
  151. feed<span class="pl-k">:</span> <span class="pl-c1">0</span>,
  152. seq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
  153. path<span class="pl-k">:</span>
  154. [ <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
  155. <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>,
  156. <span class="pl-c1">4</span> ] }</pre></div>
  157. <p>If you compare this path to the one for <code>/a/b/c</code> above, you'll see that the
  158. first 64 2-bit characters match. This is because <code>/a/b</code> is a prefix of <code>/a/b/c</code>.</p>
  159. <p>Since this is the first entry, <code>seq</code> is 0. Since this is the only known feed,
  160. <code>feed</code> is also 0. <code>feeds</code> is an array of entries of the form <code>{ key: Buffer, seq: Number }</code> that let you map the numeric value <code>feed</code> to a hypercore key and
  161. its sequence number head. <code>feeds</code> isn't always set: it only gets included when
  162. it changes compared to <code>node.seq - 1</code>, in the interest of storing less data per
  163. node.</p>
  164. <p>Now we write <code>/a/c = hello</code> and get this node:</p>
  165. <div class="highlight highlight-source-js"><pre>{ key<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>/a/c<span class="pl-pds">'</span></span>,
  166. value<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>hello<span class="pl-pds">'</span></span>,
  167. clock<span class="pl-k">:</span> [ <span class="pl-c1">0</span> ],
  168. trie<span class="pl-k">:</span> [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , [ , , [ { feed<span class="pl-k">:</span> <span class="pl-c1">0</span>, seq<span class="pl-k">:</span> <span class="pl-c1">0</span> } ] ] ],
  169. feeds<span class="pl-k">:</span> [],
  170. feedSeq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
  171. feed<span class="pl-k">:</span> <span class="pl-c1">0</span>,
  172. seq<span class="pl-k">:</span> <span class="pl-c1">1</span>,
  173. path<span class="pl-k">:</span>
  174. [ <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
  175. <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>,
  176. <span class="pl-c1">4</span> ] }</pre></div>
  177. <p>As expected, this node has the same <code>feed</code> value as before (since we're only
  178. writing to one feed). Its <code>seq</code> is 1, since the last was 0. Notice that <code>feeds</code>
  179. isn't included, because the mapping of the numeric <code>feed</code> value to a key hasn't
  180. changed.</p>
  181. <p>Also, this and the previous node have the first 32 characters of their <code>path</code> in
  182. common (the prefix <code>/a</code>).</p>
  183. <p>Notice though that <code>trie</code> is set. It's a long but sparse array. It has 35
  184. entries, with the last one referencing the first node inserted (<code>a/b/</code>). Why?</p>
  185. <p>(If it wasn't stored as a sparse array, you'd actually see 64 entries (the
  186. length of the <code>path</code>). But since the other 29 entries are also empty, hyperdb
  187. doesn't bother allocating them.)</p>
  188. <p>If you visually compare this node's <code>path</code> with the previous node's <code>path</code>, how
  189. many entries do they have in common? At which entry do the 2-bit numbers
  190. diverge?</p>
  191. <p>At the 35th entry.</p>
  192. <p>What this is saying is "if the hash of the key you're looking for differs from
  193. mine on the 35th entry, you want to travel to <code>{ feed: 0, seq: 0 }</code> to find the
  194. node you're looking for.</p>
  195. <p>This is how finding a node works, starting at any other node:</p>
  196. <ol>
  197. <li>Compute the 2-bit hash sequence of the key you're after (e.g. <code>a/b</code>)</li>
  198. <li>Lookup the newest entry in the feed.</li>
  199. <li>Compare its <code>path</code> against the hash you just computed.</li>
  200. <li>If you discover that the <code>path</code> and your hash match, then this is the node
  201. you're looking for!</li>
  202. <li>Otherwise, once a 2-bit character from <code>path</code> and your hash disagree, note
  203. the index # where they differ and look up that value in the node's <code>trie</code>.
  204. Fetch that node at the given feed and sequence number, and go back to step 3.
  205. Repeat until you reach step 4 (match) or there is no entry in the node's trie
  206. for the key you're after (no match).</li>
  207. </ol>
  208. <p>What if there are multiple feeds in the HyperDB? The lookup algorithm changes
  209. slightly. Replace the above step 2 for:</p>
  210. <blockquote>
  211. <ol start="2">
  212. <li>Fetch the latest entry from <em>every</em> feed. For each head node, proceed to
  213. the next step.</li>
  214. </ol>
  215. </blockquote>
  216. <p>The other steps are the same as before.</p>
  217. </article>