|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236 |
- title: HyperDB architecture
- url: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
- hash_url: c3bb77e9a6fb2e55fdf41b57919ecfa0
-
- <article class="markdown-body entry-content" itemprop="text"><h1><a id="user-content-hyperdb-architecture" class="anchor" aria-hidden="true" href="#hyperdb-architecture"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>HyperDB Architecture</h1>
- <p>HyperDB is a scalable peer-to-peer key-value database.</p>
- <h2><a id="user-content-filesystem-metaphor" class="anchor" aria-hidden="true" href="#filesystem-metaphor"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Filesystem metaphor</h2>
- <p>HyperDB is structured to be used much like a traditional hierarchical
- filesystem. A value can be written and read at locations like <code>/foo/bar/baz</code>,
- and the API supports querying or tracking values at subpaths, like how watching
- for changes on <code>/foo/bar</code> will report both changes to <code>/foo/bar/baz</code> and also
- <code>/foo/bar/19</code>.</p>
- <h2><a id="user-content-set-of-append-only-logs-feeds" class="anchor" aria-hidden="true" href="#set-of-append-only-logs-feeds"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Set of append-only logs (feeds)</h2>
- <p>A HyperDB is fundamentally a set of
- <a href="https://github.com/mafintosh/hypercore">hypercore</a>s. A <em>hypercore</em> is a secure
- append-only log that is identified by a public key, and can only be written to
- by the holder of the corresponding private key. Because it is append-only, old
- values cannot be deleted nor modified. Because it is secure, a feed can be
- downloaded from even untrustworthy peers and verified to be accurate. Any
- modifications (malicious or otherwise) to the original feed data by someone
- other than the author can be readily detected.</p>
- <p>Each entry in a hypercore has a <em>sequence number</em>, that increments by 1 with
- each write, starting at 0 (<code>seq=0</code>).</p>
- <p>HyperDB builds its hierarchical key-value store on top of these hypercore feeds,
- and also provides facilities for authorization, and replication of those member
- hypercores.</p>
- <h3><a id="user-content-directed-acyclic-graph" class="anchor" aria-hidden="true" href="#directed-acyclic-graph"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Directed acyclic graph</h3>
- <p>The combination of all operations performed on a HyperDB by all of its members
- forms a DAG (<em>directed acyclic graph</em>). Each write to the database (setting a
- key to a value) includes information to point backward at all of the known
- "heads" in the graph.</p>
- <p>To illustrate what this means, let's say Alice starts a new HyperDB and writes 2
- values to it:</p>
- <pre><code>// Feed
-
- 0 (/foo/bar = 'baz')
- 1 (/foo/2 = '{ "some": "json" }')
-
-
- // Graph
-
- Alice: 0 <--- 1
- </code></pre>
- <p>Where sequence number 1 (the second entry) refers to sequence number 0 on the
- same feed (Alice's).</p>
- <p>Now Alice <em>authorizes</em> Bob to write to the HyperDB. Internally, this means Alice
- writes a special message to her feed saying that Bob's feed (identified by his
- public key) should be read and replicated in by other participants. Her feed
- becomes</p>
- <pre><code>// Feed
-
- 0 (/foo/bar = 'baz')
- 1 (/foo/2 = '{ "some": "json" }')
- 2 ('' = '')
-
-
- // Graph
-
- Alice: 0 <--- 1 <--- 2
- </code></pre>
- <p>Authorization is formatted internally in a special way so that it isn't
- interpreted as a key/value pair.</p>
- <p>Now Bob writes a value to his feed, and then Alice and Bob sync. The result is:</p>
- <pre><code>// Feed
-
- //// Alice
- 0 (/foo/bar = 'baz')
- 1 (/foo/2 = '{ "some": "json" }')
- 2 ('' = '')
-
- //// Bob
- 0 (/a/b = '12')
-
-
- // Graph
-
- Alice: 0 <--- 1 <--- 2
- Bob : 0
- </code></pre>
- <p>Notice that none of Alice's entries refer to Bob's, and vice versa. This is
- because neither has written any entries to their feeds since the two became
- aware of each other (authorized & replicated each other's feeds).</p>
- <p>Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
- feed at seq 0.</p>
- <p>Next, Alice writes a new value, and her latest entry will refer to Bob's:</p>
- <pre><code>// Feed
-
- //// Alice
- 0 (/foo/bar = 'baz')
- 1 (/foo/2 = '{ "some": "json" }')
- 2 ('' = '')
- 3 (/foo/hup = 'beep')
-
- //// Bob
- 0 (/a/b = '12')
-
-
- // Graph
-
- Alice: 0 <--- 1 <--- 2 <--/ 3
- Bob : 0 <-------------------/
- </code></pre>
- <p>Because Alice's latest feed entry refers to Bob's latest feed entry, there is
- now only one "head" in the database. That means there is enough information in
- Alice's seq=3 entry to find any other key in the database. In the last example,
- there were two heads (Alice's seq=2 and Bob's seq=0); both of which would need
- to be read internally in order to locate any key in the database.</p>
- <p>Now there is only one "head": Alice's feed at seq 3.</p>
- <h2><a id="user-content-authorization" class="anchor" aria-hidden="true" href="#authorization"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Authorization</h2>
- <p>The set of hypercores are <em>authorized</em> in that the original author of the first
- hypercore in a hyperdb must explicitly denote in their append-only log that the
- public key of a new hypercore is permitted to edit the database. Any authorized
- member may authorize more members. There is no revocation or other author
- management elements currently.</p>
- <h2><a id="user-content-incremental-index" class="anchor" aria-hidden="true" href="#incremental-index"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Incremental index</h2>
- <p>HyperDB builds an <em>incremental index</em> with every new key/value pairs ("nodes")
- written. This means a separate data structure doesn't need to be maintained
- elsewhere for fast writes and lookups: each node written has enough information
- to look up any other key quickly and otherwise navigate the database.</p>
- <p>Each node stores the following basic information:</p>
- <ul>
- <li><code>key</code>: the key that is being created or modified. e.g. <code>/home/sww/dev.md</code></li>
- <li><code>value</code>: the value stored at that key.</li>
- <li><code>seq</code>: the sequence number of this entry in the owner's hypercore. 0 is the
- first, 1 the second, and so forth.</li>
- <li><code>feed</code>: the ID of the hypercore writer that wrote this</li>
- <li><code>path</code>: a 2-bit hash sequence of the key's components</li>
- <li><code>trie</code>: a navigation structure used with <code>path</code> to find a desired key</li>
- <li><code>clock</code>: vector clock to determine node insertion causality</li>
- <li><code>feeds</code>: an array of { feedKey, seq } for decoding a <code>clock</code></li>
- </ul>
- <h3><a id="user-content-vector-clock" class="anchor" aria-hidden="true" href="#vector-clock"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Vector clock</h3>
- <p>Each node stores a <a href="https://en.wikipedia.org/wiki/Vector_clock" rel="nofollow">vector clock</a> of
- the last known sequence number from each feed it knows about. This is what forms
- the DAG structure.</p>
- <p>A vector clock on a node of, say, <code>[0, 2, 5]</code> means:</p>
- <ul>
- <li>when this node was written, the largest seq # in my local fed is 0</li>
- <li>when this node was written, the largest seq # in the second feed I have is 2</li>
- <li>when this node was written, the largest seq # in the third feed I have is 5</li>
- </ul>
- <p>For example, Bob's vector clock for Alice's seq=3 entry above would be <code>[0, 3]</code>
- since he knows of her latest entry (seq=3) and his own (seq=0).</p>
- <p>The vector clock is used for correctly traversing history. This is necessary for
- the <code>db#heads</code> API as well as <code>db#createHistoryStream</code>.</p>
- <h3><a id="user-content-prefix-trie" class="anchor" aria-hidden="true" href="#prefix-trie"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Prefix trie</h3>
- <p>Given a HyperDB with hundreds of entries, how can a key like <code>/a/b/c</code> be looked
- up quickly?</p>
- <p>Each node stores a <em>prefix <a href="https://en.wikipedia.org/wiki/Trie" rel="nofollow">trie</a></em> that
- assists with finding the shortest path to the desired key.</p>
- <p>When a node is written, its <em>prefix hash</em> is computed. This done by first
- splitting the key into its components (<code>a</code>, <code>b</code>, and <code>c</code> for <code>/a/b/c</code>), and then
- hashing each component into a 32-character hash, where one character is a 2-bit
- value (0, 1, 2, or 3). The <code>prefix</code> hash for <code>/a/b/c</code> is</p>
- <div class="highlight highlight-source-js"><pre><span class="pl-smi">node</span>.<span class="pl-smi">path</span> <span class="pl-k">=</span> [
- <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
- <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>,
- <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>,
- <span class="pl-c1">4</span> ]</pre></div>
- <p>Each component is divided by a newline. <code>4</code> is a special value indicating the
- end of the prefix.</p>
- <h4><a id="user-content-example" class="anchor" aria-hidden="true" href="#example"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Example</h4>
- <p>Consider a fresh HyperDB. We write <code>/a/b = 24</code> and get back this node:</p>
- <div class="highlight highlight-source-js"><pre>{ key<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>/a/b<span class="pl-pds">'</span></span>,
- value<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>24<span class="pl-pds">'</span></span>,
- clock<span class="pl-k">:</span> [ <span class="pl-c1">0</span> ],
- trie<span class="pl-k">:</span> [],
- feeds<span class="pl-k">:</span> [ [<span class="pl-c1">Object</span>] ],
- feedSeq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
- feed<span class="pl-k">:</span> <span class="pl-c1">0</span>,
- seq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
- path<span class="pl-k">:</span>
- [ <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
- <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>,
- <span class="pl-c1">4</span> ] }</pre></div>
- <p>If you compare this path to the one for <code>/a/b/c</code> above, you'll see that the
- first 64 2-bit characters match. This is because <code>/a/b</code> is a prefix of <code>/a/b/c</code>.</p>
- <p>Since this is the first entry, <code>seq</code> is 0. Since this is the only known feed,
- <code>feed</code> is also 0. <code>feeds</code> is an array of entries of the form <code>{ key: Buffer, seq: Number }</code> that let you map the numeric value <code>feed</code> to a hypercore key and
- its sequence number head. <code>feeds</code> isn't always set: it only gets included when
- it changes compared to <code>node.seq - 1</code>, in the interest of storing less data per
- node.</p>
- <p>Now we write <code>/a/c = hello</code> and get this node:</p>
- <div class="highlight highlight-source-js"><pre>{ key<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>/a/c<span class="pl-pds">'</span></span>,
- value<span class="pl-k">:</span> <span class="pl-s"><span class="pl-pds">'</span>hello<span class="pl-pds">'</span></span>,
- clock<span class="pl-k">:</span> [ <span class="pl-c1">0</span> ],
- trie<span class="pl-k">:</span> [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , [ , , [ { feed<span class="pl-k">:</span> <span class="pl-c1">0</span>, seq<span class="pl-k">:</span> <span class="pl-c1">0</span> } ] ] ],
- feeds<span class="pl-k">:</span> [],
- feedSeq<span class="pl-k">:</span> <span class="pl-c1">0</span>,
- feed<span class="pl-k">:</span> <span class="pl-c1">0</span>,
- seq<span class="pl-k">:</span> <span class="pl-c1">1</span>,
- path<span class="pl-k">:</span>
- [ <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>,
- <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">2</span>, <span class="pl-c1">0</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">1</span>, <span class="pl-c1">2</span>, <span class="pl-c1">1</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">3</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">2</span>, <span class="pl-c1">3</span>, <span class="pl-c1">0</span>, <span class="pl-c1">1</span>, <span class="pl-c1">0</span>,
- <span class="pl-c1">4</span> ] }</pre></div>
- <p>As expected, this node has the same <code>feed</code> value as before (since we're only
- writing to one feed). Its <code>seq</code> is 1, since the last was 0. Notice that <code>feeds</code>
- isn't included, because the mapping of the numeric <code>feed</code> value to a key hasn't
- changed.</p>
- <p>Also, this and the previous node have the first 32 characters of their <code>path</code> in
- common (the prefix <code>/a</code>).</p>
- <p>Notice though that <code>trie</code> is set. It's a long but sparse array. It has 35
- entries, with the last one referencing the first node inserted (<code>a/b/</code>). Why?</p>
- <p>(If it wasn't stored as a sparse array, you'd actually see 64 entries (the
- length of the <code>path</code>). But since the other 29 entries are also empty, hyperdb
- doesn't bother allocating them.)</p>
- <p>If you visually compare this node's <code>path</code> with the previous node's <code>path</code>, how
- many entries do they have in common? At which entry do the 2-bit numbers
- diverge?</p>
- <p>At the 35th entry.</p>
- <p>What this is saying is "if the hash of the key you're looking for differs from
- mine on the 35th entry, you want to travel to <code>{ feed: 0, seq: 0 }</code> to find the
- node you're looking for.</p>
- <p>This is how finding a node works, starting at any other node:</p>
- <ol>
- <li>Compute the 2-bit hash sequence of the key you're after (e.g. <code>a/b</code>)</li>
- <li>Lookup the newest entry in the feed.</li>
- <li>Compare its <code>path</code> against the hash you just computed.</li>
- <li>If you discover that the <code>path</code> and your hash match, then this is the node
- you're looking for!</li>
- <li>Otherwise, once a 2-bit character from <code>path</code> and your hash disagree, note
- the index # where they differ and look up that value in the node's <code>trie</code>.
- Fetch that node at the given feed and sequence number, and go back to step 3.
- Repeat until you reach step 4 (match) or there is no entry in the node's trie
- for the key you're after (no match).</li>
- </ol>
- <p>What if there are multiple feeds in the HyperDB? The lookup algorithm changes
- slightly. Replace the above step 2 for:</p>
- <blockquote>
- <ol start="2">
- <li>Fetch the latest entry from <em>every</em> feed. For each head node, proceed to
- the next step.</li>
- </ol>
- </blockquote>
- <p>The other steps are the same as before.</p>
- </article>
|