A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

пре 4 година
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606
  1. title: What happens when...
  2. url: https://github.com/alex/what-happens-when/blob/master/README.rst
  3. hash_url: 46f440a287ca07d71b0fd3e7915a9a0c
  4. <article class="markdown-body entry-content" itemprop="text"><a name="user-content-what-happens-when"/>
  5. <h2><a id="user-content-what-happens-when" class="anchor" aria-hidden="true" href="#what-happens-when"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>What happens when...</h2>
  6. <p>This repository is an attempt to answer the age old interview question "What
  7. happens when you type google.com into your browser's address box and press
  8. enter?"</p>
  9. <p>Except instead of the usual story, we're going to try to answer this question
  10. in as much detail as possible. No skipping out on anything.</p>
  11. <p>This is a collaborative process, so dig in and try to help out! There are tons
  12. of details missing, just waiting for you to add them! So send us a pull
  13. request, please!</p>
  14. <p>This is all licensed under the terms of the <a href="https://creativecommons.org/publicdomain/zero/1.0/" rel="nofollow">Creative Commons Zero</a> license.</p>
  15. <p>Read this in <a href="https://github.com/skyline75489/what-happens-when-zh_CN">简体中文</a> (simplified Chinese) and <a href="https://github.com/SantonyChoi/what-happens-when-KR">한국어</a> (Korean). NOTE: these
  16. have not been reviewed by the alex/what-happens-when maintainers.</p>
  17. <a name="user-content-table-of-contents"/>
  18. <h2><a id="user-content-table-of-contents" class="anchor" aria-hidden="true" href="#table-of-contents"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Table of Contents</h2>
  19. <a name="user-content-the-g-key-is-pressed"/>
  20. <h3><a id="user-content-the-g-key-is-pressed" class="anchor" aria-hidden="true" href="#the-g-key-is-pressed"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>The "g" key is pressed</h3>
  21. <p>The following sections explain the physical keyboard actions
  22. and the OS interrupts. When you press the key "g" the browser receives the
  23. event and the auto-complete functions kick in.
  24. Depending on your browser's algorithm and if you are in
  25. private/incognito mode or not various suggestions will be presented
  26. to you in the dropbox below the URL bar. Most of these algorithms sort
  27. and prioritize results based on search history, bookmarks, cookies, and
  28. popular searches from the internet as a whole. As you are typing
  29. "google.com" many blocks of code run and the suggestions will be refined
  30. with each key press. It may even suggest "google.com" before you finish typing
  31. it.</p>
  32. <a name="user-content-the-enter-key-bottoms-out"/>
  33. <h3><a id="user-content-the-enter-key-bottoms-out" class="anchor" aria-hidden="true" href="#the-enter-key-bottoms-out"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>The "enter" key bottoms out</h3>
  34. <p>To pick a zero point, let's choose the Enter key on the keyboard hitting the
  35. bottom of its range. At this point, an electrical circuit specific to the enter
  36. key is closed (either directly or capacitively). This allows a small amount of
  37. current to flow into the logic circuitry of the keyboard, which scans the state
  38. of each key switch, debounces the electrical noise of the rapid intermittent
  39. closure of the switch, and converts it to a keycode integer, in this case 13.
  40. The keyboard controller then encodes the keycode for transport to the computer.
  41. This is now almost universally over a Universal Serial Bus (USB) or Bluetooth
  42. connection, but historically has been over PS/2 or ADB connections.</p>
  43. <p><em>In the case of the USB keyboard:</em></p>
  44. <ul>
  45. <li>The USB circuitry of the keyboard is powered by the 5V supply provided over
  46. pin 1 from the computer's USB host controller.</li>
  47. <li>The keycode generated is stored by internal keyboard circuitry memory in a
  48. register called "endpoint".</li>
  49. <li>The host USB controller polls that "endpoint" every ~10ms (minimum value
  50. declared by the keyboard), so it gets the keycode value stored on it.</li>
  51. <li>This value goes to the USB SIE (Serial Interface Engine) to be converted in
  52. one or more USB packets that follow the low level USB protocol.</li>
  53. <li>Those packets are sent by a differential electrical signal over D+ and D-
  54. pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID
  55. (Human Interface Device) device is always declared to be a "low speed device"
  56. (USB 2.0 compliance).</li>
  57. <li>This serial signal is then decoded at the computer's host USB controller, and
  58. interpreted by the computer's Human Interface Device (HID) universal keyboard
  59. device driver. The value of the key is then passed into the operating
  60. system's hardware abstraction layer.</li>
  61. </ul>
  62. <p><em>In the case of Virtual Keyboard (as in touch screen devices):</em></p>
  63. <ul>
  64. <li>When the user puts their finger on a modern capacitive touch screen, a
  65. tiny amount of current gets transferred to the finger. This completes the
  66. circuit through the electrostatic field of the conductive layer and
  67. creates a voltage drop at that point on the screen. The
  68. <code>screen controller</code> then raises an interrupt reporting the coordinate of
  69. the key press.</li>
  70. <li>Then the mobile OS notifies the current focused application of a press event
  71. in one of its GUI elements (which now is the virtual keyboard application
  72. buttons).</li>
  73. <li>The virtual keyboard can now raise a software interrupt for sending a
  74. 'key pressed' message back to the OS.</li>
  75. <li>This interrupt notifies the current focused application of a 'key pressed'
  76. event.</li>
  77. </ul>
  78. <a name="user-content-interrupt-fires-not-for-usb-keyboards"/>
  79. <h3><a id="user-content-interrupt-fires-not-for-usb-keyboards" class="anchor" aria-hidden="true" href="#interrupt-fires-not-for-usb-keyboards"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Interrupt fires [NOT for USB keyboards]</h3>
  80. <p>The keyboard sends signals on its interrupt request line (IRQ), which is mapped
  81. to an <code>interrupt vector</code> (integer) by the interrupt controller. The CPU uses
  82. the <code>Interrupt Descriptor Table</code> (IDT) to map the interrupt vectors to
  83. functions (<code>interrupt handlers</code>) which are supplied by the kernel. When an
  84. interrupt arrives, the CPU indexes the IDT with the interrupt vector and runs
  85. the appropriate handler. Thus, the kernel is entered.</p>
  86. <a name="user-content-on-windows-a-wm-keydown-message-is-sent-to-the-app"/>
  87. <h3><a id="user-content-on-windows-a-wm_keydown-message-is-sent-to-the-app" class="anchor" aria-hidden="true" href="#on-windows-a-wm_keydown-message-is-sent-to-the-app"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>(On Windows) A <code>WM_KEYDOWN</code> message is sent to the app</h3>
  88. <p>The HID transport passes the key down event to the <code>KBDHID.sys</code> driver which
  89. converts the HID usage into a scancode. In this case the scan code is
  90. <code>VK_RETURN</code> (<code>0x0D</code>). The <code>KBDHID.sys</code> driver interfaces with the
  91. <code>KBDCLASS.sys</code> (keyboard class driver). This driver is responsible for
  92. handling all keyboard and keypad input in a secure manner. It then calls into
  93. <code>Win32K.sys</code> (after potentially passing the message through 3rd party
  94. keyboard filters that are installed). This all happens in kernel mode.</p>
  95. <p><code>Win32K.sys</code> figures out what window is the active window through the
  96. <code>GetForegroundWindow()</code> API. This API provides the window handle of the
  97. browser's address box. The main Windows "message pump" then calls
  98. <code>SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)</code>. <code>lParam</code> is a bitmask
  99. that indicates further information about the keypress: repeat count (0 in this
  100. case), the actual scan code (can be OEM dependent, but generally wouldn't be
  101. for <code>VK_RETURN</code>), whether extended keys (e.g. alt, shift, ctrl) were also
  102. pressed (they weren't), and some other state.</p>
  103. <p>The Windows <code>SendMessage</code> API is a straightforward function that
  104. adds the message to a queue for the particular window handle (<code>hWnd</code>).
  105. Later, the main message processing function (called a <code>WindowProc</code>) assigned
  106. to the <code>hWnd</code> is called in order to process each message in the queue.</p>
  107. <p>The window (<code>hWnd</code>) that is active is actually an edit control and the
  108. <code>WindowProc</code> in this case has a message handler for <code>WM_KEYDOWN</code> messages.
  109. This code looks within the 3rd parameter that was passed to <code>SendMessage</code>
  110. (<code>wParam</code>) and, because it is <code>VK_RETURN</code> knows the user has hit the ENTER
  111. key.</p>
  112. <a name="user-content-on-os-x-a-keydown-nsevent-is-sent-to-the-app"/>
  113. <h3><a id="user-content-on-os-x-a-keydown-nsevent-is-sent-to-the-app" class="anchor" aria-hidden="true" href="#on-os-x-a-keydown-nsevent-is-sent-to-the-app"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>(On OS X) A <code>KeyDown</code> NSEvent is sent to the app</h3>
  114. <p>The interrupt signal triggers an interrupt event in the I/O Kit kext keyboard
  115. driver. The driver translates the signal into a key code which is passed to the
  116. OS X <code>WindowServer</code> process. Resultantly, the <code>WindowServer</code> dispatches an
  117. event to any appropriate (e.g. active or listening) applications through their
  118. Mach port where it is placed into an event queue. Events can then be read from
  119. this queue by threads with sufficient privileges calling the
  120. <code>mach_ipc_dispatch</code> function. This most commonly occurs through, and is
  121. handled by, an <code>NSApplication</code> main event loop, via an <code>NSEvent</code> of
  122. <code>NSEventType</code> <code>KeyDown</code>.</p>
  123. <a name="user-content-on-gnu-linux-the-xorg-server-listens-for-keycodes"/>
  124. <h3><a id="user-content-on-gnulinux-the-xorg-server-listens-for-keycodes" class="anchor" aria-hidden="true" href="#on-gnulinux-the-xorg-server-listens-for-keycodes"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>(On GNU/Linux) the Xorg server listens for keycodes</h3>
  125. <p>When a graphical <code>X server</code> is used, <code>X</code> will use the generic event
  126. driver <code>evdev</code> to acquire the keypress. A re-mapping of keycodes to scancodes
  127. is made with <code>X server</code> specific keymaps and rules.
  128. When the scancode mapping of the key pressed is complete, the <code>X server</code>
  129. sends the character to the <code>window manager</code> (DWM, metacity, i3, etc), so the
  130. <code>window manager</code> in turn sends the character to the focused window.
  131. The graphical API of the window that receives the character prints the
  132. appropriate font symbol in the appropriate focused field.</p>
  133. <a name="user-content-parse-url"/>
  134. <h3><a id="user-content-parse-url" class="anchor" aria-hidden="true" href="#parse-url"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Parse URL</h3>
  135. <a name="user-content-is-it-a-url-or-a-search-term"/>
  136. <h3><a id="user-content-is-it-a-url-or-a-search-term" class="anchor" aria-hidden="true" href="#is-it-a-url-or-a-search-term"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Is it a URL or a search term?</h3>
  137. <p>When no protocol or valid domain name is given the browser proceeds to feed
  138. the text given in the address box to the browser's default web search engine.
  139. In many cases the URL has a special piece of text appended to it to tell the
  140. search engine that it came from a particular browser's URL bar.</p>
  141. <a name="user-content-convert-non-ascii-unicode-characters-in-hostname"/>
  142. <h3><a id="user-content-convert-non-ascii-unicode-characters-in-hostname" class="anchor" aria-hidden="true" href="#convert-non-ascii-unicode-characters-in-hostname"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Convert non-ASCII Unicode characters in hostname</h3>
  143. <ul>
  144. <li>The browser checks the hostname for characters that are not in <code>a-z</code>,
  145. <code>A-Z</code>, <code>0-9</code>, <code>-</code>, or <code>.</code>.</li>
  146. <li>Since the hostname is <code>google.com</code> there won't be any, but if there were
  147. the browser would apply <a href="https://en.wikipedia.org/wiki/Punycode" rel="nofollow">Punycode</a> encoding to the hostname portion of the
  148. URL.</li>
  149. </ul>
  150. <a name="user-content-check-hsts-list"/>
  151. <h3><a id="user-content-check-hsts-list" class="anchor" aria-hidden="true" href="#check-hsts-list"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Check HSTS list</h3>
  152. <ul>
  153. <li>The browser checks its "preloaded HSTS (HTTP Strict Transport Security)"
  154. list. This is a list of websites that have requested to be contacted via
  155. HTTPS only.</li>
  156. <li>If the website is in the list, the browser sends its request via HTTPS
  157. instead of HTTP. Otherwise, the initial request is sent via HTTP.
  158. (Note that a website can still use the HSTS policy <em>without</em> being in the
  159. HSTS list. The first HTTP request to the website by a user will receive a
  160. response requesting that the user only send HTTPS requests. However, this
  161. single HTTP request could potentially leave the user vulnerable to a
  162. <a href="http://en.wikipedia.org/wiki/SSL_stripping" rel="nofollow">downgrade attack</a>, which is why the HSTS list is included in modern web
  163. browsers.)</li>
  164. </ul>
  165. <a name="user-content-dns-lookup"/>
  166. <h3><a id="user-content-dns-lookup" class="anchor" aria-hidden="true" href="#dns-lookup"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>DNS lookup</h3>
  167. <ul>
  168. <li>Browser checks if the domain is in its cache. (to see the DNS Cache in
  169. Chrome, go to chrome://net-internals/#dns).</li>
  170. <li>If not found, the browser calls <code>gethostbyname</code> library function (varies by
  171. OS) to do the lookup.</li>
  172. <li><code>gethostbyname</code> checks if the hostname can be resolved by reference in the
  173. local <code>hosts</code> file (whose location <a href="https://en.wikipedia.org/wiki/Hosts_%28file%29#Location_in_the_file_system" rel="nofollow">varies by OS</a>) before trying to
  174. resolve the hostname through DNS.</li>
  175. <li>If <code>gethostbyname</code> does not have it cached nor can find it in the <code>hosts</code>
  176. file then it makes a request to the DNS server configured in the network
  177. stack. This is typically the local router or the ISP's caching DNS server.</li>
  178. <li>If the DNS server is on the same subnet the network library follows the
  179. <code>ARP process</code> below for the DNS server.</li>
  180. <li>If the DNS server is on a different subnet, the network library follows
  181. the <code>ARP process</code> below for the default gateway IP.</li>
  182. </ul>
  183. <a name="user-content-arp-process"/>
  184. <h3><a id="user-content-arp-process" class="anchor" aria-hidden="true" href="#arp-process"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>ARP process</h3>
  185. <p>In order to send an ARP (Address Resolution Protocol) broadcast the network
  186. stack library needs the target IP address to look up. It also needs to know the
  187. MAC address of the interface it will use to send out the ARP broadcast.</p>
  188. <p>The ARP cache is first checked for an ARP entry for our target IP. If it is in
  189. the cache, the library function returns the result: Target IP = MAC.</p>
  190. <p>If the entry is not in the ARP cache:</p>
  191. <ul>
  192. <li>The route table is looked up, to see if the Target IP address is on any of
  193. the subnets on the local route table. If it is, the library uses the
  194. interface associated with that subnet. If it is not, the library uses the
  195. interface that has the subnet of our default gateway.</li>
  196. <li>The MAC address of the selected network interface is looked up.</li>
  197. <li>The network library sends a Layer 2 (data link layer of the <a href="https://en.wikipedia.org/wiki/OSI_model" rel="nofollow">OSI model</a>)
  198. ARP request:</li>
  199. </ul>
  200. <p><code>ARP Request</code>:</p>
  201. <pre>Sender MAC: interface:mac:address:here
  202. Sender IP: interface.ip.goes.here
  203. Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)
  204. Target IP: target.ip.goes.here
  205. </pre>
  206. <p>Depending on what type of hardware is between the computer and the router:</p>
  207. <p>Directly connected:</p>
  208. <ul>
  209. <li>If the computer is directly connected to the router the router responds
  210. with an <code>ARP Reply</code> (see below)</li>
  211. </ul>
  212. <p>Hub:</p>
  213. <ul>
  214. <li>If the computer is connected to a hub, the hub will broadcast the ARP
  215. request out all other ports. If the router is connected on the same "wire",
  216. it will respond with an <code>ARP Reply</code> (see below).</li>
  217. </ul>
  218. <p>Switch:</p>
  219. <ul>
  220. <li>If the computer is connected to a switch, the switch will check its local
  221. CAM/MAC table to see which port has the MAC address we are looking for. If
  222. the switch has no entry for the MAC address it will rebroadcast the ARP
  223. request to all other ports.</li>
  224. <li>If the switch has an entry in the MAC/CAM table it will send the ARP request
  225. to the port that has the MAC address we are looking for.</li>
  226. <li>If the router is on the same "wire", it will respond with an <code>ARP Reply</code>
  227. (see below)</li>
  228. </ul>
  229. <p><code>ARP Reply</code>:</p>
  230. <pre>Sender MAC: target:mac:address:here
  231. Sender IP: target.ip.goes.here
  232. Target MAC: interface:mac:address:here
  233. Target IP: interface.ip.goes.here
  234. </pre>
  235. <p>Now that the network library has the IP address of either our DNS server or
  236. the default gateway it can resume its DNS process:</p>
  237. <ul>
  238. <li>Port 53 is opened to send a UDP request to DNS server (if the response size
  239. is too large, TCP will be used instead).</li>
  240. <li>If the local/ISP DNS server does not have it, then a recursive search is
  241. requested and that flows up the list of DNS servers until the SOA is reached,
  242. and if found an answer is returned.</li>
  243. </ul>
  244. <a name="user-content-opening-of-a-socket"/>
  245. <h3><a id="user-content-opening-of-a-socket" class="anchor" aria-hidden="true" href="#opening-of-a-socket"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Opening of a socket</h3>
  246. <p>Once the browser receives the IP address of the destination server, it takes
  247. that and the given port number from the URL (the HTTP protocol defaults to port
  248. 80, and HTTPS to port 443), and makes a call to the system library function
  249. named <code>socket</code> and requests a TCP socket stream - <code>AF_INET/AF_INET6</code> and
  250. <code>SOCK_STREAM</code>.</p>
  251. <ul>
  252. <li>This request is first passed to the Transport Layer where a TCP segment is
  253. crafted. The destination port is added to the header, and a source port is
  254. chosen from within the kernel's dynamic port range (ip_local_port_range in
  255. Linux).</li>
  256. <li>This segment is sent to the Network Layer, which wraps an additional IP
  257. header. The IP address of the destination server as well as that of the
  258. current machine is inserted to form a packet.</li>
  259. <li>The packet next arrives at the Link Layer. A frame header is added that
  260. includes the MAC address of the machine's NIC as well as the MAC address of
  261. the gateway (local router). As before, if the kernel does not know the MAC
  262. address of the gateway, it must broadcast an ARP query to find it.</li>
  263. </ul>
  264. <p>At this point the packet is ready to be transmitted through either:</p>
  265. <p>For most home or small business Internet connections the packet will pass from
  266. your computer, possibly through a local network, and then through a modem
  267. (MOdulator/DEModulator) which converts digital 1's and 0's into an analog
  268. signal suitable for transmission over telephone, cable, or wireless telephony
  269. connections. On the other end of the connection is another modem which converts
  270. the analog signal back into digital data to be processed by the next <a href="https://en.wikipedia.org/wiki/Computer_network#Network_nodes" rel="nofollow">network
  271. node</a> where the from and to addresses would be analyzed further.</p>
  272. <p>Most larger businesses and some newer residential connections will have fiber
  273. or direct Ethernet connections in which case the data remains digital and
  274. is passed directly to the next <a href="https://en.wikipedia.org/wiki/Computer_network#Network_nodes" rel="nofollow">network node</a> for processing.</p>
  275. <p>Eventually, the packet will reach the router managing the local subnet. From
  276. there, it will continue to travel to the autonomous system's (AS) border
  277. routers, other ASes, and finally to the destination server. Each router along
  278. the way extracts the destination address from the IP header and routes it to
  279. the appropriate next hop. The time to live (TTL) field in the IP header is
  280. decremented by one for each router that passes. The packet will be dropped if
  281. the TTL field reaches zero or if the current router has no space in its queue
  282. (perhaps due to network congestion).</p>
  283. <p>This send and receive happens multiple times following the TCP connection flow:</p>
  284. <ul>
  285. <li>Client chooses an initial sequence number (ISN) and sends the packet to the
  286. server with the SYN bit set to indicate it is setting the ISN</li>
  287. <li><dl>
  288. <dt>Server receives SYN and if it's in an agreeable mood:</dt>
  289. <dd><ul>
  290. <li>Server chooses its own initial sequence number</li>
  291. <li>Server sets SYN to indicate it is choosing its ISN</li>
  292. <li>Server copies the (client ISN +1) to its ACK field and adds the ACK flag
  293. to indicate it is acknowledging receipt of the first packet</li>
  294. </ul>
  295. </dd>
  296. </dl>
  297. </li>
  298. <li><dl>
  299. <dt>Client acknowledges the connection by sending a packet:</dt>
  300. <dd><ul>
  301. <li>Increases its own sequence number</li>
  302. <li>Increases the receiver acknowledgment number</li>
  303. <li>Sets ACK field</li>
  304. </ul>
  305. </dd>
  306. </dl>
  307. </li>
  308. <li><dl>
  309. <dt>Data is transferred as follows:</dt>
  310. <dd><ul>
  311. <li>As one side sends N data bytes, it increases its SEQ by that number</li>
  312. <li>When the other side acknowledges receipt of that packet (or a string of
  313. packets), it sends an ACK packet with the ACK value equal to the last
  314. received sequence from the other</li>
  315. </ul>
  316. </dd>
  317. </dl>
  318. </li>
  319. <li><dl>
  320. <dt>To close the connection:</dt>
  321. <dd><ul>
  322. <li>The closer sends a FIN packet</li>
  323. <li>The other sides ACKs the FIN packet and sends its own FIN</li>
  324. <li>The closer acknowledges the other side's FIN with an ACK</li>
  325. </ul>
  326. </dd>
  327. </dl>
  328. </li>
  329. </ul>
  330. <a name="user-content-tls-handshake"/>
  331. <h3><a id="user-content-tls-handshake" class="anchor" aria-hidden="true" href="#tls-handshake"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>TLS handshake</h3>
  332. <ul>
  333. <li>The client computer sends a <code>ClientHello</code> message to the server with its
  334. Transport Layer Security (TLS) version, list of cipher algorithms and
  335. compression methods available.</li>
  336. <li>The server replies with a <code>ServerHello</code> message to the client with the
  337. TLS version, selected cipher, selected compression methods and the server's
  338. public certificate signed by a CA (Certificate Authority). The certificate
  339. contains a public key that will be used by the client to encrypt the rest of
  340. the handshake until a symmetric key can be agreed upon.</li>
  341. <li>The client verifies the server digital certificate against its list of
  342. trusted CAs. If trust can be established based on the CA, the client
  343. generates a string of pseudo-random bytes and encrypts this with the server's
  344. public key. These random bytes can be used to determine the symmetric key.</li>
  345. <li>The server decrypts the random bytes using its private key and uses these
  346. bytes to generate its own copy of the symmetric master key.</li>
  347. <li>The client sends a <code>Finished</code> message to the server, encrypting a hash of
  348. the transmission up to this point with the symmetric key.</li>
  349. <li>The server generates its own hash, and then decrypts the client-sent hash
  350. to verify that it matches. If it does, it sends its own <code>Finished</code> message
  351. to the client, also encrypted with the symmetric key.</li>
  352. <li>From now on the TLS session transmits the application (HTTP) data encrypted
  353. with the agreed symmetric key.</li>
  354. </ul>
  355. <a name="user-content-http-protocol"/>
  356. <h3><a id="user-content-http-protocol" class="anchor" aria-hidden="true" href="#http-protocol"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>HTTP protocol</h3>
  357. <p>If the web browser used was written by Google, instead of sending an HTTP
  358. request to retrieve the page, it will send a request to try and negotiate with
  359. the server an "upgrade" from HTTP to the SPDY protocol.</p>
  360. <p>If the client is using the HTTP protocol and does not support SPDY, it sends a
  361. request to the server of the form:</p>
  362. <pre>GET / HTTP/1.1
  363. Host: google.com
  364. Connection: close
  365. [other headers]
  366. </pre>
  367. <p>where <code>[other headers]</code> refers to a series of colon-separated key-value pairs
  368. formatted as per the HTTP specification and separated by single new lines.
  369. (This assumes the web browser being used doesn't have any bugs violating the
  370. HTTP spec. This also assumes that the web browser is using <code>HTTP/1.1</code>,
  371. otherwise it may not include the <code>Host</code> header in the request and the version
  372. specified in the <code>GET</code> request will either be <code>HTTP/1.0</code> or <code>HTTP/0.9</code>.)</p>
  373. <p>HTTP/1.1 defines the "close" connection option for the sender to signal that
  374. the connection will be closed after completion of the response. For example,</p>
  375. <blockquote>
  376. Connection: close</blockquote>
  377. <p>HTTP/1.1 applications that do not support persistent connections MUST include
  378. the "close" connection option in every message.</p>
  379. <p>After sending the request and headers, the web browser sends a single blank
  380. newline to the server indicating that the content of the request is done.</p>
  381. <p>The server responds with a response code denoting the status of the request and
  382. responds with a response of the form:</p>
  383. <pre>200 OK
  384. [response headers]
  385. </pre>
  386. <p>Followed by a single newline, and then sends a payload of the HTML content of
  387. <code>www.google.com</code>. The server may then either close the connection, or if
  388. headers sent by the client requested it, keep the connection open to be reused
  389. for further requests.</p>
  390. <p>If the HTTP headers sent by the web browser included sufficient information for
  391. the web server to determine if the version of the file cached by the web
  392. browser has been unmodified since the last retrieval (ie. if the web browser
  393. included an <code>ETag</code> header), it may instead respond with a request of
  394. the form:</p>
  395. <pre>304 Not Modified
  396. [response headers]
  397. </pre>
  398. <p>and no payload, and the web browser instead retrieves the HTML from its cache.</p>
  399. <p>After parsing the HTML, the web browser (and server) repeats this process
  400. for every resource (image, CSS, favicon.ico, etc) referenced by the HTML page,
  401. except instead of <code>GET / HTTP/1.1</code> the request will be
  402. <code>GET /$(URL relative to www.google.com) HTTP/1.1</code>.</p>
  403. <p>If the HTML referenced a resource on a different domain than
  404. <code>www.google.com</code>, the web browser goes back to the steps involved in
  405. resolving the other domain, and follows all steps up to this point for that
  406. domain. The <code>Host</code> header in the request will be set to the appropriate
  407. server name instead of <code>google.com</code>.</p>
  408. <a name="user-content-http-server-request-handle"/>
  409. <h3><a id="user-content-http-server-request-handle" class="anchor" aria-hidden="true" href="#http-server-request-handle"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>HTTP Server Request Handle</h3>
  410. <p>The HTTPD (HTTP Daemon) server is the one handling the requests/responses on
  411. the server side. The most common HTTPD servers are Apache or nginx for Linux
  412. and IIS for Windows.</p>
  413. <ul>
  414. <li>The HTTPD (HTTP Daemon) receives the request.</li>
  415. <li><dl>
  416. <dt>The server breaks down the request to the following parameters:</dt>
  417. <dd><ul>
  418. <li>HTTP Request Method (either <code>GET</code>, <code>HEAD</code>, <code>POST</code>, <code>PUT</code>,
  419. <code>DELETE</code>, <code>CONNECT</code>, <code>OPTIONS</code>, or <code>TRACE</code>). In the case of a URL
  420. entered directly into the address bar, this will be <code>GET</code>.</li>
  421. <li>Domain, in this case - google.com.</li>
  422. <li>Requested path/page, in this case - / (as no specific path/page was
  423. requested, / is the default path).</li>
  424. </ul>
  425. </dd>
  426. </dl>
  427. </li>
  428. <li>The server verifies that there is a Virtual Host configured on the server
  429. that corresponds with google.com.</li>
  430. <li>The server verifies that google.com can accept GET requests.</li>
  431. <li>The server verifies that the client is allowed to use this method
  432. (by IP, authentication, etc.).</li>
  433. <li>If the server has a rewrite module installed (like mod_rewrite for Apache or
  434. URL Rewrite for IIS), it tries to match the request against one of the
  435. configured rules. If a matching rule is found, the server uses that rule to
  436. rewrite the request.</li>
  437. <li>The server goes to pull the content that corresponds with the request,
  438. in our case it will fall back to the index file, as "/" is the main file
  439. (some cases can override this, but this is the most common method).</li>
  440. <li>The server parses the file according to the handler. If Google
  441. is running on PHP, the server uses PHP to interpret the index file, and
  442. streams the output to the client.</li>
  443. </ul>
  444. <a name="user-content-behind-the-scenes-of-the-browser"/>
  445. <h3><a id="user-content-behind-the-scenes-of-the-browser" class="anchor" aria-hidden="true" href="#behind-the-scenes-of-the-browser"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Behind the scenes of the Browser</h3>
  446. <p>Once the server supplies the resources (HTML, CSS, JS, images, etc.)
  447. to the browser it undergoes the below process:</p>
  448. <ul>
  449. <li>Parsing - HTML, CSS, JS</li>
  450. <li>Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree →
  451. Painting the render tree</li>
  452. </ul>
  453. <a name="user-content-browser"/>
  454. <h3><a id="user-content-browser" class="anchor" aria-hidden="true" href="#browser"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Browser</h3>
  455. <p>The browser's functionality is to present the web resource you choose, by
  456. requesting it from the server and displaying it in the browser window.
  457. The resource is usually an HTML document, but may also be a PDF,
  458. image, or some other type of content. The location of the resource is
  459. specified by the user using a URI (Uniform Resource Identifier).</p>
  460. <p>The way the browser interprets and displays HTML files is specified
  461. in the HTML and CSS specifications. These specifications are maintained
  462. by the W3C (World Wide Web Consortium) organization, which is the
  463. standards organization for the web.</p>
  464. <p>Browser user interfaces have a lot in common with each other. Among the
  465. common user interface elements are:</p>
  466. <ul>
  467. <li>An address bar for inserting a URI</li>
  468. <li>Back and forward buttons</li>
  469. <li>Bookmarking options</li>
  470. <li>Refresh and stop buttons for refreshing or stopping the loading of
  471. current documents</li>
  472. <li>Home button that takes you to your home page</li>
  473. </ul>
  474. <p><strong>Browser High Level Structure</strong></p>
  475. <p>The components of the browsers are:</p>
  476. <ul>
  477. <li><strong>User interface:</strong> The user interface includes the address bar,
  478. back/forward button, bookmarking menu, etc. Every part of the browser
  479. display except the window where you see the requested page.</li>
  480. <li><strong>Browser engine:</strong> The browser engine marshals actions between the UI
  481. and the rendering engine.</li>
  482. <li><strong>Rendering engine:</strong> The rendering engine is responsible for displaying
  483. requested content. For example if the requested content is HTML, the
  484. rendering engine parses HTML and CSS, and displays the parsed content on
  485. the screen.</li>
  486. <li><strong>Networking:</strong> The networking handles network calls such as HTTP requests,
  487. using different implementations for different platforms behind a
  488. platform-independent interface.</li>
  489. <li><strong>UI backend:</strong> The UI backend is used for drawing basic widgets like combo
  490. boxes and windows. This backend exposes a generic interface that is not
  491. platform specific.
  492. Underneath it uses operating system user interface methods.</li>
  493. <li><strong>JavaScript engine:</strong> The JavaScript engine is used to parse and
  494. execute JavaScript code.</li>
  495. <li><strong>Data storage:</strong> The data storage is a persistence layer. The browser may
  496. need to save all sorts of data locally, such as cookies. Browsers also
  497. support storage mechanisms such as localStorage, IndexedDB, WebSQL and
  498. FileSystem.</li>
  499. </ul>
  500. <a name="user-content-html-parsing"/>
  501. <h3><a id="user-content-html-parsing" class="anchor" aria-hidden="true" href="#html-parsing"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>HTML parsing</h3>
  502. <p>The rendering engine starts getting the contents of the requested
  503. document from the networking layer. This will usually be done in 8kB chunks.</p>
  504. <p>The primary job of HTML parser to parse the HTML markup into a parse tree.</p>
  505. <p>The output tree (the "parse tree") is a tree of DOM element and attribute
  506. nodes. DOM is short for Document Object Model. It is the object presentation
  507. of the HTML document and the interface of HTML elements to the outside world
  508. like JavaScript. The root of the tree is the "Document" object. Prior of
  509. any manipulation via scripting, the DOM has an almost one-to-one relation to
  510. the markup.</p>
  511. <p><strong>The parsing algorithm</strong></p>
  512. <p>HTML cannot be parsed using the regular top-down or bottom-up parsers.</p>
  513. <p>The reasons are:</p>
  514. <ul>
  515. <li>The forgiving nature of the language.</li>
  516. <li>The fact that browsers have traditional error tolerance to support well
  517. known cases of invalid HTML.</li>
  518. <li>The parsing process is reentrant. For other languages, the source doesn't
  519. change during parsing, but in HTML, dynamic code (such as script elements
  520. containing document.write() calls) can add extra tokens, so the parsing
  521. process actually modifies the input.</li>
  522. </ul>
  523. <p>Unable to use the regular parsing techniques, the browser utilizes a custom
  524. parser for parsing HTML. The parsing algorithm is described in
  525. detail by the HTML5 specification.</p>
  526. <p>The algorithm consists of two stages: tokenization and tree construction.</p>
  527. <p><strong>Actions when the parsing is finished</strong></p>
  528. <p>The browser begins fetching external resources linked to the page (CSS, images,
  529. JavaScript files, etc.).</p>
  530. <p>At this stage the browser marks the document as interactive and starts
  531. parsing scripts that are in "deferred" mode: those that should be
  532. executed after the document is parsed. The document state is
  533. set to "complete" and a "load" event is fired.</p>
  534. <p>Note there is never an "Invalid Syntax" error on an HTML page. Browsers fix
  535. any invalid content and go on.</p>
  536. <a name="user-content-css-interpretation"/>
  537. <h3><a id="user-content-css-interpretation" class="anchor" aria-hidden="true" href="#css-interpretation"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>CSS interpretation</h3>
  538. <ul>
  539. <li>Parse CSS files, <code>&lt;style&gt;</code> tag contents, and <code>style</code> attribute
  540. values using <a href="http://www.w3.org/TR/CSS2/grammar.html" rel="nofollow">"CSS lexical and syntax grammar"</a></li>
  541. <li>Each CSS file is parsed into a <code>StyleSheet object</code>, where each object
  542. contains CSS rules with selectors and objects corresponding CSS grammar.</li>
  543. <li>A CSS parser can be top-down or bottom-up when a specific parser generator
  544. is used.</li>
  545. </ul>
  546. <a name="user-content-page-rendering"/>
  547. <h3><a id="user-content-page-rendering" class="anchor" aria-hidden="true" href="#page-rendering"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Page Rendering</h3>
  548. <ul>
  549. <li>Create a 'Frame Tree' or 'Render Tree' by traversing the DOM nodes, and
  550. calculating the CSS style values for each node.</li>
  551. <li>Calculate the preferred width of each node in the 'Frame Tree' bottom up
  552. by summing the preferred width of the child nodes and the node's
  553. horizontal margins, borders, and padding.</li>
  554. <li>Calculate the actual width of each node top-down by allocating each node's
  555. available width to its children.</li>
  556. <li>Calculate the height of each node bottom-up by applying text wrapping and
  557. summing the child node heights and the node's margins, borders, and padding.</li>
  558. <li>Calculate the coordinates of each node using the information calculated
  559. above.</li>
  560. <li>More complicated steps are taken when elements are <code>floated</code>,
  561. positioned <code>absolutely</code> or <code>relatively</code>, or other complex features
  562. are used. See
  563. <a href="http://dev.w3.org/csswg/css2/" rel="nofollow">http://dev.w3.org/csswg/css2/</a> and <a href="http://www.w3.org/Style/CSS/current-work" rel="nofollow">http://www.w3.org/Style/CSS/current-work</a>
  564. for more details.</li>
  565. <li>Create layers to describe which parts of the page can be animated as a group
  566. without being re-rasterized. Each frame/render object is assigned to a layer.</li>
  567. <li>Textures are allocated for each layer of the page.</li>
  568. <li>The frame/render objects for each layer are traversed and drawing commands
  569. are executed for their respective layer. This may be rasterized by the CPU
  570. or drawn on the GPU directly using D2D/SkiaGL.</li>
  571. <li>All of the above steps may reuse calculated values from the last time the
  572. webpage was rendered, so that incremental changes require less work.</li>
  573. <li>The page layers are sent to the compositing process where they are combined
  574. with layers for other visible content like the browser chrome, iframes
  575. and addon panels.</li>
  576. <li>Final layer positions are computed and the composite commands are issued
  577. via Direct3D/OpenGL. The GPU command buffer(s) are flushed to the GPU for
  578. asynchronous rendering and the frame is sent to the window server.</li>
  579. </ul>
  580. <a name="user-content-gpu-rendering"/>
  581. <h3><a id="user-content-gpu-rendering" class="anchor" aria-hidden="true" href="#gpu-rendering"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>GPU Rendering</h3>
  582. <ul>
  583. <li>During the rendering process the graphical computing layers can use general
  584. purpose <code>CPU</code> or the graphical processor <code>GPU</code> as well.</li>
  585. <li>When using <code>GPU</code> for graphical rendering computations the graphical
  586. software layers split the task into multiple pieces, so it can take advantage
  587. of <code>GPU</code> massive parallelism for float point calculations required for
  588. the rendering process.</li>
  589. </ul>
  590. <a name="user-content-window-server"/>
  591. <h3><a id="user-content-window-server" class="anchor" aria-hidden="true" href="#window-server"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Window Server</h3>
  592. <a name="user-content-post-rendering-and-user-induced-execution"/>
  593. <h3><a id="user-content-post-rendering-and-user-induced-execution" class="anchor" aria-hidden="true" href="#post-rendering-and-user-induced-execution"><svg class="octicon octicon-link" viewbox="0 0 16 16" version="1.1" aria-hidden="true"><path fill-rule="evenodd" d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"/></svg></a>Post-rendering and user-induced execution</h3>
  594. <p>After rendering has completed, the browser executes JavaScript code as a result
  595. of some timing mechanism (such as a Google Doodle animation) or user
  596. interaction (typing a query into the search box and receiving suggestions).
  597. Plugins such as Flash or Java may execute as well, although not at this time on
  598. the Google homepage. Scripts can cause additional network requests to be
  599. performed, as well as modify the page or its layout, causing another round of
  600. page rendering and painting.</p>
  601. </article>