A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 19KB

5 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263
  1. title: What is a Web Framework?
  2. url: http://www.jeffknupp.com/blog/2014/03/03/what-is-a-web-framework/
  3. hash_url: 0d60b959f92d466e54c001a78b329e2b
  4. <p>Web application frameworks, or simply "web frameworks", are the de facto way to
  5. build web-enabled applications. From simple blogs to complex AJAX-rich applications, every
  6. page on the web was created by writing code. I've recently found that
  7. many developers interested in learning a web framework like Flask or Django
  8. don't really understand what a web framework <em>is</em>, what their purpose is, or how they
  9. work. In this article, I'll explore the oft-overlooked topic of web framework
  10. fundamentals. By the end of the article, you should have a solid understanding
  11. of what a web framework is and why they exist in the first place.
  12. This will make it <em>far</em> easier to learn a new web framework and make an informed
  13. decision regarding which framework to use.
  14. </p>
  15. <h2>How The Web Works</h2>
  16. <p>Before we talk about frameworks, we need to understand how the web "works". To
  17. do so, we'll delve into what happens when you type a URL into your browser and
  18. hit <code>Enter</code>. Open a new tab in your browser and navigate to
  19. <a href="http://www.jeffknupp.com">http://www.jeffknupp.com</a>. Let's talk about
  20. the steps your browser took in order to display the page (minus DNS lookups).</p>
  21. <h3>Web Servers and ... web ... servers...</h3>
  22. <p>Every web page is transmitted to your browser as <code>HTML</code>, a language used by
  23. browsers to describe the content and structure of a web page. The application
  24. responsible for sending <code>HTML</code> to browsers is called a <em>web server</em>.
  25. Confusingly, the machine this application resides on is also usually called a
  26. web server. </p>
  27. <p>The important thing to realize, however, is that at the end of the
  28. day, all a web application really does is send <code>HTML</code> to browsers. No matter how
  29. complicated the logic of the application, the final result is always <code>HTML</code>
  30. being sent to a browser (I'm purposely glossing over the ability for
  31. applications to respond with different types of data, like <code>JSON</code> or CSS files,
  32. as the concept is the same).</p>
  33. <p>How does the web application know <em>what</em> to send to the browser? <strong>It sends
  34. whatever the browser requests</strong>.</p>
  35. <h3>HTTP</h3>
  36. <p>Browsers download websites from <em>web servers</em> (or "application servers") using
  37. the <code>HTTP</code> <em>protocol</em> (a <em>protocol</em>, in the realm of programming, is a
  38. universally known data format and sequence of steps enabling communication
  39. between two parties). The <code>HTTP</code> protocol is based on a <code>request-response</code> model.
  40. The client (your browser) <em>requests</em> data from a web application that resides
  41. on a physical machine. The web application in turn <em>responds</em> to the request with
  42. the data your browser requested.</p>
  43. <p>An important point to remember is that communication is always initiated by the
  44. <em>client</em> (your browser). The <em>server</em> (web server, that is) has no way of
  45. initiating a connection to you and sending your browser unsolicited data. If you
  46. receive data from a web server, it is because your browser explicitly asked for
  47. it.</p>
  48. <h4>HTTP Methods</h4>
  49. <p>Every message in the <code>HTTP</code> protocol has an associated <em>method</em> (or <em>verb</em>). The various <code>HTTP</code> methods
  50. correspond to logically different types of requests the client can send, which in turn
  51. represent different intentions on the client side. Requesting the HTML
  52. of a web page, for example, is logically different than submitting a form, so the
  53. two actions require the use of different methods.</p>
  54. <h5>HTTP GET</h5>
  55. <p>The <code>GET</code> method does exactly what it sounds like: gets (requests) data from the
  56. web server. <code>GET</code> requests are the by far the most common <code>HTTP</code> request. During
  57. a <code>GET</code> request the web application shouldn't need to do anything more than
  58. respond with the requested page's HTML. Specifically, the web application should not
  59. alter the state of the application as a result of a <code>GET</code> request (for example,
  60. it should not create a new user account based on a <code>GET</code> request). For
  61. this reason, <code>GET</code> requests are usually considered "safe" since they don't
  62. result in changes to the application powering the website.</p>
  63. <h5>HTTP POST</h5>
  64. <p>Clearly, there is more to interacting with web sites than simply looking at
  65. pages. We are also able to <em>send</em> data to the application, e.g. via a form.
  66. To do so, a different type of request is required: <code>POST</code>. <code>POST</code> requests
  67. usually carry data entered by the user and result in some action being taken
  68. within the web application. Signing up for a web site by entering your
  69. information on a form is done by <code>POST</code>ing the data contained in the form to the
  70. web application.</p>
  71. <p>Unlike a <code>GET</code> request, <code>POST</code> requests usually result in the state of the
  72. application changing. In our example, a new user account is created when the
  73. form is <code>POST</code>ed. Unlike <code>GET</code> requests, <code>POST</code> requests do not always result in
  74. a new HTML page being sent to the client. Instead, the client uses the response's
  75. <em>response code</em> do determine if the operation on the application was successful.</p>
  76. <h4>HTTP Response Codes</h4>
  77. <p>In the normal case, a web server returns a <em>response code</em> of 200, meaning, "I did
  78. what you asked me to and everything went fine". <em>Response codes</em> are always a
  79. three digit numerical code. The web applications must send one with each
  80. response to indicate what happened as a result of a given request. The response code <code>200</code>
  81. literally means "OK" and is the code most often used when responding to a <code>GET</code>
  82. request. A <code>POST</code> request, however, may result in code <code>204</code> ("No Content")
  83. being sent back, meaning "Everything went OK but I don't really have anything to
  84. show you."</p>
  85. <p>It's important to realize that <code>POST</code> requests are still sent
  86. to a specific URL, which may be different from the page the data was submitted
  87. from. Continuing our sign up example, the form may reside at
  88. <code>www.foo.com/signup</code>. Hitting <code>submit</code>, however, may result in a <code>POST</code> request
  89. with the form data being sent to <code>www.foo.com/process_signup</code>. The location a
  90. <code>POST</code> request should be sent to is specified in the form's <code>HTML</code>.</p>
  91. <h2>Web Applications</h2>
  92. <p>You can get quite far using only <code>HTTP</code> <code>GET</code> and <code>POST</code>, as they're the two most
  93. common <code>HTTP</code> methods by a wide margin. A web application, then, is responsible
  94. for receiving an <code>HTTP</code> request and replying with an <code>HTTP</code> response, usually
  95. containing HTML that represents the page requested. <code>POST</code> requests cause the
  96. web application to take some action, perhaps adding a new record in the
  97. database. There are a number of other <code>HTTP</code> methods, but we'll focus on <code>GET</code> and
  98. <code>POST</code> for now.</p>
  99. <p>What would the simplest web application look like? We could write an application
  100. that listened for connections on port <code>80</code> (the well-known <code>HTTP</code> port that
  101. almost all <code>HTTP</code> traffic is sent to). Once it received a connection it would
  102. wait for the client to send a request, then it might reply with some very simple
  103. HTML.</p>
  104. <p>Here's what that would look like:</p>
  105. <div class="codehilite"><pre><span class="kn">import</span> <span class="nn">socket</span>
  106. <span class="n">HOST</span> <span class="o">=</span> <span class="s">''</span>
  107. <span class="n">PORT</span> <span class="o">=</span> <span class="mi">80</span>
  108. <span class="n">listen_socket</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="n">socket</span><span class="p">(</span><span class="n">socket</span><span class="o">.</span><span class="n">AF_INET</span><span class="p">,</span> <span class="n">socket</span><span class="o">.</span><span class="n">SOCK_STREAM</span><span class="p">)</span>
  109. <span class="n">listen_socket</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="n">HOST</span><span class="p">,</span> <span class="n">PORT</span><span class="p">))</span>
  110. <span class="n">listen_socket</span><span class="o">.</span><span class="n">listen</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
  111. <span class="n">connection</span><span class="p">,</span> <span class="n">address</span> <span class="o">=</span> <span class="n">listen_socket</span><span class="o">.</span><span class="n">accept</span><span class="p">()</span>
  112. <span class="n">request</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">recv</span><span class="p">(</span><span class="mi">1024</span><span class="p">)</span>
  113. <span class="n">connection</span><span class="o">.</span><span class="n">sendall</span><span class="p">(</span><span class="s">"""HTTP/1.1 200 OK</span>
  114. <span class="s">Content-type: text/html</span>
  115. <span class="s">&lt;html&gt;</span>
  116. <span class="s"> &lt;body&gt;</span>
  117. <span class="s"> &lt;h1&gt;Hello, World!&lt;/h1&gt;</span>
  118. <span class="s"> &lt;/body&gt;</span>
  119. <span class="s">&lt;/html&gt;"""</span><span class="p">)</span>
  120. <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  121. </pre></div>
  122. <p>(If the above doesn't work, try changing the <code>PORT</code> to something like <code>8080</code>)</p>
  123. <p>This code accepts a single connection and a single request. Regardless of what
  124. URL was requested, it responds with an <code>HTTP 200</code> response (so it's not <em>really</em> a
  125. web server). The <code>Content-type: text/html</code> line represents a <em>header</em> field.
  126. <em>Headers</em> are used to supply meta-information about the request or response.
  127. In this case, we're telling the client that the data that follows
  128. is HTML (rather than, say, JSON).</p>
  129. <h3>Anatomy of a Request</h3>
  130. <p>If I look at the <code>HTTP</code> request I sent to test the program above, I find it looks
  131. quite similar to the response. The first line is <code>&lt;HTTP Method&gt; &lt;URL&gt; &lt;HTTP version&gt;</code>
  132. or, in this case, <code>GET / HTTP/1.1</code>. After the first line come a few headers like <code>Accept: */*</code>
  133. (meaning we will accept any type of content in a response). That's basically it.</p>
  134. <p>The reply we send has a similar first request line, in the format <code>&lt;HTTP version&gt; &lt;HTTP
  135. Status-Code&gt; &lt;Status-Code Reason-Phrase&gt;</code> or <code>HTTP/1.1 200 OK</code> in our case. Next
  136. come headers, in the same format as the request headers. Lastly, the actual
  137. content of the response is included. Note that this can be encoded as a string
  138. or binary object (in the case of files). The <code>Content-type</code> header lets the
  139. client know how to interpret the response.</p>
  140. <h3>Web Server Fatigue</h3>
  141. <p>If we were going to continue building on the example above as the basis for a
  142. web application, there are a number of problems we'd need to solve:</p>
  143. <ol>
  144. <li>How do we inspect the requested URL and return the appropriate page?</li>
  145. <li>How do we deal with <code>POST</code> requests in addition to simple <code>GET</code> requests</li>
  146. <li>How do we handle more advanced concepts like sessions and cookies?</li>
  147. <li>How do we scale the application to handle thousands of concurrent connections?</li>
  148. </ol>
  149. <p>As you can imagine, no one wants to solve these problems each time they build a
  150. web application. For that reason, packages exist that handle the nitty-gritty
  151. details of the <code>HTTP</code> protocol and have sensible solutions to problems
  152. the problems above. Keep in mind, however, at their core they function in much
  153. the same way as our example: listening for requests and sending <code>HTTP</code> responses with
  154. some HTML back.</p>
  155. <p><em>Note that <strong>client-side</strong> web frameworks are a much different beast and deviate significantly from the above description.</em></p>
  156. <h2>Solving The Big Two: Routing and Templates</h2>
  157. <p>Of all the issues surrounding building a web application, two stand out.</p>
  158. <ol>
  159. <li>How do we map a requested URL to the code that is meant to handle it?</li>
  160. <li>How do we create the requested HTML dynamically, injecting calculated values
  161. or information retrieved from a database?</li>
  162. </ol>
  163. <p>Every web framework solves these issues in some way, and there are many
  164. different approaches. Examples will be helpful, so I'll discuss Django
  165. and Flask's solutions to both of these problems. First, though, we need
  166. to briefly discuss the <em>MVC</em> pattern.</p>
  167. <h3>MVC in Django</h3>
  168. <p>Django makes use of the <em>MVC</em> pattern and requires code using the framework
  169. to do the same. <em>MVC</em>, or "Model-View-Controller" is simply a way of logically
  170. separating the different responsibilities of the application. Resources like
  171. database tables are represented by <em>models</em> (in much the same way a <code>class</code> in
  172. Python often models some real-world object). <em>controllers</em> contain the business
  173. logic of the application and operate on models. <em>Views</em> are given all of
  174. the information they needs to dynamically generate the HTML representation of the page.</p>
  175. <p>Somewhat confusingly, in Django, <em>controllers</em> are called <em>views</em> and <em>views</em>
  176. are called <em>templates</em>. Other than naming weirdness, Django is a pretty
  177. straightforward implementation of an <em>MVC</em> architecture.</p>
  178. <h3>Routing in Django</h3>
  179. <p><em>Routing</em> is the process of mapping a requested URL to the code responsible for
  180. generating the associated HTML. In the simplest case, <em>all</em> requests are handled
  181. by the same code (as was the case in our earlier example). Getting a little more
  182. complex, every URL could map 1:1 to a <code>view function</code>. For example, we could
  183. record somewhere that if the URL <code>www.foo.com/bar</code> is requested, the function
  184. <code>handle_bar()</code> is responsible for generating the response. We could build up
  185. this mapping table until all of the URLs our application supports are enumerated
  186. with their associated functions.</p>
  187. <p>However, this approach falls flat when the URLs contain useful data, such as the
  188. ID of a resource (as is the case in <code>www.foo.com/users/3/</code>). How do we map that
  189. URL to a view function, and at the same time make use of the fact that we want
  190. to display the user with ID <code>3</code>? </p>
  191. <p>Django's answer is to map URL <em>regular expressions</em> to view functions that can
  192. take parameters. So, for example, I may say that URLs that match
  193. <code>^/users/(?P&lt;id&gt;\d+)/$</code> calls the <code>display_user(id)</code> function where the <code>id</code>
  194. argument is the captured group <code>id</code> in the regular expression. In that way, any
  195. <code>/users/&lt;some number&gt;/</code> URL will map to the <code>display_user</code> function. These
  196. regular expressions can be arbitrarily complex and include both keyword and
  197. positional parameters.</p>
  198. <h3>Routing in Flask</h3>
  199. <p>Flask takes a somewhat different approach. The canonical method for hooking up
  200. a function to a requested URL is through the use of the <code>route()</code> decorator. The
  201. following Flask code will function identically to the regex and function listed
  202. above:</p>
  203. <div class="codehilite"><pre><span class="nd">@app.route</span><span class="p">(</span><span class="s">'/users/&lt;id:int&gt;/'</span><span class="p">)</span>
  204. <span class="k">def</span> <span class="nf">display_user</span><span class="p">(</span><span class="nb">id</span><span class="p">):</span>
  205. <span class="c"># ...</span>
  206. </pre></div>
  207. <p>As you can see, the decorator uses an almost simplified form of regular expression
  208. to map URLs to arguments (one that implicitly uses <code>/</code> as separators). Arguments are
  209. captured by including a <code>&lt;name:type&gt;</code> directive in the URL passed to <code>route()</code>.
  210. Routing to static URLs like <code>/info/about_us.html</code> is handled as you would
  211. expect: <code>@app.route('/info/about_us.html')</code></p>
  212. <h3>HTML Generation Through Templates</h3>
  213. <p>Continuing the example above, once we have the appropriate piece of code mapped
  214. to the correct URL, how do we dynamically generate HTML in a way that still
  215. allows web designers to hand-craft it? For both Django and Flask,
  216. the answer is through <em>HTML templating</em>.</p>
  217. <p><em>HTML Templating</em> is similar to using <code>str.format()</code>: the desired output is written
  218. with placeholders for dynamic values. These are later replaced by arguments to
  219. the <code>str.format()</code> function. Imagine writing an entire web page as a single string,
  220. marking dynamic data with braces, and calling <code>str.format()</code> at the end.
  221. Both <em>Django templates</em> and <a href="http://jinja.pocoo.org">jinja2</a>, the template engine
  222. Flask uses, are designed to be used in this way.</p>
  223. <p>However, not all templating engines are created equal. While Django has
  224. rudimentary support for programming in templates, Jinja2 basically lets you execute
  225. arbitrary code (it doesn't <em>really</em>, but close enough). Jinja2 also aggressively <em>caches</em>
  226. the result of rendering templates, so that subsequent requests with the exact
  227. same arguments are returned from the cache instead of expensively being
  228. re-rendered.</p>
  229. <h3>Database Interaction</h3>
  230. <p>Django, with its "batteries included" philosophy, includes an <code>ORM</code>
  231. ("Object Relational Mapper"). The purpose of an <code>ORM</code> is two-fold: it maps Python
  232. classes to database tables and abstracts away the differences between various
  233. database engines (though the former is its primary role). No one loves <code>ORM</code>s
  234. (because the mapping between domains is never perfect), rather, they are
  235. tolerated. Django's is reasonably full-featured. Flask, being a
  236. "micro-framework", does not include one (though it is quite compatible with
  237. SQLAlchemy, the Django <code>ORM</code>'s biggest/only competitor).</p>
  238. <p>The inclusion of an <code>ORM</code> gives Django the ability to create a full-featured
  239. <code>CRUD</code> application. <code>CRUD</code> (<strong>C</strong>reate <strong>R</strong>ead <strong>U</strong>pdate <strong>D</strong>elete)
  240. applications seem to be the sweet spot for web frameworks (on the server side).
  241. Django (and Flask-SQLAlchemy) make the various <code>CRUD</code> operations for each model
  242. straightforward.</p>
  243. <h2>Web Framework Round-Up</h2>
  244. <p>By now, the purpose of web frameworks should be clear: to hide the boilerplate
  245. and infrastructural code related to handling <code>HTTP</code> requests and responses. Just
  246. <em>how much</em> is hidden depends on the framework. Django and Flask represent two
  247. extremes. Django includes something for every situation, almost to its
  248. detriment. Flask bills itself as a "micro-framework" and handles the bare
  249. minimum of web application functionality, relying on third-party packages to do
  250. some of the less common web framework tasks.</p>
  251. <p>Remember, though, that at the end of the day, Python web frameworks all work the
  252. same way: they receive <code>HTTP</code> requests, dispatch code that generates HTML, and
  253. creates an <code>HTTP</code> response with that content. In fact, <em>all</em> major server-side
  254. frameworks work in this way (excluding JavaScript frameworks). Hopefully, you're
  255. now equipped to choose between frameworks as you understand their purpose.</p>