|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263 |
- title: What is a Web Framework?
- url: http://www.jeffknupp.com/blog/2014/03/03/what-is-a-web-framework/
- hash_url: 0d60b959f92d466e54c001a78b329e2b
-
- <p>Web application frameworks, or simply "web frameworks", are the de facto way to
- build web-enabled applications. From simple blogs to complex AJAX-rich applications, every
- page on the web was created by writing code. I've recently found that
- many developers interested in learning a web framework like Flask or Django
- don't really understand what a web framework <em>is</em>, what their purpose is, or how they
- work. In this article, I'll explore the oft-overlooked topic of web framework
- fundamentals. By the end of the article, you should have a solid understanding
- of what a web framework is and why they exist in the first place.
- This will make it <em>far</em> easier to learn a new web framework and make an informed
- decision regarding which framework to use.
- </p>
- <h2>How The Web Works</h2>
- <p>Before we talk about frameworks, we need to understand how the web "works". To
- do so, we'll delve into what happens when you type a URL into your browser and
- hit <code>Enter</code>. Open a new tab in your browser and navigate to
- <a href="http://www.jeffknupp.com">http://www.jeffknupp.com</a>. Let's talk about
- the steps your browser took in order to display the page (minus DNS lookups).</p>
- <h3>Web Servers and ... web ... servers...</h3>
- <p>Every web page is transmitted to your browser as <code>HTML</code>, a language used by
- browsers to describe the content and structure of a web page. The application
- responsible for sending <code>HTML</code> to browsers is called a <em>web server</em>.
- Confusingly, the machine this application resides on is also usually called a
- web server. </p>
- <p>The important thing to realize, however, is that at the end of the
- day, all a web application really does is send <code>HTML</code> to browsers. No matter how
- complicated the logic of the application, the final result is always <code>HTML</code>
- being sent to a browser (I'm purposely glossing over the ability for
- applications to respond with different types of data, like <code>JSON</code> or CSS files,
- as the concept is the same).</p>
- <p>How does the web application know <em>what</em> to send to the browser? <strong>It sends
- whatever the browser requests</strong>.</p>
- <h3>HTTP</h3>
- <p>Browsers download websites from <em>web servers</em> (or "application servers") using
- the <code>HTTP</code> <em>protocol</em> (a <em>protocol</em>, in the realm of programming, is a
- universally known data format and sequence of steps enabling communication
- between two parties). The <code>HTTP</code> protocol is based on a <code>request-response</code> model.
- The client (your browser) <em>requests</em> data from a web application that resides
- on a physical machine. The web application in turn <em>responds</em> to the request with
- the data your browser requested.</p>
- <p>An important point to remember is that communication is always initiated by the
- <em>client</em> (your browser). The <em>server</em> (web server, that is) has no way of
- initiating a connection to you and sending your browser unsolicited data. If you
- receive data from a web server, it is because your browser explicitly asked for
- it.</p>
- <h4>HTTP Methods</h4>
- <p>Every message in the <code>HTTP</code> protocol has an associated <em>method</em> (or <em>verb</em>). The various <code>HTTP</code> methods
- correspond to logically different types of requests the client can send, which in turn
- represent different intentions on the client side. Requesting the HTML
- of a web page, for example, is logically different than submitting a form, so the
- two actions require the use of different methods.</p>
- <h5>HTTP GET</h5>
- <p>The <code>GET</code> method does exactly what it sounds like: gets (requests) data from the
- web server. <code>GET</code> requests are the by far the most common <code>HTTP</code> request. During
- a <code>GET</code> request the web application shouldn't need to do anything more than
- respond with the requested page's HTML. Specifically, the web application should not
- alter the state of the application as a result of a <code>GET</code> request (for example,
- it should not create a new user account based on a <code>GET</code> request). For
- this reason, <code>GET</code> requests are usually considered "safe" since they don't
- result in changes to the application powering the website.</p>
- <h5>HTTP POST</h5>
- <p>Clearly, there is more to interacting with web sites than simply looking at
- pages. We are also able to <em>send</em> data to the application, e.g. via a form.
- To do so, a different type of request is required: <code>POST</code>. <code>POST</code> requests
- usually carry data entered by the user and result in some action being taken
- within the web application. Signing up for a web site by entering your
- information on a form is done by <code>POST</code>ing the data contained in the form to the
- web application.</p>
- <p>Unlike a <code>GET</code> request, <code>POST</code> requests usually result in the state of the
- application changing. In our example, a new user account is created when the
- form is <code>POST</code>ed. Unlike <code>GET</code> requests, <code>POST</code> requests do not always result in
- a new HTML page being sent to the client. Instead, the client uses the response's
- <em>response code</em> do determine if the operation on the application was successful.</p>
- <h4>HTTP Response Codes</h4>
- <p>In the normal case, a web server returns a <em>response code</em> of 200, meaning, "I did
- what you asked me to and everything went fine". <em>Response codes</em> are always a
- three digit numerical code. The web applications must send one with each
- response to indicate what happened as a result of a given request. The response code <code>200</code>
- literally means "OK" and is the code most often used when responding to a <code>GET</code>
- request. A <code>POST</code> request, however, may result in code <code>204</code> ("No Content")
- being sent back, meaning "Everything went OK but I don't really have anything to
- show you."</p>
- <p>It's important to realize that <code>POST</code> requests are still sent
- to a specific URL, which may be different from the page the data was submitted
- from. Continuing our sign up example, the form may reside at
- <code>www.foo.com/signup</code>. Hitting <code>submit</code>, however, may result in a <code>POST</code> request
- with the form data being sent to <code>www.foo.com/process_signup</code>. The location a
- <code>POST</code> request should be sent to is specified in the form's <code>HTML</code>.</p>
- <h2>Web Applications</h2>
- <p>You can get quite far using only <code>HTTP</code> <code>GET</code> and <code>POST</code>, as they're the two most
- common <code>HTTP</code> methods by a wide margin. A web application, then, is responsible
- for receiving an <code>HTTP</code> request and replying with an <code>HTTP</code> response, usually
- containing HTML that represents the page requested. <code>POST</code> requests cause the
- web application to take some action, perhaps adding a new record in the
- database. There are a number of other <code>HTTP</code> methods, but we'll focus on <code>GET</code> and
- <code>POST</code> for now.</p>
- <p>What would the simplest web application look like? We could write an application
- that listened for connections on port <code>80</code> (the well-known <code>HTTP</code> port that
- almost all <code>HTTP</code> traffic is sent to). Once it received a connection it would
- wait for the client to send a request, then it might reply with some very simple
- HTML.</p>
- <p>Here's what that would look like:</p>
- <div class="codehilite"><pre><span class="kn">import</span> <span class="nn">socket</span>
-
- <span class="n">HOST</span> <span class="o">=</span> <span class="s">''</span>
- <span class="n">PORT</span> <span class="o">=</span> <span class="mi">80</span>
- <span class="n">listen_socket</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="n">socket</span><span class="p">(</span><span class="n">socket</span><span class="o">.</span><span class="n">AF_INET</span><span class="p">,</span> <span class="n">socket</span><span class="o">.</span><span class="n">SOCK_STREAM</span><span class="p">)</span>
- <span class="n">listen_socket</span><span class="o">.</span><span class="n">bind</span><span class="p">((</span><span class="n">HOST</span><span class="p">,</span> <span class="n">PORT</span><span class="p">))</span>
- <span class="n">listen_socket</span><span class="o">.</span><span class="n">listen</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
- <span class="n">connection</span><span class="p">,</span> <span class="n">address</span> <span class="o">=</span> <span class="n">listen_socket</span><span class="o">.</span><span class="n">accept</span><span class="p">()</span>
- <span class="n">request</span> <span class="o">=</span> <span class="n">connection</span><span class="o">.</span><span class="n">recv</span><span class="p">(</span><span class="mi">1024</span><span class="p">)</span>
- <span class="n">connection</span><span class="o">.</span><span class="n">sendall</span><span class="p">(</span><span class="s">"""HTTP/1.1 200 OK</span>
- <span class="s">Content-type: text/html</span>
-
-
- <span class="s"><html></span>
- <span class="s"> <body></span>
- <span class="s"> <h1>Hello, World!</h1></span>
- <span class="s"> </body></span>
- <span class="s"></html>"""</span><span class="p">)</span>
- <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
- </pre></div>
-
-
- <p>(If the above doesn't work, try changing the <code>PORT</code> to something like <code>8080</code>)</p>
- <p>This code accepts a single connection and a single request. Regardless of what
- URL was requested, it responds with an <code>HTTP 200</code> response (so it's not <em>really</em> a
- web server). The <code>Content-type: text/html</code> line represents a <em>header</em> field.
- <em>Headers</em> are used to supply meta-information about the request or response.
- In this case, we're telling the client that the data that follows
- is HTML (rather than, say, JSON).</p>
- <h3>Anatomy of a Request</h3>
- <p>If I look at the <code>HTTP</code> request I sent to test the program above, I find it looks
- quite similar to the response. The first line is <code><HTTP Method> <URL> <HTTP version></code>
- or, in this case, <code>GET / HTTP/1.1</code>. After the first line come a few headers like <code>Accept: */*</code>
- (meaning we will accept any type of content in a response). That's basically it.</p>
- <p>The reply we send has a similar first request line, in the format <code><HTTP version> <HTTP
- Status-Code> <Status-Code Reason-Phrase></code> or <code>HTTP/1.1 200 OK</code> in our case. Next
- come headers, in the same format as the request headers. Lastly, the actual
- content of the response is included. Note that this can be encoded as a string
- or binary object (in the case of files). The <code>Content-type</code> header lets the
- client know how to interpret the response.</p>
- <h3>Web Server Fatigue</h3>
- <p>If we were going to continue building on the example above as the basis for a
- web application, there are a number of problems we'd need to solve:</p>
- <ol>
- <li>How do we inspect the requested URL and return the appropriate page?</li>
- <li>How do we deal with <code>POST</code> requests in addition to simple <code>GET</code> requests</li>
- <li>How do we handle more advanced concepts like sessions and cookies?</li>
- <li>How do we scale the application to handle thousands of concurrent connections?</li>
- </ol>
- <p>As you can imagine, no one wants to solve these problems each time they build a
- web application. For that reason, packages exist that handle the nitty-gritty
- details of the <code>HTTP</code> protocol and have sensible solutions to problems
- the problems above. Keep in mind, however, at their core they function in much
- the same way as our example: listening for requests and sending <code>HTTP</code> responses with
- some HTML back.</p>
- <p><em>Note that <strong>client-side</strong> web frameworks are a much different beast and deviate significantly from the above description.</em></p>
- <h2>Solving The Big Two: Routing and Templates</h2>
- <p>Of all the issues surrounding building a web application, two stand out.</p>
- <ol>
- <li>How do we map a requested URL to the code that is meant to handle it?</li>
- <li>How do we create the requested HTML dynamically, injecting calculated values
- or information retrieved from a database?</li>
- </ol>
- <p>Every web framework solves these issues in some way, and there are many
- different approaches. Examples will be helpful, so I'll discuss Django
- and Flask's solutions to both of these problems. First, though, we need
- to briefly discuss the <em>MVC</em> pattern.</p>
- <h3>MVC in Django</h3>
- <p>Django makes use of the <em>MVC</em> pattern and requires code using the framework
- to do the same. <em>MVC</em>, or "Model-View-Controller" is simply a way of logically
- separating the different responsibilities of the application. Resources like
- database tables are represented by <em>models</em> (in much the same way a <code>class</code> in
- Python often models some real-world object). <em>controllers</em> contain the business
- logic of the application and operate on models. <em>Views</em> are given all of
- the information they needs to dynamically generate the HTML representation of the page.</p>
- <p>Somewhat confusingly, in Django, <em>controllers</em> are called <em>views</em> and <em>views</em>
- are called <em>templates</em>. Other than naming weirdness, Django is a pretty
- straightforward implementation of an <em>MVC</em> architecture.</p>
- <h3>Routing in Django</h3>
- <p><em>Routing</em> is the process of mapping a requested URL to the code responsible for
- generating the associated HTML. In the simplest case, <em>all</em> requests are handled
- by the same code (as was the case in our earlier example). Getting a little more
- complex, every URL could map 1:1 to a <code>view function</code>. For example, we could
- record somewhere that if the URL <code>www.foo.com/bar</code> is requested, the function
- <code>handle_bar()</code> is responsible for generating the response. We could build up
- this mapping table until all of the URLs our application supports are enumerated
- with their associated functions.</p>
- <p>However, this approach falls flat when the URLs contain useful data, such as the
- ID of a resource (as is the case in <code>www.foo.com/users/3/</code>). How do we map that
- URL to a view function, and at the same time make use of the fact that we want
- to display the user with ID <code>3</code>? </p>
- <p>Django's answer is to map URL <em>regular expressions</em> to view functions that can
- take parameters. So, for example, I may say that URLs that match
- <code>^/users/(?P<id>\d+)/$</code> calls the <code>display_user(id)</code> function where the <code>id</code>
- argument is the captured group <code>id</code> in the regular expression. In that way, any
- <code>/users/<some number>/</code> URL will map to the <code>display_user</code> function. These
- regular expressions can be arbitrarily complex and include both keyword and
- positional parameters.</p>
- <h3>Routing in Flask</h3>
- <p>Flask takes a somewhat different approach. The canonical method for hooking up
- a function to a requested URL is through the use of the <code>route()</code> decorator. The
- following Flask code will function identically to the regex and function listed
- above:</p>
- <div class="codehilite"><pre><span class="nd">@app.route</span><span class="p">(</span><span class="s">'/users/<id:int>/'</span><span class="p">)</span>
- <span class="k">def</span> <span class="nf">display_user</span><span class="p">(</span><span class="nb">id</span><span class="p">):</span>
- <span class="c"># ...</span>
- </pre></div>
-
-
- <p>As you can see, the decorator uses an almost simplified form of regular expression
- to map URLs to arguments (one that implicitly uses <code>/</code> as separators). Arguments are
- captured by including a <code><name:type></code> directive in the URL passed to <code>route()</code>.
- Routing to static URLs like <code>/info/about_us.html</code> is handled as you would
- expect: <code>@app.route('/info/about_us.html')</code></p>
- <h3>HTML Generation Through Templates</h3>
- <p>Continuing the example above, once we have the appropriate piece of code mapped
- to the correct URL, how do we dynamically generate HTML in a way that still
- allows web designers to hand-craft it? For both Django and Flask,
- the answer is through <em>HTML templating</em>.</p>
- <p><em>HTML Templating</em> is similar to using <code>str.format()</code>: the desired output is written
- with placeholders for dynamic values. These are later replaced by arguments to
- the <code>str.format()</code> function. Imagine writing an entire web page as a single string,
- marking dynamic data with braces, and calling <code>str.format()</code> at the end.
- Both <em>Django templates</em> and <a href="http://jinja.pocoo.org">jinja2</a>, the template engine
- Flask uses, are designed to be used in this way.</p>
- <p>However, not all templating engines are created equal. While Django has
- rudimentary support for programming in templates, Jinja2 basically lets you execute
- arbitrary code (it doesn't <em>really</em>, but close enough). Jinja2 also aggressively <em>caches</em>
- the result of rendering templates, so that subsequent requests with the exact
- same arguments are returned from the cache instead of expensively being
- re-rendered.</p>
- <h3>Database Interaction</h3>
- <p>Django, with its "batteries included" philosophy, includes an <code>ORM</code>
- ("Object Relational Mapper"). The purpose of an <code>ORM</code> is two-fold: it maps Python
- classes to database tables and abstracts away the differences between various
- database engines (though the former is its primary role). No one loves <code>ORM</code>s
- (because the mapping between domains is never perfect), rather, they are
- tolerated. Django's is reasonably full-featured. Flask, being a
- "micro-framework", does not include one (though it is quite compatible with
- SQLAlchemy, the Django <code>ORM</code>'s biggest/only competitor).</p>
- <p>The inclusion of an <code>ORM</code> gives Django the ability to create a full-featured
- <code>CRUD</code> application. <code>CRUD</code> (<strong>C</strong>reate <strong>R</strong>ead <strong>U</strong>pdate <strong>D</strong>elete)
- applications seem to be the sweet spot for web frameworks (on the server side).
- Django (and Flask-SQLAlchemy) make the various <code>CRUD</code> operations for each model
- straightforward.</p>
- <h2>Web Framework Round-Up</h2>
- <p>By now, the purpose of web frameworks should be clear: to hide the boilerplate
- and infrastructural code related to handling <code>HTTP</code> requests and responses. Just
- <em>how much</em> is hidden depends on the framework. Django and Flask represent two
- extremes. Django includes something for every situation, almost to its
- detriment. Flask bills itself as a "micro-framework" and handles the bare
- minimum of web application functionality, relying on third-party packages to do
- some of the less common web framework tasks.</p>
- <p>Remember, though, that at the end of the day, Python web frameworks all work the
- same way: they receive <code>HTTP</code> requests, dispatch code that generates HTML, and
- creates an <code>HTTP</code> response with that content. In fact, <em>all</em> major server-side
- frameworks work in this way (excluding JavaScript frameworks). Hopefully, you're
- now equipped to choose between frameworks as you understand their purpose.</p>
|