|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186 |
- title: “Eppur si muove!”* – Dealing with Timezones in Python
- url: http://lucumr.pocoo.org/2011/7/15/eppur-si-muove/
- hash_url: 3caf29e374bca01d6b9cab9a618ee830
-
- <p>As a result of our world not being a flat disc but a rotating geoid and
- our solar system only having one sun, we have different time of days at
- different parts at precisely the same time. Everybody learns that in
- school these days and is well aware of the effects on human life (“Call
- your aunt over sea and she will pick up at an odd time”, jetlag etc.).
- But unfortunately that whole timezone thing is only partially based on
- constraints our world gave us and in computing we have to deal with these
- oddities as well.</p>
- <small><p>* “<a class="reference external" href="http://en.wikipedia.org/wiki/E_pur_si_muove!">and yet it moves</a>” is
- what people say Galileo Galilei uttered upon leaving the courtyard after
- being forced to recant his belief that the Earth rotates around the Sun.
- Which unfortunately is the case and gives us these wonderful timezone
- problems.</p>
- <p>What does this article have to do with Galileo? Not really much I am
- afraid because even if the world would be in the center of the universe
- you would still have timezones. Consider the title a mistake on my part
- which I cannot correct now, can I :-)</p>
- </small><div class="section" id="what-s-a-timezone">
- <h2>What's a Timezone?</h2>
- <p>What's your timezone? If you respond with “UTC+X” that will be correct
- for this very moment, but not necessarily true over time. If you look at
- the timezone info database you will find that Berlin and Vienna, even
- though they are both in “UTC+1” will have a different timezone
- (Europe/Berlin vs Europe/Vienna). Why that? The reason are differences
- in daylight saving time and historical dates. Even if those two countries
- and cities nowadays have the same DST configurations, a hundred years ago
- that was not the case. Both Austria and Germany for instance used to not
- have DST over periods of time. Austria stopped in 1920, Germany did in
- 1918. During WWII both countries unsurprisingly had the same DST
- configuration, but afterwards there are a few unsynchronized years again.
- Germany abolished DST in 1949 and reintroduced DST in 1979, Austria
- abolished it in 1948 and reintroduced it in 1980. What's worse is that
- they did not even select the same date for the switch.</p>
- <p>And this pattern is quite common all around the world. For computing DST
- is a huge problem. The reason for that is that we're usually assuming
- that time has a monotonic advancing. With daylight saving time, during
- that one hour of enabling/disabling each year we either get an hour twice
- or we skip an entire hour. Results are log entries that appear out of
- order if you log with local time for instance.</p>
- <p>To quote the pytz documentation:</p>
- <blockquote>
- For example, 1:30am on 27th Oct 2002 happened twice in the US/Eastern
- timezone when the clocks where put back at the end of Daylight Savings
- Time, similarly, 2:30am on 7th April 2002 never happened at all in the
- US/Eastern timezone, as the clocks where put forward at 2:00am
- skipping the entire hour</blockquote>
- <p>But timezones have more than just DST settings. Some countries are
- switching the means of time measuring altogether, in some cases even
- without entering or leaving DST. For example, in 1915 Warsaw switched
- from Warsaw time to Central European time. So at the stroke of midnight on
- August 5th 1915 the clocks were wound back 24 minutes. In neither case
- was DST active.</p>
- <p>Much fun can be had with timezones in general. There was at least one
- country that at one point had a timezone that differed per day because
- they synchronized 0:00 with the time of the sunrise.</p>
- </div>
- <div class="section" id="where-is-the-sanity">
- <h2>Where is the Sanity?</h2>
- <p>The sanity right now is called UTC. UTC is a timezone without daylight
- saving time and still a timezone without configuration changes in the
- past. However because our world is again this rotating geoid and
- something we don't really have under control, the problem of leap seconds
- will at one point show up. If UTC will then take leap seconds into
- account (which are irregular and with that problem for computing) or not
- (and each timezone will have sub-minute differences to UTC) is, as far as
- I know, nothing that was decided for sure yet.</p>
- <p>However right now, UTC is the safest bet. From UTC you can convert into
- any local time, however of course the reverse is not true due to what was
- shown above.</p>
- <p>So here the rule of thumb which never shall be broken:</p>
- <blockquote>
- <strong>Always measure and store time in UTC</strong>. If you need to record where
- the time was taken, store that separately. Do not store the local
- time + timezone information!</blockquote>
- </div>
- <div class="section" id="where-is-the-problem">
- <h2>Where is the Problem?</h2>
- <p>Now in theory that blog post should end here and we all go on with our
- lives. Unfortunately in Python there are a couple of more things to keep
- in mind due to some design decisions that were made a long ago that were
- not thought well through. The motivation was sound, the implications
- however were not.</p>
- <p>At one time the following decisions were apparently made for the datetime
- module in the standard library:</p>
- <ol class="arabic simple">
- <li>the datetime module should not ship timezone information because
- timeszones change too often.</li>
- <li>the datetime module however should provide an API to attach timezone
- information to a datetime object.</li>
- <li>It should provide these objects: date, time, date+time, timedelta</li>
- </ol>
- <p>Unfortunately a few things went wrong. The biggest problem is that a
- datetime object with timezone information attached and a datetime object
- without that timezone information don't work at all together:</p>
- <div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span> <span class="nn">pytz</span><span class="o">,</span> <span class="nn">datetime</span>
- <span class="gp">>>> </span><span class="n">a</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span>
- <span class="gp">>>> </span><span class="n">b</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="n">tzinfo</span><span class="o">=</span><span class="n">pytz</span><span class="o">.</span><span class="n">utc</span><span class="p">)</span>
- <span class="gp">>>> </span><span class="n">a</span> <span class="o"><</span> <span class="n">b</span>
- <span class="gt">Traceback (most recent call last):</span>
- File <span class="nb">"<stdin>"</span>, line <span class="m">1</span>, in <span class="n"><module></span>
- <span class="gr">TypeError</span>: <span class="n">can't compare offset-naive and offset-aware datetimes</span>
- </pre></div>
- <p>Ignoring the horrible API you have to use to attach a timezone information
- to a datetime object this leads to quite a few problems. If you are
- dealing with datetime objects in Python you will sooner or later start
- attaching and removing tzinfo objects all over the place.</p>
- <p>Another problem is that there are two ways to create a datetime object for
- the current time in Python:</p>
- <div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span>
- <span class="go">datetime.datetime(2011, 7, 15, 8, 30, 55, 375010)</span>
- <span class="gp">>>> </span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">()</span>
- <span class="go">datetime.datetime(2011, 7, 15, 10, 30, 57, 70767)</span>
- </pre></div>
- <p>One gives the time in UTC, the other in local time. However it will not
- tell you what local time is (because it does not have a timezone
- information object, at least before 3.3), and it does not give you way to
- know which one was UTC.</p>
- <p>If you convert from a UNIX timestamp into a datetime object you also have
- to be very careful to use the <cite>datetime.datetime.utcfromtimestamp</cite> method
- because the normal one will assume the timestamp is in local time.</p>
- <p>On top of that, the library provides a <cite>time</cite> object and a <cite>date</cite> object,
- both of which are close to being useless when timezones are involved. The
- former cannot be shifted to other timezones because that would require the
- date component. The date itself also only makes any sense local to a
- timezone because what's today for me, could be tomorrow or yesterday for
- you thanks to the wonderful world of timezones.</p>
- </div>
- <div class="section" id="what-s-the-best-practice">
- <h2>What's the Best Practice?</h2>
- <p>Now we know where the culprits are. What should we do? If we ignore
- theoretical problems that won't show up anyways unless we deal with
- history times there are a few best practices that make your life easier.
- If you ever have the problem with historic dates, there is an alternative
- module called <a class="reference external" href="http://www.egenix.com/products/python/mxBase/mxDateTime/">mxDateTime</a> which
- generally follows a better design and supports multiple calendars as well
- (Gregorian and Julian).</p>
- <div class="section" id="internally-use-utc">
- <h3>Internally use UTC</h3>
- <p>This should be a given. When you take the current time, always use
- <cite>datetime.datetime.utcnow()</cite>. If you are taking in user input that is in
- local time, immediately convert it to UTC. If that conversion would be
- ambiguous let the user know. Do not blindly guess. I know every time the
- DST switch comes up I am setting a second analog clock and not just my
- phone because my iPhone failed with that conversion twice now.</p>
- </div>
- <div class="section" id="do-not-use-offset-aware-datetimes">
- <h3>Do not use offset aware datetimes</h3>
- <p>It might sound like a good idea to always attach a tzinfo object, but it's
- actually a much better idea to not do that. If you assume that every
- datetime object without a tzinfo object is in UTC, that's the better
- solution. You can actually take advantage of the fact that you cannot
- compare these two, similar to how you cannot mix bytes and unicode in
- Python 3. Use that “API weakness” to your advantage.</p>
- <ol class="arabic simple">
- <li>internally always use offset naive datetime objects and consider them
- UTC.</li>
- <li>When interfacing with the user, convert to and from local time.</li>
- </ol>
- <p>Why would you not want to attach an UTC tzinfo object? First of all
- because the majority of libraries are written with the assumption of
- <cite>tzinfo</cite> == None in mind. Secondly because it was a horrible idea to have
- this tzinfo object in the first place as the API is broken. If you look
- into the pytz library it has to provide alternative functions for the
- conversion because the intended API for timezone conversions is not
- flexible enough to represent the majority of timezones. By not using
- tzinfo objects there is a chance that we can one time change to something
- better.</p>
- <p>Another reason for not using offset aware datetimes is that the tzinfo
- object is implementation defined. There is no standard way to transport
- that timezone information (with the exception of the UTC offset in that
- very moment) to other languages or over HTTP etc. Also datetime objects
- with timezone often cause much larger pickles or broken pickles altogether
- depending on the implementation of that timezone object.</p>
- </div>
- <div class="section" id="rebase-for-formatting">
- <h3>Rebase for Formatting</h3>
- <p>If you then want to show the time in the user's local timezone take that
- UTC datetime object, attach the <cite>UTC</cite> timezone information, look up the
- user's timezone, rebase to local time and format. Do not do the
- conversion of the timezone with the tzinfo method which is known to be
- broken, but use the pytz one. Then throw away that filthy offset aware
- datetime object you've created for formatting and go on with your life.</p>
|