A place to cache linked articles (think custom and personal wayback machine)
Вы не можете выбрать более 25 тем Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.

4 лет назад
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186
  1. title: “Eppur si muove!”* – Dealing with Timezones in Python
  2. url: http://lucumr.pocoo.org/2011/7/15/eppur-si-muove/
  3. hash_url: 3caf29e374bca01d6b9cab9a618ee830
  4. <p>As a result of our world not being a flat disc but a rotating geoid and
  5. our solar system only having one sun, we have different time of days at
  6. different parts at precisely the same time. Everybody learns that in
  7. school these days and is well aware of the effects on human life (“Call
  8. your aunt over sea and she will pick up at an odd time”, jetlag etc.).
  9. But unfortunately that whole timezone thing is only partially based on
  10. constraints our world gave us and in computing we have to deal with these
  11. oddities as well.</p>
  12. <small><p>* “<a class="reference external" href="http://en.wikipedia.org/wiki/E_pur_si_muove!">and yet it moves</a>” is
  13. what people say Galileo Galilei uttered upon leaving the courtyard after
  14. being forced to recant his belief that the Earth rotates around the Sun.
  15. Which unfortunately is the case and gives us these wonderful timezone
  16. problems.</p>
  17. <p>What does this article have to do with Galileo? Not really much I am
  18. afraid because even if the world would be in the center of the universe
  19. you would still have timezones. Consider the title a mistake on my part
  20. which I cannot correct now, can I :-)</p>
  21. </small><div class="section" id="what-s-a-timezone">
  22. <h2>What's a Timezone?</h2>
  23. <p>What's your timezone? If you respond with “UTC+X” that will be correct
  24. for this very moment, but not necessarily true over time. If you look at
  25. the timezone info database you will find that Berlin and Vienna, even
  26. though they are both in “UTC+1” will have a different timezone
  27. (Europe/Berlin vs Europe/Vienna). Why that? The reason are differences
  28. in daylight saving time and historical dates. Even if those two countries
  29. and cities nowadays have the same DST configurations, a hundred years ago
  30. that was not the case. Both Austria and Germany for instance used to not
  31. have DST over periods of time. Austria stopped in 1920, Germany did in
  32. 1918. During WWII both countries unsurprisingly had the same DST
  33. configuration, but afterwards there are a few unsynchronized years again.
  34. Germany abolished DST in 1949 and reintroduced DST in 1979, Austria
  35. abolished it in 1948 and reintroduced it in 1980. What's worse is that
  36. they did not even select the same date for the switch.</p>
  37. <p>And this pattern is quite common all around the world. For computing DST
  38. is a huge problem. The reason for that is that we're usually assuming
  39. that time has a monotonic advancing. With daylight saving time, during
  40. that one hour of enabling/disabling each year we either get an hour twice
  41. or we skip an entire hour. Results are log entries that appear out of
  42. order if you log with local time for instance.</p>
  43. <p>To quote the pytz documentation:</p>
  44. <blockquote>
  45. For example, 1:30am on 27th Oct 2002 happened twice in the US/Eastern
  46. timezone when the clocks where put back at the end of Daylight Savings
  47. Time, similarly, 2:30am on 7th April 2002 never happened at all in the
  48. US/Eastern timezone, as the clocks where put forward at 2:00am
  49. skipping the entire hour</blockquote>
  50. <p>But timezones have more than just DST settings. Some countries are
  51. switching the means of time measuring altogether, in some cases even
  52. without entering or leaving DST. For example, in 1915 Warsaw switched
  53. from Warsaw time to Central European time. So at the stroke of midnight on
  54. August 5th 1915 the clocks were wound back 24 minutes. In neither case
  55. was DST active.</p>
  56. <p>Much fun can be had with timezones in general. There was at least one
  57. country that at one point had a timezone that differed per day because
  58. they synchronized 0:00 with the time of the sunrise.</p>
  59. </div>
  60. <div class="section" id="where-is-the-sanity">
  61. <h2>Where is the Sanity?</h2>
  62. <p>The sanity right now is called UTC. UTC is a timezone without daylight
  63. saving time and still a timezone without configuration changes in the
  64. past. However because our world is again this rotating geoid and
  65. something we don't really have under control, the problem of leap seconds
  66. will at one point show up. If UTC will then take leap seconds into
  67. account (which are irregular and with that problem for computing) or not
  68. (and each timezone will have sub-minute differences to UTC) is, as far as
  69. I know, nothing that was decided for sure yet.</p>
  70. <p>However right now, UTC is the safest bet. From UTC you can convert into
  71. any local time, however of course the reverse is not true due to what was
  72. shown above.</p>
  73. <p>So here the rule of thumb which never shall be broken:</p>
  74. <blockquote>
  75. <strong>Always measure and store time in UTC</strong>. If you need to record where
  76. the time was taken, store that separately. Do not store the local
  77. time + timezone information!</blockquote>
  78. </div>
  79. <div class="section" id="where-is-the-problem">
  80. <h2>Where is the Problem?</h2>
  81. <p>Now in theory that blog post should end here and we all go on with our
  82. lives. Unfortunately in Python there are a couple of more things to keep
  83. in mind due to some design decisions that were made a long ago that were
  84. not thought well through. The motivation was sound, the implications
  85. however were not.</p>
  86. <p>At one time the following decisions were apparently made for the datetime
  87. module in the standard library:</p>
  88. <ol class="arabic simple">
  89. <li>the datetime module should not ship timezone information because
  90. timeszones change too often.</li>
  91. <li>the datetime module however should provide an API to attach timezone
  92. information to a datetime object.</li>
  93. <li>It should provide these objects: date, time, date+time, timedelta</li>
  94. </ol>
  95. <p>Unfortunately a few things went wrong. The biggest problem is that a
  96. datetime object with timezone information attached and a datetime object
  97. without that timezone information don't work at all together:</p>
  98. <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="kn">import</span> <span class="nn">pytz</span><span class="o">,</span> <span class="nn">datetime</span>
  99. <span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span>
  100. <span class="gp">&gt;&gt;&gt; </span><span class="n">b</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="n">tzinfo</span><span class="o">=</span><span class="n">pytz</span><span class="o">.</span><span class="n">utc</span><span class="p">)</span>
  101. <span class="gp">&gt;&gt;&gt; </span><span class="n">a</span> <span class="o">&lt;</span> <span class="n">b</span>
  102. <span class="gt">Traceback (most recent call last):</span>
  103. File <span class="nb">&quot;&lt;stdin&gt;&quot;</span>, line <span class="m">1</span>, in <span class="n">&lt;module&gt;</span>
  104. <span class="gr">TypeError</span>: <span class="n">can&#39;t compare offset-naive and offset-aware datetimes</span>
  105. </pre></div>
  106. <p>Ignoring the horrible API you have to use to attach a timezone information
  107. to a datetime object this leads to quite a few problems. If you are
  108. dealing with datetime objects in Python you will sooner or later start
  109. attaching and removing tzinfo objects all over the place.</p>
  110. <p>Another problem is that there are two ways to create a datetime object for
  111. the current time in Python:</p>
  112. <div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span>
  113. <span class="go">datetime.datetime(2011, 7, 15, 8, 30, 55, 375010)</span>
  114. <span class="gp">&gt;&gt;&gt; </span><span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">()</span>
  115. <span class="go">datetime.datetime(2011, 7, 15, 10, 30, 57, 70767)</span>
  116. </pre></div>
  117. <p>One gives the time in UTC, the other in local time. However it will not
  118. tell you what local time is (because it does not have a timezone
  119. information object, at least before 3.3), and it does not give you way to
  120. know which one was UTC.</p>
  121. <p>If you convert from a UNIX timestamp into a datetime object you also have
  122. to be very careful to use the <cite>datetime.datetime.utcfromtimestamp</cite> method
  123. because the normal one will assume the timestamp is in local time.</p>
  124. <p>On top of that, the library provides a <cite>time</cite> object and a <cite>date</cite> object,
  125. both of which are close to being useless when timezones are involved. The
  126. former cannot be shifted to other timezones because that would require the
  127. date component. The date itself also only makes any sense local to a
  128. timezone because what's today for me, could be tomorrow or yesterday for
  129. you thanks to the wonderful world of timezones.</p>
  130. </div>
  131. <div class="section" id="what-s-the-best-practice">
  132. <h2>What's the Best Practice?</h2>
  133. <p>Now we know where the culprits are. What should we do? If we ignore
  134. theoretical problems that won't show up anyways unless we deal with
  135. history times there are a few best practices that make your life easier.
  136. If you ever have the problem with historic dates, there is an alternative
  137. module called <a class="reference external" href="http://www.egenix.com/products/python/mxBase/mxDateTime/">mxDateTime</a> which
  138. generally follows a better design and supports multiple calendars as well
  139. (Gregorian and Julian).</p>
  140. <div class="section" id="internally-use-utc">
  141. <h3>Internally use UTC</h3>
  142. <p>This should be a given. When you take the current time, always use
  143. <cite>datetime.datetime.utcnow()</cite>. If you are taking in user input that is in
  144. local time, immediately convert it to UTC. If that conversion would be
  145. ambiguous let the user know. Do not blindly guess. I know every time the
  146. DST switch comes up I am setting a second analog clock and not just my
  147. phone because my iPhone failed with that conversion twice now.</p>
  148. </div>
  149. <div class="section" id="do-not-use-offset-aware-datetimes">
  150. <h3>Do not use offset aware datetimes</h3>
  151. <p>It might sound like a good idea to always attach a tzinfo object, but it's
  152. actually a much better idea to not do that. If you assume that every
  153. datetime object without a tzinfo object is in UTC, that's the better
  154. solution. You can actually take advantage of the fact that you cannot
  155. compare these two, similar to how you cannot mix bytes and unicode in
  156. Python 3. Use that “API weakness” to your advantage.</p>
  157. <ol class="arabic simple">
  158. <li>internally always use offset naive datetime objects and consider them
  159. UTC.</li>
  160. <li>When interfacing with the user, convert to and from local time.</li>
  161. </ol>
  162. <p>Why would you not want to attach an UTC tzinfo object? First of all
  163. because the majority of libraries are written with the assumption of
  164. <cite>tzinfo</cite> == None in mind. Secondly because it was a horrible idea to have
  165. this tzinfo object in the first place as the API is broken. If you look
  166. into the pytz library it has to provide alternative functions for the
  167. conversion because the intended API for timezone conversions is not
  168. flexible enough to represent the majority of timezones. By not using
  169. tzinfo objects there is a chance that we can one time change to something
  170. better.</p>
  171. <p>Another reason for not using offset aware datetimes is that the tzinfo
  172. object is implementation defined. There is no standard way to transport
  173. that timezone information (with the exception of the UTC offset in that
  174. very moment) to other languages or over HTTP etc. Also datetime objects
  175. with timezone often cause much larger pickles or broken pickles altogether
  176. depending on the implementation of that timezone object.</p>
  177. </div>
  178. <div class="section" id="rebase-for-formatting">
  179. <h3>Rebase for Formatting</h3>
  180. <p>If you then want to show the time in the user's local timezone take that
  181. UTC datetime object, attach the <cite>UTC</cite> timezone information, look up the
  182. user's timezone, rebase to local time and format. Do not do the
  183. conversion of the timezone with the tzinfo method which is known to be
  184. broken, but use the pytz one. Then throw away that filthy offset aware
  185. datetime object you've created for formatting and go on with your life.</p>