Friday, 9 April 2010

Time zones, singletons, deployment options, names and source control

Yesterday (April 8th), the province of San Luis in Argentina declared that it wouldn't be observing a daylight saving transition after all. It was due to go back to standard time on April 11th. Thanks for the heads-up, guys. (The rest of Argentina isn't observing daylight saving at all at the moment - another decision which came with little warning.)

This highlights an important fact about time zones: they're more volatile than you might expect. This affects Noda Time in interesting ways.

Switching time zone provider

Obviously we ship with a time zone database (based on zoneinfo data) but when we eventually get to the stage of having public releases, we don't want to have to ship another release at the whim of politicians. We already have the code to load a time zone database (in our own compressed format) from a resource; we ought to expose the ability to do it from an arbitrary stream as well. (The code is there, we just need to expose it.)

So, we can then ship new time zone files and developers could use them... if they decide to care. There are two ways they could do this. First (either way) they need to load the database as an IDateTimeZoneProvider. After that, they could pass that around as a dependency everywhere that requires a time zone based on an ID. Alternatively, they could add it to the "system" time zone provider, which will make DateTimeZones.ForID(...) work.

Normally, I would absolutely say it's worth going with the dependency injection approach... but do you really want that dependency everywhere? It's not something you expect to need to mock out for tests (unlike a clock, for example). I'm torn between the "wrongness" of the singleton system provider list and the pragmatic approach it provides. Ultimately, I suspect we'll make sure that both options work (I believe we'll need to expose the built-in database as a provider, which I don't think we do at the moment) and let individual teams decide what works best for them. Even that feels slightly wrong though.

There's another awkward decision around practicality, too. If you're not going to use the built-in database, why do you even want it to be present?

Deployment options

It's extremely convenient that NodaTime comes as a single DLL with no extra dependencies. You can just drop it in, no problem. It doesn't need to read any files - just the built-in resources. That's handy in terms of sandboxing and permissions... but doesn't lend itself to having a split between code and data.

We could potentially have two different versions: a "skinny" build which doesn't include the time zone database, and a "fat" build which does. The skinny build would have automatically try to load the time zone database from some well-known place... but how?

It could be a dependent assembly - but that means it's got to be strongly named (as Noda Time itself is) which could make it hard for developers to build their own time zone database assembly if they want it to be trusted by a release build (which will be signed with a private snk file only available to core Noda Time developers).

It could be a satellite assembly, which comes with the same problems but may be slightly simpler in terms of the code to load it - and may even mean that there's no difference in the code used by the fat and skinny builds to load the "default" database.

It could be a file, but then we get back into the business of permissions and trust, as well as using reflection to find where the heck the assembly is in the first place (blech).

In short, it's all a bit of a mess - but I'm hoping we'll find some pleasant way of coping with it which makes it a breeze to replace the database without any inefficiency. Of course, even having two builds (per platform, don't forget!) is a pain to start with. More testing etc. That isn't even the limit of the problem...

Changing names

This isn't the first bit of time zone fun I've had this week. Without going into details, I've been caught out by the fact that Asia/Calcutta changed to Asia/Colombo at some point.

Currently, I don't believe Noda Time has any support for this sort of change. Depending on what's in the zoneinfo database, we may have some aliases - but I don't believe we've got any "translation" support; a sort of "if I've got this time zone name, what else might it be called?" piece of functionality.

At some point, I need to think about this further. The idea that your data could become invalid at any moment (and that new data may be unreadable by old systems) scares me.

Versioned time zone databases - and dates

All of this makes me think of source control. We're used to the idea of "a specific path, at a particular revision" (where a revision can be a number or a hash or whatever). The mess of changing time zones suggests we need to apply the same sort of idea to dates and times. So we might have a zoned instant represented by "2010-04-12T18:42:00 America/Argentina/San_Luis@2010-04-07T00:00:00Z". In other words, "the instant that we thought would be represented by April 12th 2010, 6:42pm in San Luis, when we considered it on April 7th".

Somehow, I can't see that taking off. It's worth reading Tony Finch's thoughts on the same issue. I don't know to what extent I agree with his suggestions, but they're interesting ideas.

We really have screwed things up, haven't we?

6 comments:

  1. FYI:
    "Russian time zones changed this year"
    http://news.yahoo.com/s/ap/eu_russia_killing_time

    As for the post itself, it undoubtedly needs much time to think))

    ReplyDelete
  2. My big wish is for a standard way of storing and referring to a timezone. Not just differences from UTC, but rules for when zones change.

    For example,
    UTC/Euro-DST (UK/Ireland/Portugal)
    UTC+1/Euro-DST (Most of EU)
    UTC-5/USA-DST-1987 (Eastern USA until 2006)
    UTC-5/USA-DST-2007 (Eastern USA now)

    So whenever some damn-fool politicians announce a new set of rules, they get a new ID.

    This way, I can store a UTC time-stamp and a timezone ID. Historical values will have the older timezone ID so something that's always been at "noon-local-time" won't suddenly change to 1pm when I browse the archives.

    I don't know if all the possible timezone rules that politicians might ever come up with could be fitted into 32 bits. Maybe there would need to be an IANA type organization allocating numbers. I don't know.

    But that's what I want, a standard way to refer to a timezone that won't change.

    I also want a pony.

    ReplyDelete
  3. That comment by "blog" was me, by the way.

    ReplyDelete
  4. What if you provide the current timezone database as some sort of a wiki page that can be imported locally whenever the developer chooses (on demand, only during the build etc..).

    ReplyDelete
  5. Cound you add the additional dll's into a main dll as "Embedded Resources"? You can then use the techninque that Jeffrey Richter talks about here http://blogs.msdn.com/microsoft_press/archive/2010/02/03/jeffrey-richter-excerpt-2-from-clr-via-c-third-edition.aspx

    ReplyDelete