Scott Hanselman

Back to Basics: Explore the Edge Cases or Date Math will Get You

January 3, '09 Comments [21] Posted in Back to Basics
Sponsored By

Disclaimer: I don't work for the Zune team and I don't know anyone on the team. I think that Z2K9 was a bummer, but I don't have any inside knowledge. Everything here came from the public interweb.

Dates will get you every time. Further more, it's all about Edge Cases. This is one of the things you'll think about when doing Test Driven Development and it's one of the things that everyone learns in Computer Science 101. You really have to hit those edge cases, be they dates, or number overflows, or buffer overflows.

The news reports:

"A bug in the internal clock driver related to the way the device handles a leap year affected Zune users," said the company in a statement. "That being the case, the issue should be resolved over the next 24 hours as the time change moves to January 1, 2009."

Disclaimer*2: I have no idea if the following is 100% true, only that it seems quite plausible. I present it for educational purposes, nothing else. It's very interesting. Again, I'm just a caveman.

A Zune Fan poked around in the source code from the vendor (Freescale Semiconductor) that made the real time clock in the Zune (The vendor's source for rtc.c is here) and with the benefit of hindsight, noted that there's the opportunity to get stuck in an infinite loop.

The Zune 30 shows the date and time, as do many devices, as the number of days and seconds since January 1st, 1980 at midnight.

year = ORIGINYEAR; /* = 1980 */

while (days > 365)
{
if (IsLeapYear(year))
{
if (days > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}

The days variable is read out of the memory location managed by the Real Time Clock (RTC). When the value of days == 366, you break out of the inner loop, but you can never get out of the outer loop.

A number of folks have blogged about the bug, their analysis and how they'd fix it. Programming Phases has a good post and folks have twittered suggestions). The basic problem is that since there are 366 days remaining when the calculations are reached for the year 2008, there will never be another subtraction to bring the total below 365, so the loop continues. The value of days is stuck at 366. It IS a leap year, but days is not > 366, so the loop continues.

In working with banking software for years, I can tell you that dates'll get you. When dealing with dates and date math you can't underestimate the value of really good code coverage. Also, even if you have 100% coverage, as I learned in my interview with Quetzal Bradley, 100% coverage just tells you a line of code DID run, it doesn't tell you that it ran correctly.

I noticed also when I visited my Live home page on Dec-31 that it said today was Jan-1, likely a Time Zone glitch. However, when I clicked the date, I was taken to a page with historical facts about Dec-31. ;)

jan1

dec31

Dates, especially when TimeZones are added in, are notoriously hard.

To this day there are a half-dozen bugs in DasBlog where we have a devil of a time with Time Zones. Omar spent weeks fighting with them before he just gave up. We have to reconcile the local time of the visitor, the time on the server, and GMT time (the time we store everything in). It usually works, but when the Server isn't on GMT we get into all sorts of trouble. We also have problems with clients that call the XML-RPC APIs, some of which use UTC (GMT) time and some use local time. Other than keeping an internal table to wrong-headed clients, there are no good solutions.

In 2007 CNN talked to a Major General about a bug in some F-22's that caused them to malfunction when going across the international dateline. The Unofficial Apple Weblog had an interesting post on Apple Date/Time bugs through the years. If you're interested in working with and maintaining legacy code, I recommend Michael Feathers' Working Effectively with Legacy Code.

As a random slightly-related aside, I was over at Hollywood Video buying a used copy of Mirror's Edge yesterday and some folks were talking about the Zune problem. The manager said, wow, I'm running the music in here on a Zune right now, and produced his brown 30G Zune from behind the counter. He either hadn't turned it on today fresh or the clock was wrong so he was able to weather the whole day without an incident. He seemed pretty pleased about his "survival."

I wonder how many other less widespread devices are running this real time clock and if any of them had trouble as well.

Do you have any personal Date/Time stories, Dear Reader? Please share in the comments!

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Saturday, January 03, 2009 3:08:59 AM UTC
Hmm. Maybe this is also why my alarm clock stopped working on January 1st.
Morten
Saturday, January 03, 2009 3:28:53 AM UTC
Leap second is another time problem. Many systems do not deal with it.
Saturday, January 03, 2009 4:25:45 AM UTC
Scott, minor point -- it's Michael Feathers' Working Effectively with Legacy Code, not Bob Martin.
John St. Clair
Saturday, January 03, 2009 6:05:59 AM UTC
You should put the alt text from the xkcd comic. That's always the best part!
Saturday, January 03, 2009 8:48:45 AM UTC
I didn't think anyone actually owned a 30G Zune. And apparently some people use them. Color me totally surprised.
The important point about this is that MSFT got more recognition for Zune from this event than all the other marketing activity they've done combined. Now people actually know what a Zune is.

Makes me think I should plan what bugs should be in the code to help market the product.
Saturday, January 03, 2009 12:03:08 PM UTC
When the value of days == 366, you break out of the inner loop, but you can never get out of the outer loop.


Nitpicking, but the inner part isn't a loop :)

I guess the classic case of this is the y2k hysteria. Sweden has something like social security numbers that include the birth date but only with dual digit years. So I spent better part of 1st of january 2000 at my old job fixing calculations for peoples' ages.
Saturday, January 03, 2009 1:04:11 PM UTC
Oh no.... the leap second!

My personal date and time issue is related to my new laptop. When day light savings time came aroudn, first, Vista didn't seem to have the new date for this all set. I manually adjusted the clock and COULD NOT get it to stick with the new time... I changed the time in Windows, changed the time in the BIOS, synced with a NTP server... it would stay changed until I rebooted, or even just set back seemingly at random. I was grumbling about Microsoft and then i found, I had the exact same problem with Linux on the system! And then I was messing with time zones and I ended up with my times off by an hour here, two hours there... an hour back, two ahead...

Finally I got it set right. Then, I went on a trip to another time zone... and was foolish enough to mess with the time. Now, Windows is an hour behind, Linux is an hour ahead, and I don't even know what the BIOS thinks the time is.
Saturday, January 03, 2009 1:31:37 PM UTC
A bug that has caught me out in the past is that the year before 1AD is 1BC.
This had caused leap years to be incorrectly computed (1BC and 5BC are leap years, not 4BC).

Luckily these cases have only ever appeared in unit tests.
Saturday, January 03, 2009 1:55:21 PM UTC
Not all calendars are Gregorian by default:
http://www.hanselman.com/blog/DontUnderestimateThePowerOfToStringIFormatProvider.aspx
RichB
Saturday, January 03, 2009 4:23:43 PM UTC
Times/dates are indeed a bitch. I work with code that used to assume that time zones are a whole number of hours from UTC. That was an easy fix.

Related code uses time stamps that are modulo 24 hours (to fit in 17 bits) which must be converted (based on some other hints and properties of the system) into epoch times. Complicating all this is the combination of system (server) time, user preferred time zone, and the time zone of the end user's computer. This often came to grief around daylight savings switches, but I may have finally nailed it.

The fundamental issue in our implementation was that while the standard C library has localtime() and gmtime() which convert an epoch time to a tm structure (with elements for year, month, day, etc.) there is only mktime() to go the other way around, which converts using the local machine settings. We needed one that didn't do this; effectively mkgmtime(). Microsoft provides mkgmtime(), but the software was originally cross-platform. Hopefully next April we will see this problem gone for good.
Nico
Saturday, January 03, 2009 4:51:10 PM UTC
My Belkin wireless router stopped working (the wireless part) on Jan 1st. Actually, it was broadcasting SSID and all but wouldn't authenticate any client or time out when connecting via wifi. I decided to wait and see if it would fix itself in 24hrs (I thought it had zuned too) but it didn't. Yesterday I switched it off/on and it came back. Maybe something to do with timestamps, security tokens, etc.
Saturday, January 03, 2009 5:57:41 PM UTC
My personal programming goal is to avoid the project path that deals with date/time. It always, always, always sucks, takes longer than everyone else thought, and is frustrating as hell.

Other than that, no problem.
Sunday, January 04, 2009 12:09:52 AM UTC
I am on a mission to stop developers using numeric date formats.

In the days of the WORLD wide web, a date such as 10/12/08 can mean October 12th 2008 in some parts of the world, and 10th December 2008 in other parts of the world. Even if some developers use the local date format others do NOT and therefore you can never be quite sure (unless it is after the 12th of the month).

I suggest that developers use a format such as Oct-10-2008 or 10-Dec-2008. Of course this does not quite satisfy everyone as the three letter month format is in English. In Germany it is different, and in France it is different again. However, if someone now uses the local settings it is obvious and the date is no longer ambigous. Perhaps a little flag could be displayed beside the date to signal the culture being used.
Sunday, January 04, 2009 2:43:16 AM UTC
Microsoft/Zune is not alone on this one.

It seems noone [at least not media] noticed, but some Sony Ericsson mobile phones died in a similar fashion on new year's eve. My wife's phone was one of them. It wouldn't boot so she bought a new one thinking the old one was bricked. Lo an behold, comes 2009 and the old phone works again... I wonder if Sony Ericsson used the same clock/driver as MSFT...?
Monday, January 05, 2009 12:25:05 AM UTC
Does anybody really know what time it is? Does anybody really care?
-Rich
Monday, January 05, 2009 4:20:37 AM UTC
Not so much a bug, but from working with financial software for years, we always run into problems with fiscal years. I can't tell you how often I have to add three months to get the "year" for government things, but then use a different addition for specific corporations whose FY ends in April or July. This always makes for a harder transition than necessary when brining in new developers who aren't familiar with fiscal years.

This gets even more confusing when you're dealing with fiscal calculations against a service that is offerered on a calendar year basis.
Monday, January 05, 2009 10:33:18 AM UTC
Check this out: http://siderite.blogspot.com/2006/05/weirdest-javascript-date-daylight.html. A bug that a year later I couldn't reproduce, mind you. Maybe some windows update fixed it, I don't know.
Monday, January 05, 2009 3:44:44 PM UTC
There are some funny bugs in the software of a Belgian Mobile Phone operator.

When the daylightsavingstime is switched you get some nice results.

When 2AM becomes 3AM, and you start the call at 1:59AM and end it at 2:01 (which did not occur), that has in the meantime become 3:01 you get to pay for a call of 1hour and 2 minutes.

When 3AM becomes 2AM, and you start a call at "the first" 2:50 AM and end it at "the second" 2:55 AM, you made a call of 1 hour 5 minutes, but only have to pay for 5 minutes. Even more funny: When you start the call at "the first" 2:50AM, and end it at "the second" 2:05. you made a call of 15 minutes, but actually never have to pay anything. The software "thinks" you made a call with negative duration, so it gets filtered by the rating engine. Actually, if it wouldn't get filtered, you would be paid, so make sure you call a 0900 or so... unless they fixed the bug by now.
Thursday, January 08, 2009 9:28:15 AM UTC
Another time FAIL this one from Oracle:
http://www.theregister.co.uk/2009/01/07/oracle_leap_second/
Thursday, January 08, 2009 6:26:07 PM UTC
Wait till you start working with "Broadcast" calendars. Apparently Dec 31 '08 is in the Jan '09 broadcast month. ugh
Alex
Monday, January 12, 2009 7:56:06 AM UTC
I'm happy to learn that I am not the only one facing these issues. I run a news site that shows timestamps in Pakistan Standard Time but the server hosting it is EST and the development box is in a different time zone altogether so testing and developing makes things even trickier (though it ensures the site is portable). When DST kicks in, that makes matters even worse because the the timezone for the published news items is Pakistan local time, which practises DST on a different set of dates and to make matters worse, windows did not update some time zone tables for Pakistan's DST dates so I had to manually enter figure that stuff out. Things got slightly tricky again when I represented the dates in human readable format (x hours ago, x days ago) because the 'x hours ago' needed to be the difference between Pakistan time normalized (after DST adjustments) to UTC and the web hosts' local time normalized to UTC.
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.