|Leap year math blather
||[Feb. 29th, 2012|11:43 am]
Here's a question that I have had to answer for work: if you want to compare the weather across multiple years, what do you do about leap days?|
Frequently, people dodge around the problem by looking at monthly averages. The months already differ in length by a couple days, so an extra day in February here and there just vanishes into the noise.
But what if you want to look at timescales shorter than a month? What if you need to deal with daily data (which you do if you want to look at phenomena such as heat waves), and you know that decomposing the data into a static background rate (the climatology) plus a time-varying deviation (the anomaly relative to climatology) will make the math work a lot better?
The answer is: switch your (implicit) units from "day of the year" to "orbit angle" by tallying up the days since some base date, dividing by 365.25, and taking the remainder. (Multiply by 2π if you want radians.)
This is because it takes 365.25 days for the Earth to travel once around the sun and return to the same point in its orbit. We only count 365 days in a calendar year, so if you start on March 1st of one year and track the Earth all the way around its orbit, on March 1st of the next year it's actually a quarter-day behind in the track. After another year it's a half-day behind, and after four years, on midnight of February 28th the date is a full day behind the orbital position, so we intercalate an extra day to realign our March 1sts.
What this means is that if you're comparing days year-to-year, once you consider this lag there are actually four different sub-dates for any given date. You shouldn't compare June 30th of 1986 with June 30th of 1988 because they're not the same thing; the orbital position for June 30th in 1986 is actually halfway between the orbital positions for June 29th and June 30th in 1988. In effect, in 1986 you should actually be calling it June 29.5, not June 30. (Note -- it may be tempting to approximate orbital position by just subtracting 0.25 from the day-of-year in each successive year, but that's probably not the best idea. It will sort of work, but only if you count starting from the leap day as your baseline, and it's easy to mess up.)
It might seem that things are now an even bigger tangle, but really all you have to do is switch from thinking of the problem as one where you drop values into successive bins to average them up and instead view it as fitting a curve to a bunch of sampling points at arbitrary locations along the x-axis. Easy-peasy!
(If you really really wanted to use bins, you could still do it. Use four times as many bins, hit only a quarter of them each cycle, and then recombine them with a width-7 triangular window at the end to reduce them back down to 365 bins total.)
I don't expect anybody besides me will find this particularly useful, but sometimes I find it helpful to write out explanations of these kinds of things, and I felt like sharing.