How many days have passed since 1900-01-01?
It depends…
Wait, how can the answer be that it depends?
The answer depends on who you ask, Python or Excel.
45728 vs 45729. But why is that?
It is because of a bug/feature in Excel (and some other apps).
The Leap Year Bug
Lotus 1-2-3 was a spreadsheet app from Lotus Software (later IBM). It was released on 26 January 1983 (15388 days ago, according to my spreadsheet), and IBM shut it down in 2013. It looked like this:
It was popular in the ‘80s, just like Excel is today. But there was a big issue with it.
Lotus 1-2-3 assumed that 1900 was a leap year. But it was not.
To define a leap year, we can use this logic:
Divisible by 4 → It’s a leap year
Except if it’s also divisible by 100 → it’s not a leap year
Unless it’s divisible by 400 → it is a leap year
Or logically:
(Y % 4 == 0) and ((Y % 100 != 0) or (Y % 400 == 0))
1900 is divisible by 4 and 100 but not by 400 → Not a leap year.
2000 is divisible by all of them, so it is a leap year.
The engineers forgot about this, so Lotus had this bug.
Excel deliberately replicated the bug to maintain compatibility with Lotus, and to maintain compatibility with previous versions, Excel still has the bug.
What about Python?
Python doesn’t have this bug, so we have the differences in the results above.
Let’s check what happens if we want to import a dataset into pandas with the fake leap year in it.
Book1 doesn’t contain the Excel bug, while Book2 does.
No issue with Book1 as it is a regular CSV, but for Book2, it cannot convert the column to a datetime datatype; it remains an object.
The difference is also visible on the outputs of the dfs.
Here is another example from Python:
from datetime import date
today = date.today()
later = today.replace(year = today.year + 1)
If today is 2025-03-14, then:
today
becomesdate(2025, 3, 14)
later
becomesdate(2026, 3, 14)
since we added one year. This code works until you run it on February 29.
Generally, the leap year bug has two categories.
Those that lead to error conditions, like the python code above.
Those that lead to incorrect data, bad aggregations, or “off by one” problems.
Code and apps may crash due to the errors raised by Category 1. They are hard to miss.
Category 2, however, is the silent killer; it will not crash apps but can be dangerous if not handled correctly:
Calculating “yearly averages” without accounting for that extra day in a leap year (366 vs. 365 days).
Querying a database for events between February 1 and February month end, but forgetting to involve February 29 for leap years.
Add “1 year” to February 29, and you land on March 1 or February 28. This can impact billing, invoices, etc.
The bug can cause huge issues.
In 2024, for example
Payment terminals at ICA grocery and pharmacy stores across Sweden failed to process transactions on February 29th.
Street lighting in Paris, France, was inadvertently turned off at midnight at the start of February 29th.
Some more examples here.
Why no correction?
According to Microsoft, the disadvantages of changing this behaviour outweigh the advantages.
With a change, all the dates would decrease by 1 day as they would take out a previous day from the system.
Since the compatible applications are “used to” this behaviour, changing it would require changes in all other programs as well.