Five Lessons Learned The Hard Way From The Global Blue Screen Of Death
DevOps.com, Thursday, July 25th, 2024
As a product management professional, I am trained to understand customer pain, especially for customers who have challenges with downtime due to poor software quality. Last Friday I experienced the problem firsthand though when I arrived at Heathrow Airport early, only to discover that the Microsoft outage had grounded flights across the world.
My flight from London was only delayed a couple of hours, but my connecting flight was canceled completely. Since it was the last flight of the day to my hometown, I had to find a hotel room for the night, but all of the rooms within 30 miles of the airport were sold out. All of this is because of a bug in one small file not much larger than this article, impacting millions of people across the world.
Most software releases contain defects. I have never seen production code that didn't have issues, especially enterprise-level applications. Findings that are released to production are usually cosmetic, or minor bugs that impact a seldom-used feature. Defects like the one that took down the airlines, however, are another situation entirely. Now it's time to consider what lessons we can learn from this event.