Tech Horror Stories
True stories from the trenches.
I Deployed on a Friday at 4 59 PM and Lived to Regret Every Life Choice That Led Me There
Friday 4:59 deploy. Homepage raw JSON. Database migration ran on one replica. 47-hour weekend recovery. Never again.
The Load Balancer Was Sending All Traffic to the Staging Server for 6 Months and Users Preferred Staging
Load balancer sent all traffic to staging for 6 months. Users loved staging. It had features we never shipped. Awkward.
Our Disaster Recovery Plan Was a Text File That Just Said Call Dave
DR plan was a text file: Call Dave. Dave changed his number. The new number was also in the text file. We updated neither.
The Database Index Was Missing for 3 Years and We Only Noticed Because the Table Got Too Small
Missing index discovered because we deleted so many rows the query planner accidentally used a different, correct index.
Our Error Logging Service Went Down So We Had No Idea Everything Else Was Down Too
Error logging silently went down. So did everything else. We found out when the CEO called because the website was white.
I Ran a Load Test Against Production Thinking It Was Staging and the Site Handled It Better Than Expected
Load tested production by accident. 47,000 concurrent users. Zero errors. Staging crashes at 47 users. We use staging less now.
Our Application Depends on a Package Maintained by a Single Developer in Belarus Who Has Gone Missing
The Belarusian dev maintains a package that handles auth for 47,000 apps. Last commit: 2018. The internet runs on faith.
The Entire Backend Was a Set of Excel Macros That a Finance Intern Wrote in 2014
Excel macros from a 2014 intern run production. The intern is now a VP at Google. The macros still work. Nobody touches them.
We Had Two Production Databases and Nobody Knew Which One Was Real
Two production databases. Different data. We flipped a coin during deploys. The coin was a d20 with 14 faces labeled staging.
The Cron Job That Was Supposed to Run Every Hour Has Been Running Every Second Since 2019
The cron job ran every second for 5 years. It sent 157 million reminder emails. Zero users clicked anything.