SharkintoshBlog

Tech Horror Stories

True stories from the trenches.

I Deployed on a Friday at 4 59 PM and Lived to Regret Every Life Choice That Led Me There

Friday 4:59 deploy. Homepage raw JSON. Database migration ran on one replica. 47-hour weekend recovery. Never again.

The Load Balancer Was Sending All Traffic to the Staging Server for 6 Months and Users Preferred Staging

Load balancer sent all traffic to staging for 6 months. Users loved staging. It had features we never shipped. Awkward.

Our Disaster Recovery Plan Was a Text File That Just Said Call Dave

DR plan was a text file: Call Dave. Dave changed his number. The new number was also in the text file. We updated neither.

The Database Index Was Missing for 3 Years and We Only Noticed Because the Table Got Too Small

Missing index discovered because we deleted so many rows the query planner accidentally used a different, correct index.

Our Error Logging Service Went Down So We Had No Idea Everything Else Was Down Too

Error logging silently went down. So did everything else. We found out when the CEO called because the website was white.

I Ran a Load Test Against Production Thinking It Was Staging and the Site Handled It Better Than Expected

Load tested production by accident. 47,000 concurrent users. Zero errors. Staging crashes at 47 users. We use staging less now.

Our Application Depends on a Package Maintained by a Single Developer in Belarus Who Has Gone Missing

The Belarusian dev maintains a package that handles auth for 47,000 apps. Last commit: 2018. The internet runs on faith.

The Entire Backend Was a Set of Excel Macros That a Finance Intern Wrote in 2014

Excel macros from a 2014 intern run production. The intern is now a VP at Google. The macros still work. Nobody touches them.

We Had Two Production Databases and Nobody Knew Which One Was Real

Two production databases. Different data. We flipped a coin during deploys. The coin was a d20 with 14 faces labeled staging.

The Cron Job That Was Supposed to Run Every Hour Has Been Running Every Second Since 2019

The cron job ran every second for 5 years. It sent 157 million reminder emails. Zero users clicked anything.