Minimizing Read-Write MySQL Downtime - Yelp
engineeringblog.yelp.comAn insightful story from Yelp's engineering team on how they manage MySQL failure detection and execute automated recoveries to minimize the downtime of their read-write MySQL traffic.
Replacing a primary server is sometimes necessary due to planned or unplanned events, like an operating system upgrade, a database crash or hardware failure.
The story includes details about the procedure and all of the tools they've used to achieve that.
Read more...