Database Outage
Incident Report for Retreaver
Resolved
We have identified the root cause and deployed a fix. A sudden surge of thousands of calls at 14:10 EST revealed an engineering deficiency in call handling which caused our database to lock. This deficiency has been patched, and the fix has been deployed. We are currently conducting a thorough code review to ensure that this deficiency does not exist in other parts of the codebase.
Posted Feb 24, 2016 - 16:50 EST
Update
At approximately 14:10 EST we were alerted to a surge in CPU usage on our primary database server. Unable to locate the cause, we manually failed over to our backup server at 14:16 EST. This action succeeded and operations returned to normal 4 minutes later. We're currently working to identify the root cause and will update this incident in the next 2 hours.
Posted Feb 24, 2016 - 15:06 EST
Investigating
We're investigating an outage in our primary database server. We will provide updates shortly.
Posted Feb 24, 2016 - 14:27 EST