Monday, February 22, 2010

Westhost Data Center Outage


On Saturday, February 20th at approximately 2:20 p.m. MDT, the server your account is on experienced some hardware failure as a result of an annual fire system inspection at our Data Center (DC). An inadvertent release of Inergen was triggered in the DC environment. The release was triggered by an actuator that was not removed as required in the DC's pre-test checklist. As such, the entire DC was impacted at some level, however not all servers were affected. This error was not the result of a mistake by WestHost employees, our hardware or systems. The mistake was triggered by an error from a vendor. With that said, we take responsibility and are doing everything we can to restore all systems completely.

There were multiple servers impacted by this outage; some of the servers have had damaged hardware replaced while other servers need to be restored from backups. A large majority of the servers that were affected have since had all services restored and are back online. We still have a few remaining servers that are experiencing problems. We are doing everything we can to restore those remaining servers as quickly as possible.

UPDATE: 6:12pm
At present, we have restored service to all but 12 shared and 6 dedicated servers. To our great concern, your account is residing on one of those servers currently being impacted by this outage. Our team of Systems Administrators and Engineers are continuing their work to restore the remaining servers and have been working non-stop for over 48 hours with little or no sleep. We continue to place the highest level of urgency on restoring these remaining servers. This is our number one priority at all levels within the organization.


No comments: