We have added another layer of monitoring via 3rd party service NodePing. We discussed this on the last announcement.You can view the status page at: https://nodeping.com/reports/status/RIP4WW2JRYThis shows total uptime % of variety of backend services. Also select production services are also now on this services as well being additionally ... Читать далее »
A small number of servers have gone down, it does look like an electrical issue / brownout on that section.Sometimes that causes a situation where remote reboot does not function, remote reboot still keeps standby power for chipset etc. which can cause the server to go in a state where it does not remote reboot as the power is not physically being ... Читать далее »
There was more than 33 servers down for several hours.Initially we assumed power distribution issue on one of the racks, which was partially correct. TOR switch for that rack had a failed PSU upon closer inspection.That failed PSU has now been swapped over and the switch is back online, monitoring is reporting that all nodes are back online as ... Читать далее »
This time our billing system is being attacked via a DDOS intended for resource exhaustion.Hence our billing system is working a little bit slower than usual right now, we are working on it.First the spam attacks, now this. This type of thing always tends to happen when are running specials.Current list of shame -- will update later, these are the ... Читать далее »