View Single Post
Join Date: Sep 2012
Posts: 52
This is mostly a question for the PWE staff, but I'm sure the community will weigh in as well.

Now that the 5+ hour long outage on 5/2/2013 has finally been resolved, could we get some comments about what happened, and some reassurances about what's being done to prevent it in the future?

Was it a piece of networking gear that failed? A server? A connectivity issue that was more or less outside of your control? Something that caused multiple cascading failures?

Also, as a general question, what is your uptime goal for STO? It's clearly not the "five nines" (99.999% uptime) standard for major online services, as that only allows about 5 minutes of downtime per year, and the weekly maintenance alone is way more than that.

I understand that you run a business, and while you could theoretically have 5 independent data centers all over the globe, any one of which is capable of running the entire system, that would be prohibitively expensive. But neither are you just running the server on a single box plugged into a cable modem. There's a balance point somewhere in between, and it's different for every company and situation.

My question is: how much downtime (planned and unplanned) do you consider "acceptable?" I really hope that the 5 hour unexpected downtime is as unacceptable to you as it is to the players, as reflected by the 200 posts/hour on the two different threads about the outage.