This is mostly a question for the PWE staff, but I'm sure the community will weigh in as well.
Now that the 5+ hour long outage on 5/2/2013 has finally been resolved, could we get some comments about what happened, and some reassurances about what's being done to prevent it in the future?
Was it a piece of networking gear that failed? A server? A connectivity issue that was more or less outside of your control? Something that caused multiple cascading failures?
Also, as a general question, what is your uptime goal for STO? It's clearly not the "five nines" (99.999% uptime) standard for major online services, as that only allows about 5 minutes of downtime per year, and the weekly maintenance alone is way more than that.
I understand that you run a business, and while you could theoretically have 5 independent data centers all over the globe, any one of which is capable of running the entire system, that would be prohibitively expensive. But neither are you just running the server on a single box plugged into a cable modem. There's a balance point somewhere in between, and it's different for every company and situation.
My question is: how much downtime (planned and unplanned) do you consider "acceptable?" I really hope that the 5 hour unexpected downtime is as unacceptable to you as it is to the players, as reflected by the 200 posts/hour on the two different threads about the outage.
not likely as each game is hosted through different servers, but what the client was the issue as its the launcher, if launcher goes down so does all the games as its just a clients acess to the game.
|| Open Door Policy ||
| Dues Ex Mechina |
I for one am HAPPY they stayed for overtime to fix the issue at hand.
Stayed for overtime? you do realize that when the crash happened any 9-5 employee was in the middle of dinner already. Servers are in California and the crash happened around 6:30 PM easter time.
I give kudos to the employees who fixed the server catastrophy but the company itself none. there is no excuse for server crashes this day and age when you release a new IP and it was the release of the Neverwinter open beta that pushed the system past it's limits. It is well known that server overload happens on release day. they could rent servers or a whole farm to prevent such things and still make a huge profit.
It looks like its yet another of countless data centre issues, because the forums reached through startrekonline are unreachable but perfectworld is. The data centre they are using has some very serious connectivity issues that are driving players off, they need to pack up and move out.