Star Trek Online

Star Trek Online (http://sto-forum.perfectworld.com/index.php)
-   Star Trek Online General Discussion (http://sto-forum.perfectworld.com/forumdisplay.php?f=128)
-   -   outage debrief and general availability policy question (http://sto-forum.perfectworld.com/showthread.php?t=651601)

squigish 05-03-2013 12:00 AM

outage debrief and general availability policy question
 
This is mostly a question for the PWE staff, but I'm sure the community will weigh in as well.

Now that the 5+ hour long outage on 5/2/2013 has finally been resolved, could we get some comments about what happened, and some reassurances about what's being done to prevent it in the future?

Was it a piece of networking gear that failed? A server? A connectivity issue that was more or less outside of your control? Something that caused multiple cascading failures?

Also, as a general question, what is your uptime goal for STO? It's clearly not the "five nines" (99.999% uptime) standard for major online services, as that only allows about 5 minutes of downtime per year, and the weekly maintenance alone is way more than that.

I understand that you run a business, and while you could theoretically have 5 independent data centers all over the globe, any one of which is capable of running the entire system, that would be prohibitively expensive. But neither are you just running the server on a single box plugged into a cable modem. There's a balance point somewhere in between, and it's different for every company and situation.

My question is: how much downtime (planned and unplanned) do you consider "acceptable?" I really hope that the 5 hour unexpected downtime is as unacceptable to you as it is to the players, as reflected by the 200 posts/hour on the two different threads about the outage.

srspells 05-03-2013 12:03 AM

Havent played many mmos have you no mmo is 99% uptime they all go down alot and for lots of reasons as dumb as a missing image.

I for one am HAPPY they stayed for overtime to fix the issue at hand.

Thank you Cryptic.
-Spells

momaw 05-03-2013 01:19 AM

Who wants to bet that recent shenanigans in STO are related to the infrastructure for Neverwinter getting some hardcore stress testing with the opening of that game's beta?

srspells 05-03-2013 01:44 AM

not likely as each game is hosted through different servers, but what the client was the issue as its the launcher, if launcher goes down so does all the games as its just a clients acess to the game.
-Spells

jetwtf 05-03-2013 03:51 AM

Quote:

Originally Posted by srspells (Post 9633351)
I for one am HAPPY they stayed for overtime to fix the issue at hand.

Stayed for overtime? you do realize that when the crash happened any 9-5 employee was in the middle of dinner already. Servers are in California and the crash happened around 6:30 PM easter time.

I give kudos to the employees who fixed the server catastrophy but the company itself none. there is no excuse for server crashes this day and age when you release a new IP and it was the release of the Neverwinter open beta that pushed the system past it's limits. It is well known that server overload happens on release day. they could rent servers or a whole farm to prevent such things and still make a huge profit.

srspells 05-03-2013 03:54 AM

neverwinter may of overloaded the client server but not stos, essentially it was the client that made cryptics games out of order.
-Spells

nicha0 05-03-2013 05:00 AM

It looks like its yet another of countless data centre issues, because the forums reached through startrekonline are unreachable but perfectworld is. The data centre they are using has some very serious connectivity issues that are driving players off, they need to pack up and move out.

nyxadrill 05-03-2013 05:49 AM

Being in Europe I must have been in bed when the outage happened :P however it looks like it affected Neverwinter as well:

http://nw.perfectworld.com/news/?p=880781

and so no doubt other games as well. So I'd guess the problem was systemic, probably in the network and possibly outside of Cryptics control.

anazonda 05-03-2013 05:52 AM

99.999% works great for a webpage, or even could-services... Not for games that need constant updates and additions, bugfixes...?

It's not like you can simply stage new content without at least shutting down every now and then.

ussultimatum 05-03-2013 06:16 AM

Quote:

Originally Posted by nyxadrill (Post 9636261)
B
and so no doubt other games as well. So I'd guess the problem was systemic, probably in the network and possibly outside of Cryptics control.

They still need to make this their top priority to address and see that this does not happen again.

It's pretty clear that they have all sorts of issues, just from small examples like the massive lag with items in/out of bank/account bank.


It would be exceptionally foolish to invest millions developing a new game with a big title like NW and not also invest to make sure the launch is as smooth as possible.


All times are GMT -7. The time now is 04:00 AM.