Please tell me they aren't using some dumb Windows attempt at a server. Though it would clear up things, since a Windows server can't handle any type of a strain.
There's no way they're using Windows; even newer servers running Server 2008 can't handle this load. It's almost certainly a Linux cluster of some flavor, maybe with some Sun or SGI servers, and a buttload of load-balancing servers (which tend to run Linux).
Windows is great for a small/medium business. Linux tends to scale at the high end much better; at the very it's more adaptable. Windows just hits a brick wall and that's that; 2008 scales much better than previous versions but carries a heavier footprint than Linux.
Anyhoo, if the servers are running Linux, that partially explains the extended downtime. Crashes can be tougher to track down and reboots can take forever. Even Windows servers can take 30+ minutes to reboot, not including any time for file/RAID verification, or cluster re-sync after a crash.