display | more...

Basics: We shut down the servers at 2014 UTC on July 8 to replace a failed battery unit. Two of the servers did not come up again. It appears that the room had a heat issue and some components got baked. The two machines, unfortunately, were those that handle the heaviest load. We had the capacity to compensate for one with stand-by hardware but not for two. One or more of us have been working on the machines since then, this being mostly nate, Swap, Kurt, and alex.

We restored some middlin' excuse for service around 0300 UTC on the 10th and took it down again. Right now and until we scrounge up some permanent replacements we have two loaners from Kurt, one of which is acting as the main database server. We're obviously still slow since we're working on what was an unconfigured system. We will quite likely bounce up and down numerous times before we achieve some decent performance. At 1200 UTC things were looking a bit better but not yet stable and we returned to the Word Galaxy after a few hours of trying. Halfways stable service was established some time around 0800 UTC as we flipped the switch on some significant database changes. We still have some work to do on the performance side of things.