Those of you following me on Twitter or Facebook will have seen that the past week or so has been a bit stressful, mainly due to naughty servers misbehaving. At work we have a set of redundant systems in production, so failures don't have too much impact (you still have to fix them of course, or your redundancy vanishes). It's in the staging and test areas where we have the most naughty-server fun, as we're trying out new features, newer versions of software or pushing systems to their limits to get a feel for how they will behave under load. However, on our test setup we don't have the redundancy, because they aren't critical systems and there's no great loss if they fall over.
This is just as well really, as we've had some interesting issues with Amazon's EC2 services, which provide the cloud computing infrastructure for some of our systems. It was Amazon who used to host Wikileaks, and have been under 'cyber-attack' by groups trying to take revenge for Wikileaks being kicked off.
The problem can be summed up with this little spot the difference game:
Both charts are of data traffic into and out of our systems:the green area is incoming data, blue is outgoing to our data processing cluster.
Chart #1 shows what the traffic is supposed to look like: a gentle sinusoidal wave which rises and falls with demand. That's our production server. The outgoing data wobbles if we restart systems further down the processing chain - there's a trough when they stop consuming data, and a peak when the buffered data is sent on.
Chart #2 shows the main interface to our test servers: they tap into the live data stream but are isolated by a 'siphon' script, which usually provides an identical copy of our data, but disconnects in the event of any problems, so as to protect the live systems from any errant behaviour in testing. Thus, the green part of both graphs should be broadly the same shape if everything works correctly (the axes are different, as the blue line varies depending on the number of data consumers we have running).
Clearly, we had a problem. It looks like a 2 year old child has scribbled on it with crayons (to be fair, my drawing isn't much better). There are gaps, peaks and troughs all over the shop. Gaps indicate that our monitoring systems were unable to contact the server, and because it has been out of contact or slow for random periods the incoming data ceases to be a smooth line as the system 'catches-up' with the data stream, bursting stored data over to the test system.
It all points to some pretty horrible latency on the network and coincides with attempts to bring Amazon EC2 down as a result of them removing Wikileaks.
Ouch.
Spinning up a new EC2 box has given us a nice waveform again, so hopefully the new one is in a better neighbourhood.
I haven't randomly rambled for a while, so here's an anecdote from work. My office is Apple-only, so we have a proliferation of ergonomically-questionable white and silver "designer" hardware around, including several instances of Apple's Magic Mouse. Aside from it being slightly too small to be comfortable in my hand, I've started to appreciate how it works - I'd just like the next size up, please.
However, over the past couple of weeks the Magic Mouse we use to run the Spotify box in the office (it does other things, but none so important as playing music) began to work intermittently. The Magic Mouse would randomly turn itself off, and take several attempts at fiddling with the switch, removing batteries, re-inserting batteries and hurling obscenities before it turned back on. Then on Friday, mine started doing the same thing, dropping out regularly until it wouldn't turn back on at all.
By lunchtime Monday we had 3 'Magic' Mice which had been reduced to worthless shiny pebbles. Just as the decision had been made to take them to the Apple store and get some corded mice instead, we put two and two together and realised that the problems on all three mice had started after we'd changed the batteries, from the Energizers they had in them originally to Duracell batteries from the newsagent across the road.
That couldn't be it, surely? After a hunt around Soho for somewhere which sold a different brand, sure enough, it was the Duracell's which the Magic Mice didn't like.
There's only one thing I could think of which might have upset them - we were using Duracell Plus batteries with "Super Conductive Graphite Technology", or "magic pencil ring" as I've come to refer to it. You can tell these batteries - they look like someone's taken a thick pencil and drawn a ring around both ends. I'm pointing my finger at the 'super conductive graphite' for now...
It seems like a week of battery woes - I ordered some 7dayshop.com AAA batteries for my portable mouse and keyboard, but the knobby bit isn't large enough so you have to stuff some foil in between the battery and connections to make them work correctly. On Friday I resorted to buying Polos to use the foil wrapper for this purpose (and had to scrape said foil off the plastic backing)...