Software Quality — the Whole not the Parts

Ben Higginbottom makes a significant point right at the top of this question about the Knight Capital fiasco:

Was the Knight Capital fiasco related to Release Management? On August 1, 2012, Knight Capital Group had a very bad day, losing $440 million in forty-five minutes. More than two weeks later, there has been no official detailed explanation of …

Ben Higginbottom: Nightmare scenarios like this are not the result in the failing in any one discrete components that can be ‘fixed’ simply and sweetly by improving the process, but by small failures in multiple domains (I’m kind of reminded of a fatal accident investigation). So coding error, gap in the unit testing, weak end to end testing and poor post release testing coupled with a lack of operational checks when live. I can understand the desire for a ‘silver bullet’ fix, but in any complex system I’ve never known them to exist.

Sometimes, it’s the whole, not the parts, that needs fixing.

Weave The People Technology for Connecting People

Weave The People is a cute bit of technology led by a friend of mine, that helps you to get the most out of meetings by getting people to interact and think before the meeting starts.

Pragmatically it gives you a flying start on the business of your meeting; socially it gives you a chance to get to know people early. Like all things ground-breaking, it’s a bit hard to categorise. Try it and see.

Highly Unlikely — The Mark of the Beast Bug

In 1991, amongst a series of theorems about rounding in floating point arithmetic calculations, the author parenthetically noted that rounding a number to 53 significant digits, and then rounding it again to 52 significant digits, might produce a different answer compared to when you do the rounding to 52 digits all in one go. He all but apologised for the parenthesis, noting it was “highly unlikely to affect any practical program adversely.”

19 years and 10 months later a researcher discovered that the “Mark of the Beast Bug” could freeze almost any computer in the world, and that millions of webservers could be taken down by it almost at the touch of a button — because of an error when rounding a very very small number, first to 53 significant digits, and then to 52.