Monday 23 August 2021

The Survivorship Bias - a story and the lesson

This bias is best illustrated by a story (much of this is quoted from a super piece by Vivek Kaul in the Mint
https://www.livemint.com/mint-top-newsletter/easynomics20082021.html)


While fighting the Second World War, the British Royal Air Force (RAF) ended up with a very strange problem. It needed to attach heavy plating to its fighter jets to protect them from gunfire from the German fighter planes and their anti-aircraft guns. The trouble was that, since the plating was heavy, it had to be used sparingly at the right points of the aircraft, where the Germans were most likely to attack.
Jordan Ellenberg writes about this in How Not to Be Wrong: The Hidden Maths of Everyday Life: “The damage [of the German bullets] wasn’t uniformly distributed across the [British] aircraft. There were more bullet holes in the fuselage, not so many in the engines.”

What did this data suggest? It suggested that the chances of the plane’s fuselage being attacked were higher, and that was the part of the plane that seemed to be the most vulnerable. 
QED. 

Thankfully, the British didn’t go by what the basic reading of data suggested because they would have been totally wrong. Instead, they chose to hear out Abraham Wald, a statistician.
As Tim Harford writes in How to Make the World Add Up: “Wald’s written response was highly technical, but the key idea is this: we only observe damage in the planes that return. What about the planes that were shot down?”
This data wasn’t available to the British RAF. As Ellenberg writes: “The armour, said Wald, doesn’t go where bullet holes are. It goes where bullet holes aren’t: on the engines. Wald’s insight was simply to ask: where are the missing holes? The ones that would have been all over the engine casing if the damage had been spread equally all over the plane. The missing bullet holes were on the missing planes. The reason planes were coming back with fewer hits to the engine is that planes that got hit in the engine weren’t coming back.” They simply crashed. Hence, the planes were plated around the engine and not the fuselage as the data had originally suggested.
The original data in the British case had a survivorship bias built into it. It captured the bullet patterns of only those planes that made it back to the air force base and not every British plane that got hit by a German bullet.
Ellenberg explains this in another way: “If you go the recovery room at the hospital, you’ll see a lot more people with bullet holes in their legs than people with bullet holes in their chests. But that’s not because people don’t get shot in the chest; it’s because the people who get shot in the chest don’t recover.”

When someone (generally a smoker) defends the practice by giving you an example of an eighty-year old who smokes every day, it is the survivorship bias at work.  When people take examples of Steve Jobs and the rest of the start-up stars line-up as examples...well, there we go again.  
The way to eliminate the bias: for every plane that returned, count one that did not.

No comments:

Post a Comment

When Less is More

A lovely lovely piece on persuasion, with an excellent message.  The summary of it is: Bumper message: to convince someone (of something) bu...