This was a 5 minute talk on deep learning for the very excellent @chesterdevs. Like others talking about deep learning, I took visuals and the face-learning example from the landmark 2012 paper, Quoc Le/Google/Andrew Ng paper, “Building High-level Features Using Large Scale Unsupervised Learning.”
Only afterwards did I notice that the subset of images which their system show as “most like a face” from their test set were 90% male and 90% white, as is the prototypical face that the machine outputs.
And so we have a neat demonstration of unintended algorithm bias: their input was 10 million randomly-chosen youtube videos; the output was white and male. I bet they didn’t expect that.
A salutary reminder that—as the hard-working statistician will tell you—“random selection” does not mean “unbiased”.