Nate Silver made a convincing case at his keynote address for thinking probabilistically and the necessity to be alert to biases that could cloud the vision of a successful data analyst. He emphasized the importance of keeping an open and nimble mind in a changing data landscape.
He shared use cases of data gone wrong that included a GPS guided taxi sent on a wild goose chase through Central Park and a levy that missed the high water mark by not taking into account the 9 foot margin of error.
In addressing how to solve these sorts of data mistakes, Silver displayed this neon sign depicting Bayes theorem: the algebraic expression that plays such a central role in his book “The Signal and the Noise, Why most Predictions Fail but Some Don’t”.
“The attitude behind Bayes theorem is quite important. What it does is allow you to weigh new information in the context of what you already knew before. People aren’t always that good at doing this. In political campaigns, for example, you have thousands of polls that come out every year. Any one new poll doesn’t mean that much in the scheme of things but people want to get a story out of it, so they over-react to the new data.” He acknowledged the counter-case that sometimes people under-react to new data, but he added that “because people have so much more information and data now they are prone to over-react to the information that comes over the wire.”
Silver offered best practices to avoid misinterpreting the increasing volumes of data:
- Think probabilistically.
- Know where you are coming from (identify your biases).
- Survey the Data Landscape.
- Try, and Err, i.e. make your predictions and test them out in the real world.
He said “Sometimes the progress of research is slower than it needs to be.” He cites Google as company that makes thousands of small experiments with real customers every year and then adjusts their products accordingly.