Puzzles, Mysteries, and Big Data
Randall Bolten, longtime Silicon Valley CFO, author of "Painting with Numbers: Presenting Financials and Other Numbers So People Will Understand You” and adjunct professor at U.C. Berkeley Extension
Last week I led a workshop on management reporting at the IMA Northern Lights Council’s annual seminar in Minneapolis. While there, I had the opportunity to sit in on several excellent presentations. One of them was Toby Groves’s overview of big data, a powerful software tool that has rightfully gotten much attention, but also has inherent limitations. It sometimes takes real wisdom and willpower to see when we should stop using big data.
In discussing the ideal applications for big data, Groves observed a distinction between a “puzzle” and a “mystery.” A puzzle is something that can be solved, and often the solution comes quickly with the discovery of just a little more data. A mystery has no definitive answer, and the outcome may depend on many different interacting factors.
The puzzle/mystery distinction was famously addressed by Malcolm Gladwell in a New Yorker article and by Gregory Treverton in Smithsonian. Treverton cites as examples questions like “Where is Osama bin Laden?” (a puzzle, which hadn’t yet been solved when he wrote the article) and “What will happen in Iraq after we invade?” (a mystery, at least when we’re not looking at it in retrospect). Gladwell’s article, centered around the trial of Enron COO Jeff Skilling, ponders whether truly understanding the Enron fiasco should be treated as a puzzle or as a mystery – to Gladwell, the answer wasn’t obvious.
To solve a puzzle, we just keep collecting data, and eventually the right pieces will fall into place. But to solve a mystery, the right strategy may be to stop collecting more data. The additional data is not only just more noise in the system, the effort to collect and assimilate it interferes with our ability to think creatively, and put ourselves in place of the protagonists in the mystery (that is, if the mystery is, say, about what people are going to do next).
Here’s where the distinction relates to big data: the availability of a powerful tool tempts us to collect and organize immense quantities of data, simply because we can. But that effort can be unproductive or even destructive, because the underlying problem is a mystery and not a puzzle.
The trick, of course, is knowing the difference.