From James Fisher, VP Solution Marketing, Analytics
At the Gartner BI and Analytics Summit in Barcelona this week I found myself participating in an interesting debate on the issue of whether in the growing army of business users in this new age of pervasive business intelligence and analytics need any education and training in analytics or whether the tools should just be easier to use. Gartner themselves directly posed the question during a panel discussion – should we improve users skills rather than simply the tools? The premise seemed to be that by making analytics easier to use, for example guided discovery and intuitive, easy-to-use interfaces , there is absolutely nothing users need to know before they apply statistical packages, visualization templates and predictive algorithms to uncover the gems of insight buried in the data. It feels a little like we’re suggesting that even though we are living in a world with a burgeoning surfeit of data, as long as you have the right analytics tools, the data will simply tell its own story.
Me, I’m not so sure. I’ve always found that if I just wade in to a plethora of any information without any prior hypotheses of the type of relationships and trends that I might find, it is always difficult to separate out the useful data from the misleading or confusing data that is simply noise. I guess I need to go through a similar process to classical market research; first developing testable hypotheses through qualitative focus groups and face-to-face interviews to surface the issues before collecting and analysing quantitative data to measure which ones are true and important enough to act on. I guess the parallel in business is having a dialogue with colleagues and customers before diving into the data.
Chris Anderson challenged this accepted approach in an article he wrote in Wired back in 2008 called ‘The End of Theory: The Data Deluge Makes the Scientific Method Obsolete’, where he suggested that in the petabyte world of big data the traditional approach of hypothesize – model - test is obsolete and ‘the new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world.’ Many people have criticized such optimism pointing out there is so much ‘noise’ in the data about planetary weather that it is unlikely that trends in global warming would have been uncovered without researchers having a prior hypothesis.
Seems to me that such a bold statement also denies us – the human – any role in analytics and while I don’t doubt that some data-driven predictions can succeed, I’m convinced that a questioning mind, a better knowledge of the maths and an appreciation of the common misconceptions that people typically make will always result in better assessments and reduce some of the inherent risk that results from poor business decisions. So the point I was trying to make, albeit in 140 characters on Twitter, is yes we need to make tools easier to use but that we (vendors, organizations and increasingly our education systems) have a responsibility to ensure that anyone using analytic tools is provided with a general grounding in simple statistics and research methods, so they develop a questioning approach to using data and by default analytics, to support the business decisions they make in their working lives. For instance:
- Questioning the validity of the data they are working with. For instance we know that cancer patients can be misdiagnosed so that some who are told they have the disease do not (i.e. a false positive), while others who are told they do not have the disease in fact do, (i.e. a false negative). You can bet we encounter the same issue in business all the time and never even think about it!
- Teaching about probability and significance so we appreciate that findings and predictions are generally range based rather than a single data point – and how this impacts results such as elections or sales wins which have binary outcomes – win all or lose all.
- Helping us to become ‘diligent sceptics’ when it comes to data. A colleague tells a great story about their time in insurance when a business analyst from the special lines division had merged some previously siloed data sets and found considerable crossover between customer taking out one or more of their policies – (cover for pets, music instruments, extended warranties) – and was adamant there was evidence “significant customer loyalty”. The reality is the insurer was the market leader in special lines, all of which were sold under different brand names through different channels, and the crossover was inevitable rather than evidence that indicated the likely success of a concerted campaign of cross selling.
- Showing students how best to present and visualize data so it is easy to understand and can be quickly digested.
Unless we do this, we risk entering the age of pervasive analytics that could rapidly become dysfunctional with people acting on insights that are beautifully presented but entirely misplaced. Analytics, forecasting and predictions are always have some degree of error and everything we can do to minimize that will improve the decisions we make and benefit our businesses. Even the black belts get it wrong occasionally. Nate Silver, the statistician who became famous by predicting the outcome of every state in the 2008 and 2013 US Presidential elections through the use of sophisticated statistical analysis in the swing states, fouled up on predicting the result of the Superbowl last weekend. If experts can get it wrong, where does that leave the rest of us?