People have been talking about Big Data since the late 1990s. When originally coined, the phrase addressed the continuously growing boundaries of computing. Technologists were making everything bigger—from CPU processing power to storage to networks. What that led to was a phenomenal explosion in the availability of data. In fact, 90% of all the data in the world today has been generated over the last two years.
The promise of Big Data is that you can look at huge data sets collected over time, across an enterprise, from the so-called Internet of things, and so on. By using a very large computing system you can join and aggregate data from multiple sources in arbitrary ways, enabling deeper analyses than any one system can provide. The relatively low cost enabled by technologies like Hadoop, which uses clusters of commodity-priced computers to provide huge capacity, adds to the allure of Big Data.
But the “big” part of Big Data is seriously misleading in terms not only of what is really important but also what is actually going on out there. Crunching a petabyte of data does not guarantee you will find ways to help the business or help your C-suite colleagues make better business decisions. The real reason people are making such a big deal about Big Data is that it allows more ad hoc, curiosity-driven analysis, as well as the ability to perform analytics on content such as textual data.
Don’t let big data magnify risk
However, the desire to gain new and actionable business insights from performing this kind of analysis—even on all the data collected by your business, structured or not—poses some risks for IT. Talk to any professional statistician, and they will say that analyzing data for purposes other than which it was originally collected puts you in very hazardous territory. Everyone has had this type of experience in his or her career; making a poor recommendation because the database that was being analyzed lacked a key piece of qualifying data. This would led even the most diligent professional to draw seriously wrong.
Even when you have your database structured for a certain purpose getting the right answers can be difficult. But there are far more ways of getting the wrong answer out of big data once you have collected it. It can be very hard to even tell if your queries are properly structured.
A second risk is where the organization lacks a culture of honest inquiry. In this case, Big Data can easily become a fancier version of what everybody who has ever calculated an ROI has done since the dawn of time—which is solving for the right answer.
This makes it important as an IT leader to not only mitigate these risks but also make sure you business partners understand them. Because without the right processes in place—or without a culture of honest enquiry—Big Data can simply magnify a flaw in analysis in very, very large volume.