Dealing with Big Data: Change Your Thinking, Not Just Your Technology
Contributed by Mark Albala
Finding business insights in a flood of “big data” requires organizations to change their processes, not just their technology.
Organizations have access to more data, from more sources, in more formats, than ever before. This “big data” may contain insights that could save money, uncover customer needs or predict market changes. Until now, the sheer complexity of how to store and index large data stores, as well as the information models required to access them, have made it difficult for organizations to convert this data into insight. Uncovering those insights requires prioritizing and organizing this “big data” based on its ability to deliver business value.
Big Data Challenges
Big data refers to data sets so large they become awkward to capture, store, search, analyze and visualize using conventional tools.Much of this data is in the unstructured form of documents, videos, or text that is difficult to fit into traditional databases. It also contains “multiple versions of the truth” in the form of data organized for different purposes at different times, or similar data obtained from different sources.
Before analysis users must validate data created at different times for different purposes by different sources to determine which are most accurate. Delays may also result from recovering data from “unofficial” locations such as user's desktops. Finally, the increase in the volume of data objects has made access schemes overly complex, so that finding the data that matters is akin to finding a needle in a haystack.
As a result, existing storage management technologies and processes cannot make all this information available in a neatly organized warehouse accessible when the business needs it.
Value Creation Roadmap
Organizations must enhance their people, processes and technology to derive the most value from “big data”.
On the people side, organizations must train their staffs in the databases, technologies and ontologies required to manage “big data”. These staffs must ensure that data is accessible in a timely way and facilitate the use of automated algorithms and innovative ways to improve decision making. Understanding which data deserves focus from a security and governance standpoint should be part of the overall governance charter.
Technology initiatives include ensuring that the tools needed to navigate big data are usable by the intended audience, and that the architecture and supporting network, technology and software infrastructures are capable of supporting big data.
Organizations should also develop detailed metrics to assess their big data management programs, including the times required to turn data into insight, to integrate new and existing information sources, to manage the data and the value derived achieved from the data.
From a process perspective, organizations should consider horizontal partitioning, which segments data in a way that prioritizes the information required for value extraction, origination and capture. Traditional information lifecycle management approaches stratified the physical layout of the data to optimize performance. This is inadequate for big data, which needs a more value-oriented segmentation:
- Information directly related to the creation, extraction or capture of value.
- Supporting information that helps define a strategy to create, extract or capture value.
- Information required for business operations but not necessarily related to value creation, extraction or capture.
- Information required for regulatory activities but not necessarily related to the creation,extraction or capture of value.
- Historical supporting information.
- Historical information that once supported value, regulatory or other functions but is now kept only because it might be useful in the future.
Big Data: Competitive Necessity
Only eight years ago, a 300 to 400 terabyte data warehouse was considered an outlier. Today, multi-petabyte warehouses are common. Failure to derive the most insight at the least cost from the information pouring into the enterprise will become an ever-larger competitive disadvantage.
Read the complete white paper Dealing with Big Data: Planning for and Surviving the Petabyte Age (PDF) or learn more about our data warehousing, business intelligence and performance management services.