You say you want to play in the world of Facebook, Instagram or Amazon, creating data-rich, customized user experiences that draw tens of millions of users? Do you dream of crushing industry giants with individualized, online recommendations the way Netflix and Pandora do?
If so, to quote the police chief in Jaws when he sees the famous shark, "You're gonna need a bigger boat" — in this case, to hold all the data you'll need. More importantly, you're going to need a smarter boat, one that can help you find the meaning in the giant lakes of data created by everything from social media to the evolving Internet of Things. In other words, you need a fish-finder for identifying the business insights among the volumes of data continuously generated by people, products, processes and organizations, which we call a Code Halo™.
Consider what we call the "Trillion-Dollar Club," which consists of companies that together generated more than $1 trillion in market value over the last decade: Apple, Amazon, Google, Facebook, Netflix and Pandora. These businesses upended entire industries by analyzing and acting on the Code Halos generated "only" by the 10 billion devices connected to the Internet and mostly used by people.1
Coming soon, to an industry near you, are the billions of devices in the Internet of Things. When everything from smart fitness wristbands to smart cars to jet engines and shipping pallets are constantly and automatically generating information about their operation and their users' activities, the term "big data" will seem hopelessly quaint. As the number of connected devices grows ten-fold to 100 billion, data volume is expected to double every two years to 44 zettabytes or 44 trillion gigabytes, by 2020.2
This data growth is not only inevitable, but it is also essential to creating and improving your all-important algorithms to create better, more personalized experiences for customers. It is this data that you will need to store, manage and extract meaning from, if you are to avoid an "extinction event." This data might fuel a mobile app that guides customers to parking spaces near your store rather than a competitor's, based on historical turnover at Internet-enabled parking meters. Or it might underpin a corporate app that orders inventory for neighborhood drugstores based on usage reports from local smart insulin monitors, combined with area Web searches for cold remedies.
Your mission (whether or not you accept it) is to not only manage the sheer bulk of data, but to also draw meaning from the bits and bytes. This requires going way beyond traditional data repositories to what we call the data lake. You won't be able to afford the time, effort and cost of loading all this data into a big data repository, nor could you easily find and use the data you need in it.
Semantic technology lets you build on and extend your data warehousing and big data investments to drive much more powerful insights from a much broader data set more quickly.
Jump in the Lake
Think of a data warehouse as a dusty, expensive building filled with papers in static file folders, all organized in a rigid classification system that was obsolete as soon as it was created. That's your classic data model and it won't let you fully exploit the Code Halos that you need to succeed.
Think instead of all the data from all your sources, internal and external, old and new, flowing into a massive "data lake." As the lake gets bigger and bigger, with more and different types of data, how do you identify and gather the data you need without going broke or getting lapped by your competitors?
The data has to tell you itself. What you need is the data equivalent of a fish-finder that can peer into the murky darkness of the data lake and tell you which ghostly image is an old sunken tree and which is a school of prized game fish.