Big data is a term that is thrown around frequently but often misunderstood. While it will definitely play a major role in the future of information management, the value of big data lies in the opportunities it can provide. Big data is nothing without the ability to find the insights that matter and serve these to the audience who can make use of them.
Big data is often hailed as the new big thing but what is it really, how does it differ from small data, and how is it useful? It’s easiest to define with the three V’s; Volume, Velocity, and Variety – there’s a lot of it, it’s produced rapidly and it comes in all shapes and sizes. So to make the most of it you need a technology solution that can handle all three of the V’s and then turn those into something useful i.e. big data to big insight. But that’s just the first step, big insight is a big waste of time if not used properly. Your solution must also be able to get those insights in front of those who need them most and in a format that meets their needs for it to be of any value within your organization.
What is big data?
The Gartner definition states, “Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation.”
While the definition is useful in defining the ‘what’ of big data, it is the second part of that definition that requires the most attention: “…that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation”. Big data is useful. But big data is only as useful as your ability to process and draw actionable insights from it.
So what is the difference between data and ‘big data’? In some regards, a redundant question because the answer is mostly… nothing – it’s all still data. But what makes ‘big data’… big, relates to the first part of the Gartner definition; volume, velocity, and variety – aka the ‘3 Vs of Big Data’.
- Volume: Information overload has been a common phrase for generations yet never more has it been more relevant than now. In the last two years, we created 90% of all data… ever.
- Velocity: Data is growing exponentially. In the next five years, estimates have our global data production increasing from 4.5 Billion GB per day to over 450 billion GB per day. What we might think of today as big data today, will be minuscule in the next few years.
- Variety: It is easy to think of data as an indistinct pool of information, but that’s not quite true. Data comes in many shapes and sizes from the basic structured versus unstructured, all the way to the difference between words and numbers. With big data, comes a big variety in data types.
From big data to big insight
The thing with big data is if you don’t have a system in place to draw insights and value from these data sets… it is largely useless. The ability to turn complex data sets from noise, to information, and then finally to insights is critical for any organization looking to truly harness the power of information. You need the technology in place that can handle the volume, velocity, and variety of datasets. Yes, while technology cannot compare to a human’s experience or ability to differentiate between nuances in information – it is unparalleled in its ability to handle big data sets.
But how do you make the jump from data to insight? Here’s how we do it:
- Integrate: The volume and velocity that defines big data requires a solution that can identify and ingest information from a wide variety of sources, specified by the user, into its system in real-time.
- Normalize: The data that has been integrated is then transformed into a normalized XML format. This improves the interoperability of your data so that it can be accessed across multiple technology systems and software applications.
Simply put, the V for variety means that data comes in many different shapes and sizes – to make any use of it, you need to transform this data into a unified format.
- Enrich: This is where the data truly get “smarter”. Our powerful enrichment process includes document preprocessing, entity recognition, concept disambiguation, and semantic indexing.
Essentially, we take the raw, unified data sets and add meaning to them.
- Surface: The last piece of the ‘big insight’ puzzle revolves around being able to draw the key insights to the surface of the data pool. And what might be a ‘key insight’ to one organization could be entirely different to another. That’s why the use of augmented intelligence is particularly beneficial here; by combining machine learning models, such as Relevancy, along with a human expert ‘touch’ you can ensure that the most critical information for you and your organization is identified.
From big insight to big opportunity
Ok – we’ve turned this big data into something useful. Time to call it a day? NO! Surfacing the insights is one of the most challenging steps of handling big data, but these insights in isolation are largely useless unless they can be put in front of those key stakeholders who can use them to their fullest.
Too often we see organizations invest time and resource into finding the insight among their information sources – yet leave these insights locked away in information silos or distribute these insights ineffectively to their users. Much like the famed ‘average’ pilot’s seat (basically, the US air force spent years working out the average size of their pilots to develop a one-size-fits-all cockpit seat – only to find out that the seat did not fit a single pilot), information isn’t a one-size-fits-all either.
Consider the needs of various stakeholders within your organization. From the research teams who need to access high quality and comprehensive sources without getting bogged down in misdirections and noise; to the commercial teams who need to know the latest competitive data in real-time; to the C-suite executives needing succinct summaries of key ongoings across the organization and wider industry. Each has a distinct and personalized set of needs and preferences and, therefore, an information solution that is to be effective needs to facilitate all of these.
How do you build a personalized solution for everyone, you ask? Well, that’s a blog for another day but the crux of it is; if you want to make the most out of your big data – which you’ve invested into turning into big insight – you need an information solution that can be tailored to the needs of the many individuals within your organization.