Organizations rely on huge facets of information; the increasing amount, variety, and complexity of which can pose a significant challenge in developing an information infrastructure of the future. But, for an organization pursuing their digital transformation journey, having a solution that will cope with this increasing variety of information is critical.
We sat down with InfoDesk’s COO, Lynn Epstein, to discuss some of the challenges that information presents when developing a long-lasting information solution and what an organization needs to consider to get it right. In this interview, Lynn identifies the challenge of integrating and normalizing different types of content and why this is so critical in designing a solution for the future.
What are the different types of information that organizations rely on?
Lynn: There are many types, varieties, sources (you pick the noun and it’s there) of information. It’s not necessarily the range of information ‘types’ that pose the biggest challenge. It is the polar differences between these ‘types’ that cause the headaches.
Think about the different types of data you might use in an organization. Typically, I’ve found you can categorize them into one (if not all) of these pairs; internal or external, i.e. information within your organization or information sourced from an outside source; premium or open-source, i.e. available readily online or requiring a subscription; numerative or qualitative, i.e. numbers or words; and structured or unstructured, i.e. ordered nicely in a table with labeled columns (think a well-ordered spreadsheet) or everything else. I’m not saying this is an exhaustive list by any means, but for the most part, ‘business information’ will fit nicely into at least one of the descriptions.
Why is it important that organizations rely on multiple content sources for their business information?
Lynn: Well, what happens when you use multiple information sources to formulate an idea? You get a much better understanding of the concept as a whole. Say you’re looking into trends in the Automotive industry. You could use solely structured numerative sources; financial data, stock prices, and the like. Or, you could use that in conjunction with qualitative sources that capture the ‘feel’ of the industry… suddenly you have twice the data but thrice the understanding of the Automotive industry.
It’s not that one source might be incorrect – it’s that each publisher provides a different perspective, and the different perspectives together provide a much more detailed picture. One of my biggest concerns as I see organizations relying on fewer and fewer information sources is that this bigger picture is being lost. What’s my point? I guess it’s to emphasize the fact that organizations should be relying upon a number of different information sources to be confident that they are making accurate and informed decisions.
This is even more important when you consider the recent rise of ‘fake news’. Fake news is a term we’re all becoming (unfortunately) familiar with. Fake news is a direct affliction of people not relying on verified information sources. I’m not just talking in mainstream media either, I mean it in every level of information. Because anyone in their basement has a platform to become an author, this type of information is becoming more and more prolific.
How does an organization integrate all of these various information types into their technology then?
Lynn: There are three steps I like to use to break down the process of getting the information an organization needs; Acquiring, Integrating, and Normalizing. Let me caveat this with; it’s not enough just to get the information, there are just as critical processes in making it as valuable as possible.
Starting at the top – Acquiring; essentially the process of identifying and then accessing the information you need. Whether it be competitor information, financial information, pipeline information, the list goes on (and on, and on). In some cases, this is as simple as finding the RSS feed code; in others, it is necessary to build complex API connectors to content sources – this is where vendor relationships are key.
Integrating is then the process of bringing all of your information together into one central database that can then either be accessed by employees in your organization or, more often than not, is used to feed your information solutions and tools.
And then finally, Normalizing – where the magic starts to happen (and the headache intensifies). The easiest way to explain this is to imagine all of these information types being written in different languages. The normalization process is the process of translating these languages into one, universal language that captures the meaning and context of all the assimilated languages.
How does having all of this information centralized and normalized help an organization?
Lynn: The benefits are huge! Think about what value you add by integrating all of your information sources into a single location… Firstly, you no longer need to constantly search across multiple sources for the insights your organization relies on.
Imagine, you look at a website at 9am and then something gets published at 10am – you’d have missed it until the next time you choose to check that source of information. By bringing it together in real-time and in one place, you can monitor a single-source for all of your critical information – massively reducing the chance of missing a key insight. Secondly, you’re reducing information silos across your organization drastically (if not completely). This means you majorly improve your ROI on content subscriptions, internal resources, and information management across the organization.
But remember, this is only step one in the process. A centralized archive of normalized data is wildly beneficial but it’s just the start. Where the real value is added is the ability to deliver these critical insights to the user; how they need it, when they need it.
Earlier you mentioned vendor relationships, are these important, and why so?
Lynn: Vendor relationships are about trust! I call a content vendor and they know they can trust us with their information. Over the last 20 years, we’ve cultivated this trust and it’s something we really pride ourselves on. Vendors can be assured that a) they won’t see their premium content anywhere it shouldn’t be, and b) our content neutrality means we’ll never promote one vendor’s content over anothers.
It’s also extremely important for the client! Imagine the vendor suddenly makes drastic changes to how their data feeds work. We have the experience and technical knowledge to handle these changes for our clients giving them the confidence that they aren’t going to miss critical insights or get ‘lumped’ with a system that costs a significant amount of time and resources to fix.
There are also a number of benefits that probably aren’t immediately (nor ever will be) obvious to the client. The pre-existing relationships we have with such a large number of content vendors means that chances are we already work with the content supplier the client uses. So, when it comes to ingesting the content they rely upon we can easily reach out – and if we haven’t established a relationship already, our reputation makes that conversation so much easier for ourselves and the content vendors.
How can you build a solution that can cope with the amount and variety of information available now, and into the future?
Lynn: We all know about the exponential increases predicted in data. By 2025 we’re expecting to see five times the global data production per day. It’s becoming increasingly clear that to meet these needs we’re going to have to rely on technology – primarily AI technology.
Now, this doesn’t mean for a second that the part of the human is belittled or any less important. In fact, I’d argue that their role is even more critical. AI will massively reduce the amount of manual input necessary in identifying and collating key insights but it will be hard-pressed to replace a human’s intuition and ability to analyze these insights.
The future of information management isn’t one of sole machine occupancy. It is one of human augmentation, utilizing AI where it is strongest and giving humans more capacity to add value to the data.
What I will say when it comes to looking at the content that feeds your information solution is that it is key. It is the low-hanging fruit of most information solutions, but get it wrong here and you’ve made a critical mistake. The sentiment ‘garbage in (or rubbish in – depending on where you’re reading this from), garbage out’ is true for any information solution. If you want one that is long-lasting and effective, make sure you’re feeding it the right content.