At their cores both business intelligence (BI) and discovery are two (mental) models that share the same intended purpose: to derive value from data. However, when it comes to defining what is really different between BI versus discovery it boils down to a simple change in how we think about the data. Rather than relying on business people to tell us how the business works – BI –, discovery relies on using real data to show – to gain insights – on what’s really going on in and around the business.
Traditionally, enterprise BI has focused on how to create systems that move information in and around and up and down in the organization while maintaining its business context. It focuses on keeping that Very Important Context wrapped tightly around the data so that the business consumer doesn’t need to. It is “rely and verify:” a framework wherein the analyst role is embedded within the system and the end user doesn’t have to be a data expert, just has to be able to rely on the data presented to verify a business need is met. The end goal of BI is to create and pre-define easily digestible information that is plattered up and served to the end user with the purpose of being consumed as-is.
When we talk about BI versus discovery, what we’re talking about is having the ability – the willingness -- to iterate and explore the data without the assumptions and biases of pre-definitions. Consider this example: IT (or a BI team) asks the business what it needs to know. The business, in turn, answers with a metric – not what they need to know, but rather what they need to measure. This, by the way, is how we’ve come up with things like OLAP cubes and other slice-and-dice approaches to understanding and interpreting data to achieve a business goal. Whole generations of BI have fixated on understanding and defining how data needs to map into a metric. But here’s the rub: (things like) OLAP are only as good as what you pre-define – if you only pre-define five dimensions, you won’t discover the other twenty hiding in the data. Hence the need for discovery.
Discovery begins with a goal to achieve within the business, but it accepts that we simply don’t know what the metrics are or what data we need (or have) to meet that goal. Discovery is living inside – getting all up close and personal with – the data. It’s exploring, playing, visualizing, picking a part and mashing back together the data in an iterative process to discover relationships in the data itself. We already know the context, but the goal is to build new models to uncover relationships that we don’t already know, and then figure out how that information can provide value back to the business.
In this context, discovery is as much about prediction as it is about iteration. Analysts with an inherent knowledge of the data can look at the context and identify that it’s not quite right – that it doesn’t join with an established metric quite as anticipated --- they can predict that and already have a plan in mind of what to try next. Then, they can go forth and discover. They can iterate. It’s agile, yes, but it misses some of the discipline that makes BI, well…BI.
To go full-on discovery mode requires this give-and-take ability to predict and iterate – to not be satisfied with one answer and to keep on searching for new information. We want to be able to fail fast – to take advantage of short shelf lives and get the most of our information when and how we can – and then move on. And that kind of iterative ability necessitates self-sufficiency, a new-and-improved breed of the old “self-service.” Analysts now need to not only have access to data, but they need to be able to create and consume on the fly, i.e. without having to go and ask for help. They need discovery tools. This is part of IT’s new role – enablement – and part of a larger shift we’re going to start seeing happen in the industry.
But does discovery and BI – and the tools that go along with them – need to be kept separate? Well, maybe. Historically, many of the data analysis tools cobbled together within an organization are just that – a kludge of technologies that have been folded together into one portfolio. Today vendors in the space still tend to focus on “either/or” solutions, with a few tackling “both.” And “both” does make sense: ultimately the idea of data experts being able to experiment with data relationships off the same infrastructure that the broader business community-at-large interacts with seems like a no-brainer. It seems like these two models of interacting with data would create a cohesive framework to support the organization. But, that’s not exactly the case.
Of course, there’s a flip side. While discovery is good – and necessary – that doesn’t mean it’s a free-for-all. With intuitive, robust tools and wide-open access, we face the new challenges of properly governing roles, responsibility, and how data is used in the business. We don’t want to return to the chaos of reports that don’t match, garbage data, or mismatched information systems. Discovery is as strategic a process as any other in BI.