Independent Benchmarks

SQL-on-Hadoop Performance Benchmark – Q2 2016

With the number of new and rapidly evolving technologies for big data analytics, how do you know which SQL engine to use in order to optimize queries and workloads in your Hadoop environment?  This independent benchmark, conducted by Radiant Advisors and sponsored by Teradata, tested performance of the latest versions of Hive, Impala, Presto and SparkSQL in GA at the time of testing, and also verified the impact of Hadoop distributions, file formats, volumes, and degree of query customization required. Our intent was to derive pragmatic applications of each SQL engine that help companies implement and optimize analysis and reporting in their Hadoop environments. Click here to access the report from Teradata’s website.  

Sponsored By Teradata. For more information, visit

Open-Source SQL-on-Hadoop Performanceinfinidb_logo

This independent performance benchmark focused on the performance dimensions of speed and SQL capabilities by comparing SQL-on-Hadoop options that are open-source and free  to download for existing Apache, Hortonworks, and Cloudera distributions. We also illustrated the relationship to available data file encodings — such as Optimized RC, Sequential, Parquet, or InfiniDB — for compression, performance, and openness. Click here to download the report. Miss the webinar? Click here to watch the recording!

Sponsored By InfiniDB empowers organizations to solve problems and create new solutions with powerful Big Data analytics. For more information, to join the community, and download software, visit and follow @InfiniDB.



Three Checkpoints for Governed Data Discovery

For successful data discovery, analysts must be able to move quickly and iteratively through the discovery process with as little friction – and as much IT-independence – as possible. With self-sufficient people involved, ensuring data and analytics are both trustworthy and protected becomes more difficult and imperative. This becomes a careful balance of freedom versus control, and brings the role of governance to the forefront of the discovery conversation. But discovery and governance are seemingly at odds — how is it possible to create an environment that facilitates discovery while providing secure, self-sufficient access to data and insights? In this new e-book, learn what governed data discovery is and how to institute checkpoints within the discovery process to reduce friction between analysts and IT and enable governance and sharing of trusted, valid data and insights. Click here to download the ebook. Or, click here to read the full research report.

Sponsored By  Pentaho enables discovery of any and all data and turns that data into insights by shaping that data for analytics. To learn more about Pentaho, visit

Research Reports


Data Lake Adoption and Maturity Survey Findings Report

By surveying both current and potential adopters in the industry, this study documents key perceptions, challenges and successes by focusing on data organization, integration, security, and definitional clarification to address key areas of concern and interest in ongoing data lake adoption.

The study sheds light on how companies perceive and are addressing critical lake success factors, including rethinking data for the long-term, establishing governance first, and tacking security needs up front.

Click here to download the report.

 Sponsored By  Research conducted in partnership with Unisphere Research and sponsored by Teradata, Hortonworks, Attunity and HP Data Security.

The Data Visualization Competency Center™GoodData_logo

Data visualization offers a tremendous opportunity to reach insights from data by leveraging our intrinsic hard-wiring to understand complex information visually. However, successful data visualization requires using the right kind of graphicacy to correctly interpret and analyze the data, as well as employing the right combination of design principles to curate a meaningful story. This report introduces the role of the Data Visualization Competency Center (DVCC)™ to support the use of effective self-service data visualization by providing best practices, standards, and education on how these information assets should be designed, created, and leveraged in the business.

  • Educate users on visual design principles and key cognitive elements affected by data visualization
  • Provide best practices and proven standards for understanding types of data and how to visually present them
  •  Foster a culture of communication, collaboration, and collective learning that enables a review network for newly created data visualizations

Click here to download the report. Miss the webinar? Click here to watch the recording!

Sponsored By  GoodData is an industry leading Insights as a Service provider, pushing beyond traditional BI by guiding users through the use of Collective Learning. Learn more about GoodData at or follow @gooddata on Twitter.

Enabling Governed Data Discovery

Today, data discovery and self-service are making data governance a charged topic. Pentaho_new_logo_2013 As business-driven data discovery continues to become fundamental, ensuring data and analytics are trustworthy and protected becomes more difficult and imperative. This research explains how to manage the barriers and risks of self-service and enable agile data discovery across the organization by extending existing data governance framework concepts to the data-driven and discovery-oriented business.

  • Understand the “freedom vs. control” paradox
  • How to design for iterative, “frictionless” discovery
  • Key governance checkpoints in data discovery

Click here to download the report.

Sponsored By  Built from an open source heritage, Pentaho’s unified data integration and analytics platform is comprehensive, completely embeddable and delivers governed analytics with any data in any environment. For more information visit,

The Definitive Guide to the Data Lake

It would be an understatement to say that the hype surrounding the data lake is causing confusion in the industry. Today’s newcomer to the data world vernacular – the “data lake” – is a term that has endured both the scrutiny of pundits who harp on the risk of digging a data swamp and, likewise, the vision of those who see the potential of the concept to have a profound impact on enterprise data architecture. As the data lake term begins to come off its hype cycle and face the pressures of pragmatic IT and business stakeholders, the demand for clear data lake definitions, use cases, and best practices continues to grow. This paper aims to clarify the data lake concept by combining fundamental data and information management principles with the experiences of existing implementations to explain how current data architectures will transform into enterprise data operating systems. While the data lake is a metaphor for this transformation, enterprise data management will continue to evolve the data lake according to established principles, drivers, and best practices that will quickly emerge as hindsight is applied at companies.DataLakeSponsors2

Click here to download the report.

Miss the webinar? Click here to watch the recording!

Sponsored By Hortonworks, MapR, Teradata, and Voltage Security. Developed in partnership with Unisphere Research.  

Overcoming Barriers to Data Virtualization Adoption

cisco_logo_2colorThe challenges of data management are getting exponentially harder. With the ever-increasing quantities, sources, and structures of data – as well as the insurgence of new tools and techniques for storing, analyzing, and deriving deeper insights from this information – data-driven companies continue to evaluate and explore data management technologies that better integrate, consolidate, and unify data in a way that offers tangible business value. Data virtualization is so compelling because it addresses business demands for data unification and supports high iteration and fast response times, all while enabling self-service user access and data navigability. However, adopting data virtualization is not without its set of barriers. Primarily, these relate to building a business case that can articulate the value of data virtualization in terms of speed of integration alongside the ability to manage ever-growing amounts of data in a timely, cost-efficient way. Supported by Cisco Data Virtualization, Radiant Advisors had the opportunity to further explore and understand the barriers experienced by companies considering data virtualization adoption, and then to pose these questions to companies that have already adopted data virtualization to glean their insights, best practices, and lessons learned. Together, the two halves of this research facilitate a practicable, independent, and unscripted “cross-talk” to fill information gaps and assist companies in overcoming barriers to data virtualization adoption. Click here to download the report.

Sponsored By To learn more about Cisco Data Virtualization, visit

White Papers

Why Spark MattersUnknown

Spark is quickly becoming a standard for writing deep analytics that leverage in-memory performance, streaming data, machine learning libraries, SQL, and graph analytics. The Spark environment provides big data developers and data scientists a quicker way to build advanced analytics programs with its abilities to overcome shortcomings of MapReduce and to meet the demand for faster and more powerful processing for the full data pipeline.

  • The rise of Spark
  • Overview of what Spark is
  • The drivers for its adoption
  • What to expect as the ecosystem continues to evolve
  • Five questions that are top-of-mind when considering adoption

Click here to download the paper.

 Sponsored By To learn more about how MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses, visit

The Visual Design Checklist: Balance, Emphasis and Unity microstrategy_logo

User-friendly features and functionality of modern BI, analytics and visualization tools allow more people to independently discover and communicate insights within organizations. However, this freedom requires understanding of proper design principles in order to create meaningful, accurate visualizations in ways our brains are naturally wired to perceive. The new Visual Design Checklist teaches emerging strategies in visual design in order to facilitate effective communication of trends and insights within data. This checklist is a guide to creating effective data visualizations and designing for visual dialogue. Key concepts explored include:

  • Emphasis, balance and unity between design elements
  • Understanding the picture superiority effect — and why it matters
  • Operating within the triangle of forces visualization constraints

Click here to download the paper.

 Sponsored By MicroStrategy is a leading worldwide provider of enterprise software platforms. Learn more about MicroStrategy at

Driving the Next Generation Data Architecture with Hadoop Adoption

Hadoop has been a phenomenon, both as a framework for big data workloads and its operational capabilities. Unknown Major initiatives brought Hadoop from its batch-oriented roots to the interactive capabilities that are delivering improved performance in SQL engines and with distributed in-memory engines.  Operational analytics are leading the way as “one of the first” steps towards operationalizing Hadoop as a platform. There are core data management principles that will guide Hadoop adoption, however there is also a change in mindset needed to rethink the role of Hadoop beyond a big data and analytic platform. This paper examines the emergence of Hadoop as an operational data platform, and how complementary data strategies and increasing year-over-year adoption can accelerate consolidation and realize business value in agility and reduced development efforts. Click here to download the paper. Miss the webinar? View it here.

 Sponsored By To learn more about how MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses, visit

The Power of Strategic Partnerships in Thriving Hadoop Ecosystem thumb.php

Alignment of the organization’s enterprise technology strategy and existing technologies to a Hadoop distribution vendor-partner alliances and product ecosystem is critical on the journey to the successful Data Lake. In this white paper, we classify the hierarchy of technology vendor-partnerships to provide a method for understanding vendor-partner alliances to recognize key relationships that will matter most for companies adopting Hadoop. Then, we apply these classifications to several of Hortonworks’ strategic vendor-partner alliances to highlight the significant collaborations and shared-vision commitments that provide the most unique competitive advantage for customers. Click here to download the paper.

 Sponsored By To learn more about Hortonworks and its ecosystem of partners visit

Key Considerations for Analytic Solutions for Life Sciences

logo-gradient From pharmaceuticals to global health to the environment, twenty-first century life sciences companies are transforming into data-driven life sciences companies, leveraging vast amounts (and new forms) of data. A strong emphasis on analytics and data discovery for new insights is introducing challenges in how data is leveraged into the fabric of life sciences organizations. Today’s analytic challenges for life sciences companies, then, can be separated into three distinct categories: the integration challenge, the management challenge, and the discovery challenge. However, the answer to these challenges isn’t the development of new tools or technologies, but instead that life sciences companies should turn to collaborative and transformative solutions that already exist. Download this white paper to see how embracing a data unification strategy through the adoption and continued refinement and governance of a semantic layer to enable agility, access, and virtual federation of data, as well as by incorporating solutions that take advantage of scalable, cloud-based technologies that provide advanced analytic and discovery capabilities — including visualization — life sciences companies can continue to become even more data-capable organizations. Click here to download the paper. Miss the webinar? Click here to watch the recording.

Sponsored By To learn more about Birst and how it is helping life sciences organizations to think fast, visit

Enabling Competitive Advantage with Modern Data Platforms

While understanding the term “big data” continues to vary, its underlying business value proposition is clear: the ability to affordably store all the data you can imagine, and work with it in ways never before possible. Big data has opened the door to the next true revolution – the Age of Data. Many talk about big data as being today’s booming oil wells, with so much crude oil available that anyone who can refine it into valuable, consumable, data-driven products will have a successful business model. Businesses today can aspire to leverage data in ways previously only attainable by data giants like Google, Facebook, Amazon, and LinkedIn. Actian However, it’s the experience – giving the consumer the information they want in a simple, intuitive, and instantaneous manner – that makes services like Google, Yahoo, or Bing truly valuable. Providing consumers with correct information is important, and accuracy increases statistically as more data becomes available to work with. Simplicity is the result of a product or service’s ability to mask back-end complexity for the user. Being instantaneous, in turn, comes from having the right technologies and platform that deliver the right information in the fleeting “moment of opportunity” for users. The key to successfully creating new, competitive business value in the Age of Data requires equal parts big data, ultra-performance, and relevant context.

Sponsored By Actian Corporation enables organizations to transform big data into business value with data management solutions to connect, analyze, and take automated Action across their business operations. Visit to learn more.

Unleashing Business Processes with SAP HANA In-Memory Database

Learn about SAP® HANA® in-memory database and discover how the technology can help transform and optimize your business processes. Review best practices, analytic capabilities, and business process framework to guide you on your strategy and planning. dell HANA is a fast, agile and scalable database solution that uses columnar database technology to compress data efficiently. It leverages advanced features such as vector processing and single instruction, multiple data (SIMD) parallelization. HANA implements three best-of-breed database engines — columnar, text analytic and graphing database engines — in a single in-memory system. In order to get the most out of HANA, you need to implement, integrate and coordinate across multiple business and technology domains. You also need to identify business processes that are good candidates for analytic enablement and enrichment. The following information can help you learn more about HANA and take advantage of its speed, flexibility and analytic power:

  • Business process and decision framework enabled by HANA technology and its features
  • HANA technology convergence and its analytical capabilities
  • Business Process and Technology Maturity Index
Sponsored By As HANA evolves rapidly, Dell can keep you informed and help you translate upcoming advances into actionable frameworks. Visit to learn more.

Solution Briefs


iVEDiX: Making Visual Discovery Mobile-First 

iVEDiX_logo Over the past few years, iVEDiX has emerged as a proven solution differentiated by its highly customizable visual analytics platform. As a true mobile-first platform that focuses on intuitive user interaction to drive engagement, iVEDiX embraces the mobile paradigm with unique, cutting-edge visual analytics in a compelling, meaningful, and collaborative way. iVEDiX_woTagline_woShadow Click here to download the solution brief.

 Sponsored By To learn more about iVEDiX, visit

Dundas BI: From Dashboard to Self-Service Platform

dundaslogoOver its more than twenty years in the data visualization industry, Dundas has progressed on a product roadmap to grow from developer components to a compelling dashboard framework. Today, with the Dundas BI platform, Dundas is introducing an enterprise-class BI solution designed to deliver a self-service experience on one flexible platform. Click here to download the solution brief.

 Sponsored By To learn more about Dundas BI, visit

Hortonworks and Microsoft: Strategic Partnerships in Thriving Hadoop Ecosystems

The challenges of big data are well worth the opportunities. The ability to gain insights from new forms of data (and previously difficult to work with data) is now a matter of choice in architected deployment options. With the democratization of big data, too, everyone can enjoy its benefits. The strategic Hadoop ecosystem partnership between Hortonworks and Microsoft delivers a new set of architected solutions with key benefits for more companies. The strategic partnership between Hortonworks and Microsoft demonstrates a shared vision for democratizing big data since first announced in 2011. Because of the combined capabilities of both companies in data technologies, this partnership of more than two years has significant benefits for the enterprise. Click here to download the solution brief. Miss the webinar? Click here to watch the recording!

Sponsored By To learn more about how Hortonworks and Microsoft have partnered to bring the benefits of Apache Hadoop to Windows, visit



Modern Data Platform Playbook Series

CIOs and their chief enterprise and information architects are quickly realizing that the reference architectures and best practices of the past have served them well, but are now challenged to meet the business demands of today’s data intensive and analytical environments. Companies require economical scalability for big data, but also want high-performance; they require information discovery and flexibility, but want to govern semantics and enterprise consistency; they want to benefit from advanced analytics and unstructured data, but also want broad accessibility with SQL. Despite all the recent mega-hype of big data, business analytics, and data scientist headlines, there is real business value and competitive advantage to be realized in these data technologies and skills. Based on the same fundamental set of data management principles that created past reference architectures, the Modern Data Platform (MDP) from Radiant Advisors is a framework that envisions a new reference architecture able to meet today’s challenges. The MDP strategy incorporates accepted and emerging technologies that allow existing data warehousing (DW) and business intelligence (BI) environments to transform in an agile, iterative process of adopting, integrating, and growing a powerful data platform — one that companies can drive at their own pace and needs. Dell Software_Dell Blue

Dell Information Management

As one of the world’s largest software companies, Dell’s Information Management strategy focuses on the challenges facing IT organizations that result from an increasingly heterogeneous data environment. A highly-optimized or “best of breed” architecture, like MDP, doesn’t come without its trade-offs, such as managing a more complex environment and integrating the environment to be a singular platform for users, applications, and devices. By focusing on end-to-end solution domains and key partnerships, Dell’s portfolio offers solutions directly targeted at the ongoing struggle to balance IT standardization and control, while unlocking the business value of innovation and empowering users. Download


By aligning the strengths and unique differentiators of the Kognitio Analytical Platform with MDP framework and principles, enterprise and information architects can cultivate a strategy that enables big data and advanced analytics capabilities for the business in ways that are clear and planned within their roadmap. This Kognitio Playbook for Modern Data Platforms focuses on understanding the MDP, the Kognitio technology, and its role within a big data strategy to enable today’s companies to transform into tomorrow’s competitive data-driven organizations. Download 

Insight Series

Visual Series #1: The Science of Data Visualization

The best data visualizations are designed to properly take advantage of “pre-attentive features” – visual properties hard-wired into our visual systems that help facilitate quick understanding. Thus, to create the most effective visuals, it’s important to understand the science behind visual cognition. In this first brief in a four-part series we take a high-level look at:

  • An introduction to the science of data visualization
  • Key cognitive ingredients to have a visual dialogue with data
  • How to curate meaning in data through visual cues

Click here to download the paper.

 This Series was originally published by the International Institute of Analytics. Learn more about IIA at 

Visual Series #2: The Building Blocks of Visual Design

Visual elements like lines, textures, shapes, colors, and typography help us organize information in a way that quickly facilitates meaning. But do we fully understand how to best use them as we design visualizations?

As we continue this four-part series originally published by the International Institute of Analytics (IIA), author Lindy Ryan teaches about these fundamental building blocks of visual discovery and how they work together to maximize the visual capacity of data visualization.

Click here to download the paper.

 This Series was originally published by the International Institute of Analytics. Learn more about IIA at 

Visual Series #3: Designing for Experience

Part 3 of this visual design series moves beyond the premise of achieving balance in art and science, to understanding how to create a visual experience for learning complex information through the lens of data visualization.

Click here to download the paper.

 This Series was originally published by the International Institute of Analytics. Learn more about IIA at 

Visual Series #4: Designing for Influence

This final edition of the visual series explores the importance of viewer perception and how this can be leveraged to influence the user by uniting an idea with emotion for effective data storytelling.

Click here to download the paper.

 This Series was originally published by the International Institute of Analytics. Learn more about IIA at 

Big Data Total Cost of Ownership: Evaluating Hard Costs and Options

Big Data platforms are extending data infrastructures, enabling capacity and performance increases in ways that are becoming more and more economical and attainable for today’s data-driven companies. However, estimating the total cost of ownership (TCO) for Hadoop can be challenging, with many costs hidden or not well understood when Hadoop environments are initially built and deployed.TD_Logo_top_nobox_RGB In this Insight paper, we explore the total costs of ownership from a tangible costs perspective. We analyze and illuminate actual and hidden hard costs of implementation with a Hadoop environment by option (including self-managed clusters and service-based environments). This includes hardware and software costs, as well as the support staffing and skills involved. Click here to download the paper.

 Sponsored By Learn more about how Treasure Data provides an easier way to collect, manage and analyze Big Data at

Big Data Total Cost of Ownership: Evaluating Soft Costs and Options

Big Data platforms are extending data infrastructures, enabling capacity and performance increases in ways that are becoming more and more economical and attainable for today’s data-driven companies. However, estimating the total cost of ownership (TCO) for Hadoop can be challenging, with many costs hidden or not well understood when Hadoop environments are initially built and deployed.TD_Logo_top_nobox_RGB In this Insight paper, we explore the total costs of ownership of Hadoop Big Data environments from a soft costs (or hidden) perspective. We analyze and discuss “soft costs” – those skills and resources necessary to build, run, and maintain a Hadoop environment, including the costs of opportunity and time to market; staffing Hadoop knowledge experience; and executing analytics. Click here to download the paper.

 Sponsored By Learn more about how Treasure Data provides an easier way to collect, manage and analyze Big Data at

Self-Sufficient Data Discovery by Design

In today’s emerging discovery culture, business users demand more independence to acquire, analyze, and sustain new insights from their data. As discovery continues to reshape how we earn insights from our data, discovery tools must continue to balance user intuition and self-service capabilities with high-performance for sharable, actionable timeframe insights across the organization. This paper reviews the elements that enable discovery by design, and discusses how disruptive discovery tools allow companies to truly capitalize on the business process of discovery. Click here to download the paper.

Sponsored By Learn more about how SiSense is designed for self-service business analytics at

From Self-Service to Self-Sufficiency: How Discovery is Driving the Business Shift

In the past several years, “self-service” has come to be understood as users having self-service access to information they need. Today, that definition is being redefined. Now, self-service is less about access and much more about ability: it’s a fundamental shift from being able to consume something that has been predefined and provided to be able to develop it – to discover it – yourself. With the advent of increasingly robust technologies, there is no shortage of self-service tools on the market today. And more important, these tools – BI and beyond — are good. In fact, they are more than good: these next-generation tools are the catalyst enabling business users to be increasingly more self-sufficient from IT in their data needs – if they choose. This paper identifies the key aspects empowering new, savvy business users with those BI/DW capabilities and roles traditionally reserved for IT via intuitive and powerful enabling tools, and how that shift will change IT forever. Download

Editorial Reports

All About Analytics

These days, few terms seem more meaningless than “analytics.” As a predicate, “analytics” gets applied to a confusing diversity of assets or resources – from banal operational reports to a machine analysis involving terabytes of information and thousands of predictive models. The confusion is regrettable, but understandable: the truth is that there’s simply a surfeit of analytic technologies, starting with bread-and-butter multidimensional assets – i.e., reports, dashboards, scorecards, and the like. Even in an era of so-called “big data analytics,” these assets aren’t going anywhere. Increasingly, they’re being buttressed by analytic insights from a host of other sources. Advanced practices such as analytic discovery and “investigative computing” – this last describes the methodological application of machine learning (also known as predictive analytics) at massive scale – involve different tools, different methods, and (to some extent, anyway) very different kinds of thinking. This begs a question: how do you meaningfully distinguish between analytic categories and technologies? How do you grow – or establish – a richly varied analytic practice? What must you change in your existing data warehouse environment to support or enable more sophisticated analytic practices? What can – and probably should – stay the same? Sponsored By Dell Software_Dell Blue Visit to learn more. Download

Recent News

Get In Touch

Boulder, CO USA

About Radiant

Radiant Advisors is a leading strategic research and advisory firm that delivers innovative, cutting-edge research and thought-leadership to transform today's organizations into tomorrow's data-driven industry leaders.