The performance capabilities of modern databases are a critical factor to support essential business intelligence and decision support in organizations. As such, database vendors often rely on the industry standard database benchmark, TPC-DS (Decision Support), to test and demonstrate their SQL capabilities and performance. As an objective third-party research organization, Radiant Advisors often performs these independent validations on behalf of various database vendors.
Due to the rigors of independent TPC-DS testing and database SQL development, our approach is to independently validate vendors’ published SQL statements and performance results. We independently install and configure the databases and execute the TPC-DS data generator program for data load files. This benchmark was sponsored by Kinetica and assesses Kinetica and ClickHouse, via the open-source vendor Altinity, on Azure.
Scale Factor and Environment: A must-have in comparative database benchmarks is an equivalent environment and data scale factor (SF). We set the TPC-DS data generator scale factor = 200 for generating data files. Note that many vendors’ internal TPC-DS tests use a low SF to demonstrate their SQL execution capability but mislead response times. Our goal was to test a 1 billion-row fact table in the TPC-DS. With SF=200, we loaded the store_sales table with 1.4 billion records of generated data, the catalog_sales table with 720 million records, and the web_sales table with 360 million records. Kinetica and ClickHouse benchmarking used a consistent hardware configuration of four Azure virtual machines: E48s v4 (48 vCPU, 384 GB RAM) with 2TB premium SSD.
Kinetica Benchmark Results: We accessed the Kinetica TPC-DS GitHub for their published list of TPC-DS SQL statements to test independently. The Kinetica database executed 98 of 99 queries in our six sequential (SQL 1 through 99) testing runs and demonstrated consistent performance values with very low deviation. With the SF=200 data, 22 queries ran in less than 1 second, and 80 of the 98 were completed in less than 10 seconds – impressive results. Only four queries ran over one minute, with the longest-running query taking only 189 seconds – still considered excellent performance.
ClickHouse Benchmark Results: We accessed the Altinity GitHub for ClickHouse SQL statements and their readme.md file states that their environment was a data scale – 1 (1GB of data). Readme.md explains that 21 of their TPC-DS queries work out of the box and that another 20 queries are “fixable.” Altinity also describes why 78 queries failed to execute. This overlap (shown in our spreadsheet analysis) should have resulted in 21 working ClickHouse queries and three that should be fixable while not failed. However, our results only showed that 8 of their 24 queries worked and the other 16 failed due to a database error for SQL statements that included the 1.4 billion row store_sales table. The sizeable “fact” tables were loaded as sharded tables in the distributed database environment, and these substantial operations repeatedly failed our tests.
Comparative Performance Analysis: Only the eight queries that ClickHouse could run are comparable to the same eight queries that Kinetica executed. Kinetica outperformed ClickHouse on each comparable TPC-DS query. The performance factors ranged from 42% faster on SQL #28 to dramatically faster at 23,425% faster on SQL #99. (See Table below.) Of the eight queries, ClickHouse’s longest running query was 365 seconds, and Kinetica’s longest running was 14 seconds. Once again, none of Kinetica’s 98 executed queries ran longer than 189 seconds.
In this Radiant Advisors benchmark, conducted with MResult database engineers, we could only compare 8 TCP-DS queries between Kinetica and ClickHouse. The complete TCP-DS benchmark results (SQL capability, performance times, and consistency) for the Kinetica database at scale factor 200 are available for review and assessment. The TCP-DS testing logs can be found in the Radiant Advisors public GitHub repository. We invite others to review our work and post comments and questions regarding this benchmark and how other modern databases with the same environment configurations could compare.
Query Results TPC-DS SF200 Kinetica and ClickHouse Analysis (Download results table here.)
John O’Brien is Principal Advisor and CEO of Radiant Advisors. A recognized thought leader in data strategy and analytics, John’s unique perspective comes from the combination of his roles as a practitioner, consultant and vendor CTO.
Kinetica helps many of the world’s largest companies solve complex problems across time and space. Organizations use Kinetica to simultaneously ingest and analyze fast-moving spatiotemporal data to build the next generation of advanced IoT solutionVisit Site