CloverETL Performance

We know that CloverETL always outperforms when it comes to performance. The benchmark data shown below has not been chosen to show of the fastest Clover components. Instead, we have used industry standard database benchmarks developed independently by the Transaction Processing Performance Council (TPC) and implemented these as ETL data transformation graphs. 

We have included Pentaho and Talend by way of a comparison, but we are performance leaders no matter who you compare us to.
 

TPC-H Benchmark Q1 TPC-H Benchmark Q3
TPCH Benchmark Explanation
Hardware Custom Benchmarks


Areas of High Performance

The benchmarks below do not focus on the very fastest aspects of Clover. Instead we are looking to show real world performance for bread and butter data transformations. However, many parts of Clover can be tuned to extract maximum performance and further increase performance. Some example where Clover is extremely fast are

  • File and Memory based Lookups
  • Clustering mode that allows data transformations to be split across multiple servers and in most cases delivers near linear performance improvement as you add servers
  • Hash based merging and aggregating
  • High performance sorting
     

TPC-H Benchmark (Q1)

The following list gives a synopsis of the transformation process. Full details of the TPC-H benchmarks are discussed immediately after the benchmark results.

  • Raw data is supplied by TPC-H and contains 59,986,052 Line Items (7.24 GB) of order data
  • Filter out any order items older than a certain date
  • Calculate the discount value by applying a discount percentage to the line item value for each item
  • For each order (and the line items within that order) calculate the number of order items, total value of order items, total discount, difference in days between last item and first item
  • Sort the data in order of time taken to ship an order (difference in days between date first item and last item shipped for an order)
  • Write the result to a csv file

Web Site Access Log Web Site Access Log

Click to images to enlarge

TPC-H Benchmark (Q3)

The following list gives a synopsis of the transformation process. Full details of the TPC-H benchmarks are discussed immediately after the benchmark results.

  • Raw data is supplied by TPC-H and contains 59,986,052 Order Line Items (7.24 GB) from 15,000,000 Orders (1.63 GB) placed by 150,000 customers (230.2 MB)
  • Out of date items are filtered out from Orders and Line Items
  • Only Customers of a certain cusrtomer type are accepted, others are filtered out
  • Customers, Orders and Line Items are joined on common keys
  • Data is aggregated on Revenue per Order
  • Data is sorted by Order Total and Order Date
  • Write results to CSV file


Web Site Access Log Web Site Access Log

Click to images to enlarge

The TPC-H Benchmarks

We feel you should be skeptical about benchmarks. They are usually hard to interpret and also diffiult to relate back to your own data. Human nature also dictates that Vendors will always pick the best benchmark.  For this reason, we have chosen an independent, industry standard benchmark developed by the Transaction Processing Performance Council (TPC) that is commonly used to measure database performance. More information about this can be found at http://www.tpc.org/tpch. The actual benchmarks we have implemented (TPC-H Q1 and TPC-H Q3) can be found in this PDF document http://www.tpc.org/tpch/spec/tpch2.9.0.pdf.

These benchmarks are valuable and valid because they do not highlight one particularly fast component that we have chosen in order to show off Clover. They represent typical business based data transformations that you are actually likely to encounter in practice. As such, they are a reasonably balanced measure of performance. We accept that this does not make it perfect and we ask interested parties to submit data sets to us if you require a different benchmark to be put together.


Hardware Used

The machine we used to run this test had the following server configuration

  • 2 x 2.33GHz Intel Xeon quad core 12MB L2 cache - 8 CPUs
  • 8 GB RAM
  • Windows Server 2003 Enterprise Edition SP1 32bit


Your Own Personal Benchmark

We are very confident about our ability to perform and outperform. If you have already benchmarked your data, just send us the data and the full details of your benchmarks and we will help you set up the exact same benchmark in Clover for you. You can then execute the full benchmark on your own hardware so you can perform a proper side-by-side comparison. If you are interested in this, please complete a sales inquiry form and provide some details.