Big Time with Big Records

Those of you who have ever tried to process big records with CloverETL already learned that it required some tweaking and special care to make it run smoothly and efficiently. In some cases, CloverETL could get too greedy with memory requirements for a graph run, making it quite cumbersome to set up. With CloverETL 3.2 we have introduced improved memory management in the runtime layer that has optimized memory usage when running graphs with big records.[Continue reading]

Performance Optimization of Metrics in CloverETL Data Profiler

clock-icon

The first beta version CloverETL Data Profiler was released in October, and since then we have been working on improvements for the second beta version, which was released at the end of last year. Besides bug fixing and adding a few new features, we also worked on performance optimization of profiling metrics. This article will describe this improvement and how profiling is interconnected with CloverETL Engine.[Continue reading]

Meet Joda and get 30% more power!

Joda… don’t get too excited - Clover has (most probably) not contracted a Jedi master to squeeze in a portion of extra power to the product. In the real world, Joda is a quite useful third-party library for handling date and time operations. It has been in CloverETL for some time as an alternative option to the standard Java date implementation. Although not having superpowers of the aforementioned sci-fi character it is well worth being friends with and using it wisely might give your data transformations a noticeable punch in terms of performance.[Continue reading]

Sorting Data: ExtSort vs. FastSort – which one is better for me? (Part 2)

extsort-screen

In my  previous post I have focused on tips for tweaking the FastSort component – performance sort component available in all commercial CloverETL editions. Today, I would like to touch the original ExtSort component which has been in CloverETL for a while and is available in both commercial and free (Community, opensource engine) editions.[Continue reading]

Parallel Data Processing Comparison – CloverETL vs. Talend vs. Pentaho (Part 3)

TPCH-Q1

As I have promised I bring you a complex comparison of ETL tools: CloverETL, Talend and Pentaho.

Short summary of my previous posts: For testing I used two transformations based on TPCH test and the input data generated by dbgen utility. The transformations were run on my laptop with Windows Vista Home Premium. For detail information see part 1 and part 2.[Continue reading]

Parallel Data Processing Comparison – CloverETL vs. Talend vs. Pentaho

Results

On Oct. 21 OpenSys released a new version of its ETL tool, CloverETL Designer version 2.8.1. It's mainly bugfix version but also brings a new component, ParallelReader, that makes delimited data file (CSV) processing faster than ever before.[Continue reading]

ParallelReader Component: Performance Boost in Data Processing

In October release 2.8.1 of Clover we introduced a new component which definitely should attract your attention – the Parallel Reader. The name itself already suggests the goal of the component – improve reading speed by going parallel. The component is very similar to Universal Data Reader in function – it reads delimited flat files like CSV, tab delimited, etc. - much hasn't changed here. But the real difference comes from under the hood.[Continue reading]