ComplexDataReader is a powerful new component in CloverETL meant for reading elaborate heterogeneous data. However, all data cannot be read easily even if you spend a lot of time configuring the component. Sometimes you need to think in advance: What if you come across unknown metadata you have not handled? Normally, the graph crashes.[Continue reading]
Users often need to retrieve data from a data source which does not contain this related data but is easily defined. Thus, it is important to be able to add further information to your source that is not already present in the file (e.g. time stamp, name of excel sheet). Such additional information can simplify further data processing.[Continue reading]
As I have promised I bring you a complex comparison of ETL tools: CloverETL, Talend and Pentaho.
Short summary of my previous posts: For testing I used two transformations based on TPCH test and the input data generated by dbgen utility. The transformations were run on my laptop with Windows Vista Home Premium. For detail information see part 1 and part 2.[Continue reading]
Before we will release a complete comparison of open source ETL tools and after a success of my previous blog post I decided to publish the second transformation that we used in the comparison.[Continue reading]
On Oct. 21 OpenSys released a new version of its ETL tool, CloverETL Designer version 2.8.1. It's mainly bugfix version but also brings a new component, ParallelReader, that makes delimited data file (CSV) processing faster than ever before.[Continue reading]
In October release 2.8.1 of Clover we introduced a new component which definitely should attract your attention – the Parallel Reader. The name itself already suggests the goal of the component – improve reading speed by going parallel. The component is very similar to Universal Data Reader in function – it reads delimited flat files like CSV, tab delimited, etc. - much hasn't changed here. But the real difference comes from under the hood.[Continue reading]
CloverETL provides a very useful feature: multiple delimiters. When you parse a delimited file (eg. CSV) you can specify different delimiter for each field. This isn't surprising for daily CloverETL users however for users of other ETL tools it can be. It might not be very well known that in CloverETL you can even define multiple delimiters for one field (also called "mutable delimiter") and CloverETL chooses the right one. It reveals new ways of file processing with irregular structure in CloverETL. I believe this functionality isn't provided by any other ETL tool on the market. If I am wrong you can leave me a message in comments. I'm always happy to find "hidden features" of other ETL tools.[Continue reading]