Tip

To enhance the comfort of graph editing, functions like Undo, Copy, Paste or Zoom are available.

CloverETL Components Overview

CloverETL's architecture is component based. Individual transformation components perform more or less complicated transformations. Combined together, they can deliver any imaginable complex data transformation.

Following is the list of group of components which are standard part of CloverETL distribution:

Readers

Writers

Transformers

Joiners

Other Components

Components for Data Cleansing

Readers

ScreenshotUniversalDataReader reads data from delimited or fixed length files and sends data records to all connected output ports. More info.
Screenshot ParallelReader (Commercial Component) reads data from delimited files using more threads and sends data records to its output port. More info.
Screenshot DataGenerator generates data records according to defined pattern and sends them to all connected output ports. More info.
Screenshot CloverDataReader reads data from files written in internal binary Clover data format and sends data records to all connected output ports. More info.
Screenshot XLSDataReader reads specified sheets of Excel files, converts their contents to data records and sends them to all connected output ports. Mapping of sheet columns to output metadata must be defined. More info.
Screenshot DBFDataReader reads data from dBase files and sends data records to all connected output ports. More info.
Screenshot DBInputTable connects to database using JDBC driver, executes specified query on DB, extracts the rows returned by the query and sends them as data records to all connected output ports. Output metadata must precisely describe the structure of extracted rows. More info.
Screenshot XMLExtract reads data from XML files, converts elements matched by mapping along with their children to data records and sends them to different connected output ports as defined by mapping. More info.
Screenshot XMLXPathReader reads data from XML files, converts elements matched by XPath defined in mapping along with their direct children to data records and sends them into different connected output ports as defined by mapping. More info.
Screenshot JMSReader receives JMS messages, converts them to data records and sends them to all connected output ports. Implements JmsMsg2DataRecord interface. More info.
Screenshot LDAPReader reads information from LDAP directory trees, converts it to data records and sends them to all connected output ports. More info.
Screenshot MultiLevelReader (Commercial Component) reads data from flat files containing heterogeneous data structure, identifies data records using a plugged selector and sends individual groups of data records to corresponding connected output ports. Selector implements MultiLevelSelector interface. By default, PrefixMultiLevelSelector is used. More info.
Screenshot QuickBaseRecordReader (Commercial Component) reads data from QuickBase online database (http://quickbase.intuit.com) as specified in the component itself (first) or those whose IDs are received through the optional input port (second) and sends these data records out through the connected first output port. Information about rejected data records can be sent out through the optional second port if connected. Wrapps the API_GetRecordInfo HTTP interaction ( https://www.quickbase.com/up/6mztyxu8/g/rc7/en/#_Toc126580046). More info.
Screenshot QuickBaseQueryReader (Commercial Component) reads data records from QuickBase online database (http://quickbase.intuit.com) and sends them out through the connected output port. Wrapps the API_DoQuery HTTP interaction ( https://www.quickbase.com/up/6mztyxu8/g/rc7/en/#_Toc126579999). More info.

Writers

Screenshot UniversalDataWriter receives data through connected input port and writes data records to delimited or fixed length files based on input metadata. More info.
Screenshot Trash receives data through connected input port and discards it all. However, data can be written to specified file for debugging. More info.
Screenshot CloverDataWriter receives data through connected input port and writes data records to output files in internal binary Clover data format. More info.
Screenshot XLSDataWriter receives data through connected input port and writes data records to specified sheets of Excel files. More info.
Screenshot StructuredDataWriter receives data through connected input port(s) and writes data records to output files according to defined structure consisting of header, body and footer. Header and footer are optional. More info.
Screenshot EmailSender (Commercial Component) receives data through connected input port, composes e-mail messages and sends them out via specified SMTP server. More info.
Screenshot DBOutputTable receives data through connected input port, connects to database using JDBC driver, performs database operation and loads data records into specified DB table. Correspondence between input metadata and DB table columns must be defined. More info.
Screenshot DB2DataWriter receives data through connected input port or reads it from input file, connects to DB2 database using db2 utility and loads data records into specified database table. Correspondence of input metadata and DB columns must be defined. Information about rejected records can be sent out through optional output port if connected. It works faster than DBOutputTable. More info.
Screenshot InfobrightDataWriter loads data into Infobright database. Only root user can insert data into database with this component. To run this component on Windows, infobright_jni.dll must be present in the Java library path. (Can be downloaded at www.infobright.org/downloads/contributions/infobright-core-2_7.zip.) Component can connect to both local and remote server. In order to connect to a remote server, remote agent must be listening at specified port number. More info.
Screenshot InformixDataWriter receives data through connected input port or reads it from input file, connects to Informix database using dbload or load2 utilities and loads data records into specified database table. Rejected records along with information about error are sent out through optional output port if connected. It works faster than DBOutputTable. More info.
Screenshot MSSQLDataWriter receives data through connected input port or reads it from input file, connects to MSSQL database using bcp utility and loads data records into specified database table. Rejected records along with information about error are sent out through optional output port if connected. It works faster than DBOutputTable. More info.
Screenshot MySQLDataWriter receives data through connected input port or reads it from input file, connects to MySQL database using mysql utility and loads data records into specified database table. Information about rejected records can be sent out through optional output port if connected. It works faster than DBOutputTable. More info.
Screenshot OracleDataWriter receives data through connected input port or reads it from input file, connects to Oracle database using sqlldr utility and loads data records into specified database table. User can log the process of loading data into Log file, write incorrect and/or rejected records into Bad file and/or Discard file, respectively. It works faster than DBOutputTable. More info.
Screenshot PostgreSQLDataWriter receives data through connected input port or reads it from input file, connects to PostgreSQL database using psql utility and loads data records into specified database table. It works faster than DBOutputTable. More info.
Screenshot XMLWriter receives data through all connected input ports, converts data records to XML elements based on defined mapping and writes the resulting tree structure of elements to XML files. More info.
Screenshot JMSWriter receives data through connected input port, converts data records to JMS messages and sends them out. Implements DataRecord2JmsMsg interface. More info.
Screenshot LDAPWriter receives data through connected input port and converts data records to information in LDAP directory trees. Rejected records are sent to the optional output port if connected. More info.
Screenshot QuickBaseRecordWriter (Commercial Component) receives data records through the input port and writes them into QuickBase online database (http://quickbase.intuit.com). If the optional output port is connected, rejected records along with the information about the errors are sent out through it. Wrapps the API_AddRecordInfo HTTP interaction ( https://www.quickbase.com/up/6mztyxu8/g/rc7/en/#_Toc126579962). More info.
Screenshot QuickBaseImportCSV (Commercial Component) receives data records through the input port and writes them into QuickBase online database (http://quickbase.intuit.com). Generates record IDs for successfully written records and sends them out through the first optional output port if connected. The first field on this output port must be of string data type. Into this field, generated record IDs will be written. Information about rejected data records can be sent out through the optional second port if connected. Wrapps the API_ImportFromCSV HTTP interaction ( https://www.quickbase.com/up/6mztyxu8/g/rc7/en/#_Toc126580055). More info.

Transformers

Screenshot SimpleCopy receives data through connected input port and copies data records to all connected output ports. More info.
Screenshot ExtSort receives data records through connected input port, sorts them according to specified sort key and sends them to all connected output ports. Sort key is name or combination of names of field(s) of incoming records. Sort order can be either Ascending (default) or Descending. Any number of records can be sorted. If internal buffer is full, external sorting is performed. More info.
Screenshot FastSort (Commercial Component) receives data records through connected input port, sorts them according to specified sort key and sends them to output port. Sort key is name or combination of names of field(s) of incoming records. Sort order can be either ascending (default) or descending. Any number of records can be sorted. If internal buffer is full, external sorting is performed. FastSort can be up to 2.5 times faster than ExtSort. Can work well with international characters. More info.
Screenshot SortWithinGroups receives data records through connected input port, sorts them according to specified sort key within groups of data records. The groups are defined by previous sorting of the records by the group key, with each group consisting of records considered equal when sorting by the group key. Sends sorted data records to all connected output ports. Sort key (as well as group key) is name or combination of names of field(s) of incoming data records. Sort order of each data field can be either Ascending (default) or Descending. Can be different for different key fields. Any number of records can be sorted. If the internal buffer is full, external sorting is performed. More info.
Screenshot Dedup receives sorted data records through connected input port and removes records that are in duplicate with a view to the specified key values. Keeps defined number of records from either the start (First) or the end (Last) of each group with the same key value. If desired, only unique records are kept. Dedup key is name or combination of names of field(s) of incoming records. Rejected records are sent to the optional second output port if connected. More info.
Screenshot ExtFilter receives data records through connected input port, removes some of them depending on defined filter expression and sends the rest to the connected first output port. Rejected records are sent to the optional second output port if connected. More info.
Screenshot EmailFilter (Commercial Component) reads records from input port and parses e-mail addresses in specified fields. It is capable of both simple syntax validation and advanced methods such as: Domain existence check, SMTP session validation and MAIL sending. Valid records are sent out through output port 0, rejected records through output port 1. More info.
Screenshot Concatenate receives data through all connected input ports in turn. Gets all records from each input port, sends them to output port and continues with the next input port. Skips ports without incoming data. Terminates when all incoming records are received and sent to connected output port. More info.
Screenshot SimpleGather receives data through all connected input ports in turn. Gets only one record from each input port, sends it to output port and continues with the next input port. Skips ports without incoming data. When last input port is reached, continues with the first input port with incoming data. Terminates when all incoming records are received and sent to connected output port. Implements inverse RoundRobin. More info.
Screenshot Merge receives sorted data through all connected input ports in turn. Gets all incoming records and sends them to connected output port sorted in the same way. Terminates when all incoming records are received and sent to connected output port. More info.
Screenshot Partition receives data through connected input port, splits incoming data flow into more flows and sends each of them to different connected output port. More info.
Screenshot DataIntersection receives sorted data flows through both input ports, finds data records with the same key values, converts these records to outgoing records according to defined transformation and sends these resulting records to the second output port. Data records contained only in the first or second data flow are sent to the first or third output ports, respectively. More info.
Screenshot KeyGenerator receives data through connected input port, generates one new field based on other specified fields, adds this new field to outgoing data records and sends them to all connected output ports. More info.
Screenshot Aggregate receives data through connected input port, aggregates information about groups of adjacent records with the same aggregate key value, creates outgoing records from this information along with other input fields according to output metadata and sends these resulting records to all connected output ports. Can be used similarly as GROUP BY in SQL. More info.
Screenshot Reformat receives data through connected input port, transforms them in a user specified way, and sends such new outgoing data records to all connected output ports. Implements org.jetel.component.RecordTransform interface. More info.
Screenshot Denormalizer receives data through connected input port, composes more adjacent incoming records with the same key value into one new outgoing record and sends all of these resulting records to all connected output ports. More info.
Screenshot Normalizer receives data through connected input port, splits each data record into more records and sends all resulting records to all connected output ports. More info.
Screenshot Rollup receives data through connected input port, groups incoming records by their key values and lets user transformation arbitrarily create new data records from the group by aggregating, composing or splitting. Sends these new data records through the connected output port(s) as specified by the transformation. Rollup function is superset for (De)Normalizer components. More info.
Screenshot XSLTransformer changes/transforms data record between one INPUT and one OUTPUT port. More info.

Joiners

Screenshot ApproximativeJoin receives data through two input ports, for each matching key value contained in driver data records (port 0) searches corresponding slave data records (port 1) with the same key value. For every pair of driver and slave records conformity based on join key is computed as Levenstein distance. Pairs with conformity higher than specified value are joined to form outgoing data flow and these resulting records are sent to the first output port. Pairs with conformity less are joined to form outgoing data flow and these resulting records sent to the second output port. Driver records without slave are sent to the third output port if connected. Slave records without driver are sent to the fourth output port if connected. More info.
Screenshot ExtHashJoin receives data through all connected input ports, for every connected slave input port (all input ports except the first) creates hash table from data records incoming through such slave input port (for this reason, number of incoming slave data records must be sufficiently small), for each join key value contained in data records incoming through the first input port (drivers) looks up every hash table and searches corresponding slave data records with the same key value. Every tuple of driver and its corresponding slaves is sent to transformation class that joins them to form outgoing data flow. Resulting records are sent to connected output port. More info.
Screenshot ExtMergeJoin receives sorted data through all connected input ports, for each join key value contained in data records incoming through the first input port (drivers) searches corresponding data records incoming through the other port(s) (slaves) with the same key value. Every combination of driver and its corresponding slaves is sent to transformation class that joins them to form outgoing data flow. Resulting records are sent to connected output port. More info.
Screenshot LookupJoin receives data through connected input port (drivers), for each join key value contained in driver data records searches corresponding data records with the same key value in lookup table (slaves). Every pair of driver and slave records is sent to transformation class that joins them to form outgoing data flow. Resulting data records are sent to the first output port. Driver records without slave are sent to the second output port if connected and if inner join (default) is performed. More info.
Screenshot DBJoin receives data through connected input port (drivers), connects to database, for each join key value contained in driver data records searches corresponding data records with the same key value in database table (slaves). Every pair of driver and slave records is sent to transformation class that joins them to form outgoing data flow. Resulting data records are sent to the first output port. Driver records without slave are sent to the second output port if connected and if inner join (default) is performed. More info.
Screenshot RelationalJoin (Commercial Component) receives sorted data through both connected input ports, for each join key value contained in data records incoming through the first input port (drivers) searches corresponding data records incoming through the second input port (slaves) with the key values specified by one of the five possible join relation operators (e.g., non-equal, greater than, etc.). Thus, joining is performed based on these relations instead the exact matching required in the other joiners. For each join relation, sort order is strictly defined and must be the same for both inputs. Every combination of driver and its corresponding slaves is sent to transformation class that joins them to form outgoing data flow. Resulting records are sent to connected output port. More info.

Others

Screenshot SystemExecute executes system commands and sends their outputs to the optional output port or writes it to output file. If command requires some data, it can be get through connected optional input port. More info.
Screenshot JavaExecute executes Java commands using runnable transformation. More info.
Screenshot DBExecute executes specified commands (SQL/DML) on specified database table. If using stored procedure, receives records through connected input port and sends return value and/or output parameters to connected output port. More info.
Screenshot RunGraph runs specified graph(s) in selected instance of Clover. Name(s) of at least one graph can be received through connected input port or name of one graph only can be set as attribute of this component. If graph name(s) is(are) received through connected input port, information about execution of the graph(s) is sent to the connected first output port (second output port is not connected). If graph name is set as attribute of the component, input port is not connected and information about execution of the graph is sent to connected output ports. It is sent to the first output port, if it is successful, or to the second, if not. More info.
Screenshot HTTPConnector receives an HTTP request through the connected input port, reads it from input file or directly from the graph and sends it to the HTTP server. Takes the response from the server and sends it out through the connected output port or writes it to the output file. Optionally also the output file URLs can be sent out through the connected output port. More info.
Screenshot WebServiceClient receives data through single connected input port, process it, sends the request to the web server. Gets the response from the server, process it and sends it out through the connected output ports according to the mapping. The response is processed as an XML file to one or more output ports.More info.
Screenshot CheckForeignKey checks foreign key values against a table of primary key values. Duplicate primary keys are ignored. Invalid foreign key values are replaced by default foreign key value. Resulting foreign data records are sent to all connected output ports. Invalid data records are sent to the optional second output port if connected. More info.
Screenshot SequenceChecker checks the sort order of incoming data records. If all records are sorted properly for a specified sort key, the component continues, otherwise it fails (thus aborting the graph). Sort key is name or combination of names of field(s) of incoming records. Sort order is either Ascending (default) or Descending, but it is always the same for all key fields. If the sort order is correct, the incoming records are sent out through output port(s) if connected. More info.
Screenshot LookupTableReaderWriter receives data records through connected input port and writes them to specified lookup table (output port is not connected) or reads data records from lookup table and sends them out through all connected output ports (input port is not connected). Lookup table can also be updated before data records are read and sent out (both input port and output ports are connected). More info.
Screenshot SpeedLimiter receives data through connected input port, delays each incoming data record by defined value of miliseconds and copies all data records to all connected output ports. More info.

Data Cleansing

ScreenshotAddressDoctor receives data records through the connected input port, validates the incoming records using DB files and sends the resulting data records through the connected output port. Serves to validate and correct geographical addresses (countries, cities, streets, postal codes, etc).
Screenshot AddressDoctorTransliteration receives data records through the connected input port, transliterates data fields from one encoding to another and sends the resulting data records in the new encoding through the connected output port.
Screenshot Group1 receives data records through the connected input port, establishes connection with the Group1 database, calls the specified service, validates incoming data fields depending of the service, and sends the resulting data records through the connected output port. Serves to validate and correct geographical addresses (countries, cities, streets, postal codes, etc).
Screenshot Trillium receives data records through the connected input port, establishes connection with the Trillium server, cleans incoming data records using the server and sends the resulting data records through the connected output port. Serves to validate and correct geographical addresses (countries, cities, streets, postal codes, etc).