CloverETL Cluster Edition
Clover Cluster delivers the very best in scalability and price/performance not just within our own product line but also within the industry. Clover Cluster serves the needs of organizations that have to process huge volumes of data in a limited period of time. Whether you have one large dataset or a large number of smaller datasets, Clover Cluster aims to provide near linear scalability as you add servers to your cluster.
Compare EditionsSample TransformationsProduct OverviewRequest DemoContact Us
|
Clover Cluster Demonstration 6 million records processed and transformed in under 30 seconds! Clover Cluster runs on your in-house servers or on a Cloud based server platform. In the 10 minute video demonstration on the right, we can see how we fire up 4 Amazon EC2 server instances and then add these into our Clover Cluster. We then run a transformation that performs the following operations
|
Clover Cluster Video |
|
Click image to enlarge |
Clover Cluster uses the same Designer as all Clover ETL products. The Clover Server also has the exact same web based interface for managing and configuring your server nodes. Cluster Configuration You can add as many nodes to the system as you need and can be added or removed without any complex setup. You designate one of your servers as the Master Node. That node then assumes responsibility for distributing tasks between the Worker Nodes. |
|
Master and Worker Nodes The idea behind our Cluster approach is to divide and conquer a data source and allow different nodes to read and process different portions of the data source simultaneously. The Master Node analyzes the data source to be loaded and then instructs each Worker Node to process a specific section of that data source. Worker Nodes can be added dynamically which, when coupled with a Cloud environment, allows for efficient and cost-effective up- and down-scaling. Non Parallel Processing There are certain graph operations where data may need to be operated on in a way that cannot be genuinely achieved in parallel. The Master Node is aware of these situations and orchestrates the data processing accordingly. |
Click image to enlarge |
|
Performance increase starting from a desktop machine and then moving to an Amazon EC2 cluster with 2, 3 and 4 nodes |
Remote Data Sources It may be the case that your data source is at a remote location relative to your cluster. Clover can read flat data sources using FTP and HTTP by streaming data from the source, processing it and then writing it out - all in parallel. Scalability & Performance Clover Cluster delivers the very best in scalability and price/performance not just within our own product line but also within the industry. Resiliency & Redundancy Another benefit of the Cluster Edition is redundancy. Should any of your cluster nodes fail, your transformations will still run normally, just a little slower. The intelligent load-balancing automatically apportions processing across as many servers as are operational within the cluster. |
|
Cluster Transformation Example The upper of the two graphs shown here is configured to run in the Desktop or Enterprise Editions. Click on the video icon to see this graph processing an industry standard benchmark that reads an 850Mb file with 6 million records.The lower graph shows the exact same transformation configured to run in a Clover Cluster. The graph is essentially the same but it takes into account the fact that data that has been processed by mutiple cluster nodes will have to be merged into a single data set before the final stages of processing can logically be carried out. Please watch the video at the top of the page to see a more detailed presentation of this graph and to see it running. |
Normal graph running on Desktop or Enterprise Editions. Click to enlarge. The same transformation configured to take advantage of the Cluster. Click to enlarge. |
| Forum | Blog | Sitemap | FAQs | About | Contact Us | Javlin |





