Querying Twitter in CloverETL

Share this article:

These days, social networks are pervasive. It’s virtually impossible to avoid some kind of interaction with at least a few of them. Not only that, but the mere fact that so many people use them means there’s a ton of interesting data available within.

A typical example of such a popular network is Twitter, with more than 500 million tweets sent each day. Wouldn’t it be useful if you were able to automatically find tweets you want and then process them in bulk? The capacity to dig through heaps of social interactions in an effective manner is one of the core promises of Big Data – and it’s a valuable one. In this blog, I will show you how to do it with CloverETL.

First of all, you need to grant yourself access to Twitter so that you can use it to access the API later. Log in to https://dev.twitter.com/apps and select “Create new application” to set up your application. Fill in the name, description, and website here if you want, leave the Callback URL field empty, and submit the form.

After submitting, you’ll get to a page with application details. There is an OAuth settings section on this page where you can find “Consumer key” and “Consumer secret.” You’ll need these to connect from CloverETL.

Further down on the page, there is a “Your access token” section. Use “Create my access token” button. This might take some time, so wait a few seconds and then reload the page. You should see your “Access token” and “Access token secret” there. These two values will be used in CloverETL too.

With that, you’re done working on the Twitter side. Let’s now proceed to CloverETL.

We’re going to be using REST API, so we’ll basically be performing HTTP requests. The best component to achieve this target is the HTTPConnector.

To configure the HTTPConnector component, you need to specify these five attributes:

  • URL
  • OAuth Consumer key
  • OAuth Consumer secret
  • OAuth Access token
  • OAuth Access token secret

All OAuth attributes are taken from the registered Twitter application (see above). The URL depends on REST API method you want to use. For example, using https://api.twitter.com/1.1/search/tweets.json?q=%40CloverETL will search for tweets related to @CloverETL.

The result is returned in JSON format. You can either store it into file (Output file URL attribute of the component) or map the response content to an output port and process with other downstream components (e.g. JSONReader).

The attached example graph queries for the current Twitter trends and tweets related to them. Parses returned JSON for tweets attributes and stores them into XML file.

Download Example: QueryingTwitterInCloverETL.zip

And with that, you’ve now waded through the noise to find exactly what you’re looking for.

Share this article:


  1. Johnathon says

    What version of ETL is this? How can you connect to Twitter in 3.1? As far as I can tell, the option to include OAuth Token and Token Secret has been removed…

    • says

      Hello Johnathon,

      blogpost example uses the CloverETL version 3.5. We have added OAuth related attributes to HTTPConnector in 3.5 and it makes accessing services like Twitter or LinkedIn much easier. I’m afraid there is no simple solution for the CloverETL version 3.1.


    How to read more than 100 tweets per page?Also,would like to know about parsing multiple pages in Clover too.

    • Lubos Imriska says

      Hello Snehotosh,

      100 tweets per page is a limitation of Twitter API, not CloverETL. We unfortunately can not do anything with the limits set directly by Twitter.

      What would you like to know about parsing multiple pages? CloverETL is able to parse any number of pages, you just have to send records containing your queries to HTTPConnector, one request per record, see Input port reading documentation. But be careful about the other limits Twitter has.

  3. Indra says

    Hi Lubos,

    we are facing an issue while we run the graph in SSL mode the tweets get returned with the following error : issue “unable to find valid certification path to requested target”.

    javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

    could you guide us how to resolve this issue.

    • Lubos Imriska says

      Hi Indra,

      Have you always been unable to connect to Twitter or the issue appeared after some time of using Twitter API? Which version of CloverETL do you use, please? Have you modified Java cacerts file somehow? What exactly have you set in HTTPConnector? Have you set anything related to SSL in CloverETL? May I see your graph, please?

      I do not think that Twitter requires client certificates by default so it seems the issue is in your trust store. If you use your own trust store, set the following as VM arguments in CloverETL Designer -> Window -> Preferences -> CloverETL -> ETL Runtime:


      And if you want to speed the communication up, contact us at support@cloveretl.com, please. Blog comments are not checked regularly.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>