Querying Twitter in CloverETL

Share this article:

These days, social networks are pervasive. It’s virtually impossible to avoid some kind of interaction with at least a few of them. Not only that, but the mere fact that so many people use them means there’s a ton of interesting data available within.

A typical example of such a popular network is Twitter, with more than 500 million tweets sent each day. Wouldn’t it be useful if you were able to automatically find tweets you want and then process them in bulk? The capacity to dig through heaps of social interactions in an effective manner is one of the core promises of Big Data – and it’s a valuable one. In this blog, I will show you how to do it with CloverETL.

First of all, you need to grant yourself access to Twitter so that you can use it to access the API later. Log in to https://dev.twitter.com/apps and select “Create new application” to set up your application. Fill in the name, description, and website here if you want, leave the Callback URL field empty, and submit the form.

After submitting, you’ll get to a page with application details. There is an OAuth settings section on this page where you can find “Consumer key” and “Consumer secret.” You’ll need these to connect from CloverETL.

Further down on the page, there is a “Your access token” section. Use “Create my access token” button. This might take some time, so wait a few seconds and then reload the page. You should see your “Access token” and “Access token secret” there. These two values will be used in CloverETL too.

With that, you’re done working on the Twitter side. Let’s now proceed to CloverETL.

We’re going to be using REST API, so we’ll basically be performing HTTP requests. The best component to achieve this target is the HTTPConnector.

To configure the HTTPConnector component, you need to specify these five attributes:

  • URL
  • OAuth Consumer key
  • OAuth Consumer secret
  • OAuth Access token
  • OAuth Access token secret

All OAuth attributes are taken from the registered Twitter application (see above). The URL depends on REST API method you want to use. For example, using https://api.twitter.com/1.1/search/tweets.json?q=%40CloverETL will search for tweets related to @CloverETL.

The result is returned in JSON format. You can either store it into file (Output file URL attribute of the component) or map the response content to an output port and process with other downstream components (e.g. JSONReader).

The attached example graph queries for the current Twitter trends and tweets related to them. Parses returned JSON for tweets attributes and stores them into XML file.

Download Example: QueryingTwitterInCloverETL.zip

And with that, you’ve now waded through the noise to find exactly what you’re looking for.

Share this article:

Comments

  1. Johnathon says

    What version of ETL is this? How can you connect to Twitter in 3.1? As far as I can tell, the option to include OAuth Token and Token Secret has been removed…

    • says

      Hello Johnathon,

      blogpost example uses the CloverETL version 3.5. We have added OAuth related attributes to HTTPConnector in 3.5 and it makes accessing services like Twitter or LinkedIn much easier. I’m afraid there is no simple solution for the CloverETL version 3.1.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>