Dillon Buchanan

Software Engineer

Hello there! I'm Dillon Buchanan, a software developer and all-around programming enthusist working in Boston. I love creating great software!


Node.js + Twitter Streaming API + Socket.io = Twitter Cashtag Heatmap

My friends and I have always tossed around the idea of using real-time Tweets to determine the popularity of a specific company. The vision, in it's entirety, is pretty grandiose but today I made a small step towards that goal. With the introduction of Twitter's Streaming API comes a world of possibilities! When I saw this I knew I had to create a project with it. At the same time I've been ramping up on Node.js so I thought: what better than to create a project with both technologies, plus Socket.io for good measure. Thus, I present the Twitter Cashtag Heatmap!

Demo and Source Code

As always, I've provided the demo code here: Demo which is generously (freely) hosted by Heroku. In addition, the project is open sourced so please feel free to take a look at the code! Remember, if you clone the repository make sure you run an 'npm install' to download all the dependencies.

Creating the Back-end

Let's first start by covering what is happening in the back-end of the code. The entire server is coded up using Node.js with Express. The server is responsible for serving up a single 'index.html' which will include our only page with the necessary Javascript required to communicate to the server to retrieve live data. In addition, the server opens a streaming connection with Twitter and requests that it be informed whenever any of the cash tags we specified are mentioned. When the server receives data from Twitter it processes it sends it to every client via Socket.io.

Lets look at some code:

/**
 * Module dependencies.
 */
var express = require('express')
  , io = require('socket.io')
  , http = require('http')
  , twitter = require('ntwitter')
  , cronJob = require('cron').CronJob
  , _ = require('underscore')
  , path = require('path');

//Create an express app
var app = express();

//Create the HTTP server with the express app as an argument
var server = http.createServer(app);

// Twitter symbols array
var watchSymbols = ['$msft', '$intc', '$hpq', '$goog', '$nok', '$nvda', '$bac', '$orcl', '$csco', '$aapl', '$ntap', '$emc', '$t', '$ibm', '$vz', '$xom', '$cvx', '$ge', '$ko', '$jnj'];

//This structure will keep the total number of tweets received and a map of all the symbols and how many tweets received of that symbol
var watchList = {
    total: 0,
    symbols: {}
};

//Set the watch symbols to zero.
_.each(watchSymbols, function(v) { watchList.symbols[v] = 0; });

//Generic Express setup
app.set('port', process.env.PORT || 3000);
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(express.logger('dev'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
app.use(require('stylus').middleware(__dirname + '/public'));
app.use(express.static(path.join(__dirname, 'public')));

//We're using bower components so add it to the path to make things easier
app.use('/components', express.static(path.join(__dirname, 'components')));

// development only
if ('development' == app.get('env')) {
  app.use(express.errorHandler());
}

I won't go into too much detail here since it's commented pretty heavily, but just note for those whom are shaky at Node.js/Express, this is a typical setup for the two. If you generated a new Express project you'd get most of this code except for the 'watchSymbols' and 'watchList' variables which I use to hold data for the application. Everything else is pretty standard.

Up until now, we have created a server, configured it, and setup some structures that we'll use to hold our data. Lets start serving up some data.

//Our only route! Render it with the current watchList
app.get('/', function(req, res) {
	res.render('index', { data: watchList });
});

Again, nothing special here. We only have one page to serve up: index.html. The second render parameter is passing the data to the view layer so we can render the current 'watchList'. While data is streamed live to the client sometimes it's nice to have immediate gratification. Also, it allows client's with the inability to use websockets to see the fruits of our labor.

Next, we'll setup Socket.io to work with our application.

//Start a Socket.IO listen
var sockets = io.listen(server);

//Set the sockets.io configuration.
//THIS IS NECESSARY ONLY FOR HEROKU!
sockets.configure(function() {
  sockets.set('transports', ['xhr-polling']);
  sockets.set('polling duration', 10);
});

//If the client just connected, give them fresh data!
sockets.sockets.on('connection', function(socket) { 
    socket.emit('data', watchList);
});

Wow, that was easy. One thing to note here is that I had to make some adjustments for Heroku since it doesn't support websockets. However, if you're running it locally you can simply remove the sockets.configure call entirely. It's only needed for Heroku. Finally, the last function tells Socket.io to send the 'watchList' whenever a new connection is established.

Next, lets hook up Twitters Streaming API:

//Instantiate the twitter component
//You will need to get your own key. Don't worry, it's free. But I cannot provide you one
//since it will instantiate a connection on my behalf and will drop all other streaming connections.
//Check out: https://dev.twitter.com/
var t = new twitter({
    consumer_key: '',           // <--- FILL ME IN
    consumer_secret: '',        // <--- FILL ME IN
    access_token_key: '',       // <--- FILL ME IN
    access_token_secret: ''     // <--- FILL ME IN
});

//Tell the twitter API to filter on the watchSymbols 
t.stream('statuses/filter', { track: watchSymbols }, function(stream) {

  //We have a connection. Now watch the 'data' event for incomming tweets.
  stream.on('data', function(tweet) {

    //This variable is used to indicate whether a symbol was actually mentioned.
    //Since twitter doesnt why the tweet was forwarded we have to search through the text
    //and determine which symbol it was ment for. Sometimes we can't tell, in which case we don't
    //want to increment the total counter...
    var claimed = false;

    //Make sure it was a valid tweet
    if (tweet.text !== undefined) {

      //We're gunna do some indexOf comparisons and we want it to be case agnostic.
      var text = tweet.text.toLowerCase();

      //Go through every symbol and see if it was mentioned. If so, increment its counter and
      //set the 'claimed' variable to true to indicate something was mentioned so we can increment
      //the 'total' counter!
      _.each(watchSymbols, function(v) {
          if (text.indexOf(v.toLowerCase()) !== -1) {
              watchList.symbols[v]++;
              claimed = true;
          }
      });

      //If something was mentioned, increment the total counter and send the update to all the clients
      if (claimed) {
          //Increment total
          watchList.total++;

          //Send to all the clients
          sockets.sockets.emit('data', watchList);
      }
    }
  });
});

When you run this locally, make sure you acquire a Twitter API key for yourself. I can't divulge mine since Twitter only allows one streaming connection for API key and drops all other connections if multiples are detected.

Finally, lets setup a 'reset' cron, and start the server.


//Reset everything on a new day!
//We don't want to keep data around from the previous day so reset everything.
new cronJob('0 0 0 * * *', function(){
    //Reset the total
    watchList.total = 0;

    //Clear out everything in the map
    _.each(watchSymbols, function(v) { watchList.symbols[v] = 0; });

    //Send the update to the clients
    sockets.sockets.emit('data', watchList);
}, null, true);

//Create the server
server.listen(app.get('port'), function(){
  console.log('Express server listening on port ' + app.get('port'));
});

That's it for the server. That's all that's needed! We can now run 'node app.js' and watch as the server begins execution. Now we need a client.

Enter The Client

The client could not be easier. We've already done most of the heavy lifting in the server so our client is pretty thin. Infact, it only consists of a single page and some Javascript. Let's take a look at the markup for the index page first which is coded in the Jade template engine.

doctype 5
html
    head
        title Node.js, Twitter Stream, Socket.io
        link(rel='stylesheet', href='/stylesheets/style.css')
        script(src="http://code.jquery.com/jquery-1.9.1.min.js")
        script(src="components/socket.io-client/dist/socket.io.min.js")
        script(src="/javascripts/script.js")
    body
        .container
            .header
                h1 Twitter Symbol Heatmap
                small 
                    | This application uses Node.js to create a streaming connection between it and Twitter. 
                    | Any mentions of the following symbols are received by the Node application and broadcasted
                    | to any clients using Socket.io.
                small
                    i
                        | Last updated:
                        span(id="last-update") Never
            ul
                each val, key in data.symbols
                    li(data-symbol="#{key}")= key

Remember in the server when we request the '/' route we render the 'index' and some data? Well this is where that data comes into play. The last two lines loop through every symbol and create a list item for it. This will create a HTML element for each one with a data tag so we can refer to it from our Javascript. Speaking of which, let's look at that code now.

$(function() {
    var socket = io.connect(window.location.hostname);
    socket.on('data', function(data) {
        var total = data.total;
        for (var key in data.symbols) {
            var val = data.symbols[key] / total;
            if (isNaN(val)) {
                val = 0;
            }

            $('li[data-symbol="' + key + '"]').each(function() {
                $(this).css('background-color', 'rgb(' + Math.round(val * 255) +',0,0)');
            });
        }
        $('#last-update').text(new Date().toTimeString());
    });
});

Here's where things get interesting. First thing to node is that I'm using jQuery here with the Socket.io client library. In the first line I instantiate a connection to the remote host (the server). Next, whenever data arrives on the connection we'll loop through all of the data's symbols and pull out the number of times they've been tweeted about. We'll do a little math to create a normalized value based on how often that symbol was mentioned over the total amount for every symbol listed within the app. Finally, we'll set the background color for each HTML element to the calculated normalized value. This produces the heatmap effect.

Conclusion

The application is far from perfect but I think provides a much cooler example than the generic chat program with Node.js.

comments powered by Disqus