Real Time Dashboards 1 – Introduction

Based on my presentation at BarCamp Orlando 2010, I’ve decided to do a series of posts generalizing my work.  Hopefully this will help me remember the skills used and, more importantly, help someone else in turn.

The dashboard built for my presentation consisted of an Adobe Flex application as the front end, a Java backend and BlazeDS (Adobe’s as well) in the middle to connect these two.  I had developed a tech demo for work where this tool would provide real time operational insight into the progress of a long running task.  More specifically, this operational dashboard would relay progress, realtime, on an Extract, Transform and Load (ETL from now on) job we needed to monitor.

This ETL job is a combination of java tools and different databases where data is extracted, then curated and then processed to be reused in different ways.  It’s a thing of beauty in execution but a bit intimidating to someone like me who lacks a thorough understanding of what I do for a living 🙂

I thought it would be nice to build a tool to monitor on this process.  It would have to be non intrusive to my team or to ETL, and had to be completely independent of the tool it was keeping tabs on.  Thinking this would be a fun way to use Flex and BlazeDS (I miss working with Flex), I set out to built such a tool.  Unbeknownst to be, the task would prove to be bit more complicated than I thought it would be (ignorance keeps me bold).  In the end, some of the most daunting requirements where easily met with a completely free set of tools and the end result was not bad looking either.  it even works!

I used quite a few tools to complete a working example and will try to break posts down into main components that can be completed independently.  Feel free to stay tuned (or not) for the juicy parts. Hopefully, I can at least save you some time showing you what not to do.

Briefly, the dashboard was constructed with the following technologies, all freely available:
For the software parts, Abode Flex (ActionScript 3), Java and Hibernate.
For the server components, I used Tomcat and Adobe BlazeDS. Oh yeah, you are going to need a database server of your choice.  I will be using MySQL.
Any text editor would do as well; I am most comfortable in Eclipse but you may use anything you want.

Movielens – Date Dimension

Using the book The Data Warehouse Toolkit as a reference (thanks Peter), I’ve revised the date dimension.  I figure having a better base in our data will enable me and give me more options creating reports. I am only a few steps in this project from the diving into reporting.

Previously, our date dimension looked like this:

While not bad (I think it looks perfect), I admit book does a lot better job of preparing a date dimension with a lot more information.  Essentially, data warehousing initiatives tend to do a lot of the work ahead of time in order to save cycles on report creation.  Per the book, we could revise our dimension to be better represented as follows:

Although subtle and seemingly unnecessary, it is common practice to store all these (and lots more) derived values in our database in order to easy report construction.  Clearly, we could derive all of these recently added bits of information at report run time.  The point is saving this work since it does nothing but diminish report performance and complicate report creation.  Taking things a bit further, we could replace our date dimension with a dateStamp and be on our way but this would be a chore.  The same hold true, I admit, for other date information.  Harddrive space is cheap so I am not going to complain.

What Movie Now?

Finally, I was able to secure a low cost hosting for trying skills form book out.

After tons of inconveniences, I’ve launched whatmovienow.com. This is a work in progress and I will try to add features based on collective intelligence book as best I can. The site mostly employs the ranking algorithms form book. It does not re-evaluates movies based on ranking from site visitors as that would take too many resources. Updates should be a lot quicker now that the hard work is done :).

Getting site off the ground has taken more time that I had intended.

First, I had to change from mySQL to MSSQL. I was going to use Dreamhost for hosting mySQL database but performance was very irregular and sluggish.

Second, I originally wrote website in Coldfusion, using the Model-Glue framework. This worked fine on my local computer, however, hosting provider had some restrictions which further delayed deployment. I ended up with two sligthly different versions of site one for dev locally and one for live :(. I intend to configure local computer to better reflect production server.

How apologetic… the only thing that matters is that site is out and I can resume writing blog.