Picking this topic up from the last post, I focused on enriching the data released. This will allow further exploration of this data.
Lets use our previous schema as our starting point. The previous post produced a good starting point for the task at hand. The records from the previous post were stored in a table as shown in Figure 1.
Figure 1 – License plate table readings.
Browsing Hacker News, I recently found out about the City of Oakland releasing almost 3 million records of license plate reader data. The conversation there is way better than any blurb I could come up with. However, this is a neat opportunity to mine this data as an academic exercise.
From the source, they are hosting a list of CSV files with various bits of information. Common to all files, and of critical importance is the date and time of the tag reading and the latitude and longitude of each reading. Supplemental information as the site of the reading and source of such is often given as well. Most worrisome is the fact that the data has not been cleansed and includes the actual license tag for each reading instead of some ID. This would be the first thing to go after for data to be re-shared and used here. Continue reading
Do police departments across the US (the world?) have the bandwidth to pour over crime reports in order to spot trends and mitigate crimes using all the available information? Given the ever increasing amount of data, now the norm, it will be increasingly difficult to make the best use of this data. As it applies to Crime, being able to effectively utilize this data should improve the quality of life of everyone.
For week 40, commercial burglaries on sector 2 have increased from an expected count of 7.89 to an actual count of 16. This is almost 2 1/2 times the expected volume. (Click image to try analysis or here for image)
Simply put, cohort analysis is a technique for analyzing activity over time by a common characteristic. Mostly used in sales and marketing, cohort analysis can be used in tasks such as analyzing customer loyalty, customer cost acquisition, marketing campaign effectiveness and to explore many other aspects of sales.
I am using the superstore sales created by Michael Martin found here or here. This excel file contains three sheets of which only the first one, Orders, will be used in this analysis.
The store providing their sales data does monthly advertising campaigns and wants to track what impact these advertising campaigns have on the amounts of orders placed over time. They want to use this information to evaluate their different campaigns and improve their efforts.
Given the superstore sales data and the requirements, lets present the number of orders placed per customer join date. Presenting the number of orders per join date will show the effectiveness of advertising campaigns leading to such date.
Any database server can be used to follow along. The code used here can easily be revised to work on any vendor’s product like MySQL, etc. For visualization purposes, Tableau can be easily changed replace by LibreOffice or similar.
Having recently spent an evening with Tableau, I decided to share my notes. Many of the features they highlighted I did not know about. Here’s a short list of cool things to do in Tableau. All of the visuals link to the actual Tableau example used.
This blog post is about OBIEE reporting. Specifically, it is about skipping the data warehouse and reporting from the transactional database instead. Oracle’s OBIEE, like most BI reporting tools, is designed to use star/snowflake schemas as the underlying structures to report from. Additionally, OBIEE’s metadata layer is very rich and extensively well thought out, allowing for a great deal of flexibility. Oracle’s metadata tool (Admin Tool) allows us to leverage this flexibility and features to bridge the gap between an OLTP and an OLAP model. I am not negating the need for a data warehouse, I am just wondering if all BI reporting projects merit one.
So the question to ask is: Is OBIEE up to the task?
If you read any Stephen Few for more than a minute, you’d realize he emphatically stresses clear and simple-looking visualizations. Well, he does in the one O’Reilly book I have of his where he spends a considerable amount of time pointing out what NOT to do.
With this in mind and looking for an excuse to keep playing with Awesome Tableau Software, I decided to take another look at the Healthcare Costs Data previously talked about at Centers for Medicare and Medicaid Services’.
This time, I restricted the data to the state of Florida (I’m here) and decided to drop the map. The hopeful intent of this visualization is to provide the needed information as fast as possible. As before, I cannot embed here but provide the link instead. Just click on the images to play with visual.
And for perspective, this is what Stephen did.