Data Mining Orange County (Orlando Area) Crime

Lets look at recent crime in Orlando and adjacent cities.  What can we find out by exploring Orlando’s crime data?  Is the Central Florida area a relatively safe place?  Can we tell from 90 day’s worth of day?  Are some areas safer than others?  Do we have any false ideas about places in Orlando?

At the very least, lets attempt to get a better understanding of the crime issue which affects us everywhere.

About The Data

The data used in this post can be obtained here.

Orange County provides 90 days of property crimes and robberies for Orange County.  The set in this post was fetched on Oct 24th, 2014.  Orange County does not provide records for any other crimes online.  Phone calls inquiring for other crime types went unanswered.

Primary Data – The following primary information is provided in this dataset.

1. Zone – Orange County is geographically divided into 6 Patrol Sectors.  Each of these sectors is, in turn, divided into Zones.
2. Case Number – This is the number assigned to the case for identification.
3. Crime – This is the type of crime reported.
4. Location – This is the location where the crime took place in the form of “Street Number Range – Street Name”.
5. Date – This is the date the crime is reported to have occurred.

Derived Data – From these, we can derive the following without much trouble.

1. Patrol Sector – These are grouping of patrol zones.  Some information about each zone can be found here.

Image

2. Zip Codes for Orange County – These will be used to define the geographical area or Orange County.  Go here for zip code/city list used.

0CF8CABD-27E4-4FD4-884B-866B85BD153F

These zip codes define both the patrol zones above, as well as, Orange County, Florida.

Image(1)

3. Day Of Week – From the date of the crime, we can obtain the day of the week the crime occurred.
4. Street Address, Approximate – Approximate address has been obtained by choosing the midpoint between the street number range given in Location.
5. State – Florida.

Putting approximate street address in Google Refine, we invoke the Google Maps API to obtain gegeospatialnformation for said address.  The api will return an object which contains a wealth of information such as geographical coordinates, names, etc.

First of, lets create a new column in Google Refine to hold this object.  From ‘Approximate Address (+ State)’, lets generate a new column from column from URL using the following call

http://maps.googleapis.com/maps/api/geocode/json?’ + ‘sensor=false&’ + ‘address=’ + escape(value, ‘url’)

Note 1 – Watch your limits, depending on the amount of records to be processed, you may run into Google Maps API limits as I did.

“{

“”error_message”” : “”You have exceeded your daily request quota for this API.””,

“”results”” : [],

“”status”” : “”OVER_QUERY_LIMIT””

}

Setting up DataScienceToolkit for next time!

Note 2 – Almost a third of the data came back incomplete or needed further cleanup to be usable, ymmv.

From new column, we simply retrieve data as needed from the newly created column from api call.

6. Longitude

value.parseJson().results[0].geometry.location.lng

7. Latitude

value.parseJson().results[0].geometry.location.lat

7. Zip code

value.parseJson().results[0].address_components[6].long_name

8. Complete Address

value.parseJson().results[0].formatted_address

The end result looks like this.

7F98F9FA-B1AF-47C4-AAD2-9894CDF78100

Here‘s download link for this data, enjoy.

Exploring The Data

Crimes By Day Of Week – The distribution of crimes over the day of the week looks uniform.  Friday is the most active day with 15.5% of crimes committed and Saturday the least active with 12.8%.

890F86FF-CA7E-48D3-83A3-02B53643CDF0

Looking at the second chart, however, clearly shows Friday Monday and Tuesday are the days of the week with the highest number of crimes.  Also, we can see Saturday and Sunday are the days with the lowest crimes committed.

50EE24AE-7EC0-4F7B-AEB0-FE61297DB858
Could we say crooks are exhausted from all the Friday activity?  Similarly, Wednesday and Thursday show decline in the number of crimes following higher activity of previous two days.

Crime Activity By Crime Type – For property crimes (remember the county does not willingly share other crimes), this is how crime frequency looks for the last 90 days of crime activity in Orange County.

AD9ECD44-3D23-4594-8333-433043DEBAC0

We can also see the crime types by city for all of Orange County.  The number on the chart is the actual count for the crime type of that color on legend.

12086EC3-B1C0-4868-A04E-6577EF23E90D

In general thou, 68% of crimes are either residential or auto burglaries.

1547DBBD-69AA-4CC4-9A6A-8FBA46B8E1CB

Cities Without Crime – It is interesting to report which city areas did not report any property crimes for the set provided.  This list gives the zip codes for each city where we obtained no crime reports.

E79078F5-2C29-4983-9AA4-30D176386040

Crimes Over Time – I thought it would be interesting to see crimes over time so I grabbed this from Tableau with, clearly, limited success.  I am not sure this brings any value to this post…

In Conclusion

This is Orlando area, in a nutshell…

0440B5C7-8F43-4853-B1A8-97F722505931

We can obtain a lot of information from the, already available, crime data for any city around the world.  This was but a sampling of what we can do with readily available tools.

Thanks for reading.

Advertisements

One thought on “Data Mining Orange County (Orlando Area) Crime

  1. Pingback: Using Threshold Analysis To Discover Emerging Crime Trends In Orlando | Mario Talavera Writes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s