Crime vs. Temperature

As the temperature in Chicago rises, so does the crime rate?

It turns out that, after crunching 5,000,000+ publicly available crime reports (spanning the time between 2001 and the current day), and comparing it to historical weather data, this conventional wisdom generally pans out.

Like many other civic minded hackers in the city, I was pretty excited to see the recent launch of the Chicago Police Department’s open data API. And, after a couple weekend hacks, I’m also pretty excited to announce my own attempt at a really basic visualiztion.

Roughly, here’s what I did:

  1. Take all of the days since 2001 and group them by the reported high temperature
  2. Take all of the crimes committed on those days and add them together
  3. Divide the number of crimes committed at a given temperature and divide it by the number of days to get an average

To accomplish this I downloaded the full CSV dump of all 5,000,000+ reported crimes in Chicago since 2001, headed over to the NOAA and got a dump of weather data since 2001, loaded them both into a MongoDB and started to see what I could see. Technically, there were really not too many hurdles to overcome. MongoDB ships with a handy dandy method for importing csv data directly into a database, so, with a little patience (the crime data took around 5 hours to import), I was able to get everything loaded pretty easily. I made a couple indexes to make things work a it quicker and, for the crime data, created a “Location” field that contained a GeoJSON object (cause that’s how MongoDB does things these days).

But really, all I wanted to do was to get this on the web and because I’m doing this with very few (aka none) resources, I didn’t really want to have to worry about any kind of hosted backend setup, which kinda means my database needs to stay on my local machine. With that in mind, the persistence layer of this site (including the map stuff) is all powered by flat JSON files. This way, I can run the app entirely from an S3 bucket without any server-side anything. I’m not sure how far I’ll be able to push this but it works for now. There are obvious downsides like not being able to query stuff very well on the fly but, for now, I just wanted something that offered kind of a fire hose view that you could browse through and maybe pull some threads out of.

My intention is to update this site every day when there is new data crime available. I’m hoping that I can actually make this a bit more automated but for now, I’ve just written a couple scripts that handle the heavy lifting but will need to be triggered daily. The other sorta lame thing is that the MongoDB is in a VM on my laptop. I’d love to share it with someone and/or figure out an easy way to back it up that won’t be too pricy (although I’ll probably just suck it up and put it in S3 or something).

Anyways, I’m interested in feedback. Features, ideas, etc are all welcome.

comments powered by Disqus