Prerequisites

Instruction

1. Synopsis

The purpose of this group session is to introduce you to the world of open data. The US City Open Census is a good place to start looking for open data from hundreds of cities in the US. The census gives you a sense of how each city is ranked by its open data readiness. It ranks the cities by 20 different categories, and provides information about where to get those data.

Categories of open data in various cities

  • Budget
  • Crime Reports
  • Parcels
  • Construction Permits
  • Zoning
  • Service Requests
  • Code Violations
  • Employee Salaries
  • Business Listings
  • Spending
  • Restaurant Inspections
  • Public Facilities
  • Traffic Crashes
  • Property Assessment
  • Procurement Contracts
  • Emergency Calls
  • Lobbyist Activity
  • Police Use-of-Force
  • Property Transfers
  • Website Analytics

2. Each group is assigned one city to explore its open data readiness for crime incidents

  • Group 1: Boston, MA (link)
  • Group 2: Los Angeles, CA (link)
  • Group 3: Los Angeles, CA (link)
  • Group 4: San Fancisco, CA (link)
  • Group 5: Washington, DC (link)
  • Group 6: New York City, NY (link)
  • Group 7: New York City, NY (link)
  • Group 8: Philadelphia, PA (link)
  • Group 9: Detroit, MI (link)
  • Group 10: Detroit, MI (link)

3. Assess quality and usuability of crime data

  • Data quality
    • How many years of data are available?
    • What are the variables in the data? Dooes it have these four key variables? 1) Date and time; 2) Location (may be coordinates or addresses); 3) Incident type; 4) Narrative information
  • Usability
    • How easy is it to explore the data?
    • Can you understand the variables?
    • Can you visualize the data?
    • Can you download the data and open in Excel?
  • Data wrangling (cleaning) needs assessment
    • How clean is the data? Do you see any weird characters or data cleaning needs?
    • How clean are the variable (column) names? Any space or weird characters in the variable names?
    • (extra points) What are the variable types? Do you need to change the types?
  • Only for the advanced students - Extra 2 points for your assignment 2
    • Use the baes R code to access Chicago crime API endpoint
    • Modify the code to access the API endpoint of your group’s crime data
    • If the “year” parameter is not available, use other parameters, ex) date. Hint: read the API docs
    • Send your R code along with your assignment 2 to get the extra points

Tell your story