Prerequisites

Part 1

Synopsis

The purpose of this final group project is to learn how to create a basic data science report. You will need to wrangle messy data that come in a variety of format. You will also need to merge different datasets and conduct an analysis to test your hypotheses or make recommendations based on your findings. We will go through the initial three parts of your group project in class, but your group will need to work together to complete the final report.

  • In part 1, you will be first given a snapshot of crime data for 2016, and you will be diving into data wrangling procedures.

  • In part 2, you will be working on merging the crime data with other data sets from the open data platform.

  • In part 3, you will be conducting an exploratory data analysis (EDA) and will be applying some statistical learning techniques to extract useful information out of the dataset you created.

Group project folder

  • [IMPORTANT] Create a group project folder to save your files

  • Windows: My Documents/vsp_bigdata/group_project

  • Mac: Documents/vsp_bigdata/group_project

Each group is assigned one city to complete this group project

Remember the five elements of good story telling

  • Issue at hand: What are the issues? What’s troubling the most?

  • Supporting data: For this project, you are given the crime data. Your job is to merge the the crime data with some other useful data to complete your story.

  • Relationship: What is the relationship between X and Y? Does the relationship go up or down or stay the same?

  • Interprtation: Why do you think the relationship between X and Y exists? Do some research. Read newspaper, abd use your common sense and knowledge to try to understand the observed relationship.

  • Summary and conclusions: Summarize what you’ve learned and draw a conclusion.