Urban Big Data Analytics

Vancouver Summer Program 2019

Course Syllabus [pdf]

Course Information

  • Instructor: Andy Hong, The George Institute for Global Health, University of Oxford
  • Teaching Assistants: Tom Park, Julian Ho
  • Course inquiries: Please send your course-related questions to urbanbigdata2019@gmail.com
  • Term: July 15 to August 8, 2019
  • Day/Time: 9:00 – 12:00 M-Th
  • Location: Room 110, West Mall Annex

Course Description

With the advent of open data movement, knowledge and skills for collecting and analyzing big data become increasingly important for urban planners. This course will teach students how to harness the power of big data by mastering the way they are collected, organized, and analyzed to support better decision making in urban planning context. Students will learn the basic tools needed to manipulate large datasets derived from various open-data platforms, from data collection to storage and approaches to analysis. Students will be able to capture and build data structures, perform basic queries in order to extract key metrics and insights. In addition, students will learn how to use various data analytic tools, such as Gapminder and Exploratory, to analyze and visualize data. The course will also give students some exposure to statistical programming with R, and introduce them to basic machine learning techniques.

Course Requirement

Course References

Grading Scheme

Your final grade for the course will be based on the following three items:

  • Course participation (3% for each of the 10 classes): 30%
  • Four assignments (10% for each assignment): 40%
  • Final group project (on-line submission): 30%
  • Pop quiz (one random in-class quiz, closed book): extra 3% point



Course Schedule

I. Data Science Basic

Lecture 1 - Introduction to urban big data - Jul 17 (Wed) 9:00-12:00
Lecture 1 slides: [link]
Lecture 1 group session: [link]
TA: Tom Park

  1. Course overview and syllabus
  2. Introduction of data science and big data
  3. Emergence of urban data science and open data initiatives
  4. Linking data, city management, and policy making
  5. Overview of the assignments and group project

ASSIGNMENT #1 OUT [link]


Lecture 2 - Data acquisition through open-data platform - Jul 18 (Thu) 9:00-12:00
Lecture 2 slides: [link]
Lecture 2 group session: [link]
Final group project part 1: [link]
TA: Julian Ho

  1. Introduction to Exploratory
  2. Vancouver open data catalogue
  3. NYC open data
  4. Chicago open data – OpenGrid
  5. Group project topic assignment

ASSSIGNMENT #1 DUE / ASSIGNMENT #2 OUT [link]


Lecture 3 - Data wrangling + Special guest speaker - Jul 19 (Fri) 9:00-12:00
Lecture 3 slides: [link]
Lecture 3 group session: [link]
TA: Tom Park

  1. Data wrangle principles: NICE(R)
  2. Data types, conversion, and categorization
  3. Data filtering, sorting, and reordering
  4. Summarizing and joining data
  5. Hands-on group session
  6. Guest speaker: Akansha Vashisth, MSc in Data Science, UBC (11:30 - 12:00)

II. Database & Computing

Lecture 4 - Database and SQL - Jul 22 (Mon) 9:00-12:00
Lecture 4 slides: [link]
Lecture 4 group session: [link]
TA: Tom Park

  1. Introduction to database system
  2. Difference between tabular and text-based data
  3. Choosing which tools to use for which purpose
  4. Loading database using SQL
  5. Importing data to SQLite
  6. Querying and filtering data

ASSSIGNMENT #2 DUE / ASSIGNMENT #3 OUT [link]


Lecture 5 - Spatial data and GeoJSON data - Jul 23 (Tue) 9:00-12:00
Lecture 5 slides: [link]
Lecture 5 group session: [link]
TA: Tom Park

  1. Basic cartography and projection
  2. Spatial data basic: point, line, polygon, and raster
  3. Reading conventional spatial data: shapefiles and geo-database
  4. Reading GeoJSON data
  5. Merging tabular data with spatial data
  6. Hands-on group session


Lecture 6 - Cloud computing and Google Big Query - Jul 24 (Wed) 9:00-12:00
Lecture 6 slides: [link]
Lecture 6 group session / Final group project part 2: [link]
TA: Tom Park

  1. What is big data and why do we need to know?
  2. Basic intro to cloud database system
  3. Difference between Google BigQuery, Amazon Web Services, and Microsoft Azure
  4. Basic data querying steps
  5. SQL basic and examples
  6. Group project mid term check-in


Lecture 7 - Exploratory data analysis (EDA) Jul 25 (Thu) 9:00-12:00
Lecture 7 slides: [link]
Lecture 7 group session: [link]
TA: Julian Ho

  1. Intro to exploratory data analysis
  2. Intro to R and basic programming skills
  3. Summary statistics and data types
  4. Basic plotting and correlations
  5. Missing data and handling outlier

ASSSIGNMENT #3 DUE



III. Advanced Analytics

Lecture 8 - Data visualization and web mapping - Jul 29 (Mon) 9:00-12:00
Lecture 8 slides: [link]
Lecture 8 group session: [link]
TA: Julian Ho

  1. Intro to data visualization
  2. Tufte’s 10 rules
  3. Data visualization with ggplot2
  4. Interactive web mapping with leaflet
  5. Hands-on group session

ASSIGNMENT #4 OUT [link]


Lecture 9 - Statistical analysis with Exploratory - Jul 30 (Tue) 9:00-12:00
Lecture 9 slides: [link]
Lecture 9 group session / Final group project part 3: [link]
TA: Tom Park

  1. Basic probability and statistics – signal-to-noise ratio
  2. Statistical inference and modeling
  3. Linear regression with continuous data
  4. Data transformation
  5. Hands-on group session


Lecture 10 - Advanced statistical analysis - Jul 31 (Wed) 9:00-12:00
Lecture 10 slides: [link]
Lecture 10 group session: [link]
TA: Julian Ho

  1. Regression with skewed data
  2. Poisson and negative binomial regression
  3. Logistic regression and interpretation
  4. Model selection and goodness-of-fit
  5. Final group project working session

ASSSIGNMENT #4 DUE


Lecture 11 - Basic machine learning and future of urban data science - Aug 1 (Thu) 9:00-12:00
Lecture 11 slides: [link]
Lecture 11 group session: [link]
TA: Tom Park

  1. Intro to machine learning
  2. Supervised vs. Unsupervised learning
  3. Data clustering with a k-means technique
  4. Decision trees, random forest, bagging, and boosting
  5. Future of urban data science
  6. Final group project presentation prep


Lecture 12 - Final group presentation + Special guest speaker - Aug 6 (Tue) 9:00-12:00

  • Each group will have 12 minututes to present their project.
  • Guest speaker: Sayanti Ghosh, MSc in Data Science, UBC (11:30 - 12:00)


Final Project

Final project is DUE BY Aug 8 (Thu), 11:59 MIDNIGHT
Submit your final group project to the course email (urbanbigdata2019@gmail.com) by Aug 8 (Thu)
(-3% for each day of late submission)

Please use the following email title format:
VSP BigData Final Project - [group number] - [project name]
ex) VSP BigData Final Project - Group 1 - Boston Crime Analytics