Urban Big Data Analytics
Lecture 1
Course Introduction
July 17, 2019
Instructor: Andy Hong, PhD
Lead Urban Health Scientist
The George Institute for Global Health
University of Oxford
A bit about me
▸ Lead Urban Health Scientist
▸ Co-founder of Healthy Cities Network
▸ Studied informatics, geography, and public policy
▸ Bike commuter and hiker
▸ A father of two kids
Course overview
- Updated course syllabus (link)
- 4 assignments
- 11 group sessions
- 1 final group project
- Extra perks:
- Two TAs: Tom and Julian
- Two guest speakers: MSc in Data Science
This course is about |
This course is NOT about |
- Introduction to urban data science
- Learn to think and talk like data scientists
- Learn how to wrangle, explore, and analyze data
|
- Learn how to program
- Learn statistics and math
- Learn GIS and mapping skils
|
Let's break the ice
- Tell us your name and where you are from
- What is your major of study?
- What do you want to get out of this course?

Data scientist
Data is the new oil
From data to insights
What is data science?
What is urban data science?
What are the urban issues?
- Health issues: air pollution, noise pollution
- Environmental issues: climate change (coastal cities), extreme heat
- Traffic congestion, safety, crime
- Housing, income inequality, racial segregation
Urban "big" data
- Every day, we create 2.5 quintillion bytes of data
- 90% of the data today has been created in the last 2 years alone.
What is big data?
- Three Vs: Volume, Velocity, Variety
- Very large data that cannot be possibly handled with typical softwares, like Excel
- Data volume contiues to increase. ex) Sensor data (IoT), crowdsourced data
- Everything as data, not just text but also images, videos, tweets, etc.
Big data = Structured + Unstructured data
Big data trend
Smart cities and Internet of Things
Real-time sensors = Big data
City Sensor
Rise of spatial big data
Age of cloud computing
Pre cloud days |
Post cloud days |
- Downloading and patching together satellite images
- Long time to load gigabytes of satellite images
- Take several days to process large raster data
|
- All the sattellite data stored on a public cloud
- No need to install software. Everything in a browser
- Google Earth Engine: data processing on the fly
|
Group session
https://www.gapminder.org/tools
- Group 1: Income x Life expectancy
- Group 2: Income x Life expectancy
- Group 3: Income x CO2
- Group 4: Income x CO2
- Group 5: Income x Babies per woman
- Group 6: Income x Babies per woman
- Group 7: Income x Child mortality
- Group 8: Income x Child mortality