Urban Big Data Analytics

Lecture 1
Course Introduction

July 17, 2019

Instructor: Andy Hong, PhD
Lead Urban Health Scientist
The George Institute for Global Health
University of Oxford

A bit about me


▸ Lead Urban Health Scientist
▸ Co-founder of Healthy Cities Network
▸ Studied informatics, geography, and public policy
▸ Bike commuter and hiker
▸ A father of two kids

Homepage Twitter LinkedIn

Course overview


  • Updated course syllabus (link)
  • 4 assignments
  • 11 group sessions
  • 1 final group project
  • Extra perks:
    • Two TAs: Tom and Julian
    • Two guest speakers: MSc in Data Science
This course is about This course is NOT about
  • Introduction to urban data science
  • Learn to think and talk like data scientists
  • Learn how to wrangle, explore, and analyze data
  • Learn how to program
  • Learn statistics and math
  • Learn GIS and mapping skils

Let's break the ice

  • Tell us your name and where you are from
  • What is your major of study?
  • What do you want to get out of this course?

Data scientist

Data is the new oil

From data to insights

What is data science?

What is urban data science?

What are the urban issues?

  • Health issues: air pollution, noise pollution
  • Environmental issues: climate change (coastal cities), extreme heat
  • Traffic congestion, safety, crime
  • Housing, income inequality, racial segregation

Using data science to solve urban heat island issues

https://youtube.com/embed/WN-nY1l0VM0

Urban "big" data


  • Every day, we create 2.5 quintillion bytes of data
  • 90% of the data today has been created in the last 2 years alone.

What is big data?

  • Three Vs: Volume, Velocity, Variety
  • Very large data that cannot be possibly handled with typical softwares, like Excel
  • Data volume contiues to increase. ex) Sensor data (IoT), crowdsourced data
  • Everything as data, not just text but also images, videos, tweets, etc.

Big data = Structured + Unstructured data

Big data trend

Smart cities and Internet of Things

Real-time sensors = Big data

City Sensor

Rise of spatial big data

Age of cloud computing

Pre cloud days Post cloud days
  • Downloading and patching together satellite images
  • Long time to load gigabytes of satellite images
  • Take several days to process large raster data
  • All the sattellite data stored on a public cloud
  • No need to install software. Everything in a browser
  • Google Earth Engine: data processing on the fly

Google Earth Engine Demo

Earth engine code editor: https://code.earthengine.google.com/

Group session

Instruction

Group session

https://www.gapminder.org/tools
  • Group 1: Income x Life expectancy
  • Group 2: Income x Life expectancy
  • Group 3: Income x CO2
  • Group 4: Income x CO2
  • Group 5: Income x Babies per woman
  • Group 6: Income x Babies per woman
  • Group 7: Income x Child mortality
  • Group 8: Income x Child mortality

Assignment 1

Instruction

Any questions?

For all the course materials, go to urbanbigdata.github.io