Urban Big Data Analytics
Lecture 5
Spatial Data
July 23, 2019
Instructor: Andy Hong, PhD
Lead Urban Health Scientist
The George Institute for Global Health
University of Oxford
Any quetions about Assignment 3?
- Help you get familiar with R coding
- If you are new to R, it will be difficult
- If you are new to coding, it will be extremely difficult
- Don't worry, we will cover it tomorrow
- Remember, coding is like talking to a computer
What are Spatial data?
- Data that have geographic information
- Coordinates, addresses, postal codes
- Spatial data are mulit-dimensional: x, y, z ...
- Long history: geography, forestry ...
Why spatial data?
- Locational data are valuable
- Satellites, mobile devices, drones, vehicles, etc.
- Tracking people movement
- Tracking goods movement
- Tracking animal movement
- Big companies: Google, Apple, Microsoft, Baidu, Uber, Didi
Four use cases
- Vegetation health
- Global fishing Watch
- Extreme weather events
- Terrorist incidents
Types of spatial data
Key elements
Projection! Projection! Projection!
- Most important information
- The earth is not flat
- If it's not correct, maps won't match
- Most maps have WGS84 projection
Spatial data files
- ArcGIS Shapefiles: *.shp
- Google map files: *.kml, *.kmz
- GeoJSON: spatial version of JSON (Javascript Object Notation)
- Text: Longitude (X), Latitutude (Y) information
Coordinates (points)
Joining data
- Need a common column
- Need to match data types
- Left join mostly, some times inner join, but hardly right join