Urban Big Data Analytics
					
					Lecture 5
					Spatial Data
					
					
					 July 23, 2019
					
					Instructor: Andy Hong, PhD
					
					Lead Urban Health Scientist
					The George Institute for Global Health
					University of Oxford
					
					
				
				
				
					Any quetions about Assignment 3?
					
						- Help you get familiar with R coding
- If you are new to R, it will be difficult
- If you are new to coding, it will be extremely difficult
- Don't worry, we will cover it tomorrow
- Remember, coding is like talking to a computer
What are Spatial data?
					
						- Data that have geographic information
- Coordinates, addresses, postal codes
- Spatial data are mulit-dimensional: x, y, z ...
- Long history: geography, forestry ...
Why spatial data?
					 
					
						- Locational data are valuable
- Satellites, mobile devices, drones, vehicles, etc.
- Tracking people movement
- Tracking goods movement
- Tracking animal movement
- Big companies: Google, Apple, Microsoft, Baidu, Uber, Didi
Four use cases
					
						- Vegetation health
- Global fishing Watch
- Extreme weather events
- Terrorist incidents
Types of spatial data
					 
					  
				
				
					
					Key elements
					 
					  
				
				
					
					Projection! Projection! Projection!
					
						- Most important information
- The earth is not flat
- If it's not correct, maps won't match
- Most maps have WGS84 projection
Spatial data files
					
						- ArcGIS Shapefiles: *.shp
- Google map files: *.kml, *.kmz
- GeoJSON: spatial version of JSON (Javascript Object Notation)
- Text: Longitude (X), Latitutude (Y) information
Coordinates (points)
		  		  	 
				
				
					
					Joining data
					
					 
					
						- Need a common column
- Need to match data types
- Left join mostly, some times inner join, but hardly right join