library(ggplot2)
library(ggmap)
library(readr)
library(tmaptools)

We have explored creating a map and putting additional data “on top of” the map. In this assignment, we will do that, but the one additional requirement to “zoom” into the region of the United States where the data is of interest.

Data cleaning and preparation

First, read in the dataset (Crowdsourced ports using OpenMaps) You can import the data by pointing the import function to the link (hint: its a csv). Be sure to include the header = FALSE argument when you import the data to ignore the first line.

# Import port dataset
ports <- read.csv("https://raw.githubusercontent.com/jpatokal/openflights/master/data/airports-extended.dat", header = FALSE)

Inspect the data. You’ll notice you need to change the column names. Here are the columns names:

"ID", "name", "city", "country", "IATA_FAA", "ICAO", "lat", "lon", 
                        "altitude", "timezone", "DST","location","type","source"
colnames(ports) <- c("ID", "name", "city", "country", "IATA_FAA", "ICAO", "lat", "lon", 
                        "altitude", "timezone", "DST","location","type","source")

Check the data using head to make sure the columns line up.

head(ports)
##   ID                                        name         city
## 1  1                              Goroka Airport       Goroka
## 2  2                              Madang Airport       Madang
## 3  3                Mount Hagen Kagamuga Airport  Mount Hagen
## 4  4                              Nadzab Airport       Nadzab
## 5  5 Port Moresby Jacksons International Airport Port Moresby
## 6  6                 Wewak International Airport        Wewak
##            country IATA_FAA ICAO       lat     lon altitude timezone DST
## 1 Papua New Guinea      GKA AYGA -6.081690 145.392     5282       10   U
## 2 Papua New Guinea      MAG AYMD -5.207080 145.789       20       10   U
## 3 Papua New Guinea      HGU AYMH -5.826790 144.296     5388       10   U
## 4 Papua New Guinea      LAE AYNZ -6.569803 146.726      239       10   U
## 5 Papua New Guinea      POM AYPY -9.443380 147.220      146       10   U
## 6 Papua New Guinea      WWK AYWK -3.583830 143.669       19       10   U
##               location    type      source
## 1 Pacific/Port_Moresby airport OurAirports
## 2 Pacific/Port_Moresby airport OurAirports
## 3 Pacific/Port_Moresby airport OurAirports
## 4 Pacific/Port_Moresby airport OurAirports
## 5 Pacific/Port_Moresby airport OurAirports
## 6 Pacific/Port_Moresby airport OurAirports

Creating a subset of ports

Now that you have the data in the proper format you can begin mapping the data. There are 12,304 observations across many countries. Explore the data and choose a city you want to plot. Preferably one with more than 20 observations. Create a subset of the data containing only observations you plan to plot (e.g., if you were plotting data from Chicago you’d set the condition to city == "Chicago"). Explore your data using head() so you know you have the correct dataset.

# subset the data to get only observations where the city is berlin
berlin.stations <- ports[which(ports$city == "Berlin"),]

Obtaining a map

Next, you need to retrieve a map for the city you plan to plot the points over. The code to do so is here:

mapSimple.CITYNAME <- ggmap(get_stamenmap(rbind(as.numeric(paste(geocode_OSM("CITYNAME")$bbox))), zoom = ZOOMNUMBER))

You need to change two parameters above CITYNAME and ZOOMNUMBER. The CITYNAME parameter is simply the name of the city you wish to retrieve a map for and the ZOOMNUMBER is how close you want the map to zoom. I’d recommend a value between 8 and 11 (the zoom range is between 1 and 18 where lower numbers are closer to the map.). Print the map to make sure its correct.

#use ggmap to obtain the Berlin map and set the zoom to size 10
mapSimple.berlin <- ggmap(get_stamenmap(rbind(as.numeric(paste(geocode_OSM("Berlin")$bbox))), zoom = 10))
mapSimple.berlin