We have explored creating a map and putting additional data “on top of” the map. In this assignment, we will do that, but the a few additional requirements to select a city of interest from the dataset and “zoom” into the region where the data is of interest.

# Data cleaning and preparation

First, read in the dataset (Crowdsourced ports using OpenMaps) You can import the data by pointing the import function to the link (hint: its a csv). Be sure to include the header = FALSE argument when you import the data to ignore the first line.

Inspect the data. You’ll notice you need to change the column names. Here are the columns names:

c("ID", "name", "city", "country", "IATA_FAA", "ICAO", "lat", "lon","altitude", "timezone", "DST","location","type","source")

Check the data using head() to make sure the columns line up.

Now that you have the data in the proper format you can begin mapping the data. There are 12,304 observations across many countries. Explore the data and choose a city you want to plot. Preferably one with more than 20 observations (use table(x$city) function to get a count of the observations for each city). Create a subset of the data containing only observations you plan to plot (e.g., if you were plotting data from Chicago you’d set the condition to city == "Chicago"). Explore your data using head() so you know you have the correct dataset. # Getting a base map Next, you need to retrieve a map for the city you plan to plot the points over. The code to do so is here below. You’ll need the tmaptools package which contains the geocode_OSM() function and the ggmap package which contains the ggmap() and get_stamenmap() functions. mapSimple.CITYNAME <- ggmap(get_stamenmap(rbind(as.numeric(paste(geocode_OSM("CITYNAME")$bbox))), zoom = ZOOMNUMBER))

You need to change two parameters above CITYNAME and ZOOMNUMBER. The CITYNAME parameter is simply the name of the city you wish to retrieve a map for and the ZOOMNUMBER is how close you want the map to zoom. I’d recommend a value between 8 and 11 (the zoom range is between 1 and 18 where lower numbers are closer to the map). Print the map to make sure its correct.

# Including Plots

You would use ggplot2 to map out (‘draw’) the ports as points on the map. For this assignment, you need to create two different, but related, maps:

• A map with ‘points’ for each record in the data. Set the points to the color blue
• A ‘density’ map showing the same information. Here’s a resource on Contours of a 2d density estimate.
• A map with ‘points’ for each record in the data. Set the color of points to the type of port represented. type is an attribute of each record in the data e.g., (NULL, airport, station, unknown)
##### Learning Goals for this activity:
• Consider how a simple data mining technique can be applied to a variety of kinds of source data.
• Practice using ggmap and ggplot commands
• Increase familiarity with bringing external data sets into R.