IST 687 Introduction to Visualization

Corey Jackson

2020-02-12 19:08:20

Today’s Agenda

Announcements

Exam Logistics

Exam Logistics

Question distribution

Week # Questions
2 - Using R to manipulate data. 8
3 - Descriptive Statistics & Functions 5
4 - Inferential statistics 4
6 - Introduction to visualization 1
7 - Working with map data 1
8 - Linear modeling 2

Exam office hours: Monday, February 24th (Zoom from 5 pm ET - 8 pm ET) and SLACK

Exam interface

Week 5 - Connecting using different data sources

Week 5 - Connecting using different data sources

Book (CH 11) and asynchronous topics covered

Week 5 - Connecting using different data sources

Week 5 - Connecting using different data sources

Week 5 - Connecting using different data sources

Week 5 - Connecting using different data sources

Week 6 - Introduction to visualization

Week 6 - Introduction to visualization

Book (CH 12) and asynchronous topics covered

Week 6 - Introduction to visualization

##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3

If we had data on airquality and were asked to create a histogram of Ozone. Load ggplot2 using library(ggplot), use the geom_histogram function contained inside the ggplot2 package.

Week 6 - Introduction to visualization

library(ggplot2)
ggplot(airquality, #data
    aes(x=Ozone)) + #aesthetics 
geom_histogram(color="white", fill="black") # geom and more aesthetics

Note the “+” needs to be included for adding other layers

Breakouts - Lab 6 and Project Updates (60 minutes)

Breakouts - Lab 6

Steps for completing lab 6

ggplot(airquality, aes(x=Ozone)) +    
geom_histogram(color="white", fill="black") +  
ylab("No. Observations") + xlab("Ozone")

Breakouts - Lab 6 (40 minutes)

Week 6 Homework tips

Week 6 Homework tips

Week 6 Homework tips: Wide vs. long data formats

##   country year avgtemp
## 1  Sweden 1994       6
## 2 Denmark 1994       9
## 3  Norway 1994       5
## 4  Sweden 1995      11
## 5 Denmark 1995       7
## 6  Norway 1995      11
##   country avgtemp.1994 avgtemp.1995 avgtemp.1996
## 1  Sweden            6           11            8
## 2 Denmark            9            7            7
## 3  Norway            5           11            4

Week 6 Homework tips: Converting between long and wide data

Converting wide to long using melt() (in the reshape2 package.)

##   country avgtemp.1994 avgtemp.1995 avgtemp.1996
## 1  Sweden            6           11            8
## 2 Denmark            9            7            7
## 3  Norway            5           11            4
country_longl <- melt(country_wide, id=c("country"))
##   country     variable value
## 1  Sweden avgtemp.1994     6
## 2 Denmark avgtemp.1994     9
## 3  Norway avgtemp.1994     5
## 4  Sweden avgtemp.1995    11

Week 6 Homework tips: Converting between long and wide data

Converting long to wide using dcast()

##   country year avgtemp
## 1  Sweden 1994       6
## 2 Denmark 1994       9
## 3  Norway 1994       5
## 4  Sweden 1995      11
## 5 Denmark 1995       7
country_widel <- dcast(country_long, country ~ year)
##   country 1994 1995 1996
## 1  Sweden    6   11    8
## 2 Denmark    9    7    7
## 3  Norway    5   11    4

Week 6 Homework tips: Extracting and combining columns from an existing dataframe

##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
aq1 <- data.frame(airquality$Ozone, airquality$Solar.R)
##   airquality.Ozone airquality.Solar.R
## 1               41                190
## 2               36                118
## 3               12                149

Week 6 Homework tips: Working with dates in R

##   Month Day
## 1     5   1
## 2     5   2
## 3     5   3

We need to create a date that could be interpreted by R. We can use paste() to combine elements

sessiondates$Date <- paste(sessiondates$Month, +
sessiondates$Day, 2018, sep="/")

Week 6 Homework tips: Working with dates in R

##   Month Day     Date
## 1     5   1 5/1/2018
## 2     5   2 5/2/2018

… and then convert to an R date object

## 'data.frame':    153 obs. of  3 variables:
##  $ Month: int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Date : chr  "5/1/2018" "5/2/2018" "5/3/2018" "5/4/2018" ...
## NULL

Week 6 Homework tips: Working with dates in R

Convert the date character to an R readable date using as.Date()

sessiondates$Date <- as.Date(sessiondates$Date, +
"%m/%d/%Y")
## 'data.frame':    153 obs. of  3 variables:
##  $ Month: int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Date : Date, format: "2018-05-01" "2018-05-02" ...
## NULL

Next week