Due date: the homework is due before class on Thursday.
Submission process: submit both the R Markdown file and the corresponding html file on canvas. Please submit both the .Rmd
and the .html
files separately and do not zip the two files together.
Download the RMarkdown file with these homework instructions to use as a template for your work. Make sure to replace “Your Name” in the YAML with your name.
The data this week comes from The Wallstreet Journal. The data set includes immunization rate data for schools across the U.S. The accompanying article is published here.
library(dplyr)
library(ggplot2)
library(readr)
measles <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-25/measles.csv')
mmr
to answer this question. Only consider schools with a rate > 0 for the remainder of the homework.## your answer here
mutate()
, reorder the levels of the variable state
according to the median MMR vaccination rate. Then “pipe” your results into ggplot and create box plots of MMR vaccination rates for each state. Map the variable state
to color
, include the parameter show.legend = FALSE
within geom_boxplot()
, and flip the coordinates. Interpret.## your answer here
mutate()
and case_when()
, introduce a new variable into the data set mmr_threshold
where the value is “above” when mmr
is greater than 95 and “below” otherwise. Is there a relationship between the type of school and the proportion of schools that did not reach that threshold? For each type of school, calculate the mean MMR vaccination rate. On how many responses are the averages based? Show these numbers together with the averages. Additionally, calculate the percentage of schools that did not reach that threshold. Arrange your results from greatest percentage to lowest. Comment on your results.## your answer here
dplyr
functions to:year
, city
, state
, name, type
, enroll
, and mmr
(there are duplicates in the data)mutate()
use weighted.mean()
to calculate the mean MMR vaccination rates weighted by the enrollment. Name this new variable state_avg
.## your answer here
ggplot(aes( ))
. Describe and summarise the chart.question6_data %>%
ggplot(aes( )) +
geom_hline(yintercept = 95, linetype = "dashed", size = 0.25, color = "grey40") +
geom_point(size = 2, alpha = .3) +
scale_color_gradient(low = "red", high = "blue", limits=c(88, 96), oob = scales::squish,
guide = guide_colorbar(direction = "horizontal", title.position = "top",
title = "State average immunization rate", barwidth = 15, barheight = 0.25,
ticks = FALSE, title.hjust = 0.5)) +
theme_minimal() +
theme(legend.position = "bottom") +
ggtitle("MMR immunization rates at schools grouped across US cities") +
labs(subtitle="According to data collected by The Wall Street Journal",
x = "Student Enrollment", y = "") +
scale_x_continuous(labels = scales::comma)
## your answer here