Submission Details

Due date: the homework is due before class on Thursday.

Submission process: submit both the R Markdown file and the corresponding html file on canvas. Please submit both the .Rmd and the .html files separately and do not zip the two files together.


Weekly box office data

  1. Download the RMarkdown file with these homework instructions to use as a template for your work. Make sure to replace “Your Name” in the YAML with your name.

  2. For this homework we use the data set box from the classdata package, which consists of weekly box office gross for movies of the last five year.

# devtools::install_github("haleyjeppson/classdata")
library(classdata)

head(box)
##   Rank Rank.Last.Week             Movie        Distributor    Gross Change
## 1    1              1             Joker       Warner Bros. 55861403    -42
## 2    2             NA The Addams Family     United Artists 30300007     NA
## 3    3             NA        Gemini Man Paramount Pictures 20552372     NA
## 4    4              2        Abominable          Universal  6072235    -49
## 5    5              3     Downton Abbey     Focus Features  4881075    -39
## 6    6              4          Hustlers  STX Entertainment  3887018    -39
##   Thtrs. Per.Thtr. Total.Gross Week       Date
## 1   4374     12771   193590190    2 2019-10-11
## 2   4007      7562    30300007    1 2019-10-11
## 3   3642      5643    20552372    1 2019-10-11
## 4   3496      1737    47873585    3 2019-10-11
## 5   3019      1617    82668665    4 2019-10-11
## 6   2357      1649    98052357    5 2019-10-11
  1. In class we discussed two instances where a movie was released under the same name as a different movie previously. Identify at least one more instance of a movie where that happened. Report the name of the movie, search online for additional information. Describe the strategy you used to identify this movie, report the code involved.
## your answer here
  1. Re-derive variables: Change (percent change in gross income from last week), Rank.Last.Week, Per.Thtr. (as gross per theater), and Total.Gross (as the cumulative sum of weekly gross).
## your answer here
  1. For the variables Per.Thtr. and Change compare the original variables and the newly derived ones. Are there differences? Where? Try to describe patterns you find.
## your answer here
  1. Is the original variable Total.Gross strictly increasing?
## your answer here
  1. Identify the three top grossing movies for each year. Plan of attack:
    • Extract the year from the Date variable.
    • Summarize the total gross for each movie and each year.
    • Find the rank of movies by total gross in each year.
    • Filter the top three movies.
## your answer here