class: center, middle, inverse, title-slide # Working with
dates & time --- ## deceptively tricky At first glance, dates and times seem simple, but consider these questions: 1. Does every year have 365 days? 2. Does every day have 24 hours? 3. Does every minute have 60 seconds? </br></br> Dates and times are hard because they have to reconcile: - two physical phenomena (the rotation of the Earth and its orbit around <br>the sun) - a whole raft of geopolitical phenomena including months, time zones, <br>and daylight savings time. ??? I’m sure you know that not every year has 365 days, but do you know the full rule for determining if a year is a leap year? (It has three parts.) You might have remembered that many parts of the world use daylight savings time (DST), so that some days have 23 hours, and others have 25. You might not have known that some minutes have 61 seconds because every now and then leap seconds are added because the Earth’s rotation is gradually slowing down. --- background-image: url(https://pbs.twimg.com/media/CEwVKWbWgAAzFS2.png) background-size: 600px background-position: 25% 65% ## Comprehensive map of all countries that use the <br>MM-DD-YYYY format .footnote[ From [https://twitter.com/donohoe/status/597876118688026624](https://twitter.com/donohoe/status/597876118688026624). ] --- background-image: url(https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/lubridate.png) background-size: 120px background-position: 92% 5% ## `lubridate` package for working with dates and times defines different classes of time: instants, periods, intervals, durations defines converter and accessor functions, enables time calculations .center[ <img src="images/lubridate_ymd.png" height="400"> ] --- ## date/time data Three types of date/time data: dates, times, and date-times. - If you can use a date instead of a date-time, you should. (it's simpler) - Date-times are substantially more complicated because of the need to handle time zones. ??? - time instants: one (absolute) moment in time, e.g. `now()`, `Jan-1-2000` --- ## creating date/times To get the current date or date-time you can use `today()` or `now()`: ```r today() ``` ``` ## [1] "2020-04-02" ``` ```r now() ``` ``` ## [1] "2020-04-02 09:39:43 CDT" ``` </br></br> Otherwise, there are three ways you’re likely to create a date/time: 1. From a string. (use a converter function) 2. From individual date-time components. 3. From an existing date/time object. --- ## From strings (or numbers) Want to convert strings or numbers to date-times. - easy-to-use converter functions: - date: `ymd()`, `mdy()`, `dmy()`, ... - time: `hm()`, `hms()`, ... - date & time: `ymd_hms()`, `mdy_hm()`, ... - order of letters determines how strings are parsed - separators are automatically determined, then assumed to be the same ??? - identify the order in which year, month, and day appear in your dates - arrange "y", "m", and "d" in the same order - this will gives you the name of the lubridate function that will parse your date --- ## Converter function Examples ```r ymd("2020-03-26") ``` ``` ## [1] "2020-03-26" ``` ```r mdy("03-26-2020") ``` ``` ## [1] "2020-03-26" ``` ```r mdy("March 26th, 2020") ``` ``` ## [1] "2020-03-26" ``` ```r ymd(20200326) ``` ``` ## [1] "2020-03-26" ``` --- ## Converter function Examples ```r mdy_hm("03/26/2020 09:00") ``` ``` ## [1] "2020-03-26 09:00:00 UTC" ``` <br/><br/> You can also force the creation of a date-time from a date by supplying a timezone: ```r ymd("2020-03-26", tz = "UTC") ``` ``` ## [1] "2020-03-26 UTC" ``` --- class: yourturn # Your Turn - Create date objects for today's date by typing the date in text format and converting it with one of the `lubridate` converter functions. - Try different formats of writing the date and compare the end results. - Use the appropriate converter function to parse each of the following dates: <br><br> `d1 <- "January 1, 2010"`<br> `d2 <- "2015-Mar-07"`<br> `d3 <- "06-Jun-2017"`<br> `d4 <- c("August 19 (2015)", "July 1 (2015)")`<br> `d5 <- "12/30/14"`<br> --- ## From individual components Functions to create date/time from individual components spread across multiple columns: `make_date()`, `make_datetime()` ```r flights %>% select(year, month, day, hour, minute) %>% mutate(departure = make_datetime(year, month, day, hour, minute)) %>% head() ``` ``` ## # A tibble: 6 x 6 ## year month day hour minute departure ## <int> <int> <int> <dbl> <dbl> <dttm> ## 1 2013 1 1 5 15 2013-01-01 05:15:00 ## 2 2013 1 1 5 29 2013-01-01 05:29:00 ## 3 2013 1 1 5 40 2013-01-01 05:40:00 ## 4 2013 1 1 5 45 2013-01-01 05:45:00 ## 5 2013 1 1 6 0 2013-01-01 06:00:00 ## 6 2013 1 1 5 58 2013-01-01 05:58:00 ``` --- ## From a date/time object Functions to switch between a date-time and a date: `as_date()`, `as_datetime()` .pull-left[ ```r today() ``` ``` ## [1] "2020-04-02" ``` ```r now() ``` ``` ## [1] "2020-04-02 09:39:44 CDT" ``` ] .pull-right[ ```r as_datetime(today()) ``` ``` ## [1] "2020-04-02 UTC" ``` ```r as_date(now()) ``` ``` ## [1] "2020-04-02" ``` ] --- class: yourturn # Your Turn For this question use the `flights` data from the `nycflights13` package. - Use `make_datetime()` to create a date-time variable for `dep_time` and `arr_time`. <br><br> **Hint**: use modular arithmetic `%/%` for hour and `%%` for minute --- ## Accessor functions Want to get or assign a component of a date/time object. - Use accessor functions to get a component: `year()`, `month()`, `week()`, `wday()`, `mday()`, `yday()`, `hour()`, `minute()`, ... - Assign into an accessor functionsto set elements of date and time, e.g. `hour(x) <- 12` ??? You can pull out individual parts of the date with the accessor functions --- ## Accessor function examples ```r month(now()) ``` ``` ## [1] 4 ``` ```r wday(now(), label = TRUE) ``` ``` ## [1] Thu ## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat ``` ```r wday(now(), label = TRUE, abbr = FALSE) ``` ``` ## [1] Thursday ## Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < Friday < Saturday ``` ??? For month() and wday() you can set label = TRUE to return the abbreviated name of the month or day of the week. Set abbr = FALSE to return the full name. --- ## Accessor function examples ```r flights_dt %>% mutate(wday = wday(dep_time, label = TRUE)) %>% ggplot(aes(x = wday)) + geom_bar() ``` ![](01_dates-and-time_files/figure-html/unnamed-chunk-10-1.png)<!-- --> --- ## Accessor function examples ```r (date <- now()) ``` ``` ## [1] "2020-04-02 09:39:45 CDT" ``` ```r year(date) <- 1920 date ``` ``` ## [1] "1920-04-02 09:39:45 LMT" ``` ```r hour(date) <- hour(date) + 2 date ``` ``` ## [1] "1920-04-02 11:39:45 LMT" ``` --- class: yourturn # YOUR TURN For this question, use the `flights_dt` data created in the last your turn. - Use an accessor function to calculate the average departure delay by minute within the hour. - Use ggplot2 to plot your results. --- ## Intervals, Durations, & PERIODS Three important classes that represent time spans: - **Periods** track changes in clock times, which ignore time line irregularities - make a period with the name of a time unit plyralized, e.g. `years()`, `months()` - **Durations** are potentially of relative length (months, leap year, <br>leap second, ...) - make a duration with the name of the perios prefixed with a d, e.g. `dyears()`, `dmonths()` - **Intervals** have a start and an end date/time: absolute difference - make an interval with `interval()` or `%--%` ??? periods work with "human" times, likes days and months intervals represent specific intervals of the timeline, bounded by start and end date-times durations track the passage of physical time, which deviates from clock time when irregularities occur --- ## Time span examples ```r years(1) ``` ``` ## [1] "1y 0m 0d 0H 0M 0S" ``` ```r dyears(1) ``` ``` ## [1] "31536000s (~52.14 weeks)" ``` ```r (tomorrow <- today() + ddays(1)) ``` ``` ## [1] "2020-04-03" ``` ```r # A leap year ymd("2016-01-01") + dyears(1) ``` ``` ## [1] "2016-12-31" ``` ```r ymd("2016-01-01") + years(1) ``` ``` ## [1] "2017-01-01" ``` --- ## Time span examples When you subtract two dates, you get a difftime object: ```r # How old am I? h_age <- today() - ymd(19900327) h_age ``` ``` ## Time difference of 10964 days ``` <br> As a **duration**: ```r as.duration(h_age) ``` ``` ## [1] "947289600s (~30.02 years)" ``` --- ## Time span examples ```r next_year <- today() + years(1) today() %--% next_year ``` ``` ## [1] 2020-04-02 UTC--2021-04-02 UTC ``` ```r (today() %--% next_year) / ddays(1) ``` ``` ## [1] 365 ``` ```r (today() %--% next_year) / dhours(1) ``` ``` ## [1] 8760 ``` To find out how many periods fall into an interval, you need to use integer division: ```r (today() %--% next_year) / dweeks(1) ``` ``` ## [1] 52.14286 ``` ```r (today() %--% next_year) %/% dweeks(1) ``` ``` ## Note: method with signature 'Timespan#Timespan' chosen for function '%/%', ## target signature 'Interval#Duration'. ## "Interval#ANY" would also be valid ``` ``` ## [1] 52 ``` --- ## Example: Movies ```r library(classdata) summary(box$Date) # date variable: allows date arithmetic ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## "2013-03-08" "2014-10-24" "2016-06-17" "2016-06-12" "2018-01-26" "2019-10-11" ``` --- ## Example: Movies (cont'd) ```r library(tidyverse) box %>% ggplot(aes(x = Date)) + geom_histogram(binwidth=7) ``` ![](01_dates-and-time_files/figure-html/unnamed-chunk-18-1.png)<!-- --> --- ## Example: Movies (cont'd) Is there a seasonal effect in the number of movies in the box office? ```r library(lubridate) box %>% ggplot(aes(x = month(Date, label=TRUE))) + geom_bar() ``` ![](01_dates-and-time_files/figure-html/unnamed-chunk-19-1.png)<!-- --> --- class: yourturn # Your Turn - inspect the `budget` data set from the `classdata` package - make sure the variable `Release Date` is a date format - plot a histogram of the variable - merge (`join`) budget and box office data (by movie name) - is the time between the release of a movie and the date is equal to the number of weeks in theaters? --- ## Resources - reference/document: https://lubridate.tidyverse.org/index.html - RStudio cheat sheet for [lubridate](https://rawgit.com/rstudio/cheatsheets/master/lubridate.pdf) - Artwork by [@allison_horst](https://twitter.com/allison_horst?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor)