Bike rentals in DC

  1. Download the RMarkdown file with these homework instructions to use as a template for your work. Make sure to replace “Your Name” in the YAML with your name.

  2. The data include daily bike rental counts (by members and casual users) of Capital Bikeshare in Washington, DC in 2011 and 2012 as well as weather information on these days. The original data sources are http://capitalbikeshare.com/system-data and http://www.freemeteo.com. Using the command below, read in the spotify data set into your R session.

bikes <- read.csv("https://raw.githubusercontent.com/Stat480-at-ISU/Stat480-at-ISU.github.io/master/homework/data/bikes.csv")
  1. Recode the variable holiday to be logical variables with 0 as FALSE and 1 as TRUE.
bikes$holiday <- as.logical(bikes$holiday)
  1. Create a variable workingday in that is FALSE if it is a holiday or the weekend (use weekday where 1 = Sunday, 2 = Monday, etc.). You may find De Morgan’s laws helpful here. Use ggplot to create a scatterplot comparing the number of registered bike rentals with the number of casual bike rentals. Map workingday to color. Interpret the result.
#first I define workingday to have TRUE value and then change the logical value false if the condition true
bikes$workingday <- TRUE
#sunday is 1 ansd saturday is 7
bikes$workingday[bikes$holiday==TRUE | bikes$weekday==7 |bikes$weekday==1] <- FALSE

ggplot(bikes,aes(x=bikes$registered,y=bikes$casual,color=bikes$workingday))+
  geom_point()

Interpret the Result:

number of casual rented bikes are greater than registered bikes in holidays and weekends while number of registered bikes are greater than casual rented bikes in weekdays.

  1. Recode the year variable so that the value 0 becomes 2011 and the value 1 becomes 2012.
bikes$year[bikes$year==0] <- 2011
bikes$year[bikes$year==1] <- 2012
  1. For each observation, verify that the variable count is equal to casual plus registered. You should be able to verify this without having to print out the columns. (Hint: one option is to use the function any())
any(bikes$count != bikes$registered+bikes$casual)
## [1] FALSE
  1. How does the number of casual riders renting bikes compare across the months? Use ggplot2 to draw side-by-side boxplots of casual by month. Interpret the result.
ggplot(bikes, aes(x=as.factor(bikes$month),y=bikes$casual))+
  geom_boxplot()

Interpret the Result:

during spring and summer( 4,5,6,7,8,9,10 Aprill to October) people rented more bikes for casual in average compared to other month in Winter and Automn (1,2,3,11,12) between all months, Augest(8) has the highest number of casual bikes rented by people in average(mean)

  1. How does the number of rentals compare for different weather conditions? Recode the variable weather to be a factor with 1 - clear, 2 - mist, 3 - light_precip. Use ggplot2 to draw side-by-side boxplots of count by weather colored by weather. Interpret the result.
bikes$weather <- as.factor(bikes$weather)

levels(bikes$weather) <- c("clear","mist","light_precip")

ggplot(bikes, aes(x=bikes$weather,y=bikes$count,fill=bikes$weather))+
  geom_boxplot()

Interpret the Result:

the average number of bikes rented in clear weather is greater than in mist weather and in light_precip weather. Also the average number of bikes rented in mist weather is greater than in light_precip weather