class: center, middle, inverse, title-slide # Intro to R --- class: inverse ## Outline R is a calculator five commands to look at objects extracting pieces a first glimpse at data --- background-image: url(https://raw.githubusercontent.com/allisonhorst/stats-illustrations/master/rstats-artwork/r_first_then.png) background-size: 500px background-position: 0% 27% .pull-left[] .pull-right[ ## The R language - Learning a new language is **hard**! - grammar, vocabulary - Loading, examining, summarizing data - Creating data - Getting help - Miscellaneous useful stuff ] --- ## R syntax Basic algebra is the same as calculator/mathematics - use explicit operators: `2*x` not `2x`, `2^p` instead of `2p` <br/> Functions are (most often) verbs, followed by what they will be applied to in parantheses: - `do_this(to_this)` - `do_that(to_this, to_that, with_those)` In my notes, I will try to refer to objects as `objects` and functions as `function()` <br/> Making and saving a variable, use `<-` or `=` - Remember: everything in R is a vector - Vectors can be constructed using concatenation (the `c()` function) --- ## Examples .pull-left[ **Math** ##### Assignments `\(x = \frac{2}{3}\)` ##### Functions `\(\sqrt{x}\)` ##### Vectors `\(y = \left( 1, 2, 3, 5\right)^t\)` ##### Indices `\(y_1\)` ##### Mathematical Operators `\(\sum_{i=1}^{4} y\)` ##### `\(2y\)` ] .pull-right[ **Code** #### `x <- 2/3` #### `sqrt(x)` #### `y <- c(1,2,3,5)` #### `y[1]` #### `sum(y)` #### `2*y` ] --- class: yourturn .center[ # Your Turn (5 min) ] - Introduce vector `\(x\)` defined as `\(x = (4, 1, 3, 9)^t\)` - Introduce vector `\(y\)` defined as `\(y = (1, 2, 3, 5)^t\)` - Calculate the Euclidean distance between the two vectors `\(x\)` and `\(y\)`, defined as `$$d = \sqrt{\sum_{i=1}^4 (x_i - y_i)^2}$$` --- ## Vocabulary - What verbs (=functions) do you need to know? - Loading data - Accessing parts of things - Statistical summaries - ... --- background-image: url(https://image.slidesharecdn.com/short-refcard-160608175856/95/reference-card-for-r-1-638.jpg?cb=1465408833) background-size: 700px background-position: 50% 170% ## R Reference Card - Download the R Reference Card from http://cran.r-project.org/doc/contrib/Short-refcard.pdf - Open/Print it so that you can glance at it while working --- ## R essentials #### Getting help within R If you want to know what a specific `command` is doing: `?command` `help(command)` `help.search(command)` #### Getting out `q()` Quits out of the console --- # R packages: <br>installing & loading Packages are installed with the install.packages function and loaded with the library function, once per session: `install.packages("package_name")` `library(package_name)` --- ## Loading class data - Some R packages have in-built datasets - For this class, there is an R package available on github - Installing/Updating `classdata` package (once every so often): ```r devtools::install_github("haleyjeppson/classdata") ``` ``` ## Skipping install of 'classdata' from a github remote, the SHA1 (f07d5f48) has not changed since last install. ## Use `force = TRUE` to force installation ``` - Make the data available (every time you start R): ```r library(classdata) ``` --- class: yourturn .center[ # Your Turn (5 min) ] Install the package `classdata` on your machine Make the package active in your current R session: `library(classdata)` Check the R help on the dataset `fbi`<br> What happens if you just type in the name of the dataset? --- ## Alternative method ... not able to install devtools or the classdata package? ```r fbi <- read.csv("https://raw.githubusercontent.com/Stat480-at-ISU/materials-2020/master/02_r-intro/data/fbi.csv") ``` --- ## Inspecting objects for object `x`, we can try out the following commands: - `x` - `head(x)` - `summary(x)` - `str(x)` - `dim(x)` <br/> Try these commands out for yourself on the `fbi` data. --- ## The fbi data - `fbi` is a data frame with columns (variables) and rows (records) ```r str(fbi) ``` ``` ## 'data.frame': 23256 obs. of 7 variables: ## $ State : Factor w/ 52 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ... ## $ Abb : Factor w/ 52 levels "AK","AL","AR",..: 2 2 2 2 2 2 2 2 2 2 ... ## $ Year : int 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 ... ## $ Population : int 3302000 3358000 3347000 3407000 3462000 3517000 3540000 3566000 3531000 3444165 ... ## $ Type : Factor w/ 8 levels "Aggravated.assault",..: 6 6 6 6 6 6 6 6 6 6 ... ## $ Count : int 427 316 340 316 395 384 415 421 485 404 ... ## $ Violent.crime: logi TRUE TRUE TRUE TRUE TRUE TRUE ... ``` --- ## Extracting parts of objects For object `x`, we can extract parts in the following manner: ``` x$variable x[, "variable"] x[rows, columns] x[1:5, 2:3] x[c(1,5,6), c("State","Year")] x$variable[rows] ``` `rows` and `columns` are vectors of indices. <br/> Try these commands out for yourself on the `fbi` data. --- ## Statistical summaries Elements of the five point summary: `mean()`, `median()`, `min()`, `max()`, `quartiles()` <br/> Other summary statistics: `range()`, `sd()`, `var()` <br/> Summaries of two variables: `cor()`, `cov()` --- class: yourturn .center[ # Your Turn ] - Look at the first 10 data records of the `fbi` data - Compute mean and standard deviation for the number of counts. Why do you get NAs? (read `?NA`) - Advanced: Read `?mean` and `?sd`, and fix missing value problem --- class:yourturn .center[ ## Your turn - practice to ask questions ] Write down questions that you could answer with this data 4 minutes by yourself, then pair up for another 3 minutes, and we'll write ideas on the board The types of crimes are defined on the UCR website, see http://www.ucrdatatool.gov/offenses.cfm --- ## Resources [R Reference Card](http://cran.r-project.org/doc/contrib/Short-refcard.pdf) Artwork by [@allison_horst](https://twitter.com/allison_horst?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor)