The objectives of this tutorial are to:
The first step is to download and install R, which is freely available and can be accessed here. You can choose any of the “mirrors” you want, but it is best to use the one that is closest to you geographically (e.g., the one at Oregon State University).
The second step is to download and install Rstudio, which is a very nice integrated development environment (IDE) for R. In a nutshell, it makes working with R much easier. You will want to install the Rstudio Desktop version that is appropriate for your operating system. After you have installed R and Rstudio, you can proceed to the next step.
R is a programming language that was developed to help researchers analyze quantitative data (e.g., do statistics). As such, R can be used for conducting both simple mathematical functions and complex statistical analyses. See below for some of the simple things that you can do with R. Note that any code directly preceded with a “#” is ignored by R.
1+2
## [1] 3
5*4
## [1] 20
18/32
## [1] 0.5625
5^2
## [1] 25
sqrt(25)
## [1] 5
In short, you can use R as a calculator if you wish.
We can also save values (or other objects) by assigning them to an arbitrary variable. We do this using “<-”.
VarName1 <- 5^2
Once we have saved a value (or other object) we can use it in later in various ways.
print(VarName1) #display the value using the "print" function
## [1] 25
VarName1 - 5
## [1] 20
R comes loaded with a large number of helpful packages and datasets. To access these datasets, we use the function library() which takes a package name as an argument.
library("psych")
If you haven’t installed the package “psych”, then you will need to install it, which you can do using the function install.packages(), which takes the name of the package you want to install as an argument.
install.packages("psych")
After installing the package, we can then load it. Note that we do NOT have to install packages each time we use R (we only have to do that once). We do, however, have to load the package each time we use R.
library("psych")
If we want to get details regarding the use of a particular package, we can use the help() function, which will open the documentation for the chosen package.
help(psych)
R comes with a number of datasets pre-installed. Later on, we will load our own datasets, but for now, lets play with one that comes with R, called “mtcars”. First, lets see what kind of data mtcars comprises. We can do this using the help() function.
help(mtcars)
After running this code, you should see a description of the dataset in a separate window. As noted in that window, mtcars comprises data from Motor Trend magazine’s tests of 32 cars in 1973-1974. We can get a statistical summary of this data by using the function summary()
summary(mtcars)
## mpg cyl disp hp
## Min. :10.40 Min. :4.000 Min. : 71.1 Min. : 52.0
## 1st Qu.:15.43 1st Qu.:4.000 1st Qu.:120.8 1st Qu.: 96.5
## Median :19.20 Median :6.000 Median :196.3 Median :123.0
## Mean :20.09 Mean :6.188 Mean :230.7 Mean :146.7
## 3rd Qu.:22.80 3rd Qu.:8.000 3rd Qu.:326.0 3rd Qu.:180.0
## Max. :33.90 Max. :8.000 Max. :472.0 Max. :335.0
## drat wt qsec vs
## Min. :2.760 Min. :1.513 Min. :14.50 Min. :0.0000
## 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89 1st Qu.:0.0000
## Median :3.695 Median :3.325 Median :17.71 Median :0.0000
## Mean :3.597 Mean :3.217 Mean :17.85 Mean :0.4375
## 3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90 3rd Qu.:1.0000
## Max. :4.930 Max. :5.424 Max. :22.90 Max. :1.0000
## am gear carb
## Min. :0.0000 Min. :3.000 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:3.000 1st Qu.:2.000
## Median :0.0000 Median :4.000 Median :2.000
## Mean :0.4062 Mean :3.688 Mean :2.812
## 3rd Qu.:1.0000 3rd Qu.:4.000 3rd Qu.:4.000
## Max. :1.0000 Max. :5.000 Max. :8.000
As we can see above, the dataset includes 11 characteristics for the included 32 cars. We can easily plot the relationship between some of these characteristics using the function plot(). Note that we can access particular variables in our data by using the dataframe name (e.g., mtcars) followed by a dollar sign ($) and the variable name.
plot(mtcars$mpg,mtcars$hp)
This plot seems to show a negative relationship between a car’s horsepower and its fuel efficiency (MPG), which is likely what we would expect.
One particularly useful package that we will be using a lot this term is ggplot. The appropriate command for installing ggplot2 is included below.
install.packages("ggplot2")
library("ggplot2")
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
Once ggplot is installed and loaded, we can quickly make nicer (and more sophisticated) plots. Below is an example of a rather simple one - a scatter plot with a line of best fit. We will learn more about ggplot in upcoming classes!
ggplot(data = mtcars, aes(mpg,hp)) +
geom_point() +
geom_smooth(method = lm)
## `geom_smooth()` using formula 'y ~ x'
This is the end of our introduction to R for now. More will follow in upcoming classes.