R Basics

Intro: R objects

  1. These are three very useful objects in R that were discussed in our first meeting.
  1. A ………. is the equivalent to a spreadsheet in Excel. It almost always has two dimensions, namely, ………. and ………. This is what you will be working with most of the time.

  2. Each ………. in a ………. is equivalent to a vector. As a result, a typical column contains only one type of data, i.e., a column cannot have numbers and strings (unlike rows): if it does, numbers will be coerced to as.character.

  3. ………. are one of the most flexible objects in R. They can have multiple dimensions as well as different types of data. As a result, you can hold both numbers and strings in single variable.

  4. To create a vector, you use the ………. command. To create a data frame from scratch, you use ………. If you want to turn an object into a data frame (e.g., a matrix), you use ………. To create lists, you type ……….



  1. How would you create the following data frame from scratch and assign it to a new variable? (You won’t normally do that in R, but it’s good practice).
item    sentence                   type     condition   version       RT
1       The nurse was nervous      filler   pauseN2     a             4.628606
2       I walk every day           filler   fallingInt  b             3.510744
3       She never speaks Japanese  target   risingInt   a             2.851694



  1. In myData, we anticipate that R will treat item as a number, not as a factor. As a result, if you use summary(myData) (or mean(myData$item)), R will return the mean for item, which makes no sense. Using item as an example, check its class and change it into a factor.



Packages

  1. How do you install and load a package in R?



Loading your own (hypothetical) data

  1. You have just opened R. How do you check which objects are loaded in your workspace?



  1. Now you want to load your data file, myFile.csv. Assume it’s located in a particular folder in your laptop: /Users/yourName/Documents/files/myFile.csv. Which command(s) could you use to load the file and assign it to the variable myData?



Working with real data

  1. We will use the danish data set, which comes with the languageR package (click here for more info). Load the data and assign danish to a new (shorter) variable, dan.



  1. This question has three parts. Normally, the first thing you want to do when you load your data file is to have a general sense of its structure and dimensions (so you know the file is correct, for example). How do you: (a) visualize the first 10 rows in your data? (b) check the class of each variable? (c) print basic stats for all variables?



  1. A couple of things: (a) How do you print the number of columns dan has? (b) How do you print the names of all the columns? Finally, (c) create a subset that only contains the following columns: Subject, Word, LogRT, Sex, LogWordFreq, LogUP. Assign this subset to new and visualize the first rows of new.



  1. Now that we have a simpler data frame, export new as a csv file named output.



  1. Note that we have a column for word frequency (LogWordFreq), which has been log-transformed. To backtransform it, we can take the exponential of LogWordFreq using the exp() function. Create a new column called WordFreq that backtransforms LogWordFreq.



  1. One very useful function in R is ifelse(), which has three arguments. The first argument is the condition; the second, the result in case the condition is met; the third refers to what needs to be done if the condition is not met (i.e., the else bit).

For example:

x = 10
ifelse(x > 5, "x is greater than 5", "x is not greater than 5")
## [1] "x is greater than 5"

Because x == 10, the first argument evaluates to TRUE. As a result, the second argument is printed (the third argument is not evaluated in this case).

Now, create a new column in new called isFreq. This column will have two levels: yes if the log-transformed frequency of a word is greater than 5, and no otherwise. Note that R will likely think your new column is a character, not a factor. So you also need to transform it (you can actually do it all at once by embedding functions).