R is a software language for carrying out complicated (and simple) statistical analyses. It includes routines for data summary and exploration, graphical presentation and data modelling. The aim of this document is to provide you with a basic fluency in the language. It is suggested that you work through this document at the computer, having started an R session. Type in all of the commands that are printed, and check that you understand how they operate. Then try the simple exercises at the end of each section.
When you work in R you create objects that are stored in the current workspace( sometimes called image). Each object created remains in the image unless you explicitly delete it. At the end of the session the workspace will be lost unless you save it. You can save the workspace at any time by clicking on the disc icon at the top of the control panel.
Commands written in R are saved in memory throughout the session. You can scroll back to previous commands typed by using the `up’ arrow key (and `down’ to scroll back again). You can also `copy’ and `paste’ using standard windows editor techniques (for example, using the `copy’ and `paste’ dialog buttons). If at any point you want to save the transcript of your session, click on `File’ and then `Save History’, which will enable you to save a copy of the commands you have used for later use. As an alternative you might copy and paste commands manually into a notepad editor or something similar.
You finish an R session by typing
> q()
at which point you will also be prompted as to whether or not you want to save the current workspace If you do not, it will be lost.
R stores information and operates on objects. The simplest objects are scalars, vectors and matrices. But there are many others: lists and data frames for example. In advanced use of R it can also be useful to define new types of object, specific for particular application. We will stick with just the most commonly used objects here.
An important feature of R is that it will do different things on different types of objects. For example, type:
> 4+6
The result should be
[1] 10So, R does scalar arithmetic returning the scalar value 10. (In actual fact, R returns a vector of
length 1 – hence the [1] denoting first element of the vector. We can assign objects values for subsequent use. For example:
x<-6
y<-4
z<-x+y
would do the same calculation as above, storing the result in an object called z. We can look at the contents of the object by simply typing its name:
z
[1] 10At any time we can list the objects which we have created:
> ls()
[1] “x” “y” “z”Notice that ls is actually an object itself. Typing ls would result in a display of the contents of this object, in this case, the commands of the function. The use of parentheses, ls(), ensures that the function is executed and its result – in this case, a list of the objects in the directory – displayed. More commonly a function will operate on an object, for example
> sqrt(16)
[1] 4calculates the square root of 16. Objects can be removed from the current workspace with the rm
function:
> rm(x,y)
for example. There are many standard functions available in R, and it is also possible to create new ones. Vectors can be created in R in a number of ways. We can describe all of the elements:
> z<-c(5,9,1,0)
Note the use of the function c to concatenate or `glue together’ individual elements. This function can be used much more widely, for example
> x<-c(5,9)
> y<-c(1,0)
> z<-c(x,y)
would lead to the same result by gluing together two vectors to create a single vector. Sequences can be generated as follows:
> x<-1:10
while more general sequences can be generated using the seq command. For example:
> seq(1,9,by=2)
[1] 1 3 5 7 9and
> seq(8,20,length=6)
[1] 8.0 10.4 12.8 15.2 17.6 20.0These examples illustrate that many functions in R have optional arguments, in this case, either the step length or the total length of the sequence (it doesn’t make sense to use both). If you leave out both of these options, R will make its own default choice, in this case assuming a step length of 1. So, for example,
> x<-seq(1,10)
also generates a vector of integers from 1 to 10. At this point it’s worth mentioning the help facility. If you don’t know how to use a function, or don’t know what the options or default values are, type help(function name) where function name is the name of the function you are interested in. This will usually help and will often include examples to make things even clearer.
Another useful function for building vectors is the rep command for repeating things. For example
> rep(0,100)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0or
> rep(1:3,6)
[1] 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3Notice also a variation on the use of this function
> rep(1:3,c(6,6,6))
[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3which we could also simplify cleverly as
> rep(1:3,rep(6,3))
[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3As explained above, R will often adapt to the objects it is asked to work on. For example:
> x<-c(6,8,9)
> y<-c(1,2,4)
> x+y
[1] 7 10 13and
> x*y
[1] 6 16 36showing that R uses component-wise arithmetic on vectors. R will also try to make sense if objects are mixed. For example,
> x<-c(6,8,9)
> x+2
[1] 8 10 11though care should be taken to make sure that R is doing what you would like it to in these circumstances. Two particularly useful functions worth remembering are length which returns the length of a vector (i.e. the number of elements it contains) and sum which calculates the sum of the elements of a vector.