Chemistry 260

Chem 351 How To... Pages

This page contains brief descriptions of various commands for using the program R. More detailed descriptions of these commands and examples are available from the Archives page.

The following convention is used to display these R commands. Required text is in bold, text entered by the user is in italics and optional arguments and text are underlined.

Getting Help with R

help(topic) – provides an explanation for the requested topic, including proper formatting and examples

help.start( ) – launches the searchable html version of help

example(topic) – executes examples demonstrating the behavior of R functions

Identifying Available Objects and Files in R

objects( ) – provides a list of all objects currently in use; also can use ls( )

dir( ) – provides a list of all files in the current directory; the identity of the current directory is determined using the command getwd( ) and set using the command setwd( ), or from the menu bar (File:Change dir… when using Windows XP or Misc:Change Working Directory… when using Mac OS X).

Entering Data Into R

V = c(x, y,…) – creates the vector V whose elements are x, y, etc.

V = scan( ) – allows a vector’s elements to be entered one at a time.  R requests the next element using the notation “1:” with each entry followed by a return (Enter); a final return completes the data entry.

V = scan(file = “filename”) – reads the identified file, which must be in the current directory, and assigns its elements to the vector V

M = matrix(V, nrow = nr, ncol = nc , byrow = TRUE) – creates a matrix M with nr rows and nc columns, and fills the matrix with the elements from the vector V.  Setting byrow to TRUE fills the matrix by rows; this command is optional and defaults to FALSE if not specified, in which case the matrix is filled by columns.

DF = read.csv(“filename.csv”) – creates the data frame DF and fills it with the elements stored in the designated file, which must be in the current directory; this is a convenient way to import data to R using a .csv (comma separated values) file created with Excel.

DF = read.table(“filename.txt”, header = TRUE) – creates the data frame DF and fills it with elements stored in the designated file, which must be in the current directory.  Setting header to TRUE indicates that the first row contains names for the data frame’s columns; this command is optional and defaults to FALSE.

load(file = “filename.RData”) – loads the identified R data file, which must be in the current directory, creating the file’s corresponding objects.

Displaying an Object's Values

X – prints the elements of the object X

V[n] – prints the nth element in the vector V. The term n can refer to a: specific element in the vector , [1]; a range of elements, [1:3]; all elements but a specified element, [-2]; or all elements satisfying a requirement [<3].

M[r,c] – prints the element in the rth row and cth column in the matrix M. The term r,c can be modified to specify all elements in a given row, [1,] or all elements in a given column, [, 2].

Saving Data to Your Computer

save.image(file = “filename.RData”) – creates an R data file in the current directory and saves all objects in the workspace to the file; files created using save.image can be read by R but not by other programs.

save(X, Y…,file = “filename.RData”) creates an R data file in the current directory and saves the specified objects to the file; files created using save can be read by R but not by other programs.

write(X, file = “filename”, ncolumns = n, append = TRUE) – creates a text file in the current directory and writes the elements of object X to the file using the specified number of columns (which defaults to five columns if not specified); setting append to TRUE adds the object to the end of the file allowing multiple objects to be saved to the same file.

write.csv(X, file = “filename.csv”) – creates a ‘comma separated values’ file in the current directory and writes the elements for object X to the file; such files provide a way to export data to Excel.

pdf("filename.pdf") - opens a pdf file as a graphics ouput device; use the command dev.set(index) to select the file and dev.off(index) to close and save the file.

Modifying Objects

rm(object) – deletes an object from the workspace, which is a useful way to avoid clutter.

V[n] = x – changes the value of the nth element of the vector V to the value given by x.
The term n can refer to a: specific element in the vector , [1]; a range of elements, [1:3]; all elements but a specified element, [-2]; or all elements satisfying a requirement [<3]

M[r,c] = x – changes the value of the element in the rth row and cth column of the matrix M to the value given by x. The term r,c can be modified to specify all elements in a given row, [1,] or all elements in a given column, [, 2].

Descriptive Statistics

mean(object) – provides the mean of the object’s elements

median(object) – provides the median of the object’s elements

var(object) – provides the sample variance of the object’s elements

sd(object) – provides the sample standard deviation of the object’s elements

IQR(object) – provides the object’s interquartile range; note – this value may differ slightly from that provided by other programs because there is no single accepted definition for FU and FL

pnorm(value, mean, standard deviation) - for a normal distribution, returns the probability of obtaining a results smaller than the stated value

qnorm(probability, mean, standard deviation) - for a normal distribution, returns the largest value that does not exceed the stated probability

Visual Displays of Data

hist(object) – creates a histogram of the object’s elements with the number of compartments chosen by R.

boxplot(object 1, object 2…, names, horizontal = TRUE) – creates a boxplot of the object’s elements (for multiple objects, a boxplot is drawn for each); names is a vector containing the names of the objects, which adds labels on the x-axis when plotting more than one boxplot.  Setting horizontal to TRUE (the default value is FALSE) creates a horizontal boxplot.

qqnorm(object); qqline(object) - produces a QQ probability plot for the object and displays the normal distribution line.

layout(matrix(n:m, nr, nc, byrow = True) - partitions the graphics window into a nr x nc matrix of cells with index values from n to m, "byrow" determines if the cells are filled by row or by column.

Creating User-Defined Functions in R

name = function(argument 1, argument 2…) {expression}

Creataing a Data Frame

DF = data.frame(object 1, object 2, object 3…)

Miscellaneous Commands

rep(x, times) - repeats the term x for a total of times times; thus c(rep(x,5), 1) creates a vector with elements of (5, 5, 5, 5, 5, 1)

tapply(object, index, function) - applies the function to the object after using the index to divide the object into groups

sapply(object, function) - applies the function to each column in a data frame.

last modified on February 3, 2007
send comments to David Harvey (harvey@depauw.edu)