|
Chem 351 How
To... Pages
This page contains brief
descriptions of various commands for using the program R. More detailed
descriptions of these commands and examples are available from the
Archives page.
The following convention
is used to display these R commands. Required text is in
bold, text
entered by the user is in italics and optional arguments
and text are underlined.
Getting Help with
R
help(topic) – provides
an explanation for the requested topic, including proper formatting
and examples
help.start(
) – launches
the searchable html version of help
example(topic) – executes
examples demonstrating the behavior of R functions
Identifying Available
Objects and Files in R
objects(
) – provides a
list of all objects currently in use; also can use ls( )
dir(
) – provides a list
of all files in the current directory; the identity of the current
directory is determined using the command getwd( ) and
set using the command setwd( ), or from the menu bar
(File:Change dir… when using Windows XP or Misc:Change Working
Directory… when using Mac OS X).
Entering Data
Into R
V = c(x,
y,…) – creates
the vector V whose elements are x, y,
etc.
V =
scan( ) – allows a vector’s elements to be entered
one at a time. R requests the next element using the notation “1:” with
each entry followed by a return (Enter); a final return completes
the data entry.
V =
scan(file = “filename”) – reads
the identified file, which must be in the current directory, and assigns
its elements to the vector V.
M =
matrix(V, nrow = nr, ncol
= nc , byrow = TRUE) – creates
a matrix M with nr rows and nc columns,
and fills the matrix with the elements from the vector V. Setting byrow to TRUE fills
the matrix by rows; this command is optional and defaults to FALSE if
not specified, in which case the matrix is filled by columns.
DF =
read.csv(“filename.csv”) – creates
the data frame DF and fills it with the elements stored in
the designated file, which must be in the current directory; this is
a convenient way to import data to R using a .csv (comma separated
values) file created with Excel.
DF = read.table(“filename.txt”, header
= TRUE) – creates
the data frame DF and fills it with elements stored in
the designated file, which must be in the current directory. Setting header to TRUE indicates
that the first row contains names for the data frame’s columns;
this command is optional and defaults to FALSE.
load(file
= “filename.RData”) – loads
the identified R data file, which must be in the current directory,
creating the file’s corresponding objects.
Displaying an
Object's Values
X – prints
the elements of the object X
V[n] – prints
the nth element in the vector V. The term n can
refer to a: specific element in the vector , [1]; a range of elements,
[1:3]; all elements but a specified element,
[-2]; or all elements satisfying a requirement [<3].
M[r,c] – prints
the element in the rth row and cth column
in the matrix M. The term r,c can be modified to
specify all elements in a given row, [1,] or all elements in a given
column, [, 2].
Saving Data to
Your Computer
save.image(file
= “filename.RData”) – creates
an R data file in the current directory and saves all objects in the
workspace to the file; files created using save.image can be read by
R but not by other programs.
save(X, Y…,file
= “filename.RData”) creates
an R data file in the current directory and saves the specified objects to
the file; files created using save can be read by R but not by other programs.
write(X, file = “filename”, ncolumns
= n, append = TRUE) – creates
a text file in the current directory and writes the elements of object X to
the file using the specified number of columns (which defaults to five columns
if not specified); setting append to TRUE adds the object to the end of the
file allowing multiple objects to be saved to the same file.
write.csv(X, file
= “filename.csv”) – creates
a ‘comma separated values’ file in the current directory
and writes the elements for object X to the file; such files
provide a way to export data to Excel.
pdf("filename.pdf") -
opens a pdf file as a graphics ouput device; use the command dev.set(index) to
select the file and dev.off(index) to
close and save the file.
Modifying Objects
rm(object) – deletes
an object from the workspace, which is a useful way to avoid clutter.
V[n] = x – changes the value of the nth element
of the vector V to the value given by x. The
term n can refer to a: specific element in the vector ,
[1]; a range of elements, [1:3]; all elements but a specified element,
[-2]; or all elements satisfying a requirement [<3]
M[r,c] = x – changes
the value of the element in the rth row and cth column
of the matrix M to
the value given by x. The
term r,c can be modified to specify all elements in a given
row, [1,] or all elements in a given column, [, 2].
Descriptive Statistics
mean(object) – provides
the mean of the object’s elements
median(object) – provides
the median of the object’s elements
var(object) – provides
the sample variance of the object’s elements
sd(object) – provides
the sample standard deviation of the object’s elements
IQR(object) – provides
the object’s interquartile range; note – this value may
differ slightly from that provided by other programs because there
is no single accepted definition for FU and FL
pnorm(value,
mean, standard deviation) - for a normal
distribution, returns the probability of obtaining a results smaller
than the stated value
qnorm(probability,
mean, standard deviation) - for a normal
distribution, returns the largest value that does not exceed the stated
probability
Visual Displays of Data
hist(object) – creates
a histogram of the object’s elements with the number of compartments
chosen by R.
boxplot(object 1, object 2…, names, horizontal = TRUE) – creates
a boxplot of the object’s elements (for multiple objects, a boxplot
is drawn for each); names is a vector containing the names of the objects,
which adds labels on the x-axis when plotting more than one boxplot. Setting
horizontal to TRUE (the default value is FALSE) creates a horizontal
boxplot.
qqnorm(object);
qqline(object) - produces
a QQ probability plot for the object and displays the normal
distribution line.
layout(matrix(n:m,
nr, nc, byrow = True)
- partitions the graphics window into a nr x nc matrix
of cells with index values from n to m, "byrow"
determines if the cells are filled by row or by column.
Creating User-Defined Functions in R
name = function(argument
1, argument 2…) {expression}
Creataing a Data
Frame
DF =
data.frame(object 1, object 2, object 3…)
Miscellaneous Commands
rep(x,
times) - repeats the term x for
a total of times times; thus c(rep(x,5), 1) creates
a vector with elements of (5, 5, 5, 5, 5, 1)
tapply(object,
index, function) - applies the function
to the object after using the index to divide the object into
groups
sapply(object, function) -
applies the function to each column in a data frame.
|