ChemoSpec
ChemoSpec is an R package
for the chemometric analysis of spectra.
It consists of functions for plotting spectra (NMR, IR etc)
and carrying out various forms of exploratory data analysis, such as HCA
and PCA. The design allows comparison of data from samples which fall into
groups such as treatment vs. control. Robust methods appropriate for this
type of high-dimensional data are available. ChemoSpec is designed to be
very user friendly for people with limited background in R. Considerable
effort was made to ensure consistency across the various functions and plots.
You can access the tarball and source files for ChemoSpec at GitHub. The
tarball is also available at R-Forge (note
I am having subversion issues at R-Forge so use GitHub for now).
ChemoSpec is composed of only R source files, nothing is complied. Hence,
it should be platform independent.
Some of the plots that ChemoSpec can create are shown
here. Click on the small version to see a larger version. These
were created using a built-in data set of IR spectra of plant cuticles. In
addition to creating plots such as these, the data sets may be edited to remove
particular samples, or to remove particular frequency ranges. Binning
of the frequency data is also provided. For more information, download
and install the package and check out the documentation. Questions,
comments or suggestions, please e-mail.
Plotting Spectra
Spectra may be plotted offset or overlaid. The offset
can be specified, as can the vertical magnification. The location
of the sample name label can be controlled, including not plotting it
at all. The color-coding is automatically generated during the
import of the data and is user specified. |
 |
Principal Components Analysis (PCA) Score Plots
Either classical or robust PCA can be carried out using various scaling
options. Which
PC to plot on each axis can be specified. Ellipses correspond
to the groupings specified during data import and use the same color
scheme as the samples. The ellipses may be drawn
based upon classical or
robust methods (not to be confused with how the PCA is conducted), or
they may be omitted. Each
point can be labeled with its sample number. The labeling can be
controlled to label all points or just the most extreme points. A
key and information about the data processing are automatically generated. |
 |
Scree Plot of PCA Results
This is the typical means of determining how many PCs are needed to
describe the data. Both individual and cumulative contributions
of each PC to the variance described are plotted. The 95% line
is dotted as many researchers consider this a good explanatory threshold.
A notation
is made about the data processing history. |
 |
Bootstrap Analysis of the Number of Principal Components
A bootstrap or cross-validation approach is taken in which some samples
are used the compute the PCs and others are used to check the results. This
is an alternate means of deciding how many PCs should be kept for further
work. Bootstrap analysis
is only available for classical PCA. |
 |
PCA Diagnostics
Two different plots are available for PCA diagnostics, as a means of
identifying potential outliers. |
 |
Loadings Plots
Loadings to be plotted may be specifed, as can a reference spectrum. The
gap that you see here in the loadings is due to some data being
removed from regions in the IR spectrum that don't carry any information. |
 |
Heirarchical Cluster Analysis (HCA)
HCA can be performed using the clustering options available in R. The
result is plotted with the same color coding by sample as in the other
plots. The
clustering method employed is also plotted. |
 |
3D display of PCA Scores
ChemoSpec can display 3D plots of scores, with a simple mechanism
to change the view. Much better, ChemoSpec data can be passed to
the interactive display system GGobi for tours of any number of
PCs, including projection pursuit methods. The resulting views
can be saved to a graphics file.
Projection
pursuit from GGobi
|
 |
This page maintained by
Bryan
Hanson, Dept of Chemistry & Biochemistry. Last update:
October 22, 2009