-
Introduction
-
What is a
Response Surface? -
Searching
Algorithms -
Empirical
Models -
Case
Studies
-
Practice
Problems -
Further
Study
Glossary
A qualitative spot test for vanadium is the formation of a reddish brown color in the presence of H2O2 and H2SO4. Suppose you wish to develop a quantitative spectrophotometric method for vanadium. Adapting the spot test into a quantitative method requires studying how the amounts of H2O2 and H2SO4 affect the absorbance at a particular wavelength. The optimum condition is a strong absorbance that is not particularly sensitive to a small uncertainty in the measurement of H2O2 or H2SO4. Finding this optimum condition can be challenging, but the application of response surface methodology can provide a relatively straightforward solution.
The goal of this learning module is to explore several approaches to solving this problem. Topics covered include:
- What is a response surface?
- Searching algorithms, including one-factor-at-a-time and simplex optimization.
- Building an empirical model of a response surface.
- Case studies providing practical examples.
- Problems that provide practice in the basic calculations.
- Resources for further study, including texts, articles, and lab experiments.
Use the quick links at the top to move through the module's major sections. A glossary provides definitions for terms in red.

Response Surface Methodology by David Harvey is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
Terminology is important, so before turning to the specifics of response surface methodology we need a common vocabulary. A response, for example, is the property of a system that we wish to study. Consider the problem of optimizing a quantitative method for vanadium. Because we plan to monitor the solution's absorbance, this is our response. The absorbance, in turn, depends on the amounts of H2O2 and H2SO4 that we use. We identify hydrogen peroxide and sulfuric acid as factors and their amounts are called factor levels.
A plot showing how the response changes with respect to changes in the factor levels is a response surface, a typical example of which is shown here as a three-dimensional perspective plot:


Although we cannot easily display a response surface when it is a function of three or more factors, the approaches in this module are general and easily applied to more complex systems.
Independent and Dependent Factors
An important issue in response surface methodology is whether factors are independent or dependent. Two factors are independent when a change in the level of one factor does not change how the other factor affects the response. An example of two independent factors is shown below where the parallel lines indicate that the effect of factor A does not depend on the level of factor B.

Exercise
The file "Response Surfaces" contains examples of six response surfaces, some of which involve independent factors and some of which are a function of dependent factors. Open the file and provide answers to the following:
- For each response surface, indicate the factors
affecting the response and whether the factors are independent or dependent.
- In the case of dependent factors, the interaction between
the factors can be positive (a higher level for both factors gives a greater
response) or negative (a higher level for both factors gives a smaller
response). For those response surfaces with dependent factors, indicate
whether the interaction is positive or negative.
- For one of the six response surfaces, the response depends
on the square of each factor's level.
Identify which of the six responses surfaces shows this effect and explain
the reasons for your choice.
- Using your answers to questions 1 - 3 as a starting point, write a brief description of the properties of independent and dependent response surfaces. Your answer should allow anyone to look at a response surface and to classify it properly.
Finding the Optimum Response
The goal in any analysis of a response surface is to find factor levels producing an optimum response, or at least a response that is acceptable. Often, as is the case for the analysis of vanadium, the goal is to find the response surface's maximum point. In other cases, however, we may be seeking a minimum response, such as the smallest error. Having a three-dimensional perspective plot or a contour plot makes it relatively easy to find the optimum and to evaluate its suitability, but we are unlikely to know exactly how the response varies with respect to the factors without some experimentation. There are two broad experimental strategies for locating the optimum on a response surface: searching methods, which use an algorithm to systematically search through the response surface, and modeling methods, in which we build a suitable mathematical model of the response surface.
To continue, use the links at the top of the page.
Imagine that you want to climb to the top of a mountain. Because the mountain is covered with trees that obscure its shape, the summit is not visible. Nevertheless, you can reach the summit if each step moves you in the direction of the steepest slope. Your path to the summit is an example of a searching algorithm. Because your searching algorithm is not the only possible algorithm, a hiker using a different algorithm may find a different path to the summit. The final result, however, is the same.
Knowing that there are multiple paths to the mountain's summit suggests that not all paths are equally desirable. Some paths may be too long, while others may be too steep. Before examining different types of searching algorithms, we will consider two important ways to evaluate them: effectiveness (Did you reach the top of the mountain?) and efficiency (Did you quickly get to the top of the mountain?).
Effectiveness and Efficiency
An effective searching algorithm finds the factor levels corresponding to the response surface's global optimum, or at least a point close to the global optimum. An ineffective searching algorithm, however, stops before reaching the response surface's maximum response. When climbing a mountain, for example, there are several problems, that may prevent you from reaching the summit:
A poorly designed algorithm – An algorithm that allows you to move only north, south, east or west may force you to stop at a false optimum if the next step must follow a steep ridge to the northeast.
Uncertainty in measuring the response – If you are moving across rough terrain, where the small scale variations in elevation over the distance of a single step exceed the large scale variation over several steps, then you may become stuck far from the global optimum.
The presence of a local optimum – What if the mountain you are climbing has two peaks? If you climb the smaller of the peaks (a local optimum), your searching algorithm may not allow you to descend from this peak and to search for the global optimum.
An efficient searching algorithm finds the global optimum in less time and while using fewer resources. Algorithms that allow for larger moves across the response surface will arrive at the optimum more quickly.
There is a trade-off between efficiency and effectiveness. In general, algorithms that take smaller steps require more cycles to reach the optimum; taking larger steps allows for a more efficient approach to the optimum, but at the expense of possibly arriving at a local optimum rather than the global optimum
One-Factor-at-a-Time Optimization
Perhaps the simplest, and most common searching algorithm, is one in which we change the level of one factor while holding constant the levels of all other factors. We might, for example, begin climbing a mountain by moving north or south until we reach a point where neither direction leads to a higher elevation. Then, we try moving east or west until that, too, no longer leads to a higher elevation. Next, we try moving north or south and continue in the same manner until we can no longer move in any direction. This is called a one-factor-at-a-time optimization.
Simplex Optimization
One way to increase the efficiency of a searching algorithm is to change simultaneously the levels of several factors. For two factors, such as longitude (north/south position) and latitude (east/west position), we begin by measuring the response (elevation) at three locations that form the vertices of a triangle.

To move toward the optimum, we reject the vertex with the smallest response and replace it with a new vertex based on a predetermined set of rules. We again compare the response of the three locations, reject the smallest and select a new location, continuing the process until no further increase in response is possible. We call the triangular shape formed by the three vertices a simplex and the algorithm is a simplex optimization.
Fixed-Size Simplex Optimization
The basic simplex optimization maintains the size of the original simplex (and, for obvious reasons, is called a fixed-size simplex optimization). First, we rank the vertices in terms of their respective responses, identifying them as Worst, Next-to-Worst (or Next), and Best. After rejecting the vertex with the smallest response, we create a new vertex by reflecting through the centroid of the remaining two vertices. In the figure below, for example, the vertex labeled Reflection replaces the vertex labeled Worst, which has the lowest response. The new simplex is shown by the solid lines.

Next, we evaluate the new simplex, replacing the vertex with the lowest response with a new vertex, as shown below.
The process continues until the simplex rotates about a single vertex (which, as shown below, is the optimum), or otherwise reaches a position where no further movement is possible.

Variable-Size Simplex Optimization
Although more effective and efficient than a one-factor-at-a-time optimization, the size of the initial simplex limits the efficiency of a fixed-sized simplex optimization. To improve efficiency, we can modify the algorithm and allow the simplex to expand or contract in size based on the relative change in the response (thus, the name variable-size simplex optimization). For example, suppose we reject the vertex labeled Worst and replace it with a new vertex, as we did earlier.
If the response of the new vertex (Reflection) falls between the responses for the vertices Best and Next, then we continue using the new simplex . Suppose, however, that the response at the new vertex is worse than all three of the original vertices. Reflecting back to to the vertex labeled Worst will not help as we already know that its response is unacceptable. Instead, we contract the new simplex by reflecting the new vertex back to the vertex labeled Cw, giving a simplex with vertices of Best, Next, and Cw. If the newly reflected vertex is worse than the vertices Best and Next, but better than the vertex Worst, then we contract the simplex Cr and use the vertices Best, Next, and Cr as the new simplex.

On the other hand, if the response for the reflected vertex is better than the response for the vertex Best, we expand the simplex, as shown below, and use this as the new simplex.

Boundary Conditions
When optimizing experimental conditions there usually are limitations, or boundary conditions, on the factors. For example, in the analysis for vanadium, the concentrations of hydrogen peroxide and sulfuric acid cannot be less than 0 M. For obvious reasons, a vertex requiring a concentration of -0.30 M H2O2 is impossible.
For a one-factor-at-a-time optimization, reaching a boundary forces the next move to be in a direction that is 90o or 270o to the previous direction.
During a simplex optimization, when reflection or expansion leads to a vertex that lies beyond the boundaries of the response surface, the simplex must move in another direction. For a fixed-size simplex this can be handled by returning to the original simplex and treating the next-to-worst response as if it is the worst response. This forces the simplex to move in a different direction. In a variable-size simplex optimization, if reflection moves beyond the boundaries the reflected point can be treated as if its response is worse than the response of the other vertices and the simplex contracted, as described above. If expansion of the initial reflection goes beyond the boundaries, then the reflected point simply is retained.
Handling an Oscillating Simplex
One problem with a fixed-size simplex optimization is that it will oscillate back-and-forth when the reflected point has the worst response in its simplex. This is not a problem when the simplex has found the response surface's optimum value (see below). It is a problem, however, when the simplex becomes stuck on a ridge. To prevent the simplex from oscillating, when the response for the reflected vertex is less than the vertex it is replacing, the original simplex is retained and the next-to-worst vertex is treated as having the worst response.
When to Stop the Optimization
A fixed-size simplex optimization eventually will reach a position where the simplex rotates around a fixed point or oscillates back-and-forth, both of which naturally suggest a stopping point. A variable-size simplex, however, can continue indefinitely to expand and contract unless the user stops the optimization. An easy way to do this is to select a desired threshold response; once that response is achieved, the optimization is stopped.
Finding the Global Optimum
One problem with any optimization strategy is ensuring that it finds the response surface's global optimum and not a local optimum. For this reason, it is a good idea to begin the optimization from several different starting positions and/or with different step-sizes or using simplexes with different initial sizes. If all optimizations end in the same general location, then that location most likely is the global optimum; however, if the optimizations end in different locations, the relative responses will indicate the general location of the global optimum.
Exercise
The following exercise will help you visualize the progress of all three searching algorithms (one-factor-at-a-time, fixed-size simplex, and variable-sized simplex) and help you evaluate their respective effectiveness and efficiency. Click on this link to open a java applet in a new window. Scroll down and read the applet's instructions and then experiment with the controls to gain a feel for how the applet works. When you are comfortable with the applet's interface, answer the following:
- Describe the mathematical form of the response surfaces. Do any of the response surfaces have dependent factors and, if so, is the interaction positive or negative? Do any of the response surfaces include second-order effects in one or more factors?
- How sensitive is each optimization algorithm to the starting point for the optimization and the size of either the initial simplex or the size of a step? In developing an answer, consider both effectiveness and efficiency.
- Are there response surfaces for which the optimizations are particularly sensitive to the initial position? If so, what are the characteristics of these response surfaces?
- One strategy for working with a fixed-size simplex is to run two or more passes. For the first pass use a large initial simplex. For the second pass, begin with a smaller simplex located near the optimum from the first pass. If desired, try a third pass using an even smaller initial simplex. Compare this algorithm to the variable-size simplex algorithm.
Mathematical Details
The mathematical details behind the fixed-size and variable-size simplex optimization are relatively straightforward, although carrying out the calculations by hand can be tedious. Consider the simplex shown here using solid lines and vertices identified as worst (W), next-to-worst (N), and best (B), each with associated xy-coordinates.

Reflecting W through the centroid of the remaining vertices (shown as the black dot on the line between B and N), gives the reflected vertex R. The difference between the y-coordinate of the worst vertex and the y-coordinate of the centroid is
The y-coordinate for R, which is equidistant from the centroid, is
The calculation for the x-coordinate of R follows the same form; thus
As discussed above, for a variable-size simplex optimization, we also consider expanding or contracting the simplex, as shown here.

The equations for the coordinates of E, Cr, and Cw are:
A more detailed derivation of these equations can be found using the references found on the Further Study tab.
To continue, use the tabs at the top of the page.
A searching algorithm provides an effective and efficient method for finding the optimum, but it does not provide much information about the response surface’s shape, particularly in regions unexplored by the simplex. Finding the optimum is important, but so is knowing whether the response at the optimum is particularly sensitive to small changes in factor levels. Imagine, for example, that you are looking for the perfect position for taking a panoramic picture from the top of a mountain. Standing on a large, fairly flat surface probably is a better choice than standing on a small ledge that drops off sharply on all sides.
A mathematical model of a response surface provides more information because we can visualize the response surface’s shape. To create the model we collect values of the response for a range of factor levels and then use the data to build a mathematical model of the response surface. There are two common types of models: theoretical models, in which the underlying theoretical relationship between the response and the factors is well known, and empirical models, in which we try to find the simplest polynomial equation that adequately explains the data. We will consider empirical models only in this module.
2k Factorial Experimental Designs
Building an empirical model requires an appropriate experimental design. The minimum requirement is to measure the response at two levels for each factor. For convenience these levels are labeled high or "+," and low or "-." All together, 2k experiments, where k is the number of factors, are needed for a 2k factorial design.
In a factorial design, each factor’s high and low level is paired with the high and low levels of each other factor. With two factors, for example, a minimum of four experiments is needed
|
![]() |
||||||||||||||||||
A 22 factorial design, therefore, can fit an empirical model of the form
Response = bo + baA + bbB + babAB
where bo, ba, bb, and bab are the model's coefficients and A and B are the levels for the two factors. Note that the number of experiments equals the number of coefficients, which means that there is just enough information to solve for the four factors, a point to which we will return. For a system with three factors, a 23 factorial design requires a minimum of eight experiments
|
![]() |
||||||||||||||||||||||||||||||||||||||||
and gives an empirical model of
Response = bo + baA + bbB + bcC+ babAB + bacAC + bbcBC + babcABC
Coded vs. Uncoded Factor Levels
With 2k measurements and 2k coefficients it is easy to calculate by hand the exact solution for an empirical model using a 2k factorial design (see section below on mathematical details). This task is easier, however, if we first transform the factor levels into coded form by defining each factor's high level as +1 and its low level as -1. Coding also has the advantage of scaling the factors to the same magnitude, which makes it easier to evaluate each factor's relative importance over the range of factor levels being explored. Furthermore, when using coded factor levels the intercept, bo, of the resulting model corresponds to the center of the experimental design, which typically is a more useful piece of information than an intercept that is far removed from the factor levels and that may represent experimental conditions that are impractical. When using coded factor levels, the empirical model is said to be in coded form.
It is easy to covert back-and-forth between uncoded factor levels and coded factor levels. For example, if in the spot test for vanadium we are exploring the effect of the concentration of H2O2 we might let a coded level of -1 represent an actual concentration of 1% (v/v) and let +1 represent an actual concentration of 2% (v/v). A coded level of 0, therefore, corresponds to a concentration of 1.5% (v/v), and a concentration of 0.5% (v/v) corresponds to a coded level of -2.
Exercise
The following exercise will help you visualize the relationship between the factor levels in a 22 factorial design and the resulting response surface. Click on this link to open a java applet in a new window. Scroll down and read the applet's instructions and then experiment with the controls to gain a feel for how the applet works. When you are comfortable with the applet's interface, answer the following:
- Consider the following responses: HA-HB 100, HA-LB 80, LA-HB 60, LA-LB 40. Make a prediction about whether the factors are dependent or independent, and predict the shape of the response surface. Check your predictions by entering the responses and running the model. Based on this model, what combination of factor levels corresponds to the optimum?
- Consider the following responses: HA-HB 80, HA-LB 80, LA-HB 60, LA-LB 20. Make a prediction about whether the factors are dependent or independent, and predict the shape of the response surface. Check your predictions by entering the responses and running the model. Based on this model, what combination of factor levels corresponds to the optimum?
- In a study of the aluminum's influence on the determination of calcium by atomic absorption, the 22 factorial design shown below was used. If you have a sample that is 6.00 ppm in Ca2+, what is the predicted response in the absence of aluminum and in the presence of 30.0 ppm Al3+? This data comes from Koscielniak, P.; Parczewski, A. Anal. Chim. Acta 1983, 153, 111-119.
[Ca2+] (ppm) |
[Al3+] (ppm) |
Response (arb. units) |
10.00 |
160.0 |
54.29 |
10.00 |
0 |
98.44 |
4 |
160.0 |
19.18 |
4 |
0 |
38.53 |
Central Composite Experimental Designs
One limitation to a 2k factorial design, and it is an important limitation, is that it includes only first-order effects for the factors. We can use a 22 factorial design, for example, to model the equation
Response = bo + baA + bbB + babAB
which includes first-order terms for factors A and B, and a first-order interaction between A and B. We do not have sufficient information, however, to include second-order terms for factors A and B
Response = bo + baA + bbB + baaA2 + bbbB2 + babAB
because four experiments do not provide sufficient information to determine the model's six coefficients. Testing a factor at only two unique levels allows for a first-order model's linear relationship between response and factor, but if does not allow for the curvature of a second-order polynomial model. To include such curvature in a model, we must gather data for at least three levels of each factor. An efficient approach for accomplishing this is the central composite design shown below, which allows for five levels for each factor, is shown below. Typically a is chosen so that all points, other than the center point, are equidistant from the center point; for two factors this requires an a of 1.414 and for three factors an a of 1.682. In some cases a takes on other values due to experimental limitations in setting the factor levels.
|
![]() |
|||||||||||||||||||||||||||||||||
Other experimental approaches, such as Box-Behnken designs and Doehlart designs, can be used in place of a central composite design; consult the references on the tab for Further Study for additional details.
Exercise
The following exercise will help you visualize the relationship between the factor levels in a two-factor central composite design and the resulting response surface. Click on this link to open a java applet in a new window. Scroll down and read the applet's instructions and then experiment with the controls to gain a feel for how the applet works. When you are comfortable with the applet's interface, answer the following:
- Consider the following responses: HA-LB 21, HA-HB 19, LA-LB 15, LA-HB 5, Top-Star 22.07, Bottom-Star 7.93, Right-Star 10.76, Left-Star 19.24, Center 15.00. Make a prediction about whether the model will show curvature in the response surface. Check your predictions by entering the responses and running the model. Based on this model, what combination of factor levels corresponds to the optimum?
- Consider the following responses: HA-LB 28, HA-HB 26, LA-LB 22, LA-HB 12, Top-Star 21.07, Bottom-Star 6.93, Right-Star 25.76, Left-Star 34.24, Center 20.00. Make a prediction about whether the model will show curvature in the response surface. Check your predictions by entering the responses and running the model. Based on this model, what combination of factor levels corresponds to the optimum?
- The following results were obtained in a study of reaction temperature and reactant concentration on the yield of a chemical reaction: HA-LB 78, HA-HB 73, LA-LB 43, LA-HB 69, Top-Star 76, Bottom-Star 48, Right-Star 74, Left-Star 74, Center 80. The +1 and -1 levels for temperature are 250o C and 200o C, and the +1 and -1 levels for the reactant's concentration are 25% and 15%. If you carry out the reaction at a temperature of 200o C and a reactant concentration of 22%, what is the expected yield? This data comes from Myers, R. H.; Montgomery, D. C. Response Surface Methodology, Wiley-Interscience: New York, 2nd Edition, 2002.
Mathematical Details
The calculations for a 2k factorial design are straightforward when using coded factor levels. For example, in a 22 factorial design the four experiments give the following four equations:
R1 = bo + ba + bb + bab
R2 = bo + ba - bb - bab
R3 = bo - ba + bb - bab
R4 = bo - ba - bb + bab
where Ri is the response for the ith experiment. Note the terms ba, bb, and bab each appear in two equations with multiplicative factors of +1 and in two equations with multiplicative factors of -1. Because of this it is possible to add and subtract the equations in a manner than eliminates all terms but one. For example, the value of bo is found by adding together the four equations and dividing by 4
![]()
and the value for ba is found by adding together the first two equations, subtracting the last two equations, and dividing the result by 4
![]()
In the same manner, the equations for bb and bab are
![]()
![]()
We can generalize this approach to any 2k factorial design as
![]()
where bf is the coefficient for factor F, and Ri is the response and Fi is the coded factor level for the ith experiment.
The calculations for a central composite experimental design require the use of a regression analysis. You can do this in an Excel spreadsheet by creating columns for A, B, A2, B2, AB, and the response
A |
B |
A2 |
B2 |
AB |
Response |
+1 |
-1 |
+1 |
+1 |
-1 |
R1 |
+1 |
+1 |
+1 |
+1 |
+1 |
R2
|
-1 |
-1 |
+1 |
+1 |
+1 |
R3 |
-1 |
+1 |
+1 |
+1 |
-1 |
R4 |
+1.414 |
0 |
2 |
0 |
0 |
R5 |
-1.414 |
0 |
2 |
0 |
0 |
R6 |
0 |
+1.414 |
0 |
2 |
0 |
R7 |
0 |
-1.414 |
0 |
2 |
0 |
R8 |
0 |
0 |
0 |
0 |
0 |
R9 |
Select Data Analysis... from the Tools menu and then Regression from the pop-up window. From the Regression window, place the cursor in the box for Input Y Range and then click and drag over the cells containing the responses. Next, place the cursor in the box for Input X Range and then click and drag over the cells containing the columns for A, B, A2, B2, and AB. If you included the row of labels in your clicking and dragging, then check the box for Labels. Click in the box for the Output Range and click on an empty cell in your spreadsheet below your data table. Click on the button labeled OK and the results are displayed in your spreadsheet as a series of tables, the last of which provides the names of each coefficient and its value. There is a great deal of additional information in these tables, including probability values that help in determining whether the individual coefficients are significant; coefficients with p-values of less than 0.05 generally are considered significant. Consult the references on the tab for Further Study for additional details.
Evaluating the Model
The calculations outlined here yield a values for each coefficient in the empirical model. Just because we can calculate a value for a coefficient, however, does not mean that the coefficient is important in predicting the response. For 2k factorial designs, where the number of experiments equals the number of factors, there is insufficient information to set a firm threshold for when a coefficient is significant; nevertheless, when working with coded factor levels the relative magnitudes of the coefficients can be used to evaluate the relative importance of the coefficients. In the case of a central composite design, there is enough information for a statistical evaluation of a coefficient's significance. Consult the references on the tab for Further Study for additional details.
An important limitation to an empirical model of a response surface is the assumption that the model provides an appropriate explanation of how the factors affect the response. One simple test of an empirical model is to run an experiment at factor levels that fall within your design and compare the experimental result to that predicted by the model using a t-test. If the agreement between the two is acceptable then your confidence in the model increases. For a 2k factorial design, for example, which models first order effects only, running several experiments with each factor at a coded level of 0 and comparing the results to bo tests for the presence of possible second-order effects. Extrapolating an empirical model to factor levels outside the range of your experimental design is risky. Consult the references on the tab for Further Study for additional details.
To continue, use the tabs at the top of the page.
The analytical literature is rich with papers in which optimizating a response surface plays an important role. Three representative case studies are presented and discussed here.
Case Study 1: Variable-Sized Simplex Optimization
"Simplex Optimization of Ion-Pair Reversed-Phase High Performance Liquid Chromatographic Analysis of Some Heavy Metals"
S. Srijaranai, R. Burakham, R. L. Deming, T. Khammeng, Talanta 2002, 56, 655-661.
An important analytical method for determining the concentration of metal ions is ion-pair reverse-phase HPLC with UV-visible detection. To enhance the sensitivity of the analysis, the metal ions are first complexed with a chelating agent, such as PAR, a heterocyclic azo dye that forms strong, highly absorbing complexes with many metal ions. Separation of the resulting metal ion-PAR complexes is accomplished using a reverse-phase HPLC separation in which the mobile phase includes an ion-pairing agent. For a given column, mobile phase parameters such as pH, buffer concentration, organic modifier concentration, and the concentration of the ion-pairing agent are expected to affect the quality of the separation. The goal of the research presented in this paper is to explore the utility of a variable-size simplex optimization as an efficient means for identifying the optimum mobile phase conditions for the separation of a mixture of Co2+, Ni2+, and Cr3+ when using tetrabutylammonium bromide, TBABr, as an ion-pairing reagent.
Preliminary studies that three mobile phase factors were important: the percentage of acetonitrile, CH3CN, used as an organic modifier, the total concentration of acetic acetic acid and acetate used to buffer the mobile phase pH, and the concentration of TBABr. The pH of the mobile phase was not an important factor within the range of 5.0 to 6.3. The quality of the separation was judged using a chromatographic response function, CRF, which, for this study, considers the resolution between adjacent peaks, the total number of peaks detected, and whether the metal ions elute within specified time limits. Larger values for the CRF correspond to a better response.
Because the study includes three factors, the simplex is a tetrahedron instead of a triangle, as seen earlier for optimizations involving two factors. The initial vertices and response are shown here:
Vertex |
%CH3CN |
[buffer] (mM) |
[TBABr] (mM) |
CRF |
1 |
30.0 |
3.0 |
1.0 |
25.54 |
2 |
33.0 |
3.5 |
1.2 |
23.00 |
3 |
31.0 |
5.0 |
1.2 |
23.60 |
4 |
31.0 |
3.5 |
2.0 |
27.20 |
Vertex 2 has the poorest response and is replaced with a new vertex V5. The calculation for the coordinate on the axis representing %CH3CN is shown here

Note that this calculation is a logical extension of the equations presented earlier for a two-factor simplex optimization. The remaining coordinates for the new vertex are 4.17 for the concentration of buffer and 1.6 for the concentration of TBABr.
The progress of the simplex optimization is shown below. Small differences between the experimental vertices and the calculated vertices presumably represent routine experimental variability. Note that the pathway includes expansions, one of which fails, and contractions. When evaluating the simplex of vertices 5, 7, 9, and 11, the worst next-to-worst vertex is rejected to avoid rejecting the most recently retained vertex. The optimization ends with the addition of the 19th vertex, at which point the simplexes show signs of rotating around the 9th vertex.
Simplex (Vertices) |
Rejected Vertex |
New |
%CH3CN |
[buffer] |
[TBABr] |
CRF |
Comments |
1 |
30.0 |
3.0 |
1.0 |
25.54 |
|||
2 |
33.0 |
3.5 |
1.2 |
23.00 |
|||
3 |
31.0 |
5.0 |
1.2 |
23.60 |
|||
4 |
31.0 |
3.5 |
2.0 |
27.20 |
|||
1,2,3,4 |
2 |
5 |
28.4 |
4.1 |
1.6 |
30.25 |
|
6 |
6 |
26.1 |
4.4 |
1.8 |
26.49 |
expansion of 5 to 6 fails |
|
1,3,4,5 |
1 |
7 |
28.6 |
2.0 |
1.8 |
29.29 |
|
1,4,5,7 |
3 |
8 |
29.0 |
3.4 |
2.6 |
32.13 |
|
8 |
9 |
28.5 |
3.6 |
3.4 |
34.41 |
expansion of 8 to 9 is ok |
|
4,5,7,9 |
4 |
10 |
26.2 |
2.9 |
2.6 |
29.24 |
|
10 |
11 |
27.4 |
3.1 |
2.5 |
26.91 |
must contract to Cr |
|
5,7,9,11 |
7 |
12 |
27.2 |
5.2 |
3.2 |
26.39 |
reject worst from last simplex |
12 |
13 |
28.6 |
2.8 |
2.2 |
30.52 |
must contract to Cw |
|
5,9,11,13 |
11 |
14 |
29.6 |
3.9 |
2.3 |
28.32 |
|
14 |
15 |
29.1 |
3.7 |
2.4 |
29.82 |
must contract to Cr |
|
5,9,13,15 |
5 |
16 |
29.0 |
2.7 |
3.8 |
33.50 |
|
9,13,15,16 |
15 |
17 |
28.3 |
2.3 |
3.8 |
33.88 |
|
9,13,16,17 |
13 |
18 |
28.6 |
3.0 |
5.2 |
34.25 |
|
9,16,17,18 |
19 |
29.1 |
3.9 |
4.4 |
33.19 |
stop optimization |
The maximum response is obtained at vertex 9, although the authors eventually settled on using vertex 18 as the CRF values in the vicinity of this vertex are more similar in value than those for vertices in the vicinity of vertex 9. Choosing this set of factor levels makes for a more rugged analysis that is less sensitive to small changes in the factor levels.
Case Study 2: Application of a 23 Factorial Design
"Optimization of Pb(II) Biosorption by Robinia Tree Leaves Using Statistical Design of Experiments"
J. Zolgharnein, A. Shahmoradi, M. R. Sangi, Talanta 2008, (in press) doi: 10.1016/j.talanta.2008.03.039.
Pollution from toxic heavy metals, such as Pb(II), is a significant problem. The removal of heavy metals from water can be accomplished in a number of ways, one of which is using biomass as an absorbent. In this paper the authors explore the utility of using leaves from the Robinia tree as a biosorbent. Interest in the leaves results from their wide availability, their low cost, and because they have no commercial use.
Factors known to affect metal ion uptake by biosorbents include the mass of biosorbent, the pH of the water sample, the contact time between the metal ion solution and the biosorbent, the initial concentration of metal ion, and the speed of shaking. Because adsorption occurs fairly quickly and reaches equilibrium in 15-25 minutes, the contact time for all trials was set to 25 min and the affect of shaking time was not considered. Three main factors were considered - biosorbent mass, initial pH, and initial concentration of Pb(II) - and the high and low factor levels are shown in the following table.
Factor |
Low Level |
High Level |
| biosorbent mass (g) | 0.1 |
0.3 |
| initial pH | 2 |
5 |
| [Pb(II)] (mg/L) | 40 |
200 |
With three factors a 23 factorial design was used. Each of the eight combinations of factor levels was run twice, with runs occurring on separate days. Results of the individual trials, where the response is the percent of Pb(II) absorbed by the biosorbent, are shown in the following table. The day of the analysis is included as an additional factor.
| Day | pH |
[Pb(II)] |
biosorbent mass |
Response |
1 |
5 |
40 |
0.3 |
96.92 |
1 |
2 |
40 |
0.3 |
95.47 |
1 |
2 |
40 |
0.1 |
74.82 |
1 |
2 |
200 |
0.3 |
90 |
1 |
5 |
200 |
0.1 |
81.17 |
1 |
5 |
200 |
0.3 |
92.7 |
1 |
5 |
40 |
0.1 |
95 |
1 |
2 |
200 |
0.1 |
50.33 |
2 |
2 |
40 |
0.3 |
95.6 |
2 |
5 |
200 |
0.3 |
92.7 |
2 |
2 |
40 |
0.1 |
74.7
|
2 |
2 |
200 |
0.1 |
50.5 |
2 |
5 |
40 |
0.1 |
95 |
2 |
5 |
40 |
0.3 |
97 |
2 |
2 |
200 |
0.3 |
90.2 |
2 |
5 |
200 |
0.1 |
81.17 |
The authors fit the data to a model that considers first-order effects for all four factors, and interactions between pH, the concentration of Pb(II), and the amount of biosorbent. Using coded factor levels for pH, for example, gives its coefficient as

Values for all the coefficients are shown here:
Term |
Value |
t |
p |
| constant | 84.58 |
4471.36 |
0.000 |
| day | -0.029 |
-1.52 |
0.172 |
| pH | 6.878 |
363.58 |
0.000 |
| [Pb(II)] | -5.984 |
-316.33 |
0.000 |
| biosorbent mass | 9.244 |
488.68 |
0.000 |
| pH x [Pb(II)] | 1.461 |
77.25 |
0.000 |
| pH x biosorbent mass | -5.871 |
-310.39 |
0.000 |
| [Pb(II)] x biosorbent mass | 3.56 |
188.2 |
0.000 |
| pH x [Pb(II)] x biosorbent mass | -1.168 |
-61.72 |
0.000 |
Because the model includes fewer coefficients than experiments, there is sufficient information to determine a standard error for the coefficients and, therefore, to determine values of t and p for the null hypothesis that a coefficient's value is equal to zero. The results in the table show that the day of analysis does not appear to be a significant factor (a p-value of 0.172); all other factors and interactions are significant.
Interpreting the results is straightforward. For the first-order effects, less acidic pH levels, lower initial concentrations of Pb(II), and a greater mass of biosorbent lead to a greater recovery of the lead by the biosorbent. The interactions terms suggest that there is a negative interaction between pH and the mass of biosorbent. This can be explained by recognizing that there is a competition between Pb(II) and H+ for absorption sites within the biosorbent. When the pH is more acidic (factor level of -1), more biosorbent (factor level of +1) is needed to remove the same relative amount of Pb(II).
Case Study 3: Central-Composite Design
"A Multivariate Study of the Performance of a Microwave-Assisted Soxhelt Extractor for Olive Seeds"
L. E. Garcia-Ayuso, M. D. Luque de Castro, Anal. Chim. Acta 1999, 382, 309-316.
To the manufacturers of olive oil, one of the most important characteristic of a feed stock is the total content of edible oil in the olives. The traditional method for determining the oil content is to grind the seeds and then leach them with an organic solvent. The difference in mass before and after the leaching gives the mass of oil. The most common method for completing the extraction is a Soxhelt extraction, which, although easy to carry out and relatively inexpensive, is time consuming. One solution to this problem is using microwave radiation to speed up extraction.
Preliminary studies indicated that three microwave parameters were important factors: irradiation power, irradiation time, and the number of cycles. Furthermore, the preliminary studies indicated that extraction efficiency increases with the number of cycles, and that there are strong interactions between the three factors. The factor levels and the central composite experimental design shown in the following two tables were used to model the importance of these factors.
Level |
Power (%) |
Time (s) |
Cycles (number) |
-a = +1 |
30 |
20 |
5 |
-1 |
30 |
20 |
5 |
0 |
45 |
25 |
6 |
+1 |
60 |
30 |
7 |
+a = +1 |
60 |
30 |
7 |
Run Order |
Power |
Time |
Cycle |
% Oil Extracted |
1 |
-1 |
+1 |
+1 |
48.58 |
2 |
-a |
0 |
0 |
49.18 |
3 |
+1 |
+1 |
+1 |
43.03 |
4 |
0 |
0 |
0 |
50.51 |
5 |
0 |
-a |
0 |
49.22 |
6 |
+1 |
+1 |
-1 |
42.01 |
7 |
0 |
0 |
+a |
49.93 |
8 |
0 |
0 |
0 |
49.33 |
9 |
-1 |
+1 |
-1 |
45.51 |
10 |
0 |
0 |
0 |
49.01 |
11 |
+1 |
-1 |
-1 |
42.55 |
12 |
0 |
0 |
0 |
49.93 |
13 |
0 |
+a |
0 |
47.89 |
14 |
-1 |
-1 |
-1 |
46.64 |
15 |
-1 |
-1 |
+1 |
47.23 |
16 |
+a |
0 |
0 |
44.59 |
17 |
0 |
0 |
-a |
48.93 |
18 |
+1 |
-1 |
+1 |
44.68 |
19 |
0 |
0 |
0 |
49.63 |
20 |
0 |
0 |
0 |
50.54 |
Normally a three-factor central composite design sets a to 1.682; however, because the number of cycles must be an integer value and because the microwave's power can be changed in steps of only ±5%, the value of a was set to 1. Note that the center point was run five times to provide replication, which helps in the statistical evaluation of the regression results.
The authors fit a second-order polynomial equation, including first-order interactions, to the results from the central composite design, obtaining the following model
% oil extracted = 49.85 - 2.03P - 0.33T + 0.78C - 0.30PT - 0.064PC + 0.17TC - 3.01P2 - 1.34T2 - 0.465C2
where P is the irradiation power, T is the irradiation time, and C is the number of cycles. The optimum conditions for maximizing the amount of extracted oil are coded factor levels of P = -0.345, T = -0.030, and C = 0.859. Translating these into uncoded values gives P = 39.82%, T = 24.85 s, and C = 6.86 cycles. Given the limitations of the microwave, the optimum conditions are an irradiation power of 40%, an irradiation time of 25 s, and 7 cycles. Analysis of seven samples using these conditions gave an average value of 50.52% oil with a standard deviation of 0.76, a result that compared favorably with the standard Soxhlet extraction.
To continue, use the tabs at the top of the page.
Problem 1. For each of the following equations determine the optimum response using a one-factor-at-a-time algorithm. Begin the search at (0,0) with factor A, and use a step-size of 1 for both factors. The boundary conditions for each response surface are 0≤A≤10, and 0≤B≤10. Continue the search through as many cycles as necessary until you reach a maximum response. Compare your maximum response for each equation to the actual optimum (which is given in parentheses as A, B).
- R = 1.68 + 0.24A + 0.56B - 0.04A2 - 0.04B2 (optimum is 3, 7)
- R = 4.0 - 0.4A + 0.08B (optimum is 10,10)
- R = 3.264 + 1.537A + 0.5664B - 0.1505A2 - 0.02734B2 - 0.05785AB (optimum is 3.91,6.22)
Note: These equations are from Deming, S. N.; Morgan, S. L. Experimental Design: A Chemometric Approach (Elsevier: Amsterdam, 1987), and pseudo-three dimensional plots of the response surfaces can be found in their Figures 11.4, 11.5 and 11.14.
Problem 2. Determine the optimum response for each of the equations in Problem 1 using a fixed-size simplex algorithm and the same boundary conditions. Begin each search with vertices of (0,0), (0,2), and (1, 1.74). Continue the search through as many cycles as necessary until you reach a maximum response. Compare your maximum response for each equation to the actual optimum (which is given in parentheses as A, B).
Problem 3. Determine the optimum response for each of the equations in Problem 1 using the variable-size simplex algorithm and the same boundary conditions. Begin each search with vertices of (0,0), (0,5), and (1, 3.48). Continue the search through as many cycles as necessary until you reach a maximum response. Compare your maximum response for each equation to the actual optimum (which is given in parentheses as A, B).
Problem 4. Pharmaceutical tablets coated with lactose often develop a brown discoloration. The factors that primarily affect the discoloration are temperature, relative humidity, and the presence of a base acting as a catalyst. The following factor levels
Factor |
High (+1) Level |
Low (-1) Level |
X: benzocaine |
present |
absent |
Y: temperature |
40o C |
20o C |
Z: relative humidity |
75% |
50% |
and results for a 23 factorial design have been reported.
Run |
X |
Y |
Z |
Color |
1 |
- |
- |
- |
1.55 |
2 |
+ |
- |
- |
5.40 |
3 |
- |
+ |
- |
3.50 |
4 |
+ |
+ |
- |
6.75 |
5 |
- |
- |
+ |
2.45 |
6 |
+ |
- |
+ |
3.60 |
7 |
- |
+ |
+ |
3.05 |
8 |
+ |
+ |
+ |
7.10 |
If coefficients of less than ±0.5 are insignificant, what main effect and interaction terms in the coded equation are important? The data in this problems is from Armstrong, N. A.; James, K. C. Pharmaceutical Experimental Design and Interpretation. Taylor and Francis: London, 1996 (as cited in González, A. G. Anal. Chim. Acta 1998, 360, 227-241).
Problem 5. The following information is available for a 23 factorial design used to optimize the yield of a chemical process.
Factor |
High (+1) Level |
Low (-1) Level |
X: Temperature |
140o C |
120o C |
Y: Catalyst |
Type B |
Type A |
Z: [Reactant] |
0.50 M |
0.25 M |
Note that one of the factors (Y: catalyst) has non-numeric factor levels. In this case it is possible to determine the relative importance of the factors, but it is not possible to generate a response surface since intermediate factors levels for factor Y. Results for the eight experiments are shown here:
Run |
X |
Y |
Z |
% Yield |
1 |
- |
- |
- |
28 |
2 |
+ |
- |
- |
17 |
3 |
- |
+ |
- |
41 |
4 |
+ |
+ |
- |
34 |
5 |
- |
- |
+ |
56 |
6 |
+ |
- |
+ |
51 |
7 |
- |
+ |
+ |
42 |
8 |
+ |
+ |
+ |
36 |
If coefficients of less than ±1 are insignificant, what main effect and interaction terms in the coded equation are important? Is one of the catalysts, A or B, a better choice, or does the choice depend upon the temperature or concentration of reactant? What is the yield using catalyst A if the temperature is set to 100o C and the concentration of the reactant is 1.0 M? The data in this problem is from Strange, R. S. J. Chem. Educ. 1990, 67, 113-115.
Problem 6. The following data for a 23 factorial design were collected during a study of the effect of temperature, pressure and residence time on the % yield of a reaction.
Factor |
High (+1) Level |
Low (-1) Level |
X: temperature |
200o C |
100o C |
Y: pressure |
0.6 MPa |
0.2 Mpa |
Z: residence time |
20 min |
10 min |
Run |
X |
Y |
Z |
% Yield |
1 |
- |
- |
- |
2 |
2 |
+ |
- |
- |
6 |
3 |
- |
+ |
- |
4 |
4 |
+ |
+ |
- |
8 |
5 |
- |
- |
+ |
10 |
6 |
+ |
- |
+ |
18 |
7 |
- |
+ |
+ |
8 |
8 |
+ |
+ |
+ |
12 |
If coefficients of less than ±1 are insignificant, which main effect and which interaction terms are important? Three runs were made at the center of the factorial design (temperature = 150o C, pressure = 0.4 MPa, residence time = 15 min), giving yields of 12, 8, 9, and 8.8. Using a t-test, determine if there is any evidence that a first-order empirical model is inappropriate for this system at a = 0.05. The data in this problem is from Akhnazarova, S.; Kafarov, V. Experimental Optimization in Chemistry and Chemical Engineering MIR Publishers: Moscow, 1982 (as cited in González, A. G. Anal. Chim. Acta 1998, 360, 227-241).
Problem 7. Duarte, et al. used a factorial design to optimize the determination of penicillin using flow-injection analysis with potentiometric detection. Three factors were studied -- reactor length, carrier flow rate, and sample volume, with the high and low values summarized in the following table.
Factor |
High (+1) Level |
Low (-1) Level |
X: reactor length (cm) |
1.5 |
2.0 |
Y: carrier flow rate (mL/min) |
1.6 |
2.2 |
Z: sample volume (mL) |
100 |
150 |
An optimum response was defined as the greatest analytical sensitivity, defined as DE for a standard solution of penicillin, and the greatest sampling rate. The results of the optimization studies are tabulated below.
Run |
X |
Y |
Z |
DE (mV) |
Samples/hr |
1 |
- |
- |
- |
37.45 |
21.5 |
2 |
+ |
- |
- |
31.70 |
26.0 |
3 |
- |
+ |
- |
32.10 |
30.0 |
4 |
+ |
+ |
- |
27.20 |
33.0 |
5 |
- |
- |
+ |
39.85 |
21.0 |
6 |




