By Matthew Barsalou
Recently, I attempted to give several engineers a 30-second explanation of what design of experiments (DoE) is and what it could do. The results were what an experienced DoE practitioner might expect from such an exercise: a total failure. Perhaps a 30-second introduction to DoE is unrealistic, but providing a short and concise explanation is possible. Having a paper helicopter on hand helps.
The late statistician George E. P. Box, along with Soren Bisgaard and Conrad Fung, used a paper helicopter to teach statistics. The idea originated with Kip Rogers of Digital Equipment and is useful for demonstrating fractional factorial designs. Decades after Box, Bisgaard and Fung’s publication, the DoE helicopter has become a useful staple of DoE training.
The paper helicopter provides a way to quickly explain basic DoE concepts. It also offers an easy-to-do experiment you can analyze using Minitab.
To perform a DoE with a paper helicopter we need to identify the desired output, which would be our response variable. We can’t just declare that we want a high quality helicopter; quality must be clearly defined.
A good helicopter is one which stays in the air for a longer time, so the response variable would be flight time as measured from the time the helicopter is dropped from a height of 2 meters until the time it hits the floor. Without defining the test conditions it could be possible that sample helicopters would be dropped from different heights, in which case our DoE results would be not be valid.
Test factors that influence flight time must also be identified. For the helicopter experiment, the factors are paper type, rotor length, leg length, leg width and paper clip. The helicopter experiment levels are varied by using two different types of paper, using longer or shorter leg and rotor lengths and adding or removing a paper clip.
Here’s how to make the paper helicopters.
Figure 1: The helicopter plan
Figure 2: The finished helicopter
Table 1: Helicopter factors
Statisticians and Six Sigma black belts should know how to set up and perform the calculations in a designed experiment by hand; however, computer programs make DoE a much simpler task, particularly for people who need to perform experiments only occasionally.
To create a fractional factorial design in Minitab Statistical Software, go to DoE > Factorial > Create Factorial Design and select the desired design.
For this experiment, we will use a 2-level factorial which can handle from two to fifteen different factors. To select the desired design in Minitab, select 5 for the Number of factors, then click Designs to select the desired design and resolution level.
Resolution is the degree to which effects are aliased with other effects. In other words, aliased effects are mixed together and can’t be estimated separately. This can also be referred to as confounding, and it results from not testing every possible combination of factors. This is a disadvantage of a fractional factorial design; however, not testing every possible combination can be a significant advantage in time and expense over a full factorial design.
In the quality realm, we typically use three levels of resolution: resolution III, IV and V. There is no confounding of main effects with each other in these three resolution types; however, in a resolution III design, main effects will be confounded with 2-factor interactions. Resolution IV designs do not have 2-factor confounding with main effects, but 2-factor interactions are aliased with other 2-factor interactions, and main effects are confounded with 3-factor interactions.
We try to use resolution IV designs instead of resolution III designs when possible because they have less aliasing, but still require fewer experimental runs than higher resolution experiments.
Resolution V designs have the added advantage that no 2-factor effects are confounded with other 2-factor effects; however, 2-factor effects are aliased with 3-factor effects, and main effects are aliased with 4-factor effects.
The confounding problem can be eliminated by performing a full factorial design; however, this requires more experimental runs, which might be prohibitive in terms of both time and money.
As you set up the experiment, Minitab also asks for the number of blocks. Blocks are simply homogenous groupings of measurements that can be used to account for variation. The default value is one; ideally, everything is homogenous.
The helicopter experiment will be set up so that there is only one experimental block: each type of paper will come from the same source; the helicopters will all be built by the same person using the same scissors and ruler. If we had a paper clip shortage that forced us to use paper clips from two manufacturers, then we would need blocks to account for potential variation in the paper clips. Fortunately, this is not the case.
After you select your design, click the “Factors” button to enter the names and levels of the variables in your experiment. To change the name of a factor, simply type the name of the factor over the letter in the name field. The factor settings can also be renamed by replacing the default values of -1 and 1 with the actual factor levels.
When you’ve completed the dialog box, Minitab creates the experimental design and displays it in a Minitab worksheet. The Session Window above the worksheet provides a description of the selected design with the resulting alias structure.
In the resulting Minitab worksheet shown above, the experimental results are entered into column C10. We can name the column “Flight time” because that is our experimental response variable.
A randomized run order is provided in the “RunOrder” column. Without randomization there is a risk that the experimental results will reflect unknown changes in the test system over time. For example, in the helicopter experiment, the scissors may become dull over time, resulting in slightly different cuts as each new helicopter is prepared.
Minitab’s default setting for a designed experiment is one replicate. If you observe a lot of variation in the process or the resulting measurements, you can use Stat > DoE > Modify Design to add replicates to your design. Suppose the person making the helicopters had difficulty cutting a straight line so all edges are not uniform; the differences in results may reflect this variation. Replicating runs minimizes the effects of this kind of unanticipated variation.
Variability can have a major impact on experimental results, so take steps to reduce the variability. Have the same person make all of the helicopters using the same scissors and ruler. Drop the helicopters from a height of 2 meters, and identify the drop point clearly to ensure consistency. A higher or lower starting point would affect flight time, and this could throw off the results. The helicopters must also be held and released the same way, or variation in our data might be the effect of the release method and not the design of the helicopter.
The Minitab worksheet below contains the experimental results listed under “Flight time” in column C10.
After running the experiment and entering the collected data in the Minitab worksheet, select DoE > Factorial > Analyze Factorial Design…
Significant factors are those that influence the response as they changed from one setting to another. When you click OK, Minitab provides an ANOVA table as well as a Pareto chart of effects, which make it very easy to identify significant factors.
In an ANOVA table, those factors with a P-value less than 0.05 are statistically significant. However, the ANOVA table for this model doesn’t include any P-values!
This is because with all of our factors included in the model, we have no degrees of freedom left for Error, and you need at least 1 degree of freedom to calculate P-values. But while we can’t accept this model based on the ANOVA results, we can use the normal plot or Pareto chart to identify factors and interactions that are not significant.
At this point, the experimenter would typically begin eliminating these factors, rerunning the analysis until only significant factors and interactions are left. This is usually referred to as “reducing the model.” As factors are removed from the model, additional degrees of freedom become available for the calculation of P-values. The number of models you need to evaluate depends on the number of factors in your analysis.
The stepwise regression feature makes it simple and fast to select the optimal model for your data by automatically removing factors to find the model that best fits your data. You can choose from three stepwise analysis methods: Stepwise, Forward selection, and Backward elimination. In Backward elimination, all factors are included in the initial analysis, and then non-significant factors are removed one-by-one.
Regardless of the stepwise method you use, the model Minitab selects contains the same significant factors shown below:
To help you interpret your results, Minitab can also provide main effects and interaction plots. Select DoE > Factorial > Factorial Plots… Since we have already analyzed the results, Minitab automatically selects the factors used in our model:
Clicking OK gives us plots of the significant main effects and interactions. The main effects plot shows the results of changing from one setting to another for each factor.
The interaction plot shows the interactions between the factors.
Finally, we can use the Response Optimizer to find the combination of factor settings that will give us the longest flight time. Select Stat > DoE > Factorial > Response Optimizer…
The optimizer produces the following graph showing the optimal factor settings in red, and the predicted response for helicopters made with those settings in blue:
For the data we collected, our analysis with Minitab indicates the optimal helicopter settings are lighter paper, longer rotor length, shorter leg length, slimmer leg width, and no paperclip on the leg.
To design an even better helicopter, we could repeat the entire DoE using even lighter paper and longer helicopter blades. A 50 cm wing may be bigger, but that does not mean it will be better. You may be able to predict the ideal settings based on a DoE result, but you should always be cautious when extrapolating beyond the data set, or the result may be a crashing helicopter.