Minitab Inc
 

Individual Distribution Identification

Identify the Distribution of Your Data

Knowing the distribution of your data is essential to choosing the right statistical method. Suppose you need to assess the capability of your process. If you conduct an analysis that assumes the data follow a normal distribution but in fact the data are nonnormal, your results will be inaccurate. To avoid this costly error, you must determine the distribution of your data.

So, how do you determine the distribution? Minitab's new Individual Distribution Identification is a simple way to find the distribution of your data so you can select the appropriate analysis. You can use it to:

  • Verify that a distribution used previously is still valid for the current data
  • Choose the right distribution when you’re not sure which to use
  • Transform your data to follow a normal distribution

Three Ways to Use Individual Distribution Identification

If you want to confirm that a certain distribution fits your data

In most cases, your process knowledge helps you identify the distribution of your data. In these situations, you can use Individual Distribution Identification to confirm that this distribution fits the current data.

Suppose you want to perform a capability analysis to ensure that the weight of ice cream filled containers from your production line is meeting specifications. In the past, these data have been normal, but you want to confirm normality. Here’s how you use Individual Distribution Identification to quickly assess the fit.

  1. Choose Stat > Quality Tools > Individual Distribution Identification.
  2. Specify the column of data to analyze and the distribution you want to check it against. Click OK.
  3. A given distribution is a good fit if:
    • The data points roughly follow a straight line
    • The p-value is greater than 0.05
Indivi13.gif

In this case, the ice cream weight data appear to follow a normal distribution, so you can justify the use of normal capability analysis.

If you are not sure which distribution fits your data

Suppose you have successfully used more than one distribution in the past. You can use Individual Distribution Identification to help you decide which distribution best fits your current data. For example, you want to assess whether a particular weld strength is meeting customers' requirements. A number of distributions have been used to model this type of data in the past. Here’s how you use Individual Distribution Identification to choose the distribution that best fits your data.

  1. Choose Stat > Quality Tools > Individual Distribution Identification.
  2. Specify the column of data to analyze and the distributions you want to check it against. Click OK.
  3. Choose the distribution with data points that roughly follow a straight line and the highest p-value.
Indivi14.gif

In this case, the lognormal distribution is a better fit than the others because the data points roughly follow a straight line and its p-value is the highest.

Note

You can evaluate up to 14 different distributions, including 1-, 2-, and 3-parameter distributions. When you fit your data with both a 2-parameter distribution and its 3-parameter counterpart, the 3-parameter distribution often appears to be a better fit. However, because it is more restrictive, you would only want to use a 3-parameter distribution if it offers a significantly better fit. See Minitab Help for information about using LRT p-values to choose between them.

If you know your data are nonnormal but you want to use a normal statistical technique

While Minitab offers various options for working with nonnormal data, many users simply prefer to use the broader palette of normal statistical techniques. The good news is that, in addition to finding the true distribution of your data, Minitab's Individual Distribution Identification can transform your nonnormal data to follow a normal distribution using the Box-Cox method. You can then use the transformed data with any tool that assumes data follow a normal distribution.

  1. Choose Stat > Quality Tools > Individual Distribution Identification.
  2. Specify the column of data to analyze, choose Box-Cox transformation, and check any other distributions you want to compare it with. Click OK in each dialog box.
  3. For the transformed data, check whether data points roughly follow a straight line and the p-value is greater than 0.05.
Indivi17.gif

In this case, the probability plot and p-value suggest that the data are successfully transformed to follow a normal distribution. You can now use the transformed data for further analysis.

Note

Transforming data does not always result in normal data. You must check the probability plot and p-value to assess whether the normal distribution fits the transformed data well.

Putting Individual Distribution Identification to Use

It is always a good practice to know the distribution of your data before analyzing them. Minitab's Individual Distribution Identification is an easy-to-use tool that can help identify the distribution of your data as well as eliminate errors and wasted time that results from using an inappropriate analysis. You can use this feature to check the fit of a single distribution, or use it to compare the fits of several distributions and select the one that fits best. If you prefer to work with normal data, you can even use Individual Distribution Identification to transform your nonnormal data to follow a normal distribution.


Visit Accessing the Power of Minitab for additional tutorials on the many time-saving features and functionality available in Minitab Statistical Software.

Learn more about Minitab 15.

Purchase or upgrade now

Products | Training | Support | Company | Employment | Resources | Downloads | Store | Contact Us | Site Map | RSS
Quality. Analysis. Results.®
Copyright ©2008 Minitab Inc. All rights reserved. See Legal Page.