Evaluating the effectiveness of a measurement system involves very advanced statistical techniques. Statistics is an integral part of every Six Sigma project and, in this article, we will learn some of the most fundamental concepts in statistics, especially those essential for understanding how data is checked for errors and anomalies. The goal of any measurement system analysis is to confirm that the data is accurate, precise, and stable.
Statistics is the science of using mathematical tools to organize and analyze data. As we have discussed in previous chapters, most of the information gathered in a Six Sigma project, especially information about a company's customers, must be numeric in origin. If it is not numeric, the Black Belts on the team must find a uniform method for deriving numeric values from the data.
Description, Prediction, and Probability
Using statistics in a Six Sigma project meets two main objectives of any process improvement project: description and prediction.
Data gathered in a Six Sigma project are used to describe the current state of a process or a product. The product's characteristics are studied and all of this data is converted to variables. Variables are descriptive elements about a product. For example, a product's average useful lifetime is a variable. Other variables might include its color, price, availability, and so on. There are countless variables that can be established about a product, and these variables are used to establish a baseline at the beginning of the project.
Prediction is also a vital part of a Six Sigma project, and of statistics in general. As the goals of most Six Sigma project are to improve a process or product, statistics are used to predict how a product or process will improve over time given the amount of changes that are being enacted.
Foretelling the future is certainly a very difficult endeavor, but if sufficient data is collected about a process, the data may (and usually does) reveal a pattern that will clearly emerge. The underlying patterns that are discovered can be generalized, which means they can be predicted, with a high level of confidence, to occur again. There is a fundamental difference between certainty and probability. Six Sigma Black Belts cannot be certain of anything when predicting a future business environment, but the science of statistics is to reach conclusions with a very high degree of probability.
Probability is always described in terms of a percentage chance that something might occur, and it is always reported with a margin of error. For example, in most election polls, or polls in general, researchers will usually issue a statement such as "Mrs. Smith has a 40% chance of being reelected, with a margin of error of 3%." The methods used to calculate the exact margin of error involves algebraic formulas that relate to the size of the entire population (the total number of people in a given community) and the sample size (the number of people actually surveyed). Margin of error is always thought to be both positive and negative. In this instance, Mrs. Smith's probability of being reelected is in the range of 37% to 43%, which is 3% above and 3% below the official prediction.
Sampling is the process that Black Belts will use to pick a section of the population that is manageable. They will either do this by a random sampling or by matching. A random sampling is the process of picking people at random from the general population to participate in the survey. It offers the greatest probability that the results of the survey will represent the thoughts of the entire population, but the demographics (characteristics) of the sample should be in line with the demographics of the entire population.
Matching involves selecting specific individuals from the population that closely match certain characteristics. While it might seem to some people that matching would give a more accurate result, this is not the case in most circumstances. It has been proven that random sampling is a better approach for most surveys and it is the method used by most Six Sigma teams.
Regardless of the method chosen to select survey participants, Black Belts must determine if there is a sampling size error before predictions can be made.
It should be noted that every survey that does not involve the entire population will have some sampling error. The goal in Six Sigma is to reduce the error to as close to zero as possible. If a classroom has 100 students and 99 of them are asked if they like the teacher, a strong prediction can be made but the prediction is not entirely 100% because one person was not asked the question.
We just discussed the concept of margin of error. In our example, the conclusion was that Mrs. Smith's probability of being reelected is between 37% and 43%. But, there is now an additional element to this prediction that must be considered and that is the confidence level. A confidence level is a measurement of how "sure" a team of researchers is that its survey conclusion is correct. There is a specific mathematical formula that is used to determine the confidence level of a prediction. The goal of every Six Sigma team should be to have a confidence level of 95%, which is the number used by most professional statisticians. The team working on Mrs. Smith's campaign can then say that they have a 95% confidence level that their survey represents the beliefs of the entire population.
Analysis begins with looking at the way processes are currently handled and comparing the current state to the desired way to run these processes. Analysis also identifies ways to eliminate the discrepancy between the two. This is called Value Stream Analysis.
The concept of the value stream is a simple one. Simply put, the path of creating a product or offering a service from the time the customer places an order or request, to the time the product or service is delivered is called a value stream. A value stream includes both value added and non-value added activities.
The Six Sigma approach to the value stream attempts to reduce and eliminate as much waste in the stream as possible. While waste includes physical waste, such as useless byproducts of a manufacturing process, defects in products, and excess inventory, waste also includes intangibles as well, including unnecessary paper processing, unnecessary moving of items or people, and unnecessary waiting.
This is an important concept in the Six Sigma program because customers will pay for value. Waste in the value stream does not add value, but all steps in the value stream have a cost; therefore, customers are willing to pay for the cost of value added activities but not non-value added activities.
Value stream mapping is a process through which waste in the value stream is identified. Work can be either logical or physical, and the value stream map includes the steps taken in the value stream as well as how information flows to facilitate the work in the value stream.
Making a flow chart is a recommended way to map the value stream. The request or order would be the first box, and the delivery would be the last box. In between are boxes that identify steps in the value stream as well as decisions that need to be made to affect the work flow. When every step in the value stream is mapped, it is then time to identify which steps directly add value to the finished product or service, and which steps do not.
Often, the non-value added steps cannot be eliminated immediately or without consequence. For example, some industries have regulations that require certain quality control or testing steps that do not add value but must be performed. In other cases, general audits and inspections are part of the process. There may also be processing of information that doesn't directly add value but is important to record to expedite a future process or for billing purposes. However, non-value added activities can still be eliminated if there are no consequences in doing so, and those that cannot be eliminated immediately can perhaps be reduced through process improvement or eliminated later.
It is important to note that in eliminating waste in the value stream, there may be resistance to change, particularly from employees and areas that are identified as part of non-value added activities because they may view their jobs or tasks as threatened. However, there is a responsibility to shareholders to eliminate as much waste as possible since customers do not pay for it; the organization does. If employees are assured that they will receive fair treatment, there is generally more cooperation. Additionally, when waste is eliminated and value is redefined, there is usually an increase in customers and orders that can even outpace the conversion from old value streams to new ones, and this can often comfort and encourage employees through the process.
Variation Analysis
When conducting a Value Stream Analysis within a Six Sigma project, dealing with variations in data is crucial. A variation in data is something that is not expected and this variation needs to be investigated and explained. The Measure stage of the project reveals where variations will occur as well as how to respond to different variations. Each variation has its own effect in the overall value stream. There are a number of ways that statisticians use to measure variations.
Regardless of how data is analyzed, the end result should give some prediction of where data will fall depending on particular causes. One way to develop such a prediction is to use a point estimator, which refers to a single value that represents population parameters. These values could be the sample mean or sample standard deviation, for example.
The range in which values will fall is called the confidence level. This can be of greater use because it provides a better guess as to how large the sampling's margin of error is, as we discussed in a previous chapter. Practically all samples do contain errors, and this can be a way to help interpret them.
Sometimes, statistical inference is the best way to deal with data variations. In these cases, it's usually a scientific inference or hypothesis that needs to be proved or disproved. Experiments are conducted, and if the resulting data varies, then the Six Sigma team will create a hypothesis, or a theory about why the data varies.
When working with a hypothesis, the first step is to create a hypothesis about the entire population. Next, samples are collected from the population. Third, statistics are calculated on these samples, and finally, the hypothesis is accepted or rejected based on the statistics.
Because errors are possible in any human driven process, resampling the data is a way to help reduce these errors. In a nutshell, a computer can often analyze the same data hundreds or thousands of times to see if the results are consistent.
Regression and Correlation Analysis
In dealing with variables in data in a Six Sigma project, it is often useful to see how these variables are caused. This is also helpful in determining the effects of variations within the Value Stream Analysis. Both Regression and Correlation Analysis will help.
Regression Analysis is used when there are one or more independent variables and one dependent variable; in other words, independent variables can be controlled, and dependent variables can vary. This takes into consideration the frequency distribution of one variable when other variables remain steady.
On the other hand, Correlation Analysis looks for a linear relationship among variables. There are no variables controlled as in Regression Analysis; instead, variables simply react to one another or to other forces. It is important to remember that in Correlation Analysis, causation cannot be proven by statistics alone.
Scatter diagrams can be effective at analyzing relationships between variables. Using X- and Y-axes, points are plotted to show where one variable falls in relation to another. After the points are plotted, patterns of the points can reveal relationships and trends between the variables.
If scatter diagrams reveal a straight line, then a linear relationship between the variables is revealed. However, errors in data can cause variations in points on a line. A least-squares fitting model uses mathematical formulas to look for the line within the data, taking errors into consideration.
Ultimately, analyzing variations in data and analyzing relationships between variables is imperative in a Six Sigma project since they will reveal cause-and-effect relationships. Understanding causes for certain effects in the Value Stream Analysis will also help minimize errors and variations that create waste.