## Statistical Process Control

Read this chapter on the basics of statistical process control (SPC). SPC is a standard tool for monitoring whether a process is performing as expected and, if not, where problems occur. While reading, consider how this kind of tool factors in process capacity management.

### X-Bar, R-Charts, and S-Charts

There are three types of control charts used determine if data is out of control, $x$-bar charts, $r$-charts and $s$-charts. An $x$-bar chart is often paired with either an $r$-chart or an $s$-chart to give a complete picture of the same set of data.

#### Pairing $X$-Bar with $R$-Charts

$X$-Bar (average) charts and $R$ (range) -charts are often paired together. The $X$-Bar chart displays the centerline, which is calculated using the grand average, and the upper and lower control limits, which are calculated using the average range. Future experimental subsets are plotted compared to these values. This demonstrates the centering of the subset values. The $R$-chart plots the average range and the limits of the range. Again, the future experimental subsets are plotted relative to these values. The $R$-chart displays the dispersion of the subsets. $X$-Bar/$R$-Chart plot a subgroup average. Note that they should only be used when subgroups really make sense. For example, in a Gage R&R study, when operators are testing in duplicates or more, subgrouping really represents the same group.

#### Pairing $X$-Bar with $S$-Charts

Alternatively, $X$-Bar charts can be paired with $S$-charts (standard deviation). This is typically done when the size of the subsets are large. For larger subsets, the range is a poor statistic to estimate the distributions of the subsets, and instead, standard deviation is used. In this case, the $X$-Bar chart will display control limits that are calculated using the average standard deviation. The $S$-Charts are similar to the $R$-charts; however, instead of the range, they track the standard deviation of multiple subsets.

#### Smoothing Data with a Moving Average

If it is desired to have smooth data, the moving average method is one option. This method involves taking the average of a number of points, and using that average for the middle data point. From this point on, the data is treated the same as any normal group of $k$ subsets. Though this method will produce a smoother curve, it has a lag in detecting points, which may be problematic if the points are out of the acceptable range. This time lag would keep the control system from reacting to the problem until after the average is found. For this reason, moving average charts are appropriate mainly for slower processes that can handle the lag.

For example, let us calculate a value for a set of data which takes samples every second. We will use an average of 10 points to find this, however, in practice there is no set number of data points that should be used. For the point $t = 50$, we must wait until data has been collected through $t = 54$. The points are then averaged for $t = 45-54$ and used as the function value. For the next point, $t = 51$, the average of the points for $t = 46-55$ are used, and so on. If this is still confusing, please see moving average for a more detailed explanation.

Control charts can determine whether a process is behaving in an "unusual" way.

Note: The upper and lower control limits are calculated using the grand average and either the average range and average sigma. Example calculations are shown in the Creating Control Charts Section.

The quality of the individual points of a subset is determined unstable if any of the following occurs:

• Rule 1: Any point falls beyond $3\sigma$ from the centerline(this is represented by the upper and lower control limts).
• Rule 2: Two out of three consecutive points fall beyond $2\sigma$ on the same side of the centerline.
• Rule 3: Four out of five consecutive points fall beyond $1\sigma$ on the same side of the centerline.
• Rule 4: Nine or more consecutive points fall on the same side of the centerline.

The quality of a subset is determined unstable according to the following rules:

1. Any subset value is more than three standard deviations from the centerline.

2. Two consecutive subset values are more than two standard deviations from the centerline and are on the same side of the centerline.

3. Three consecutive subset values are more than one standard deviation from the centerline and are on the same side of the centerline.

#### Creating Control Charts

To establish upper and lower control limits on control charts, there are a number of methods. We will discuss the method for the number of components in a subset, $n$, less than $15$. For methods involving $n > 15$ and other techniques, see Process Control and Optimization, Liptak, 2.34. Here, the table of constants for computing limits, and the limit equations are presented below.

Please note that Table A below does NOT contain data for a sample problem. Any time you make a control chart, you refer to this table. The values in the table are used in the equations for the upper control limit (UCL), lower control limit (LCL), etc. This will be explained in the examples below. If you are interested in how these constants were derived, there is a more detailed explanation in Control Chart Constants.

 Subgroup x-bar chart S-chart R-chart Using Ra Using Sa n A2 A3 B3 B4 D3 D4 2 1.886 2.659 0 3.267 0 3.268 3 1.023 1.954 0 2.568 0 2.574 4 0.729 1.628 0 2.266 0 2.282 5 0.577 1.427 0 2.089 0 2.114 6 0.483 1.287 0.03 1.97 0 2.004 7 0.419 1.182 0.118 1.882 0.076 1.924 8 0.373 1.099 0.185 1.815 0.136 1.864 9 0.337 1.032 0.239 1.761 0.184 1.816 10 0.308 0.975 0.284 1.716 0.223 1.777 11 0.285 0.927 0.322 1.678 0.256 1.744 12 0.266 0.886 0.354 1.646 0.283 1.717 13 0.249 0.85 0.382 1.619 0.307 1.693 14 0.235 0.817 0.407 1.593 0.328 1.672 15 0.223 0.789 0.428 1.572 0.347 1.653

Table A: Table of Constants

#### To determine the value for $n$, the number of subgroups

In order to determine the upper (UCL) and lower (LCL) limits for the $x$-bar charts, you need to know how many subgroups ($n$) there are in your data. Once you know the value of $n$, you can obtain the correct constants (A2, A3, etc.) to complete your control chart. This can be confusing when you first attend to create a $x$-bar control chart. The value of $n$ is the number of subgroups within each data point. For example, if you are taking temperature measurements every min and there are three temperature readings per minute, then the value of $n$ would be. And if this same experiment was taking four temperature readings per minute, then the value of $n$ would be $4$. Here are some examples with different tables of data to help you further in determining $n$:

 Subset# Values (kg) 1 (control) 1.02, 1.03, 0.98, 0.99 2 (control) 0.96, 1.01, 1.02, 1.01 3 (control) 0.99, 1.02, 1.03, 0.98 4 (control) 0.96, 0.97, 1.02, 0.98 5 (control) 1.03, 1.04, 0.95, 1.00 6 (control) 0.99, 0.99, 1.00, 0.97 7 (control) 1.02, 0.98, 1.01, 1.02 8 (experimental) 1.02, 0.99, 1.01, 0.99 9 (experimental) 1.01, 0.99, 0.97, 1.03 10 (experimental) 1.02, 0.98, 0.99, 1.00 11 (experimental) 0.98, 0.97, 1.02, 1.03

Example 1: $n= 4$ since there are four readings of kg.

 time (hours) pH 1 7.00 7.30 6.99 7.00 2 7.12 7.25 7.12 7.20 3 7.20 7.16 7.20 7.16 4 6.98 7.00 6.94 7.00 5 6.99 6.99 6.99 6.98 6 7.00 6.93 7.02 6.93 7 6.92 7.00 6.92 7.02 8 6.88 6.82 6.94 6.99 9 7.10 7.00 7.00 7.00 10 7.21 7.02 7.21 7.04 11 7.01 6.86 7.01 6.90 12 6.86 6.98 6.90 6.98 13 6.90 7.00 6.87 7.00 14 7.01 7.04 7.01 7.05 15 7.00 6.95 7.00 6.99 16 7.09 7.20 7.03 7.20 17 6.89 7.14 6.87 7.15 18 6.98 6.80 6.98 6.89 19 7.00 6.90 7.00 6.90 20 7.20 7.00 7.23 7.00 21 7.04 7.03 7.08 7.00 22 6.90 6.92 6.98 6.92 23 7.00 7.00 7.00 7.00 24 7.00 6.97 7.01 6.98

Example 2: $n= 4$ since there are four readings of pH.

 time (min) T1 T2 T3 0 305.1578 311.1926 303.0032 1 308.6441 299.2898 307.9012 2 304.4789 308.7662 312.273 3 303.2384 303.7872 308.4915 4 316.6728 303.9563 303.3419 5 297.3459 308.0937 306.353 6 310.0358 304.9309 304.5568 7 302.2579 304.0973 317.315 8 305.5338 308.5081 308.1174 9 311.6743 302.4106 305.5727 10 303.535 312.9508 305.1281 11 307.5137 312.0491 307.6593 12 310.6001 305.5229 311.1861 13 307.6121 313.0331 313.4924 14 313.2346 312.1953 297.2964 15 306.0061 301.9239 298.6282 16 310.8455 308.7776 300.404 17 306.6952 299.0904 304.7548 18 305.2398 307.3239 297.1759 19 303.3781 305.8241 306.5276 20 309.3113 316.0451 309.9065

Example 3: $n= 3$ since there are three readings of temperature.

After creating multiple control charts, determining the value of n will become quite easy.

#### Calculating UCL and LCL

For the $X$-Bar chart the following equations can be used to establish limits, where $X_{G A}$ is the grand average, $R_{A}$ is the average range, and $S_{A}$ is the average standard deviation.

Calculating Grand Average, Average Range and Average Standard Deviation

To calculate the grand average, first find the average of the n readings at each time point. The grand average is the average of the averages at each time point.

To calculate the grand range, first determine the range of the n readings at each time point. The grand range is the average of the ranges at each time point.

To calculate the average standard deviation, first determine the standard deviation of the $\mathbf{n}$ readings at each time point. The average standard deviation is the average of the standard deviations at each time point.

Note: You will need to calculate either the grand range or the average standard deviation, not both.

For $X$-bar charts, the UCL and LCL may be determined as follows:

\begin{aligned}&\text { Upper Control Limit }(\mathrm{UCL})=X_{G A}+A_{2} R_{A} \\&\text { Lower Control Limit }(\mathrm{LCL})=X_{G A}-A_{2} R_{A}\end{aligned}

Alternatively, $S_{A}$ can be used as well to calculate UCL and LCL:

$\text { Upper Control Limit (UCL) }=X_{G A}+A_{3} S_{A}$

$\text {Lower Control Limit (LCL) }=X_{G A}-A_{3} S_{A}$

The centerline is simply $X_{G A}$.

For $R$-charts, the $U C L$ and $L C L$ may be determined as follows:

\begin{aligned}\mathrm{UCL} &=D_{4} R_{A} \\\mathrm{LCL} &=D_{3} R_{A}\end{aligned}

The centerline is the value $R_{A}$.

For $S$-charts, the $U C L$ and $L C L$ may be determined as follows:

\begin{aligned}&\mathrm{UCL}=B_{4} S_{A} \\&\mathrm{LCL}=B_{3} S_{A}\end{aligned}

The centerline is $S_A$.

The following flow chart demonstrates the general method for constructing an $X$-bar chart, $R$-chart, or $S$-chart:

#### Calculating Region Boundaries

To determine if your system is out of control, you will need to section your data into regions A, B, and C, below and above the grand average. These regions are shown in Figure III. To calculate the boundaries between these regions, you must first calculate the UCL and LCL. The boundaries are evenly spaced between the UCL and LCL. One way to calculate the boundaries is shown below.

Boundary between A and B above $X_{G A}=X_{G A}+\left(U C L-X_{G A}\right) * 2 / 3$

Boundary Between B and C above $X_{G A}=X_{G A}+\left(U C L-X_{G A}\right) * 1 / 3$

Boundary Between A and B below $X_{G A}=\angle C L+\left(X_{G A}-L C L\right) * 2 / 3$

Boundary Between B and C below $X_{G A}=L C L+\left(X_{G A}-L C L\right) * 2 / 3$