CONTROL CHARTS FOR STATIONARY VECTOR ARMA PROCESSES

In practice, there are many quality control situations where a product under consideration may have two or more interrelated quality characteristics and observations of each characteristic are serially correlated. One of the objectives of management is to investigate whether or not all these characteristics of the product simultaneously satisfy the required specifications. To the author's best knowledge, no concrete attempts have been made so far to construct the control charts for such situations, particularly when the data arise from vector autoregressivemoving average (VARMA) processes. It is this problem that has been addressed in this paper. A few methods are suggested for constructing the control charts. When assumptions about independence and normality break down, a bootstrap method, perhaps for the first time, is suggested to attack the problem. Some illustrative examples are discussed. OPSOMMING In die praktyk is daar vele kwaliteitbeheersituasies waar 'n betrokke produk een of twee onderling verbonde kenmerke kan hê en waarnemings van elke kenmerk serie gekorreleer is. Een van die doelwitte van bestuur is om te ondersoek of hierdie kenmerke van die produk gelyktydig aan die vereiste spesifikasies voldoen al dan nie. Na die outeur se beste wete, is daar tot dusver geen daadwerklike pogings aangewend om die beheergrafieke vir sodanige situasies op te stel nie, veral waar die data ontstaan uit vektor outoregressief bewegende gemiddelde (VARMA) prosesse. Hierdie probleem geniet in hierdie artikel aandag. 'n Aantal metodes vir die opstel van die beheergrafieke word voorgestel. Wanneer aannames oor selfstandigheid en normaliteit faal, word 'n skoenlusmetode voorgestel om die probleem die hoof te bied. 'n Aantal voorbeelde ter toeligting word bespreek. 1 Visiting as a research professor on retirement from Monash University, Clayton Vic3168, Melbourne, Australia. SA Journal of Industrial Engineering 2003 14(1): 27-37 http://sajie.journals.ac.za


INTRODUCTION
In many quality control situations, especially in modern industries, the product under examination may have two or more interrelated quality characteristics and, furthermore, observations of each characteristic may be autocorrelated.The objective of the management is to investigate whether or not all these characteristics simultaneously satisfy the required specifications.For instance, a cylinder has both an inner diameter (X 1 ) and an outer diameter (X 2 ) which together determine the usefulness of the item.If both characteristics are uncorrelated and the observations of both are independent, the process may be controlled by applying the well-known Shewhart charts.However, if the variables are interdependent, the use of the Shewhart charts would be misleading.
In particular, a standard multivariate quality control problem is to consider whether or not an observed vector of measurements X = (X 1 , X 2 , ..., X p )' of a certain item exhibits any evidence of a location shift from a set of given mean values : 0 = (: 1 , ..., : p )'.When the vector variables are correlated and the observations of them are independent, the multivariate control charts have been developed by several authors, including Hotelling [7], Jackson [8], Montgomery [9] and Hayter et al. [6] (see References).However, for most processes, it is unrealistic to expect the observations to be independent.Instead, the quality measurements X i1 , X i2 , ..., X in of the i-th variable X i , i = 1, ..., p may be autocorrelated (see Montgomery [9]).
It has been found in practice that if observations of quality characteristics exhibit even low levels of correlation, the conventional control charts mislead in the form of too many false alarms.Examples include chemical processes where consecutive measurements of product characteristics are often highly correlated for a variety of reasons.In fact, in all manufacturing processes, when the interval of sampling is narrowed down, observations of the process become correlated over time.
In the univariate case, the effect of autocorrelated observations of control charts has been studied by several authors (see Wardell et al. [11]).In the multivariate situation, however, no concrete results are available as yet, to the author's best knowledge (see Alt [1,2]).It is this problem which is studied in this paper and some interesting results are obtained.Three approaches are suggested, namely (i) X charts for dependent observations of correlated quality characteristics.(ii) Fitting of a model to the data and application of traditional methods of residuals.(iii) Bootstrap methods.
The plan of the paper is as follows: Section 2 gives a brief account of multivariate control charts when observations of each variable are independent.In Section 3, we state and prove some useful theorems.In Section 4, we discuss the problem and suggest X charts for autocorrelated observations of correlated quality characteristics.Section 5 deals with the fitting of a vector ARMA model to the data and studying residuals.When the usual normality assumptions do not hold good, a bootstrap method is suggested in Section 6.

REVIEW OF MULTIVARIATE CONTROL CHARTS WHEN OBSERVATIONS ARE INDEPENDENT
In the univariate case when a single quality characteristic X is assumed to be normally distributed with mean µ 0 and variance , the Shewhart control chart has the following control limits: where n is the sample size and Suppose that the p-quality characteristics are jointly distributed as a p-variate normal with a known mean 0 µ and covariance matrix 0 ∑ and that a random sample of size n is available from the process.The likelihood ratio test 0 :

=
is the 1 × p vector of sample means and is the χ corresponding to the significance level α with p degrees of freedom.Then the control chart is formed by plotting the values of Q on the chart with UCL = and LCL = 0.If a value falls above the UCL, the process is considered to be out of control and the assignable causes are sought.This is referred to as a χ If µ and Σ are unknown, they can be estimated from preliminary samples taken when the process was in control.Let there be m such samples each of size n.The sample means variances and covariances can be calculated from these m samples as usual.Let X and S denote respectively the pooled estimates of the vector mean µ and the covariance matrix Σ.
Then the test statistic is which is the Hotelling T 2 -statistic, X is the sample mean of new observations.The control chart formed by this statistic is known as the T 2 chart.If µ and Σ are estimates using large samples (say n ≥25), it is customary to use the UCL = as the upper control limit on the T 2 chart, otherwise the UCL and LCL are based on the F-distribution which is closely related to the T 2 -distribution by where p and n-p denote the degrees of freedom for the statistic F. In that case, the (1-α)100% UCL is given by and the LCL by

MULTIVARIATE CONTROL CHARTS WHEN OBSERVATIONS ARE INTERDEPENDENT
We notice from §2 that the multivariate quality control methodology is based on the assumption that the serially generated data are independent and multinormally distributed.If these assumptions are relaxed, the presence of serial correlation in observations greatly affects the distribution of statistic Q in (2) and/or T 2 in (3).A great deal of data in business, economics, engineering and the natural sciences occur in the form of time series where observations are no longer independent.The presence or absence of autocorrelation in a time series or cross-correlation between two time series can be tested using the well-known test statistics available in the literature.Then the problem is what to do once the dependence among observations is confirmed.In answering this, let us first consider the following theorems to be used later.

Theorem 3.2
Let {Y t } be a zero-mean, stationary p Η 1 vector process and let Y 1 , Y 2 , ..., Y n be a sequence of random observations from this process.Then the autocovariance function of the mean For simplicity, we develop the control charts for vector MA(1) and vector AR(1) processes in §4.1 and §4.2 respectively.It is worth pointing out at this stage that most of the univariate, as well as the multivariate, time series in practice are very well represented by lower-order ARMA processes such as MA(1), AR (1) or ARMA(1,1) processes.

Control charts for the vector MA(1) process
Example 4.1 For simplicity, let {X t }denote a bivariate MA(1) process defined by It is assumed that Θ and Σ are known.Suppose that z 1 , ..., z n is a realization of size n.Then from Theorems 3.1 and 3.2, it follows that where ∆ stands for "distributed approximately as", ( ) 2 were calculated using (12) from these (skipping a few values) 200 observations.Note that the autocorrelations at lags higher than one are all equal to zero.Hence one group of n = 5 values can be treated as uncorrelated with either the preceding or the succeeding group.
The calculated twenty χ 2 values were plotted in the χ 2 chart given below with UCL = χ 2 2,0.05 = 5.99.It may be noted that all points fall within limits, suggesting that the process is under control.

Example 4.2
Consider a vector AR(1) model defined by where δ = (δ 1 , δ 2 ) is a vector of constants, Φ is the coefficient matrix, Z t and ε t have been defined earlier.It can easily be seen that Let z 1 , z 2 , ..., z n be a realization of size n.Then from Theorems 3.1 and 3.2, it follows that where the process mean µ is given by µ = (I -Φ) -1 δ, z is the sample mean and ( ) , from ( 14) and (15).
For given Φ and Σ , Γ 0 may be obtained from where Vec(B) is a vector made up of the columns of B stacked one over the other in succession, starting with the first column and Φ⊗Φ is the Kronecker product.

Numerical example 4.2
Suppose that we are given It may be noticed from Figure 4.2 that only one point falls outside the 95% upper limit which is, of course, expected.

Remark 4.1
The numerical examples 4.1 and 4.2 are based on the simulated data for illustrative purposes.However, the methodology can very well be applied with equal ease to any real-world data.

Remark 4.2
When Φ and Σ are unknown, they can be replaced by their estimates (see Jenkins and Alavi (1981)).

MODELLING OF VECTOR TIME SERIES
Suppose that we are given a bivariate time series.When we plot the individual values on a three-dimensional graph, they may display systematic nonrandom patterns reflecting common causes which may be present throughout the data.In such a case, a casual inspection would not be able to separate the special and common causes.
A natural solution to this problem would be to model the systematic nonrandom patterns by a bivariate model (see Jenkins and Alavi [5], Singh and Peiris [10]).After fitting an optimum bivariate model to the given data, residuals may be determined.To these residuals, all the traditional methods such as the Hotelling T²-charts can be applied.If they are consistent with the randomness, the process may be considered to be in control and there is no need to look for any special causes.
Note 5.1 Since the higher-than-two-dimensional time series cannot be plotted, we can immediately proceed to modelling the series and then the corresponding residuals can be determined to which any traditional method of §2 can be applied.
Practical emphasis should be placed on trying to gain a better understanding of the process.The process can be improved by identifying the model and understanding the common causes making for autocorrelated and cross-correlated behaviour.Fitting a VARMA model would enable us to study the residuals and isolate the departures from control that may be attributed to special cases, otherwise these departures would be confounded with the dominant autoregressive behaviour of the data (ITSM can be used for fitting VAR(1)and VMA(1)models).

BOOTSTRAP SAMPLING METHODS
We noticed from Sections 4 and 5 that the upper and lower control limits are determined under the normality assumption.However, in practice, this assumption seldom holds good and many data display abnormality.In such cases, we advocate the use of bootstrap sampling methods for developing the control charts for nonnormal multivariate time series, particularly in view of the advent of fast computers.A procedure described below considers a bivariate AR(1) model for illustrative purposes.The procedure can easily be extended to vector ARMA models of higher orders, if required.
Let z' = (x, y) = {(x 1 , y 1 ), (x 2 , y 2 ), …, (x m , y m )} be a preliminary sample from the recent past from a bivariate AR(1) process under control defined at (13), that is, To generate a bootstrap realization of the time series, we arrange for convenience the preliminary sample z′ = (x, y) of size, say, m vertically (or horizontally) and consider all possible contiguous moving blocks of size, say, l, (l « m).In all, there will be m -(l -1) blocks.We then sample with replacement from these blocks and arrange them together as they occur to form a bootstrap time series.Just enough blocks are sampled to obtain a series of roughly the same length as the original series.That is, if the block length is l, we choose k blocks so that m ≈ lk (see Efron and Tibshirani [4]).An illustrative example is given below: Consider a bivariate AR(1) time series of length m = 15 arranged vertically (can be arranged horizontally as well) and divided into contiguous moving blocks each of size 5.There are 11 blocks in all.We then choose a random sample of three blocks with replacement, say 2, 9 and 7 and arrange them together vertically as they occur (see Table 6

JUSTIFICATION FOR MOVING BLOCK BOOTSTRAP
Since the observations are not independent, we cannot simply resample randomly from the individual observations in the original sample, as this would destroy autocorrelations (even at smaller lags) which we are trying to retain.With the moving block bootstrap, the idea is to choose a block size, say l, large enough so that the observations l units apart will be nearly independent (Theorem 3.1).By sampling the blocks of size l, one can retain the correlation present in observations less than l units apart.Another advantage of moving block bootstrap is that it is less mode-dependent, unlike the bootstrapping of individuals.

PROCEDURE FOR CONSTRUCTING UPPER AND LOWER CONTROL LIMITS
Having obtained a moving block bootstrap realization z * = (x * , y * ), we can determine the upper and the lower control limits.There are two cases, viz.(i) the parameters ψ = (Φ, Σ) in model ( 19) are known and (ii) ψ = (Φ, Σ) are unknown.In either case we assume that the process (19) is zero-mean and stationary.where Ω 0 is defined at (17).For a given bootstrap realization z * = (x * , y * ) and known Ω 0 , Q 1 can be calculated using (20).The entire procedure of generating the bootstrap sample from the original sample and then calculating Q 1 from (20) can be repeated a large number of times on any fast computer, say, 1 000 times or so and then the empirical distribution of Q 1 can be determined, which will give us Q 1 percentiles corresponding to % 100 (ii) When ψ = (Φ, Σ) are unknown.For a given preliminary sample, these parameters can be determined using a package such as S-plus or ITSM and then we calculate

2 αZ
is the Z percentile corresponding to 2 α .For the successive random samples of size n, this control chart can be viewed as the repeated tests of significance of the likelihood ratio test rejection region.This equivalence of hypothesis testing and the control chart methodology lays the necessary foundation for extending the univariate to the multivariate.
which will be the upper and lower control limits respectively.