Methodology and Application of Oneway ANOVA

Veiw figure View Table

In statistical softwares is used to be in this table column with pvalue. This p-value says the probability of rejection the null hypothesis in case the null hypothesis holds. In case , where α is chosen significance level, is the null hypothesis rejected with probability greater than % probability.

3. Post Hoc Comparison Procedures

One possible approach to the multiple comparison problem is to make each comparison independently using a suitable statistical procedure. For example, a statistical hypothesis test could be used to compare each pair of means, and , ; , where the null and alternative hypotheses are of the form

(17)

An alternative way to test for a difference between and is to calculate a confidence interval for . A confidence interval is formed using a point estimate a margin of error, and the formula

(18)

The point estimate is the best guess for the value of based on the sample data. The margin of error reflects the accuracy of the guess based on variability in the data. It also depends on a confidence coefficient, which is often denoted by . The interval is calculated by subtracting the margin of error from the point estimate to get the lower limit and adding the margin of error to the point estimate to get the upper limit ^[6].

If the confidence interval for does not contain zero (thereby ruling out that ), then the null hypothesis is rejected and and are declared different at level of significance α.

The multiple comparison tests for population means, as well as the F-test, have the same assumptions.

There are many different multiple comparison procedures that deal with these problems. Some of these procedures are as follows: Fisher’s method, Tukey’s method, Scheffé’s method, Bonferroni’s adjustment method, DunnŠidák method. Some require equal sample sizes, while some do not. The choice of a multiple comparison procedure used with an ANOVA will depend on the type of experimental design used and the comparisons of interest to the analyst ^[8].

The Fisher (LSD) method essentially does not correct for the type 1 error rate for multiple comparisons and is generally not recommended relative to other options.

The Tukey (HSD) method controls type 1 error very well and is generally considered an acceptable technique. There is also a modification of the test for situation where the number of subjects is unequal across cells called the Tukey-Kramer test.

The Scheffé test can be used for the family of all pairwise comparisons but will always give longer confidence intervals than the other tests ^[6]. Scheffé’s procedure is perhaps the most popular of the post hoc procedures, the most flexible, and the most conservative.

There are several different ways to control the experimentwise error rate. One of the easiest ways to control experimentwise error rate is use the Bonferroni correction. If we plan on making m comparisons or conducting m significance tests the Bonferroni correction is to simply use as our significance level rather than α. This simple correction guarantees that our experimentwise error rate will be no larger than α. Notice that these results are more conservative than with no adjustment. The Bonferroni is probably the most commonly used post hoc test, because it is highly flexible, very simple to compute, and can be used with any type of statistical test (e.g., correlations), not just post hoc tests with ANOVA.

The Šidák method has a bit more power than the Bonferroni method. So from a purely conceptual point of view, the Šidák method is always preferred.

The confidence interval for is calculated using the formula:

(19)

where is the quantile of the Student’s tprobability distribution, by Fisher method (LSD − Least Significant Difference);

(20)

where represents the quantile for the Studentized range probability distribution, by TukeyKramer method (HSD − Honestly Significant Difference);

(21)

by Scheffé method;

(22)

where , is the number of pairwise comparisons in the family, by Bonferonni method;

(23)

where and , by DunnŠidák method ^[2].

4. Tests for Homogeneity of Variances

Many statistical procedures, including analysis of variance, assume that the different populations have the same variance. The test for equality of variances is used to determine if thtion of equal variances is valid.

We will be interested in testing the null hypothesis

(24)

against the alternative hypothesis

(25)

There are many testse assump of homogeneity of variances. Commonly used tests are the Bartlett (1937), Hartley (1940, 1950), Cochran (1941), Levene (1960), and Brown and Forsythe (1974) tests. The Bartlett, Hartley and Cochran are technically test of homogeneity. The Levene and Brown and Forsythe methods actually transform the data and then tests for equality of means.

Note that Cochran's and Hartley's test assumes that there are equal numbers of participants in each group.

The tests of Bartlett, Cochran, Hartley and Levene may be applied for number of samples . In such situation, the power of these tests turns out to be different. When the assumption of the normal distribution holds for these tests may be ranked by power decrease as follows: Cochran Bartlett Hartley Levene. This preference order also holds in case when the normality assumption is disturbed. An exception concerns the situations when samples belong to some distributions which have more heavy tails then the normal law. For example, in case of belonging samples to the Laplace distribution the Levene test turns out to be slightly more powerful than three others ^[7].

Bartlett’s test has the following test statistic:

(26)

where constant and meaning of all the others symbols is evident (see section 2). The hypothesis H₀ is rejected on significance level α, when

(27)

where is the critical value of the chi-square distribution with degrees of freedom.

Cochran’s test is one of the best methods for detecting cases where the variance of one of the groups is much larger than that of the other groups. This test uses the following test statistic:

(28)

The hypothesis H₀ is rejected on significance level α, when

(29)

where critical value is in special statistical tables.

Hartley’s test uses the following test statistic:

(30)

The hypothesis H₀ is rejected on significance level α, when

(31)

where critical value is in special statistical tables ^[2].

Originally Levene’s test was defined as the one-way analysis of variance on the absolute residuals , and where

k is the number of groups and n_i the sample size of the i^th group. The test statistic has Fisher’s distribution and is given by:

(32)

where , , .

To apply the ANOVA test, several assumptions must be verified, including normal populations, homoscedasticity, and independent observations. The absolute residuals do not meet any of these assumptions, so Levene’s test is an approximate test of homoscedasticity ^[5].

Brown and Forsythe subsequently proposed the absolute deviations from the median of the i^th group, so is

5. Example from Technical Practice

What follows is an example of the one-way ANOVA procedure using the statistical software package, MATLAB.

One important factor in selecting software for word processing and database management systems is the time required to learn how to use a particular system. In order to evaluate three database management systems, a firm devised a test to see how many training hours were needed for six of its word processing operators to become proficient in each of three systems ^[9]. The data from this experiment are in the Table 2. Using a 5 % significance level, is there any difference between the training time needed for the three systems?

Table 2. Experiment data in hours

Download as

Veiw figure View Table

5.1. Testing the Assumption of Normality

One of the first steps in using the oneway ANOVA test is to test the assumption of normality. Even if the distribution is somewhat different from normal, oneway ANOVA can still work good if the sample sizes are large enough. However, when sample sizes are small, oneway ANOVA can be unreliable if the data in one or more of the groups comes from a highly nonnormal distribution.

For evaluating normality there are graphical and statistical methods. For example normal probability plot is a graph specifically designed to check for normality. If the data comes from a normal distribution the points should form a line. The statistical methods include diagnostic hypothesis tests for normality, where the null hypothesis is that there is no significant departure from normality for each of the groups/levels. The alternative hypothesis is that there is a significant departure from normality. The main tests for the assessment of normality are KolmogorovSmirnov (KS) test, Lilliefors test (corrected K-S test), ShapiroWilk test, AndersonDarling test, Cramervon Mises test, D’Agostino test and JarqueBera test.

For the above example we are using MATLAB with functions ^[4]:

Ÿ [h,p]=lillietest(x,0.05,'norm') for Lilliefors test,

Ÿ [h,p]=swtest(x,0.05) for ShapiroWilk test.

For example the ShapiroWilk test using significance level 0.05 give these results: for system 1, for system 2, and for system 3.

We would conclude that each of the levels of the independent variable are normally distributed.

5.2. Testing the Assumption of Homogeneity of Variances

We seek to test equality of variances (see part 4) and have run Bartlett’s test in MATLAB:

X=[20,17,15,19,14,13;18,17,14,20,13,12;23,25,20,21,...19,20]';[p,stats]=vartestn(X)

From the following analysis in MATLAB, the pvalue for Bartlett’s test (significance level 0.05 is here default) is .

Therefore, we would fail to reject the null hypothesis .

5.3. Hypothesis Testing Using ANOVA

Letting µ₁, µ₂,and µ₃ be the mean for the three systems, the null hypothesis is . The alternative is for at least one i, l pair ().

In MATLAB we use command:

[p,tbl,stats]=anova1(X)

This will return an ANOVA table, showing the value of the Fstatistic and pvalue, and a boxplot of three different groups. The results of the calculations for this case are summarized in Table 3 and Figure 1.

Table 3. Summary table of the oneway ANOVA for experiment data

Download as

Veiw figure View Table

Since p-value is less then given significance level 0.05 for this problem, we reject the null hypothesis. There is a difference between the mean learning times for at least two of the three database management systems.

Figure 1. Boxplot of three different group

Download as

View current figure in a new window

Figures index

Veiw figure View Figure

View next figure

5.4. Pairwise Comparison

When the null hypothesis is rejected using the F-test in ANOVA, we want to know where the difference among the means is. To determine which pairs of means are significantly different, and which are not, we can use the multiple comparison tests.

MATLAB implements for example the TukeyKramer procedure, the Bonferroni procedure, DunnŠidák procedure and reports the results in terms of the confidence interval.

Now we can make the 95 % confidence interval for differences in pair of population group means , ; .

In MATLAB we use the following series of commands:

multcompare(stats,'alpha',.05,'ctype','tukey-kramer')

multcompare(stats,'alpha',.05,'ctype','bonferroni')

multcompare(stats,'alpha',.05,'ctype','dunn-sidak')

The statistical outputs are, respectively, shown in Table 4, Table 5, Table 6 and Figure 2.

Table 4. Results using Tukey-Kramer method

Download as

Veiw figure View Table

Table 5. Results using Bonferroni method

Download as

Veiw figure View Table

Table 6. Results using Dunn-Šidák method

Download as

Veiw figure View Table

Figure 2 represents an interactive figure. By clicking on the group symbol at the bottom, in part of the figure is displayed the group from which the selected one statistically differs.

Figure 2. The interactive figure

Download as