Lance Armstrong’s Era of Performance – Part III: Demonstrating the Post Hoc Fallacy

View current table in a new window

Tables index

Veiw figure View Table

View next table

1.2. Research Question: Demonstrating the Post Hoc Fallacy

The latter findings imply that the wins of the American ex–cyclist cannot be regarded atypical or ‘abnormal.’ Besides, the slight amounts of variation explained by the (un)adjusted rider main effects in the two studies suggest that the time trial achievements of all riders are quite comparable and that, therefore, Armstrong did not strongly benefit from his doping use. However, this conclusion may be criticized, since both studies revealed a significant yearly progress in riders’ speed. Armstrong won his trials between 1999 and 2005. Hence, the linear relationships could lead one to conclude that Armstrong indeed raced faster than his predecessors and that the doping agents he resorted to might have strongly contributed to these achievements ^{[2, 3]}. Although we rebutted this criticism extensively in foregoing papers ^{[5, 6, 7, 8]}, in the current study we will comment once more on this way of circular reasoning, because it might reflect the logic underlying the post hoc fallacy. The fallacy is illustrated in Figure 1. Armstrong’s victories can be seen as winning performances (W_P)in Figure 1. He also used modern, ergogenic doping aids, labeled doping use in a certain year (D_Y)in Figure 1. The post hoc fallacy implies that W_P (“after this”) is caused by D_Y (“because of this”). This reasoning generalizes to the often heard opinion in discussions about effects of doping, suggesting that the progress in speed over the years in professional road racing (W_P) can be attributed to the use of progressively powerful and advanced banned substances (D_Y) [5-10]^[5]. The current study purports to show that this reasoning may be false by examining the relationship between the year in which riders’ competed (D_Y) and their winning performances in mountain time trials (W_P). If the relationship between D_Y and W_P proves to be nonexistent, than the logic employed in the post hoc fallacy is refuted.

The foregoing Armstrong studies deliberately assessed his individual time trial achievements as the criterion to appraise his performances over the years, rather than focusing on his final km/h achievements in the seven Tours he won (1999–2005). Several historic studies ^{[7, 8]}, which examined all first–ranking performances of riders in the Tour, Giro, and Vuelta from 1903 to 2011(N = 256), indicated once more that the accomplishments of the American racer were not exceptional. However, these studies may suffer from a methodological flaw. Cyclists’ ultimate km/h performances, realized after three weeks of competition in multi–stage cycling races, cannot be considered individual performances, because they may be strongly affected by the (often inestimable) athletic efforts exerted by the bunch of riders participating in the races (the ‘peloton’). In individual time trials, riders’ efforts are not influenced by these confounding, coordinated peloton forces. In a time trial riders individually race for the fastest time, making it impossible to benefit from the joint labors of cooperating riders in the race through drafting (profiting from the slipstream other riders). Thus, in a time trial riders can only rely on their own athletic capabilities ^{[5, 6, 11]}. Therefore, we maintain that an assessment of these individual performances will increase chances to (indirectly) explore the impact of ergogenic doping agents on Armstrong’s cycling feats.

Figure 1. Illustration of the post hoc fallacy to account for riders’ winning performances over time

Download as

Veiw figure View Figure

View next figure

Time trial racing constitutes one of the most demanding disciplines in professional road racing and, within this discipline, mountain time trials are a class on their own ^[11]. To critically evaluate findings and conclusions pertaining to the two previous Armstrong studies and to investigate the post hoc fallacy, the present research contrasted Armstrong’s winning performances in these demanding races to victories of other riders. The logic of the fallacy suggests an increase in riders’ speed over time. Moreover, given the proposed powerful ergogenic effects of Armstrong’s doping aids ^{[2, 3]}, it further assumes that his km/h performances will be far superior to achievements of the other winners.

2. Method

2.1. Design and Sample

Mountain time trials are rarely scheduled in the three tours ^{[11, 12]} and the archival records of Magnier and co–workers ^[12] revealed that all in all nineteen of these trials occurred in the Tour de France in the years following WW II (1958-2004). Records relating to the Giro and Vuelta supplied by ^[12] were incomplete, forcing us to restrict the examination of our research questions to riders’ performances in these nineteen trials. Magnier et al. ^[12] provided information concerning the winners of the trials, their mean km/h and time performances, as well as the names of the mountains and their altitudes, but they did not give details relating to the slopes of the climbs. We succeeded to retrieve this information from Codifava ^[13] and cross–checked it with data provided by Van Lonkhuyzen ^[14] and Ejnes et al. ^[15]. Table 2 presents descriptive statistics of the variables we measured.

2.2. Measurements

We took notice of the day at which the trials were scheduled during the Tours. The mean stage number indicates that, on average, the trials took place at the end of the second week of the three–week races (M = 14, SD = 4). The years of competition ranged from 1958 to 2004. In 1958, famous climber Charly Gaul won the first trial to the top of Mont Ventoux in the Provence. As Table 2 shows, Armstrong won the last two races uphill in 2001 and 2004. The mean distances of the trials amounted to M = 28.85 km. Note that there is a large between–trial variation in these distances (SD = 13.14 km) of which we come to speak below. French five–time Tour winner Bernard Hinault faced the longest and Spanish climber Frederico Bahamontes the shortest trial.

The altitudes of the climbs varied between 0.857 km (Cote d’Engins) and 1.909 km (Mont Ventoux) with M = 1.606 km (SD = 0.349). However, reckoning our research questions, we doubted the validity of these altitudes, because cyclists faced less high elevations. Most of the trials are located at high mountains in the French Alps or Pyrenees, meaning that the riders did not start their races at sea level. However, the altitudes of the climbs are measured from sea level ^[14]. For instance, it is documented ^[15] that Charly Gaul in his 1958-trial to Mont Ventoux started his race at Bédoin, a village located at the foot of the climb with an elevation of 327 m. The altitude of Gaul’s trial thus amounted to1.582 km and not 1.909 km. To validly evaluate riders’ achievements, we therefore accounted for the elevation at which riders began their trial. After scrutinizing various sources relating to the history of climbs in professional road racing ^{[12, 13, 14, 15]}, we estimated the corrected altitudes of the climbs. They were operationalized as the difference in km between the documented base of the climb (where riders started the race) and the top (where the finish line was drawn). Table 2 indicates that the corrected altitudes are substantially lower (515 m) than the uncorrected heights (M = 1.091 km, SD = 0.292).

The average slope of the climbs is defined as the rise over the run: (rise of the climb / horizontal distance) ● 100. The mean slopes amounted to M = 6% (SD = 1.37). Spanish rider and former Tour winner Pedro Delgado as well as Dutch time trial specialist Eric Breukink faced the ‘easiest’ climbs (Cote d’Engins, M = 3.3%). In 2004, Armstrong faced the steepest climb to the top of l’Alpe d’Huez (M = 7.8%).

As regards the dependent variable, across the sample riders’ mean km/h performances amounted to M = 28.87 (SD = 5.56). On average, it took riders approximately one hour to finish the trials (M = 58:01, SD =18:20). Note that both performance measures show a large variability, owing to the variation in the distances of the time trials to which we will attend to in the next section.

2.2.1. Climbing Index

Surprisingly, preliminary analyses revealed that larger distances of the trials are strongly associated with faster km/h performances, r = .83 (p ≤ .001). This can be explained by the fact that longer races may involve extended stretches of relatively flat roads, enabling riders to maintain higher speeds compared to trials in which riders instantly have to climb. Moreover, the correlation between the corrected altitude and riders’ km/h performances was significant, r = -.56 (p = .01), whilst the correlation with the uncorrected altitudes was not, r = -.38 (p = .11). The corrected altitudes thus appear to have better discriminatory value. Additionally, the strong distance–km/h performance relationship entails that we should control for this variable in order to validly address our research questions. Inspired by the research of El Helou and co–workers ^[9], we therefore developed a climbing index, also operationalized as the rise over the run. The rise concerns the corrected altitude in km and the run the total distance of the trial in km and it is expressed in percentages:

Since longer trials involve less climbing kilometers, they are (relatively) less difficult for riders to maintain higher speeds. Therefore, we assume that higher values of the Cl_I designate more demanding trials in terms of riders’ instant climbing efforts. Table 2 presents descriptive statistics of the index: M = 4.36%, SD = 2.26. The strong interrelationships of the index with other variables make clear that the Cl_I accounts for the influence of the variation in distances of the trials on riders’ performances (r = -.85, p ≤ .001) as well as for the influence of the corrected altitudes (r = .54, p ≤ .05) and the slopes of the climbs (r = .64, p ≤ .01). Besides, the index correlates nearly perfectly with riders’ speed (r = -.97, p ≤ .001) explaining 94% of the variation between riders. For these reasons, we chose to attend to our research questions using two variables: The year in which riders competed, permitting an evaluation of Armstrong’s performances in 2001 and 2004, and the Cl_I.

Table 2. Descriptive Statistics of Riders’ Performances in Mountain Time Trials in the Tour de France (N = 19)

Download as

View current table in a new window

Tables index

Veiw figure View Table

View previous table

2.3. Analyses

We examined our research questions using a mediation regression model (OLS, ^{[16, 17]}) with riders’ mean km/h performances as the criterion and competition year and the CL_I as predictor variables. The results of these analyses allow an evaluation of Armstrong’s doping–induced wins: “Are his performances predicted by competition year and the Cl_Ior not, and do they constitute outliers?” If competition year proves to be the main predictor variable and Armstrong’s performances are statistical outliers as well, this will render the reasoning underlying the post hoc fallacy plausible. However, if these expectations are disconfirmed, the logic will be disproved. Consistent with previous studies ^{[5, 6]}, we further used the stringent 68%-and 95%–CIs (± 1SD or ± 2SD from the sample mean, respectively) as the criterion to determine outliers. Conventionally, however, performances beyond ±3SD are considered outliers ^[18]. Last, we assessed whether influential cases biased the regression model. Analyses were conducted using IBM–SPSS (v. 20).

3. Results

3.1. Km/h Performances

Following guidelines for mediation analyses provided by ^{[16, 17]} the regression analyses we conducted, involved four steps. Figure 2 summarizes findings of the analyses. The first step in the analyses showed that the year in which cyclists competed is negatively related to the Cl_I, b = -0.076 (∆R² = .201), indicating that the trials became less brutal over time. The second step revealed that competition year is positively related to riders’ mean km/h performances, b = 0.201 km/h (∆R² = .234). Per year riders race 201 m faster. The third step indicated that the index has a strong negative influence, showing that riders race b = -2.302 km/h (∆R² = .932) slower per unit of the index. In the fourth step we simultaneously entered competition year and the index in the regression equation. This step not only allows a test of the mediation effect of the index on the competition year–performance relationship, but also permits an estimation of the degree to which the influence of competition year on riders’ achievements is affected by the index. The lower panel in Figure 2 presents the results of the fourth step. The resulting unstandardized b = 0.026 km/h (∆R² = .003) indicates that riders raced 26 m faster per year. This relationship is not significant (p = .38) and explains 0.3% of the variation in riders’ km/h performances. The fourth step further revealed that the Cl_I fully mediated the competition year–performance relationship (z = 2.10, p ≤ .05) and accounted for a substantive part of the variation in this relationship, b = 0.175 km/h. Thus, the mediating effect of the index ultimately resulted in the adjusted, nonsignificant influence of competition year on the dependent variable, i.e., 0.201 - 0.175 = 0.026 km/h. The two variables jointly explain 92.7% of the variation in riders’ performances, to which the index contributed by far the most. To conclude, as there is no solid evidence that riders race faster over time, our findings put great pressure on the logic underlying the post hoc fallacy.

3.2. Outliers and normality

The analyses further yielded information concerning riders’ predicted and residual performances, which are presented in the last two columns of Table 2. The correlation between the two variables, r = .00, indicates a very good fit of the regression model. Residuals give an indication which performances are not predicted by the variables included in the regression model and, therefore, permit us to evaluate which performances may constitute outliers. Figure 3A and Figure 3B present regression plots of riders’ predicted performances with 68%– and 95%¬–CIs. Table 2 and the figures show that one of Armstrong’s wins surpassed the 68%–CI and it concerns his 2001 trial to Chamrousse in the French Alps. However, in this trial he did not perform faster, but 1.51 km/h slower than predicted. In his 2004 trial to l’Alpe d’Huez he raced 690 m faster than predicted. Yet, this performance did not even surpass the 68%–bandwidth. Hence, regarding our research questions, we conclude that the achievements of the American racer do not constitute outliers.

Table 2 and Figure 3A and Figure 3B further indicate that four other riders exceeded the 68%-CI. Three of them performed slower than predicted: French riders Laurent Fignon (1984) and Jeff Bernard (1987) and Dutch cyclist Steven Rooks (1989). In 1978, Dutch rider and former Tour winner Joop Zoetemelk performed faster than predicted on his climb of Puy de Dome in the Massif Central.

Figure 2. Mediation model of variables explaining differences in riders’ km/h performances. Presented are unstandardized regression weights (b) and associated standard errors (SE_b) in parentheses and standardized weights (β). As to km/h performances, b and SE_bare in kilometers per hour per year, or in kilometers per hour per unit of the climbing index. The unadjusted relationship between year of competition and km/h performances is in bold type face. The broken arrow indicates a mediated relationship

Download as

Veiw figure View Figure

View previous figure

View next figure

Figure 3. Plots of riders’ mean km/h performances regressed on year of competition and the climbing index. Dotted lines indicate the 68%-CI and broken lines the 95%-CI. Open dots (¡) present Berzin’s performance beyond the 95%-CI and double daggers (‡) Armstrong’s performances. Negative values of the centred mean km/h indicate slower performances

Download as

Veiw figure View Figure

View previous figure

View next figure

Only one of the nineteen cyclists went beyond the 95%–CI in his performance, it concerns Russian rider Yevgeni Berzin in the 1996 trial to Val d'Isère in the Alps. Although we utterly realize that the number of observation is far too low to reliably estimate whether riders’ observed, predicted and residual time performances depart from normality, we did evaluate these performan-ces. They all appeared to be normally distributed (Kol-mogorov–Smirnov tests, zs ≤ 0.71, ps ≥ .70), indicating there are no signs of any ‘abnormal’ fast or slow perfor-mances among the riders, including Armstrong’s. To give examples, Figure 4 illustrates the normal probability plots of riders’ observed and residual performances.

Figure 4. Normal probability plots of riders’ observed and residual km/h performances

Download as

Veiw figure View Figure