Tuesday, July 21, 2009

Should Unprecedented Tour Success Warrant Suspicion?

In recent years at the Tour de France we’ve seen breakthrough results from riders like Floyd Landis and Bernhard Kohl nullified by doping convictions. In this light, it is tempting to view the current success of Bradley Wiggins with skepticism (see here for a generally civil discussion). Wiggins, now third on GC, is a multiple gold medalist and world champion on the track, but has never had much success in road or stage racing. He credits his new climbing ability to substantial weight loss paired with a new focus on road racing, but a hardcore skeptic who believes Bernhard Kohl is not going to be so easily convinced.

Of course, at the moment there is no way we can know if Wiggins is on the good juice or not. But we can look at his pattern of improvement and ask how extraordinary it actually is. Similarly, we can look at dopers and determine if their results were entirely unprecedented. To this end, I will consider two quantities:
  • R: The difference between a rider’s mean placings in a given year and mean placings in previous years. Larger numbers mean greater improvement.

  • P: The likelihood that this difference is real and not simply a result of random fluctuations (statistical significance). Smaller numbers mean greater significance.
Data, as usual, are from the indispensable Cycling Quotient. When computing P, I first take the logarithm of all results in order to enhance the value of top placings and minimize differences between mid-pack finishes (e.g. the difference between 2nd and 12th is much more important than the difference between 102th and 112nd). See below for more exciting technical notes.

Here are the career results for Bradley Wiggins:

Wiggins results so far in 2009 are definitely the best of his road career, with a mean placing of 66 compared to 108 over previous years. This corresponds to R = 42.5. The likelihood of this result in random fluctuations is P = 0.00012, suggesting that this is indeed statistically significance. Here is how Wiggins compares to a few other riders:

Sastre’s results appear to have improved slightly, but the P-value tells us this is not significant. Haussler is having a very good season but not to the extent that Wiggins is. Vande Velde was impressive in 2008, but Wiggins has shown an even greater improvement in 2009.

So how rare is this magic season of Bradley Wiggins? Looking back across the CQ data, I identified other riders who have had a season with an improvement greater than or better than Wiggins, defined as a R greater than 42.5 and P less than 0.00012:

Should we be skeptical of Wiggins based on this company? There are a few suspicious characters there, but overall this list is notably light on convicted dopers. We can compare these riders with some of the recent bad boys:

These improvements are generally more modest. Kashechkin, Landis, and Basso are the only riders in this lot whose gains were in the neighborhood of Wiggins's 2009. Three of these riders actually had a drop in results when doping (negative R), although these were not significant.

The data fairly clearly contradict the idea that significant performance improvements are necessarily, or even likely, the result of doping. As always, we should apply the standard caveat that the absence of a positive test does not always imply clean riding. However, without evidence to the contrary I think we should conclude that it is common for riders to significantly improve their performances without the aid of doping. Likewise, it is common for dopers not to have significantly improved results.

Technical notes: Most race results on Cycling Quotient are partial, often listing only the top 10 or 20 riders. For calculation of R and P, only races with 100 or more listed riders were used. Calculations were therefore based on about 900 races from 2000 to 2009, with the later years having more races listed. Roughly half of the races were stages from grand tours, and the remaining results are mostly the major one-day races and lesser stage races. To avoid small sample sizes, R and P were only computed if the rider at hand had more than 10 race results for the year in question and more than 10 results prior to that year. The likelihood P corresponds to the probability that there is no difference between a rider’s mean results in a given year and their mean results in previous years, and was computed using the Student’s t-test without assuming known or equal variances between the two samples.

No comments:

Post a Comment