If It Ain't Broke, Don't Fix It:
An Analysis of the Figure Skating Scoring System

By Sandra Loosemore (April, 1997)

In recent months, there has been increasing controversy over the ordinal-based scoring system used in ISU ("amateur" or Olympic-eligible) figure skating competitions. The ordinal system is not well-understood by the general public, and this has led to some results at recent competitions that seem "wrong" (or at least surprising) to people who are not familiar with the details of how figure skating is scored. The argument I make in this article is that the problem is not so much with the scoring system itself -- indeed, from a statistical point of view it is technically superior to the alternatives that have been proposed -- but with the failure of the ISU and the media to adequately educate the public.

How the ordinal system works

The pairs and men's and ladies singles events each consist of two phases, a short program and a free skating program. In dance, there are two compulsory dances, an original dance, and a free dance. The placements of the skaters within each phase are determined independently of the other phases, and then combined by multiplying the placements by a weighting factor to arrive at the overall standings in the competition.

The first critical point to understand about the figure skating scoring system is that the marks given to the skaters (on a scale of 0 to 6) have no intrinsic meaning in themselves. Instead, each judge uses the total of their technical and presentation marks to indicate a ranking -- first, second, third, and so on -- for each skater relative to the other skaters in the event. It is these rankings, called ordinals, which form the basis of the scoring system.

In other words, what matters is not how a judge's marks compare to the marks given by the other judges for a given skater, but how the marks compare to the marks given to other skaters by the same judge.

Once the ordinals from all the judges have been computed, they need to be combined in some way to reduce the nine sets of rankings into one set. In figure skating, this is done by majority vote.

For each competitor, the lowest-numbered place for which that competitor has a majority of votes from the judges is determined. For example, if there are 7 judges and a skater was given 2 first-place ordinals, 2 second-place ordinals, and 3 third-place ordinals, this skater would have a majority of 4 for second place or better. A skater that has a majority for a lowered-number place is ranked ahead of any skaters whose majority is for a higher-numbered place.

It's really as simple as that. Where it can seem complicated is in the rules for breaking ties, when there is more than one skater with a majority for the same place. But keep in mind that these rules don't apply unless there have not been clear distinctions among skaters and/or a clear consensus of the judges about their placements.

In practice, people who take the trouble to work through one or two examples have no problems understanding the method and being able to work out placements by hand.

What happened at 1997 Europeans

The situation which was largely responsible for the current controversy over the ordinal system was the men's competition at the 1997 European Championships. Part of the confusion over the outcome of this competition was the due to the fact that nearly all of the top contenders made several mistakes in their free-skating programs and there was no clear consensus among the judges about how to rank them in that part of the competition. Because the competition was very close and the judges were divided on where to place each successive skater compared to the ones who had gone before, their relative placements in that phase of the competition changed as they gained or lost majorities. In turn, this caused similar flip-flops in the total factored placements for the event as a whole.

Let us first consider the ordinal-related situation. With one skater left to compete, the ordinals and standings for the top group of competitors were as follows:

1. Alexei Urmanov           1  1  1  1  1  1  1  1  1  9/1
2. Viacheslav Zagorodniuk   5  5  4  4  2  3  2  2  3  5/3, TOM=12, TO=30
3. Philippe Candeloro       3  2  5  2  3  2  5  5  5  5/3, TOM=12, TO=32
4. Ilya Kulik               2  4  2  3  5  4  3  4  4  8/4
5. Alexei Yagudin           4  3  3  5  4  7  4  3  2  7/4
The key to the "surprising" outcome of this competition is that Zagorodniuk and Candeloro were so close together at this point that it came down to the last tie-breaking rule to separate them. Neither skater had a majority of votes for second place, and both of them held a majority of votes for third place by the slimmest of margins. Also key to the eventual outcome is the fact that Kulik and Yagudin were also very closely placed by the judges; both held majorities for fourth place at this point, with Kulik being ahead by virtue of having a larger majority. As a further note on how close the competition was at this point, both Zagorodniuk and Candeloro ranked behind Kulik in terms of the number of votes for fourth place and higher!

Now let us look at the ordinals and placements after the final competitor, Andrejs Vlascenko, had skated.

1. Alexei Urmanov           1  1  1  1  1  2  1  1  1  8/1
2. Philippe Candeloro       3  2  5  2  3  3  5  6  6  5/3
3. Viacheslav Zagorodniuk   5  5  4  4  2  4  2  2  3  7/4, TOM=21
4. Alexei Yagudin           4  3  3  6  4  8  4  3  2  7/4, TOM=23
5. Ilya Kulik               2  4  2  3  6  5  3  4  5  6/4
6. Andrejs Vlascenko        7  7  6  5  5  1  6  5  4  5/5
So, what has happened? The previous "tie" between Candeloro and Zagorodniuk has now been definitely broken, as Zagorodniuk has lost his majority of third-place votes. Moreover, Kulik has also lost some of his fourth-place votes, and dropped behind both Zagorodniuk and Yagudin in terms of those with a majority for fourth place. However, the subsequent reshuffling of the placements seems counterintuitive only if one does not realize how very close the standings were prior to Vlascenko's skate, and how little consensus there was among the judges as to the placement of any skater other than Urmanov.

The fact that one of the judges chose to place Vlascenko first, ahead of Urmanov, was actually completely irrelevant to the outcome of this competition. The thing that caused the flip-flop in placements between Zagorodniuk and Candeloro was the simple fact that this judge placed Vlascenko ahead of Zagorodniuk -- as did two other judges on the panel. It is also not at all accurate or fair to point a finger at this one judge for altering Zagorodniuk's placement, which was the result of the ordinals given by the majority of the judges.

The public would doubtless not have gotten the impression that Zagorodniuk was somehow "robbed" of the championship if the media had explained that in the free skate, he did not have a majority of votes for first, second, or even third place, but instead found himself in a virtual tie for fourth place with two other skaters. By trying to "simplify" things and thinking in terms of only the resulting placements and not looking at the majorities involved, critical information about the closeness of the competition was lost.

Finally, let's look at the situation involving the factored placements and the overall outcome of the competition. Before Vlascenko skated, the overall standings of the competitors who had already skated were as follows.

1. Viacheslav Zagorodniuk   2*0.5 + 2*1.0 = 3.0
2. Alexei Urmanov           6*0.5 + 1*1.0 = 4.0
3. Ilya Kulik               1*0.5 + 4*1.0 = 4.5
4. Philippe Candeloro       4*0.5 + 3*1.0 = 5.0
5. Alexei Yagudin           5*0.5 + 5*1.0 = 7.5
The final standings worked out to be:

1. Alexei Urmanov           6*0.5 + 1*1.0 = 4.0
2. Philippe Candeloro       4*0.5 + 2*1.0 = 4.0
3. Viacheslav Zagorodniuk   2*0.5 + 3*1.0 = 4.0
4. Ilya Kulik               1*0.5 + 5*1.0 = 5.5
5. Alexei Yagudin           5*0.5 + 4*1.0 = 6.5
6. Andrejs Vlascenko        3*0.5 + 6*1.0 = 7.5
Just as how the placements in the free skating portion of the competition were complicated by a mixed bag of performances within that phase, the overall placements in the competition as a whole were also complicated by the fact that skaters who had done well in the short program dropped out of contention in the free skate, and skaters who had made errors in the short program placed higher in the free skate.

In particular, observe that the final outcome between Urmanov, Candeloro, and Zagorodniuk was a virtual tie, broken only by the rule that the placement in the free skating part of the competition takes precedence. Once again, by trying to "simplify" matters by looking only at the overall standings instead of at the details of the factored placement totals for these skaters, critical information about the closeness of the competition has been lost.

Is the ordinal system really "broken"?

It appears that the primary problem with the ordinal system is simply the lack of understanding about how it works among the general public, media, and indeed among ISU officials. I would argue, however, that this confusion is not an intrinsic property of the ordinal system itself; that the problem is not that the system is inexplicable, but that nobody has been making a determined effort to explain it. Rather than joining in the general hysteria and making ill-informed statements about the inadequacy of the current scoring system, surely the ISU's primary responsibility should be to promote a deeper understanding of the sport of figure skating among the general public.

Aside from the general confusion about how the system works, the main technical objection that has been made against the current ordinal-based scoring system has to do with the fact that relative placements of skaters can change or "flip-flop" throughout the event as a result of the ordinals given to other skaters. Technically, there are two properties of the ordinal system that contribute to this situation.

The real problem, of course, is that the current figure skating scoring system was never intended to be used to give incremental standings throughout the event. It has been in place since before the days of computers, when the final standings had to be computed by hand on paper at the conclusion of the event.

There is also a non-technical issue that contributes to the perceived problem of flip-flops in the standings while an event is in progress. As noted above in the discussion of 1997 Europeans, "simplifying" the incremental results presented to the spectators and media by showing only overall standings for all phases of the event without indicating the factored placement totals, or only the standings in the free skating phase without any indication of the ordinals or majorities, obscures critical details about the closeness of the competition and lends those standings a definiteness that they do not actually possess. If this information were provided to the public, and an effort were made to educate people about what the information meant, the flip-flops in the standings would seem far less mysterious. In fact, many serious skating fans find watching the ordinal computations unfold during a competition to be fascinating and something that adds significantly to the suspense of the event! ("That's not a bug, it's a feature!")

Advantages of the ordinal system

On the positive side, the current figure skating system has other properties which make it very attractive. The foremost of these advantages is that it is statistically sound. There have been a number of papers published in statistics journals which have compared the figure skating scoring system to the systems used in other sports, such as gymnastics or diving, and the conclusion seems to be that the figure skating system does work extremely well.

(These published articles include: "Rating Skating" by Gilbert Bassett and Joseph Persky, in the Journal of the American Statistical Association, Sept. 94, Vol 89 #427, pp. 1075-1079; and "Amateur Figure Skating: Is the Ranking System Out of Date?" by Edmund L. Russell, III, in the 1995 Proceedings of the Section on Statistics in Sports, American Statistical Association.)

One of the reasons why the figure skating scoring system works so well is that it is based on majority vote. Since it takes a majority of judges to decide the placement of each skater, the chances for error, bias, or deliberate manipulation of scores by some of the judges to affect the overall results of the competition are minimized. From a philosophical point of view, the ordinal system is also attractive because it recognizes that figure skating is a complex sport, with many aspects that must be judged simultaneously, and individual judges may legitimately have differences of opinion in deciding how to weigh the various factors. The judges' individual rankings are combined by a process that determines the overall consensus of the panel, rather than arbitrarily asserting that, for example, the judges on the high and low end of the range are in error.

Another point in favor of the current system is that it is neutral to the order in which the competitors skate. As we shall see, some of the alternatives that have been discussed informally do not have this property.

Finally, another advantage of the current scoring system is that of familiarity. This system has been in place for years. Judges are used to giving marks that reflect relative placements of the skaters rather than an absolute measure of performance. Accountants and referees already understand how the system works, and the computer software is familiar and well-tested. We know both that the system works, and how it works.

Comparison to alternative scoring systems

Alternative 1: modifications to the ordinal system

One of the alternative judging systems that has been discussed informally is a modified ordinal system that uses a different method to combine the ordinals, namely a sort based on pairwise comparison of relative placements: if a majority of judges place skater A ahead of skater B, then skater A remains ahead of skater B no matter whether later competitors are placed ahead, between, or behind them.

Unfortunately, this system doesn't work. Here's a simple example involving three skaters that illustrates a case where it breaks down.

A  1  1  1  2  2  2  3  3  3
B  2  2  2  3  3  3  1  1  1
C  3  3  3  1  1  1  2  2  2
It's obvious that these skaters ought to be tied, even though if you look at each pair of skaters individually you'll see that a clear majority of judges (6/9) placed skater A before B, B before C, and C before A.

A slightly more complicated example would be:

A  1  1  1  1  2  2  2  3  3
B  2  2  2  2  3  3  3  1  1
C  3  3  3  3  1  1  1  2  2
Now, we still have a majority of judges putting A before B (7/9), B before C (6/9), and C before A (5/9), but is this still a tie? Probably not; the correct ordering would be A, B, C. Moreover, if C and A skated before B, there would still be the same "flip-flop" situation we now have with the current ordinal system: the relative placements of C and A would reverse as a result of the scores given to the third skater B.

Another suggested modification to the ordinal-based system would attempt to solve the problem of flip-flops in the relative placements of skaters caused by the ordinals given to a third skater by "locking in" the relative placements of skaters. Lacking any more specific details of how this would work, I can say only that I am very skeptical that this could be accomplished in a way that does not bias the overall results of the competition by the order in which the competitors skate. Since the skate order is decided by a random draw, such a system would be manifestly unfair to the athletes.

Alternative 2: trimmed-mean system

The most commonly-proposed alternative to the current ordinal-based system is a trimmed-mean system similar to that used in gymnastics and some professional skating events: essentially, "throw out the high and low marks, and add the rest".

Statistically, the problem with an approach based on the mean or average is that an average represents a compromise rather than a consensus. In other words, the average mark may not represent the opinion of the majority of the judges.

Arbitrarily throwing out the high and low marks is also a remarkably poor solution to the problems of national bias or human error that arise from time to time on judging panels, because it assumes that these judges are necessarily mistaken and the others are "correct". As a very common kind of example where this would be obviously the wrong assumption, consider a case where two judges seated at the far end of the panel don't have a good view of a jump placed at the opposite end of the rink and fail to see that the skater lands it on two feet. The other seven judges see the mistake and give the "correct" marks that include a required deduction. Now, what happens under the trimmed-mean system is that only one of the mistaken marks is discarded, along with one of the correct ones! To make the situation even worse, consider what would happen if a third judge happened to sneeze at the wrong moment and also missed the skater's mistake. In short, if there's more than one judge out of line at the same end of the scale, then the incorrect marks will end up affecting the skaters' overall placement even if the majority of judges are in agreement. The ordinal system does not have this problem.

What the ISU and media can do

Let's start out by talking about what the ISU shouldn't do: rush into a decision to institute an ill-considered, stopgap system that has no overall technical advantage over the current scoring system. In addition to whatever problems are specific to such a new scoring system, in the short term, there would be two additional problems to deal with. First, a radically different system would be more prone to errors resulting from the unfamiliarity of judges and other officials with the system. Second, we can be sure that situations will arise in the future that are at least as controversial as those that have led to the current re-examination of the scoring system. The controversy surrounding these situations would be made all the worse by the inevitable complaints that "this wouldn't have happened with the old scoring system". (As an example of a parallel situation where this is now happening, observe the unintended consequences of the ISU's new qualifying rules for the World Championships.)

Surely it does not seem like a good idea to use a competition as important as the Olympic Games as the first major test of a new scoring system. It also does not seem like a good idea to further jeopardize the credibility of figure skating as a legitimate sport in the eyes of the public by replacing the current scoring system with a different system that is likely to lead to results that are equally confusing or manifestly unfair to the athletes.

It may be that there is a better scoring system that fixes the "flip-flop" problem while also remaining neutral to the skating order and avoiding the statistical problems with the mean or trimmed-mean approaches. However, until such a system is devised and has been demonstrated to work, the ISU has nothing to gain and very much to lose by instituting a change to some other system that is not a clear improvement over the current system.

I believe that the current problems can be minimized without replacing the scoring system, and that the ISU's immediate mission ought to be simply to educate the public, and to assist the media in educating the public, about the rules of our sport, rather than to change the rules merely for the sake of changing them. Some specific suggestions for how to approach this include:


Sandra Loosemore is a longtime skating fan and a regular contributor of commentary and reviews to the skating discussion groups on the Internet. She publishes The Figure Skating Page at http://www.frogsonice.com/skateweb/ and is the author of the Competitive Figure Skating Frequently Asked Questions List, a collection of reference and tutorial material about the sport that is regularly updated and published on the net. She is also a recreational figure skater.

SkateWeb
 Home © 1994-2010 SkateWeb