Aadityan Ganesh, Haran Mouli and Hari Ramasubramanian
In 2 lines: Not surprisingly, we argue that ranking players according to points at the end of a double round-robin tournament increases the likelihood of ordering the players in decreasing order of strengths. However, this leads to a surprising corollary, conditioned on two players ending up with the same number of points, no tie-break that relies purely on the tournament results identifies the stronger among the two players with probability better than half.
Designing effective tie-breaking protocols is a critical challenge in competitive settings. Formally, a tie-breaking protocol can be viewed as a mapping from all available information in a tournament to an ordered ranking of participants. The simplest and most widely used method is ranking by total points, but tie-breaks can become much more intricate. In this article, we consider designing tie-breaking protocols aimed at enhancing the likelihood of the stronger player finishing higher in the rankings.
Tie-breaks often rely on external metrics that go beyond the match results themselves. For example, cricket and football use measures like net run rate or goal difference to capture the margins of victories and defeats—details not apparent from results alone. We say such tie-breaks are exogenous. On the other hand, many sports, like chess, use tie-breakers derived entirely from match outcomes, which we call endogenous tie-breaks. Ranking players by points is a straightforward example of an endogenous tie-break. Chess also employs more nuanced methods, such as the Sonneborn-Berger score (which factors in the points of defeated and drawn opponents) or counting the number of wins, particularly those achieved with Black pieces.
Interestingly, chess relies on endogenous tie-breakers even in double-round robin tournaments, where players play each other once with both white and black pieces. Famously, in the 2013 Candidates Tournament (the tournament to decide the challenger for the World Champion crown), Carlsen and Kramnik finished with equal points, but Carlsen was declared the winner due to more wins. One might argue that the result to be fair and Carlsen’s higher number of wins to be a sign of dominance. On the other hand, one can also argue that Kramnik’s fewer losses reflect his consistency. Such disagreements are inherent to any endogenous tie-breaking system. For example, breaking ties based on a better head-to-head score in the tournament can be seen as favouring the player who had a poorer tournament against all other opponents. We formalize this intuition to argue that all endogenous tie-breaking rules are effectively equivalent to a coin toss: the probability of the stronger player ranking higher, given equal points, is always one-half, regardless of the tie-breaker used.
Theorem 1: In a double round-robin tournament for a symmetric game where two players finish with the same number of points, the probability of any endogenous tie-break ranking the stronger player above the weaker player is exactly half.
The Davidson Model
We adopt the Davidson model to assign probabilities for various outcomes in a tournament. Intuitively, it is not guaranteed that a stronger player will always beat a weaker player. A player is considered strong when they have a large probability of winning. The Davidson model captures this intuition.
The Davidson model assumes the existence of an inherent strength s for each player in the tournament. For a symmetric game with three possible results — win, loss and draw, and two players with strengths and
,
- player 1 wins with probability
,
- the game ends in a draw with probability
, and,
- player 2 wins with probability
,
for some constant .
is said to be the draw constant of the game. For example, the draw constant in chess is expected to be much more larger than T20 cricket, where it is extremely improbable for the game to end in a tie.
For example, the Elo rating system used in chess, go and football is a variant of the Davidson model. The rating of a player is used to approximate the strength
for some scale factor
(for example,
can be chosen so that the odds of a player rated 400 points more than the other winning is 10:1). In other words, if the “correct” Elo ratings have been discovered by the rating updates, a player rated 400 points more than their opponent should win roughly 10 out of every 11 games after an extremely large number of games have been played between them.
The Davidson model can be extended to games with a bias between the players. Traditionally, the player with the white pieces has held a slight advantage over the player with the black pieces in the highest levels of chess. Such bias can be accommodated by multiplying the first player’s strength by a bias constant
and calculating the Davidson probabilities with strengths
and
. While our results continue to hold for double round-robin tournaments where each player plays the other twice (once with the bias favouring them and once with the bias against them), we will restrict our attention to games without bias for this post.
Preparing the Final Standings
While we will not be able to learn the exact strengths of the players at the end of the tournament, we can still hope to find a ranking algorithm that increases the likelihood of ranking the players in decreasing order of strengths. Formally, a tournament is a collection of ordered triples with
if
and
played against each other which resulted in
. For a tournament
and standings
of the
players output by the ranking algorithm, we would like to maximize the likelihood function
of the tournament given by
By Bayes rule, we have . Note that
is the probability of a particular set of results happening in the tournament and is independent of the ranking procedure. Further, the probability
is the probability that, at the start of the tournament,
is the strongest player, followed by
, all the way to
. However, at the start of the tournament, we do not have any information about the players’ strengths and thus, player
being stronger than
is equally as player
being stronger than
. The probability that
equals
as is also independent of the ranking algorithm. Thus, the ranking algorithm should output a final standing
that maximizes
Not surprisingly, ranking players by points (assuming 1 point for a win, 1/2 for a draw and none for a loss) maximizes the above probability in a round-robin tournament. In the theorem below, we will argue that the optimal ranking algorithm should always rank a player with higher points than player
above
Theorem 2: For a player with more points than
,
.
Proof: We will pretend we know the strengths of all the
players and calculate
. Then, for all
and any fixed strengths
of the other players, we will prove that
. Thus, it must be the case that the tournament
is much more likely to occur when
is a stronger player than
.
Let and
respectively when
beats
,
loses to
and when the game results in a draw. Further, let
be the total points scored by player
. Then,
From the above equation, it is clear that for
. The denominators
are identical in both the probabilities irrespective of whether
or
. The numerators
are identical too, except for
, which is indeed larger when
and
. The theorem follows.
Theorem 1 that claims tie-breaking is pointless when two players are tied on the same number of points follows immediately follows immediately from the proof of Theorem 2. When , we have
for all
, and thus,
. As a consequence,
. Guessing the player with the higher strength is a coin flip, and any ranking algorithm will guess the stronger player with probability exactly
.
However, tie-breakers can be extremely useful despite what Theorem 1 suggests. For that matter, Theorem 1 observes that conditioned on two players being tied on points, no tie-break can perform strictly worse than another. Thus, beyond the goal of ranking players according to their strengths, the tie-break can be used to elicit an intended behaviour from the players. For example, tournament organizers might want to reward players for playing more entertaining chess and as a result, can consider breaking ties in favour of the player with a higher number of wins.