Chose a mathematical model to rate players

We are off to designing a simple game(details on the release), where players play a game, get a score positive/negative based upon it they are rated. This post talks about the mathematical model that we have used in calculating the ranks of the player and volatility of the team.

What is volatility of the team?

Its a measure of the standard deviation of the performance of the team, in practical terms it is the uncertainty in the peak performance of the team.

What is the algorithm/mathematical model used?

Well we did a LOT of brainstorming over it. Initially we came up with a model of our own. The rating of a player was a function of his:

  1. instantaneous skill sets(measure of current performance)

  2. weight(the experience of player, depending upon the frequency with which he had been competing)

  3. deviation in his performance, with respect to volatility of the opponent.

All three combined and normalized for all the players on a logarithmic scale would give the player's current rating, with respect to his competitors.

But that had way too many flaws to be put to test, we gave up and the obvious choice to start of with was the ELO RankingSystem widely used in Chess and Football team rankings. But Age of Empires explains why they moved from ELO to Power Rating here, to us, their power rating felt similar to Glicko developed by Mark GlickMan.

There is improvement on Glicko and Glicko2 has arrived, even Microsoft uses a form of it in their True Skill ranking system which calculates the rank of players for Xbox Live based upon Mu(skill) and Sigma(deviation). We tried and implemented Glicko's model and tested on server, but for us it seemed to be more resource intensive than what we had actually estimated, and we wanted something more simpler but standard. Later from Evan Miller's article we came to know that reddit, yelp and digg used Wilson Score Interval.

Randall Munroe creator xkcd explains it in detail here, it was way safe to modify slightly, Wilson Score Interval in the context of the game and it suits the need, here is the sample output of data for four players whose all time positive and negative scores have been summed:

PlayerPositive ScoreNegative ScoreFactor

This gives a fair amount of idea of how well the player has performed, but this does not take into account the time, player has not competed. Hence we add another factor of gravity which would cause a decay in the player's rating, if he does not participate. In short upvote:downvote :: positive_score:negative_score and an added factor of gravity over the outcome to ensure consistent participation in the matches. The game is totally a risk based game and does not involve any skill measure, to start off with the above model seems to be an optimal solution, lets see how it goes, we are actually over deadline for the game, will release it pretty soon, stay tuned.


blog comments powered by Disqus