A geeky question/suggestion about the ratings algorithm.

Suggestions on improving the site or comments in general?

Re: A geeky question/suggestion about the ratings algorithm.

Postby Groo » 16 Jun 2017, 15:44

nopunin10did wrote:
Though I'm sure their Elo has a lot more mess going on behind-the-scenes, I have been able to glean that it's roughly based on a model like this:

  • -20 for a loss
  • 120 for a solo
  • 50 each for 2-way
  • 26.667 each for 3-way
  • 15 each for 4-way
  • 8 each for 5-way
  • 3.333 each for 6-way
  • 0 for everyone in 7-way


I'm not sure what exactly are you referring to with this model, but I'm pretty sure ratings on playDip now are not working this way, because I have only one ranked game, and a solo victory and I got around 400 points, so please explain :)

I glanced at Dixiecon, and I don't know, I don't find it satisfying enough also. My idea was not to create a usable Elo system for PlayDip at this moment, but to "improve" the existing system a tad. It should actually encourage players to go for solo even more: only the second in a solo would get some points, or rating shield. For example - If you're the only survivor except the victor you get some points, let's say 10% of what victor gains, and if you're second in a solo with more survivors you get a ratings shield and lose no points.
I don't see the need for zero sum environment at all, if we're talking about score on this site alone.If we're talking about a "global" dip elo ranking system...then this should have a separate thread and be discussed much more thoroughly.
User avatar
Groo
 
Posts: 12
Joined: 14 Nov 2016, 18:13
Class: Diplomat
Rating: 1405
Timezone: GMT

Re: A geeky question/suggestion about the ratings algorithm.

Postby nopunin10did » 16 Jun 2017, 16:38

Wethalon wrote:The scoring system depends on rating differences. See here for chess:

https://en.wikipedia.org/wiki/Elo_ratin ... al_details

There are no standard changes for certain results. But yes it's zero-sum.


That's true. What I'm describing is the base case, where you assume equally skilled opponents and no variety in K-factors.
NoPunIn10Did
PBEM & PBF GM

Variant Designer & Collaborator
User avatar
nopunin10did
Premium Member
 
Posts: 611
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Rating: 1345
Timezone: GMT-5

Re: A geeky question/suggestion about the ratings algorithm.

Postby asudevil » 16 Jun 2017, 16:46

Groo wrote:I don't see the need for zero sum environment at all, if we're talking about score on this site alone.If we're talking about a "global" dip elo ranking system...then this should have a separate thread and be discussed much more thoroughly.


You have to have a zero sum environment otherwise inflation or deflation screws with the ratings where people who have been here longer...by default have a higher rating just because...not based on skill, but based on number of games.
Captain FANG, forum team championships WINNER
Part of the surviving nations of WW4/Haven

Unless I am in the cheater's subforum. 99% of what I say is NOT as a mod.

Walking Dead Mafia now in signups here

Want to play fantasy football next season sign up here
User avatar
asudevil
Premium Member
 
Posts: 15360
Joined: 18 Jul 2011, 02:20
Class: Star Ambassador
Rating: 1536
Timezone: GMT-7

Re: A geeky question/suggestion about the ratings algorithm.

Postby nopunin10did » 16 Jun 2017, 16:58

Groo wrote:
nopunin10did wrote:
Though I'm sure their Elo has a lot more mess going on behind-the-scenes, I have been able to glean that it's roughly based on a model like this:

  • -20 for a loss
  • 120 for a solo
  • 50 each for 2-way
  • 26.667 each for 3-way
  • 15 each for 4-way
  • 8 each for 5-way
  • 3.333 each for 6-way
  • 0 for everyone in 7-way


I'm not sure what exactly are you referring to with this model, but I'm pretty sure ratings on playDip now are not working this way, because I have only one ranked game, and a solo victory and I got around 400 points, so please explain :)

I glanced at Dixiecon, and I don't know, I don't find it satisfying enough also. My idea was not to create a usable Elo system for PlayDip at this moment, but to "improve" the existing system a tad. It should actually encourage players to go for solo even more: only the second in a solo would get some points, or rating shield. For example - If you're the only survivor except the victor you get some points, let's say 10% of what victor gains, and if you're second in a solo with more survivors you get a ratings shield and lose no points.
I don't see the need for zero sum environment at all, if we're talking about score on this site alone.If we're talking about a "global" dip elo ranking system...then this should have a separate thread and be discussed much more thoroughly.


As mentioned in a prior reply, the numbers I listed are not the whole story. They're the base case. Your 400 is likely an adjustment based on your K-factor, a value that shrinks or magnifies a win or draw by a certain amount independent of other players. Here, K-factors appear to shrink to a constant over time but start at high values to push your rating further up or down in our earliest games (as a means of measuring your skill faster).

Your rating compared to that of your opponents prior to applying points is the other side. In the assumed base case, you're expected to get 20 points out of 140. Your adjustment is your victory minus your expectation, which is why a solo lands at 120 and a loss at -20.

But let's say one player was three times as good as you, and the other five were twice as good as you. Then the Elo calculation would treat your expected as 10, the best player as 30, and the other five as 20. For you, a loss would still be negative, but only half as much. And if the best player landed in a large enough draw, they would lose points.

And you absolutely need a near-zero sum environment for any ratings system where you don't want "I play more games" to become a trivial means of increasing ratings.
NoPunIn10Did
PBEM & PBF GM

Variant Designer & Collaborator
User avatar
nopunin10did
Premium Member
 
Posts: 611
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Rating: 1345
Timezone: GMT-5

Re: A geeky question/suggestion about the ratings algorithm.

Postby Groo » 16 Jun 2017, 19:11

Ok, thanks for explanation guys :D
User avatar
Groo
 
Posts: 12
Joined: 14 Nov 2016, 18:13
Class: Diplomat
Rating: 1405
Timezone: GMT

Re: A geeky question/suggestion about the ratings algorithm.

Postby Wethalon » 18 Jun 2017, 14:20

Here is my best guess to how the system actually works:

1) For each player (i = 1,2,...,7), divide their rating by 400 and exponentiate: p[i] = exp(Rating[i]/400)
2) Normalize the numbers you get by dividing by their sum. This is a number between 0 and 1 representing each player's expected result. ExpectedResult[i] = p[i]/(sum_i p[i]).
3) ActualResult[i] is 1 over number of winners if you are a winner, 0 otherwise. So it would be 1/3 for those part of a three-way draw.
4) Rating change is K * (ActualResult - ExpectedResult).

This system is guaranteed to be zero-sum because both ActualResult and ExpectedResult sum to 1 over all players. The system also becomes identical to the Elo system in chess if there were two players.

Unknowns:
- The 400 in Step 1 is the value used in chess. I don't know if PD uses the same value.
- The K-factor in Step 4. This was discussed two posts above. It is set by the site and depends on how much you have played and/or your current rating. It likely starts at 400 and decreases in steps, probably always being a round number.

If someone is really dedicated, they could try to determine the unknowns from recently finished games (tricky: you'd need to know the ratings going in) 8-)
Wethalon
Premium Member
 
Posts: 40
Joined: 14 Jun 2015, 04:35
Class: Star Ambassador
Rating: 1926
Timezone: GMT-5

Re: A geeky question/suggestion about the ratings algorithm.

Postby asudevil » 18 Jun 2017, 15:19

Wethalon wrote:Here is my best guess to how the system actually works:

1) For each player (i = 1,2,...,7), divide their rating by 400 and exponentiate: p[i] = exp(Rating[i]/400)
2) Normalize the numbers you get by dividing by their sum. This is a number between 0 and 1 representing each player's expected result. ExpectedResult[i] = p[i]/(sum_i p[i]).
3) ActualResult[i] is 1 over number of winners if you are a winner, 0 otherwise. So it would be 1/3 for those part of a three-way draw.
4) Rating change is K * (ActualResult - ExpectedResult).

This system is guaranteed to be zero-sum because both ActualResult and ExpectedResult sum to 1 over all players. The system also becomes identical to the Elo system in chess if there were two players.

Unknowns:
- The 400 in Step 1 is the value used in chess. I don't know if PD uses the same value.
- The K-factor in Step 4. This was discussed two posts above. It is set by the site and depends on how much you have played and/or your current rating. It likely starts at 400 and decreases in steps, probably always being a round number.

If someone is really dedicated, they could try to determine the unknowns from recently finished games (tricky: you'd need to know the ratings going in) 8-)


Also, every game individually isn't zero sum because solo's get more than a 2way draw...but it doesn't "cost" and lose you more points if you lose to a solo vs a 2man draw.

And there are the surrenders which take full losses immediately...and ratings shields which DONT cost you points when you lose...

So its not entirely zero sum for each game...although we try to keep it close...but its pretty close to zero sum for the site hence the fact that our inflation % is less than .1% in the last 3 years.
Captain FANG, forum team championships WINNER
Part of the surviving nations of WW4/Haven

Unless I am in the cheater's subforum. 99% of what I say is NOT as a mod.

Walking Dead Mafia now in signups here

Want to play fantasy football next season sign up here
User avatar
asudevil
Premium Member
 
Posts: 15360
Joined: 18 Jul 2011, 02:20
Class: Star Ambassador
Rating: 1536
Timezone: GMT-7

Re: A geeky question/suggestion about the ratings algorithm.

Postby nopunin10did » 18 Jun 2017, 17:28

Wethalon wrote:Here is my best guess to how the system actually works:

1) For each player (i = 1,2,...,7), divide their rating by 400 and exponentiate: p[i] = exp(Rating[i]/400)
2) Normalize the numbers you get by dividing by their sum. This is a number between 0 and 1 representing each player's expected result. ExpectedResult[i] = p[i]/(sum_i p[i]).
3) ActualResult[i] is 1 over number of winners if you are a winner, 0 otherwise. So it would be 1/3 for those part of a three-way draw.
4) Rating change is K * (ActualResult - ExpectedResult).

This system is guaranteed to be zero-sum because both ActualResult and ExpectedResult sum to 1 over all players. The system also becomes identical to the Elo system in chess if there were two players.

Unknowns:
- The 400 in Step 1 is the value used in chess. I don't know if PD uses the same value.
- The K-factor in Step 4. This was discussed two posts above. It is set by the site and depends on how much you have played and/or your current rating. It likely starts at 400 and decreases in steps, probably always being a round number.

If someone is really dedicated, they could try to determine the unknowns from recently finished games (tricky: you'd need to know the ratings going in) 8-)


Based on the data I've been able to collect, the "400" value seems to be either 1200 or 1400. The K-factor changes based on the number of games you've played, but it appears to shrink to approximately 140 eventually. This value yields the -20 loss, 120 win base case mentioned earlier.

For Ancient Med games, where there are 5 opponents instead of 7, I think the K-factor is 100 instead. A base-case loss there counts the same -20, but a solo is worth 80, a 2-way worth 30, 3-way 13, and a 4-way 5.
NoPunIn10Did
PBEM & PBF GM

Variant Designer & Collaborator
User avatar
nopunin10did
Premium Member
 
Posts: 611
Joined: 17 Aug 2011, 00:17
Location: North Carolina
Class: Ambassador
Rating: 1345
Timezone: GMT-5

Previous

Return to Suggestions

Who is online

Users browsing this forum: No registered users and 6 guests

cron