{"id":505,"date":"2012-01-19T14:45:25","date_gmt":"2012-01-19T14:45:25","guid":{"rendered":"http:\/\/blog.warlight.net\/?p=505"},"modified":"2014-09-02T16:49:46","modified_gmt":"2014-09-02T16:49:46","slug":"trueskill","status":"publish","type":"post","link":"https:\/\/war.app\/blog\/index.php\/2012\/01\/trueskill\/","title":{"rendered":"TrueSkill"},"content":{"rendered":"<p>I&#8217;ve been experimenting with <a href=\"http:\/\/en.wikipedia.org\/wiki\/TrueSkill\">TrueSkill<\/a> as a potential replacement for the Bayesian ELO that the ladders use.<\/p>\n<p>I&#8217;ve put together a small sample app (download at the bottom of this post) that will calculate the ratings of players using the TrueSkill algorithm.  This can be used to compare the results of the algorithms side-by-side.  Below is the top 30 results of the 1v1 ladder, as of the date of this blog post.<\/p>\n<table>\n<tr align=\"center\">\n<td>TrueSkill<\/td>\n<td>Bayesian ELO<\/td>\n<\/tr>\n<tr>\n<td>\n<pre>\r\nRank Player                  Rating     Wins Losses\r\n---------------------------------------------------\r\n1    zaeban                  2421.536   41   6\r\n2    Gui                     2372.826   15   2\r\n3    AceWindu                2354.958   17   3\r\n4    Rubik87                 2308.389   32   8\r\n5    unknownsoldier          2287.282   20   7\r\n6    Heyheuhei               2264.112   62   23\r\n7    Eitz                    2258.755   31   11\r\n8    NuckLuck                2254.831   16   5\r\n9    ????V                   2210.151   29   11\r\n10   ????Chaos               2205.284   20   2\r\n11   chas                    2203.465   36   15\r\n12   Oliebol                 2187.576   21   8\r\n13   20AquaHolic             2158.945   61   31\r\n14   Yeon                    2151.634   18   6\r\n15   13CHRIS37               2143.782   44   16\r\n16   DrTypeSomething         2098.338   18   6\r\n17   WMMekBlaze              2098.327   18   4\r\n18   TheEmperorCornInMyTight 2079.722   61   29\r\n19   MonsenhorChacina        2075.971   9    3\r\n20   Mian                    2072.797   23   10\r\n21   Hroptatyr               2071.86    26   12\r\n22   PaniX                   2069.742   16   9\r\n23   bytjie                  2037.808   20   11\r\n24   Xyphistor               2018.273   71   39\r\n25   alababi                 2010.574   18   8\r\n26   REGLMentysh             2005.368   28   17\r\n27   WMDazedInsane           1992.604   24   9\r\n28   20TheWindowCleaner      1988.566   15   5\r\n29   JimH                    1985.033   33   18\r\n30   Tor                     1978.475   24   17\r\n<\/pre>\n<\/td>\n<td>\n<pre>\r\nRank Name                        Elo\r\n------------------------------------\r\n   1 AceWindu                   2176\r\n   2 Gui                        2163\r\n   3 zaeban                     2130\r\n   4 ????Chaos                  2085\r\n   5 NuckLuck                   2060\r\n   6 Rubik87                    2042\r\n   7 unknownsoldier             2039\r\n   8 MonsenhorChacina           2029\r\n   9 zibik21                    2021\r\n  10 WMMekBlaze                 2004\r\n  11 ????V                      1995\r\n  12 Yeon                       1992\r\n  13 Eitz                       1989\r\n  14 Oliebol                    1981\r\n  15 20TheWindowCleaner         1977\r\n  16 Heyheuhei                  1963\r\n  17 DrTypeSomething            1961\r\n  18 chas                       1954\r\n  19 PaniX                      1950\r\n  20 Troll                      1943\r\n  21 13CHRIS37                  1935\r\n  22 TheImpaller                1933\r\n  23 Mian                       1927\r\n  24 fwiw                       1917\r\n  25 alababi                    1907\r\n  26 Hroptatyr                  1904\r\n  27 LilEitz                    1892\r\n  28 bytjie                     1886\r\n  29 WMDazedInsane              1882\r\n  30 Fizzer                     1879\r\n<\/pre>\n<\/td>\n<\/tr>\n<\/table>\n<p>The ratings are not important, just the ordering of the players.  The wins and losses are only specified in the left table, but these are comparisons over the same games so the numbers are the same for both sides.  <\/p>\n<p>Even though Bayesian ELO is what the site uses now, you might notice some differences between the right table and what WarLight.net shows today.  They&#8217;re not identical since WarLight doesn&#8217;t give ranks to players who have left the ladder, don&#8217;t have 10 games yet, or are on vacation.  <\/p>\n<h1>Algorithm Differences<\/h1>\n<p>There are advantages and disadvantages between the two algorithms.  The biggest difference is that the ripple effect that Bayesian ELO uses is not existent in TrueSkill.  That is, when a game ends, Bayesian ELO applies the biggest changes to the players who played that game, but also applies smaller adjustments to everyone who has played either of those players, and so on.<\/p>\n<p>The nice thing about TrueSkill is that when a game ends, you can immediately know how many rating points you gained or lost.  Your rating also only changes when you finish a game.  This also means that you can see exactly how your rating got to its current location, as each game can show its affect on your rating.<\/p>\n<p>The disadvantage of this is that *when* you defeat an opponent matters.  Say player A rises from #30 to #1 on the ladder.  If you defeat player A when they&#8217;re at #1, you&#8217;ll get a much bigger ratings boost than you would have if you defeated them when they were #30.  This isn&#8217;t true in Bayesian ELO, since the rating points you got from defeating player A rise as they rise up the ladder.<\/p>\n<p>This is most visible in contrived examples.  Say Player A defeat B, B defeats C, and then C defeats A.  In Bayesian ELO, all three players would be tied, as their victories form a perfect triangle, evening each other out.  In TrueSkill, player C would be the highest ranked, since they defeated A who was the #1 ranked player at the time of their game.<\/p>\n<h1>Running TrueSkill Simulations<\/h1>\n<p>As mentioned above, I wrote a command-line tool that allows you to run simulations of WarLight ladders with the TrueSkill algorithm.<\/p>\n<p>You can download this tool from this link: <a href=\"http:\/\/data.warlight.net\/WLTrueSkill.exe\">WLTrueSkill.exe<\/a>.<\/p>\n<p>On Windows, this requires .NET 4.0 runtime to be installed.  On Mac or Linux, the tool should work fine under a recent version of Mono.<\/p>\n<p>To use the tool, simply feed it one of the Bayeselo Logs linked from <a href=\"http:\/\/wiki.warlight.net\/index.php\/Ladder_Ranks_and_Ratings#Run_your_own_Ladder_Simulations\">the wiki<\/a>.  For example:<\/p>\n<pre>WLTrueSkill < BayeseloLog0.txt<\/pre>\n<p>By running the program with an argument of \/?, you can see some additional options.<\/p>\n<h1>Feedback<\/h1>\n<p>I'm considering using TrueSkill for Season II as a trial run.  Let me know your thoughts!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve been experimenting with TrueSkill as a potential replacement for the Bayesian ELO that the ladders use. I&#8217;ve put together a small sample app (download at the bottom of this post) that will calculate the ratings of players using the TrueSkill algorithm. This can be used to compare the results of the algorithms side-by-side. Below &hellip; <a href=\"https:\/\/war.app\/blog\/index.php\/2012\/01\/trueskill\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;TrueSkill&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/posts\/505"}],"collection":[{"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=505"}],"version-history":[{"count":16,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/posts\/505\/revisions"}],"predecessor-version":[{"id":1071,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/posts\/505\/revisions\/1071"}],"wp:attachment":[{"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=505"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=505"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/war.app\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=505"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}