dumpster_ wrote:Is there some kind of adjustment based on how balanced the teams are?
Even the greatest players can sometimes perform atrociously, not necessarily through any fault of their own.
Even if one were to play perfectly it's still possible to end up 0-5.
I've seen it happen to every top tier player without exception.
No, there is no such adjustment. The four ratings (farm/kills/deaths/assists) are a flat ratio of your hero's stats/duration to the community's average. They are then averaged equally to deduce the general rating.
dumpster_ wrote:It seems to me like there's enough data to do some machine learning, for example to predict the winner based on the players and hero selection.
I think it's pretty likely that you could achieve an accuracy of over 90%. (edit: 90 is too ambitious I guess, but 75% seems reasonable)
If one is successful with that, then the next step could be to evaluate how much a given player's presence in a game affects its expected outcome.
If you're really ambitious, you could even build an entire new rating system based on that.
Doesn't even have to be hooked into ENT in any way; it can just exist on its own for entertainment purposes.
I have already implemented this system you're describing for another community, in the far past. It was used to both predict the game's outcome and team balancing, and it was a complete replacement of the ELO system.
The
ELO rating system is supposed to be a game's outcome prediction system. For any ELO difference, it gives probability of winning:
In the community where the new rating system was implemented, I ran an extensive simulation of the pre-existing ELO system there, to compare how well ELO had actually been predicting the outcome of DotA games. The simulation was performed against 14,000 played games, and the prediction was compared to the actual outcome. Players had 2 separate ELOs there, a "Solo ELO" when playing alone and a "Team ELO" when playing in pre-made teams. The results were pretty disappointing:

- elo_winrate_en.png (24.08 KiB) Viewed 38 times
The same simulation was then performed for the new Rating System, used as an ELO replacement for outcome prediction, on a sample of 83,654 played games. The results were impressive. Not only did the new system perfectly replace ELO, it actually performed much closer to what ELO was originally designed for, in DotA:

- rating_winrate2_en.png (17.06 KiB) Viewed 38 times
dumpster_ wrote:A big obstacle towards both of those goals is the prevalence of smurfs, but I think it shouldn't be too hard to accurately determine who's who with some machine learning.
Of course one could trick it if they really want to, e.g. by changing their hero picks, the time they play, and the way they talk.
But why would anyone bother?
It shouldn't be too hard to reasonably accurately determine who's who after even a dozen games.
After >50 games I think you could achieve close to 100% accuracy.
You could also ask the admins to share the aliases with you in private to make training the model so much easier.
The new Rating system was actually also used for balancing, where judging a player's skills was critical for correct balance.
While the ELO system needs at least 15 games to somewhat describe a new player's skill accurately, the Rating system was able to give a fair estimation with just 1 game! Thus smurfs were quickly balanced correctly. No machine learning or identification attempts were needed.
dumpster_ wrote:I've been thinking about this for years but the chance I actually do any of it is close to nill.
Do you have any grand plans like this?
It has already been done, widely tested in thousands of games afterwards, and was pretty successful.
However, hooking it up on ENT is an entirely different story.
There is a bug on ENT, not recording some heroes. None of the official pages display what hero you were playing:
https://entgaming.net/findstats.php?id=12257041http://storage.entgaming.net/replay/vie ... 257041.w3gI can't do much about the source data, if there are incomplete.
The rating compares stats to a specific hero, so if there's no hero there can't be a proper rating. It is all zeroed out.