Categories: DiscussionDota 2

Using Machine Learning to Analyze Dota Heroes and Predict Matches

Hey everyone! I've been working on a ML project on-and-off for the past few months, where I use ML (neural networks, specifically) to analyze Dota heroes (learn hero embeddings) and predict the result of matches. If you prefer to consume the following content in video form, feel free to check out this Youtube Video: https://www.youtube.com/watch?v=OI1rYJPQ_-U

Dataset

I used OpenDota (https://www.opendota.com/) to collect data. I collected data on ranked games, with average mmr > 4k, on patch 7.30.

  • Train Set: ~15,000 games
  • Val Set: ~4,000 games

Model Details

This is the Dota subreddit, not the ML one so I'll keep this relatively brief. For more details, feel free to check out the video linked at the top.

  • Model Architecture: Embedding Layer -> Attention Layer -> MLP Output Layers
  • Train Task: Given 10 heroes as input, predict:
    • Result of game (classification – radiant long win, radiant medium win, radiant fast win, dire long win, dire medium win, dire long win)
    • Per Hero:
      • Gold/XP per min
      • Ratio of team's last hits
      • Hero Damage per min
      • Ratio of damage that is right click
      • Damage Taken per min
      • Stun (caused) duration

Model Per-Hero Predictions

Here are the (model predictions / truths) for a test game – TI 2021 Grand Finals Game 1. I only included the radiant hero outputs (Team Spirit) to avoid clutter.

Hero Last Hit Team Ratio Hero Damage Per Min Damage Taken Per Min Right Click Damage Ratio Gold Per Min XP Per Min
Elder Titan 0.14 / 0.02 590 / 236 594 / 132 0.34 / 0.30 342 / 270 460 / 328
Naga Siren 0.28 / 0.49 801 / 653 808 / 297 0.42 / 0.76 470 / 228 600 / 679
Void Spirit 0.25 / 0.24 791 / 625 798 / 508 0.34 / 0.18 470 / 495 610 / 667
Lion 0.11 / 0.03 594 / 171 598 / 204 0.30 / 0.45 348 / 214 472 / 353
Tidehunter 0.22 / 0.22 663 / 296 668 / 361 0.37 / 0.06 391 / 449 509 / 504
  • The ordering of the magnitudes of the predictions largely make sense, given the positions heroes play.
  • The predictions are often more average, which makes sense given that the result of the game is not conditioned on when making the predictions.
  • The damage values seem high overall – a place for improvement.

Model Result Predictions

Here are the model result predictions for Games 3, 4, and 5 of TI Grand Finals in a mixed up order. I'll actually leave the answer out here for now, and if you're interested, you can make a guess in the comments and explain why (so which of A, B, and C corresponds to each game). I'll edit or comment with the answer later.

Radiant Heroes Dire Heroes
Game 3 Disruptor, PA, Invoker, Dark Willow, Magnus Spectre, Tinker, Bloodseeker, Rubick, Undying
Game 4 Luna, Kunkka, Magnus, Undying, Bane Winter Wyvern, Spectre, TA, Lion, Axe
Game 5 Winter Wyvern, TB, Ember, Bane, Magnus Tiny, Kunkka, Lycan, Skywrath, Ench

Radiant Win <35 min Radiant Win 35-50 min Radiant Win >50 min Dire Win <35 min Dire Win 35-50 min Dire Win >50 min
Probs A 0.24 0.24 0.08 0.18 0.17 0.10
Probs B 0.15 0.31 0.12 0.18 0.14 0.10
Probs C 0.22 0.24 0.08 0.22 0.19 0.06
  • Might be confirmation bias, but there seems to be explainable factors for the matching.
  • If you're interested in hearing discussion on this from 2k-7k players, there's a section in the video linked at the top – I guess the answers are also there if you can't wait to find out lol

Learned Embeddings

Hero Value Embeddings

Here is a 2d visualization of the embeddings the model learned. If you are unfamiliar with embeddings, they are learned representations of the heroes, with the goal that similar heroes would have embeddings that are close to each other. Some things that I immediately noticed:

  • Medusa and Spectre on bottom left – quintessential late game carries
  • Sven and Kunkka towards top right – melee cores with cleave and burst
  • A lot of supports grouped towards bottom and bottom right

If you see something interesting or weird, feel free to leave a comment!

Summary

This project was largely for fun – I definitely think the model was able to learn things about Dota and can spark some interesting discussion, but I wouldn't try to productionize/sell this model at its current state. I do think the potential is there though – with more and richer data, tinkering with model architecture, and more effort spent tuning and evaluating, it would be possible to create a very high quality model.

Gamer

Recent Posts

Ledx have been so hard for me this wipe

Not being able to craft them sucks. Especially when everyone I talk to about it…

11 months ago

My interesting and unfortunate Gwent life

First I'd like to say I absolutely love this game it's quality. Basically I first…

11 months ago

Teacher Tuesday 12/Dec/2023 – ask your questions here!

Welcome to Teacher Tuesday, a thread where anyone can ask any type of question without…

11 months ago

This games balance is confusing

I’m kind of new/returning to gwent I played beta and obviously it’s a lot lot…

11 months ago

Summary of 10 Days of Draws from Chaffee’s Bundles

Level 1 Bag (Free with Atmosphere Level 2) 6 small consumable (First Aid, Repair, Fire…

11 months ago

Why is my crew at 135%?

Here's my crew - T34-85M - for the life of me I cant figure out…

11 months ago