Quantifying performance in Counter-Strike: Global Offensive

I sometimes watch Counter-Strike: Global Offensive (CSGO). I am not a fan of a specific team, but I like to watch good Counter-Strike. It is the combination of strategy and fast-paced gameplay that I find great (most likely the same reason I enjoy watching bullet chess). I have recently been interested in how to quantify and measure the performance of players and teams in CSGO.

In CSGO, a first-person shooter video game, two teams (T and CTs) compete against each other in best-of-30 rounds in a series of maps (i.e., first team to win 16 rounds wins the map). In brief, the maps are virtual worlds (I know maps like Dust2 and Inferno better than my own neighborhood). On each team you have five players with 100 health points (HP) each in every round. When a player reaches 0 HP, the player is gone for the rest of the round. The Ts can win a round by either killing all CTs or succesfully detonate the bomb. The CTs can win by killing all Ts, defuse the bomb or letting the time of the round run out. Before each round begin, the players can buy equipment (such as guns and armor).

Before two teams even compete, they have to decide what maps to play with a pick and ban process. That is, there is a map pool (of seven maps), and the teams pick the maps that they are best at and ban the ones that their opponents are better at. Of course, this selection is anything but random and the teams should consider how to increase their odds of a win.

Petri et al. (2021) provide a study of the map selection process for a best-of-three match (see the figure below for an illustration of the process). In their study, they use data on 8,753 games (165 professional teams playing a total of 3,595 matches) to show how using machine learning to pick maps can increase the expected win probability by ~10 percentage points.

Notice that a good team is able to not only consider what maps they are best at and what maps their opponent are best at, but also to take into the expectations of what maps to play. A good team can then surprise the opponent if they prepare for a map they usually do not play (and thereby making it more difficult for the opponent for make a strategy against the team, i.e., anti-strat).

Once the two teams have decided what maps to play, it is all about winning the maps. The most straightforward way to win a map is to eliminate the enemy team. To value a player, we can therefore look at how many kills that specific player got, i.e., his or her kill-death ratio (KDR).

$$ KDR = \frac{Kills}{Deaths} $$

The more kills a player get (relative to his or her number of deaths), the better the KDR. The main limitation is that it captures very little of what is actually going on in a round. You can in theory be the most influential player on the team and deal 495 damage and still have a KDR of 0. One alternative is to consider the average damage per round (ADR).

$$ ADR = \frac{Total\;damage}{Rounds} $$

The main limitation is that it is still a very simplistic measure that does not take a series of other influential factors into account. For one, it is not taking into account how often you survive and the type of damage you are doing. One metric that takes these factors into accout is the KAST, i.e., the kills, assists, survivals, and trades, provided by a player (trades, for example, is when you are able to revenge the death of a team mate shortly after he or she is killed). We can then divide this by the number of rounds to get the KAST%.

$$ KAST\% = \frac{Kills + Assists + Survivals + Trades}{Rounds} $$

The problem with KAST is that is does not distinguish between the different inputs, and it is just as important surviving as it is providing a kill. In addition, there are various factors still not being addressed.

In order to provide a more nuanced rating, HLTV, the world’s leading CSGO site, introduced their HLTV Rating 1.0 back in 2010. The rating for an individual player is given by this formula (RWMK stands for Rounds With Multiple Kills):

$$ Rating\;1.0 = \frac{Kill\;rating + 0.7 \times Survival\;rating + RWMK\;rating}{2.7} $$

This measure can go from 0 to 3. The kill rating is calculated as Kills/Rounds/AverageKPR, where AverageKPR is 0.679 (average kills per round). The survival rating is calculated as (Rounds-Deaths)/Rounds/AverageSPR, where AverageSPR is 0.317 (average survived rounds per round). The RWMK is calculated as (1K + 4*2K + 9*3K + 16*4K + 25*5K)/Rounds/AverageRMK, where AverageRMK is 1.277 (average value calculated from rounds with multiple kills). Accordingly, is is better to kill five players in one round than one player every round for five rounds (you can find more information on the rating here).

The HLTV Rating 1.0 is an improvement as it takes into account additional factors and puts more emphasis on kills vis-a-vis survivals (given the fact that the survival rating is multiplied by 0.7). However, it did not take some of the measures introduced above into account, such as KAST and damage. For that reason, HLTV introduced its HLTV Rating 2.0 in 2017. Here is a visual comparison between Rating 1.0 and Rating 2.0:

We are now looking at five separate inputs: KAST rating, kill rating, survival rating, impact rating, and damage rating. The blue and orange colours indicate that these ratings are calculated for a player both on the T side and the CT side of the map (because the expected values differ for the two sides). The impact rating includes various types of impactful actions on a map, such as multi-kills, opening kills, and 1onX wins. The main limitation of the 2.0 rating is that the formula is not publicly available and it is, for that reason, impossible for others to replicate the results.

Another limitation is that there are still important aspects that are not taken into account. For example, some kills are more important in certain contexts and the in-game economy decisions are also crucial for the performance of a player and a team. Unsurprisingly, scientists have tried improve the models used to evaluate the performance of players.

Xenopoulos et al. (2020), for example, study the Win Probability Added (WPA) per round. They explore 4,682 matches with 70 million unique in-game events. The results confirm that player outcomes are heavily dependent on the context, as some game situations are harder or easier than others, and equipment value increases the win probability the most (more than remaining HP). That is, simply by looking at the economy of a team and what they are able to buy, we can make good predictions about the likely outcome of a round.

Xenopoulos et al. (2021) further look at how teams allcoate their in-game dollars on equipment and the different strategies teams can use in different situations. They estimate a game-level win probability model with a measure of ‘Optimal Spending Error’ in order to to rank teams by how actual spending decisions deviate from the optimal decisions.

There is a lot of data that can be explored in CSGO, including spatiotemporal data, exploring where different players are on the map and how that captures the performance of individual players and teams. However, if you are familiar with the above-mentioned metrics, you should be able to get a sense of why some players are better ranked than others in CSGO.