Famine's Domain

Famine's Domain

Developer / Street Fighter / Beast

Analyzing Video Game Data

Getting a chance to analyze some video game data is a good opportunity to use your skill-set on a domain that is fun. That’s what I got to do with a side project that I have been working on called, “Devil’s Silence”.

Come play with us! (dsmud.com port 5000)

Devil’s Silence is an online text-based video game referred to as Multi-User Dungeons (MUD). If you ever played Massively Multi-player Online Games (MMOG), then MUD’s would be the predecessor that made MMOG’s what they are today. They were free open-source text-based games that commonly were developed as role-playing games (RPG) for nerds who needed their Dungeon & Dragons fix whenever they wanted.

Video Game Data

The game I’m going to analyze is a well established game that has been online since the 90’s. I actually played this game when I was around 16 years old. Today, I’m 33 years old and helping as the programmer. So, you could say I have a long history here.

The data itself is not easily accessible. It’s stored in flat files, has very little standardization around it’s structure, is broken up into many segments that feed the game engine and will likely be a little dirty. Ideally, we will have to leverage our skill-set and tools to extract, transform and clean the data to fit our needs.

Data Methodology

The methodology is simple. We will answer these key high-level questions with our data analyzation:

  1. What will we take away?
  2. What are our recommendations?
  3. What is the impact?

How we will answer those questions is by defining a key set of business questions we will ask of the data and then answering those questions the best way we know how–identifying, extracting, analyzing and reporting on the data.

Data Questions

I’m about to embark on some game balancing of both classes and races in Devil’s Silence. The balance is focused around the player-versus-player (PvP) system for the game. It’s not as fun for the players right now because the classes, races and even item system has gone astray. Therefore, my questions will be centralized around the classes and races first, maybe some of the code data second and some of the PvP results last.

General Questions

I always start out my analysis with two types of questions from my main methodology — general and deep insights. General questions are those that will help me get a feel for the data and deep insight questions are those that will help me drive actionable and meaningful insights.

  • What is the most popular and least popular class in the game?
  • What is the most popular and least popular race in the game?
  • What is the most popular and least popular race and class combo in the game?

Deep Insight Questions

For my deep insight questions, I generally start out with one, but towards the end of the analyzation, I end up with a few more based on my discoveries. It’s good not to limit yourself right off the bat because who knows what you may find when you start down this path of data discovery right?

  • What is truly driving popularity in our classes/races and why?

General Overview

To answer these questions, I’m going straight to the player files of the game. They have structure, but do not have a standard across all code systems. I will need to develop a script that rips through every player file (one file per character), extract the data I need to answer my questions and spit out the results to a single file that I can analyze in another tool.

Go, Go, Python!

I took a brute force approach here, sue me.

This did the trick when reading a player file that just had attributes per line with non-standard terminators and formatting. The script just went from top to bottom per file while applying a very rough rules system to catch whatever. When data was found, the script just appended it to an array and then slammed it into Pandas. Easy, peasy.

Player Class Overview

Okies, let’s jump into iPython or whatever name they call it these days. Forgive me for not having a iPython enabled site, I will add this soon in the future to make things easier I hope.

At this stage, we’re loading some basic modules we may or may not use later. For now, we just want to get our data loaded from our CSV file into a Pandas data frame for analyzation. The first task is to just take a quick peek at how many players are playing what classes in our game. We can visualize that really easy with Seaborn.

Top Player Classes (Tier 2)

Devil's Silence Total Players By Class

It seems we’ve spotted our first issues with the data. Wizard, Mage, Shaman and Gladiator are all classes from other tiers. They used to be tier 2 and tier 1, but they were switched just recently and we know the tiers on the player files will not adjust until they log into the game. Nonetheless, it doesn’t change the fact of who they are. We can just overlook them for now.

Our top 5 classes are pretty clear. Morphs are dominating the popularity for tier 2. Ninja’s and Paladins on the other hand are likely our least popular classes of the game. Without a doubt, I can make the assumption that Morphs are only popular at this time due to the fact they were just added. Players are experimenting with them. I can also make the assumption that Ninja’s and Paladin’s are not popular because they are not powerful in PvP.

However, these are only assumption’s. They should be treated as such until we can back them up with facts from our data.

Player Race Overview

Same thing as before, but we switch to Races and switch our axis to make this easier to read.

Top Player Races (Tier 2)

Devil's Silence Top Player Races

Here we see that our dragon races are pretty popular. Sonic, Storm, Swamp and Dracolich are topping the charts. Our legacy races such as Treant, Banshee, Drow and so forth are not doing too hot. With the recent changes, we can make the assumption that the new introduction of dragon races has sparked a lot of interest. We can also assume that due to the insanely powerful stats, racial abilities and so forth with dragons, they will likely stay that way, especially for Sonic and Storm Dragons.

Player Class/Race Overview

Now we take a look at the player class and race numbers by creating a pivot table with our raw player counts. This should give us a better picture of the popular and not-so-popular race and class combinations in the game.

Top Player Class/Races (Tier 2)

Devil's Silence Top Player Class/Races

Right off the bat, we can see from our higher values here (dark) are our dragon races and classes that showed in our previous charts as being popular. Morph’s in the first example are obviously popular, but not only that, they are popular in the Sonic and Storm Dragon races too. Something triggered players to pick one of those two races. But why? We cannot answer that question with a simple heatmap.

When we look at our legacy races (anything not Dragon), we can see there is some distribution across different classes. We can see a Cyclopes Wardancer, a Beast Blade, a Dwarf Battlemage and so on. The actual number of players is still low. Why are these players still playing not-so-popular races with popular to semi-popular classes?

Taking a look at our two not-so-popular classes like Elder and Paladin, we can see they are all sporting the new dragon races. This means they are fairly active and trying a new race with an old class that is not really popular among the players. Why did they choose those classes? What about them is keeping them playing where so many others have not? Again, we can’t really answer those with this heatmap or the other charts. We have to dig deeper to understand.

Conclusion

We have just scratched the surface here. We got our player data extracted, cleaned up a little bit and have dived into some of the numbers around classes and races. We have a decent overview of what is popular and not-so-popular. Yet, we really don’t have a decent understanding into why. We will need to dig deeper into the other features of the dataset to unearth some deeper insights into why these classes and races are popular and not-so-popular.

As we dive into PvP and other aspects of the data, I think we might find a correlation. Maybe we can compare kill-to-death ratios, item values and more with our races and classes to see if there is any reasoning to why. Maybe we can dive into the code and extract raw skills, spells and other statistics to understand what each race and class has to offer and then compare it to other features that may form a stronger link in why players are making these choices.

We will do all of that in our next installment of Analyzing Video Game Data.

Glen 'Famine' Swan
Follow Me

Glen 'Famine' Swan

Glen is a 8-year video game industry professional with over 10 AAA accredit titles under his belt. Currently, he is a practicing developer and data scientist within the digital marketing industry.
Glen 'Famine' Swan
Follow Me

Latest posts by Glen 'Famine' Swan (see all)