Dealing With Evil Gamers
This was originally written shortly after leaving Catapult Entertainment, Inc., the makers of the XBAND Video Game Modem and Network. In short, the device allowed you to play certain console games over the phone line. Catapult sold devices for the Sega Genesis and Super Nintendo Entertainment System (SNES).
All games were 1-on-1, played directly against each other rather than through a central server. The games of the day (e.g. Mortal Kombat, NBA Jam) were "twitch" games, requiring fast reflexes, and connecting over anything but a direct phone call would have introduced unacceptable latencies.
This document describes the lessons learned while trying to make XBAND a happy place for gamers. The most significant barrier to gamer happiness was the ability of the losing player to pull out the phone cord or reset their console, thus invalidating the game and effectively "robbing" the winning player of victory. This turned out to be a very difficult problem to solve.
Most of the online game communities that have formed in the years since have had to confront similar problems, and have met with varying degrees of success, often after costly periods of player-induced chaos.
This paper had been languishing on my hard drive for about five years before I finally decided to post it in 2002.
Handling Unsportsmanlike Conduct in Online Gaming
Copyright (c) 1996-97 by Andy McFadden. All rights reserved.
Last Revised September 1, 1997
All online game services face a common problem, but few have taken steps to deal with it. The problem addressed in this paper involves users who will reset their game console or PC, or disrupt modem or Internet communications, in an effort to deny victory to an opponent or guarantee victory for themselves. Most online game companies have underestimated the significance of this issue and the difficulty of dealing with it effectively.
This paper explains the problem, analyzes approaches that have been tried in the past, and presents a model for handling troublesome customers. Much of what is written applies to 1-on-1 games rather than multi-player games, but most of the ideas apply to both.
I refer to three different kinds of customers:
- Evil - these are customers who deliberately disrupt the game for their own benefit or their opponents' detriment.
- Victim - customers on the other side of the connection from Evil customers.
- Unfortunate - people whose computers tend to lock up on their own, have relatives who pick up phone extensions at inopportune moments, suffer from noisy phone lines, or have other ailments largely beyond their control.
A single person can be any or all of these three, but not all at the same time. It is possible, but unlikely, for two simultaneous Evil acts to occur if the game environment is such that a line drop would benefit both parties, perhaps by awarding a win to the guy who was ahead, but forgoing a loss to the guy who was behind. In this case, both parties might drop the line, making both Evil and neither a Victim.
III. Why is this a problem?
At first glance this seems an unimportant issue. What difference does it make if the game doesn't play out? Plenty.
A. Catapult Entertainment and XBAND
Catapult Entertainment developed the XBAND Video Game Modem and Network. The user plugged an inexpensive device into the cartridge slot of a Sega Genesis or Super Nintendo Entertainment System (SNES), and plugged the cartridge in on top of it. The user plugged a telephone line into the side of the XBAND unit, which would automatically dial into the online service. The games were played through direct connections between two XBAND units. The service provided e-mail, daily news, win/loss statistic gathering and reporting, and firmware upgrades, but the latency was unacceptable for "twitch" gaming popular on game consoles. Most games were 1-on-1, but some games allowed multiple players per side, so 2-on-1 or 2-on-2 was possible.
During beta testing of the product it became clear that game disruptions were going to be a problem. Typically the person on the losing side would hit the reset button on the game console to kill the game. The game would freeze, and after a minute or so the Victim's game would realize that the connection was broken and restart. The next time the XBAND units connected to the service, each would post game results that indicated - to the best of the box's ability - what went wrong. The service was able to conduct a limited amount of detective work based on what it found.
There were three ways of disrupting a game: calling yourself (if you have call waiting), hitting reset, and pulling the phone cord out. Call waiting was supported by the box, so the service was able to identify users who called themselves and told the box, "I'm done with the call, but I don't want to continue the game." Unfortunately, when the game restarted after a call was finished, it either started completely over, or picked up from the beginning of the last period (e.g. the end of the last quarter for NBA Jam). There were reports of users who would call themselves repeatedly, and offer to continue the game, until they either outscored their opponent during the quarter or their opponent gave up in frustration and declined to continue. The user who declined to continue was given a loss if they were behind at the time they gave up, or nothing if they were ahead.
Hitting reset was the most popular method in the early days of the XBAND Network. The XBAND units had hardware that enabled them to patch out pieces of the cartridges used on the game platforms. This is how multi-player game support was added; the games themselves weren't aware that a modem was involved. Each game had a separate "game patch" that was sent down by the service. Early versions of the game patches tended to be unstable, and would lock up the console, so it was plausible that a user might hit reset after the game locked up. Later game patches were more stable, and some attempts were made to identify lockups, but there wasn't enough support on the box to be 100% certain for every game (a clock chip would've worked wonders here). For the games that were stable, though, an automated mechanism was put in place that would penalize people who hit reset or powered the box off, and award a win to their opponent, depending on the scores that the box posted to the service.
The effectiveness of the reset-detection mechanism led directly to the third method of game disruption, disrupting the phone line. The reset detector sent mail immediately to each user when they next logged in, informing them of what had happened and how their statistics had been affected. The first and most obvious response to this change was a flood of customer service calls with excuses about cats hitting power buttons and highly localized power outages, along with petulant demands for win/loss stat restitution. The second response was a shift from hitting reset to disrupting the modem, either by pulling the cord out of the unit, out of the wall, or picking up an extension. This was much harder to detect, because the disruption was equal on both sides. An attempt was made to measure the power level on the telephone line, based on the average level observed at the end of a server connection (a time we were almost guaranteed that the line was plugged in), but it was never reliable enough to use.
For all the best efforts of the Catapult engineers, the three biggest complaints about XBAND were that there weren't enough users (the service peaked at around 15,000), not enough games were supported (adding more would've diluted the sparse user base even further), and "XBAND doesn't care about resetters and cord-pullers". The last was usually spoken the loudest.
One specific example of a Victim should illustrate the problem. The Victim had been playing a football game called "Madden '95" against an opponent of similar but lesser skill for about 40 minutes. (The twitch combat games were usually over in 5-10 minutes, but some of the sports simulations could go for 45.) They were near the end of the game, and the Victim was ahead in points and in control. He had won. With a few seconds remaining, the game froze, and when the game patch's communication routines timed out, his box restarted. On connecting to the service, he learned that the game was a wash; no points would be awarded, and no credits would be deducted. (Later versions of the service would have awarded a win to the Victim because he was ahead, but would not have awarded a loss to the Evil user. Customer service costs became so high that Victims were awarded wins to keep them quiet, and Evil users were left alone to keep them quiet. The lack of apparent enforcement didn't curb the rate of game disruption, unfortunately.)
From a rational point of view, the user got all the benefit of a successful game except for the last few seconds, and didn't even have it charged against his account. But from the Victim's point of view, the game was worthless. A complete waste of time. XBAND was a rip-off, playing games with the service was pointless, he would've been better off playing against the computer. His annoyed message to customer service was more of an emotional venting than a statement of clear fact, but for some period of time the user believed these things, and came away with a profoundly negative impression of other players on XBAND and the service's ability to police itself. These opinions were often shared with other people - many of them potential customers - on Usenet newsgroups and other forums.
In a highly competitive environment, the goal is to beat the other player. By disrupting the game, the Evil player denied a complete victory to the Victim. Even though the game had been won, the feat would go unrecorded, and thus total victory had been denied. A simple message to the player that said, "we all know who won that game, congratulations, he didn't get away with it" would have left the game network in a much more positive light.
Playing (and being Evil) in Netrek is considerably different from the situations seen in XBAND. With 8 players on a side who can come and go at any time, being evil was more a matter of avoiding unpleasant situations than screwing over a single individual. (The only exception I can think of was "chunging", named after Greg Chung. A player being pursued would head to the nearest wall, and fire a point-blank-range plasma torp, killing themselves and denying victory to the opponents. While it would have been fairly easy to fix this, the entertainment value was simply too great.)
A common case would be a starbase that sees an "ogg wave" forming (consult the official jargon dictionary for the definition of the word "ogg"). If they were too damaged to reasonably expect to handle the attackers, they would kill their Netrek client and launch a new one. The "ghost" player would time out after a minute, and the "ghostbuster" would reset the slot. When they came back in, they could use the excuse that their network connection had frozen up on them.
Statistics in Netrek were updated constantly, and you didn't get a break for being ghostbusted, so player stats were never a cause for being Evil.
There are three basic motivations for interrupting a game: emotion, frustration, and statistics.
Most people don't like to lose. The kind of people who participate in highly competitive online gaming services like it even less. After losing a tough match against a skillful opponent, a sportsmanlike player will congratulate the victor, exchange compliments, and maybe solicit some playing tips. Many people will become so caught up in the fever of the moment that losing is a serious blow to their ego. In cases where it's possible to show "personality" during a game, it's possible to develop a genuine dislike for the individual on the other side of the modem. Some playing styles, such as using a simple but effective ("cheap") technique over and over again during a fighting game, can be extremely annoying. Whatever the cause, in a rush of emotion the Evil player takes the only action they know of that will deny complete victory to their opponent.
In some cases they don't care what the consequences are. They were going to lose anyway, and being charged an extra connection credit on XBAND became meaningless when Catapult switched to an all-you-can-eat plan. (The idea of always matching Evil players to other Evil players was suggested, but there weren't enough customers to make it work.)
It's also not uncommon for players off to a bad start to reset out of the game, then claim that the game froze. This is a clumsy (and often ineffective) way of saving face.
An easily overlooked issue is what happens when a very good player and a very poor one get matched up. The XBAND network tried to match players of equal skill, but in areas with few players or games with a small following, the set of possible opponents was small.
Some players - including a magazine reviewer who was profoundly against such things - would hit reset in a game of moderate or long duration if it looked like they were going to spend the next 15 to 30 minutes getting slaughtered. People who don't like to lose also don't like to be stomped on, taunted, and generally destroyed for an extended period. On the other side of the phone cord, skillful players quickly tire of trashing the unskilled or inexperienced. Without some way to signal surrender, the only way out is to be Evil.
Many game services will track statistics such as the number of games played, win/loss, average points scored in a game, and so on. Some will even feature player rankings or "top 10" lists based on these stats. While these provide rewards for winning, recognition of persistent and successful game play, and incentives to play more often on pay-per-play networks, they also provide a rational justification for disrupting the game. With some players, the stats become more important than the gaming experience.
Broad statistics can be as troublesome as focused statistics like wins and losses. In Netrek, which is more of a multi-player online sport than a game, some players would only do things that would improve their statistics, even if it hurt their team more than it helped. The kind of tuning and game balancing needed to avert this behavior would be an appropriate subject for a second paper.
V. Example Solutions
A complete solution requires the online service to reduce or remove the incentives for disrupting the game, and provide adequate retribution mechanisms for situations where the incentives are unavoidable.
Here are some things that Catapult tried. These were added over the course of several months.
- Wins and losses were displayed, but the weekly "top 10" was computed based only on total wins. If you were denied a win you suffered, but losses didn't count against you.
- Automatic detection and handling of games disrupted by call waiting, or by hitting reset or losing power. The players were notified of the disposition of their game by the service after they reconnected.
- Automatic detection and handling of phone disconnects. Since some failures due to line noise are inevitable, the unit that initiated the call in the first place would attempt to redial the other player to re-establish the connection. If successful, the game would pick up where it left off. This helped in games against Unfortunate opponents.
- Win/loss disposition determined by time remaining in game and relative score. Code was added to the service that looked at how far along a game was and how far apart the scores were. The closer the game was to the end, and the more disparate the scores, the more likely the chance that the person winning was going to win anyway, and should be awarded the victory. The loser wasn't given a loss or notified that the other player had won, to avoid customer service complaints, and to avoid providing a reference point by which players could determine the limits of the system.
- Attempted detection of local phone line disruption. Pulling out the phone cord or picking up an extension on the same line would both cause the power level on the line to drop. Because of differences in phone systems and load from passive devices on the same line, the "normal" power level had to be computed for each individual user, and recorded in the service. This turned out to be somewhat unreliable however, partly because the information from the modem chip wasn't very consistent.
- During Catapult's first tournament, the results from all failed games were carefully scrutinized by the game patch authors. This was an arduous task with mixed results: even though players were notified in advance that they would be ejected from the tournament if they performed Evil acts, many preferred to bail out rather than accept a loss. (The tournament wasn't single elimination, so there was no reason to bail at the first sign of trouble.)
- The box that was calling the opponent would display "Dialing Thrasher..." while dialing. If "Thrasher" was particularly good, and the player dialing didn't want to lose to him, the Evil user would disrupt the dialing process. This was rather unpleasant, because it made our dialing stats look bad, reduced our successful match rate, and left Thrasher waiting for an opponent for ten minutes. (The person waiting for a call had no way of knowing that the attempt to dial them had been aborted; the service would have had to dial Thrasher via an outbound modem bank when the Evil user reached the service again.) To fix this, the box was changed to display "Dialing opponent" instead.
Even with all of this, and a customer service department that refused to adjust player stats, calls and e-mail complaints continued. Some users had been reset on and wanted the stats adjusted, others claimed they had been unjustly accused of hitting reset and demanded that the loss be removed from their stats. In the end Catapult stopped penalizing the Evil users, which reduced the complaints but did nothing to discourage Evil users.
Mpath Interactive, the company that Catapult merged into after going bankrupt, removed some of the problems but not all. They decided to not make win/loss statistics available, because, according to one report, "they made the system too competitive." While some would argue that competition is the heart of gaming, they did remove one of the significant reasons for game disruption.
Sega's Heat service is based on Mpath, but they put the stats back in and even sponsored tournaments.
Not even multi-user dungeons (MUDs) are free of problems with Evil users. A game on CompuServe, called Island of Kesmai, was somewhat popular in the late 1980s. They had a rather severe way of dealing with modem carrier loss: all of your player's possessions ended up on the ground immediately. If you weren't able to connect back in quickly enough, other players could steal your things. The goal was to prevent players from dropping out every time the going got rough, but for people with shaky phone lines the results were unpleasant. This was a case where being Evil didn't harm other users, except that by being Evil you could have potentially advanced faster than others by being reckless. There are better ways of dealing with such situations, but that's outside the scope of this paper.
Self-ghostbusting never became a huge problem in Netrek, so we never actively searched for a solution. Weighted-average ping times were readily available for all opponents, though, so it was pretty easy to tell which players were likely to have network seizures.
VI. Solutions and Options
Put simply, the goal is to penalize every Evil customer and give retribution to every Victim, while not penalizing any Unfortunate customers. Removing the sources of Evilness is a valid approach, but some approaches (such as doing away with statistics) may put an online service at a competitive disadvantage.
The solutions that have been tried so far failed to completely address the problem, mainly because they were implemented entirely in the service when they should have been partly implemented in the games themselves.
A successful online service should include many of the following elements.
A. Display player statistics that encourage good behavior
Simplistic statistics like win/loss ratio impose tremendous penalties on the losing player. It's okay to track wins and losses so long as you also track the games that failed to complete. A full set of statistics should include:
- Number of games completed
- Incomplete games in which player was ahead
- Incomplete games in which player was behind
- Incomplete games which were too close to call (e.g. tie game, or game hadn't really started yet)
All of these statistics should be available to other players, so there's no way to hide. A more sophisticated system would show failure averages, by game, for the user's calling area and for the system as a whole. There is potential for abuse here, e.g. an online "club" decides that they will disrupt the game whenever NastyBob is playing so as to give him the appearance of being Evil, but this should be rare enough to not pose a significant problem.
Tracking "points scored for" and "points scored against" is counter-productive. Players will look for blowout victories.
B. Define "win" and "loss" separately for each game
Not all disrupted games should be scored as "incomplete". If the game is close enough to finishing that there is no doubt of the victor, a real win and loss should be assigned.
Simply adding up the points at the end of the game might work for successful games, but for failed games a different approach must be taken. It is reasonable to assign victory (i.e. a real win, not an incomplete win) to the player ahead 14-7 in the 4th quarter of a football simulation, but to do the same in the 1st quarter of a 14-7 basketball game would be a mistake.
It may be necessary to make the criteria somewhat random. If players can determine the exact point spread necessary, they may pull the plug as soon as they're that far ahead.
C. Games should be restartable when the connection drops
Legitimate failures (where the phone company or a wayward cat were responsible) should be recoverable. The players involved should be able to pick up the game right where it left off, not at some arbitrary earlier checkpoint. This is difficult for tightly-synchronized LAN games like DOOM, but would not be impossible had the designers included the appropriate technology.
D. Games whose state can be held in the service should be restartable for a brief period
The state of some games can be expressed in a file uploaded to the service. Retaining such file for a brief period, perhaps a week, should not impose undue storage requirements on the service.
The service should make a summary decision on the outcome of the failed game for each player. If "real" wins and losses are assigned, it's unlikely that the winner will wish to continue, and the stats will stand (which is why it's important that the winner be well ahead). If the leader has an "incomplete" win, they will be motivated to secure a true victory, and the player with the "incomplete" loss will be interested in continuing the game so as not to look Evil.
The player matching process (I'm assuming one-on-one games here) must allow, and should encourage, the option of restarting a previously incomplete game. It may be difficult for some users to find each other again, but with a little infrastructure it can be made relatively painless for users who legitimately want to play each other but got interrupted.
The key to this feature is that the game has to support it. It is unreasonable to expect the service to be able to extract all necessary state for a complex game.
E. Have an online "appeal" process
Sometimes users don't want to accept the wins and losses handed to them by the service. The process for appealing these arbitrary (and probably entirely automated) decisions should be clearly defined. Even if the appeal process does nothing but cause an automatic "sorry, the results stand" message to get sent a day or two later, the user will have some feeling that his gripes are being heard.
F. Make stats optional
It is reasonable to have competitive and "just for fun" games going on at the same time. Sometimes users just want to play a quick couple of games without having to worry about how their statistics will be affected. On most existing systems this is done by changing to a different pseudonym, but this results in players who don't care playing against players who do care, which either isn't fun, or results in players going against their non-stat-tracking friends in an attempt to run up their own stats.
This is related to a separate issue: should players be allowed to reset their play statistics? On the one hand it makes it much easier to have two users play against each other in an attempt to boost the stats of one. After deliberately losing several games, the designated loser just resets his stats and starts over. This kind of "stat laundering" can be countered with, you guessed it, more statistics.
On XBAND, you got a number of "XBAND Points" for every victory. The amount you got depended on your relative skill level compared to your opponent. A player with a lot of victories, playing against a player with zeroed stats, would not have advanced as quickly as a player going up against experienced opponents.
Another way to deal with the problem would be to track the number of different opponents that the player has gone up against. This can be expensive to track, however.
The important part is to allow users to play without having to stress out over how their stats will be affected.
G. Have a forfeit option
Allow users to bow out politely. There are two situations where this is useful: when you're getting killed, and when mom says dinner is ready and you have to come to the table Right Now. If you're destroying somebody and are bored to tears, it might also be good to have a call for an immediate judgement. The game itself would decide if one player had guaranteed victory for himself, and if so would allow the game to end as if it had completed successfully.
For the first case, a forfeit should be a loss for you and a victory for the opponent. You lost fair and square. This might provide a faster way to run up stats of course, so the service should track - if only internally - the number of victories by forfeit.
In the second case, if the game can be restarted at a later time, the forfeiting player should be allowed to stop and restart it later, using the standard rules. One modification would be to always assign an "incomplete while losing" to the forfeiting player, as a mild disincentive to using the forfeit switch.
H. Try to identify the Evil players
The "real" vs. "incomplete" statistic gathering is a neutral way of expressing the facts. In many cases, however, it will be possible to go beyond this and know definitively which player was the Evil one. If such a determination can be made, it should be recorded, and the general level of Evilness should be factored into other decisions (e.g. Evil players should be matched up against other Evil players whenever possible).
It can be dangerous to display this to the user, however. Nobody wants to be recognized by other players as a cheater, and users will clog customer support lines with excuses, pleads, and demands for adjustments.
I. Examine both sides of the game
Some people will figure out how to post phony game results that make them look good or their opponents look bad. Comparing the game results from different players will identify problems, and a little service-side cleverness will show which of the players has been cheating.
It can also be very useful to see who thought the game locked up, and when. By comparing the results from two different players you might be able to see who is trying to cheat. Also, if one player resets their game, they may not have any game results to send to the service afterward, so determining whether they should be assigned a win or a loss can only be done by looking at their opponents' scores.
This implies a time delay. If player A and B play each other, player B resets and goes to sleep, and player A logs in showing an incomplete game result, you won't be able to give A any results until B comes back (or a set period of time expires). This has to be explained to the players in a way that doesn't leave them confused about why judgement has been delayed.
VII. Scattered notes
[Some notes that didn't get incorporated elsewhere, but shouldn't be overlooked.]
- Some players would go easy during the first quarter or half, then turn it up toward the end, to keep people from bailing out early.
- Evil person must care about the consequences. This has to balance against the accuracy of determining who is Evil.
- Client/server and "pick up" games don't have hard problems with stats, because they're ongoing. Simliar things can be done with games like "doom" (see how "quake" was done and extend it). Must avoid local corruption; either continually report stats, or allow other players to pseudo-rat on the others. Best if service can track it, so that you can't screw with your own client to make yourself look better or others look worse.
- Allow reconnects whenever possible. Provide a means to rejoin the game. Easy in client/server, slightly more difficult but not impossible in synchronized games like doom.
Catapult tried not to show win/loss stats. Top 10 was based solely on wins - which was a good thing since it encouraged player to play more, bring in more revenue, provide more matches for others, etc. Some players found a hole in a game patch that allowed both sides to win...
Things Catapult did: auto-redial after disruption. Call-waiting detection. Reset detection. Attempts at phone-cord-line pull detect. Problem: users adapt to solutions. Method must be clearly explained, but can't have obvious workarounds. Sometimes you need to see data from both sides, which introduces a time delay.