sniffnoy: (Chu-Chu Zig)
[personal profile] sniffnoy
So here's a question for you: How does one generate a tier list from a matchup chart?

A matchup chart, obviously, contains way more information than a tier list. If you know how literally every character matches up against every other character, you should certainly be able to say which characters are, on the whole, better than others. But how? If you have two characters you want to compare, and one dominates the other, then this is easy; but what if neither dominates?

(I'm assuming here that a "tier list" means just a list of all characters sorted by rank; I'm ignoring any further information, such as numerical ratings or actual divisions into tiers.)

This is the method I came up with: First, we treat the matchup chart as a (zero-sum, symmetric) game. (Since matchup charts are usually presented as being out of 10, you may have to subtract 5 from each entry before you can feed it into your standard tools.) Then, compute the optimal strategy for this game. (I'll assume this is unique, as it will be in all cases analyzed here; I'm not going to worry about what happens if it isn't.) Finally, rank each character by how well they do against this optimal strategy. (There might be ties; I'm just not going to worry about that for now.)

EDIT 6 August 2016: Welp, I've now actually encountered the case of the optimal strategy being nonunique; I decided to handle it by taking the centroid of the set of optimal strategies and using that. How meaningful that is, I don't know, but I couldn't think of anything better. (Note that for a two-player zero-sum game, the set of optimal strategies must be convex.)

Originally I had left out the final step, instead thinking that how often a character occurs in the optimal strategy should be used as the method of ranking. But later I realized this makes no sense -- suppose you have two characters, A and B, which are very similar, but A slightly dominates B. Then B should rank just a little bit below A, but this method would instead put B in last place. I don't think that's the intent of a tier list. We can, I suppose, use this as a tiebreaker for the actual method I suggest though.

Now obviously there are some problems with this. In real life, not everyone is actually capable of playing every character to its greatest extent. Still, I think it is an interesting model. What does it result in when applied to Super Smash Bros.? And do the results match the more usual tier lists?

Now, it's quite a bit more work to produce a matchup chart than it is to produce a tier list, so matchup charts are not always so up to date. As such, we'll compare each matchup chart not to the current tier list but to the one closest in time to when the matchup chart was produced.

SSBWiki lists four matchup charts: One for the original Super Smash Bros., one for Melee, one for Brawl, and one for the Japanese version of the original Super Smash Bros (various things were altered for the more well-known international version). (For obvious reasons, there is as yet no Super Smash Bros. 4 matchup chart; indeed, there isn't even yet a tier list. Since the game is still actively being patched, producing one would be premature. Also, without a Super Smash Bros. 4 Back Room, it's not clear who would produce an "authoritative" one.)

The SSB chart will be compared against the 3rd SSB tier list; the SSBM chart against the 10th SSBM tier list; and the SSBB chart against the 8th (current) SSBB tier list. (Again, this is based on the dates the matchup charts are from.) SSBWiki doesn't list a separate Japanese SSB tier list, but there seems to be one implicit in the Japanese SSB matchup charts, i.e., the ordering of the characters used in the chart, so that's what we'll compare it to.

Actually, before I continue, let me observe -- it's not obvious that my method should make sense that for the SSBM and SSBB matchup charts listed here, since rather than state how often a given character wins a given matchup, they just give abstract descriptions such as "slight advatage", "large advantage", "Close to unloseable", etc. Nonetheless, as we'll see, this will prove not to be an issue.

The thing about the SSB, SSBM, and SSBB matchup charts is that all three are dominance-solvable; the optimal strategy, it turns out, isn't any sort of mixed strategy at all, but simply to always play the best character (i.e., Pikachu for SSB, Fox for SSBM, and Meta Knight for SSBB). This is why I said it doesn't matter that the SSBM and SSBB charts don't actually give matchup splits, since only qualitative comparisons are needed to establish this result.

In the SSB case, Pikachu dominates all characters but Kirby and Fox; but once we restrict to those three, Pikachu dominates.

In the SSBM case, the characters not dominated by Fox are Falco, Sheik, Marth, Jigglypuff, Peach, Captain Falcon, Ice Climbers, Samus, Ganondorf, Link, and Donkey Kong. Once we restrict to those, Fox dominates everyone but Sheik, Marth, Jigglypuff, Peach, Captain Falcon, Ganondorf, Link, and Donkey Kong. And once we restrict to those, Fox dominates everyone but Sheik and Jigglypuff. Once we restrict to those three, Fox dominates.

In the SSBB case, the characters not dominated by Meta Knight are Ice Climbers, Olimar, Diddy Kong, Marth, Falco, Pikachu, Wario, Lucario, King Dedede, Donkey Kong, and Sheik. Once we restrict to those, Meta Knight dominates everyone but Ice Climbers, Olimar, Diddy Kong, Marth, Pikachu, Lucario, and King Dedede. And once we restrict to those, Meta Knight dominates everyone but Ice Climbers, Olimar, Diddy Kong, Marth, and Lucario. Once we restrict to those six, Meta Knight dominates.

In all three cases, then, my method says that for these games, the tier list should be ordered purely based on how each character does against the unique best character (we'll ignore the question of how ties are broken). As you can easily check, this is not how they actually are ordered.

This leaves the case of DSB (Japanese SSB). Here the optimal strategy is in fact mixed! Let's start with a dominance analysis. Pikachu dominates everyone but Kirby, Captain Falcon, and Ness; and once we restrict to those 4, Ness is dominated by Kirby. But once we restrict attention to the remaining three, we find that none of them dominates. In fact, the unique optimal strategy is a mix of 1/4 Pikachu, 1/4 Kirby, and 1/2 Captain Falcon.

This is surprising because it suggests that in some sense Captain Falcon is the best character, rather than Pikachu! I'm not sure how believable that really is, but that's what it says.

This case is more interesting than the others, so let's do a full listing. I'm going to rank characters by how they do against this optimal strategy, as discussed above. I'll then break ties by how often they appear in this optimal strategy. Finally I'll break the remaining ties by their rank in the (inferred) Japanese tier list, i.e., I'll try to keep the resulting list as close to the usual tier list as possible within the bounds stated above.

If we do this, we get the following results:
  Games won Games used Usual rank Resulting rank Difference
Captain Falcon 5 5 3 1 +2
Pikachu 5 2.5 1 2 -1
Kirby 5 2.5 2 3 -1
Ness 4.75 0 5 4 +1
Fox 4.25 0 4 5 -1
Jigglypuff 4.25 0 8 6 +2
Mario 4 0 6 7 -1
Luigi 3.5 0 11 8 +3
Samus 3.25 0 7 9 -2
Link 3 0 9 10 -1
Yoshi 3 0 10 11 -1
Donkey Kong 1.75 0 12 12 0

Here "games won" is how many games out of 10 they win against the optimal strategy above, "games used" is how many games out of 10 they're used in the optimal strategy above, "usual rank" is their rank in the usual Japanese tier list (that I've inferred), "resulting rank" is their resulting rank in this new ordering, and "difference" is, well, the difference.

So, that's a little different, with Luigi jumping up 3 whole spots. Most of the rest isn't too different, but remember, I deliberately broke ties in a way designed to keep it similar.

Is any of this really meaningful? I have no idea. Captain Falcon getting the edge over Pikachu certain seems wrong... but then, that was on tiebreaker, by a different method. Still, I thought it was interesting to see how it would result. If anyone wants to try it on other games and report the results, I'd be interested to hear!


Ranking is hard

Date: 2015-10-26 04:12 pm (UTC)
From: [identity profile]
Ranking in general is hard, for reasons similar to Arrow's theorem.

Re: Ranking is hard

Date: 2015-10-26 06:45 pm (UTC)
From: [identity profile]
Oh man, I hadn't even thought of this in terms of voting! I suppose you could think of this as a voting system, too -- a candidate's "matchup" against another candidate is how many people vote for the one candidate over the other. (Except that you'd have to normalize, because otherwise it might not be zero-sum, if some people only ranked some of the candidates.) It would be an odd voting system, that's for sure. I'm not sure it would be Condorcet. (Maybe it is? Is it in general true that if there's character with no disadvantageous matchups, and at least one advantageous one, then the game is dominance-solvable with them as the winner? I hadn't considered that. I'll have to think about that, or look it up...)

EDIT: OK, that's definitely false, because we have a counterexample above: In SSBM, Fox and Falco both have no disadvantageous matchups, but obviously the optimal strategy can't both be "only play Fox" and "only play Falco"! Of course, that's a pretty complicated counterexample, I should come up with a simpler one.

Perhaps more of a problem is that when snarls occur -- which of course should be rare, but which is where all this different voting systems actually differ from one another -- this one seems prone to producing ties (at least if you don't use the tiebreaker). It's easy to see that in a zero-sum symmetric game, all of the pure strategies that occur in the optimal strategy must do equally well against this optimal strategy. So in fact the first ranking doesn't matter at all, it's just based on how often each occurs in the optimal strategy. Which seems less prone to producing ties. Hm. Still, I don't really see why one would want to use this as a voting system in the first place; it just seems kind of arbitrary in that context.

One could also do the reverse, and take matchups as votes and use standard voting systems to yield a result. (It would have to be a system that worked with just that information, obviously; there isn't enough information to do, say, a Borda count.) That said, I don't see any particular motivation for doing this other than that you can. Whereas, like, my method actually makes sense for ranking the characters in a world where everyone could actually play them all optimally.
Edited Date: 2015-10-26 06:55 pm (UTC)

Re: Ranking is hard

Date: 2015-10-26 09:25 pm (UTC)
From: [identity profile]
OK, here's a simple example: Suppose you have four strategies, A, B, C, and D. A beats the others by 1, while B, C, and D exist in a rock-paper-scissors relation, each beating the other by 2. A has only positive matchups, but nothing dominates anything else.

However, plugging this into a solver, apparently the optimal strategy is in fact pure A. So finding one where it's mixed will be harder (assuming that's possible, which I don't see why it wouldn't be). Still, the SSBM example illustrates that in general having no bad matchups doesn't mean you're part of the optimal strategy.

Re: Ranking is hard

Date: 2015-11-21 11:25 pm (UTC)
From: [identity profile]
OK, here's a nice small sort-of-example, with four strategies. Actually, it's trivial to come up with examples if you allow two of the strategies to be identical (and then you can do it with three), but let's assume we don't allow that.

Just take my example above and modify it so that A, rather than beating all the others by 1, ties with one of the others; call it B. Now there's no longer a unique best strategy; anything from "pure A" to "2/3 A, 1/3 B" is optimal.

Unfortunately this is only a sort-of example, since pure A still is an optimal strategy, just not a unique one. So I don't have any better example there than the Fox/Falco example for now. And that one has both Fox and Falco with no negative matchups; I'd like one where ideally only one character had this. Hm -- actually, if you just have 3 strategies, A, B, and C, and A and B tie against each other but each beat C (let's say A beats C by more), then that's a Fox/Falco example right there. OK, so that one's silly.

This still leaves the question: Does one character having only strictly positive matchups -- or possibly just being the unique character with no negative matchups -- mean the optimal strategy consists of only that character? I haven't found a counterexample so far, or a proof, but I assume an answer is known to this. I think I'll go ask on Math.SE.

EDIT: Nevermind; this paper answers the question -- though I don't understand the reasoning? Oh well, assuming it's correct, a symmetric zero-sum game has an optimal pure strategy iff there is a strategy with only nonnegative payoffs. So if one pure strategy has all positive payoffs, or more generally is the unique one with no negative ones, it is the unique optimal strategy.
Edited Date: 2015-11-21 11:57 pm (UTC)

Re: Ranking is hard

Date: 2015-10-26 11:14 pm (UTC)
From: [identity profile]
The pairwise comparison voting system can be though of as actually taking voting preferences and moving that to rankings. But it is probably true that games can exhibit much weirder behavior than what one would get from listing out all the pair comparisons from some set of voting preferences.

September 2017

17181920 212223
Page generated Sep. 26th, 2017 07:17 am
Powered by Dreamwidth Studios