Last year, as many of you know, I worked on a list of the 25 Best Creatures. I really enjoyed the project, but I ultimately did not find a place to publish it. This time around I am going to do things differently. I will use all of the old criteria, but this time I am going to do some number crunching, to give my conclusions more weight and objectivity. Below is the method I am using to calculate a creature's power based on the metagame. As I stayed last year, the efficiency method (power and toughness, plus abilities, minus casting cost and drawbacks) is helpful, but ultimately lacking because of the power of the metagame in Magic. Thus I have tried to calculate a creature's presence in the metagame, by looking at all of the Top 8 decklists I can get my hands on. I have not done any calculations as of yet, so this is all preliminary. I want so feedback on the method. Are there any flaws you can see? Any additions or deletions? Does the method work at all? What about its limitations?
Note that the numeric criteria will be only one of a number of criteria.
The Numeric CriteriaThe first version of this list used a number of objective criteria to judge the power of a given creature. The criteria, in general, were pretty useful, and they will be used in this attempt to rank creatures, but they will be supplemented with empirical data. That data will be based on Top 8 appearances. That is data will be compiled by looking at the Top 8 decks from major, well attended events as far back as such data is available. For the purposes of this inquiry these events include: Masters level events, Pro Tours, Grand Prix standings, reported qualifiers for major events (a.k.a. “meatgrinder” tournaments), Nationals and Continental championships, Worlds, Amateur Championships, and the various Constructed formats used at the Invitational event held every year. These events are chosen for two reasons, one practical and the other more logical. First, these are the events with the most complete documentation. Thus they serve as good snapshots of a given metagame. Second, because these events are attended by Pros they tend to produce the most well developed decks. Thus these tournaments are both well recorded and the peak of a given metagame. In instances where Wizard’s archives are incomplete, I will use The Dojo’s “Decks to Beat” as representative of the major winning archetypes at the time. In Vintage, where there have been few high profile, sanctioned events I will used Top 8’s from major tournaments around the world like Duelman and Waterbury. These lists are taken from
www.morphling.de a clearinghouse of Vintage information. As was said in the introduction, no Limited information will be looked at.
The Raw ScoreThis is the simplest score to explain. It is the number of times a given card appears in a Top 8 or equivalent record since the beginning of the game. For example, if Birds of Paradise have been in 100 Top 8 decks since the game began and have been a 4 of in each of those decks, Birds of Paradise’s raw score would be 400. The raw score is handy because it shows long-term trends. A high raw score likely indicates that a creature is one of the best creatures of all time. The raw score ignores format entirely.
The raw score, however, is flawed in a number of ways. First, and most obviously, older cards will have a higher raw score. Birds of Paradise, a creature released in Alpha, will likely have the highest raw score (I am writing this before I compile the scores to prevent bias) because it is a good card AND because it has been around since the beginning of the game. This is not entirely unfair, as Air Elemental has been around equally long and will rank nowhere near as high, indicating in a rough way that Birds of Paradise is far better than Air Elemental, an equally old creature. However, a simple raw score means that newer creatures like Psychatog and Wild Mongrel will rank far too low. My bet is that a card like Shivan Dragon will have about the same raw score as Psychatog, though we know Psychatog is far better a creature. Thus, raw score serves the same purposes as years of play does for determining great sports players. If a given player played for twenty years it is likely that he is at least very good. It does not, necessarily, mean, however, that he is great. Sandy Koufax is proof enough of this point.
In order to combat this “old creature bias” I will also calculate a time indexed raw score. This number is the raw score divided by the number of years (half years rounded up) the card has been in print. It may help reduce the bias, but I do not believe that it will get rid of it entirely, hence the other scores.
Format ScoreThis is a creature’s raw score in a given format over the history of that format. Vintage and Extended scores will be higher because old creatures don’t get rotated as often, if ever. Block scores will be incredibly interesting to look at because the shallowness of the card pool means that broken creatures will get used much more often, given the limited number of possible strategies. Thus, Flametongue Kavu will likely have an incredibly high IBC score. Note that each Block format is seen as an independent format. Standard will likely have creatures with scores closer to Extended scores than Block scores given that some creatures, like Birds of Paradise have been in the format since its inception. Note that lists that were taken from events before the split of Magic into Vintage and Standard will be treated as Vintage Top 8s, since decks from that era more closely resemble Vintage decks than Standard decks. Even in that era the Power Nine had a stranglehold on Magic. Also note that I will calculate a time indexed format score for all non-block formats, again to combat old card bias.
Presence PercentageThis score represents the percentage of Top 8s in which there was at least one of a given creature present. For example, if Psychatog made an appearance in a least one deck that made the Top 8 in every tournament for the entire time that it was legal in Standard then its Presence Percentage for Psychatog would be 100%. This score is designed to find those creatures that were good in hard to use decks that were powerful but faced simpler/cheaper competition. The idea here is that a given deck, while incredibly powerful, even perhaps the best deck in the format, made be underrepresented because the deck is either hard to play, expensive to build, or both. For example, when Standard consisted of Tempest and Urza’s Saga blocks, Living Death was a powerful metagame force, but it was significantly harder to play than the Sligh decks of the same era. Thus, Pros, the ever cautious types that they are might opt for consistency and ease of use over complex power when faced with the prospect of three straight days of Magic. Thus while Pup shows up all the time in this era it is unfair to say that it is outright superior to Birds of Paradise (a staple in Living Death) in that metagame, even though there were more comparatively more Jackal Pups than Birds of Paradise. This score is primarily useful for shifting formats like Block and Standard. It has some usefulness in Extended, but hardly any in Vintage.
Weighted and Indexed ScoreThis is my best attempt at an overall numeric ranking of creatures. It is based on comparing the weighted total a creature’s various format specific scores. The format scores, however, are not all equally weighted. Here format score multipliers: Extended = 3 Multiplier, Vintage and Standard = 2 Multiplier, Block = 1 multiplier. Here is how the weighed score will be calculated, noting that it will be a time indexed version of the given format’s score:
1.x Score x 3
Vintage Score x 2
Standard Score x 2
+ Block Score x 1
--------------------------
Weighted Score
Here are the reasons why I weighted the various formats as I did. I will look at the formats in descending order.
Extended has a number of things going for it. First, it, along with Block, are the only two regularly supported Constructed formats on the Pro Tour. Standard has not been played since Chicago of 2000 and I believe there has been only one Vintage Pro Tour. Thus, there are more players with better resources and knowledge of the game playing Extended than any other Constructed format. As such, it seems likely that all of the truly great creatures in the various Extended eras have been discovered and used numbers that indicate their overall power. Second, unlike Vintage where the cardpool problematic, creatures can and have been used in nearly every Extended event since the format’s inception. This means that there is more of focus on creatures, instead of powerful swing cards, allowing them to play a major role in the game. Finally, Extended seems to be the most watched format by Wizards. They regularly ban cards from the Extended cardpool doing away with the “swingyness” that plagues other formats. Thus, I believe that a time indexed Extended score is the most accurate numeric representation of a creature’s power, hence the 3 Multiplier.
While Extended has no real flaws Vintage does. First, Vintage has a cardpool dominated by a few cards that are concentrated in only a few of the five colors. These cards are so powerful that they artificially constrict the size of the cardpool. So while in theory Vintage has the largest cardpool to choose from, this is not true in practice. Countermagic provides an excellent illustration of this point. In Vintage the format is so fast and the other cards are so powerful, that over time it has become clear that there are really only two counter type spells worth plays 4 of in a control deck. These two cards are Force of Will and Mana Drain. They do so much for so little that they are many times more powerful than the plain vanilla Counterspell. Mana Leak occasionally makes an appearance in a Vintage deck as a 9th-12th counterspell when the deck can use it, as in Ophidian, where Mox mana can allow the splashable Mana Leak to be more useful early than a harder to cast Counterspell. Similarly, Stifle and Misdirection also have niche uses. But by in large, if a Vintage deck is running countermagic, counters 1-4 are Force of Will and counters 5-8 are Mana Drain. Anything less powerful than Mana Drain is not worth using. This has profound impacts on the cardpool. Think of it this way: in each Block there are about 3 counter. They will never print another counter as good as Mana Drain (I believe the exact words were “We’ll reprint Mana Drain when R&D gets hit by a but.”). Thus, we have at least 3 cards that are at the very best only marginally useful in Vintage printed in every block. Applying this same logic to creatures, burn spells, spot removal, and everything else and it becomes clear that the King of the Formats has a squire’s sized cardpool. Grant that cardpool contains the best cards ever printed, but this constriction eliminates a vast number of cards, including many good creatures. This means that creatures are dramatically underrepresented in Vintage. This flaw does have a useful corollary: those creatures that are used in Vintage are amazingly powerful. Another side effect of Vintage’s supercharged cardpool is the rise of combo decks. Combo decks have traditionally been the enemy of aggro, creature-based decks and thus creatures, for another reason are underrepresented in Vintage. Finally, given the format’s dramatic lack of high profile support and the small number of people to do professional grade preparation Vintage, even today, remains an undeveloped wilderness of deckbuilding potential. This untapped potential is complicated by the format’s naturally conservative fans (who but a traditionalist would cling to a format that has never had a Pro Tour event?). Keeper for many years dominated simply because no one had taken the time or effort to develop another top flight deck idea. Recently this has begun to change, but the sheer number of Keeper decks in any given major format indicates that Vintage players aren’t exactly the Lewis and Clark of Magic. Certain players have eschewed this mantle and gone exploring, reaping tremendous benefits, but by in large, Vintage fans don’t want things to shift dramatically. And given the cardpool it is unlikely things ever will. Blue, for example, has an almost untouchable position as the format’s best color. Thus with the shallow de facto cardpool, the lack of Pro players, and the format’s naturally conservative nature, there are a number of creatures that are likely powerful but unused. Recent developments in the past two years (decks like GAT, TnT, Mask, and Dragon) all prove this point. However, the format does show us, in a best of the best competition which creatures make an appearance. Thus it gets a 2 multiplier for its format score in the weighted score.
Many of the problems that afflict Vintage also afflict Standard: a shallow cardpool, though it is de jure shallow and not de facto shallow, overly swingy cards, and little support or exploration. While Standard is played at Pro level events more often, and garners more grassroots support than Vintage, it is nowhere near the “standard format.” The last Standard Pro Tour was Chicago in 2000. Nationals and Worlds both have Standard events. However, these events are heavily informed by Regionals testing. For Worlds, Pro are faced with a daunting array of skill testing formats. The decks used in the Standard portion are often tweaks on Nationals decks, the only format in a regular year in which Pros actively explore the format. So Standard is largely unexplored by the Pros. It is slightly more explored by the average gamer, as the format of choice for States, a favorite among the people, and Regionals, a feeder to Nationals and Worlds. However, this grassroots support, at least from the vantage point of New England is waning. Any given month presents at least two and as many as four major Vintage events. There are no regular prize awarding Standard tournaments outside of Friday Night Magic, which is almost exclusively a casual format. Thus, Standard is a format that gets explored by people only twice a year. Add to this the quickly rotating card pool and the relatively high price of essential cards like Birds of Paradise and Wrath of God and it becomes clear that Standard is not the best source of data for finding out which creatures are the best. It was the default format for quite awhile, though, so some of that data is useful.
Block represents the worst format to look at to discover the power of a given creature. First, the format had an incredibly shallow card pool, smaller even than Standard’s. Second, design mistakes, because of this small card pool, radically taint Block formats. For example, Odyssey Block was all but a two-deck ordeal thanks to the overpowered nature of the Madness mechanic and the nearly all Black Torment. These design experiments, while not damaging to a larger format ruin Block formats quickly. This danger was seen again in Onslaught Block when R/W and Goblins represented most of the competitive decks. The third problem with Block is that niche cards are more powerful. In Standard where Wrath of God reigns supremely powerful, creatures like Cabal Interrogator stand little chance of seeing play. In Block, however, there is no cutthroat cards and creatures that are subpar can make an impact. Looking through the ranks it becomes clear that this trend is present in every Block format. Tempest had its Bottle Gnomes and Mogg Raiders, Urza’s Block its Horseshoe Crabs, and so on. Every Block format allows niche cards to shine when basic abilities are removed from the format. Venice’s land of the giants proved this true. Without a cheap Wrath-effect or a solid counterspell behemoths were worthwhile. But when these decks moved into a larger format, efficient cards buzzsawed this decks to pieces. Also because Block cards are only relevant for a year, at most, they receive less exploration than cards in slow rotating formats. Thus a creature’s prominence in Block is not a useful indicator of power, unless that creature can make the transition into Standard.
Thus, a card’s Extended score will rank highest, receiving a multiplier of 3, its Vintage and Standard score will receive a multiplier of 2, and its Block score will receive no multiplier at all.
The Flaw with NumbersThis idea of ranking creatures empirical is tantalizing. We will get a number, something easy to compare and contrast. However there are two problems with the scores, even once they have been weighted and indexed. The first problem is that of under representation. In certain formats/eras creature decks have been terrible. This is part of the cycle of the game. In early Vintage creatures were simply too weak to use. The Abyss, Swords to Plowshares, and Mana Drain were central pieces to many decks. Thus the ability to “rush” a control deck was seriously curtailed. Add to this the general paucity of good small to medium priced creatures and it is clear that creatures were not worth the time and effort. Serra Angel, Erhnam Djinn, and Mishra’s Factories represent the few creature-type permanents that were used. Thus, even though, for their format/era Serra Angel and Erhnam Djinn were quite good, they were under represented in tournament Top 8s. One could say that these creatures, while comparatively good to other creatures in the metagame, just don’t stack up to modern creatures. And to an extent this is true. However, there are format/eras in which even the best of creatures just are good enough. The first good example of this is Combo Winter. At the time Magic was so dominated by powerful and fast combo decks, that playing creatures, even incredibly high quality ones, was a losing prospect. For a time even creatures like the revered Jackal Pup were not being played, in favor of Donate, High Tide, and Tolarian Academy.
Another good example of this is the recent Long/MUD dominated Vintage, where again creatures were under represented because of the power of combo and non-creature (or creature light) decks. These same conditions existed, for the most part at New Orleans in 2003. The crucial problem relates to the format’s fundamental turn (FT). With Tinker and Long having respective FTs at 2 and 1, no creature deck, no matter how good, can win reliably in those circumstances. Thus, creatures which existed in these environments are under represented by their numeric score. A sneaking suspicion tells me that had Jackal Pup not suffered through two or even three eras of combo in Extended it would have the highest score of all time, by a great deal. Other forms of systemic under representation include creatures that are not used in 4 of configurations, such as Morphling in old school Keeper decks, and Psychatog in GAT decks, and creatures whose heyday came at the end of a rotation. In tabulating the final rankings I will try to keep in mind under represented creatures and make a note of them.
The inverse problem, that of over representation, is far more tricky. While obvious metagame factors give rise to under representation, such as fast combo decks, I still have lingering doubts as to whether over representation is a problem at all. I can think of one classic example of over representation—Gorilla Shaman. While, the final tally has not been compiled as I write this, I have a feeling that Gorilla Shaman will be the most used creature in the history of Vintage. A number of factors cause this. First, Gorilla Shaman is an older card, having been printed in Alliances. Second, Gorilla Shaman is easily splashable, with minimal of color requirements. Third, Gorilla Shaman is a staple in the format’s most played and favorite deck—Keeper. Fourth, Gorilla Shaman easily corrects one of the format’s biggest design mistakes (most played cards)—the Moxen. So the question becomes—is Gorilla Shaman a good metagame creature or a balance against a flagrant flaw in the format’s cardpool. The answer, of course, is both. The two are a cause and effect. The format’s design flaws cause Gorilla Shaman to be a good metagame creature.
But it seems unfair to say that Gorilla Shaman is the best creature in Vintage. After you eat your opponent’s five Moxen and Lotus, Gorilla Shaman is merely a plain vanilla 1/1 (most of the time). Destroying these resources, is, of course, a huge boost in tempo, but Gorilla Shaman’s abilities are so narrowly useful, that I can’t but help believe that Gorilla Shaman is useful only because the Moxen are so incredibly good. Proof of this is Gorilla Shaman’s success in other formats. Few if any decks outside of Moxen laden Vintage have used the Shaman, even as a sideboard card. This sharp drop off in use is a testament to a creature being used in an over representative way. Other examples of over representative creatures include: Morphling, Phyrexian Dreadnought, Worldgorger Dragon, Uktabi Orangutan (in the pre-January Extended), Wildfire Emissary, and Erhnam Djinn. Most of the time over representation occurs for two reasons, though they are much more amorphous than the reasons for under representation.
The first reason is the “lock and key” phenomenon. The best example of this is Morphling or Dreadnought. In their respective decks, Keeper and Mask, these two creatures, in their given era, were amazingly well suited to the decks that used them. Morphling was a flying, pumping, offensive/defensive, Abyss dodging swiss army knife of a creature. These abilities mean that Morphling is good, but it was not great. In other formats, like Extended, Morphling was used much less frequently. In truth, Morphling fit so well into the deck because of Mana Drain, the Moxen, and The Abyss. It was the perfect key for the lock that Keeper represented. Dreadnought and Mask have a similar relationship, as do Dragon, Bazaar, and animate dead enchantments.
The second reason for over representation relates to a similar phenomenon—cards that break a given metagame. In the Standard Mirage/Ice Age era, Swords to Plowshares was the dominant card. It was so good, in fact, that White has yet to recover from its loss, some 7 years later. Swords to Plowshares defined the format and the decks that were useful. Lightning Bolt served a similar purpose. It too, was undercosted, and dramatically warped the environment. So when Wildfire Emissary was printed, it fit into the metagame perfectly. It was immune to both Swords and Bolt. Thus it became a staple creature of the era. But, much like Shaman is good because the Moxen are ridiculous, Emissary was good because the removal of the era was insane. In a different era, with different conditions, Emissary and Shaman are both just okay creatures, and discovering okay creatures is not the purpose of this list. Tinker decks made Uktabi Orangutan insanely good, multi-colored lands like Gemstone Mine and Undiscovered Paradise made Erhnam Djinn both easily splashable and negated its drawback. As such, the other kind of over represented creatures are creatures that are good because of a flaw or imbalance in the metagame.I will try to make note of over represented creatures while making the final rankings.
I am trying to identify creatures that are great regardless of circumstance. Sometimes this is impossible, but I want to make sure to keep the problems of over and under representation in focus. Otherwise, the numeric data will overwhelm the results of the other inquiries and make the rankings inaccurate in a way that is immediately obvious. So this means that while the various scores the creatures receive may be useful, it is not the final analysis. It is one of many criteria that will be used to compile the ranking. Also keep in mind that I have chosen the criteria and how the scores will be calculated prior to tabulating the Top 8 data. I say this to dispel notions of bias. I am relying on things other than numeric data not so that I can “keep my hand on the scale” and choose my “favorite” creature, but because I believe that the numbers only tell part of the story.