Continuing from Part One… Again, our datasource (WWF/WWE PPV 1985-2013 Star Ratings).

We’re still trying to tackle the quandary of how one should distill a myriad of individual match ratings into a single score for each event and/or each year.

Let’s review some options:


(a) Weight every single rated match equally

(b) Weight every single rated match that received a rating above DUD equally


(c) Weight every single rated match according to their placement on the card

(d) Weight every single rated match that received a rating above DUD according to their placement on the card.


(e) Weight every single rated match according to their length in minutes

(f) Weight every single rated match that received a rating above DUD according to their length in minutes


(g) Weight every single rated match according to importance of the competitors and titles involved

(h) Weight every single rated match that received a rating above DUD according to importance of the competitors and titles involved


(i) Weight every single rated match according to several factors

(j) Weight every single rated match that received a rating above DUD according to several factors


If you’ve followed by work previously, you know we’re going to settle on the very last choice; still, it may be comforting to know that options a-h don’t really provide vastly different pictures.

In yesterday’s post, we explored options a&b (equal weighting, or as I called it “unweighted all ratings” and “unweighted positive ratings).


As noted before, 1985 and 1986 improve from the bottom third to the middle third when you eliminate those pesky “negative” star ratings such as Dave’s infamous disdain for the Hogan/Andre main event at Wrestlemania III which he awarded [bonus points for reading in voice of Bryan Alvarez shouting] MINUS FOUR STARS.


While Hogan and Andre have both received negative star scores on multiple occasions, the real grand champions of negative scores are the Brothers of Destruction: Undertaker and Kane. They’ve done it all.

They earned negative scores for wrestling big stiffs (Mabel, Giant Gonzalez, Big Show, Khali, Kamala), wrestling each other (WWF In Your House 25: Judgment Day), teaming together (against KroniK) and even wrestling themselves (Undertaker vs. Undertaker: Summerslam 1994). Dennis Knight (Phineas I. Godwinn/Mideon) and Santino Marella have nothing on those two.


Under this methodology, we number all of the matches on the card and weight each match heavier than the preceding one. In this fashion, we’re emphasizing the importance of the “main event” — or at least whatever match went on last.  There are some obvious drawbacks to applying this method, not the least is that wrestling cards aren’t always structured cleanly from least important to most important match.

In fact, we’re all familiar with the “bathroom break match”; they’re often used as buffers between hot main events.  (For further exploration, I encourage you to check out the mini-study about the “Viscera Slot” utilizing 1993-2013 Raw data.)  Simply assuming the important stuff goes on late in the show and the minor stuff in the beginning is certainly a fallacy.

In fact, even over-weighting the main events (on a ten-match card, half of the score for the card would be driven by the last three matches), doesn’t dramatically alter the annual rankings.

Including All Ratings

The biggest change is that 1999 fares better when you overweight “main events” dropping from 26th place to 22nd place.  Meanwhile, 1990 creeps up from 24th to 26th.  Otherwise, you’re just shuffling the candidates all around.

Including Only Positive Ratings (ignore DUD and negative star ratings)

As before, it’s largely the same years just slightly shuffled around.  The main difference in this example is that 2011 dropped from 3rd place to 6th place and 2005 jumped up from 6th to 5th.

So, as with many intricate plans, simplicity prevails.  Moving on…


@steenalized put it nicely, “My gut reaction is to include time. Five minutes of -** (minus two stars) is less damaging than 20 minutes of * (one star)”.  What happens if we just evaluate the card using a weighting based on the match lengths?

First thing you may notice is that the “space” between the lines is much smaller than in the previous examples.  Whereas previously there was between 0.37 to 0.41 star difference (the delta) from eliminating the DUD/negative stars, in our time-weighted example that “delta” drops to only 0.30.

Essentially, the implication is that while some matches are terrible, they’re usually short. And this method underweights short matches (regardless of where those short matches took place on the card).  Now, there’s a lot to be said about this and it will be further explored in a later installment when I get to the complex relationship between match lengths and star ratings.

(If you want a sneak preview, look at the piece I wrote in October with special attention to the WWF Star Ratings vs. Time graph.)  Your average negative star match is 6 minutes 54 seconds.  Your average “DUD” is 6:34.  Your average positive star rating match is 13:16.  Essentially, your average positive star match (i.e. “good”) is going to be more than twice as heavily weighted as a stinker (DUD) or downright terrible match (negative stars).

Including All Ratings

Top five years get reshuffled, but no big surprises. On the other end, 1992 gets a bit of a break as it climbs up out of the bottom five pit as 1997 spills downward into 26th place.

Including Only Positive Ratings

Again, all that work, and not a lot to show for it. The most dramatic move is 2012 jumping up from 9th place to 4th place while 2008 slips to 7th.  At the bottom, the list is dreadfully similar.

That brings us to the next options which will be explored in the next installment:



As usual, a simple idea (weight the matches according to their importance) will introduce a lot more questions: How do we identify the “most important” matches?  Is it tied to the type of match or titles involved or the people involved?  How does card placement influence the ranking of this importance?  And finally, can we combine all of this stuff and produce any kind of an answer that’s defensible yet materially different than the same five years on top and the same five years on the bottom which we keep getting?  Tune in next time…