Wednesday, December 2, 2009

Plusses and minuses of alternative stats

Warning: Shameless plug for a guy who says he likes my blog -- see, I can be bought for just a few kind words (dollars don't hurt, though)!

Looking over Luis's new QB rating system, as well as his article on the NY Times' Fifth Down, I like it and I get what it's saying but I find it -- I don't know, not confusing, per se, but complex. Which is, naturally, how any system to rank a quarterback is probably going to be, including both traditional passer rating and my system. But I tend to think of it as relatively simple. Is that because I made it myself and I'm intimately familiar with it? Without trying to sound boastful, I naturally think of my system is good, and not just because I spent hours coming up with it and think it's some sort of statistical masterpiece.

I feel that any new statistical measure, if it's going to achieve resonance with the masses, should be both 1) something the masses can compute with minimal effort and 2) something where they can get a concept of what the value means. By "something they can compute," I mean that it should be something the average fan could figure out, like third-down conversion rate in football or WHIP in baseball. My second point means that the value should have meaning; the fan should understand what they're looking at. Third-down conversion rate is just that, and needs no additional definition. So is WHIP; it's essentially how many baserunners a pitcher gives up in an average inning. Something like OPS is a little more squirrely, but if you know the component parts of it (OBP and SLG), you can say someone has a .900 OPS and get an idea that he probably has around a .400 OBP and .500 SLG, and you know what those mean.

Most quarterback rating systems fail the first test. Unless the formula is mind-bogglingly simple and involves very few variables, it's relatively indecipherable to the common fan. The second aspect -- comprehension of what the number means -- can be a little easier to wrangle. Traditional passer rating at least lets you think that a "100" is good, and people like round numbers. A lot of alternate QB systems (mine included) use some form of yards per attempt (including or not including sacks, interceptions, fumbles, TDs, and so on in some way) as their result and that, too, is something most people can grasp. (Passer rating has, I think, become mainstream simply because it was the first attempt to quantify the many aspects of a QB's stats.)

The other thing I tend to dislike about alternate statistical systems is any "imaginary" aspect, simply because, to me, it seems like mostly blind guesswork and highly subjective. Usually, these comes in the form of strength-of-schedule adjustments or, in the case of certain baseball stats like xFIP, what stats the player or team "would have" accumulated if he'd played with a league-average defense (or pitching staff or running game or whatever). Those are, IMHO, fun to look at, but are ultimately unreliable as definitive measures. I understand that Brett Favre's great numbers this season are due, in part, to his playing against a relatively weak schedule, but how good would he be against a league-average schedule? 95% as good? 80% as good? 71.6% as good. Nobody knows. It's just speculation, and I prefer to use "real" stats in my arguments, not guesses. If I can't tell where the numbers are coming from, the average fan probably can't either, and that's going to hurt the acceptance of any new stat. Any "imaginary" stat almost certainly fails point 1) (easy to compute) and 2) (understandability) -- and don't get me started on "intangibles."

This brings me to the subject of this post and something I almost always dislike seeing in any statistical system: negative numbers. They usually crop up in stats that try to say a player or team is better or worse than average and, in the process, fail both of my criteria for a stat that's "acceptable" to the masses:

1) Something the masses can compute. This may come as a surprise to us statheads, but, as someone who's comfortable with math and has had to work with people who aren't, the average joe has trouble working with negative numbers.

2) Something people can understand what it means. With very few exceptions (negative yardage comes to mind), all stats accumulate in the positive. What will someone understand better: that Adrian Peterson averages 4.7 yards per carry or that he averages +0.6 yards above average per carry? Both are true, but one is what actually happens in the game (he gains yardage) and one is just a stat (he gains more than the average back).

I might be wrong in all of this. Maybe the issues I have with "new" stats is just my issue and not something that most people have. The thing "we" -- meaning those of us who try to innovate with new stats and can understand how complex stats are computed -- sometimes get lost in our own heads and can't see how others wouldn't understand our glorious ideas. I'm not bashing anyone's stats, and I know my own ideas need refinement; rather, I'm typing all this because I think these are issues we'll all need to address if we want "our" stats to achieve widespread use. Maybe in a hundred years, passer rating, obtuse as it is, will fall out of vogue with football fans and some other system will supplant it as the standard by which quarterbacks are rated (wins notwithstanding). But it'll have to be something that's palatable not to statheads like us, but to Joe Six-pack.

No comments: