Stathead Trivia & Esoterica: #2
THE REVOLUTION WILL BE DIGITIZED
One of the great lessons that history can teach us is that it’s surprisingly difficult to identify sea changes as they’re happening. If you’re old enough to read this, that means you can probably remember a time when some people openly doubted the internet’s staying power, or at least questioned the viability of a medium like blogging. You don’t see those arguments anymore, because history has proved them to be laughably wrong, all within a window of 15 years or so. Those poor people couldn’t tell the difference between a fad and a ground-shifting, life-changing explosion in the history of human communication, but it wasn’t because they were stupid, per se: it was because humans are inherently conservative creatures, slow to integrate new ideas into their lives, and they need time to adapt to even the biggest changes to their world.
What helps us move this process along, however, is a narrative: a convenient story we can recite, that tells one version of what happened in the past, and can give our culture a framework for evaluating the present. What helps this narrative work is that it’s told in a common language: a shared set of words, ideas, concepts, benchmarks, and standards. After all, if we can agree on what’s come before us — where we succeeded and where we failed — then we can probably agree on the importance of what’s happening in front of us right now.
I have seen the new narrative in baseball analysis, my friends. I have seen the ground shift below our feet, but if you weren’t paying attention on Monday, you might not have noticed. You might have thought it was any other day. But it wasn’t. Because Monday, May 17, 2010 was the day Sean Forman added historical Wins Above Replacement data to baseball-reference.com, and the common language available to the public about Major League Baseball leapt from 900 words to about 900,000. You might think I’m kidding, or that I’m exaggerating for effect, but you’d be wrong. Where you’re standing used to be the Grand Canyon. But Sean Forman moved mountains, and now the world is flat.
A couple weeks ago I wrote about the importance of using the right measurements to define value. What Forman did by adding Sean Smith’s version of WAR to the most popular baseball statistics site on the web was an act that takes that idea seriously, and it’s to the sport’s benefit. By giving his user-base a thoroughly modern yardstick for judging value, Forman also provides the definitive re-write of history, in many ways, as the stathead community’s basic framework for looking at the world — in terms of wins, based on context-neutral performance, derived from a player’s ability to help his team outscore his opponent — has now found its way in front millions of eyeballs per month. That’s a big deal, folks. If stupid baseball opinions were carbon emissions, this is the equivalent of the government handing out free electric cars.
THE DIRTY WORK
Bill James was a successful analyst for many reasons. First he was an undeniably great writer, with a voice — curious, prodding, iconoclastic — that would have a tremendous influence over every sports pundit with a “statistical bent” to come after him. He was also a creative thinker. I don’t know Bill, but I suspect his Myers-Briggs personality I.D. is something like an I.N.T.P., which means he excels looking at problems from an uncommon distance, and is drawn to arguments that ask his audience to re-think their entire notion of what they’re doing. Not everyone is adept at seeing the world this way, but it seems to come naturally to Bill, and it’s one of the reasons, I think, his arguments are frequently so compelling. He was operating outside the box before people even knew there was a box in the first place.
But the main reason Bill found success, I think, is much more mundane, but no less important than the other strengths he brought to the table. Bill James was successful because he did the dirty work. When tasked with answering the question “Does clutch hitting exist?” he didn’t just wax on for 2,000 words about the definition of “clutch,” or whether that was a useful concept on its face. He went through the goddamn box scores and collected the data to see what he could see. He created new information — new knowledge, not just trivia — and then he made that information available for other people to consume. He changed the norms by organizing facts in an easily accessible form, where they could be referenced and memorized and enter the existing conversation about baseball like fluoride dripped into our tap water. Of course, only a small niche of baseball fans were actually reading Bill’s Abstracts when they reached bookstores every year, but that didn’t matter because the type of information he was giving them could live on past the usefulness of his essays about the Royals, or the Hall of Fame, or Duane Kuiper. He was giving you the keys. All you had to do was turn the ignition.
While the rise of baseball analysis on the internet can be traced back to any number of worthy sources — from Baseball Prospectus, to Rob Neyer (who has directly influenced more writers than James ever did), to Peter Gammons, to Moneyball — it’s Forman’s baseball-reference.com that has been the real driver all along. For one, bb-ref.com is a non-partisan source. Unlike most of the personalities in the “sabermetric community,” Forman has never been obsessed with advocacy, and as a result, he’s never managed to rile up any opponents. His site is about giving his audience as much information as possible, as quickly as possible, with a minimum of braying commentary. To those of us who have been paying attention, it has always been clear that Forman was a stathead’s stathead — and his site has always contained subtle suggestions for how to best interpret the data he’s giving you — but he never used his platform to explicitly sell ideas. Like Bill James, he was comfortable doing the dirty work, and letting the data speak for itself.
Consider, after all, the timeline for bb-ref.com over the past decade, and the way data-driven baseball analysis has entered the mainstream on a broader scale:
YEAR EVENT 2000 Baseball-Reference.com launches 2001 Detailed, historical splits added from Retrosheet 2002* OPS+ and ERA+ appear on every player page 2004 Basic defensive stats added 2005 Player seasons broken down by team on page 2006* Season stats now update daily, including splits 2007 Users can adjust stats to a neutral context 2007* BB-Ref's Play Index is born 2009 Upgraded defensive data with TotalZone 2010 10-for-10 upgrades, including WAR, etc.
* From memory, because Forman’s too modest to list everything on his site history page.
It would be silly to pretend that baseball-reference.com was the only site really innovating during this period. That would be the opposite of the truth, in fact. But no other site in bb-ref.com’s league of non-partisan information sources has increased the sophistication of its offering nearly this much, or this often, thereby redefining what it even means to know what “the numbers” say. And I don’t think it’s a pure coincidence that an increased understanding of basic concepts like park illusions and defense have gained traction shortly after Forman made them standard issue on every single one of his player pages, for all the world to see. Sure, it’s nice when something like PECOTA makes it on to SportsCenter for a brief segment during spring training, but that’s only an example of increased publicity. What Forman does by placing OPS+ next to a hitter’s traditional stat-line is much more effective, because it encourages increased acceptance. It’s like he’s giving his audience a brand new word to use amongst themselves. Sooner or later, they’re going to start integrating it into their language, and they’ll forget the time when they didn’t know what it meant. Its definition becomes common, and that commonality allows it to slip seamlessly into the narrative.
And that “new” narrative is judging players by how many wins they create beyond (or below) the production of a good Triple-A hitter or pitcher. Of course, it’s not new because the concept is new. It’s new because the average fan visiting baseball-reference.com in 2010 will see it alongside all the other information he was looking for in the first place. It’s new because there are now cleanly organized leaderboards for this metric, for every season since 1901. It’s new because the stathead’s version of history — the “sabermetric narrative” about Major League Baseball — can finally be found in one place, and read in one coherent, comprehensive argument that rejects the false idols of the past and exalts new ones in a manner that’s more difficult to dismiss than ever before.
It’s easy to see a revolution when the walls are coming down around you. It’s much more difficult to perceive a revolution when the weapons-of-choice are ideas. By adding Wins Above Replacement to baseball-reference.com’s suite of statistics — not to mention its draft pages, where they are perhaps most valuable — Sean Forman has provided the stathead community — the fact-based community, really — with a tactical advantage in the war against ignorance like nothing we’ve seen in 30 years. The bar for common knowledge has just been raised, my friends. This might not be impact you directly — you already knew about Fangraphs and Rally’s database, after all — but it’s going to affect your ability to persuade other people that you’re right, and the quality of your argument didn’t change a bit. Forman just gave your opponents the vocabulary they needed to understand your point of view, maybe for the first time. We should all be thankful for that, and remember May 17, 2010 as the day when the world changed just a little bit, even if only a few of us noticed.