What It Takes to Build An Advanced Hockey Stats Site

We talked to one of the founders of the now-defunct War on Ice, as well as Manny Perry of Corsica Hockey, about the behind-the-scenes efforts that went into getting their sites off the ground.
July 7, 2016, 10:01pm
War on Ice

This article originally appeared on VICE Sports Canada.

Whether you're a casual fan or a credentialed reporter, statistics are essential to your understanding of sports. Sure, there may be some disagreement over the value of certain metrics, but there's an expectation of infallibility; if the box score says Alex Ovechkin scored last night, you're inclined to believe that he did without thinking twice.

Unfortunately, that's not the case for the NHL. In fact, visiting the league's official stats page is like looking into a funhouse mirror and seeing twisted versions of what's actually happening on the ice.

"It's extremely troubling for the obvious reasons," said TSN's Travis Yost. "Everything the NHL does is seen as credible. The problem is their entire stats database is either randomly generated, incorrectly calculated, or a combination of the two. Most of the site maintains the same data integrity issues as it did one year ago despite the analytics community repeatedly pointing it out."

READ MORE: The VICE Basics to Hockey Analytics

Thankfully for hockey fans, War on Ice became a meticulous counterpoint to the league's incompetence; just about any analytical tool—from possession numbers to shift charts and shooting maps—that are publicly cited could be traced back to its database. But what grew into an invaluable resource began much more humbly.

"As it happened, we'd already built a lot of the mechanics to collect the play-by-play data beforehand," A.C. Thomas, one of the site's co-founders, explained via email. "Sam [Ventura, another co-founder] and I were on a hockey panel at a statistics conference and we all wished out loud we had far better exposure for our research than the academic sphere offered, so we all resolved that we were going to make a much bigger push to release things publicly. At the time we didn't know what that would be."

Roughly a week after that conference, the stats website Extra Skater suddenly went dark. Shortly after, news broke that the Toronto Maple Leafs had poached its creator, Darryl Metcalf. That was exactly the push Thomas and Ventura needed. The first prototype of War on Ice was ready within three days and the page went public about two weeks later.

But even after launch, the effort was far from finished. "We probably did about 15-20 hours a week each for the next month getting everything smoothed over, adding new features, and so forth, but there was also time spent during the games keeping an eye on Twitter for questions, getting more feedback, and so forth," Thomas said. "That went down over the next couple of months and rocketed back up again when CapGeek went offline and we tried to expand into that area."

In January 2016, news broke that War on Ice would be the latest casualty. Thomas and Alexandra Mandrycky, who came to the site a year earlier, joined the Minnesota Wild organization, while Ventura was hired by the Pittsburgh Penguins. The site that stepped into the gap left by Extra Skater would be going down the same path as its predecessor. And while the site managed to stay active through the conclusion of the season, it was clear that someone else would need to shoulder the load.

One of the sites that can directly fill that void is Corsica Hockey, which was launched in late February by Manny Perry, who previously operated an advanced stats website about the Ottawa Senators. The platform's name even comes from an interaction with Sens forward Bobby Ryan, who referred to popular advanced stat Corsi as "Corsica."

"Screw your nerd stats," Bobby Ryan exclaimed after scoring a goal earlier this season. Photo by Jayne Kamin-Oncea-USA TODAY Sports

While the site wasn't originally conceived as a major recourse, the possibility that War on Ice might go dark prompted an adjustment.

"Initially, it was just going to be a really small-scale thing. I wanted a site where I could have these apps I had been working on; if anyone was interested, they could access them," Perry explained. "When the War on Ice announcement was made, people were scrambling to see whether or not there would be a replacement. I had already started working on the site—I was working on the scraper, I was working on the database—so the decision had to be made quickly as to what scale I was looking at."

Once the decision was made to aim for a larger database, plenty more effort was needed to make the site functional for the masses.

"It's been quite a bit of work," Perry said. "I had already started on some of the stuff before that, so I wouldn't want people to think that when the announcement was made I just started working on this, but I did accelerate things and start putting more time into it. I can't really ballpark the number of hours, but I've spent a lot of my free time working on stuff.

"Building the apps—when you access the site and are using the tools to play around with it—those aren't the most difficult things. It's mostly engineering the database and getting the data in place. You need to decide on a schema ahead of time because once you start compiling the database, you're bound to this structure. That whole process is probably the longest."

By the time War on Ice had shut down last month, the focus had shifted. The hardest work was done and the initial pressure of starting a new site had subsided; now it's all about improving.

Over the summer, Perry plans to add more tools to help fans go beyond possession data. There's a goals added model, somewhat like Wins Above Replacement, on the cards, along with a player comparison tool similar to Domenic Galamini's HERO charts. Corsica could also boast its own predictive models by next season, something that SAP and the NHL have been attempting to offer with little success.

Those efforts are already being noticed. There's been an uptick in traffic since War on Ice went down, and Ryan Lambert referred to Corsica as "the logical successor, with a bunch of other options and a clean, usable interface" in his June 15th edition of the Puck Daddy Countdown.

A good deal of the positive reaction likely stems from the context in which the site came to life. Following in the footsteps of a successful predecessor is never easy, but Corsica is yet to falter.

It's understandable that the end of War on Ice feels significant. The site was seamless, efficient, and coincided with a move toward the mainstream for hockey analytics. If you were curious about the numerical side of the game anytime during the past few seasons, odds are that you'd be directed to War on Ice. Whether you were in the press box and wanted to check how many scoring chances the first line generated or sitting on your couch looking for a flatlining shot chart to confirm your favourite team was hanging on for dear life through the entire third period, one site was your go-to source.

That influence, however, is not simply felt by bloggers and reporters—it extends to Corsica, too.

"I've never said or felt like this was a pitch to replace War on Ice. I just hope to offer some of the features that people would otherwise miss," Perry noted. "I think War on Ice is the best hockey stats site that there has ever been. But the big thing is not so much replicating the site or the layout or the features, but replicating their philosophy. What I really enjoyed most was the openness; their modus operandi was to share everything."

Manny Perry, plugging away at the newest and nerdiest stats. Photo courtesy Manny Perry

But even beyond the death of one site and the birth of another, there are countless resources out there.

From Hockey Stats' live in-game stats to Micah Blake McCurdy's data visualizations and everything in between, all sorts of analytical tools are out there for anyone who wants to better understand the game. If anything, the end of War on Ice and emergence of Corsica has confirmed what was already known: The stats community is in good hands, no matter who's putting in the innumerable hours behind the scenes. After all, there's nothing more central to sports than the numbers.

"I can't emphasize enough how important it is to have one of these sites active," Yost said. "Imagine a world where NHL.com was the only resource? The only people left would be live game trackers and, I guess, people trying to start up new sites. No one would ever use NHL.com.

"I can't imagine the work that goes into getting one of these sites live," he continued. "The cool thing is I think a lot of people do seem to comprehend the challenges these sites face, and people have helped in myriad ways: straight donations, ad-hoc troubleshooting in beta phases, recommendations for site adds. Still, it's an incredibly daunting task. I have crazy respect for the time that people have dumped into Behind the Net, Extra Skater, War on Ice, Hockey Analysis, [and] Corsica Hockey."