PHOTO CREDIT: DAVID BANKS – USA TODAY SPORTS
I, like many hockey nuts across Canada, have a subscription to the Athletic. It’s one of the few sports publications I’ve been willing to pay for, and that has a lot to do with their lineup of writers. Analytically oriented writers Tyler Dellow and Dom Luszczyszyn played a major role in that, not to mention that them landing Corey Pronman allowed me to discontinue my ESPN Insider subscription. Another major factor was the hiring of Justin Bourne, whose most recent gig was as a video coach for the AHL’s Toronto Marlies.
What got me excited about the Bourne addition is that his focus area has previously been on scouting and systems analysis, which is an area that I’d like to improve in. I was genuinely obsessed with the systems articles that he used to produce for The Score before taking the job with the Marlies in 2015.
While I’m still looking forward to what Bourne has to say about hockey strategies and video analysis for both NHL and non-NHL players, one of his recent articles for The Athletic suggests to me that his knowledge of prospect statistics and analytics is lagging a bit behind. The piece, titled “When Evaluating Players, Gauging Opportunity They’ve Been Given is Crucial” laments the lack of available statistics for prospects beyond raw points. (Paywall, obviously.)
What you’ll almost always look at first, is points. As you move down the ladder away from the NHL, it makes the most sense to start there, as they’re basically all we have to look at. For better or worse, we’re stuck with points when it comes to player evaluation.
As someone who has spent the last couple of years developing statistical models for prospects, and as someone who learned from my predecessors on this site that were doing it years before me, this strikes a chord with me. Essentially, it feels like I’m paying for an article to tell me that the work that I’ve been doing for the past two years doesn’t exist, which, needless to say, is an odd feeling.
Without a doubt, prospect statistics has moved far beyond looking at mere point totals, and it did so quite a while ago. Today I’ll be looking at the history of prospect analytics: how far they’ve come, and where they’re going. I realize that the regular readership at Canucks Army probably doesn’t need any convincing of this, as they’ve seen endless posts on prospect data already. But apparently that hasn’t extended all across the league yet. I don’t know how many people in the industry are on the same page as Justin Bourne, but for all those folks in the hockey community that aren’t aware, it’s time to bring you up to date on the world of prospect analytics.
For those, like myself, that came onto the hockey analytics scene long after it started, it seems like a great many of the early advances were achieved by a small number of people. Names like Benjamin Wendorf, Rob Vollman, Iain Fyffe, and Gabriel Desjardins dominate the oldest references, and prospect analysis is no different.
One of the earliest forms of statistical analysis of prospects is The Projectinator, a model created by Iain Fyffe for Hockey Prospectus way back in 2009. In the original post, Fyffe began to explore the idea of era adjusting Canadian Major Junior statistics. Era adjustments had been around for quite a while. Every hockey fan at some gets into a debate about how many points Wayne Gretzky would score nowadays, or how many points Sidney Crosby would have scored in the mid-1980’s. Era adjustments – performed in their simplest manner by dividing the current goals per game rate by that of any desired era, and applying the resulting coefficient to the player in question’s point rate – allowed us to compare the 1980’s and the 2010’s objectively. The same can be done for prospects, so long as you can capture the data.
Up and Coming became a regular series on Hockey Prospectus, and Fyffe continued to evolve his ideas. Over the course of that first year, Fyffe explored concepts that would eventual become integral to prospect analysis, such as height bias, within-year age, and using comparables, as well as taking numerous cracks at projecting players, and even goaltenders. The Projectinator has reappeared numerous times over the years, and had its very own chapter in Rob Vollman’s guide to hockey analytics, Stat Shot (which you absolutely should own a copy of).
A couple of years later, Gabe Desjardins, then of Arctic Ice Hockey, used a cohort of OHL players born between 1965 and 1985 to estimate approximate likelihoods of success for Mark Scheifele and Sean Couturier. He also brought visualization to the table with this wonderful chart.
The Legacy of Canucks Army
Canucks Army has been at the forefront of prospect analysis for quite some time, going back long before I began to write here. Much of the original work was done by former editor Rhys Jessop (now employed by the Florida Panthers). His work was built upon by former Jets Nation site editor Garret Hohl (now of HockeyData Inc.), and their work became the foundation for PCS, the brainchild of Cam Lawrence and Josh Weissbock (now both of the Florida Panthers), and the predecessor of our current cohort model, pGPS.
The 2014 draft was huge for the Canucks, who were picking 6th overall, their highest draft slot since selecting Cody Hodgson 10th overall in 2008. There was a lot on the line for the Canucks, and the members of Canucks Army at the time were highly involved in their own search for the best prospect available at that position. It was at this time that Jessop applied some fairly basic statistical principles to publicly available data and ventured into the realm of adjustments.
Moreover, Jessop introduced age adjustments as an attempt to account for the apparent discrepancy in ability between the average October born player (very early for their draft year) and August-born player (very late for their draft year). In leagues like the CHL, which span approximately five years of ages and include major development years, the differences in a single calendar year can be grand. They can also be quantified.
After exploring the extreme inefficiency of the Canucks drafting under Ron Delorme, and after demonstrating that a made-up summer intern constrained by the simplest of rules could, in theory, outperform many of the NHL’s scouting departments, Jessop kicked things up a notch, following Fyffe on the path of assessing players based on what similar players had achieved in the past – but with a new degree of accuracy.
Cohorts are used in all manner of social sciences, predicting future behaviour based on studies of what has happened before. Applying such a concept to sports seems like a no-brainer now, but at the time it was truly ground breaking. Cam Lawrence, along with enigmatic programmer and prospect junkie Josh Weissbock, turned Jessop’s experiments into a full scale statistical model, complete with Euclidean math, and testable results. The system, in addition to the promise of what it could become in the future, opened the door for them to work for an actual NHL front office, and go from writing about prospects to actually playing a role in drafting some of them.
Modern Prospect Analytics
It’s been a couple of years since we lost PCS, and I’ve done my best to fill that void with my own prospect cohort model. pGPS has been used hundreds of times on this website since being unveiled in the Spring of 2016, including in the Nation Networks prospect rankings for both the 2016 and 2017 drafts, and for the vast majority of prospect articles published here since then.
pGPS barely resembles the last public incarnation of PCS, or the first public version of pGPS for that matter, apart from the very foundation of the concept and the use of Euclidean distance. There is nothing in the formula that hasn’t had adjustments applied to it, and what used to be a three-factor formula now contains nearly three times that many, all with various weights applied to them.
But I think the biggest advancement I’ve made has been visually. The bubble charts generated by the pGPS program have proven quite popular, as have others such as year-to-year progression charts and cohort line distributions graphs.
Of course, pGPS is just one way we look at prospects here. Our arsenal of prospect statistical models now also contains SEAL adjusted statistics (adapted and improved from Garret Hohl’s original concept, with his permission and guidance), and the use of game sheets for a variety of purposes, including on-ice data, ice time estimation, rate stat estimation, teammate adjustments, quality of competition, and more.
And that’s just us at Canucks Army. We can’t talk about modern prospect analytics without making reference to prospect-stats.com, currently the most comprehensive free database for prospects from the CHL, the USHL, and the AHL. Then there are those who are out there performing research on prospects en masse, like Namita from Hockey Graphs, or on a more individualistic basis, like Josh Khalfin from Blue Seat Blogs.
Assessing and Accounting for Opportunity
While it should be incredibly clear at this point that prospect statistics have evolved far beyond mere points, we haven’t yet addressed the other point of Bourne’s article: the need for assessing and accounting for opportunity.
I agree with everything that Bourne says about the nuances of opportunity and how they relate to fluctuating point totals, which include but are not limited to:
- Ice time,
- Quality of teammates,
- Power play time,
While he’s also right about some non-NHL stats being notoriously unreliable (or outright absent), we do have the ability to estimate each of the above factors, again using pretty simple statistical techniques.
Game Sheet Analytics, which I touched on above, are something that Dylan Kirkby (our resident programmer) and I were able to get working just before the NHL draft this year, but others have been using them for quite awhile. Prospect-stats.com is entirely based on them, as was the now defunct chl-stats.com.
The value of game sheets is in the fact that, for many leagues, they list the players who were on the ice for each goal that was scored. In small samples, this is incredibly volatile and largely influenced by extraneous variables, but over the course of a season, you can get some pretty reliable conclusions.
The distribution of situational scoring is one of the main factors included in SEAL adjustments (the ‘S’ is for situational), and the system has built in adjustment coefficients for each type of point (from 5-on-5 goal to 5-on-4 secondary assist) that are based on which rates correlate strongly with eventual NHL production.
Ice time may be estimated by determining the percentages of team events (goals for plus goals against) a player was on the ice for, and multiplying that by the amount of available ice time. This can be done for 5-on-5 time, as well as for special teams.
Frequent linemates can also be found by keeping track of how often two players are on the ice for events together. We can also then determine things like goal shares and point rates for when two players are together and apart. The resulting WOWY charts have seen plenty of use on this site, including during our recent Top 20 Prospect Rankings series.
It doesn’t stop there though, as game sheet data can be manipulated further to estimate quality of teammates, quality of competition, rate stats, and more.
But if you’re the type that simply can’t shake the skepticism involved in estimating stats like this, there’s a solution for that too.
HockeyData Inc. and the Future of Prospect Analysis
Enter HockeyData Inc., the creation of the aforementioned Garret Hohl and business partner Cole Gawenda. The two have created an analytics company that prominently features prospect stats, with a team of employees who watch endless amounts of hockey games and track all minutia of data. That includes actual time on ice, on-ice shot differentials (the Corsi’s), and shot and pass location data that allows for expected goal and assist statistics that go far beyond what is publicly available for the NHL, let alone leagues below it.
— HockeyData (@HockeyDataInc) September 5, 2016
HockeyData’s collection of statistics is unfortunately unavailable for public consumption. So time consuming is the work involved that it only makes sense to charge money for it, and so valuable is the output that plenty of teams are willing to pay. News of a contract with the Washington Capitals made the rounds shortly before the summer, but the company had plenty of NHL clients before then, and they’ll continue to pick more up as time goes on.
Because HockeyData’s information isn’t freely available, perhaps it doesn’t really fall into the category that Bourne was discussing in his article – hockey fans googling players that their teams have picked up, or simply shown interest in. Even without proprietary information like that though, the research and data that is currently out there for non-NHL prospects is pretty astounding, and it’s not something that you should turn a blind eye to. While some may not be satisfied by its lack of certainty, the amount of context that it can provide is undeniable, and it helps to answer a lot of questions and concerns that Bourne put forth. It’s time to get on board with this stuff, because it’s come a very long way, and it’s only getting better year after year.