An Evaluation of the Total Hockey Rating (THoR), Part III

This is the third part of a series that looks in more depth at the Total Hockey Ratings that we developed. In Part I, I looked at the year to year correlations for THoR, Gabe Desjardins’ Corsi Rel and (HART) . Part II discussed the validity, or the relationship between the response in the THoR model, NP20, and winning. In this part, we’re going to take a look at some individual players through the lens of THoR. We’ll look at some players that have come to be associated with THoR and some others for whom having the ability to evaluate them via THoR gives us some clarity about their value. Let’s start with the two players most associated with THoR: Alexander Steen and Tyler Kennedy. We caught plenty of flak for rating these two players highly when THoR came out. Here we’re going to use the value of a player over 4000 even strength plays. That’s about the number that Zach Parise played in his last full year with New Jersey or that Ryan Clowe played with San Jose in 2011-12 the last full NHL season. It’s about what we would expect from a 2nd line player over the course of a full NHL season. Note also that the outcome of THoR is a per play value for each player that is adjusted for factors such as the other players on the ice with a given player. We are inflate that value to a total of 4000 plays to numbers that are meaningful and easier to interpret. There will be some fluctuation from year to year in player performance and there are factors such as player aging that . We’re also going to assess players on their wins above replacement value. Note that the value of a win is worth approximately $2MM.

We’ll start with Tyler Kennedy. The original THoR paper had him as a top player for 2010-12. He’s generally been a third liner with Pittsburgh until this year. He was traded to San Jose prior to the NHL draft for a 2nd round pick. The line for Kennedy over the four years for which we have THoR is: 0.8, 2.2, 4.0, 2.1 wins above replacement per year, where we’re using half a win as the difference between average and replacement. Those four numbers are the even strength wins above replacement for the 2009-10, 2010-11, 2011-12 and 2012-13 NHL seasons. Obviously the 2012-13 season is based upon less data than the others. Note that even strength is not the whole picture of player value but it is one that gives us some consistently. In calculating these values we assuming the same number, 4000, of plays for each player each year. The performance here for Kennedy has been relatively consistent and top end since the 2009-10 season. An high end THoR player can account for 3 or 4 wins above replacement at even strength using this methodology. He showed up highly on our radar for the middle two years. The most he played was in 2010-11 when he had just under 4000 plays. Can’t say it is a surprise that his play in 2012-13 regressed. We expect some year to year regression though THoR does have high year to year correlation. The Penguins were obviously disappointed with him making him a healthy scratch for good portions of the NHL playoffs. As with most metrics, we should expect the Kennedy going forward will produce something like the average of his THoR’s to date.

Next up is Alexander Steen. We had him very high and recently a small sample survey of NHL players by ESPN found him to be the most underrated player in the NHL. THoR has him at 1.3, 3.1, 4.5, 0.9 wins above replacement for each of the last four seasons. Steen has had some injuries and so like Kennedy has never played 4000 plays in a single season. Now one might suggest that the numbers for these two players might be a result of some smaller sample sizes but in the model we have shrinkage due to the ridge regression that we are using which deflates ratings of players with smaller sample sizes. So that’s not it. Steen has gotten a good deal of attention of late since he has started the season at a torrid scoring rate. THoR is not affected by this as we give the same value to a shot, the probability that it goes in, whether it is a goal or not. There is definitely finishing ability at the NHL level but it is hard to determine that from a single years worth of data and so we’ll continue to look at finishing but for now the model does not include it.

Now we have so far looked at players that were surprising from the original THoR paper. And, no doubt, those need to be assessed. But to paraphrase something I’ve seen attributed to Bill James any new metric that is worthwhile should be mostly things we already know but also some surprises. Steen is no longer a surprise; Kennedy does well at getting pucks to the net. If we look at some of the other top players from from the THoR we can see the year to year consistency of this metric. From A. Kopitar (1.6, 2.3, 3.2, 3.5) to H. Zetterburg (2.8, 2.3, 2.5, 2.3) to Z Parise (5.1, NA, 4.8, 4.0) to D. Sedin (3.2, 4.1, 3.7, 4.3) to J. Toews (2.5, 2.7, 3.3, 3.8), there is good consistency with who some of the elite players in the NHL are. The NA for Parise in the ’10-’11 season was because he did not appear in enough games (due to injury) to have a rating.

The next player we’ll consider is Rob Scuderi. (Sorry to kick a guy when he is injured.) Some Penguins folks weren’t happy when I said that his signing stood out this past fall in an interview with a Boston newspaper. Scuderi will make an AAV of $3.375MM per year for four years which means he ought to be worth a couple of wins per year. Scuderi’s THoR values come in at 0.5, -0.8, -0.1, 0.0. Those negative values means he is doing below replacement level performance according to THoR. As some critics have pointed out, Scuderi had positive Corsi values while with the Kings but if you account for who he was playing with (a very high possession Kings team) as well as who he played against, Scuderi was not adding to the bottom line. Hence his THoR numbers. Of course, he had better numbers than Douglas Murray who the Pens also acquired. Murray’s numbers were: -0.1, -0.5, -3.4, -2.6. Maybe pairing Scuderi with Orpik (-0.7, -0.7, -1.7, -2.,8) will improve them both. I guess regression to the mean beckons. Meanwhile Pittsburgh seems to have demoted Kris Letang (3.4, 3.0, 1.7, 3.9).

As I was preparing this series on THoR, Scott Cullen had a twitter discussion about the value of Robyn Regehr.

@MacSapintosh That’s evident in his Corsi alone, but they play him against toughest opponents and he gets most D-zone starts of Kings’ D.

— Scott Cullen (@tsnscottcullen) November 17, 2013

That discussion caught my eye. Essentially Cullen was saying that it is hard to judge Regehr since he is playing the toughest ice time and often starting in his own zone. Now we know that both of those things (zone starts and quality of opponents) matter for evaluating a player. One of the great things about THoR is that we can use our methodology to account for just those factors (and the others in the model). If we do that we get the following for Regehr: -1.2, 0.9, 0.4, -1.2. Not exactly lighting the world on fire.

So we’ve looked at some individual players using even strength THoR. THoR ratings are going to vary from year to year but there is relatively high correlation in these player ratings. For more details see Part I of this series. In mid-December we’ll have enough data from the current season and the previous one to release the latest THoR. Meanwhile we’ll keep hammering away.

Developing data-based decision making tools for a competitive edge