So the Hockey Analysis Group over on Yahoo has been having a discussion concerning hockey’s plus/minus values and how they can be used. We, the SSC group and a student, Dennis Lock, have done an analysis that takes the advances the plus/minus idea. Our premise is that every play should tell us about the players on the ice. Therefore, we looked at a more general model that goes beyond goals and accounts for every play on the ice and looks at every play from the ’06-’07 season (There were 359, 322 plays). We determined from NHL play by play files and time on the ice (TOI) files who was on the ice for each play. (There were over 1000 different players that took the ice in ’06-’07.) Note that this was a ton of work on the programming side to get the data to work out. Thanks to CK for that!
Formally our model is :
where Y(i) is the value of the play and q(t) is the rating of player q on team t (t=h is for the home team and t=a for the away team) that was on the ice for play i. This can be viewed as a linear regression (with a ton of indicator variables for who is one the ice) and the resulting q(t) for each player gives us a rating for that player that adjusts for the other players on the ice. (This is similar to the model given by 82 games for basketball.)
We used the NHL definition of a play which includes: shot, blocked shot, hit, takeaway, giveaway, faceoff win/loss in each zone, etc. We assigned values to plays based upon the probability that it led to a goal for the home team minus the probability it lead to a goal for the away team in 10 seconds. That is,
Y(i) = P( event i leads to goal for Home Team in next 10 secs)
– P(event i leads to a goal for Away Team in next 10 secs).
Goals were assigned a value of Y(i) =1 or -1, taking a penalty gave a value of negative of the league average power play scoring percentage. Working with an NCAA Div I hockey coach, we chose 10 seconds because there was evidence that after about 8 seconds each event no longer impacts the probability of a goal.
Thus, we have ratings for each players that say how much a given player contributes having accounted for the players they were on the ice with. In particular, the ADJUSTED +/- Rating tells us how much a player increases (or decreases) the likelihood that a goal will be scored in the next 10 seconds if they were paired with an average lineup playing against an average lineup. Note that for a randomly selected 10 second interval the probability of scoring a goal is 0.008.
Some player ratings
|Ranking||Player||Team||Adjusted +/- Rating|
Above we’ve presented a table with the Top 20 players (and some other notables) who played had TOI of > 1000 minutes based upon this Adjusted +/- for the 2006-07 season. Again this methodology takes account of the value of all the plays that a player was on the ice for and then adjusts their rating based upon who they were on the ice with. If Selanne was on the ice with an average lineup for him and an averge lineup against him, we would still increase the likelihood of a goal in the next 10 seconds by 0.02671. Chelios on the other hand barely adds any value above an average player and Robitaille is decreasing the likelihood of scoring in a given 10 second interval. Note that Crosby who won the Hart Trophy this year was only considered to be 5th best in this rankings. We note however that small differences here can have a large impact given the total number of 10 second intervals that players played.
Staal at the top of the list is a surprise but what it means is that better events happened when he was on the ice relative to what we would have expected given who he was on the ice with. Thus, he adds significant value to a team.
Again this rating is what does a player contribute adjusting for who they played with. Thus we see some other surprises in this list Moen, Talbot and Staal among them. What is likely happening here is that the model is adjusting for who they were on the ice with and against.
We think this is a better way to assess the overall performance of a player because it deals not just with goals with all of the events associated with a player and the likelihood that those events lead to a goal. This approach also accounts for being on a good line. The rating is based upon how much you contribute over/above the lines that you were paired with and paired against.
At the moment there are some drawbacks to this set of rankings. Currently all events are treated equally and we should probably deal with power plays in a different manner than other goals. We can update that by running separate regressions for power plays. We are slashing (pun intended) perhaps a little too heavily for being on the ice during a penalty. We’d like to differentiate between individual penalties and team penalties and charge individual players when appropriate. Additionally, using the result of a play minus its expected result would be an alternative and possibly better play value.
The NHL changed their format for reporting play-by-play and TOI for the 07-08 season so we’ll have to go back and rewrite a ton of code to get these ratings this for that season.
FINAL NOTE: These results were recently presented, along with a group of other results by students, at the Joint Statistical Meetings and were awarded Best Poster by the Section on Statistics in Sports.