The conference will be held January 16th, 2016 at Porter Hall/231 University Centre Carleton University in Ottawa, Ontario. The conference webpage (including registration) can be found here:

http://statsportsconsulting.com/ottanalytics16/

Michael Schuckers, professor of statistics at St. Lawrence, who will speak at the conference, says hockey analytics has become a way to better understand the factors that impact the outcome of hockey games and hockey seasons. “The conference is an opportunity to meet and hear from some of the top minds in hockey analytics,” he said. “This year, we are especially excited to have IBM Analytics as a sponsor and presenter and to have a session on player tracking, which will cover work on capturing player location data during games. For the casual fan, this is an opportunity to learn more about the sorts of analytics that are being used to evaluate teams and players.”

One of the highlights of the conference will be a session on player tracking data and its role in the future of analytics.

This year the lead sponsors for the conference will be Carleton University and the Ottawa Senators. The other sponsors for the conference are IBM Analytics, St. Lawrence University (Canton, NY) , American Statistical Association Section on Statistics in Sports, Carleton University Athletics and the Statistical Society of Ottawa.

There will be a social event for conference attendees at Sens House the night before the conference.

This 2nd annual Ottawa Hockey Analytics Conference will consist of talks on the analysis of hockey along with time for socializing and networking. The event will be similar to hockey analytics events previously held in Calgary and Pittsburgh.

The webpage for last year’s version of this conference can be found at: Ottawa Hockey Analytics Conference 2015.

]]>So there has been a bunch of discussion lately about tanking in the NHL. Seems pretty clear that this has happened before and is happening again this year. Puck Daddy doesn’t mind it. Grantland’s Sean McIndoe calls it a ‘disastrous situation’ and has some suggestions including one (Option 11) for a postseason tournament. The idea of a postseason tournament is nice in theory but I wanted to sit down and see if it was possible to build a tournament which gave worse teams better odds of winning the tournament. So I set out to put together a concrete proposal for a tournament. Though it would clearly need NHLPA approval, that tournament could run concurrently with the Stanley Cup playoffs. It might even be good to have it start after a small layoff when there are fewer games per night. Sponsorship might be fun: The Second Cup, anyone? Or the Last Chance Cup sponsored by the New York Lottery.

Below is the full proposal for a single elimination tournament to determine the first round NHL entry draft order. The idea here is to award the first pick in the NHL draft to the team that wins a tournament. I’ve laid out the full proposal in the pdf below. The key to the process is to design a tournament that will have the highest probability of winning going to the worst team. That takes some rethinking of the usual tournament design. This proposed tournament is represented in the graph at right where the worst team is listed as 14, the second worst is 13 and the best team (to not make the playoffs) is 1.

- Draft order is determined by how far a team progresses in the tournament.
- There are guarantees that the worst team will get no worse than the 3rd pick, the 2nd worst team no worse than the 4th pick.
- Games are played at the rink of the lower seed, again to provide worst teams with better odds of progressing through the tournament.
- Draft ordering could be different for rounds beyond the first.
- It is possible to extend the model to tournaments for 15 or 16 teams when/if expansion occurs.
- There would be additional revenue from ticket sales and TV rights.
- Playoff OT rules apply though I would be open to the possibility of using the shootout to determine a winner if it would benefit worse teams and I suspect that it does (unless you’re NJD circa 2014).
- Players eligibility for the tournament is the same as for the Stanley Cup playoffs.

As McIndoe noted in his Grantland piece there are some potential issues including questionable integrity of games. One remedy might be to consider monetary incentives — new Honda’s for every member of the winning team, maybe — created for players by rewarding players and teams financially for continuing to win. There are some tiers here for teams and so it is possible that a team in the 8th worse spot might ‘tank’ for the 9th spot.

Here is a pdf of the proposal with some additional verbage.

NHLDraftSingleEliminationTournament

]]>

Last night I did a webinar on Statistical Methods for the Analysis of Hockey for St. Lawrence University Alumni and friends. I had a blast. There were lots of good questions and lots of interest. Exactly what I would expect from a bunch of Laurentians.

Below is a link to the slides for the talk.

]]>Additional Details can be found on the workshop website below:

Workshop website: HERE

]]>

Our aim is to build a model which accounts for the relative differences between rinks on the events that are recorded. The rest of this post is a summarized version of our paper on this topic which is linked below. **The focus of this work is in making data recorded for the following events comparable from rink to rink: Blocks, Giveaways, Hits, Missed Shots, Shots, and Takeaways.** We also look at the recording of aggregated events that count as Corsi events, Fenwick events and Turnovers. The last of these was created by Schuckers and Curro to account for the home bias of Takeaways and Giveaways as part of the THoR paper.

The data that we use for this analysis comes from the nhlscrapr R package created by Thomas and Ventura and includes 6858 games from six regular seasons.

We used a statistical regression to model counts of events per game with several predictors **including team factors, average score differential, and rink.** The estimated rink effects that we derive can be used to reweight recorded events so that can have comparable counts of events across rinks. Applying our methodology to data from six regular seasons (2007-08 through 2012-13), we find that for the most part NHL rinks (and the individuals therein) do a reasonably consistent job of recording events. This is especially true of the recording of SHOTs, which has the fewest rinks with significant recording issues and has the smallest rink effects that we found. We only found that St. Louis and Florida have rates for the recording of rinks that differ significantly from other rinks. Florida counts shots at a rate that is about 3% higher than other rinks while St. Louis counts shots at a rate that is 4.5% lower than the rest of the league.

However, there are some rinks with rink effects that are significant and consistent across these seasons for other events. Zeroing in on blocks, hits and misses, there were four rinks that consistently inflated or deflated the counts of those events: Edmonton, Los Angeles, New Jersey and Toronto. There were also six rinks that were not significantly different from the rest of the league on those events. These were: Buffalo, Nashville, Pittsburgh, St. Louis, Tampa Bay and Vancouver.

While event counts are impacted by rink effect, ratios of events such as Corsi For Percentage remain relatively unaffected. The table below demonstrates how small the impact of rink effects are on the Corsi For %. This despite our estimation that Boston, Columbus, Edmonton, Los Angeles, New Jersey, and Toronto all have significant rink effects for the counting of Corsi events. New Jersey is particularly egregious, undercounting Corsi events by about 16%. The reason that ratios are not impacted as much as counts is that the impact of a given rink is felt in both the numerator and the denominator for the home team.

**Table 1: Comparison of Corsi For % and Adjusted Corsi For % for 2012-13 NHL Regular Season**

Top 5 and Bottom 5 teams on Corsi For Pct.

Team | Corsi For Pct. | Adjusted Corsi For Pct. |

L.A. | 0.5630 | 0.5628 |

N.J. | 0.5592 | 0.5592 |

BOS | 0.5433 | 0.5430 |

CHI | 0.5414 | 0.5420 |

DET | 0.5366 | 0.5365 |

… | … | … |

CBJ | 0.4711 | 0.4706 |

NSH | 0.4668 | 0.4662 |

BUF | 0.4512 | 0.4513 |

EDM | 0.4458 | 0.4445 |

TOR | 0.4408 | 0.4398 |

While ratios are not affected in a major way by rink effects, the counts of individual events recorded as part of the NHL’s RTSS system are. To illustrate this we looked at the Block shots recorded by players during the 2012-13 NHL Regular season. For those rinks where there are rink effects for block shots, we take each block and weight it by 1 divided by the rink effect. So that for Nassau Coliseum home of the New York Islanders, NYI, each block counts as 1/1.208 or 0.828 of a block since that rink overcounts blocks by about 20.8%. Similarly, since the rink effect for Anaheim is 0.721, each block in that rink counts as 1/0.721 or 1.387 of a block.

We adjusted every block during the 2012-23 regular season for the rink in which it occurred and the top 10 players on adjusted blocks is given in the table below. Unlike the case for Corsi For %, there are substantial changes for the counts of block events. Here the top player on the revised list, Francois Beauchemin, was originally ranked 10th on the list of players with the most blocks. There are other large changes for players from the Islanders (Andrew MacDonald) and the Capitals (John Carlson) due to the relative counting of BLOCKs in those rinks. Macdonald and Carlson move from tied for third in raw BLOCKs to eighth and ninth, respectively, after our adjustment.

**Table 2: Comparison of Raw Block Counts and Adjusted Block Counts for 2012-13 NHL Regular Season**

Top 10 players based on Adjusted Block Counts

Player | Team | Adjusted Blocks | Raw Blocks | Differential |

F. Beauchemin | ANA | 133.3 | 111 | 22.3 |

G. Zanon | COL | 127.2 | 124 | 3.2 |

D Girardi | NYR | 120.7 | 125 | -4.3 |

R Hainsey | WPG | 120.5 | 123 | -2.5 |

D Seidenberg | BOS | 119.8 | 115 | 4.8 |

L Smid | EDM | 114.7 | 119 | -4.3 |

B Orpik | PIT | 110.3 | 114 | -3.7 |

A MacDonald | NYI | 109.8 | 123 | -13.2 |

J Carlson | WSH | 109.4 | 123 | -13.6 |

The full paper can be found HERE.

Note: Brian’s contributions to this project, with the exception of minor edits of the paper, were made while he was an Associate Professor in the Department of Mathematical Sciences at the United States Military Academy, West Point, NY, prior to joining the Florida Panthers.

]]>*Recently there has been a good deal of interest in the use of statistical methods and statistical thinking in sports. Work of this kind has been going on for many many years though often behind the scenes. With the publication of Michael Lewis’ Moneyball and the subsequent movie of the same name, the interest in this kind of work has grown extensively. Further evidence of this can be found in the increased number of conferences devoted to the topic and the increased use of these topics in the sports media. In this talk, I will begin with an overview of sports analytics including a discussion of the general approaches and methods. Having done that, I will discuss two examples from my own recent work: Rink effects in the NHL and NHL player ratings.*

Thanks to Shirley Mills at Carleton University for the invitation to speak and for lunch. Thanks to her students for some enlightening conversations. Was also nice to see some folks I know from the Canadian Border Services Agency in the audience.

Here are the slides from that talk.

]]>A week ago (on August 6th), I appeared on a panel on the Joint Statistical Meetings in Boston along with Brian Macdonald, Andrew Thomas, Sam Ventura and Kevin Mongeon. The panel was a good one on some of the advanced statistical methods that are being used in hockey. Fluto Shinzawa of the Boston Globe attended the panel and wrote up a piece on the panel that can be found at this link. A rough audio of this can be found at Andrew Thomas’ website: www.acthomas.ca/?p=62. Andrew was the organizer of the session.

I spoke for about 5 minutes on a model for Rink Effects that Brian Macdonald and I developed. Rink effects are something that has been known about in Hockey Analytics for some time. The issue is that there has not been a way to correct for the differences between the recording of RTSS at various rinks. This issue has been cited as one of the hurdles to adoption of hockey analytics. Here is an article with a quote from Chuck Fletcher of the Wild on data inconsistencies. Sportsnet.ca article Fletcher’s wrong about the recording of events in Detroit relative to Minnesota though Minnesota does undercount shots relative to the rest of the league. The bigger picture is that we need a way to adjust event counts. Thus, the goal of or paper is to develop tools to allow analysis to adjust RTSS events (Shots, Hits, Misses, Blocks) so that they are consistent from rink to rink. We’re not aiming for absolute truth (i.e. that what is counted as a hit is genuinely at hit) but rather for relative truth meaning that events are counted, on average, the same from rink to rink.

My slides from a short presentation at the JSM Panel are found below. We, Brian Macdonald and I, are working on finishing the final paper before I present the full set of results at the Royal Statistical Society’s International Conference in Sheffield, UK at the beginning of September.

(This article was edited 8/28/14 to add the link to the Sportsnet.ca article.)

]]>

I’ve updated the Total Hockey Ratings (THoR) for the 2013-14 season. Both the Even Strength and the All Events can be found on the THoR Page. Recall that THoR evaluates each play from the NHL’s RTSS system for it’s likelihood to lead to a goal in the subsequent 20 seconds. We then get a rating for players by accounting for their impact after adjusting for quality of teammates, quality of competition, zone starts, score effects, etc. The outcome metric here is Wins above replacement (WAR) relative to position. Details on the methodology can be found on the THoR Page. Below are some of the highlights from these results from the posted files which include the top 50 players.

**Even Strength**

Some interesting results here. I’ll start with EV first. The top ten here are A. Kopitar, M. Niskanen, M-E Vlasic, R. Suter, A. Sekera, J. Pavelski, J. Jagr, L. Couture, J. Hudler, T. Vanek.. Sekera and Hudler might be a bit of a surprise but none of the others would seem so. Suter has not always been so highly ranked by THoR but this year he is evaluated well. Couple of other notes on this. At even strength, THoR is a big fan of Gustav Nyquist of the Red Wings. And THoR also likes Tampa Bay’s Ondrej Palat as the best candidate for the Calder Trophy but he is.closely followed by Hampus Lindholm of the Ducks. The total impact of Palat and HIndholm is probably very similar but Palat was on the ice for fewer events. MacKinnon is not rated as highly. (Note that THoR does not take account of PDO as part of it’s evaluation of players so Nyquist’s high shooting percentage is not relevant to THoR but his high rate of shots per time on the ice is.)

In this analysis, we also find that the bottom five include: A. Ovechkin, A. Edler, A. MacDonald, N. Grossman, N. Yakupov. Again, these are not the worst rated players in the league but those that have cost their teams the most due to the amount of ice time they are getting in the THoR methodology.

**All Events**

The top ten evaluating players based upon all events is: R. Suter, C. Kunitz, J. Jagr, M.Niskanen, M. Giordano, A. Markov, T. Brouwer, J. Carlson, D Doughty and O Ekman-Larsson.

Seven defensemen are on that list and that is due to the fact that we are doing two things. One, we are normalizing average player value to the position and, two, we are taking that average relative value and multiplying by the number of plays for which the player was on the ice. Since top D tend to play more minutes, they are giving more value under this assessment. This, in particular, helps Suter who was on the ice for just about 25% more plays than Giordano. On a per play basis Giordano had a greater average impact but Suter had a bigger overall impact on the season. As with any statistical metric, we can derive variability in these estimates.

Hampus Lindholm, a defensemen for the Anaheim Ducks, is the highest rated, 11th, rookie on the list. He is not a Calder Trophy finalist. The finalists are Ondrej Palat, Tyler Johnson and Nathan MacKinnon, all forwards, who are rated 38th, 75th and 94th, respectively.

Again, we see some of the usual names at the top of the THoR list. Doughty, Steen, Hossa, Hornqvist, Couture, Karlsson, Kopitar and Subban. THoR is a two-way player metric and so this is not a surprise given the high year to year correlation for THoR.

Of note is that Ovechkin (not in the Top 250) compensates somewhat for his really poor even strength play with PP play that makes him a replacement level player for this year though

At the bottom of the ratings (not in the file for download), are A. Edler, R. Regehr, A. Ference, N. Nystrom and J. Cowen. Again, they were not the worst players but by virtue of their amount of ice time, they hurt their teams the most.

Also of note here is that Kunitz gets very high marks and higher marks than Crosby. This is a results of multicollinearity in these data with Crosby and Kunitz playing nearly 4/5th of their 5v5 ice time together (Source: www.behindthenet.ca). While THoR uses ridge regression to deal with this, it is currently optimized to provide high predictive reliability.

Alfredson, Jagr, Markov, Moulson, Niskanen, Timmonen, Vanek and Vrbata are all in the THoR Top 100 for 2013-14 and they are all UFA’s this summer. They should have plenty of good offers this offseason (except for Jagr who has resigned with NJD). That, along with a higher salary cap, should make for an interesting summer.

One interesting note for long-time hockey analytics followers is the appearance of Sean Couturier and Mark Scheifele in the top 250 players, though Couturier is worth over one win more than Scheifele this year based upon THoR. Also, David Perron of the Oilers, who was obtained in a trade from the Blues, shows up in the top 100.

Note that we use a different THoR model for Even Strength than for All Events that accounts for differences between 5v5, 5v4 and 5v3, among others, .

]]>

I love this idea that more NHL teams aren’t investing in analytics because the price tag reaches into the hundreds of thousands of dollars. This is a league in which a team still employs Colton Orr for close to a million dollars per season.

Maybe teams would rather not spend $250,000 to find out that they shouldn’t pay guys like him $925,000.

So this got me thinking about whether we could estimate the amount of return that spending $250k on analytics would yield. I decided to do a bit of back of the envelope computations to estimate what would be the benefit from having an analytics staff. No new analyses will be presented here, just some crude approximations. Almost all of this is low hanging fruit and no doubt there are improvements that can be made. The list is not comprehensive. I simply want to make the point that to say that hockey analytics have not proven their worth is ridiculous and I don’t need to be too thorough to do that.

I’ve broken down what follows into three categories: drafting, strategy and player acquisition. Within each category, I’ve estimated the return via goals, or points or wins. Following some previous work by Gabe Desjardins, I’m using that a win is worth 2 points and is worth roughly $2MM. Consequently, a point is worth $1MM. (Note the new CBA prompted me to round down slightly but that rounding should likely go up given what is expected for the salary cap next year.) We’re also going to look at things on a yearly basis so you might only get to sign a free agent goalie every fourth year so we’ll take 1/4 of the value from that in this assessment.

**Drafting**

The first area we’ll check is drafting. There are two aspects to this: evaluating players to draft and determining the value of draft picks. The former is a good bit more difficult to evaluate since it is harder to get a sense of how much can be gained in this regard. I also think analytics can help in this area both in predicting the future and in evaluating mistakes of the past.

Some recent work by Eric Tulsky is suggestive of the data that might be mined from leagues outside the NHL and might be useful for projection of player performance in the NHL. This work helps to determine how a player is being used by their current team and how difficult is the quality of opponents with whom they share the ice. Such supplemental information might yield additional information that would be useful. If a players is generating 25 goals in 75 games but is playing sheltered minutes then we may want to draft others who are producing . This work is relatively new and so it is harder to evaluate how they will perform. However, it is likely safe to say that such methods are worth a couple of goals a year. Let’s say 2. That’s a 1/3 of a win or about $670k.

One area of drafting that has seen a good deal of work is that idea of league equivalencies. The idea of these approaches is to look at the performance of a player who spent year 1 in a non-NHL league (e.g SM-Liiga in Finland) and year 2 in the NHL. By looking at the relationship between points in these two years, we can create a measure of the quality of the non-NHL league but also a method for estimation of production once a players has joined the NHL. The most famous example of the application of this sort of work comes from the decision by the Winnipeg Jets in 2011 to draft Mark Scheifele over Sean Couturier. The projections for Couturier were clearly higher than those for Scheifele as was noted at the time. Couturier has gone on to perform very well for the Flyers and is probably worth a couple of wins a year to them. Such circumstances are not going to crop up every year but when they do they are valuable. I’ll assume that such circumstances occur once every 4 years. Using Hockey-Reference.com’s Point Shares metric, the value of Couturier has been about 2.7 points over Scheifele over three years. That’s about 0.9 points per year. Or about 0.9/4 = 0.225 points or $225k per year.

So conservatively the total for drafting is about $895k per year from drafting. This excludes something like the creation and use of a value pick chart, for example this one or these.

**Strategy**

There is a good deal of evidence that pulling the goalie more often will yield some results. David Beaudoin and Tim Schwartz have done a nice study that suggests that teams should pull their goalies more often. This mathematical approach uses goal scoring rates under a variety of circumstances and uses these rates to estimate the average number of points under a variety of circumstances. They conclude by suggesting that on average a point per season can be gained with this approach. As with any strategic innovation, it’s success will lead to (being copied and eventually) its demise. Pulling the goalie faces many of the same difficulties as going for fourth downs in the NFL. It is less likely to be adapted because the costs. Nonetheless it is clear that analytics has suggested an improved way to play the game. Assuming such an advantage would be adopted by other teams we might lose our advantage after a year and so we’ll round this down to $100K in value. I do this here not because I don’t think it will work — I’ve seen first hand how well it works as my collaborator Chris Wells has led the NCAA in minutes with the goalie pulled for several consecutive years — but rather since I think it is less likely to be adopted by coaches.

Another strategic move that has been shown to have a strategic impact is carrying the puck more and dump-and-chase-ing less. Eric Tulsky has gotten a good deal of attention as the lead researcher in this area. His paper on this was a poster at last year’s MIT Sloan Sports Analytics Conference. The idea here is that carrying the puck into the offensive zone leads to better possessions which leads to more goals. It is hard to get a sense of the value here but let’s try. We can estimate about 50 zone entries at 5v5 per game (source here and here). If we can increase the number of shots using this method by 0.5 per game that would roughly be 3 goals per year or about $ 1MM. As above we spread out this gain over several years to get $250k per year.

So the number here suggest that it is possible to get $350k per year in value. Value from the analyses given here is the most likely to be transient but it is also possible that analysts will find new areas to exploit.

**Trades and Free Agency**

Next we’ll look at player acquisition through trades or free agency. One of the most useful aspects of analytics has been noted by Phil Birnbaum which is that they prevent stupid. As the article argues, there is more to be gained by not being stupid than there is by being smart. That is useful and a big part of analytics.

The litany of large NHL contracts that might have appropriately received another look if analytics are involved is long and includes: Bryzgalov’s signing by Philadelphia, Douglas Murray’s trade acquisition by Pittsburgh from San Jose, or his signing with Montreal, or trading for Robyn Regehr. Not every team signs or tries to sign a big free agent every year. Let’s say one every four years that overplays by $1MM per year. That’s a savings of $250k per year. And we haven’t even looked at the contracts of Pekka Rinne, or David Clarkson, or Brad Richards, or the aforementioned Colton Orr. As the salary cap grows this is likely to be an even bigger value.

**Summary**

There are lots of ways that hockey analytics can impact a team. I’m an analytics guy so I’ve tried to be conservative here about the value that analytics can bring. Overselling analytics won’t help it’s adoption. The cost of a reasonable analytics department might be around $300k. You could get a good deal of consulting input alone for $100k. I think it’s pretty easy to see that analytics could yield (at a minimum) benefits for an NHL team of over $1MM from the areas listed above. As I stated previously, these are estimates that are low and they are not exhaustive. Hopefully from this, it should pretty clear that cost is not one of the factors holding back analytics in the NHL.

*Addendum*

*My friend Steve Argeris suggested the following: I’d think about it a different way. What are you spending your money on? Presumably you can just have your cap guy or some $50k per year guy just stay up on what’s on the blogs and use the public websites reasonably well, so that’s really your baseline, not “nothing at all.” So say to hire a PhD in statistics (ahem) and move him from his tenure-track position to a more expensive city, subscribe to various data sets, and pay freelancers (analysts, coders, database guys) $200k per year. For that amount of money, what would it take every year to just break even?*

* It would take:*

*1) not blowing one draft pick in the first three rounds every three years (merely moving one pick in three years from “never playing an nhl game” to playing, say, 60 games at a minimum salary OR*

*2) finding a league minimum guy that can do what a $900k guy can do once per year OR*

*3) finding a $1M player who can do what a $2M player can do every three years OR*

*4) recognizing a tactical or strategic decision that improves a team by three marginal goals over three years ONCE *

* In other words, if ANY of those three things happen, you’ve paid for yourself. The likelihood of ANY of those things happening is pretty close to 100%. The likelihood of more substantial benefit is far far higher.*

]]>

So last week during the Hockey Analytics panel at the MIT Sloan Sports Analytics Conference, Eric Tulsky referenced a study that Michael Schuckers and his student Lauren Brozowski did on referees in the NHL. While this work has been available publicly on the conference website, due to some unknown oversight we did not post it here. So here it is. The paper is based upon two NHL seasons worth of data. The data don’t let us know who among the referees made the call just who was on the ice for the call. Most of the results are pretty obvious. The later you are in a tight game, the less likely it is that a penalty is called. Home teams are less likely to be called for penalties than visitors. There also seems to be a great deal of consistency among the referees in their rates of penalties after adjusting for a variety of factors including score, period, the teams involved, etc.

Link to slides from 2011 JSM Talk

Photo by Mark Canter, http://en.wikipedia.org/wiki/File:Dmitry_Kulikov_Panthers_Shane_Heyer.jpg

]]>