So picture this. You're hanging out with some friends and a debate starts, "what's more important, goals or primary assists?". There are many ways to go about answering this question, but many in hockey analytics will look for how predictive each metric is, or how highly correlated each one is with future on-ice goals for. So you pull up Corsica to answer that question. First, you grab the counting stats of all the forwards who played at least 500 5v5 minutes in 2016-17, then you grab the adjusted on-ice relative to teammate goals for per hour of all the forwards who played at least 500 5v5 minutes in 2017-18. Next you delete all the players who didn't play 500 minutes in both of the seasons, now you have a decent sample of over 250 forwards, and finally you can just line the two up against each-other and easily check the correlations for how predictive each metric was and use that to argue you know what the best counting stat is. However, if you do this, the results will surprise you.
If all I gave you was goals and primary assists, this could look pretty reasonable. Goals slightly outperformed primary assists in 16-17. However, the addition of secondary assists is where it gets weird. Secondary assists were actually slightly more predictive than primary assists. The more familiar you are with hockey statistics, the more surprising this probably is. Eric Tulsky showed how secondary assists aren't nearly as repeatable as primary assists back when I was in middle school, so secondary assists being inferior to primary assists hasn't been a controversial take for a while now.
So what happened? Well, the obvious answer is noise. Everybody understands that hockey statistics are noisy, however, I'm not sure everyone understands just how noisy they can be. Even with a 250+ player sample, the signal is overwhelmed by the noise. This becomes obvious when you extend the sample to every back to back season since 2011. (If your unfamiliar with some of the stat's below, I talk about it in my BPM explainer).
With the extended sample, the results makes a lot more sense. Primary assists are the most predictive, then goals, and finally a massive drop-off to secondary assists. So, even though you can technically use evidence to show that secondary assists are more predictive than primary assists, you would be wrong. And the problem is, this happens way more than you might think.
With help from Bill Comeau there's now an interactive tableau showing how predictive each metric has been over the years so you can see for yourself how much noise there is in the predictive value of hockey statistics. It looks like a mess altogether, but if you highlight secondary assists you can see that even though they are the "worst" metric on aggregate, they have only been the worst metric 2 out of 6 times.
I chose secondary assists because they are generally viewed as useless in the analytics community, but as you can see, this happens with other metrics too. Look at goals and primary assists for another example.
This works well because goals have outperformed primary assists in two straight seasons with hundreds of players in the sample, at which point people will likely become very confident in the correlations. Thankfully with these counting stats we can easily cite a larger sample of data and show people why they are still probably better off weighing primary assists more heavily than goals, but unfortunately, we are not always going to have that luxury.
I'm referring to what feels like the new wave in hockey analytics, micro-statistics (I'm talking about hand tracked data, but the general message will apply to the first few seasons of the NHL's tracking data too). More and more often it feels like we are seeing people hand track data to learn about the game. From macro-scale projects like Corey Sznajder and companies All Three Zones Project to more Micro-scale projects like Harman Dayal tracking the Vancouver Canucks. This sort of data seems to be where the analytics community is trending. Of course, the more project's like these the better. It's awesome that we now have the ability to dig deeper into transition data to show how well Erik Karlsson exit's the zone or to quantify some players ability to apply pressure on the forecheck. This analysis might just be the future of hockey analytics, and hey, the best way to learn is to try. But as more and more of this data collected, we will inevitably zero in on how predictive each of the metrics prove to be.
This has started already thanks to CJ Turtoro's RITSAC presentation where he used the all three zones project's data on the Blue Jackets, Stars and Flyers to show that the public micro stats were more predictive of future goals than traditional metrics like Corsi and xG (For defenders). It's great to see a signal like this, but it's important to urge caution even if we continue to see promising results like these in the future. Because even full seasons worth of data on the entire league can lead you astray, just like secondary assists would have in 2016-17. Sadly tracking games takes hours from extremely committed people, so we likely won't have the luxury of league-wide data sets for a while, making the already difficult task even more challenging.
So yes all this new data is exciting and it's awesome that the early results at promising, but it's going to take a much longer time than most people would like before we can be confident which metrics are most predictive of future results, and until then it's good to be cautious with testing results.
Today's the day the William Nylander saga finally ends. He will presumably sign somewhere, and once he does this begs the question, how good will he finally get's on the ice? Today I took an alternative approach to project William Nylander's future on ice performance.
To do so I've used a catch all statistic called C-Score, which improves upon my old goals above replacement (GAR). Basically the goal of C-Score is to condense all the things a player does into a single number. From there you can judge a player based off their total output rather than just say, points. With C-Score I've developed a system to come up with historical comparables for each player. It does this by taking how similar each of their inputs are and weighing them by importance. So, for example, similar offensive style at 5v5 is more important than winning a similar percentage of your face-offs, because 5v5 offence accounts for a much larger percentage of players overall output.
FInally there is an age adjustment. Rather than comparing William Nylander against everyone to play in the past decade, the projection will only compare him to fellow 21 year olds (his age last season). So who are Willy's closest comparables at the same age? Let's take a look. Note that these are ranked, so Eberle is the #1 comp, Ehlers is #2 and so on. (Also random fun fact, without the age adjustment I actually have WIlly's closest comparable as a 22 year old Phil Kessel, his first season in Toronto).
William Nylander's camp has likely been citing Leon Draisaitl's contract to get more money, and the Maple Leafs are likely citing Nikolaj Ehlers's deal to keep Willy's AAV down, so it's interesting they show up as his second and third closest comparables. Hopefully this gives more credit to the idea neither side is right or wrong, it's just business.
More generally, It's hard to over state how impressive William Nylander's performance up to to age 21 is. Over the past decade, his comparables include all-stars like; Patrick Kane, Nathan Mackinnon, Aleksander Barkov, Steven Stamkos Tyler Seguin, David Pasternak , Johhny Gaudreau, Jamie Benn, Filip Forsberg and Nikita Kucherov.
That's 10 unquestionable superstars! And there were only 25 non-Nylander players used as comparables. Those are very, very good odds, and the number might have even been higher if we knew how Ehlers, Draisaitl and Larkin's next few seaosns will pan out. Rather than looking at a general age curve to project Nylander, I wanted to project him based off these comparables, and how they aged into their 22 year old season. The hope is that this might give Leafs Nation a better idea of what to expect from Nylander in the NHL this season. To get this idea, I created 3 projections for Nylander which should outline his expected range of outcomes. Let's start with the mean projection.
The greenish dot right below the blue line shows Nylanders mean projection. This what I would consider the most realistic projection for Nylander based off his comparables, and it sits right inside the top 30 NHL forwards. That is incredibly impressive for a 22 year old forward, and would significantly boosts the Leafs expected winning percentage every night he's in the lineup. Also, it's worth noting that the 30th highest paid forward in the NHL right now is Ryan Kesler, making $6,850,000. (Thanks Cap Friendly). Thanks to inflation and the likelihood of continued growth from Nylander he will likely be worth an even higher cap hit over the long run.
Of course, age curves are very noisy. Not everyone follows the average trend, so I looked at Nylander's projection if he is one standard deviation above his comp's average to get an idea of what the ceiling is for this player.
The purple dot above the "Top 15 Forward Line", that's the upper bounds for Nylander. This would drive him close to a top 10 forward in the NHL. Is this terribly likely? No, in fact it may be even less likely than usual because he's sat for so long. But age curves are much nosier than we give them credit for, and this represents Nylander's ceiling. Basically the sky is the limit for a player as young and talented as he is, hopefully fans have not forgotten that. Again it's worth noting, the 15th highest paid player in the NHL today makes $8,250,000 giving any realistic extension the potential to be a massive bargain right away.
Finally we have the more pessimistic projection. This is if Nylander drops by one standard deviation and represents his floor, or worst reasonable scenario.
The green dot lying below the "average first liner" is the lower bounds of Nylander's projection range. Nobody likes to think that their young players will be the one's get worse, but the reality is it happens all the time. If Nylander does lose a step the Leafs should be prepared for him to fall as far as the 70th best forward in hockey. This represents some of the risks we often forget about with young players, some of them do get worse, luckily for the Nylander, the lower bounds I have projected is still a first liner, and worth about $5,750,000 today. Not bad for a floor.
Altogether the question of "How good is William Nylander going to be?" is probably a lot more uncertain than most people think. Age curves are weird, so a single number projection probably isn't ideal. However it's reasonable to expect once Nylander gets back into the NHL, he should be around the top 30 forwards while acknowledging he could realistically climb as high as a the top 15, or fall as far as the 70th.
Just like Sidney Crosby was for a long time, Connor McDavid is significantly better than anyone else in the world at hockey. This could be true for like a decade, and will probably get boring. Que talking heads on T.V, or someone looking for a hot take on twitter to say how someone is actually close to the best player on earth. It happened with Toews for Crosby, and yesterday it erupted about Matthews and McDavid. This left me wondering, who is Connor McDavid's best comparable?
To answer this I used my new metric, C-Score. I don't have a write-up yet, but think of it like a greatly improved version of my old GAR (So much so everytime I say C-Score you may just want to think GAR). It tries to take all of a players important stats, and boil their output down into one number. There is a small ex-plainer on the C-Score sheet, and I can help answer questions about it on twitter or in the comments. From C-Score the goal was to find a similarity rating for each player. To do this I started with creating a Z-score for each input into overall C-Score. This measures how many standard deviations away from the mean a player is at each component of the game. This is a great way to compare skills across different years. With those Z-Scores you can easily see how close certain players are in each skill. From there I took the absolute value of the sum of the differences in Z-Scores, and weighted them by how much each metric is worth overall. So for example, BPM, which represents weighted counting stats at 5v5, is worth far more to a players C-Score than Face-offs, so, having a similar BPM rating means way more to the overall similarity score than having a similar face-off rating. With that, It's time to look at Connor McDavid's comparable seasons from the past 3 years, meaning I'm talking recent history for this post.
Connor's Closest Comps
So with a method to test this, i put Connor McDavid's numbers into the sheet and it spit out his closest comparables. Here are the top 10 forwards comparable to McDavid's 17-18 Season. (The smaller the "similarity score" the better.)
His own season is obviously perfectly comparable with itself, but maybe people are on to something. Connor McDavid's closest comparable last year was actually Auston Matthews. The problem is that not all closest comparables are equal. Matthews and McDavid differ by a score of about 1. This likely means nothing to anyone, so I'll show you on a line graph.
Matthews was technically McDavid's closest comparable, but McDavid is playing one game, and everyone else is playing another. However, if you look at Prospect Cohort analysis you will find that it is usually harder to find comparables for top tier players, and this is true to an exent. More pedestrian players do have much stronger comparables, but even other elite ones aren't nearly as far away from the pack as McDavid. To illustrate this, we'll do the same thing for Matthews, and to give him his best shot at looking one of a kind, I'll use his 17-18 season too, which was his best so far.
If you look closely, you'll notice the scale here is much smaller. Matthews closest comparable is the ghost of his new teammate, John Tavares with a collection of all stars filling out the top ten, but no McDavid. Looking at it through Matthews' lens, there 33 players seasons more comparable to him than McDavid was last year. In fact Matthews was more similar to Marchessault than McDavid last year, despite the fact Matthews is technically McDavid's best comparable. For those who like the scatter plot view, here is the same thing but for Matthews, with McDavid highlighted.
As you can see, relative to McDavid's plot, there are a much larger group of players, who are much more comparable to Matthews than anyone is to McDavid. So please, stop comparing players to McDavid. You doing a disservice to the player by doing so. They are probably special in their own way, but there simply aren't any players like McDavid in today's NHL, and barring some drastic change, there probably won't be for a while.
Secondary assists are significantly less repeatable than goals or primary assists. As a result, they're generally ignored inside hockey analytics circles. I understand the logic behind this, but I think there's a better way of looking at things. My philosophy here is directly inspired by Dawson Spriging's old WAR model. Rather than cutting out secondary assists altogether, he used a "BPM" or box plus minus metric. The general idea here is not to throw counting stats out, but to combine them all into one stat, assigning each one an appropriate weight. Spriging's (predictive) weights ended up looking like this (Blue is forwards, which I will be focusing on for this post.)
I've always been a fan of this concept, and since Spriging's is no longer in the public sphere, I've set out to do something similar. My goal was to take public counting stats, and combine them to predict future on ice goals for as well as possible (RelT GF/60 from Corsica, where all the stats in this post will be from. Also, I will be referring to year to year correlations for the remainder of the post). The metrics I set out to use were.
Goals- Individual Goals scored
A1- Primary assists
A2- Secondary assists
ESA- Estimated shot assists, from @loserpoints
IFF- Individual fenwick for (the unblocked shot's a player takes)
ixFSh%- The percentage of times a league average shooter would score given that players IFF
iXGF- The amount of goals a league average finisher would score given that players (unblocked) shots
Gives- Number of times a player gives the puck away. Yes giveaways are going to be a good thing in this model
Takes- Amount of times a player takes the puck away from the other team
(All Per Hour of 5v5 Ice time)
Each of these metrics predicts future goals to varying degrees. Here's each one's R squared with future on ice goals for.
This shows us a few things. First, secondary assists aren't necessarily useless, they just aren't nearly as useful as other stats. Second, as Spriging's already showed, looking at primary points probably isn't optimal, because primary assists are worth more than goals. Finally, individual shot quality doesn't add much, if anything to the equation. So with the basic correlations revealed, I used a linear regression model to combine most of the stats shown above into one. (Anything including shot quality added nothing to the model and was therefor removed. As was ESA1 due to multicollinearity issues with ESA). Here are the resulting weights.
Most of the weights aren't surprising. Primary assists are the most important. Then takeaways, goals, and shots are just behind. After that shot assists and giveaways provide less value, and finally secondary assists add the least. Now to test the model's actual goal, it's ability to predict future on ice goals for.
An R squared of 0.197 doesn't seem like much on it's own, so let's compare it to normal counting stats.
This new metric is a significant improvement over using any of the inputs on their own. Defencemen will be coming soon, but for now you can find the forwards data here. And finally if your not convinced yet, I noticed some fun quirks with the stat. The target variable was on ice, relative to teammate goals for per hour, but this new number is actually more predictive of most of the metric's we care about than they are with themselves. For example here's the auto-correlation of individual goals per hour compared to this metrics correlation with future individual goals for per hour. (All the counting stats about to be shown are standardized)
Same goes for primary assists.
Even something like relative to teammate expected goals for... which is weird.
This new metric isn't a WAR model in itself, but I think based on the testing we can conclude it is a very useful stat. For one final thought, you may have noticed that I haven't yet referred to this stat by an actual name. That's because in hockey analytics, people often complain about nomenclature, saying that the names of some of our current stats don't help our cause. So, I'm asking you to help me come up with a name. If you can think of a more descriptive name than the "BPM" it was inspired by, then I'll gladly name the stat whatever you come up with. If you have any ideas for the name, feel free to comment, or DM me @CMhockey66 on twitter, thanks for making it this far!
The 2018 NHL entry draft is just around the corner. As picks are traded up down and around, one of the things you're likely to see is draft pick value charts. These have been done before by many people, and the general idea is to put one number value on each draft slot. From there, it's easy to compare which picks are likely to yield more value relative to the other picks. Since this has been done before, today let's look at draft pick value charts through a different lens, creating a draft pick value chart strictly for strong links.
The importance of strong links
Many people are likely unfamiliar with the term "strong link" or why they are important. Luckily Alex Novet has gone in great depth here, but I'll provide a quick synopsis of his findings.
Basically, there are 2 types of sports, strong and weak link. In a strong link game, star power drives success, while in a weak link game, teams with superior depth are more likely to win. To answer the question, what kind of game is hockey? Alex used this picture.
The vertical axis is the amount of points each team earned throughout the season, and on the horizontal axis is the value of the teams best and worst players (Worst in blue, best in red). The first picture shows us there is no relationship between the weak link value and a teams point totals. In a weak link game (like soccer), the stronger the weakest link, the better the team would be, but no such relationship exists in hockey. The second graph shows us there is a relationship between a teams best player and their teams point total. Generally, the stronger the teams best player the more points that team ended up with. This means even though the Vegas Golden Knights are in the Cup final and Connor McDavid's team didn't make the playoffs, hockey is a strong link game, driven by the best players.
how Teams acquire strong links
Since hockey is a strong link game, every NHL front office should have one goal above all else, acquire stars. For our purposes, strong links are going to be defined as the top 30 forwards (31 this year because of the Vegas expansion) in goals above replacement each season since 2008-09 (Data Here). This is because, in a perfectly competitive market, these players would each be the strong link on their team. How have teams generally acquired their strong links? Let's break it down.
There are three different ways teams have obtained their stars. First is signing them in free agency. Only seven percent of strong link seasons have been acquired this way. Players peak around age 24, meaning they almost always enter free agency past their prime. As a result, teams generally don't get elite talent on the open market.
The next way teams have added superstars to their roster is in trades. About 19% of strong link seasons have been obtained in a trade. This means it's possible, but not all that likely for a team to get their best players in a trade.
Finally, we have the draft, where an overwhelming majority of stars are acquired. Sure every armchair GM loves to mock trade the next Hall for Larsson, or dream that the unsigned free agent their team signed might become the next Artemi Panarin, but the reality is that NHL teams should plan to acquire their strong links through the draft. It's where about 75% of them come from.
Re-Thinking Draft Pick Value Charts
So thanks to Alex Novet's work and my findings we know two things. First, good NHL teams are driven by strong links. Second, most teams will need to acquire these players through the draft. This gives us an alternative way of viewing an old concept, draft pick value charts.
Traditionally these charts aim to value each pick by their ability to produce NHL players, but instead, let's look at each pick based solely off it's ability to produce strong links. Using the parameters set above, here's a look at the cumulative strong link goals above replacement (GAR) by each draft position since 2008-09
This is a rough look at what's to come. Obviously the 45th pick isn't better than the 44th just because Patrice Bergeron was drafted there, so here is the same thing smoothed, which will represent our strong link draft pick value chart.
This is the approximate value of each draft pick if your only goal is to draft stars. As mentioned above, I'm not the first person to create a value chart. Traditionally, these value charts look at how each pick yields any type of NHL player, not just strong links. So what's important about my findings is the difference between the value chart above and a more traditional one. To see the differences, here's my strong link value chart plotted against Dawson Sprigings (DTM) draft pick value chart.
When plotted against a more traditional value chart, it's easy to see the dramatic differences in relative pick value. DTM's chart see's a 12% drop after the first pick, then each drop get's smaller and smaller from there, and relative value level's off after the first round or so. In contrast, the strong link chart holds the top three picks to a much higher standard. Each lottery pick provides significantly more value than the next, and from pick four on wards every pick is easily replaceable.
aPPLICATIONS OF THE STRONG LINK PICK VALUE chart
Admittedly, when I first had this idea I wasn't sure how many practical applications there would be, but there is one very important takeaway from this, albeit in a small niche. The 4th overall pick has very little value relative to the lottery picks, and is very similar to the picks below it. This has one major application because of the NHL draft lottery. Specifically referring to teams that are;
A) Bad enough to be in the lottery to begin with, and
B) Unlucky enough to lose the lottery and fall out of the top 3
These teams are likely going to be desperate to acquire star talent, and they're probably planning to acquire that talent through the draft. The problem is, bottom feeding teams that are knocked out of the top 3 are left with a pick which isn't significantly better at producing the superstars than the picks after them. This brings us the biggest lesson to be learned from this exercise, don't be afraid to get creative with your first round pick, even if it's a really high one. The plethora of picks a team would receive in exchange for say the 4th overall pick gives them a much better chance at drafting a strong link than that singular pick. After the top three, look to trade down, prioritizing quantity over quality.
An alternative way to get creative with top non lottery picks can be emulating Arizona last draft. Arizona, a rebuilding team traded their 7th overall pick in exchange for Derick Stepan and Antii Raanta. It's generally frowned upon for a team in the Coyotes position to trade their first round pick, however using the strong link value chart this looks like a fantastic trade. They still need star talent, but at pick 7 they probably weren't getting it anyways, so they flipped the pick for a first line center and a budding star in goal. These two assets are probably going to provide more value than a lottery ticket at 7. It takes a lot of guts to trade a pick that high, but teams shouldn't be nearly as skeptical of doing it as they currently are, non lottery picks probably aren't as valuable as people think.
The draft position data used in this post is thanks to Rob Vollman's super spreadsheet, and for the years which he does not have draft position the rest came from Hockey Reference. For those interested, here's a link to the actual individual strong link pick values. Of course, remember that these values are the approximate relative value of each pick, not a rule-book written in stone to follow at all times. If there's any questions comments and concerns about this based feel free to comment or reach out to me on twitter @CMhockey66!
For our purposes, GAR is an Acronym which stands for Goals Above Replacement. It measures the total amount of goals a player adds to his team relative to a replacement level player, and tries to do so by taking everything a player does into account. Then getting it down into one number. So theoretically, if a break-even goal differential team gained a +20 GAR player, their goal differential would increase to +20.
Edit: I'm in the process of greatly improving upon this idea. Check out the first step towards my new and improved GAR here.
If you're only looking for a specific section of the stat, here's a quick table of contents, if not, let's dive in.
1) The Value of GAR
2) Replacement Level
3) Even Strength Offence
4) Even Strength Defense
5) Power-play Offence
6) Penalty Differential
9) Testing and Results
1) The Value of GAR
Before diving into how it's calculated, first let's address the question, why is this important/necessary? Well, you may have scene great resources around twitter that look something like this. (Data Viz From Bill Comeau).
These charts are a great way to consume tones of information in a small amount of time. You can see tones of statistics in one glance, from their point production to their ability to drive play, and even their context, it's all there. The problem is when you have to choose between two players with similar production, like Hall and Mackinnon for example. Above you can see Mackinnon had more points, but Hall had better shot and expected goal metrics. Mackinnon played tougher competition, but with much better teammates. What about the fact Mackinnon had a better penalty differential? Or the additional face-off and power-play data that isn't even included here. With all of that, who was better overall?
You can try to weigh everything in your head, but the odds of you being able to come to consistent conclusions without being bogged down by the same personal affinities and biases that we turn to numbers to get away from in the first place are very low. This is why we can turn to GAR, as a framework for how to weight things. Also, just as a general tool, having player output in one number can be incredibly useful. It's not going to be perfect, but it's a great starting point for player talent.
2) Replacement Level
The next question around GAR model's is why are they "above replacement level"? For a simple explanation, let's assume individual goals are all a player brings to the table. Then imagine a theoretical hockey team called the Computer Boys. And during the 2017 off-season the Computer Boys lose Eric, their top line left winger to free agency. Over a full season, Eric always scores 22 goals. With Eric gone most people think that the Computer Boys have lost 22 goals from their lineup, however that's not the case. If the team was to just leave a void on first line left wing, then i guess they would lose 22 goals. But of course, they wont do that. The second line left wing will step into Eric's minutes, the third will take the second and so on. This shifting of the line-up will replace some, but not all of Eric's 22 goals. This why it's worth comparing player contributions to replacement level rather than zero.
So what is replacement level? That's a question which I don't have a perfect answer to. However for the purpose of this model it has to be something. From Rob Vollman's book, Stat Shot he cites the 75th percentile as replacement level. He also notes that will drop after an expansion draft. Since the Vegas Golden Knights made the NHL about 3% bigger, I'm using the 78th percentile as replacement level. This means you have to be better than 22% of the league at a given skill to provide above replacement level value at said skill. With replacement level defined, let's dive into the inputs that give us goals.
3) Even Strength Offence
The next part of GAR (Goals) is to try and encapsulate everything a hockey player does to help his team win games. To cover this, my WAR has 5 main inputs. Since about 70% of NHL goals are scored at even strength, most of the goals come from even strength play. Let's start with offence.
Ask the average hockey fan what players do to contribute offensively, and they will immediately cite points, which leads into the first half of the offensive equation. To account for players point production there's C-BPM. Which is built around Dawson Sprigining's BPM. Here's his inputs, weighted based on their ability to predict future goals.
This basically represents the weights for C-BPM. The longer bars show the more important inputs for each position. Some things of note here. For forwards, primary points are king. For defencemen all points are relatively equal, with slightly more focus on assists. And for either position, shot quality (ixFSh%) is slightly more important than shot quantity (iFF/60). I did make some small modifications (C-BPM likes goals slightly more relative to primary assists than BPM), but this is a good breakdown of the weights. Since the goal here is to encompass point production, I compared CBPM to points per hour.
There are a few outliers, which are mainly guys with high shooting percentages or low TOI. Other than that, CBPM lines up closely with point production and does a good job encompassing players score-sheet stats. For a quick look at who excels in this metric, here's the best C-BPM players from the 2017-18 season (Minimum 1000 minutes played).
I mentioned above that point production (CBPM) is only half of the equation. That's because there's another key input to players offensive output, their ability to drive play. Some players (especially defencemen) can be elite offensive players without massive point totals. They achieve this by driving shots and scoring chances towards the other teams net. Making sure their team generates goals even though they may not be directly picking up points. To account for players ability to drive play offensively I use C-XPM (again based on Dawson Sprigings WAR, and his XPM metric). This uses 2 key metrics, Relative to teammate Corsi for, and relative to teammate expected goals for. If you're already familiar with those metrics, the next 2 paragraphs aren't for you. If your not, here's a quick synopsis.
First is relative to teammate corsi for (RelT CF). This starts with the basic idea of corsi, which is a fancy way of saying a shot. Each shot attempt a player is on the ice for counts as a corsi for. Once we have a players Corsi for, we can adjust for players quality of teammates. For an example, let's bring back Eric. Imagine Eric generates 65 shots per hour of ice time, while his line-mates generate an average of 60 shots per hour. In this scenario, take Eric's 65 shots, minus his line-mates average of 60 shots (65-60) and Eric would have a +5 relative to teammate corsi for per hour. (The equation is more complex than this, and if your interested you can read more in depth here, but that's the general idea).
After reading about corsi everyone thinks the same thing, "but not all shots are created equal". Of course a shot from the blue-line is far less valuable than one from the slot, which is where Corsica's expected goals comes into play. Expected goals adjusts each shot for quality by taking into account the shot's type, angle and distance. This way if two players generate the same number of shots, the one who generates higher quality will be recognized as the superior play driver (the same relative to teammate formula is applied here too).
With the 2 main metrics defined, C-XPM is simple. Since it's a combination of the two metrics above, the first step is to get them on the same scale. To get corsi on the same scale as (expected) goals, think about goals as a function of shots. Every shot attempt over the past three years has had a 4.08% chance of going in, so just multiply the corsi events by 0.0408055, and suddenly it's on the same scale as expected goals. (This number will change every year, but you get the idea). Once they are on the same scale, this quick equation combines them together.
(((RelT CF/Min - RL CF/Min)*TOI) + ((RelT xGF/Min - RL xGF/Min)*TOI))
This gives is the framework for CXPM. Originally I had corsi and expected goals weighted equally, however with some testing I found XG to be way more volatile (especially for defense-men). As a result, the ratings are skewed to include both but favor corsi.
The next addition to C-XPM is a quality of competition adjustment. There are 2 different ways to look at quality of competition. First is by looking at the time on ice percentage of the competition, and the second is by their shot and expected goal metrics. Since corsi and XG can be split into offensive and defensive sides, I went with those for the adjustments here. The idea here is that the better a players opponents are at suppressing shots and expected goals, the more value is in that players shot and chance generation. Here's the formula i used to adjust.
(((QoC CA/Min - Average CA/Min)*TOI) + (((QoC XGA/Min - Average XGA/Min)*TOI)
Again this is the framework, with the weights skewed slightly to favor corsi. This adjustment is nothing major. In extreme cases it swings players results by about 1.5 goals or so. This is probably the most controversial debate in hockey analytics today, so I'm especially open to feedback on improving this part of the model, but in the meantime, the context adjustments did make the total more repeatable, which is a good sign. Again, to get an idea of who excels in this metric, here are the top 5 forwards and defencemen at C-XPM from the 2017-18 season.
4) Even Strength Defense
Even Strength defense uses 2 of the metrics mentioned above, corsi and expected goals. This time it's relative to teammate corsi (RelTCA) and expected goals against (RelTxGA). It's the same thing, just the ability to suppress shots and chances relative to your teammates rather than generate them. So the basic equation is the just the inverse everything mentioned above
(((RelT CA/Min - RL CA/Min)*-TOI) + ((RelT xGA/Min - RL xGA/Min)*-TOI))
Again i found Corsi to be more consistent than expected goals. Furthermore players don't have an the ability to influence their goalies save percentage, so preventing quantity against (Corsi) is more important than players ability to suppress quality (XG). For the context adjustment, it's more of the same.
(((QoC CF/Min - Average CF/Min)*-TOI) + (((QoC XGF/Min - Average XGF/Min)*-TOI)
This adjustment is applied the exact same way as the offensive context adjustment. Once again, here are the leaders in this metric from the 2017-18 season.
5) Power Play Offence
The next components of the model are to deal with special teams. About 30% of goals in the NHL are scored on special teams, so roughly 30% of the value in this model is distributed between these upcoming sections. First up on special teams is power-play offence. Just like C-BPM, power play value comes from point production. And to weight all of the stats, again I turned to Dawson Sprigings BPM.
Power play production has some weirder results. People generally put a premium on primary points, however based off the BPM weights being used, either type of assist provides significantly more value than a goal. This sounds counter intuitive. And while Dawson used machine learning techniques beyond my pay-grade to find these weights, I decided to look at the repeatablility of goals, primary and secondary assists per hour on the power-play and it started to make more sense. (players with minimum 50 minutes played on the power-play in each season)
So that's a lot of information, but there is one big takeaway. Many are likely puzzled with why goals for forwards have so little value, and while I can't tell you how Dawson got the weights, it's worth noting that goals per hour among forwards in recent history has been incredibly noisy. More noisy than any of the other point based metric. Furthermore, Ryan Stimpson's work illustrates that passing data is more predictive of future goals than shot data at even strength, and since power-play offence is especially reliant on passing to open up the opposing penalty kill, the gap might be even larger on the power-play. These two things help explain why assists are king on the power-play, not primary points. And once again for fun, here are the top power-play performers on from the 2017-18 season.
6) Penalty Differential
If an individual players rating looks out of whack to you, this section is likely the reason why. We love to rave about a great power-play or complain about a teams terrible penalty kill, however there is more to special teams than that. A significant part of special teams is deciding how much time is spent playing each one. For example, the 2017-18 Maple Leafs have an amazing power-play, generating the second most goals per hour in the NHL. Sadly they don't get the full benefit from that power-play because the team struggles to draw penalties, as a result, the leagues second most efficient power-play only scored the 9th most goals last season.
Of course the flip side is on the penalty kill. Take the Carolina Hurricanes, they gave up the 4th most goals per hour on the P.K. an absoutley terrible result. But in spite of their bad penalty kill, they tied for 9th least goals surrendered while short handed, how? They masked their P.K woes by being the least penalized team in the NHL.
On top of it's importance at a team level, penalty differential has proven to be repeatable at the player level, so it's an important input into the goals above replacement formula. For the model, it's split into two separate categories. First is players to draw penalties, and second is players ability to stay out of the box. To calculate the value of each, the formula's are similar
((RL PIM Taken/Minute*TOI) - PIM Taken) *0.1207
(PIM Drawn - (RL PIM Drawn/Minute*TOI)) *0.06885
Conventional wisdom from @EvolvingWild states that a penalty is worth 0.17 goals, however my weights are different. The first reason for that is using 0.17 goals results in an sub-optimal distribution of goals. Using 0.17 it resulted in greater than 40% of the goals being attributed to special teams, which is far to high, so the numbers were scaled down. It's important to note here that the raw values don't matter much, the values matter relative to each-other. I could have easily used 0.17 goals, there would just be more goals to go around in the other categories too.
The second reason they are different relative to each-other is to make up for a shortcoming of the model. I have no way of accounting for penalty kill prowess, so instead I shifted more credit to players ability to stay out of the box. It's not perfect and I'm always open to suggestions (especially about P.K. data), but since I was unable to quantify penalty killing at the skater level, staying out of the box is the best proxy to helping your team short-handed. With the equation defined here's last seasons leaders in the NHL's most under rated skill.
For the finishing touches of the model, we have face-offs. The mainstream media tends to treat face-offs as some sort of defensive WAR, and push back from people generally revolves around "face-offs don't matter" neither of which are true. Face-offs are a valuable input into a center-mans output. Micheal Shuckers did a study and found these to be value for each type of faceoff.
In a perfect world each Face-off would be weighted based off all the information listed above, however for simplicity sake focus on the top number. Altogether, it takes 76.5 Face-off wins to be worth one goal. Take 100 divided by that 76.5, and you get 1.3071895425 . From there, it's really easy to derive the formula for Face-off value. (F = Face-offs)
(F Wins - (F Taken*0.45))*0.013071895425
This way, 45% in the circle is replacement level. And the amount of value that comes from Face-offs is likely more than people expect. Since there are no defencemen here, let's look at the top 5 and bottom 5 face-off value players from the 2017-18 instead.
Turns out there is actually something Connor McDavid isn't good at. The final extra to the model is the addition of a prior to both even strength offence and defense. Meaning any single season value uses some of the previous seasons data too. This makes the model slightly worse for talking about say awards voting, but gives a better idea of players true talent. Altogether, the inputs I've described are added together to make GAR!
As I've mentioned before, any GAR model, including mine, are going to have weaknesses. And it's important to understand those weaknesses when working with the data. First is the inability to quantify short handed impact. Ian Tulloch's zone start adjustment is a step in the right direction, but I was still unable to come up with a penalty kill metric that adds anything but noise. So for this model, the best penalty killers just stay out the box.
The second one is about the prior. When a player has no prior history (rookies) their history is considered to be average. This by definition means that the value of a rookies season will be under rated if he's above average, and over rated if he's below average. Also with the prior, if a player dramatically outperforms their past performance, they will be scaled back towards their career norm. This helps find "true talent" because generally, the more extreme a players results, the more likely luck played a significant role. On top of those, there are a few other major weaknesses of all models, which have been explained well already by others.
The first is the Sedin problem. This issue arises when two players play almost entirely together, models struggle to distribute credit between the 2, you can read Matt Cane's thread about it here. Then there is the problem where elite players can do weird things to their teammates result, explained by Ian Tulloch here. And finally there is one weakness in many public models which has yet to be addressed.
8A) The more important QoC
One of the biggest puzzles in hockey analytics today is adjusting for context, and specifically QoC, which generally refers to quality of competition. I have a small adjustment for that in the model, however I don't adjust for the more important QoC, quality of coach. This is something Dawson Sprigings noted back when he was in the public sphere on the hockey graphs podcast. He spoke about how in his WAR model, he found adjusting for quality of coach to be more important than quality of teammate.
Many people (including myself at first) are likely highly skeptical of this, however let me present a case study. When working with the first model I made (very similar to Gamescore) I noticed something really weird when creating the age curve. From the 2014-15 onto the 2015-16 season, a cluster of players all saw improvements in their results, and it caught my eye. All of these players had one thing in common, they all played for the Toronto Maple Leafs. Using Dom's Gamescore from Corsica, you can see the results for yourself.
Gamescore is a primarily an offensive metric, and the Leafs shooting percentage dropped from 7.51% to 6.31%, their goals for cratered, and yet every single core piece of the team saw their results improve. So what changed? They transitioned from some of the worst coaching in the NHL to among the best, and benefited from the Babcock effect. And to show this wasn't just a weird year, I included the two years before Babcock in blue, and his first two seasons behind the bench in orange. Over the 4 year sample, every single player had their best two seasons under Babcock, and their worst two before he took over.
This is probably one of the most extreme example's I could cite (Bruce Boudreau and Mike Sullivan appeared to have massive effects too), however it shows the huge effect a quality of coach can have. That being said, I don't have a method to account for this, and this is a topic not often discussed in the public sphere, so it will likely be a while before adjustments in this field are made again. So for now, in ability to adjust for QoC will remain a weakness.
9) Testing and Results
With the calculations out of the way, we can finally get to the fun stuff, the results! Throughout the summer I plan to be adding to the data set as far back as I can go, but for now we have the past 5 years of data for forwards, and 4 for defencemen. Let's start by looking at the best GAR seasons at each position in recent history.
In the end the sniff test is meaningless, but it is nice to see the model recognizing the best seasons to be from superstars. Peak Crosby is the king, and McDavid is going to truly be something special. For defencemen, Karlsson has lead the way in literally every season I have data for (and has somehow only won the Norris once in that time). Then Hedman Doughty, OEL, Burns and Subban also make appearances in the top 10. The full list of results are available on this google doc. It's only the past 2 seasons for now, but in the upcoming week or so I'll be adding more seasons, and different filters like age, draft position, salary and anything I find interesting for people to play around with too (Thanks to Rob Vollman's Data). And just to show that all of the metrics I've presented you with aren't just noise, here is a chart showing the year to year repeatabiltity of each metric.
If you've made it this far, thanks for reading! That makes goals above replacement, a quick snapshot of player talent boiled down into one number. It's not as complex as some others, but it does combine all of the numbers people are most likely to cite anyways when discussing why player X is better than player Y. If you have any questions comments or concerns your welcome to comment or reach-out to me on twitter @CMhockey66. I'm always open to discuss why numbers are the way they are, improvements going forward, or anything really. And finally, thanks to Manny and his website Corsica, which is where all of the data in this post came from!
A question I have always had is what impact coaches actually have on NHL teams. Usually, their impact is simply mentioned anecdotally with nothing actually backing anything up other than winning records. I used to never think coaching impact would be high, but the 2015-16 season made me rethink this theory. In this season these two teams were possibly the most extreme case studies of the impact a coach can have. First, there was Babcock and the Leafs. In 2014-15 season Toronto a 46.32% Corsi under coaches Randy Carlyle and Peter Horachek, good enough for fourth last in the league. But then that summer they landed the biggest coaching fish Mike Babcock and skyrocketed up to a 50.59% Corsi, good enough league average (despite bleeding talent that summer and throughout the season). Early in that same season, the Penguins were in a tailspin like the lite version of the 2014-15 Leafs. Crosby wasn't scoring, and the team was a disappointing fifth in the metropolitan division. This was not simply luck as the star-powered team only had a 48.34% Corsi. But then it all changed when they Fired Mike Johnston in favor of their AHL coach Mike Sullivan. Under Sullivan, they lit the league on fire running the table with a 55% Corsi en route to their first of back to back Stanley Cups. Ever since watching both these dramatic swings it seemed to me like the coach could be even more valuable than the players on these teams. So these coach MVP's left me asking one question, how much impacts do coaches really have on shot rates?
The method I used was fairly simple. First off teams were all put into two sides with year X on the first side and Year X+1 right beside it. For example, the 2012-13 Penguins went on the left side and they would be compared to the 2013-14 Penguins on the right. This way we could look at how teams shot rates changed from season to season. The next step was to attempt to isolate the variable that was coaches. This was done by taking all these teams and binning them into 3 categories. The first category was teams that made midseason coaching changes, these teams were removed from the data. For example, because the 2014-15 Leafs fired Carlyle for Horacheck, the 14-15 vs. 15-16 data for Toronto was not included. There are already a lot of moving parts when using this method to compare Babcock to his predecessors so including data with a third coach seemed to be unnecessarily adding another extraneous variable. Next there are the two bins actually used, first was the same coach category. This was teams having the same coach for the year of X, plus the year after that. For example, the 15-16/16-17 Hurricanes used Bill Peters for the entirety of two seasons and therefore were in this bin. The second bin represented the "Coach Change" bin, and it was a team which used two different coaches for the entirety of two different seasons. For example, in the summer of 2016 the Anaheim Ducks fired Bruce Boudreau and hired Randy Carlyle, so the Ducks 2015-16 data was compared against the 2016-17 data in this bin which examined the effects of a coaching change. From there we can examine how shot rates change from year to year for both same coach teams and coaching change teams, and infer that the difference is probably largely related to coaching. This results in the binds being 99 same coach teams since 2011-12 and 22 summer coach change teams to work with. Now that there is a method, let's see how they differed. (Data from Corsica.Hockey)
Repeat-ability of CF% With Same Coach
First off let's see the teams who remained the same and how their shot rates compared year to year. To see what the relationship is simply creating a scatter plot of the data and see what the correlation was. To the left is the results of that process. A glance at the R^2 shows us a correlation of 0.53. What this means is there is a moderately high relationship between an NHL teams who have the same coach in back to back years CF%. This seems logical. While they're always moving parts but teams generally don't change enough on a year to year basis to make a huge swing in CF%. Using the equation of the line of best fit there is also another interesting thing I found. If you sub 50 in for X in the equation y=0.7708x+11.269 you get a value of 49.809. What this means is that teams who stick with their coach actually get tend to get worse the next year. This is probably just regression to the mean however maybe the whole "Coaches voices go stale voice narrative" may actually have a little bit of truth to it. So now that we know how teams who keep their coaches results fluctuate let's compare them to teams who change their coach fair and compare.
Repeat-ability of CF% with Different Coaches
Before we begin it should be noted that the coach change bin is much smaller, with only 22 teams so that could increase noise. With that in mind it's time to examine the new coach bin. The difference is substantial compared to the same coach graph and you can see it at first glance. The R^2 shows us a correlation of 0.34. This means there is almost no relationship between year to year CF% a mung teams that changed coaches between said seasons. From here we can take the difference between the two (0.53-0.34) which gives us 0.19. This means that teams who change coaches in the off season's CF% are 19% more variance than those who keep the same coach for two years. From here I make the cautious conclusion that coaching can be responsible for up to 19% of the variance in a teams CF%. Also from this line of best fit there is another interesting observation. When you sub 50 in for X in the equation of y=0.4179+29.454 you get a y of 50.349. This means that teams that fired their coaches actually saw their shot rates improve next year. Again this is probably just regression to the mean but i thought it was interesting. So now that we can estimate the fact that coaches are responsible for nearly a fifth of a teams shot rates, lets look at what specifically changes.
Coaching Impact on Shot Generation
Now that we can see the impact coaches have on Corsi For Percentage, now we can look more specifically at what side of the puck coaches seem to influence more. This can be done using the same simple method of above but just splitting into two bins, Corsi For Per 60 minutes and Corsi against per 60 minutes. First, lets look at shot generation with the control group (same coach) to the left. The repeatably gives us an R^2 of 0.5027, slightly less repeatable than overall CF%. Now that we know how repeatable offense is a mung same coach teams, let's see how they stack up relative to coaching changes and estimate their impact from there.
Coaching Impact on Shot Suppression
Hockey coaches are infamous for putting way too much emphasis on the defensive side of the puck. Seemingly every team has a horror story of the coach not playing a gifted offensive player enough because of perceived defensive weakness. We are about to see that there is a reason for this. To the left, we can see the repeatability of shot suppression a mung same coach teams. The R^2 is 0.5881 which is the strongest relationship yet. The R^2 is 0.08 larger in this sample than the shot generation sample. 8% Less noise in the shot suppression sample is a strong clue that coaches may have a larger impact on defense than offense, but to be sure we can use the same test as used above.
To the left is the sample where teams had a coaching change and the R^2 comes out as 0.2297. This means that there really isn't a relationship in the year to year shot suppression from teams that changed coaches. One last time we can take the same coach R^2 minus the coach change R^2 (0.5881-0.2297) and we get a value of 0.3584. This would appear to mean that coaches can be responsible for up to 35% of shot suppression. The fact that up to a third of a teams defense (when it's defined as shot suppression) are simply the results of the teams' coach is an impressively high number. I would wager that very few, if any single players have such a massive sway on teams shot rates. So bringing this back to a point made above this could illustrate why coaches seem to love defense so much. It appears they have twice the impact on what happens without the puck rather than with it. Knowing this it makes intuitive sense that they focus on defense so much as it could be a result of either focusing on what you control or a choice support bias. Trying to figure out which one leads to a sort of chicken vs. the egg kind of debate. This depending on whether you think coaches focus on defense because they can control it or control defense because they chose to focus on it, but the point remains that coaches can drive as much as 16% of shot generation, and 35% of shot suppression. So since such a large portion of shot rates are can be driven by coaches me think that coaches are maybe the most underrated part of hockey teams on-ice results.