Secondary assists are significantly less repeatable than goals or primary assists. As a result, they're generally ignored inside hockey analytics circles. I understand the logic behind this, but I think there's a better way of looking at things. My philosophy here is directly inspired by Dawson Spriging's old WAR model. Rather than cutting out secondary assists altogether, he used a "BPM" or box plus minus metric. The general idea here is not to throw counting stats out, but to combine them all into one stat, assigning each one an appropriate weight. Spriging's (predictive) weights ended up looking like this (Blue is forwards, which I will be focusing on for this post.)
I've always been a fan of this concept, and since Spriging's is no longer in the public sphere, I've set out to do something similar. My goal was to take public counting stats, and combine them to predict future on ice goals for as well as possible (RelT GF/60 from Corsica, where all the stats in this post will be from. Also, I will be referring to year to year correlations for the remainder of the post). The metrics I set out to use were.
Goals- Individual Goals scored
A1- Primary assists
A2- Secondary assists
ESA- Estimated shot assists, from @loserpoints
IFF- Individual fenwick for (the unblocked shot's a player takes)
ixFSh%- The percentage of times a league average shooter would score given that players IFF
iXGF- The amount of goals a league average finisher would score given that players (unblocked) shots
Gives- Number of times a player gives the puck away. Yes giveaways are going to be a good thing in this model
Takes- Amount of times a player takes the puck away from the other team
(All Per Hour of 5v5 Ice time)
Each of these metrics predicts future goals to varying degrees. Here's each one's R squared with future on ice goals for.
This shows us a few things. First, secondary assists aren't necessarily useless, they just aren't nearly as useful as other stats. Second, as Spriging's already showed, looking at primary points probably isn't optimal, because primary assists are worth more than goals. Finally, individual shot quality doesn't add much, if anything to the equation. So with the basic correlations revealed, I used a linear regression model to combine most of the stats shown above into one. (Anything including shot quality added nothing to the model and was therefor removed. As was ESA1 due to multicollinearity issues with ESA). Here are the resulting weights.
Most of the weights aren't surprising. Primary assists are the most important. Then takeaways, goals, and shots are just behind. After that shot assists and giveaways provide less value, and finally secondary assists add the least. Now to test the model's actual goal, it's ability to predict future on ice goals for.
An R squared of 0.197 doesn't seem like much on it's own, so let's compare it to normal counting stats.
This new metric is a significant improvement over using any of the inputs on their own. Defencemen will be coming soon, but for now you can find the forwards data here. And finally if your not convinced yet, I noticed some fun quirks with the stat. The target variable was on ice, relative to teammate goals for per hour, but this new number is actually more predictive of most of the metric's we care about than they are with themselves. For example here's the auto-correlation of individual goals per hour compared to this metrics correlation with future individual goals for per hour. (All the counting stats about to be shown are standardized)
Same goes for primary assists.
Even something like relative to teammate expected goals for... which is weird.
This new metric isn't a WAR model in itself, but I think based on the testing we can conclude it is a very useful stat. For one final thought, you may have noticed that I haven't yet referred to this stat by an actual name. That's because in hockey analytics, people often complain about nomenclature, saying that the names of some of our current stats don't help our cause. So, I'm asking you to help me come up with a name. If you can think of a more descriptive name than the "BPM" it was inspired by, then I'll gladly name the stat whatever you come up with. If you have any ideas for the name, feel free to comment, or DM me @CMhockey66 on twitter, thanks for making it this far!