I have posted a tone of NHL draft research over the past few months. My focus has been on draft biases. I have found various metrics, which were available to teams at the time of past drafts, that are correlated with success independent of draft position. For example, high scoring players tend to outperform their draft position, while taller players tend to under perform their draft position (don't forget hyperlinks). The fact that these relationships persist show NHL teams have failed to properly account for this information. A good question after finding these various draft biases is, do they still exist today? While true guess must be they aren't sure. Nevertheless, I think I have a cool way to test this. I am going to be looking at my sample to see if teams got better at accounting for this information during the sample. If teams were rapidly improving during my sample, it would strongly suggest that the inefficiencies I found are likely gone by now. If teams didn't get noticeably better at accounting for this information during my sample, it will make me incredibly skeptical of anyone arguing these biases are gone today. At least with a high degree of certainty. My Prior I should be up front about this, I have a relatively strong prior here. I am reasonably certain these loopholes are not gone at the moment. At least not all of them. Additionally I am somewhat skeptical teams got significantly better over the sample. The biases I found will eventually be gone, but I think somebody would need good evidence to convince me otherwise. I think once these inefficiencies have been found, everyone should have the prior that they exist until it is proven otherwise. Additionally I am not neutral here. I doubt anyone in my shoes would be, to be honest. I found a bunch of biases and I think it would be a lot cooler if my findings still may exist in the modern NHL. It would be somewhat depressing if my findings here were just like yup its all gone. Either way though, we will let the data decide. NHL Draft Bias Index To keep this post from being too long, we are going to look at all the biases I have found at the same time rather than individually. To do this, I am going to use an algorithm called Principle Component Analysis (PCA). PCA is an algorithm used to make indices, which aim to show you how some numbers move together over time. It's usually used when some variables are highly related so you look at them all at once, because putting highly related variables in a model will mess with your model. I put the big draft blind spots I found in a PCA, and this was the results. (Ignore the word Stata before everything) For those who aren't sure what's going on don't worry, on their own PCA results literally don't mean anything. It just shows you how some variables tend to move together. As a result, I will need to convince you that what I am using actually represents what I want to measure. Focus on the first dimension (Dim 1) here. In Dim 1 increases when the prospect is 1) Taller 2) Takes more penalties 3) Is a defender drafted high while the same dimension is decreasing when the prospect 1) scores more 2) is an over-age prospect Now, note the direction of the variables I have found to be not properly valued. I found, high scoring players do relatively well, over age players do relatively well. On the flip side, tall players do relatively worse, players who take a lot of penalties do relatively poorly, and finally highly drafted defenders tend to underperform. Directionally, this Dimension one is decreasing when a player has characteristics that are associated with being undervalued, while the opposite is true for players shown to have been over valued. (technically are all flipped but the direction doesn't matter, it just means negative represents player's likely to be undervalued instead of positive). For an example, here is our first dimension for each first round prospect in the 2014 NHL draft. What this means is this number, Dimension 1 is doing a reasonably good job at approximating all of the variables being examined at the same time. So, rather than looking at each metric individually, we use this dimension 1 variable to account for all our under / overvalued characteristics at the same time. Since it represents all our metrics at once, I am going to call this out NHL draft bias index for the remained of the post. NHL Draft Biases Index Against Time Now that we have 1 metric to represent a bunch of NHL draft biases, we can test to see if the NHL draft has been improving over time by testing this metric in a model. If the relationship between our draft bias index and draft success is shown to have been declining over time, the biases I have found are likely gone today. To do this, I am going to be looking at success rates. Success rates start from my NHL draft pick value curve. This chart shows how many Goals Above Replacement (GAR) a player is expected to produce based on their draft position. From there each player is given a 1 (success) or a 0 (failure). They get that 1 if they produced more value than expected based on their draft position, and a 0 otherwise. Note highly drafted players are more likely to get a 1 due to the heteroskedastic nature of the draft, and that will be accounted for in all the analysis below. To do this, all the relationships below will include our draft bias index, draft position and draft position squared. If there is still any relationship between out draft bias index and pick success rates, it will show that the NHL draft market failed to properly account for the information in our index. The logistic regression model will look like this. Where we predict if each player in a given draft is a success based on our index, and their draft position. Of course we know they have failed to properly account for this information when looking at the sample as a whole, but the more interesting question is, if the relationship between our index and success rates have been changing over time. If the relationship our index and Success Rates is trending towards 0 in our sample, it will show that NHL teams began adjusting and better accounting for this information. If the relationship is just random over time, it will show NHL teams didn't actually make any progress in accounting for these biases over a nearly decades worth of drafts. What happens when examine the relationship over time? Well, there is a trend. Remember the undervalued players by our index are negative in this model, so numbers closer to -1 indicate the league was particularly poor at accounting for the biases in that year. On the flip side, if the coefficient sits within error bars of zero, it means the league properly accounting for the biases we found in that draft. Here are the results. A few things pop out. First, get used to the trend where our NHL draft bias index did not predict success independent of draft position in 2012. Although given 2012 is considered a historically bad draft for the league too, we can probably give the model a pass for that season. Weird stuff happens sometimes. Beyond 2012, there is a trend. Altogether, as expected, players who have really small numbers in our NHL draft bias index were far more likely to be successful draft picks. We expected to see this thought. What's more important is the small trend showing the index did worse towards the end of the sample. Although, the cumulative point estimate is well within range of each drafts error bars. So, maybe the falling predictive power of our index isn't as meaningful as it may look. I have another reason to be skeptical the decline your seeing above is as big as it looks, because there is more than one way to define pick success. Using other metrics, the trend looks more cyclical than anything. New Target Variables The next metric up to define pick success is NHL games played. I applied the same draft pick value model, except this time used games played to arrive at the following curve. Then repeated the same process to measure success rates. Did our bias index do noticeably worse at predicting success rates when using games played rather than GAR as our measure of prospect success? Not really With a new target variable, the same high level finding holds. As expected, players who, based on our index, we believe are likely to be undervalued have generally outperformed their draft position, suggesting they were undervalued. Although this time when we look at the predictive power of our index against time, it just seems to be fluctuating randomly. Like, did teams get worse at drafting, then better, then worse again? Probably not. The trend above is more likely to be random variation. Of course, we can do this with more than just games played. This time lets use points to measure player output, applying the same pick value and success rate methodology as before. Aside from 2012 which is again an outlier here, if anything, teams actually seem to have gotten worse at accounting for the biases we have looked at during the sample. Again, it is important to respect the large error bars which will be present whenever looking at individual drafts. We probably looking at 2012 as a weird year, then a bunch of random variation. Of course we can even get into more niche measures of success too. This time I will attempt to account for opportunity given to players. This is important because if you look above you will notice our index did the worst job at finding inefficiencies when targeting games played. In other words, when drafting, NHL teams are best at maximizing games played. Of the three variables above, games played is of course the one NHL teams have the most control over. They get to control who is or is not in the lineup each night. This is strong evidence of how much of a self fulfilling prophecy the draft can be. So, to better adjust for different levels of opportunity, lets look at GAR per games played to measure success. Note values have been regressed towards replacement level up until a player hit 82 games player. Again, I doubt teams got worse then better than worse. This is likely just more random variation, with 2012 as an outlier. Finally, let's do the same thing with GAR per minute played. This time, regressed towards replacement level up until 1500 minutes played. Again we are seeing a familiar cyclical looking pattern. Here is the coefficient on our index graphed against time for each different target metric. Remember, as these numbers trend towards 0 it means teams are getting better at drafting. Maybe there is a small trend of improvement. Although more than anything it just appears although 2012 is a draft our index would want back. This is of course true for much of the league too, so I am inclined to give it a pass there. So did teams actually get any better at correcting for league wide biases in my sample? It doesn't really look like it to me. I'm confident they didn't get worse, but an overarching idea of improvement is far from obvious in my opinion.
1 Comment
5/4/2024 09:53:35 am
Hello!
Reply
Leave a Reply. |
AuthorChace- Shooters Shoot Archives
November 2021
Categories |