NFL Big Data Bowl 2022: Punt Return Yards Over Expected

Sean
Jan 2, 2022
8 min read

Howdy! Welcome back to the URAM Analytics blog. 2022 figures to be a busy year for us, so what better way to kick off the year than with our NFL Big Data Bowl 2022 submission. For those of you who are unfamiliar, the NFL Football Operations has been holding a contest for four years now that they have dubbed the “Big Data Bowl”. Per their website, “the annual sports analytics contest from the NFL Football Operations challenges members of the analytics community…to contribute to the NFL’s continuing evolution of the use of advanced analytics. The crowdsourced competition uses data and technology to spur innovation that results in creating new insights, making the game more exciting for fans and protecting players from unnecessary risk” (NFL Football Operations website). This year’s focus was on special teams and the NFL provided their Next Gen Stats data and Pro Football Focus (PFF) provided their scouting data. With these sources, I was able to take a crack at creating an insightful analysis for NFL special teams.

I took a long, hard look at the dataset and spent quite a while brainstorming until I landed on a project idea: how can punt returns be evaluated beyond the yards gained by a returner? I admit that isn’t entirely a novel idea and there has been work done to do the same sort of evaluation on rushing performance (hyperlink). Getting back to the project idea - we know that punt returns are evaluated on how many yards the returner gained, but how many yards SHOULD they have gained? We certainly have seen some wonky punt returns where a player should have gained more yards than they should have, but how does one put an actual value on that? That’s where Punt Return Yards Over Expected (PRYOE) comes into play.

Punt Return Yards Over Expected (PRYOE) is the byproduct of modeling a punt return and comparing how many yards the model predicted a return would result in to the actual amount of yards gained on the return. A positive value would indicate that the player gained more yards than expected - which would suggest that they did something to specifically add value to the return. Whereas a negative value would suggest the opposite: that the player lost more yards than expected and did something to specifically reduce value from the return. Using the data provided by the NFL from the Big Data Bowl, I was able to create a model to predict return yards and generate the PRYOE metric. The model specifically focused on a returner and contained information that described their position on the field, how fast they were moving, the direction and orientation of their body and movements, and information related to how close the other 21 players on the field were in relation to the returner. For my fellow data nerds out there: the model was trained on 2018 and 2019 data and subsequently ran on 2020 data to generate PRYOE results for the 2020 season. Now that we have an understanding as to what PRYOE is, let’s evaluate the results!

PYROE Results

PRYOE can be used to evaluate performance on a player level and special teams unit level. Let’s start at the player level and evaluate which punt returners were great at adding yards to their returns and which were poor at adding yards.

The results above only feature returners with at least 8 punts returned. Evaluating the results on a per return basis, Kalif Raymond (Tennessee Titans), Andre Roberts (Buffalo Bills), and Diontae Spencer (Denver Broncos) are in a tier all to themselves, nearly gaining +0.5 yards per return more than expected.

Jabrill Peppers (New York Giants), Pharoh Cooper (Carolina Panthers), Marquez Callaway (New Orleans Saints), Gunner Olszewski (New England Patriots), and Jaydon Mickens (Tampa Bay Buccaneers) were in the next tier gaining nearly +0.3 yards per return more than expected.

The players with the worst PRYOE per play were Christian Kirk (Arizona Cardinals), Mecole Hardman (Kansas City Chiefs), and Jakeem Grant (Miami Dolphins).

Knowing that Raymond was one of the best, when evaluated on a PYROE per return basis, and that Grant was one of the worst - what made one better than the other? Let’s take a deep dive into their data and see if we can come away with anything insightful.

On a high level, one would think that Grant was the better punt returner. Grant had a superior average return yard metric (12 yards vs Raymond’s 9 yards). But, these results are slightly skewed by Grant’s 86 yards punt return for a touchdown. If you look at the 75th percentile for their returns, their performance is nearly identical (Raymond’s 15 yards vs Grant’s 14 yards). This view only provides so much information, so it is important to take it a step further.

When looking at how “crowded” each player was on a return, which is a metric that counts how many more teammates are within 5 yards of the returner than defenders, Grant’s crowdedness was 7th and Raymond’s was 30th out of 34 qualified returners. This suggests that Raymond typically had a less favorable return “situation” than Grant and may hint at Raymond having a superior returning “vision” than Grant.

Adding another cut to this, I counted the number of frames (measurement that is a taking a situation measurement every split second) that had at least one defender within 5 yards of the returner. This is providing insight to how much pressure a returner faces on a typical play. The measurement is slightly inflated if a returner is known to "run around" quite a bit and accumulate more frames within a play, so take this with a grain of salt. But, with that said, Grant ranks 5th highest for frames with defenders within 5 yards per play and Raymond ranks 15th.

These two measurements present conflicting insight on a high level, but provide insight nonetheless. Grant typically had more teammates than defenders within 5 yards than Raymond, but Grant typically had more defenders, on a frame level, within 5 yards of him. This may be an indicator into why Grant underperformed so badly when evaluated with PRYOE. On the other hand, Raymond typically had more defenders within 5 yards of him than teammates, but he was middle of the pack regarding the number of frames with defenders within 5 yards of him. Given that Raymond tended to have a tougher "situation" and that he outperformed his expectation, it is reasonable to ponder what his punt returning performance would be with a better blocking scheme.

PRYOE Brought To Life

In an effort to bring PRYOE to life, I was able to animate two punt returns by Jakeem Grant. One went for a touchdown and is a good example of what a return with positive PRYOE would look like. The other was not and serves as a good example of what a return with negative PRYOE would look like. The size of the returner’s bubble is reflecting the predicted yards gained in a frame.

This line chart shows the PRYOE for the animated play above. You can see that the model expected Grant to gain more yards after initially catching the punt. Frames 100 - 120 are where it gets interesting. According to the model, this is where Grant is really outperforming expectations. Near frame 140 is where Grant made the punter and another defender miss and from there, it was off to the races.

This line chart shows the PRYOE for the animated play above. You can see that the model is expecting Grant to gain more yards than he did for most of this play. His performance really cratered in the range of frame 100 - 130. Looking at the animated chart, this is where Grant begins to run more to the left (loosing yards) and down on the y axis. It is interesting to see the model pick up a better PRYOE around frame 120 when Grant made a defender miss a tackle.

Team Level Evaluations

To evaluate PRYOE on a team level, I aggregated the results for teams for their returning plays and their punting plays. The returning results mostly align with the player level - which makes sense given the low sample size of punts that are returned. The punt defending results are interesting and begin to hint at units that are superior at limiting strong returns.

Evaluating the scatter plot, we should use the 0,0 coordinates to base ourselves. Anything to the right of the 0 on the x axis indicates a positive PRYOE per play as the returning team (good). Anything to the left indicates a negative value (not ideal). Any value above the 0 on the y axis indicates a positive PYROE per play as the kicking team (not ideal) and anything below it indicates a negative value (good).

Where teams would want to be is the bottom right hand quadrant (positive PRYOE as a returning team and negative PRYOE as a kicking team). Using this criteria, the Buffalo Bills are one of the strongest punting units. Their PRYOE per play as a returning unit is slightly above 0.4 yards and as a kicking team, it is about -1.5 yards. Other teams with strong unit performance include: New England Patriots, Indianapolis Colts, and the Philadelphia Eagles.

Value Discussion, Limitations, and Next Steps

Now that the model has been explained and the results discussed, I think it is important to discuss the value of this metric. Should this metric take over in place of actual return yards gained? Probably not. But, I do think that it possesses value for a few different areas of relevance.

Value Proposition 1: Post Performance Evaluation

PRYOE provides insight into how a returner (or kicking unit) performed relative to how they would be expected to perform given the situation of the punt return. As shown in the above section, insight can be given into if a returner was due more credit for their over performance or if their blocking unit provided a beneficial boost to the return. It is a method that begins to get to the individual contribution that the returner has on a play.

Value Proposition 2: In Game Strategy

Teams would be able to layer in this metric to help guide in game strategic decision making. For example, perhaps the Chicago Bears are playing the Buffalo Bills and know that the Bills have one of the best units on a kicking defense PRYOE per play basis. Knowing this, the Special Teams Coordinator may encourage their returner to take a fair catch or allow a punt to drop in order to maximize starting position for the ensuing drive.

I also think that this PRYOE metric is a piece to a decision optimization problem: what is the optimal choice for a punt return (return, fair catch, let bounce)? By creating a very robust RYOE model, this tackles the first part of this decision making equation. Models would need to be developed to estimate how a punt would bounce and then provide the expected return yards, fair catch location, and bounced punt location on a given punt, in order to determine if a decision was optimal. I am aware of a few BDB projects doing this and am excited to see their finished products.

As I mentioned in the value discussion above, there is a limitation to the PRYOE metric due to how narrow its scope is.

In terms of a next step, I think it would be worth expanding some of the player tracking features. Perhaps including features that provide information on how close players are to each other (and not just the returner) would aid in the accuracy of the model. This sort of robust feature set could also be a launching point for quantifying impact of individual players' on a given play. For example, perhaps we'd be able to quantify a kicking team defender's impact on limiting return yards or we could quantify a player on the returning team's impact on blocking and creating more return yards. However, a discussion would need to be had on how much impact this would truly bring to a team and the sports analytics community. Essentially, is the juice worth the squeeze given the number of punt returns that happen within a given season? Regardless, it would set the wheels in motion for similar analyses that could be performed on other actions within a football game.

Closing Thoughts

It was a ton of fun to participate in my first NFL Big Data Bowl contest and I hope to continue to participate in the future! Special thanks to my girlfriend, Sarah, for putting up with me spending too much time "beep, bop, booping" at the computer. Additional shout outs to Ben Draus and Tej Seth for serving as sounding boards for ideas!

GitHub: Click Here

Kaggle Notebook: Click Here