Principal Component Analysis to Inform Draft Selection for Fantasy Football Teams

Author

Tomi Akisanya

Around this time every year, if you’re like me, you’ve already started to mentally prepare for 80% of your conversations to be on the topic of football. For those individuals that would not classify themselves as ffanatics, it’s probably annoying - it’s annoying for all of us. Pretty consistently there will behavior from fans that range from DMs to athletes, verbal assaults to close friends, damaged property, and public humiliation. Who realistically has the mental endurance to only discuss a single topic over an extended period of time for something they’re not physically involved in? It’s those with passion, and although I am not passionate about football (nor justify the aforementioned behavior), it will be my first of hopefully many Fantasy Football leagues. If there’s one thing I am passionate about, it’s tilting the odds in my favor. Usually that’s in the form of taking a creative line on the felt on a 2/5 reg by extracting max value with 67s on a low connected board in a 4! pot as the preflop aggressor. Balance and discipline is the name of the game though, and who are we if we don’t apply the same approach to all areas in life…

Data Sets

The data contains in-game and Fantasy Football Points per Reception stats by NFL player from 2017 - 2023 for all 17 games of the regular season. Most leagues use a points per reception based metric to calculate fantasy points, or FantasyPointsPPR. Before converting to fantasy points, in-game stats may be weighted or counted differently. My league adopted the following criterion:

Variable.Name Scoring..Points. Scoring..Points..1
Completions Typically not scored (0 points) 0.00
Attempts Typically not scored (0 points) 0.00
PassingYards 1 point per 25 passing yards 0.04
PassingTDs 4 points per passing touchdown 4.00
Interceptions -2 points per interception -2.00
RushingAttempts Typically not scored (0 points) 0.00
RushingYards 1 point per 10 rushing yards 0.10
RushingTDs 6 points per rushing touchdown 6.00
Targets Typically not scored (0 points) 0.00
Receptions 1 point per reception (PPR format) 1.00
ReceivingYards 1 point per 10 receiving yards 0.10
ReceivingTDs 6 points per receiving touchdown 6.00
Fumbles -2 points per fumble (if lost) -2.00
FumblesLost -2 points per fumble lost -2.00
CompletionsPerAttempt Typically not scored (0 points) 0.00
TDsPerAttempt Typically not scored (0 points) 0.00
InterceptionsPerAttempt Typically not scored (0 points) 0.00
TDsPerReception Typically not scored (0 points) 0.00
FumblesLostPerFumble Typically not scored (0 points) 0.00
FantasyPointsPPR NA NA

Two different data sets are used, with a focus on three distinct NFL regular seasons - 2022, 2023, and 2024. Both data sets have been scraped but differ in source, purpose, and underlying information present:

Data Set 1 - Historical - 2022 & 2023: This data set contains historical data from 2017-2023 for both relevant in-game statistics and fantasy scoring for regular NFL season. This project primarily focuses on 2022 and 2023. Each observation or row in the data set is a NFL athlete’s relevant in-game statistics, such as position, team, completion, attempts, interceptions per attempt, etc. Since most leagues exclude defensive players from their fantasy team, those have been implicitly removed from the data set. The key feature of this data set are the retroactive fantasy rankings/scoring. The total fantasy points, overall rank, and rank by position are available for each player. This enables the direct comparison of calculated rankings from models to their actual rank.

Players that were on two or more teams in a given season are not assigned a team - but are instead given a makeshift name to highlight this. For example, you may see Baker Mayfield’s registered team as 2TM. One notable variable contained in the data set is ADP, or average draft position, representing the number of times the player was drafted across all recorded leagues before the start of the respective season. Additional in-game stats were calculated afterwards. These were YardsPerRushAttempt,CompletionsPerAttempt, TDsPerAttempt, InterceptionsPerAttempt, TDsPerReception, and FumblesLostPerFumble. My fantasy league’s scoring methodologies were also factored into a set of new variables. These variables have the schoring_ schema.

Years 2017-2022 and 2023 are taken from the same source but were scraped separately. Player IDs assigned

Data Set 2 - Projected - 2024: This data set contains all player match ups for the upcoming regular 2024 NFL season. The in-game statistics recorded for each player are projections based on those teams and match ups. One benefit of this data set is being able to use these projections as inputs of our model to determine which players obtain the most fantasy points, rank them in ascending order, and draft them accordingly. The drawback is that no additional information on how these projections were calculated are known so the accuracy of these projections cannot be confirmed.

Methods

The current methodology is to use singular value decomposition of eigenvalues to create scores of new variables that can be attributed to their overall performance. Their projected overall performance would be used to rank each player in ascending order (potentially by position) to inform our draft decision. Starting with the 2022 season, the overall rank for each player is calculated and then compared to the actual ranks of that same year. Precision will be measured in three ways:

  1. Difference in overall rank
  2. Difference in position rank
  3. Difference in fantasy points, obtained by taking the difference of fantasy points using our model’s draft order with the fantasy points using the ideal draft order.

If the model is precise, the same 2022 projections will then be tested on 2023. This method is not full-proof obviously. Many things change between off seasons of professional sports, but the objective is to quantify the model’s ability to generalize onto future seasons. If it can, the same process will be done starting with the 2023 data set, testing it against itself, then using those scores for 2024. If it does not, principal component scores will be calculated using the 2024 data set only, and only those will inform our draft order. This is not the ideal scenario, since it inherently trusts the projected data.

Processing

Missing values in completions per attempt, tds per attempt, interceptions per attempt, tds per reception, and fumbles lost per fumble were a result of undefined values in the denominator. Observations of this missing values are directly related to the player and position. For example, QBs will rarely record touchdowns per reception and will therefore have undefined values for those statistics since they are more equipped to measure performance of wide receivers. All missing values in these cases were replaced with zero. The same approach was applied to yards per attempt and yards per reception with two notable exceptions. Foster Moreau had zero rushing attempts but two rushing yards during the 2022 NFL season which is difficult to interpret considering rushing attempts are a function of rushing yards. In the same vein, Joe Flacco had -3 receiving yards but zero receptions. Both of these players were removed from the data set.

[1] 1367

YoY PPR Trend for Top NFL Teams

2022 NFL season

Data Exploration

The data was sub set for the 2022 season and then skimmed for to review distributions, counts, and other elements within the data set.

Data summary
Name fantasy
Number of rows 575
Number of columns 43
_______________________
Column type frequency:
character 2
factor 2
numeric 39
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Player 0 1 8 24 0 575 0
PlayerID 0 1 8 8 0 575 0

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
Team 0 1 FALSE 34 2TM: 28, ARI: 20, DEN: 20, LAC: 20
FantasyPosition 0 1 FALSE 4 WR: 218, RB: 162, TE: 113, QB: 82

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Rank 0 1.00 289.17 166.96 1.00 144.50 290.00 433.50 577.00 ▇▇▇▇▇
Age 0 1.00 26.19 3.23 21.00 24.00 26.00 28.00 45.00 ▇▇▁▁▁
Games 0 1.00 11.60 5.16 1.00 8.00 13.00 16.00 17.00 ▂▂▂▂▇
GamesStarted 0 1.00 5.56 5.81 0.00 0.00 3.00 11.00 17.00 ▇▂▂▂▂
Completions 0 1.00 19.98 72.11 0.00 0.00 0.00 0.00 490.00 ▇▁▁▁▁
Attempts 0 1.00 31.08 110.35 0.00 0.00 0.00 0.00 733.00 ▇▁▁▁▁
PassingYards 0 1.00 219.00 789.12 0.00 0.00 0.00 0.00 5250.00 ▇▁▁▁▁
PassingTDs 0 1.00 1.30 5.04 0.00 0.00 0.00 0.00 41.00 ▇▁▁▁▁
Interceptions 0 1.00 0.72 2.40 0.00 0.00 0.00 0.00 15.00 ▇▁▁▁▁
RushingAttempts 0 1.00 25.65 56.65 0.00 0.00 2.00 17.00 349.00 ▇▁▁▁▁
RushingYards 0 1.00 114.51 261.73 -15.00 0.00 5.00 73.50 1653.00 ▇▁▁▁▁
YardsPerAttempt 0 1.00 2.62 3.69 -7.50 0.00 1.75 4.46 40.00 ▇▇▁▁▁
RushingTDs 0 1.00 0.85 2.14 0.00 0.00 0.00 1.00 17.00 ▇▁▁▁▁
Targets 0 1.00 29.95 37.05 0.00 3.00 14.00 44.00 184.00 ▇▂▁▁▁
Receptions 0 1.00 20.11 24.93 0.00 2.00 9.00 30.00 128.00 ▇▂▁▁▁
ReceivingYards 0 1.00 220.03 304.02 -10.00 11.00 96.00 313.50 1809.00 ▇▂▁▁▁
YardsPerReception 0 1.00 8.67 6.23 -6.00 5.00 8.90 12.11 42.00 ▃▇▂▁▁
ReceivingTDs 0 1.00 1.30 2.12 0.00 0.00 0.00 2.00 14.00 ▇▁▁▁▁
Fumbles 0 1.00 1.04 1.98 0.00 0.00 0.00 1.00 16.00 ▇▁▁▁▁
FumblesLost 0 1.00 0.48 0.98 0.00 0.00 0.00 1.00 9.00 ▇▁▁▁▁
FantasyPointsPPR 0 1.00 78.41 85.43 -2.90 11.50 43.40 116.35 417.40 ▇▂▂▁▁
PositionRank 0 1.00 81.82 55.22 1.00 37.00 73.00 118.50 218.00 ▇▇▅▃▂
Year 0 1.00 2022.00 0.00 2022.00 2022.00 2022.00 2022.00 2022.00 ▁▁▇▁▁
ADP 429 0.25 71.90 42.07 1.30 36.42 70.65 107.15 153.80 ▇▇▇▇▅
CompletionsPerAttempt 0 1.00 0.10 0.24 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
TDsPerAttempt 0 1.00 0.01 0.08 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
InterceptionsPerAttempt 0 1.00 0.01 0.08 0.00 0.00 0.00 0.00 1.00 ▇▁▁▁▁
TDsPerReception 0 1.00 0.05 0.11 0.00 0.00 0.00 0.07 1.00 ▇▁▁▁▁
FumblesLostPerFumble 0 1.00 0.20 0.36 0.00 0.00 0.00 0.25 1.00 ▇▁▁▁▂
scoring_PassingYards 0 1.00 8.76 31.56 0.00 0.00 0.00 0.00 210.00 ▇▁▁▁▁
scoring_PassingTDs 0 1.00 5.18 20.16 0.00 0.00 0.00 0.00 164.00 ▇▁▁▁▁
scoring_Interceptions 0 1.00 -1.44 4.80 -30.00 0.00 0.00 0.00 0.00 ▁▁▁▁▇
scoring_RushingYards 0 1.00 1.15 2.62 -0.15 0.00 0.05 0.74 16.53 ▇▁▁▁▁
scoring_RushingTDs 0 1.00 5.08 12.82 0.00 0.00 0.00 6.00 102.00 ▇▁▁▁▁
scoring_Receptions 0 1.00 20.11 24.93 0.00 2.00 9.00 30.00 128.00 ▇▂▁▁▁
scoring_ReceivingYards 0 1.00 22.00 30.40 -1.00 1.10 9.60 31.35 180.90 ▇▂▁▁▁
scoring_ReceivingTDs 0 1.00 7.79 12.72 0.00 0.00 0.00 12.00 84.00 ▇▁▁▁▁
scoring_Fumbles 0 1.00 -2.09 3.96 -32.00 -2.00 0.00 0.00 0.00 ▁▁▁▁▇
scoring_FumblesLost 0 1.00 -0.96 1.96 -18.00 -2.00 0.00 0.00 0.00 ▁▁▁▁▇

Important categorical variables to note outside of the Team and Player is the Fantasy Position. These are QBs, WRs, TEs, and RBs. All defensive positions are not scoped and have been explicitly removed from the data set. The objective of this project is to create a fantasy team that has the highest likelihood of obtaining the most Fantasy Points (PPR) for the upcoming season based on prior seasons. The next logical question becomes, does this likelihood vary by team? By position? Intuitively, the likelihood will vary simply based on how many of these positions are in game at a given time. A table of each position shows the distribution of each position in a given NFL season.

 FantasyPosition   n   percent
              WR 218 0.3791304
              RB 162 0.2817391
              TE 113 0.1965217
              QB  82 0.1426087

A majority of NFL players in 2022 for fantasy purposes were wide receivers (38%) while the least common position were quarterbacks (14%). For the continuous variables, the important things to note are the means, standard deviations, their counts and distribution via the histograms, and any missing values. In-game stats such as Yards Per Attmept, Yards per Reception, and additional variables (below ADP) are missing for a lot of players as expected. The missing values are likely due to a number of factors such as the position of the player and the number of games each played. The range of games played by each player vary from none to 17, representative of 17 total games in the regular season. The descriptive summary for Fantasy Points (PPR) is:

   vars   n  mean    sd median trimmed   mad  min   max range skew kurtosis
X1    1 575 78.41 85.43   43.4      64 57.08 -2.9 417.4 420.3 1.38      1.4
     se
X1 3.56

With mean = 78.4 (sd = 85.29), it is obvious that there is high variability. The histogram of PPR shows where how this variability is distributed.

Depending on the fantasy league, PPR scoring will be different in the sense that there is typically a PPR threshold per game for a player in order for their PPR to be recorded for that given week. If the player is below the baseline, the PPR may be zero. Interestingly enough, the distribution of PPR in 2022 is a log-normal right skew distribution. The observed PPR for most players were approximately zero. This is logical, granted the extreme difficulty of being a top NFL performer. Most players are benched, aren’t playing games, nor are starters. Evidently, most of the continuous variables will follow the same trend with observed frequencies ~0 and relatively a fewer number of players scoring the most for each variable. There are three notable exceptions:

  1. Average Draft Position - Uniform Distribution: For players that have this data available, ADP is uniformly distributed across most players. This implies that the parent population drafts players evenly across the board and there isn’t a strong concentration of players being picked predominantly. To reiterate, this is specifically for players that have ADP data available, which may be a combination of the most popular or the best players.

  2. Fumbles Lost per Fumble - Bimodal distribution: Frequency of Fumbles Lost per Fumble have peaks at both zero and one, revealing that most players either fumble and lose possession of the ball, resulting in a turnover, or don’t fumble at all.

  3. Yards Per Reception - Normal distribution: Players average 10 Yards Per Reception with a majority of players within the range of two standard deviations from the mean. Although there is a slight right skew, this is one of the few variables that follows a Gaussian distribution.

The first thing we want to understand is what position, if any, we should be more inclined to draft first, and their likelihood of obtaining the most fantasy points. I started by seeing what the public likes to draft first. In 2022, the public drafted quarterbacks around 86 times on average, the highest of the four positions, with tight ends at a close second of 84 times on average. Running backs and wide receivers then followed. Acknowledging that ADP data is not available for all positions, I wanted to better understand if quarterbacks are the biggest factor in regards to PPR. A comparison of ADP and PPR by position begins to paint the picture. On average, quarterbacks had 105 fantasy points in 22, followed by wide receivers, running backs, then tight ends.

# A tibble: 4 × 3
  FantasyPosition AverageDraftPosition AveragePPR
  <fct>                          <dbl>      <dbl>
1 QB                              85.8      105. 
2 WR                              73.5       81.6
3 RB                              61.5       75.7
4 TE                              83.9       56.5

This seems to be consistent at a higher level when the top three teams for the 2022 regular season are compared.

However, properly accounting for total number of games played by player, yields different results. Evidently, quarterbacks are disproportionately the most efficient in regard to fantasy points on a per game basis at .164 per game. Running backs, tight ends, and receivers consecutively follow but are significantly behind. However, fantasy points between those three positions vary by less than 5%. The logical question becomes what are the underlying causes of this variation? One possibility is that there are typically more wide receivers on an offensive play than any other position. Another possibility can be due to the extreme routes and distances wide receivers run, making them injury prone and therefore less efficient.

# A tibble: 4 × 2
  FantasyPosition PPRperGame
  <fct>                <dbl>
1 QB                  0.164 
2 RB                  0.0393
3 TE                  0.0372
4 WR                  0.0316

Analysis SVD PCA

Eigenvalue decomposition is an unsupervised machine learning method typically used for dimensionality reduction on mostly unlabeled data. The same approach is used here, with the intent to reduce the data to components that measure performance in some aspect. The best way to think about this is in the form of a recipe. A cookbook’s recipe for chicken cordon bleu will have elaborate concoctions and mixes of different food. When applied in this context, it would reduce the recipe to its core components, 3/4 chicken, 1/8 cheese, 1/8 ham let’s say. It then becomes much easier to make chicken cordon bleu while keeping most of the taste. While an oversimplification, the approach is essentially the same, with the goal of maximizing the amount of underlying variation using linear combinations of variables. At its core, there is some latent underlying variable(s) that combinations of these variables measure. What those underlying variables measure and its relevancy is on us to define. These are the principal components. The principal components are made up of the original variables, and how much that variable contributes to the underlying variable (ie principal component) are the eigen vectors or loadings. Loadings can be positive (greatly contributes) or negative (adversely contributes). How well the variables load help define what that new underlying variable is. To define the inclusion criteria, any variable that loads +/- .7 will be considered as loading well and those variables alone will be what is used to define the underlying variable/component. This empirical threshold is a very conservative approach.

Eigen vectors describe a mathematical phenomena such that

\[ A * v = λ * v \]

where A is a square matrix, v is an eigen vector, and λ is a scalar (numerical value) and the associated eigen value of vector v. In this application, matrix A is correlation matrix of the original data. This mechanism works because linear transformations are applied to the data meaning the data does not inherently change. The proportions of all variables and the direction in which they move remain the same. The data gets centered at the origin after scaling, and a best fitting line is calculated that goes through the origin and maximizes the variance in the data. The algorithm does this by fitting a random line through the data, projecting the points onto the line, and calculating the largest sum of squared difference. The yielded line of best fit is the eigen vector for the principal component and the slope is the eigen value.

A parallel test was used to measure the number of components to obtain. The test performs the same decomposition on simulated data of the same size and graphs the results. Where the simulated and actual data intersect is the cutoff for the number of components to obtain. The results of the test suggest three components. The y-axis plots the eigenvalues which is the total variation explained by each component. In a simpler sense, it can be thought of as the number of original variables accounted for in that component (hence the horizontal line separating values less than one). Three principal components were obtained, acknowledging that principal component one (PC1) should account for approximately six variables, PC2 around 5, and PC3 around 3. The other scree plot better highlights the components as a percentage of total variability explained. Keep in mind that PC1 only accounts for 30% of the total variability and the first three components cumulatively account for 52% of total variability. It’s likely that the post-hoc tests described in the methods section will not be sufficient for our goal since there is still half of the total variation not accounted for in these components.

Scores are calculated for each individual player. Depending on how the components are defined, players can be ranked in ascending order.

Principal Components - In-Game Stats

rotation: maximum variance

PC1 Definition: High-Volume Performing QBs
Attributes: Completions, Attempts, Passing Yards, Passing TDs, Interceptions, Fumbles

  1. PC1 would be attributed to high-volume QBs as they load extremely well for the above categories. The first inclination was to attribute PC1 to high performing quarter backs, however, that statement alone would be unjustifiable considering that interceptions and fumbles load extremely well to this component. High-volume quarter backs would be a more fitting description. These quarter backs are performing extremely well in some regard since it loads high for completions, passing yards, and passing touch downs. We can reason that these quarter backs are also able to consistently get the ball off of their hands. High-volume quarter backs will also load high to interceptions and fumbles. The more throws and attempts made, the more likely that fumbles and interceptions will occur.

PC2 Definition: Offensive Long Range Efficiency
Attributes: Games Started, Targets, Receptions, Receiving Yards, Receiving TDs

  1. PC2 can be attributed to total overall offensive efficiency given that we load extremely high for targets, receptions, receiving yards, and receiving touchdowns. Players typically defined for this category would be wide receivers, and pc2 is measuring yardage efficiency. Efficiency is important in this context given that we also load high to targets, and even though it is not an inclusion criteria in fantasy scoring, it speaks to the aggressiveness on the offensive side. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and touch downs are not accounted for in this component. Additionally, overall long range efficiency is justified since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a good quarter back.

PC3 Definition: Offensive Driving Efficiency
Attributes: Rushing Attempts, Rushing Yards, Rushing TDs

  1. PC3 would be attributed to mainly RBs and TEs that are elite drivers since they load high for rushing stats.

Ideally, we’d want to load players that load high for all three categories. Considering the nature of football, depending on player’s primary position, they will naturally perform better in certain stats or categories over others. In this case, there are multiple approaches to account for this. Actually, along every step I find there are ways that our paths diverge, but more on that later. One approach is to use only principal component one and players/scores that load high for that component to pick our quarter back. Principal components two and three would then be used for all other positions. I started with that approach but here comes the other divergence - how I choose the calculate the scores. There are two options under consideration.

  1. Include all variables in the principal component computation, with the benefit of providing a more comprehensive score but the drawback of added complexity.
  2. Include only variables in the principal component computations that load high as the score, with the benefit of exclusively calculating how good they are at being good but the drawback of missing nuanced information capture in less significant variables.

I tested the model starting with the second approach. First, I reviewed how the top ten quarter backs performed in 2022 by looking at their overall rank, position rank, the player, and the total fantasy points they had.

# A tibble: 10 × 4
    Rank PositionRank Player          FantasyPointsPPR
   <dbl>        <dbl> <chr>                      <dbl>
 1     1            1 Patrick Mahomes             417.
 2     2            2 Josh Allen                  396.
 3     3            3 Jalen Hurts                 378 
 4     7            4 Joe Burrow                  351.
 5    13            5 Geno Smith                  304.
 6    17            6 Justin Fields               296 
 7    18            7 Trevor Lawrence             296.
 8    19            8 Kirk Cousins                292.
 9    20            9 Daniel Jones                289 
10    21           10 Jared Goff                  284.

Then we use the eigen values to calculate the principal component scores for each player, only including variables that loaded high. The players with the highest scores would be the highest performing QBs predicted for the 2022 season. Two things to note. Firstly, the position and overall rank will be the same here since we’ve define our first principal component as attributes of quarter backs only. Secondly, the model’s fantasy points ppr would not be known. The purpose here is to calculate scores and draft in ascending order. We can, however, calculate the difference in fantasy points had we taken the models’ picks. In the table, the model’s fantasy points are the same as actual fantasy points to make this calculation easier. The results are shown below.

Methods of measuring model performance:

  1. Percentage of total players the model accurately selects. If we were selecting quarter backs, we would use the first principal component scores to obtain the top 10 quarter backs in ascending order. We would then compare the results against the actual top 10 quarter backs for that season. In these results, 70% (7 out of 10 QBs) were accurately selected as being in the top 10 for total fantasy points.

  2. Difference in total fantasy points of the top 20 players. The top 20 players for each position are isolated using their associated principal component scores. The sum of the total fantasy points for the top 20 players are then subtracted from the what the actual total fantasy points for players in the top 20 in each position had to obtain the delta.

  3. Absolute difference in position rank by player. Each player will have the net difference in their position rank between the model and their actual rank for that season. In the above table, for example, the model selects Justin Herbert as #2 QB for fantasy but was actually #11 after the regular season, making the net -11.

Now using the model’s picks for the top ten quarter backs, 70% of those selected in the top 10 were actually in the top ten during the 2022 regular season. The total fantasy points for the quarter backs picked by the model were 3131.2. The total fantasy points the top ten quarter backs actually had was 3302 which means the model was off by 5% in regard to quarter back selection. The difference for rank by position are shown below. Players with a negative delta are those that were ranked higher in the model but came in lower after the season. Using a conservative threshold, MAE > 5, Top10 < 70%, and Fantasy Error > 5% will be used. If results are above or below this threshold, different measures should be taken to improve the model.

FantasyPosition MAE Top10 FantasyError
QB 4 70% 5.1%

Principal Components - Fantasy Weights

While the results are seemingly great, this was when I recalled that these principal components only account for half of the total variation. Considering that the goal of singular value decomposition via principal component is to maximize the total variation in the data, I had to think we can do much better. The first principal component which was attributed to quarter backs load high for interceptions and fumbles. While it is justifiable to reason that high-volume quarter backs will naturally intercept and fumble more often solely as a function of volume, would the best quarter backs really load high for those? To better understand this, the exact same process above was done to calculate new component scores, this time only including variables that are used for fantasy scoring. I first created these additional variables by multiplying them by my league’s point system. For example, rushing touch downs were multiplied by six and became the new variable used for the svd. Those variables were passing yards, passing touchdowns, interceptions, rushing yards, rushing touchdowns, receptions, receiving yards, receiving touch downs, fumbles, and fumbles lost.

Results

The parallel test suggested three components were sufficient to explain the maximum variation in the data.

The first three components alone account for 88.1% of the total variation within the data set, much better than the 50% obtained previously. The mean item complexity = 1.1. This means that each individual variable included in the principal components only load significantly on one component. This is the more ideal scenario since it makes defining the components much easier. Previously we had a mean item complexity of 1.5, meaning that half of the variables on average load significantly to two components. The first three components are then defined using a loading threshold of .7.

Dimension Definitions

rotation: maximum variance

PC1 Definition: Low Performing QBs
Attributes: (-) Passing Yards, (-) Passing Touchdowns, (+) Fumbles, (+) Interceptions

  1. PC1 would be attributed to quite literally the least performing quarter backs. The significantly negative loadings for passing yards and touch downs mean that quarter backs that load high to this component are unable to score touch downs. Additionally, they load extremely high for fumbles and interceptions, a confirmation of their under performance in relevant categories. PC1 would only be attributed to quarter backs since these in-game stats are generally relevant to them alone.

PC2 Definition: High-Performing Distance Efficiency
Attributes: + Receiving Touchdowns, + Receiving Yards, + Receptions

  1. PC2 can be attributed to total overall offensive efficiency given that we load extremely high for receiving touchdowns, receiving yards, and receptions. Players that load high to this category are likely wide receivers since wide receivers are more used for long range plays. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and rushing touch downs are not accounted for in this component. Additionally, overall long range efficiency would better describe this component, since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a better quarter back.

PC3 Definition: High-Performing Driving Efficiency
Attributes: + Rushing Touchdowns, + Rushing Yards

  1. PC3 would be attributed to overall driving efficiency in the same fashion. Likewise, this is also a combination of the offense line and the quarterback, considering that high performing drivers will still be unable to score touch downs in some fashion if their quarter back cannot perform. I would expect running backs and tight ends to load high to this category.

The goal is to isolate players in these areas.

The coordinate plane shows the first component on the x axis and the second on the y axis. The scoring_ and associated arrows are the eigen vectors on this principal component space. An increase on the x axis, or the first principal component, we increase in under performance. We would want to obtain players that negatively contribute to this component ie quadrant three. In the same fashion, an increase on the y axis means an increase in long range efficiency. This can only show the first two components. Based on the above, the we’d use the first component for quarter backs, the second for wide receivers, and the third for tight ends and running backs. However, since WRs, TEs, and RBs are much more similar in position (which the model concurs via the boxed region in quadrant I) than QBs, those three positions were included and ranked for PC2 and PC3. This allows us to more effectively see the primary position and players the model decides to pick for each category. In summary, principal component one was used for quarter backs, principal components two and three were used for all other positions at first. An overall score was then calculated using PC2 and PC3 only. Players with the highest overall scores would load significantly well to PC2 and PC3. These players would be both the best of the best in both long-range and driving efficiency. This allows us to see what positions the model picks for long-range (PC2) and driving (PC3) efficiency. Note that this tells us what position would be the best at both but does not tell us if they are the best at both.

# A tibble: 3 × 2
  FantasyPosition       n
  <fct>             <dbl>
1 RB               0.455 
2 WR               0.0738
3 TE              -0.287 

We see that running backs and wide receivers on average have positive scores for both, meaning that they contribute positively to the second and third components (long and driving efficiency). Surprisingly, tight tends on average are negative in both regards. The model rarely selects tight ends across all three of these dimensions. The results from the model show that the best overall QBs via the lowest component scores should be selected for PC1, the best wide receivers for PC2, and the best RBs for PC3. Only a very select few tight ends are chosen for PC2 and PC3. Of the top 30 highest-performing players in long-range efficiency (PC2), 3 of them were tight ends and only 1 of them was in the top 10 - Travis Kelce. Of the top 30 highest-performing players in driving efficiency (PC3), only Taysom Hill (TE), made the cut. Since fantasy league members must select players for every position, the best approach would be to select QBs, WRs, and RBs only for PC1, PC2, and PC3 respectively and then address tight ends afterwards. That same process was followed to compare the model’s picks using against their actuals for 2022. The results for the three positions are shown below.

Model Picks

PC1 - QBs

Actual_OvrRank Actual_PositionRank Player Actual_FantasyPointsPPR PositionRank Model_PositionRank Model_Player Model_FantasyPointsPPR Model_PCscore
1 1 Patrick Mahomes 417.4 1 1 Patrick Mahomes 417.4 -385.8120
2 2 Josh Allen 395.5 2 2 Josh Allen 395.5 -350.7159
3 3 Jalen Hurts 378.0 4 3 Joe Burrow 350.7 -339.5820
7 4 Joe Burrow 350.7 8 4 Kirk Cousins 291.6 -325.2073
13 5 Geno Smith 303.9 5 5 Geno Smith 303.9 -316.3577
17 6 Justin Fields 296.0 11 6 Justin Herbert 281.3 -311.7454
18 7 Trevor Lawrence 295.6 10 7 Jared Goff 284.3 -309.3991
19 8 Kirk Cousins 291.6 12 8 Tom Brady 271.7 -304.4806
20 9 Daniel Jones 289.0 7 9 Trevor Lawrence 295.6 -299.4071
21 10 Jared Goff 284.3 13 10 Aaron Rodgers 239.2 -280.7368

The results are nearly identical to the previous method (on the original variables). Model accurately picks seven of the top ten players.

PC2 - WR

Actual_OvrRank Actual_PositionRank Player Actual_FantasyPointsPPR PositionRank Model_PositionRank Model_Player Model_FantasyPointsPPR Model_PCscore
5 1 Justin Jefferson 368.7 1 1 Justin Jefferson 368.7 343.4864
8 2 Tyreek Hill 347.2 3 2 Davante Adams 335.5 321.2936
9 3 Davante Adams 335.5 2 3 Tyreek Hill 347.2 319.6640
11 4 Stefon Diggs 316.6 4 4 Stefon Diggs 316.6 303.8524
15 5 CeeDee Lamb 301.6 6 5 A.J. Brown 299.6 291.2316
16 6 A.J. Brown 299.6 5 6 CeeDee Lamb 301.6 284.9864
26 7 Amon-Ra St. Brown 267.6 8 7 Jaylen Waddle 259.2 248.4996
27 8 Jaylen Waddle 259.2 7 8 Amon-Ra St. Brown 267.6 248.0896
28 9 DeVonta Smith 254.6 9 9 DeVonta Smith 254.6 246.5056
32 10 Amari Cooper 246.0 10 10 Amari Cooper 246.0 237.7820

PC3 - RB

Actual_OvrRank Actual_PositionRank Player Actual_FantasyPointsPPR PositionRank Model_PositionRank Model_Player Model_FantasyPointsPPR Model_PCscore
4 1 Austin Ekeler 372.7 13 1 Jamaal Williams 225.9 106.06931
6 2 Christian McCaffrey 356.4 4 2 Derrick Henry 302.8 88.05878
10 3 Josh Jacobs 328.3 3 3 Josh Jacobs 328.3 83.52732
14 4 Derrick Henry 302.8 6 4 Nick Chubb 281.4 82.29713
22 5 Saquon Barkley 284.0 1 5 Austin Ekeler 372.7 82.07121
23 6 Nick Chubb 281.4 22 6 Ezekiel Elliott 185.8 76.05968
29 7 Rhamondre Stevenson 249.1 15 7 Miles Sanders 216.7 74.20003
30 8 Tony Pollard 248.8 5 8 Saquon Barkley 284.0 68.97659
31 9 Aaron Jones 248.6 18 9 Kenneth Walker III 202.5 60.82183
35 10 Joe Mixon 239.5 8 10 Tony Pollard 248.8 60.40856

Model Accuracy

FantasyPosition MAE Top10 FantasyError
QB 2.2 70% 5.1%
RB 5.8 60% 4.5%
WR 0.6 100% 2.3%

The table shows model accuracy against 2022 actual data. Starting with the mean absolute error, this measures the average difference in rankings for total fantasy points between the model and the actual year. On average, across the three positions, the model is off by 3 - very good for fantasy purposes. Even those who not the sport extremely well will draft players based on emotion, sentiment, or loyalty in some fashion, and having these insights and potential edge (if there is then evidence the scores generalizes well to future seasons) will definitely be an advantage. There are still some areas for improvement, such as the model only accurately selecting 7 of the top 10 players in fantasy points for quarter backs, and 6 of 10 for running backs. Normally, process steps would be reviewed for different ways to improve the model, but in the interest of time that will not be done here. Now that I’m currently in a position where I like the results, what does the next step look like and how can we apply this for the 2024 draft given that draft day is in 24 hours? This is the hardest part. SVD via PCA is commonly done in the post exploratory phase as a part of an ensemble of methods in which predictive models/machine learning methods are then layered on top of it depending on the goal of the research. With my goal of creating the fantasy team with the highest likelihood (albeit unknown) of obtaining the most fantasy points, I outlined different approaches one could take given where I am in the process and knowing that I only have 24 hours left.

1. Using 2022 PCA Scores to Rank Players and Compare to 2023 Actual

Process

PCA on 2022 Data: Perform PCA on 2022 in-game stats. Obtain principal component (PC) scores for each player, focusing on the first few principal components (e.g., PC1, PC2). Rank players based on their PC scores (e.g., higher scores on PC1 may indicate better performance).

Ranking and Testing: Compare the PCA-derived ranks from 2022 to the actual fantasy points scored by each player in 2023. Evaluate the accuracy of these ranks by calculating metrics like precision or correlation between PCA ranks and actual points. Generalization:

If 2022 PCA ranks generalize well to 2023 performance, use the same method on 2023 data. Apply this PCA-based ranking approach to the 2024 player pool to predict future performance.

2. Combining 2022 and 2023 Data for PCA and Regression

Process

Data Preparation: Combine the 2022 and 2023 data sets, including in-game stats and calculated fantasy points. Standardize the data to ensure comparability. PCA on Combined Data:

Perform PCA on the combined data set to capture the underlying structure across both years. Extract PC scores for each player, focusing on the first few components (e.g., PC1, PC2, PC3). Regression Analysis:

Use the PC scores as features in a regression model along with other in-game stats (e.g., games started, targets). Train the model on 2022 and 2023 data to predict total fantasy points. Projection for 2024:

Input 2024 projection data into the trained regression model to predict total fantasy points for each player in 2024. Rank players based on these predictions.

3. Using 2024 Projection Data to Calculate PCA Scores

Process

PCA on 2024 Projection Data: Perform PCA on the 2024 projected in-game stats. Obtain PC scores for each player based on the projections. Ranking Based on Projections:

Rank players based on their PC scores from the 2024 projection data. Use these rankings to guide draft decisions.

Assumptions: This approach assumes that the 2024 projections are accurate enough to reflect actual performance, so PC scores from projections should correlate with final fantasy points.

4. Weighted PCA Combining Historical and Projected Data

Process

Weighting and Combining Data: Assign weights to the 2022, 2023, and 2024 (projected) data. For example, use a higher weight for more recent years like 2023. Combine the data sets into a single matrix, with projections and historical data weighted accordingly.

PCA on Weighted Data: Perform PCA on the weighted data set to capture the combined effect of historical and projected performance. Obtain PC scores for each player, reflecting a blend of past performance and future projections. Ranking and Drafting:

Rank players based on the weighted PC scores. Use these ranks to inform drafting decisions.

2023 | 2022 NFL seasons

The weighted PCA approach on both the ’22 and ’23 NFL regular season was used to select my fantasy team. After matching corresponding players in 2023, both years were joined such that statistics from both years were available for each observation per player. A weighted average of the scoring_ variables were then calculated consisting of a third of their statistics from ’22 and two-thirds from ’23, emphasizing recent seasons at a 2:1 ratio. This weight only applies to the athletes that played at least one game in both years. A total of 39 athletes that had season ending injuries in ’22 but played in ’23 were omitted from the data. No weight was applied to the 142 rook athletes in 2023. 100% of their ’23 NFL season was incorporated into their component scores for rook athletes, given that there is no other history (besides this one) for a baseline. Using combine data was considered, but was not used in the interest of time. Exploratory analysis showed that the both the mean and spread of the underlying distributions for all in-game statistics were extremely similar between ’23 and ’24. For ’23 fantasy points, the mean was 83.1 (sd = 87.3).

Results

Parallel and scree tests were performed on the weighted averages of the variables used for FantasyPointsPPR. The parallel test showed three components were sufficient but as many as four could be used according to the scree test. The calculated eigen values of the first three components cumulatively explain 90.1% of the total variation within the data.

The standardized loadings were then plotted on a patter correlation matrix to define the components. The loading score for each component were extremely similar in value to ones obtained in the Fantasy Weights section. Components were defined using these definitions.

rotation: maximum variance

PC1 Definition: Low Performing QBs
Attributes: (-) Passing Yards, (-) Passing Touchdowns, (+) Fumbles, (+) Interceptions

  1. PC1 would be attributed to the worst overall performing quarter backs. The significantly negative loadings for passing yards and touch downs mean that quarter backs that load high to this component are unable to score touch downs. Additionally, they load extremely high for fumbles and interceptions, a confirmation of their under performance in relevant categories. PC1 would only be attributed to quarter backs since these in-game stats are generally relevant to them alone. When calculating scores for PC1, the highest-performing or best overall quarter backs would have the lowest scores, since they would negatively contribute to this component.

PC2 Definition: High-Performing Distance Efficiency
Attributes: + Receiving Touchdowns, + Receiving Yards, + Receptions

  1. PC2 can be attributed to total overall offensive efficiency given that we load extremely high for receiving touchdowns, receiving yards, and receptions. Players that load high to this category are likely wide receivers since wide receivers are more used for long range plays. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and rushing touch downs are not accounted for in this component. Additionally, overall long range efficiency would better describe this component, since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a better quarter back.

PC3 Definition: High-Performing Driving Efficiency
Attributes: + Rushing Touchdowns, + Rushing Yards

  1. PC3 would be attributed to overall driving efficiency in the same fashion. Likewise, this is also a combination of the offense line and the quarterback, considering that high performing drivers will still be unable to score touch downs in some fashion if their quarter back cannot perform. I would expect running backs and tight ends to load high to this category.

The annotations show what regions in each component would isolate the best (or worst) quarter backs for PC1 and the best long-range players for PC2. The benefit of plotting components as functions of each other is to understand the variability between two dimensions individually. Comparing PC1 and PC2 in this manner may not be necessary considering how we’ve defined the components, but visualizing it this way highlights any outliers and spread of the data. The ellipses are the regions that represent the 95% confidence interval for each fantasy position. Looking at PC1 and its associated eigen vectors, it is evident that the best quarter backs would be those that have high values along the x-axis (note that quarter backs are the only group spread differently than the others, evidence that this component are attributes of them). Although the individual component scores for these quarter backs will be negative (as they negatively contribute to this component), these are orthogonal projections, so the fact that it is ‘positive’ on the x axis here is meaningless. The interpretation would be identical if we flipped it over the y axis. The three quarter backs we definitely want to obtain are the three outliers closest to the top of the annotated box. These quarter backs would be something akin to the ‘best of the best’. The other important part of this relates to PC2. The underlying variables that contribute the most to this component are receptions, receiving yards, and receiving touch downs. This is the ‘long-range’ cateogry (rushing yards and tds are not accounted for here). In the previous biplot, wide receivers completely dominated this region when it came to individual component scores of athletes. While that is somewhat the case here, notice that running backs are not far behind. Athletes highest in y values are still the wide receivers, but there is a good region of running backs [2,2.5] on the x axis that contribute just as well as wide receivers.

Understanding the variability between two dimensions individually would be the most beneficial between PC2 and PC3 since it would easily highlight players that are the best at both. The dynamic sport of football and the physical build of players makes it extremely difficult to be an efficient driving scorer and an efficient passing scorer (receiving). Running backs tend to do well in the driving aspect; they have stockier frames making easier to drive through defenses and gain yardages that way. Wide receivers do well on the receiving end - they have slimmer frames leading to elusive plays to gain the most yardage (hail marys’ as an polar example). Players that contribute equally well to PC2 and PC3 are wraps for fantasy (granted they have a good quarter back).

The same trend identified in the component definitions are seen here. The region with the highest x values are wide receivers (tip of 95% CI extends out the most) with a handful of outliers near the max. The region with the highest y values, representing PC3, is nearly reserved for running backs (except for a couple exceptions). Notice that there are almost no players that positively contribute to PC3. I can physically count by hand the six players that positively contribute for PC3 that are not running backs. There are two parts of this graph to pay attention to. Firstly, the red box indicates the top of the 95% confidence interval for running backs ie the best of the best for PC3. Players in this region should be selected in the fantasy draft. More importantly, there is one identified that contributes extremely well to both PC2 and PC3. The vertical line separates the top six NFL athletes for long range efficiency, who are wide receivers (using a weighted average of ’22 and ’23 seasons), but there is only one running back out of the entire NFL offense that is in this category but also one of the best in driving efficiency. Take a guess whom that is.

Model Picks

There are two different ways of calculating overall scores (ie principal component scores) for each player to identify the players that should be drafted for this season. The first is to only use variables that contribute within the empirical range (+/-.75). Usually, the empirical range is only used to define what the principal components are or represent. Generally all scores should be used when calculating scores, the benefit being that it captures the combined variance of all the variables and the overall score is the more accurate reflection. The one drawback with this approach is that it complicates the model. The benefit of using only variables that are outside the empirical threshold is that it simplifies the model and is easier to interpret. Note that when interpreting the visualizations in this analysis, specifically the biplots, all component scores need to be included in the calculation. I will show the picks using both methods.

Approach 1 - Empirical Threshold

QBs

ModelRank Rank2023 PositionRank2023 Team2023 Player2023 FantasyPointsPPR2023 pc1 pc2 pc3 ovr
1 3 1 BUF Josh Allen 392.6 -336.8850 0.0000000 75.2816532 412.1666
2 50 8 KAN Patrick Mahomes 280.2 -331.6486 0.5168941 11.1670530 343.3325
3 40 7 DET Jared Goff 289.1 -323.6598 0.4842071 7.8787255 332.0227
4 66 11 MIA Tua Tagovailoa 270.4 -316.1377 0.0000000 0.7021111 316.8398
5 8 3 DAL Dak Prescott 342.8 -302.7733 0.0000000 11.5304138 314.3037
6 76 12 JAX Trevor Lawrence 262.5 -296.4182 0.0000000 27.5229754 323.9412
7 5 2 PHI Jalen Hurts 356.8 -277.1566 0.0000000 87.0594829 364.2161
8 58 9 HOU C.J. Stroud 276.0 -275.1716 0.9623171 18.5073453 294.6413
9 80 16 NOR Derek Carr 241.1 -270.6667 0.0000000 0.5861661 271.2528
10 83 19 SEA Geno Smith 227.3 -270.3990 0.5107969 7.8084478 278.7183
11 60 10 TAM Baker Mayfield 274.1 -251.3108 0.0000000 6.9678469 258.2787
12 81 17 LAC Justin Herbert 234.2 -250.4118 1.6099590 13.2045966 265.2264
13 78 13 DEN Russell Wilson 256.9 -248.3591 0.3534593 19.9824227 268.6950
14 31 6 SFO Brock Purdy 295.6 -247.5300 0.0000000 10.3548609 257.8848
15 10 4 BAL Lamar Jackson 331.2 -241.6355 0.0000000 32.1511112 273.7866
16 89 24 MIN Kirk Cousins 149.7 -233.3930 0.0000000 4.2276148 237.6206
17 79 15 LAR Matthew Stafford 243.1 -225.1303 0.0000000 2.3247628 227.4550
18 90 25 CIN Joe Burrow 147.2 -222.2560 0.0000000 10.7799923 233.0360
19 18 5 GNB Jordan Love 319.1 -218.7511 0.0000000 16.6044933 235.3556
20 82 18 CHI Justin Fields 230.2 -200.9537 0.0000000 37.9466259 238.9003

WRs

ModelRank Rank2023 PositionRank2023 Team2023 Player2023 FantasyPointsPPR2023 pc1 pc2 pc3 ovr
1 4 2 MIA Tyreek Hill 376.4 -2.8242470 352.7382 2.0767694 357.6392
2 2 1 DAL CeeDee Lamb 403.2 -5.6484940 344.4119 8.3875950 358.4480
3 7 3 DET Amon-Ra St. Brown 330.9 -2.2389478 298.3283 0.4605591 301.0279
4 20 8 PHI A.J. Brown 289.6 -6.7168433 289.3462 0.0000000 296.0631
5 14 5 LAR Puka Nacua 298.5 -1.7558977 282.0636 0.8599250 284.6794
6 34 14 LVR Davante Adams 265.4 -0.5852992 281.3340 -0.0032207 281.9161
7 29 12 BUF Stefon Diggs 273.8 -3.9948454 280.2885 0.0225449 284.3059
8 9 4 TAM Mike Evans 282.5 0.0000000 256.9248 0.0000000 256.9248
9 33 13 CIN Ja’Marr Chase 262.7 -3.2292872 250.8549 -0.0128828 254.0713
10 70 26 MIN Justin Jefferson 202.2 -2.6767197 249.3281 1.8770865 253.8819
11 15 6 CHI D.J. Moore 286.5 -2.2389478 245.1010 4.0601387 251.4000
12 25 10 SFO Brandon Aiyuk 249.2 -3.3584216 237.1919 0.0740759 240.6244
13 54 22 PHI DeVonta Smith 227.6 -3.3584216 232.4062 0.0000000 235.7646
14 27 11 LAC Keenan Allen 278.9 -8.3019302 231.6001 0.0644139 239.9664
15 62 25 IND Michael Pittman Jr. 250.2 -5.1143193 230.7656 0.0966208 235.9765
16 42 16 CLE Amari Cooper 227.0 -4.0363499 228.1088 0.0000000 232.1452
17 39 15 SEA D.K. Metcalf 225.4 -2.2389478 221.4407 0.0000000 223.6796
18 47 19 MIN Jordan Addison 221.3 0.0000000 215.5321 0.0193242 215.5515
19 57 24 NOR Chris Olave 231.3 -2.2389478 213.8417 0.0000000 216.0807
20 103 32 MIA Jaylen Waddle 198.6 -1.1194739 212.2413 0.1610346 213.5218

RBs

Christian McCaffrey is favored as the first pick outside of drafting quarter backs. Although he is #3 pick for running backs, ideal picks for offensive players (although rare) would load high to principal components two and three. His combined component score for those dimensions are the highest of any offensive player included in the data.

ModelRank Rank2023 PositionRank2023 Team2023 Player2023 FantasyPointsPPR2023 pc1 pc2 pc3 ovr
1 6 2 MIA Raheem Mostert 267.7 -6.8702170 59.71549 82.59469 149.18040
2 16 5 TEN Derrick Henry 246.7 -11.8040821 55.54816 81.92272 149.27497
3 1 1 SFO Christian McCaffrey 391.3 -7.9271136 168.60418 80.64146 257.17276
4 32 9 DET David Montgomery 207.2 -5.6484940 41.11002 67.30746 114.06597
5 35 10 DET Jahmyr Gibbs 242.1 -5.1143193 86.91137 65.44326 157.46895
6 53 14 BAL Gus Edwards 187.0 -6.7679679 19.46584 61.04759 87.28140
7 12 3 JAX Travis Etienne 282.4 -4.5290201 93.80611 60.79753 159.13267
8 17 6 CIN Joe Mixon 267.0 0.0000000 107.29194 56.20920 163.50113
9 59 16 SEA Kenneth Walker III 199.4 -1.1705985 53.51327 56.13834 110.82221
10 93 25 LVR Josh Jacobs 181.1 -6.8702170 73.16355 55.55920 135.59296
11 65 20 PIT Najee Harris 195.5 -6.2852958 56.23899 53.17002 115.69431
12 13 4 LAR Kyren Williams 255.0 -5.6484940 51.13492 52.86670 109.65011
13 69 21 MIA De’Von Achane 190.7 -3.3584216 62.95150 52.77974 119.08966
14 43 13 NYG Saquon Barkley 223.2 -5.0631947 89.62998 51.71806 146.41124
15 99 31 LAC Austin Ekeler 185.4 -14.6554096 132.87294 50.16511 197.69347
16 63 18 DAL Tony Pollard 222.6 -4.5801447 86.13688 49.13565 139.85267
17 52 15 ARI James Conner 201.5 -2.2900723 62.47610 48.63644 113.40261
18 61 17 KAN Isiah Pacheco 213.9 -5.6484940 60.44341 44.36051 110.45242
19 122 37 CLE Kareem Hunt 118.5 -0.5852992 35.16709 43.57351 79.32590
20 97 28 IND Jonathan Taylor 156.4 -4.5290201 39.76985 41.33364 85.63251

Top 75

Based on the functional nature of football and analysis thus far, a top 75 list wouldn’t be meaningful, since each component and position is unique to its own. For those that are curious, however, a ranking such as this can, in theory, be created by combining all component scores.

ModelRank Rank2023 PositionRank2023 Team2023 FantasyPosition2023 Player2023 FantasyPointsPPR2023 pc1 pc2 pc3 ovr
1 3 1 BUF QB Josh Allen 392.6 -336.8849856 0.0000000 75.2816532 412.1666
2 5 2 PHI QB Jalen Hurts 356.8 -277.1566447 0.0000000 87.0594829 364.2161
3 2 1 DAL WR CeeDee Lamb 403.2 -5.6484940 344.4118627 8.3875950 358.4480
4 4 2 MIA WR Tyreek Hill 376.4 -2.8242470 352.7382300 2.0767694 357.6392
5 50 8 KAN QB Patrick Mahomes 280.2 -331.6485727 0.5168941 11.1670530 343.3325
6 40 7 DET QB Jared Goff 289.1 -323.6597758 0.4842071 7.8787255 332.0227
7 76 12 JAX QB Trevor Lawrence 262.5 -296.4181804 0.0000000 27.5229754 323.9412
8 66 11 MIA QB Tua Tagovailoa 270.4 -316.1377077 0.0000000 0.7021111 316.8398
9 8 3 DAL QB Dak Prescott 342.8 -302.7732618 0.0000000 11.5304138 314.3037
10 7 3 DET WR Amon-Ra St. Brown 330.9 -2.2389478 298.3283464 0.4605591 301.0279
11 20 8 PHI WR A.J. Brown 289.6 -6.7168433 289.3462135 0.0000000 296.0631
12 58 9 HOU QB C.J. Stroud 276.0 -275.1716287 0.9623171 18.5073453 294.6413
13 14 5 LAR WR Puka Nacua 298.5 -1.7558977 282.0636273 0.8599250 284.6794
14 29 12 BUF WR Stefon Diggs 273.8 -3.9948454 280.2885390 0.0225449 284.3059
15 34 14 LVR WR Davante Adams 265.4 -0.5852992 281.3340147 -0.0032207 281.9161
16 83 19 SEA QB Geno Smith 227.3 -270.3990214 0.5107969 7.8084478 278.7183
17 10 4 BAL QB Lamar Jackson 331.2 -241.6354945 0.0000000 32.1511112 273.7866
18 80 16 NOR QB Derek Carr 241.1 -270.6666575 0.0000000 0.5861661 271.2528
19 78 13 DEN QB Russell Wilson 256.9 -248.3591059 0.3534593 19.9824227 268.6950
20 81 17 LAC QB Justin Herbert 234.2 -250.4118266 1.6099590 13.2045966 265.2264
21 60 10 TAM QB Baker Mayfield 274.1 -251.3108188 0.0000000 6.9678469 258.2787
22 31 6 SFO QB Brock Purdy 295.6 -247.5299791 0.0000000 10.3548609 257.8848
23 1 1 SFO RB Christian McCaffrey 391.3 -7.9271136 168.6041824 80.6414638 257.1728
24 9 4 TAM WR Mike Evans 282.5 0.0000000 256.9248291 0.0000000 256.9248
25 33 13 CIN WR Ja’Marr Chase 262.7 -3.2292872 250.8548747 -0.0128828 254.0713
26 70 26 MIN WR Justin Jefferson 202.2 -2.6767197 249.3280744 1.8770865 253.8819
27 15 6 CHI WR D.J. Moore 286.5 -2.2389478 245.1009581 4.0601387 251.4000
28 26 3 KAN TE Travis Kelce 219.4 -3.3584216 246.1584535 0.0161035 249.5330
29 25 10 SFO WR Brandon Aiyuk 249.2 -3.3584216 237.1919041 0.0740759 240.6244
30 27 11 LAC WR Keenan Allen 278.9 -8.3019302 231.6000597 0.0644139 239.9664
31 82 18 CHI QB Justin Fields 230.2 -200.9537028 0.0000000 37.9466259 238.9003
32 89 24 MIN QB Kirk Cousins 149.7 -233.3929601 0.0000000 4.2276148 237.6206
33 62 25 IND WR Michael Pittman Jr. 250.2 -5.1143193 230.7656078 0.0966208 235.9765
34 54 22 PHI WR DeVonta Smith 227.6 -3.3584216 232.4062271 0.0000000 235.7646
35 18 5 GNB QB Jordan Love 319.1 -218.7511326 0.0000000 16.6044933 235.3556
36 90 25 CIN QB Joe Burrow 147.2 -222.2560458 0.0000000 10.7799923 233.0360
37 42 16 CLE WR Amari Cooper 227.0 -4.0363499 228.1088277 0.0000000 232.1452
38 11 1 DET TE Sam LaPorta 239.3 0.0000000 228.7718784 0.0386483 228.8105
39 79 15 LAR QB Matthew Stafford 243.1 -225.1302799 0.0000000 2.3247628 227.4550
40 39 15 SEA WR D.K. Metcalf 225.4 -2.2389478 221.4406898 0.0000000 223.6796
41 112 36 NYJ WR Garrett Wilson 213.2 -6.1826686 211.3563377 0.0128828 217.5519
42 57 24 NOR WR Chris Olave 231.3 -2.2389478 213.8417338 0.0000000 216.0807
43 28 4 MIN TE T.J. Hockenson 219.0 -3.3584216 212.5474634 0.0000000 215.9059
44 47 19 MIN WR Jordan Addison 221.3 0.0000000 215.5321435 0.0193242 215.5515
45 77 14 WAS QB Sam Howell 257.5 -191.5508111 0.9030403 22.4547598 214.9086
46 72 27 KAN WR Rashee Rice 212.5 -5.1143193 209.1896872 -0.0289862 214.2750
47 36 6 JAX TE Evan Engram 230.3 -5.6484940 208.0803575 0.0418690 213.7707
48 103 32 MIA WR Jaylen Waddle 198.6 -1.1194739 212.2412612 0.1610346 213.5218
49 107 34 SEA WR Tyler Lockett 202.4 -2.2389478 207.0999236 0.0000000 209.3389
50 102 31 CAR WR Adam Thielen 231.0 -2.3411969 205.6832038 0.0515311 208.0759
51 104 33 TAM WR Chris Godwin 209.2 -2.2389478 200.7186832 4.0150490 206.9727
52 75 29 WAS WR Terry McLaurin 209.2 -0.5852992 205.3121344 0.0934001 205.9908
53 88 23 CAR QB Bryce Young 156.4 -200.2969227 0.0000000 2.4445059 202.7414
54 45 18 GNB WR Jayden Reed 217.2 -3.5117954 186.4197583 12.4123061 202.3439
55 51 21 LVR WR Jakobi Meyers 218.6 -5.0590601 189.2252456 7.6275115 201.9118
56 23 9 HOU WR Nico Collins 260.4 -1.1705985 199.0064355 0.0450897 200.2221
57 22 2 SFO TE George Kittle 203.2 -1.1194739 197.7416374 0.0128828 198.8740
58 99 31 LAC RB Austin Ekeler 185.4 -14.6554096 132.8729430 50.1651129 197.6935
59 21 7 SFO WR Deebo Samuel 243.7 -3.9948454 166.8070814 26.5986366 197.4006
60 49 20 TEN WR DeAndre Hopkins 223.6 -1.7047731 194.5227328 0.0579725 196.2855
61 101 30 BAL WR Zay Flowers 206.4 0.0000000 187.6528827 6.1723358 193.8252
62 55 23 PIT WR George Pickens 208.8 -3.5117954 185.2630141 2.0703280 190.8451
63 74 28 DEN WR Courtland Sutton 190.2 -6.7168433 179.0934789 0.0161035 185.8264
64 30 5 CLE TE David Njoku 201.2 -5.5973694 180.1365787 -0.0257655 185.7082
65 134 47 JAX WR Christian Kirk 150.3 -5.5716181 178.6363682 0.0740759 184.2821
66 85 21 IND QB Gardner Minshew II 196.2 -165.8224506 0.0000000 13.7934059 179.6159
67 86 22 ATL QB Desmond Ridder 177.1 -154.1840391 0.2493013 20.2201764 174.6535
68 121 42 ATL WR Drake London 174.4 -3.9249500 170.5222128 0.0000000 174.4472
69 120 41 LAR WR Cooper Kupp 164.4 -1.7047731 168.0109081 2.0252383 171.7409
70 38 12 ATL RB Bijan Robinson 246.3 -11.8311626 127.1046309 31.9552264 170.8910
71 84 20 2TM QB Joshua Dobbs 200.7 -144.9644778 0.0000000 25.3785715 170.3430
72 118 31 NWE QB Mac Jones 106.4 -166.9169863 0.0000000 2.8239702 169.7410
73 46 8 CHI TE Cole Kmet 181.1 -1.7558977 164.9200433 0.0418690 166.7178
74 113 38 BUF WR Gabriel Davis 161.4 -3.3584216 161.5473649 -0.0128828 164.8929
75 91 26 ARI QB Kyler Murray 146.4 -144.1053422 0.0000000 19.8117259 163.9171