Principal Component Analysis to Inform Draft Selection for Fantasy Football Teams

Author

Tomi Akisanya

Around this time every year, if you’re like me, you’ve already started to mentally prepare for 80% of your conversations to be on the topic of football. For those individuals that would not classify themselves as ffanatics, it’s probably annoying - it’s annoying for all of us. Pretty consistently there will behavior from fans that range from DMs to athletes, verbal assaults to close friends, damaged property, and public humiliation. Who realistically has the mental endurance to only discuss a single topic over an extended period of time for something they’re not physically involved in? It’s those with passion, and although I am not passionate about football (nor justify the aforementioned behavior), it will be my first of hopefully many Fantasy Football leagues. If there’s one thing I am passionate about, it’s tilting the odds in my favor. Usually that’s in the form of taking a creative line on the felt on a 2/5 reg by extracting max value with 67s on a low connected board in a 4! pot as the preflop aggressor. Balance and discipline is the name of the game though, and who are we if we don’t apply the same approach to all areas in life…

Data Sets

The data contains in-game and Fantasy Football Points per Reception stats by NFL player from 2017 - 2023 for all 17 games of the regular season. Most leagues use a points per reception based metric to calculate fantasy points, or FantasyPointsPPR. Before converting to fantasy points, in-game stats may be weighted or counted differently. My league adopted the following criterion:

Variable.Name	Scoring..Points.	Scoring..Points..1
Completions	Typically not scored (0 points)	0.00
Attempts	Typically not scored (0 points)	0.00
PassingYards	1 point per 25 passing yards	0.04
PassingTDs	4 points per passing touchdown	4.00
Interceptions	-2 points per interception	-2.00
RushingAttempts	Typically not scored (0 points)	0.00
RushingYards	1 point per 10 rushing yards	0.10
RushingTDs	6 points per rushing touchdown	6.00
Targets	Typically not scored (0 points)	0.00
Receptions	1 point per reception (PPR format)	1.00
ReceivingYards	1 point per 10 receiving yards	0.10
ReceivingTDs	6 points per receiving touchdown	6.00
Fumbles	-2 points per fumble (if lost)	-2.00
FumblesLost	-2 points per fumble lost	-2.00
CompletionsPerAttempt	Typically not scored (0 points)	0.00
TDsPerAttempt	Typically not scored (0 points)	0.00
InterceptionsPerAttempt	Typically not scored (0 points)	0.00
TDsPerReception	Typically not scored (0 points)	0.00
FumblesLostPerFumble	Typically not scored (0 points)	0.00
FantasyPointsPPR	NA	NA

Two different data sets are used, with a focus on three distinct NFL regular seasons - 2022, 2023, and 2024. Both data sets have been scraped but differ in source, purpose, and underlying information present:

Data Set 1 - Historical - 2022 & 2023: This data set contains historical data from 2017-2023 for both relevant in-game statistics and fantasy scoring for regular NFL season. This project primarily focuses on 2022 and 2023. Each observation or row in the data set is a NFL athlete’s relevant in-game statistics, such as position, team, completion, attempts, interceptions per attempt, etc. Since most leagues exclude defensive players from their fantasy team, those have been implicitly removed from the data set. The key feature of this data set are the retroactive fantasy rankings/scoring. The total fantasy points, overall rank, and rank by position are available for each player. This enables the direct comparison of calculated rankings from models to their actual rank.

Players that were on two or more teams in a given season are not assigned a team - but are instead given a makeshift name to highlight this. For example, you may see Baker Mayfield’s registered team as 2TM. One notable variable contained in the data set is ADP, or average draft position, representing the number of times the player was drafted across all recorded leagues before the start of the respective season. Additional in-game stats were calculated afterwards. These were YardsPerRushAttempt,CompletionsPerAttempt, TDsPerAttempt, InterceptionsPerAttempt, TDsPerReception, and FumblesLostPerFumble. My fantasy league’s scoring methodologies were also factored into a set of new variables. These variables have the schoring_ schema.

Years 2017-2022 and 2023 are taken from the same source but were scraped separately. Player IDs assigned

Data Set 2 - Projected - 2024: This data set contains all player match ups for the upcoming regular 2024 NFL season. The in-game statistics recorded for each player are projections based on those teams and match ups. One benefit of this data set is being able to use these projections as inputs of our model to determine which players obtain the most fantasy points, rank them in ascending order, and draft them accordingly. The drawback is that no additional information on how these projections were calculated are known so the accuracy of these projections cannot be confirmed.

Methods

The current methodology is to use singular value decomposition of eigenvalues to create scores of new variables that can be attributed to their overall performance. Their projected overall performance would be used to rank each player in ascending order (potentially by position) to inform our draft decision. Starting with the 2022 season, the overall rank for each player is calculated and then compared to the actual ranks of that same year. Precision will be measured in three ways:

Difference in overall rank
Difference in position rank
Difference in fantasy points, obtained by taking the difference of fantasy points using our model’s draft order with the fantasy points using the ideal draft order.

If the model is precise, the same 2022 projections will then be tested on 2023. This method is not full-proof obviously. Many things change between off seasons of professional sports, but the objective is to quantify the model’s ability to generalize onto future seasons. If it can, the same process will be done starting with the 2023 data set, testing it against itself, then using those scores for 2024. If it does not, principal component scores will be calculated using the 2024 data set only, and only those will inform our draft order. This is not the ideal scenario, since it inherently trusts the projected data.

Processing

Missing values in completions per attempt, tds per attempt, interceptions per attempt, tds per reception, and fumbles lost per fumble were a result of undefined values in the denominator. Observations of this missing values are directly related to the player and position. For example, QBs will rarely record touchdowns per reception and will therefore have undefined values for those statistics since they are more equipped to measure performance of wide receivers. All missing values in these cases were replaced with zero. The same approach was applied to yards per attempt and yards per reception with two notable exceptions. Foster Moreau had zero rushing attempts but two rushing yards during the 2022 NFL season which is difficult to interpret considering rushing attempts are a function of rushing yards. In the same vein, Joe Flacco had -3 receiving yards but zero receptions. Both of these players were removed from the data set.

[1] 1367

YoY PPR Trend for Top NFL Teams

2022 NFL season

Data Exploration

The data was sub set for the 2022 season and then skimmed for to review distributions, counts, and other elements within the data set.

Data summary
Name	fantasy
Number of rows	575
Number of columns	43
_______________________
Column type frequency:
character	2
factor	2
numeric	39
________________________
Group variables	None

Variable type: character

skim_variable	n_missing	complete_rate	min	max	empty	n_unique	whitespace
Player	0	1	8	24	0	575	0
PlayerID	0	1	8	8	0	575	0

Variable type: factor

skim_variable	n_missing	complete_rate	ordered	n_unique	top_counts
Team	0	1	FALSE	34	2TM: 28, ARI: 20, DEN: 20, LAC: 20
FantasyPosition	0	1	FALSE	4	WR: 218, RB: 162, TE: 113, QB: 82

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
Rank	0	1.00	289.17	166.96	1.00	144.50	290.00	433.50	577.00	▇▇▇▇▇
Age	0	1.00	26.19	3.23	21.00	24.00	26.00	28.00	45.00	▇▇▁▁▁
Games	0	1.00	11.60	5.16	1.00	8.00	13.00	16.00	17.00	▂▂▂▂▇
GamesStarted	0	1.00	5.56	5.81	0.00	0.00	3.00	11.00	17.00	▇▂▂▂▂
Completions	0	1.00	19.98	72.11	0.00	0.00	0.00	0.00	490.00	▇▁▁▁▁
Attempts	0	1.00	31.08	110.35	0.00	0.00	0.00	0.00	733.00	▇▁▁▁▁
PassingYards	0	1.00	219.00	789.12	0.00	0.00	0.00	0.00	5250.00	▇▁▁▁▁
PassingTDs	0	1.00	1.30	5.04	0.00	0.00	0.00	0.00	41.00	▇▁▁▁▁
Interceptions	0	1.00	0.72	2.40	0.00	0.00	0.00	0.00	15.00	▇▁▁▁▁
RushingAttempts	0	1.00	25.65	56.65	0.00	0.00	2.00	17.00	349.00	▇▁▁▁▁
RushingYards	0	1.00	114.51	261.73	-15.00	0.00	5.00	73.50	1653.00	▇▁▁▁▁
YardsPerAttempt	0	1.00	2.62	3.69	-7.50	0.00	1.75	4.46	40.00	▇▇▁▁▁
RushingTDs	0	1.00	0.85	2.14	0.00	0.00	0.00	1.00	17.00	▇▁▁▁▁
Targets	0	1.00	29.95	37.05	0.00	3.00	14.00	44.00	184.00	▇▂▁▁▁
Receptions	0	1.00	20.11	24.93	0.00	2.00	9.00	30.00	128.00	▇▂▁▁▁
ReceivingYards	0	1.00	220.03	304.02	-10.00	11.00	96.00	313.50	1809.00	▇▂▁▁▁
YardsPerReception	0	1.00	8.67	6.23	-6.00	5.00	8.90	12.11	42.00	▃▇▂▁▁
ReceivingTDs	0	1.00	1.30	2.12	0.00	0.00	0.00	2.00	14.00	▇▁▁▁▁
Fumbles	0	1.00	1.04	1.98	0.00	0.00	0.00	1.00	16.00	▇▁▁▁▁
FumblesLost	0	1.00	0.48	0.98	0.00	0.00	0.00	1.00	9.00	▇▁▁▁▁
FantasyPointsPPR	0	1.00	78.41	85.43	-2.90	11.50	43.40	116.35	417.40	▇▂▂▁▁
PositionRank	0	1.00	81.82	55.22	1.00	37.00	73.00	118.50	218.00	▇▇▅▃▂
Year	0	1.00	2022.00	0.00	2022.00	2022.00	2022.00	2022.00	2022.00	▁▁▇▁▁
ADP	429	0.25	71.90	42.07	1.30	36.42	70.65	107.15	153.80	▇▇▇▇▅
CompletionsPerAttempt	0	1.00	0.10	0.24	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▁
TDsPerAttempt	0	1.00	0.01	0.08	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▁
InterceptionsPerAttempt	0	1.00	0.01	0.08	0.00	0.00	0.00	0.00	1.00	▇▁▁▁▁
TDsPerReception	0	1.00	0.05	0.11	0.00	0.00	0.00	0.07	1.00	▇▁▁▁▁
FumblesLostPerFumble	0	1.00	0.20	0.36	0.00	0.00	0.00	0.25	1.00	▇▁▁▁▂
scoring_PassingYards	0	1.00	8.76	31.56	0.00	0.00	0.00	0.00	210.00	▇▁▁▁▁
scoring_PassingTDs	0	1.00	5.18	20.16	0.00	0.00	0.00	0.00	164.00	▇▁▁▁▁
scoring_Interceptions	0	1.00	-1.44	4.80	-30.00	0.00	0.00	0.00	0.00	▁▁▁▁▇
scoring_RushingYards	0	1.00	1.15	2.62	-0.15	0.00	0.05	0.74	16.53	▇▁▁▁▁
scoring_RushingTDs	0	1.00	5.08	12.82	0.00	0.00	0.00	6.00	102.00	▇▁▁▁▁
scoring_Receptions	0	1.00	20.11	24.93	0.00	2.00	9.00	30.00	128.00	▇▂▁▁▁
scoring_ReceivingYards	0	1.00	22.00	30.40	-1.00	1.10	9.60	31.35	180.90	▇▂▁▁▁
scoring_ReceivingTDs	0	1.00	7.79	12.72	0.00	0.00	0.00	12.00	84.00	▇▁▁▁▁
scoring_Fumbles	0	1.00	-2.09	3.96	-32.00	-2.00	0.00	0.00	0.00	▁▁▁▁▇
scoring_FumblesLost	0	1.00	-0.96	1.96	-18.00	-2.00	0.00	0.00	0.00	▁▁▁▁▇

Important categorical variables to note outside of the Team and Player is the Fantasy Position. These are QBs, WRs, TEs, and RBs. All defensive positions are not scoped and have been explicitly removed from the data set. The objective of this project is to create a fantasy team that has the highest likelihood of obtaining the most Fantasy Points (PPR) for the upcoming season based on prior seasons. The next logical question becomes, does this likelihood vary by team? By position? Intuitively, the likelihood will vary simply based on how many of these positions are in game at a given time. A table of each position shows the distribution of each position in a given NFL season.

 FantasyPosition   n   percent
              WR 218 0.3791304
              RB 162 0.2817391
              TE 113 0.1965217
              QB  82 0.1426087

A majority of NFL players in 2022 for fantasy purposes were wide receivers (38%) while the least common position were quarterbacks (14%). For the continuous variables, the important things to note are the means, standard deviations, their counts and distribution via the histograms, and any missing values. In-game stats such as Yards Per Attmept, Yards per Reception, and additional variables (below ADP) are missing for a lot of players as expected. The missing values are likely due to a number of factors such as the position of the player and the number of games each played. The range of games played by each player vary from none to 17, representative of 17 total games in the regular season. The descriptive summary for Fantasy Points (PPR) is:

   vars   n  mean    sd median trimmed   mad  min   max range skew kurtosis
X1    1 575 78.41 85.43   43.4      64 57.08 -2.9 417.4 420.3 1.38      1.4
     se
X1 3.56

With mean = 78.4 (sd = 85.29), it is obvious that there is high variability. The histogram of PPR shows where how this variability is distributed.

Depending on the fantasy league, PPR scoring will be different in the sense that there is typically a PPR threshold per game for a player in order for their PPR to be recorded for that given week. If the player is below the baseline, the PPR may be zero. Interestingly enough, the distribution of PPR in 2022 is a log-normal right skew distribution. The observed PPR for most players were approximately zero. This is logical, granted the extreme difficulty of being a top NFL performer. Most players are benched, aren’t playing games, nor are starters. Evidently, most of the continuous variables will follow the same trend with observed frequencies ~0 and relatively a fewer number of players scoring the most for each variable. There are three notable exceptions:

Average Draft Position - Uniform Distribution: For players that have this data available, ADP is uniformly distributed across most players. This implies that the parent population drafts players evenly across the board and there isn’t a strong concentration of players being picked predominantly. To reiterate, this is specifically for players that have ADP data available, which may be a combination of the most popular or the best players.
Fumbles Lost per Fumble - Bimodal distribution: Frequency of Fumbles Lost per Fumble have peaks at both zero and one, revealing that most players either fumble and lose possession of the ball, resulting in a turnover, or don’t fumble at all.
Yards Per Reception - Normal distribution: Players average 10 Yards Per Reception with a majority of players within the range of two standard deviations from the mean. Although there is a slight right skew, this is one of the few variables that follows a Gaussian distribution.

The first thing we want to understand is what position, if any, we should be more inclined to draft first, and their likelihood of obtaining the most fantasy points. I started by seeing what the public likes to draft first. In 2022, the public drafted quarterbacks around 86 times on average, the highest of the four positions, with tight ends at a close second of 84 times on average. Running backs and wide receivers then followed. Acknowledging that ADP data is not available for all positions, I wanted to better understand if quarterbacks are the biggest factor in regards to PPR. A comparison of ADP and PPR by position begins to paint the picture. On average, quarterbacks had 105 fantasy points in 22, followed by wide receivers, running backs, then tight ends.

# A tibble: 4 × 3
  FantasyPosition AverageDraftPosition AveragePPR
  <fct>                          <dbl>      <dbl>
1 QB                              85.8      105. 
2 WR                              73.5       81.6
3 RB                              61.5       75.7
4 TE                              83.9       56.5

This seems to be consistent at a higher level when the top three teams for the 2022 regular season are compared.

However, properly accounting for total number of games played by player, yields different results. Evidently, quarterbacks are disproportionately the most efficient in regard to fantasy points on a per game basis at .164 per game. Running backs, tight ends, and receivers consecutively follow but are significantly behind. However, fantasy points between those three positions vary by less than 5%. The logical question becomes what are the underlying causes of this variation? One possibility is that there are typically more wide receivers on an offensive play than any other position. Another possibility can be due to the extreme routes and distances wide receivers run, making them injury prone and therefore less efficient.

# A tibble: 4 × 2
  FantasyPosition PPRperGame
  <fct>                <dbl>
1 QB                  0.164 
2 RB                  0.0393
3 TE                  0.0372
4 WR                  0.0316

Analysis SVD PCA

Eigenvalue decomposition is an unsupervised machine learning method typically used for dimensionality reduction on mostly unlabeled data. The same approach is used here, with the intent to reduce the data to components that measure performance in some aspect. The best way to think about this is in the form of a recipe. A cookbook’s recipe for chicken cordon bleu will have elaborate concoctions and mixes of different food. When applied in this context, it would reduce the recipe to its core components, 3/4 chicken, 1/8 cheese, 1/8 ham let’s say. It then becomes much easier to make chicken cordon bleu while keeping most of the taste. While an oversimplification, the approach is essentially the same, with the goal of maximizing the amount of underlying variation using linear combinations of variables. At its core, there is some latent underlying variable(s) that combinations of these variables measure. What those underlying variables measure and its relevancy is on us to define. These are the principal components. The principal components are made up of the original variables, and how much that variable contributes to the underlying variable (ie principal component) are the eigen vectors or loadings. Loadings can be positive (greatly contributes) or negative (adversely contributes). How well the variables load help define what that new underlying variable is. To define the inclusion criteria, any variable that loads +/- .7 will be considered as loading well and those variables alone will be what is used to define the underlying variable/component. This empirical threshold is a very conservative approach.

Eigen vectors describe a mathematical phenomena such that

\[ A * v = λ * v \]

where A is a square matrix, v is an eigen vector, and λ is a scalar (numerical value) and the associated eigen value of vector v. In this application, matrix A is correlation matrix of the original data. This mechanism works because linear transformations are applied to the data meaning the data does not inherently change. The proportions of all variables and the direction in which they move remain the same. The data gets centered at the origin after scaling, and a best fitting line is calculated that goes through the origin and maximizes the variance in the data. The algorithm does this by fitting a random line through the data, projecting the points onto the line, and calculating the largest sum of squared difference. The yielded line of best fit is the eigen vector for the principal component and the slope is the eigen value.

A parallel test was used to measure the number of components to obtain. The test performs the same decomposition on simulated data of the same size and graphs the results. Where the simulated and actual data intersect is the cutoff for the number of components to obtain. The results of the test suggest three components. The y-axis plots the eigenvalues which is the total variation explained by each component. In a simpler sense, it can be thought of as the number of original variables accounted for in that component (hence the horizontal line separating values less than one). Three principal components were obtained, acknowledging that principal component one (PC1) should account for approximately six variables, PC2 around 5, and PC3 around 3. The other scree plot better highlights the components as a percentage of total variability explained. Keep in mind that PC1 only accounts for 30% of the total variability and the first three components cumulatively account for 52% of total variability. It’s likely that the post-hoc tests described in the methods section will not be sufficient for our goal since there is still half of the total variation not accounted for in these components.

Scores are calculated for each individual player. Depending on how the components are defined, players can be ranked in ascending order.

Principal Components - In-Game Stats

rotation: maximum variance

PC1 Definition: High-Volume Performing QBs
Attributes: Completions, Attempts, Passing Yards, Passing TDs, Interceptions, Fumbles

PC1 would be attributed to high-volume QBs as they load extremely well for the above categories. The first inclination was to attribute PC1 to high performing quarter backs, however, that statement alone would be unjustifiable considering that interceptions and fumbles load extremely well to this component. High-volume quarter backs would be a more fitting description. These quarter backs are performing extremely well in some regard since it loads high for completions, passing yards, and passing touch downs. We can reason that these quarter backs are also able to consistently get the ball off of their hands. High-volume quarter backs will also load high to interceptions and fumbles. The more throws and attempts made, the more likely that fumbles and interceptions will occur.

PC2 Definition: Offensive Long Range Efficiency
Attributes: Games Started, Targets, Receptions, Receiving Yards, Receiving TDs

PC2 can be attributed to total overall offensive efficiency given that we load extremely high for targets, receptions, receiving yards, and receiving touchdowns. Players typically defined for this category would be wide receivers, and pc2 is measuring yardage efficiency. Efficiency is important in this context given that we also load high to targets, and even though it is not an inclusion criteria in fantasy scoring, it speaks to the aggressiveness on the offensive side. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and touch downs are not accounted for in this component. Additionally, overall long range efficiency is justified since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a good quarter back.

PC3 Definition: Offensive Driving Efficiency
Attributes: Rushing Attempts, Rushing Yards, Rushing TDs

PC3 would be attributed to mainly RBs and TEs that are elite drivers since they load high for rushing stats.

Ideally, we’d want to load players that load high for all three categories. Considering the nature of football, depending on player’s primary position, they will naturally perform better in certain stats or categories over others. In this case, there are multiple approaches to account for this. Actually, along every step I find there are ways that our paths diverge, but more on that later. One approach is to use only principal component one and players/scores that load high for that component to pick our quarter back. Principal components two and three would then be used for all other positions. I started with that approach but here comes the other divergence - how I choose the calculate the scores. There are two options under consideration.

Include all variables in the principal component computation, with the benefit of providing a more comprehensive score but the drawback of added complexity.
Include only variables in the principal component computations that load high as the score, with the benefit of exclusively calculating how good they are at being good but the drawback of missing nuanced information capture in less significant variables.

I tested the model starting with the second approach. First, I reviewed how the top ten quarter backs performed in 2022 by looking at their overall rank, position rank, the player, and the total fantasy points they had.

# A tibble: 10 × 4
    Rank PositionRank Player          FantasyPointsPPR
   <dbl>        <dbl> <chr>                      <dbl>
 1     1            1 Patrick Mahomes             417.
 2     2            2 Josh Allen                  396.
 3     3            3 Jalen Hurts                 378 
 4     7            4 Joe Burrow                  351.
 5    13            5 Geno Smith                  304.
 6    17            6 Justin Fields               296 
 7    18            7 Trevor Lawrence             296.
 8    19            8 Kirk Cousins                292.
 9    20            9 Daniel Jones                289 
10    21           10 Jared Goff                  284.

Then we use the eigen values to calculate the principal component scores for each player, only including variables that loaded high. The players with the highest scores would be the highest performing QBs predicted for the 2022 season. Two things to note. Firstly, the position and overall rank will be the same here since we’ve define our first principal component as attributes of quarter backs only. Secondly, the model’s fantasy points ppr would not be known. The purpose here is to calculate scores and draft in ascending order. We can, however, calculate the difference in fantasy points had we taken the models’ picks. In the table, the model’s fantasy points are the same as actual fantasy points to make this calculation easier. The results are shown below.

Methods of measuring model performance:

Percentage of total players the model accurately selects. If we were selecting quarter backs, we would use the first principal component scores to obtain the top 10 quarter backs in ascending order. We would then compare the results against the actual top 10 quarter backs for that season. In these results, 70% (7 out of 10 QBs) were accurately selected as being in the top 10 for total fantasy points.
Difference in total fantasy points of the top 20 players. The top 20 players for each position are isolated using their associated principal component scores. The sum of the total fantasy points for the top 20 players are then subtracted from the what the actual total fantasy points for players in the top 20 in each position had to obtain the delta.
Absolute difference in position rank by player. Each player will have the net difference in their position rank between the model and their actual rank for that season. In the above table, for example, the model selects Justin Herbert as #2 QB for fantasy but was actually #11 after the regular season, making the net -11.

Now using the model’s picks for the top ten quarter backs, 70% of those selected in the top 10 were actually in the top ten during the 2022 regular season. The total fantasy points for the quarter backs picked by the model were 3131.2. The total fantasy points the top ten quarter backs actually had was 3302 which means the model was off by 5% in regard to quarter back selection. The difference for rank by position are shown below. Players with a negative delta are those that were ranked higher in the model but came in lower after the season. Using a conservative threshold, MAE > 5, Top10 < 70%, and Fantasy Error > 5% will be used. If results are above or below this threshold, different measures should be taken to improve the model.

FantasyPosition	MAE	Top10	FantasyError
QB	4	70%	5.1%

Principal Components - Fantasy Weights

While the results are seemingly great, this was when I recalled that these principal components only account for half of the total variation. Considering that the goal of singular value decomposition via principal component is to maximize the total variation in the data, I had to think we can do much better. The first principal component which was attributed to quarter backs load high for interceptions and fumbles. While it is justifiable to reason that high-volume quarter backs will naturally intercept and fumble more often solely as a function of volume, would the best quarter backs really load high for those? To better understand this, the exact same process above was done to calculate new component scores, this time only including variables that are used for fantasy scoring. I first created these additional variables by multiplying them by my league’s point system. For example, rushing touch downs were multiplied by six and became the new variable used for the svd. Those variables were passing yards, passing touchdowns, interceptions, rushing yards, rushing touchdowns, receptions, receiving yards, receiving touch downs, fumbles, and fumbles lost.

Results

The parallel test suggested three components were sufficient to explain the maximum variation in the data.

The first three components alone account for 88.1% of the total variation within the data set, much better than the 50% obtained previously. The mean item complexity = 1.1. This means that each individual variable included in the principal components only load significantly on one component. This is the more ideal scenario since it makes defining the components much easier. Previously we had a mean item complexity of 1.5, meaning that half of the variables on average load significantly to two components. The first three components are then defined using a loading threshold of .7.

Dimension Definitions

rotation: maximum variance

PC1 Definition: Low Performing QBs
Attributes: (-) Passing Yards, (-) Passing Touchdowns, (+) Fumbles, (+) Interceptions

PC1 would be attributed to quite literally the least performing quarter backs. The significantly negative loadings for passing yards and touch downs mean that quarter backs that load high to this component are unable to score touch downs. Additionally, they load extremely high for fumbles and interceptions, a confirmation of their under performance in relevant categories. PC1 would only be attributed to quarter backs since these in-game stats are generally relevant to them alone.

PC2 Definition: High-Performing Distance Efficiency
Attributes: + Receiving Touchdowns, + Receiving Yards, + Receptions

PC2 can be attributed to total overall offensive efficiency given that we load extremely high for receiving touchdowns, receiving yards, and receptions. Players that load high to this category are likely wide receivers since wide receivers are more used for long range plays. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and rushing touch downs are not accounted for in this component. Additionally, overall long range efficiency would better describe this component, since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a better quarter back.

PC3 Definition: High-Performing Driving Efficiency
Attributes: + Rushing Touchdowns, + Rushing Yards

PC3 would be attributed to overall driving efficiency in the same fashion. Likewise, this is also a combination of the offense line and the quarterback, considering that high performing drivers will still be unable to score touch downs in some fashion if their quarter back cannot perform. I would expect running backs and tight ends to load high to this category.

The goal is to isolate players in these areas.

The coordinate plane shows the first component on the x axis and the second on the y axis. The scoring_ and associated arrows are the eigen vectors on this principal component space. An increase on the x axis, or the first principal component, we increase in under performance. We would want to obtain players that negatively contribute to this component ie quadrant three. In the same fashion, an increase on the y axis means an increase in long range efficiency. This can only show the first two components. Based on the above, the we’d use the first component for quarter backs, the second for wide receivers, and the third for tight ends and running backs. However, since WRs, TEs, and RBs are much more similar in position (which the model concurs via the boxed region in quadrant I) than QBs, those three positions were included and ranked for PC2 and PC3. This allows us to more effectively see the primary position and players the model decides to pick for each category. In summary, principal component one was used for quarter backs, principal components two and three were used for all other positions at first. An overall score was then calculated using PC2 and PC3 only. Players with the highest overall scores would load significantly well to PC2 and PC3. These players would be both the best of the best in both long-range and driving efficiency. This allows us to see what positions the model picks for long-range (PC2) and driving (PC3) efficiency. Note that this tells us what position would be the best at both but does not tell us if they are the best at both.

# A tibble: 3 × 2
  FantasyPosition       n
  <fct>             <dbl>
1 RB               0.455 
2 WR               0.0738
3 TE              -0.287

We see that running backs and wide receivers on average have positive scores for both, meaning that they contribute positively to the second and third components (long and driving efficiency). Surprisingly, tight tends on average are negative in both regards. The model rarely selects tight ends across all three of these dimensions. The results from the model show that the best overall QBs via the lowest component scores should be selected for PC1, the best wide receivers for PC2, and the best RBs for PC3. Only a very select few tight ends are chosen for PC2 and PC3. Of the top 30 highest-performing players in long-range efficiency (PC2), 3 of them were tight ends and only 1 of them was in the top 10 - Travis Kelce. Of the top 30 highest-performing players in driving efficiency (PC3), only Taysom Hill (TE), made the cut. Since fantasy league members must select players for every position, the best approach would be to select QBs, WRs, and RBs only for PC1, PC2, and PC3 respectively and then address tight ends afterwards. That same process was followed to compare the model’s picks using against their actuals for 2022. The results for the three positions are shown below.

Model Picks

PC1 - QBs

Actual_OvrRank	Actual_PositionRank	Player	Actual_FantasyPointsPPR	PositionRank	Model_PositionRank	Model_Player	Model_FantasyPointsPPR	Model_PCscore
1	1	Patrick Mahomes	417.4	1	1	Patrick Mahomes	417.4	-385.8120
2	2	Josh Allen	395.5	2	2	Josh Allen	395.5	-350.7159
3	3	Jalen Hurts	378.0	4	3	Joe Burrow	350.7	-339.5820
7	4	Joe Burrow	350.7	8	4	Kirk Cousins	291.6	-325.2073
13	5	Geno Smith	303.9	5	5	Geno Smith	303.9	-316.3577
17	6	Justin Fields	296.0	11	6	Justin Herbert	281.3	-311.7454
18	7	Trevor Lawrence	295.6	10	7	Jared Goff	284.3	-309.3991
19	8	Kirk Cousins	291.6	12	8	Tom Brady	271.7	-304.4806
20	9	Daniel Jones	289.0	7	9	Trevor Lawrence	295.6	-299.4071
21	10	Jared Goff	284.3	13	10	Aaron Rodgers	239.2	-280.7368

The results are nearly identical to the previous method (on the original variables). Model accurately picks seven of the top ten players.

PC2 - WR

Actual_OvrRank	Actual_PositionRank	Player	Actual_FantasyPointsPPR	PositionRank	Model_PositionRank	Model_Player	Model_FantasyPointsPPR	Model_PCscore
5	1	Justin Jefferson	368.7	1	1	Justin Jefferson	368.7	343.4864
8	2	Tyreek Hill	347.2	3	2	Davante Adams	335.5	321.2936
9	3	Davante Adams	335.5	2	3	Tyreek Hill	347.2	319.6640
11	4	Stefon Diggs	316.6	4	4	Stefon Diggs	316.6	303.8524
15	5	CeeDee Lamb	301.6	6	5	A.J. Brown	299.6	291.2316
16	6	A.J. Brown	299.6	5	6	CeeDee Lamb	301.6	284.9864
26	7	Amon-Ra St. Brown	267.6	8	7	Jaylen Waddle	259.2	248.4996
27	8	Jaylen Waddle	259.2	7	8	Amon-Ra St. Brown	267.6	248.0896
28	9	DeVonta Smith	254.6	9	9	DeVonta Smith	254.6	246.5056
32	10	Amari Cooper	246.0	10	10	Amari Cooper	246.0	237.7820

PC3 - RB

Actual_OvrRank	Actual_PositionRank	Player	Actual_FantasyPointsPPR	PositionRank	Model_PositionRank	Model_Player	Model_FantasyPointsPPR	Model_PCscore
4	1	Austin Ekeler	372.7	13	1	Jamaal Williams	225.9	106.06931
6	2	Christian McCaffrey	356.4	4	2	Derrick Henry	302.8	88.05878
10	3	Josh Jacobs	328.3	3	3	Josh Jacobs	328.3	83.52732
14	4	Derrick Henry	302.8	6	4	Nick Chubb	281.4	82.29713
22	5	Saquon Barkley	284.0	1	5	Austin Ekeler	372.7	82.07121
23	6	Nick Chubb	281.4	22	6	Ezekiel Elliott	185.8	76.05968
29	7	Rhamondre Stevenson	249.1	15	7	Miles Sanders	216.7	74.20003
30	8	Tony Pollard	248.8	5	8	Saquon Barkley	284.0	68.97659
31	9	Aaron Jones	248.6	18	9	Kenneth Walker III	202.5	60.82183
35	10	Joe Mixon	239.5	8	10	Tony Pollard	248.8	60.40856

Model Accuracy

FantasyPosition	MAE	Top10	FantasyError
QB	2.2	70%	5.1%
RB	5.8	60%	4.5%
WR	0.6	100%	2.3%

The table shows model accuracy against 2022 actual data. Starting with the mean absolute error, this measures the average difference in rankings for total fantasy points between the model and the actual year. On average, across the three positions, the model is off by 3 - very good for fantasy purposes. Even those who not the sport extremely well will draft players based on emotion, sentiment, or loyalty in some fashion, and having these insights and potential edge (if there is then evidence the scores generalizes well to future seasons) will definitely be an advantage. There are still some areas for improvement, such as the model only accurately selecting 7 of the top 10 players in fantasy points for quarter backs, and 6 of 10 for running backs. Normally, process steps would be reviewed for different ways to improve the model, but in the interest of time that will not be done here. Now that I’m currently in a position where I like the results, what does the next step look like and how can we apply this for the 2024 draft given that draft day is in 24 hours? This is the hardest part. SVD via PCA is commonly done in the post exploratory phase as a part of an ensemble of methods in which predictive models/machine learning methods are then layered on top of it depending on the goal of the research. With my goal of creating the fantasy team with the highest likelihood (albeit unknown) of obtaining the most fantasy points, I outlined different approaches one could take given where I am in the process and knowing that I only have 24 hours left.

1. Using 2022 PCA Scores to Rank Players and Compare to 2023 Actual

Process

PCA on 2022 Data: Perform PCA on 2022 in-game stats. Obtain principal component (PC) scores for each player, focusing on the first few principal components (e.g., PC1, PC2). Rank players based on their PC scores (e.g., higher scores on PC1 may indicate better performance).

Ranking and Testing: Compare the PCA-derived ranks from 2022 to the actual fantasy points scored by each player in 2023. Evaluate the accuracy of these ranks by calculating metrics like precision or correlation between PCA ranks and actual points. Generalization:

If 2022 PCA ranks generalize well to 2023 performance, use the same method on 2023 data. Apply this PCA-based ranking approach to the 2024 player pool to predict future performance.

2. Combining 2022 and 2023 Data for PCA and Regression

Process

Data Preparation: Combine the 2022 and 2023 data sets, including in-game stats and calculated fantasy points. Standardize the data to ensure comparability. PCA on Combined Data:

Perform PCA on the combined data set to capture the underlying structure across both years. Extract PC scores for each player, focusing on the first few components (e.g., PC1, PC2, PC3). Regression Analysis:

Use the PC scores as features in a regression model along with other in-game stats (e.g., games started, targets). Train the model on 2022 and 2023 data to predict total fantasy points. Projection for 2024:

Input 2024 projection data into the trained regression model to predict total fantasy points for each player in 2024. Rank players based on these predictions.

3. Using 2024 Projection Data to Calculate PCA Scores

Process

PCA on 2024 Projection Data: Perform PCA on the 2024 projected in-game stats. Obtain PC scores for each player based on the projections. Ranking Based on Projections:

Rank players based on their PC scores from the 2024 projection data. Use these rankings to guide draft decisions.

Assumptions: This approach assumes that the 2024 projections are accurate enough to reflect actual performance, so PC scores from projections should correlate with final fantasy points.

4. Weighted PCA Combining Historical and Projected Data

Process

Weighting and Combining Data: Assign weights to the 2022, 2023, and 2024 (projected) data. For example, use a higher weight for more recent years like 2023. Combine the data sets into a single matrix, with projections and historical data weighted accordingly.

PCA on Weighted Data: Perform PCA on the weighted data set to capture the combined effect of historical and projected performance. Obtain PC scores for each player, reflecting a blend of past performance and future projections. Ranking and Drafting:

Rank players based on the weighted PC scores. Use these ranks to inform drafting decisions.

2023 | 2022 NFL seasons

The weighted PCA approach on both the ’22 and ’23 NFL regular season was used to select my fantasy team. After matching corresponding players in 2023, both years were joined such that statistics from both years were available for each observation per player. A weighted average of the scoring_ variables were then calculated consisting of a third of their statistics from ’22 and two-thirds from ’23, emphasizing recent seasons at a 2:1 ratio. This weight only applies to the athletes that played at least one game in both years. A total of 39 athletes that had season ending injuries in ’22 but played in ’23 were omitted from the data. No weight was applied to the 142 rook athletes in 2023. 100% of their ’23 NFL season was incorporated into their component scores for rook athletes, given that there is no other history (besides this one) for a baseline. Using combine data was considered, but was not used in the interest of time. Exploratory analysis showed that the both the mean and spread of the underlying distributions for all in-game statistics were extremely similar between ’23 and ’24. For ’23 fantasy points, the mean was 83.1 (sd = 87.3).

Results

Parallel and scree tests were performed on the weighted averages of the variables used for FantasyPointsPPR. The parallel test showed three components were sufficient but as many as four could be used according to the scree test. The calculated eigen values of the first three components cumulatively explain 90.1% of the total variation within the data.

The standardized loadings were then plotted on a patter correlation matrix to define the components. The loading score for each component were extremely similar in value to ones obtained in the Fantasy Weights section. Components were defined using these definitions.

rotation: maximum variance

PC1 Definition: Low Performing QBs
Attributes: (-) Passing Yards, (-) Passing Touchdowns, (+) Fumbles, (+) Interceptions

PC1 would be attributed to the worst overall performing quarter backs. The significantly negative loadings for passing yards and touch downs mean that quarter backs that load high to this component are unable to score touch downs. Additionally, they load extremely high for fumbles and interceptions, a confirmation of their under performance in relevant categories. PC1 would only be attributed to quarter backs since these in-game stats are generally relevant to them alone. When calculating scores for PC1, the highest-performing or best overall quarter backs would have the lowest scores, since they would negatively contribute to this component.

PC2 Definition: High-Performing Distance Efficiency
Attributes: + Receiving Touchdowns, + Receiving Yards, + Receptions

PC2 can be attributed to total overall offensive efficiency given that we load extremely high for receiving touchdowns, receiving yards, and receptions. Players that load high to this category are likely wide receivers since wide receivers are more used for long range plays. The key distinction to make here is that this describes the overall long range efficiency only, since rushing yards and rushing touch downs are not accounted for in this component. Additionally, overall long range efficiency would better describe this component, since this must be a combination of quarter backs and the offensive line. Wide receivers, running backs, and tight ends will generally only score more touch downs and have more yards with a better quarter back.

PC3 Definition: High-Performing Driving Efficiency
Attributes: + Rushing Touchdowns, + Rushing Yards

PC3 would be attributed to overall driving efficiency in the same fashion. Likewise, this is also a combination of the offense line and the quarterback, considering that high performing drivers will still be unable to score touch downs in some fashion if their quarter back cannot perform. I would expect running backs and tight ends to load high to this category.

The annotations show what regions in each component would isolate the best (or worst) quarter backs for PC1 and the best long-range players for PC2. The benefit of plotting components as functions of each other is to understand the variability between two dimensions individually. Comparing PC1 and PC2 in this manner may not be necessary considering how we’ve defined the components, but visualizing it this way highlights any outliers and spread of the data. The ellipses are the regions that represent the 95% confidence interval for each fantasy position. Looking at PC1 and its associated eigen vectors, it is evident that the best quarter backs would be those that have high values along the x-axis (note that quarter backs are the only group spread differently than the others, evidence that this component are attributes of them). Although the individual component scores for these quarter backs will be negative (as they negatively contribute to this component), these are orthogonal projections, so the fact that it is ‘positive’ on the x axis here is meaningless. The interpretation would be identical if we flipped it over the y axis. The three quarter backs we definitely want to obtain are the three outliers closest to the top of the annotated box. These quarter backs would be something akin to the ‘best of the best’. The other important part of this relates to PC2. The underlying variables that contribute the most to this component are receptions, receiving yards, and receiving touch downs. This is the ‘long-range’ cateogry (rushing yards and tds are not accounted for here). In the previous biplot, wide receivers completely dominated this region when it came to individual component scores of athletes. While that is somewhat the case here, notice that running backs are not far behind. Athletes highest in y values are still the wide receivers, but there is a good region of running backs [2,2.5] on the x axis that contribute just as well as wide receivers.

Understanding the variability between two dimensions individually would be the most beneficial between PC2 and PC3 since it would easily highlight players that are the best at both. The dynamic sport of football and the physical build of players makes it extremely difficult to be an efficient driving scorer and an efficient passing scorer (receiving). Running backs tend to do well in the driving aspect; they have stockier frames making easier to drive through defenses and gain yardages that way. Wide receivers do well on the receiving end - they have slimmer frames leading to elusive plays to gain the most yardage (hail marys’ as an polar example). Players that contribute equally well to PC2 and PC3 are wraps for fantasy (granted they have a good quarter back).

The same trend identified in the component definitions are seen here. The region with the highest x values are wide receivers (tip of 95% CI extends out the most) with a handful of outliers near the max. The region with the highest y values, representing PC3, is nearly reserved for running backs (except for a couple exceptions). Notice that there are almost no players that positively contribute to PC3. I can physically count by hand the six players that positively contribute for PC3 that are not running backs. There are two parts of this graph to pay attention to. Firstly, the red box indicates the top of the 95% confidence interval for running backs ie the best of the best for PC3. Players in this region should be selected in the fantasy draft. More importantly, there is one identified that contributes extremely well to both PC2 and PC3. The vertical line separates the top six NFL athletes for long range efficiency, who are wide receivers (using a weighted average of ’22 and ’23 seasons), but there is only one running back out of the entire NFL offense that is in this category but also one of the best in driving efficiency. Take a guess whom that is.

Model Picks

There are two different ways of calculating overall scores (ie principal component scores) for each player to identify the players that should be drafted for this season. The first is to only use variables that contribute within the empirical range (+/-.75). Usually, the empirical range is only used to define what the principal components are or represent. Generally all scores should be used when calculating scores, the benefit being that it captures the combined variance of all the variables and the overall score is the more accurate reflection. The one drawback with this approach is that it complicates the model. The benefit of using only variables that are outside the empirical threshold is that it simplifies the model and is easier to interpret. Note that when interpreting the visualizations in this analysis, specifically the biplots, all component scores need to be included in the calculation. I will show the picks using both methods.

Approach 1 - Empirical Threshold

QBs

ModelRank	Rank2023	PositionRank2023	Team2023	Player2023	FantasyPointsPPR2023	pc1	pc2	pc3	ovr
1	3	1	BUF	Josh Allen	392.6	-336.8850	0.0000000	75.2816532	412.1666
2	50	8	KAN	Patrick Mahomes	280.2	-331.6486	0.5168941	11.1670530	343.3325
3	40	7	DET	Jared Goff	289.1	-323.6598	0.4842071	7.8787255	332.0227
4	66	11	MIA	Tua Tagovailoa	270.4	-316.1377	0.0000000	0.7021111	316.8398
5	8	3	DAL	Dak Prescott	342.8	-302.7733	0.0000000	11.5304138	314.3037
6	76	12	JAX	Trevor Lawrence	262.5	-296.4182	0.0000000	27.5229754	323.9412
7	5	2	PHI	Jalen Hurts	356.8	-277.1566	0.0000000	87.0594829	364.2161
8	58	9	HOU	C.J. Stroud	276.0	-275.1716	0.9623171	18.5073453	294.6413
9	80	16	NOR	Derek Carr	241.1	-270.6667	0.0000000	0.5861661	271.2528
10	83	19	SEA	Geno Smith	227.3	-270.3990	0.5107969	7.8084478	278.7183
11	60	10	TAM	Baker Mayfield	274.1	-251.3108	0.0000000	6.9678469	258.2787
12	81	17	LAC	Justin Herbert	234.2	-250.4118	1.6099590	13.2045966	265.2264
13	78	13	DEN	Russell Wilson	256.9	-248.3591	0.3534593	19.9824227	268.6950
14	31	6	SFO	Brock Purdy	295.6	-247.5300	0.0000000	10.3548609	257.8848
15	10	4	BAL	Lamar Jackson	331.2	-241.6355	0.0000000	32.1511112	273.7866
16	89	24	MIN	Kirk Cousins	149.7	-233.3930	0.0000000	4.2276148	237.6206
17	79	15	LAR	Matthew Stafford	243.1	-225.1303	0.0000000	2.3247628	227.4550
18	90	25	CIN	Joe Burrow	147.2	-222.2560	0.0000000	10.7799923	233.0360
19	18	5	GNB	Jordan Love	319.1	-218.7511	0.0000000	16.6044933	235.3556
20	82	18	CHI	Justin Fields	230.2	-200.9537	0.0000000	37.9466259	238.9003

WRs

ModelRank	Rank2023	PositionRank2023	Team2023	Player2023	FantasyPointsPPR2023	pc1	pc2	pc3	ovr
1	4	2	MIA	Tyreek Hill	376.4	-2.8242470	352.7382	2.0767694	357.6392
2	2	1	DAL	CeeDee Lamb	403.2	-5.6484940	344.4119	8.3875950	358.4480
3	7	3	DET	Amon-Ra St. Brown	330.9	-2.2389478	298.3283	0.4605591	301.0279
4	20	8	PHI	A.J. Brown	289.6	-6.7168433	289.3462	0.0000000	296.0631
5	14	5	LAR	Puka Nacua	298.5	-1.7558977	282.0636	0.8599250	284.6794
6	34	14	LVR	Davante Adams	265.4	-0.5852992	281.3340	-0.0032207	281.9161
7	29	12	BUF	Stefon Diggs	273.8	-3.9948454	280.2885	0.0225449	284.3059
8	9	4	TAM	Mike Evans	282.5	0.0000000	256.9248	0.0000000	256.9248
9	33	13	CIN	Ja’Marr Chase	262.7	-3.2292872	250.8549	-0.0128828	254.0713
10	70	26	MIN	Justin Jefferson	202.2	-2.6767197	249.3281	1.8770865	253.8819
11	15	6	CHI	D.J. Moore	286.5	-2.2389478	245.1010	4.0601387	251.4000
12	25	10	SFO	Brandon Aiyuk	249.2	-3.3584216	237.1919	0.0740759	240.6244
13	54	22	PHI	DeVonta Smith	227.6	-3.3584216	232.4062	0.0000000	235.7646
14	27	11	LAC	Keenan Allen	278.9	-8.3019302	231.6001	0.0644139	239.9664
15	62	25	IND	Michael Pittman Jr.	250.2	-5.1143193	230.7656	0.0966208	235.9765
16	42	16	CLE	Amari Cooper	227.0	-4.0363499	228.1088	0.0000000	232.1452
17	39	15	SEA	D.K. Metcalf	225.4	-2.2389478	221.4407	0.0000000	223.6796
18	47	19	MIN	Jordan Addison	221.3	0.0000000	215.5321	0.0193242	215.5515
19	57	24	NOR	Chris Olave	231.3	-2.2389478	213.8417	0.0000000	216.0807
20	103	32	MIA	Jaylen Waddle	198.6	-1.1194739	212.2413	0.1610346	213.5218

RBs

Christian McCaffrey is favored as the first pick outside of drafting quarter backs. Although he is #3 pick for running backs, ideal picks for offensive players (although rare) would load high to principal components two and three. His combined component score for those dimensions are the highest of any offensive player included in the data.

ModelRank	Rank2023	PositionRank2023	Team2023	Player2023	FantasyPointsPPR2023	pc1	pc2	pc3	ovr
1	6	2	MIA	Raheem Mostert	267.7	-6.8702170	59.71549	82.59469	149.18040
2	16	5	TEN	Derrick Henry	246.7	-11.8040821	55.54816	81.92272	149.27497
3	1	1	SFO	Christian McCaffrey	391.3	-7.9271136	168.60418	80.64146	257.17276
4	32	9	DET	David Montgomery	207.2	-5.6484940	41.11002	67.30746	114.06597
5	35	10	DET	Jahmyr Gibbs	242.1	-5.1143193	86.91137	65.44326	157.46895
6	53	14	BAL	Gus Edwards	187.0	-6.7679679	19.46584	61.04759	87.28140
7	12	3	JAX	Travis Etienne	282.4	-4.5290201	93.80611	60.79753	159.13267
8	17	6	CIN	Joe Mixon	267.0	0.0000000	107.29194	56.20920	163.50113
9	59	16	SEA	Kenneth Walker III	199.4	-1.1705985	53.51327	56.13834	110.82221
10	93	25	LVR	Josh Jacobs	181.1	-6.8702170	73.16355	55.55920	135.59296
11	65	20	PIT	Najee Harris	195.5	-6.2852958	56.23899	53.17002	115.69431
12	13	4	LAR	Kyren Williams	255.0	-5.6484940	51.13492	52.86670	109.65011
13	69	21	MIA	De’Von Achane	190.7	-3.3584216	62.95150	52.77974	119.08966
14	43	13	NYG	Saquon Barkley	223.2	-5.0631947	89.62998	51.71806	146.41124
15	99	31	LAC	Austin Ekeler	185.4	-14.6554096	132.87294	50.16511	197.69347
16	63	18	DAL	Tony Pollard	222.6	-4.5801447	86.13688	49.13565	139.85267
17	52	15	ARI	James Conner	201.5	-2.2900723	62.47610	48.63644	113.40261
18	61	17	KAN	Isiah Pacheco	213.9	-5.6484940	60.44341	44.36051	110.45242
19	122	37	CLE	Kareem Hunt	118.5	-0.5852992	35.16709	43.57351	79.32590
20	97	28	IND	Jonathan Taylor	156.4	-4.5290201	39.76985	41.33364	85.63251

Top 75

Based on the functional nature of football and analysis thus far, a top 75 list wouldn’t be meaningful, since each component and position is unique to its own. For those that are curious, however, a ranking such as this can, in theory, be created by combining all component scores.

ModelRank	Rank2023	PositionRank2023	Team2023	FantasyPosition2023	Player2023	FantasyPointsPPR2023	pc1	pc2	pc3	ovr
1	3	1	BUF	QB	Josh Allen	392.6	-336.8849856	0.0000000	75.2816532	412.1666
2	5	2	PHI	QB	Jalen Hurts	356.8	-277.1566447	0.0000000	87.0594829	364.2161
3	2	1	DAL	WR	CeeDee Lamb	403.2	-5.6484940	344.4118627	8.3875950	358.4480
4	4	2	MIA	WR	Tyreek Hill	376.4	-2.8242470	352.7382300	2.0767694	357.6392
5	50	8	KAN	QB	Patrick Mahomes	280.2	-331.6485727	0.5168941	11.1670530	343.3325
6	40	7	DET	QB	Jared Goff	289.1	-323.6597758	0.4842071	7.8787255	332.0227
7	76	12	JAX	QB	Trevor Lawrence	262.5	-296.4181804	0.0000000	27.5229754	323.9412
8	66	11	MIA	QB	Tua Tagovailoa	270.4	-316.1377077	0.0000000	0.7021111	316.8398
9	8	3	DAL	QB	Dak Prescott	342.8	-302.7732618	0.0000000	11.5304138	314.3037
10	7	3	DET	WR	Amon-Ra St. Brown	330.9	-2.2389478	298.3283464	0.4605591	301.0279
11	20	8	PHI	WR	A.J. Brown	289.6	-6.7168433	289.3462135	0.0000000	296.0631
12	58	9	HOU	QB	C.J. Stroud	276.0	-275.1716287	0.9623171	18.5073453	294.6413
13	14	5	LAR	WR	Puka Nacua	298.5	-1.7558977	282.0636273	0.8599250	284.6794
14	29	12	BUF	WR	Stefon Diggs	273.8	-3.9948454	280.2885390	0.0225449	284.3059
15	34	14	LVR	WR	Davante Adams	265.4	-0.5852992	281.3340147	-0.0032207	281.9161
16	83	19	SEA	QB	Geno Smith	227.3	-270.3990214	0.5107969	7.8084478	278.7183
17	10	4	BAL	QB	Lamar Jackson	331.2	-241.6354945	0.0000000	32.1511112	273.7866
18	80	16	NOR	QB	Derek Carr	241.1	-270.6666575	0.0000000	0.5861661	271.2528
19	78	13	DEN	QB	Russell Wilson	256.9	-248.3591059	0.3534593	19.9824227	268.6950
20	81	17	LAC	QB	Justin Herbert	234.2	-250.4118266	1.6099590	13.2045966	265.2264
21	60	10	TAM	QB	Baker Mayfield	274.1	-251.3108188	0.0000000	6.9678469	258.2787
22	31	6	SFO	QB	Brock Purdy	295.6	-247.5299791	0.0000000	10.3548609	257.8848
23	1	1	SFO	RB	Christian McCaffrey	391.3	-7.9271136	168.6041824	80.6414638	257.1728
24	9	4	TAM	WR	Mike Evans	282.5	0.0000000	256.9248291	0.0000000	256.9248
25	33	13	CIN	WR	Ja’Marr Chase	262.7	-3.2292872	250.8548747	-0.0128828	254.0713
26	70	26	MIN	WR	Justin Jefferson	202.2	-2.6767197	249.3280744	1.8770865	253.8819
27	15	6	CHI	WR	D.J. Moore	286.5	-2.2389478	245.1009581	4.0601387	251.4000
28	26	3	KAN	TE	Travis Kelce	219.4	-3.3584216	246.1584535	0.0161035	249.5330
29	25	10	SFO	WR	Brandon Aiyuk	249.2	-3.3584216	237.1919041	0.0740759	240.6244
30	27	11	LAC	WR	Keenan Allen	278.9	-8.3019302	231.6000597	0.0644139	239.9664
31	82	18	CHI	QB	Justin Fields	230.2	-200.9537028	0.0000000	37.9466259	238.9003
32	89	24	MIN	QB	Kirk Cousins	149.7	-233.3929601	0.0000000	4.2276148	237.6206
33	62	25	IND	WR	Michael Pittman Jr.	250.2	-5.1143193	230.7656078	0.0966208	235.9765
34	54	22	PHI	WR	DeVonta Smith	227.6	-3.3584216	232.4062271	0.0000000	235.7646
35	18	5	GNB	QB	Jordan Love	319.1	-218.7511326	0.0000000	16.6044933	235.3556
36	90	25	CIN	QB	Joe Burrow	147.2	-222.2560458	0.0000000	10.7799923	233.0360
37	42	16	CLE	WR	Amari Cooper	227.0	-4.0363499	228.1088277	0.0000000	232.1452
38	11	1	DET	TE	Sam LaPorta	239.3	0.0000000	228.7718784	0.0386483	228.8105
39	79	15	LAR	QB	Matthew Stafford	243.1	-225.1302799	0.0000000	2.3247628	227.4550
40	39	15	SEA	WR	D.K. Metcalf	225.4	-2.2389478	221.4406898	0.0000000	223.6796
41	112	36	NYJ	WR	Garrett Wilson	213.2	-6.1826686	211.3563377	0.0128828	217.5519
42	57	24	NOR	WR	Chris Olave	231.3	-2.2389478	213.8417338	0.0000000	216.0807
43	28	4	MIN	TE	T.J. Hockenson	219.0	-3.3584216	212.5474634	0.0000000	215.9059
44	47	19	MIN	WR	Jordan Addison	221.3	0.0000000	215.5321435	0.0193242	215.5515
45	77	14	WAS	QB	Sam Howell	257.5	-191.5508111	0.9030403	22.4547598	214.9086
46	72	27	KAN	WR	Rashee Rice	212.5	-5.1143193	209.1896872	-0.0289862	214.2750
47	36	6	JAX	TE	Evan Engram	230.3	-5.6484940	208.0803575	0.0418690	213.7707
48	103	32	MIA	WR	Jaylen Waddle	198.6	-1.1194739	212.2412612	0.1610346	213.5218
49	107	34	SEA	WR	Tyler Lockett	202.4	-2.2389478	207.0999236	0.0000000	209.3389
50	102	31	CAR	WR	Adam Thielen	231.0	-2.3411969	205.6832038	0.0515311	208.0759
51	104	33	TAM	WR	Chris Godwin	209.2	-2.2389478	200.7186832	4.0150490	206.9727
52	75	29	WAS	WR	Terry McLaurin	209.2	-0.5852992	205.3121344	0.0934001	205.9908
53	88	23	CAR	QB	Bryce Young	156.4	-200.2969227	0.0000000	2.4445059	202.7414
54	45	18	GNB	WR	Jayden Reed	217.2	-3.5117954	186.4197583	12.4123061	202.3439
55	51	21	LVR	WR	Jakobi Meyers	218.6	-5.0590601	189.2252456	7.6275115	201.9118
56	23	9	HOU	WR	Nico Collins	260.4	-1.1705985	199.0064355	0.0450897	200.2221
57	22	2	SFO	TE	George Kittle	203.2	-1.1194739	197.7416374	0.0128828	198.8740
58	99	31	LAC	RB	Austin Ekeler	185.4	-14.6554096	132.8729430	50.1651129	197.6935
59	21	7	SFO	WR	Deebo Samuel	243.7	-3.9948454	166.8070814	26.5986366	197.4006
60	49	20	TEN	WR	DeAndre Hopkins	223.6	-1.7047731	194.5227328	0.0579725	196.2855
61	101	30	BAL	WR	Zay Flowers	206.4	0.0000000	187.6528827	6.1723358	193.8252
62	55	23	PIT	WR	George Pickens	208.8	-3.5117954	185.2630141	2.0703280	190.8451
63	74	28	DEN	WR	Courtland Sutton	190.2	-6.7168433	179.0934789	0.0161035	185.8264
64	30	5	CLE	TE	David Njoku	201.2	-5.5973694	180.1365787	-0.0257655	185.7082
65	134	47	JAX	WR	Christian Kirk	150.3	-5.5716181	178.6363682	0.0740759	184.2821
66	85	21	IND	QB	Gardner Minshew II	196.2	-165.8224506	0.0000000	13.7934059	179.6159
67	86	22	ATL	QB	Desmond Ridder	177.1	-154.1840391	0.2493013	20.2201764	174.6535
68	121	42	ATL	WR	Drake London	174.4	-3.9249500	170.5222128	0.0000000	174.4472
69	120	41	LAR	WR	Cooper Kupp	164.4	-1.7047731	168.0109081	2.0252383	171.7409
70	38	12	ATL	RB	Bijan Robinson	246.3	-11.8311626	127.1046309	31.9552264	170.8910
71	84	20	2TM	QB	Joshua Dobbs	200.7	-144.9644778	0.0000000	25.3785715	170.3430
72	118	31	NWE	QB	Mac Jones	106.4	-166.9169863	0.0000000	2.8239702	169.7410
73	46	8	CHI	TE	Cole Kmet	181.1	-1.7558977	164.9200433	0.0418690	166.7178
74	113	38	BUF	WR	Gabriel Davis	161.4	-3.3584216	161.5473649	-0.0128828	164.8929
75	91	26	ARI	QB	Kyler Murray	146.4	-144.1053422	0.0000000	19.8117259	163.9171