DFS Soccer Strategy: DraftKings Goalkeeper Analysis

Written by

Updated on January 15, 2024 1:44PM EST

Goalkeeper is easily the most frustrating position in soccer DFS. I've heard many theories on how to select a goalkeeper at DraftKings. Always pay for the most expensive option. Never roster Manchester City's Ederson. Play the cheapest home goalkeeper. Play a goalkeeper who will face a lot of shots from outside the box. Or, my go to, play whoever you want.

I wanted to look at some macro-level trends in goalkeeper scoring to help make decisions.

For detailed stats and odds, check out RotoWire's
DraftKings Cheat Sheets

METHODOLOGY

My data set includes all Premier League matches from the 2020/21 season until the present. I used closing moneyline odds for the home and away team, along with shot totals and closing over/under 2.5 lines from sharp sportsbook Pinnacle (available at Football-Data). With this information, I calculated team goalkeeper saves and if they received the win and clean sheet bonus. My data is sorted by home and away goalkeepers, and I kept that distinction throughout my analysis.

I wanted to look at some macro-level trends in goalkeeper scoring to help make decisions.

For detailed stats and odds, check out RotoWire's
DraftKings Cheat Sheets

METHODOLOGY

My data notably omits some items that affect goalkeeper scoring on DraftKings, including accurate passes, fouls, yellow cards and penalty saves. While accurate passes are important on the margins, 17 out of 20 of the goalkeepers who started half of their team's matches in the 22/23 Premier League season averaged between 24 and 35 accurate passes, the exceptions being David Raya (38.8), Nick Pope (17.8) and Lukasz Fabianski (21.47). This accounts for a difference of around 0.3 fantasy points. Only three goalkeepers registered more than three yellow cards that season and 48 yellow cards were given to goalkeepers, which means it happens roughly in one out of 19 games. The other events range from very uncommon (fouls suffered) to rare (penalty saves and assists).

Since I consider the above omissions a combination of minor and difficult or impossible to project, I focused on what I believe to be the two main drivers of goalkeeper scoring:

Goalkeeper Floor: This is two times the number of saves minus the number of goals conceded by the team goalkeeper.

Goalkeeper Total: This is the goalkeeper floor plus five times the win and/or clean sheet bonus, if achieved.

How Efficient is DraftKings Goalkeeper Pricing?

Goalkeeper is a unique position on DraftKings since the price of the goalkeeper is almost completely correlated to their team's implied win probability. It's not a perfect correlation and depends on the specific slate. Sometimes a $5,900 goalkeeper will be 60-percent likely to win and sometimes they will be at 80 percent. However, I can still investigate the relationship between Goalkeeper Total points and implied win probability, and then dive into the trends.

After a quick glance, things appear random. The x-axis represents the implied win probability, while the y-axis records the Goalkeeper Total for these matches. I ran a linear regression on both home and away Goalkeeper Total to see how well a linear relationship based on team win probability explains Goalkeeper Total. That trendline is in orange.

There is (unsurprisingly) a positive correlation, but it is not a strong one. The R-squared value, a number between 0 and 1 that represents how well the regression line approximates Goalkeeper Total (with 1 being a perfect fit) is approximately 0.025. That's very low. The regression suggests that goalkeeper total increases by around 0.5 points for each 10 percent of win probability added, but the variance around the regression line is evidently massive.

What about Goalkeeper Floor?

Goalkeeper Floor only takes into account the difference between total saves and total goals conceded. It is the primary driver of goalkeeper scoring before the win and clean sheet bonuses are applied. It's assumed underdogs will face more shots on goal than favorites, but do they save them enough to have a Goalkeeper Floor comparable to the favorites?

I again ran a linear regression on both data sets and the resulting R-squared values were even worse (0.01 for home and 0.003 for away). There is a slight negative relationship between implied win probability and Goalkeeper Floor. The standard error, which measures on average how far the data drifts from the regression line, is about 0.5 points less for home goalkeepers than away goalkeepers.

To me, the data suggests that from the perspective of "floor" on a macro level, there is little reason to prefer any goalkeeper over another. Home goalkeeper has slightly less variance, but that could be noise. Big underdogs might project for a point or two more in floor, but the difference is minimal. Micro-level takes can and should definitely be applied on a case-by-case basis, but at a zoomed-out level, goalkeeper seems to be a battle for a combination of outlier save performances and achieving the win and clean sheet bonus.

Segmenting Out Goalkeeper Performances

A few things struck me while combing through the data. First, there are a lot of negative Goalkeeper Total scores even for favorites. Second, I expected there to be visual evidence of slightly less variance in scoring from favored goalkeepers, but I didn't observe anything obvious. Finally, the linear relationship between win probability and Goalkeeper Total is quite weak.

So, how can we get a better feel for the goalkeeper range of outcomes? I decided to put goalkeepers in "buckets" corresponding to their implied win probability and made some simple statistical measurements on Goalkeeper Total within each of those.

Win %: Team's implied probability to win

Games: Number of matches that fall into each bucket

Mean: The average (add all scores and divide by number of games)

Median: The 50th percentile outcome (literally, the point in the middle)

Q1/Q3: The 25th and 75th percentile outcome. 25 percent of performances are less than or equal to Q1, 25 percent are better than or equal to Q3.

Negative %: The number of times a goalkeeper gets a Goalkeeper Total of 0 or less.

StdDev: The standard deviation of Goalkeeper Total in each bucket. This measures the spread of the data. When data is reasonably distributed, we expect around 68 percent of the outcomes to fall within one standard deviation of the mean.

Home Goalkeeper Scoring
Win %	0-20%	20-30%	30-40%	40-50%	50-60%	60-70%	70-80%	80-100%
Games	162	199	248	230	178	147	100	74
Mean	4.50	6.04	5.86	6.91	7.96	7.16	8.21	8.16
Q1	0	0	2	2	2	2	4	4
Median	4	4	4	7	7.5	7	7	9.5
Q3	8	11	11	12	14	12	12	12
Negative %	30%	29%	24%	21%	20%	20%	14%	12%
StdDev	6.56	7.43	6.83	6.63	7.16	6.32	6.72	5.78

Away Goalkeeper Scoring
Win %	0-20%	20-30%	30-40%	40-50%	50-60%	60-70%	70-100
Games	371	277	244	180	119	81	65
Mean	5.14	5.89	6.80	6.3	7.82	9.15	7.86
Q1	0	0.5	2	2	2	4	3
Median	4	4	6	5.5	7	10	9
Q3	10	11	11.5	11	12	14	12
Negative %	29%	25%	21%	24%	14%	14%	15%
StdDev	6.99	6.79	6.87	6.38	6.36	6.29	6.48

There are some intuitive results here, but I exercise caution. Goalkeeper variance is very high and this sample is solid but not particularly large. That away goalkeepers with 60-70% implied win probability have the highest mean and median score is more indicative of the variance of the position than a new secret to succeeding at soccer DFS. It's better to focus on general trends.

That said, I think there are some actionable observations to take away.

Goalkeeper Total is "right-skewed" for teams with a win probability between 0-40%. This is reflected in the mean's in each bucket being higher than the medians. I think this makes sense. Outlier negative goalkeeper performances require a combination of few saves and many goals conceded. Goalkeepers have easier access to "easy" saves from poor attempts at goal than outlier efficiency from the attacking team (or inefficiency on their end).

Predictably, scores less than or equal to zero from goalkeeper are more common when win probability is lower. Still, goalkeepers who are slight favorites in the match score zero or less slightly over 20 percent of the time. Often, the difference in price for goalkeepers in this range is around $600 to $800 on DraftKings. I'm more likely to spend down at goalkeeper and use the extra salary for a better outfielder moving forward, especially in cash games. Combined with the right-skew of the scoring, I think there might be a psychological bias against the lowest price goalkeepers

There is no trend of reduced variance throughout the price buckets, except for exceedingly high home favorites. Even then, the sample is not particularly large and the variance is not much smaller compared to the rest, which surprised me. High variance is quite consistent through all ranges of win probabilities.

Finally, with my data, I was able to observe that at the higher win probability end of the spectrum, goalkeepers are more likely to have a so-called bimodal distribution. This is a distribution of outcomes that has two bumps instead of one, and this makes sense, since I think it reflects the five-point bump between getting a win and getting the additional clean sheet.

Compare this to goalkeepers with lower win probabilities. A win and clean sheet is less likely to contribute to the Goalkeeper Total, so it shows distribution somewhat normally and with a skew towards the larger point totals.

CONCLUSIONS

I don't think much of this is shocking, but I think it's useful to do a more quantitative analysis to see if our qualitative observations are accurate. It struck me how the 75th percentile and better range of outcomes for most win probability buckets was 11 points or more. In GPPs of slates with three matches or more, I think you can truly play whoever you want at goalkeeper and just wait for the variance of the position to work in your favor. Of course, it's even better if the goalkeeper you roster comes with low rostership.

In cash games, it's helpful to have a number to compare cheap goalkeepers to expensive ones. When considering 2v2s involving a favored versus underdog goalkeeper, do you think you can get five points more in projection, plus some upside to match a possible clean sheet?

I didn't attempt to make any micro observations, but I still think those are important to look into. I can't quantify goalkeeper skill, but maybe some models can. Do teams that shoot outside the box concede more saves to the opposing goalkeeper? Or are they too often off target? I think you can use the data above as a baseline, then make small adjustments based on any extra information you think might be useful.

If you made it all the way here, thanks for reading! I am hardly a data scientist or statistician, so hit me up on Twitter/X (@JackBurkart) or on Discord (hsmyyt.com/chat) and let me know your thoughts, especially if I made any mistakes. Otherwise, I wish you all the best in terms of goalkeeper variance... unless you scooped one of my head to heads.