A Quick Summation of My Approach to Daily Fantasy Baseball

Much of my approach to daily fantasy sports—particularly baseball—centers around embracing uncertainty and trying to put myself in a position to not only not be harmed by it, but also to benefit from it.

A big part of that comes down to understanding that I probably don’t know as much as I think I do. I think it’s pretty natural for people to act with certainty about their beliefs, and sometimes that’s beneficial; the underdog that thinks they can win a game they otherwise probably wouldn’t win is the type of self-fulfilling prophecy in which strong belief, even if irrational, leads to success.

For our purposes, though, ignoring uncertainty and volatility can be harmful. If we never account for the fact that we could just be wrong, we’re creating a fragile system that opens itself up to being harmed by variance. That’s one reason that traditional projections and values can be very misleading; if you think you can consistently project a difference of 10.0 fantasy points for Player X and 10.2 fantasy points for Player Y, you’re going to get yourself into trouble. It’s fine to make projections, but it’s vital to incorporate uncertainty and the consistency with which we can trust them. “Here’s what I think this player will do, but how confident can I be that I’m right?”

And the more uncertainty, the less value there is in following the herd—one reason value matters more in basketball (a game in which players have a narrow range of fantasy outcomes on a nightly basis) than baseball (a game in which the daily fantasy outcomes are distributed over a very wide range). The more variance, the more we can benefit from using game theory (depending on the league type) as a path to creating an antifragile plan of attack.

None of this is to say that I believe baseball or any sport is totally random, or even mostly random—it’s not like I could go out there and hit dingers—but rather that there’s just more variance than most people would like to admit. As I mentioned, that limits the importance we can place on short-term data, but it increases the importance of uncovering long-term signals.

Basically, the more variance, the more we can predict numbers to regress toward the mean. Figuring out that “mean”—whether it’s league-wide or on the level of a single player—is key. If I’m projecting Adam Wainwright’s strikeouts per nine innings this year, for example, it’s important to know that he had only 7.1 last year after four straight seasons above 8.0, but it would also be useful to know how pitchers at his age typically perform relative to their previous peak production. Maybe pitchers normally don’t see a major decline in strikeouts until age 35, for example, in which case Wainwright could be in for a positive regression.

Thus, a big part of what I do is try to get my hands on as much aggregate data as I can get. I want to know the timeless (or at least semi-timeless) sort of information: how left-handed finesse pitchers age compare to right-handed power hurlers or exactly how valuable it is to play at Coors Field, for example.

The numbers allow me to develop general heuristics that I think make for a nice daily fantasy decision-making foundation. I hardly follow such heuristics at all times—there are plenty of occasions when it’s smart to deviate from the rule—but in general, I think it makes sense to have aggregate data as a backbone.

Again, this relates back to variance and our inability to consistently and accurately identify exceptions. In the NFL, for example, there’s a very low success rate for running backs who check in slower than about 4.50 in the 40-yard dash. Are there times when it’s acceptable to draft a running back who runs a 4.55? Sure. It’s not like those guys never succeed. But teams consistently miss on those sorts of players because identifying the traits that might allow for a player to be an exception to the rule is really challenging, and overall, data has shown NFL teams would be better off drafting running backs based off of a single naïve heuristic—faster is better—than whatever voodoo they’ve been using in the past.

I think we can have the best of both worlds: data-driven heuristics and logical, subjective decision-making. Actually, the heuristics act as a foundation from which we can make smart subjective decisions and adapt as daily fantasy players. That evolution is key.

My goal is to take as scientific of an approach to daily fantasy sports as I possibly can. I want to create falsifiable theories, test them, and alter my opinions when necessary. I want to use data to inform my decisions, implementing heuristics as a foundation and stats as a path through which I can improve my subjective decision-making. I want to constantly adapt to new information.

Much of my approach to daily fantasy sports—particularly baseball—centers around embracing uncertainty and trying to put myself in a position to not only not be harmed by it, but also to benefit from it.

A big part of that comes down to understanding that I probably don’t know as much as I think I do. I think it’s pretty natural for people to act with certainty about their beliefs, and sometimes that’s beneficial; the underdog that thinks they can win a game they otherwise probably wouldn’t win is the type of self-fulfilling prophecy in which strong belief, even if irrational, leads to success.

For our purposes, though, ignoring uncertainty and volatility can be harmful. If we never account for the fact that we could just be wrong, we’re creating a fragile system that opens itself up to being harmed by variance. That’s one reason that traditional projections and values can be very misleading; if you think you can consistently project a difference of 10.0 fantasy points for Player X and 10.2 fantasy points for Player Y, you’re going to get yourself into trouble. It’s fine to make projections, but it’s vital to incorporate uncertainty and the consistency with which we can trust them. “Here’s what I think this player will do, but how confident can I be that I’m right?”

And the more uncertainty, the less value there is in following the herd—one reason value matters more in basketball (a game in which players have a narrow range of fantasy outcomes on a nightly basis) than baseball (a game in which the daily fantasy outcomes are distributed over a very wide range). The more variance, the more we can benefit from using game theory (depending on the league type) as a path to creating an antifragile plan of attack.

None of this is to say that I believe baseball or any sport is totally random, or even mostly random—it’s not like I could go out there and hit dingers—but rather that there’s just more variance than most people would like to admit. As I mentioned, that limits the importance we can place on short-term data, but it increases the importance of uncovering long-term signals.

Basically, the more variance, the more we can predict numbers to regress toward the mean. Figuring out that “mean”—whether it’s league-wide or on the level of a single player—is key. If I’m projecting Adam Wainwright’s strikeouts per nine innings this year, for example, it’s important to know that he had only 7.1 last year after four straight seasons above 8.0, but it would also be useful to know how pitchers at his age typically perform relative to their previous peak production. Maybe pitchers normally don’t see a major decline in strikeouts until age 35, for example, in which case Wainwright could be in for a positive regression.

Thus, a big part of what I do is try to get my hands on as much aggregate data as I can get. I want to know the timeless (or at least semi-timeless) sort of information: how left-handed finesse pitchers age compare to right-handed power hurlers or exactly how valuable it is to play at Coors Field, for example.

The numbers allow me to develop general heuristics that I think make for a nice daily fantasy decision-making foundation. I hardly follow such heuristics at all times—there are plenty of occasions when it’s smart to deviate from the rule—but in general, I think it makes sense to have aggregate data as a backbone.

Again, this relates back to variance and our inability to consistently and accurately identify exceptions. In the NFL, for example, there’s a very low success rate for running backs who check in slower than about 4.50 in the 40-yard dash. Are there times when it’s acceptable to draft a running back who runs a 4.55? Sure. It’s not like those guys never succeed. But teams consistently miss on those sorts of players because identifying the traits that might allow for a player to be an exception to the rule is really challenging, and overall, data has shown NFL teams would be better off drafting running backs based off of a single naïve heuristic—faster is better—than whatever voodoo they’ve been using in the past.

I think we can have the best of both worlds: data-driven heuristics and logical, subjective decision-making. Actually, the heuristics act as a foundation from which we can make smart subjective decisions and adapt as daily fantasy players. That evolution is key.

My goal is to take as scientific of an approach to daily fantasy sports as I possibly can. I want to create falsifiable theories, test them, and alter my opinions when necessary. I want to use data to inform my decisions, implementing heuristics as a foundation and stats as a path through which I can improve my subjective decision-making. I want to constantly adapt to new information.