If intelligent extraterrestrials were to come to earth — never having heard of baseball – but from a planet that also has DFS – they could become profitable MLB DFS players using the FantasyLabs Correlation tool.
You may be familiar with other tools we have with similar names, like the NFL version. However, the MLB tool works slightly differently. Instead of showing the correlations between players, this one shows the correlation between specific metrics and fantasy performance.
For those unfamiliar, correlation in statistics refers to how much a movement in one measure relates to a movement in another. Things that are perfectly correlated – say sales of right and left shoes – get a score of 1.0. Things that are perfectly inversely correlated are given a score of -1. Positive numbers mean when one measure goes up, so does the other.
I’m intentionally vague with what “fantasy performance” means here for reasons we’ll get into shortly. However, this is a far more powerful tool than the NFL version. (While the data on how players correlate is valuable, it’s mostly accounted for by the market.)
The MLB tool is a one-stop shop for figuring out what matters for MLB DFS in our MLB Player Models.
This piece is designed to give you some tricks for navigating this data. The type (and size) of the contest you’re playing is a significant factor. Additionally, we’ll take a quick look at some possible surprising data points that might help you rethink your lineup building process.
First though, a quick look at what the tool tells us.
What’s Inside the Tool
The first thing to note is all of the various statistics included in this tool. While I won’t be going over each of them, nearly 30 different baseball metrics are included. What’s more important is what these metrics correlate with.
The “fantasy performance” I referenced above takes shape here. Each statistic is compared to the following:
In addition, various filters are available for you to tinker with. These include Vegas data, ballpark, and lineup order. There’s also a time filter, in case you’re curious how a certain metric has fared recently or in the distant past.
Of course, there are separate pages for pitchers and hitters as well.
Finally, there’s a page for lineup order correlations.
This is more similar to the NFL product and shows how much scoring from a certain player relates to his teammates. This is beneficial for building stacks since it allows you to find correlations that may be under-accounted for in terms of ownership.
(You could also select only AL ballparks in order to remove pitchers from the sample. Or, select only NL to see how much pitchers hurt the correlation between the eighth hitter and the leadoff man.)
What Are You Solving For?
That’s the most important question when looking at this data. Your answer will vary significantly between cash games and tournaments and slightly between large and small-field tournaments.
Let’s start with cash games.
Regardless of the type of cash game (head-to-head, 50/50, or double up), the mission is the same: maximize expected scoring, regardless of ownership. To that end, we want to target players who rank highly in stats that correlate with actual points, as well as Plus/Minus.
Remember, Plus/Minus is a FantasyLabs metric that relates scoring to salary.
A positive figure means more points than we would expect from a player at a given salary.
My personal preference is to target hitters who excel from a Plus/Minus standpoint. A few things stand out in that department. First, a hitter’s Plus/Minus relates very strongly to the quality of the opposing starting pitcher – but not in the way you think.
Actual points are diminished when facing better opponents, but Plus/Minus actually goes up. This suggests that pricing reacts too strongly to the projected starter on the mound. Intuitively, that makes sense.
The average starter dipped below five innings last season. That means roughly half of a given hitter’s at-bats come against someone other than the starter. With fantasy sites still adjusting salaries based on opponent, there’s a slight edge in targeting good players in tough matchups – if the salary supports it.
This brings us to the next strongest (positive) correlation for hitters – monthly salary change. This one is fairly simple, players whose salaries are lower than they were a month ago tend to outperform the current salary.
Since a month is such a short span in a high-variance sport like baseball, it’s likely that the prior expectation (which was presumably based on a career’s worth of data) is more accurate than the current expectation.
Put simply: target hitters whose prices have declined.
With cash game pitchers, I’m not worried as much about salary. On all but the biggest slates, there’s a limited number of pitchers who put up strong scores. So particularly on DraftKings, where we need two pitchers, I’m willing to pay up for the raw points while finding value from my hitters.
To that end, my biggest focus on the correlations page is on actual scoring. Park Factor immediately stands out. It makes sense that the dimensions of the park matter more for pitchers – the larger sample size is where a larger or smaller park comes into play. Any given fly ball might get out of the deepest stadiums, but most of them will drop in.
(Interestingly, Park Factor barely correlates with Plus/Minus. It seems fantasy pricing adequately considers where the game takes place.)
Beyond that, the strongest correlations are various measurements of how deep pitchers go into the game. Innings pitched percentile and pitch counts (over the season and the last 15 days) are major factors. This is fairly obvious since better pitchers get to stay in the game longer. Finally, Strike %, and Pitch Speed are strong correlations as well.
Nothing groundbreaking here, but for cash games, it’s a clear message. Don’t get cute with pitcher selections, as the standard measures of effective pitching all correlate strongly with actual scoring.
Next, let’s look at how to use the Correlations tool to optimize our lineups for GPPs.
Things get a bit more interesting for tournaments. We aren’t just trying to score the most points. We’re trying to score the most points that our opponents don’t. (It’s not what you know, it’s what you know that other people don’t.)
That means not only will we be looking for factors that score strongly, but those that correlate more heavily with fantasy scoring than they do ownership.
Extra points if Plus/Minus shows a positive correlation as well.
What stands out from the hitter’s side is all the factors that are opposite of what we’re looking for: Strong correlations with ownership but negative correlations with scoring. Of course, that means we can target hitters that rank poorly in those stats to achieve the opposite effect.
Again, this comes down primarily to opponent statistics. For example, “Opponent WHIP percentile” correlates slightly negatively with actual scoring but has a .10 correlation to tournament ownership. To unpack that a bit further, hitters facing bad pitchers score fewer points but at higher ownership.
Clearly, we should be rostering – or at least not avoiding – hitters facing tough competition for tournaments.
Additionally, ISO (isolated slugging) splits exhibit a similar effect. (The ISO of a hitter against the handedness of the pitcher he’s facing, minus his ISO against the other hand of pitchers.)
This has a strong negative correlation with actual points but a noticeable correlation with ownership. That suggests it’s mostly noise, but we shouldn’t be afraid to roster hitters on the wrong side of their splits.
As discussed before, this is mainly because hitters generally don’t face opposing starters for all of their at-bats. The field hasn’t adapted fully to that notion, though.
Pitchers for tournaments are a bit of a mixed bag strategy-wise. Sometimes a pitcher or two stands so far above the field that we should eat the ownership and look to differentiate somewhere else. Most of the time, though, we still want to be sensitive to leverage spots. Like with hitters, let’s look at a few different stats that correlate positively with scoring but negatively (or less strongly) with ownership.
WHIP, strikeouts, and innings pitched percentile ranking both fit the bill. Keep in mind; this isn’t the pitcher’s raw numbers for those statistics. Rather, it’s based on where they stack up amongst all the pitchers on the day’s slate. Both of these correlate nearly twice as strong with actual scoring as they do with ownership – sometimes we can keep it simple.
Park factor – as mentioned for cash games – has a strong correlation to scoring at .54, but the ownership is much less related at only .20.
Interestingly, the quality of a pitcher’s opposition (as measured by WOBA and strikeout rate) has little to do with their actual success.
Good pitching beats good hitting, as the saying goes.
This isn’t an exhaustive list, but FantasyLabs subscribers can cross-reference the data at any point.
While our projections do most of the work, a tool like this is super useful in breaking close ties between similar players.
It’s also a great starting point for people who are new to MLB DFS, so make sure you study up before building your lineups.