Our Blog


Labyrinthian: Berkson’s Paradox and Daily Fantasy Sports

“tl;dr version: it’s because of Berkson’s Paradox.”
— Brian Burke, Twitter

I thought about starting this piece with a quotation from E. M. Forster — a respected British writer of the last 150 years whose work I’ve never bothered to read — but I’m not sure if the quotation I want to use is actually his. I found the quotation on the internet, and internet quotations are often misattributed and/or misquotations. Also, even if it is his, I’m not sure about the original context of the quotation. Did he himself say it in a speech or write it in an essay? Or was it written by one of his narrators in a story or spoken by one of his characters? (Context matters.)

Anyway, here’s the quotation loosely associated with Forster: “Spoon feeding in the long run teaches us nothing but the shape of the spoon.”

That’s pretty f*cking good — but is it really any better than Burke’s tweet?

This piece is about Berkson’s Paradox and the need not to spoon-feed oneself. By the way, I had no idea what Berkson’s Paradox was until about five minutes before I started writing this piece, so I’m the perfect person to tell you all about it.

Life of Brian

  1. A movie that used to be overrated and is now probably underrated.
  2. The existence of a guy who used to have a cool analytics site before joining ESPN.

Burke is a former Navy pilot who started the site Advanced Football Analytics and eventually leveraged that into a job with ESPN’s Stats & Information Group as a senior football analytics specialist.

I’ve been lucky enough to speak with Brian a couple of times on a podcast — and by “speak with Brian,” I mean “he spoke and I fumbled through sentence fragments as I interviewed him” — and anything I’ve ever read of his or heard him say has always seemed reasonable and researched. He’s a dude who definitely is devoted to facts and data.

Today as I was wasting time researching, I came across a tweet of his . . .

. . . after which was another tweet — the one with which I started the article. Because Burke is smart and I’m interested in paradoxes, riddles, fallacies, etc. — as well as the NFL combine — I read Burke’s article.

Full disclosure: I read it hoping I’d learn about Berkson’s Paradox and be able to use it in a piece. Appearances to the contrary, The Labyrinthian doesn’t write itself — although it might be better if it did.

Burke’s Thoughts on Combine Numbers

In “Why You Should Take Those Combine Numbers With a Grain of Salt,” Burke doesn’t say that combine data is meaningless, but he does say this (which I’ve excerpted):

It may surprise you just how difficult it is to find meaningful connections between combine performance and success in the pros. It’s much harder to detect the signal among the noise than you might expect.

The tremendous difficulty of finding a connection between athletic abilities and performance in the NFL is something of a mystery. After all, if you and I and other random fans were invited to the combine, we’d likely do quite poorly at the drills.

But that’s just it! You and I aren’t invited to the combine. That’s the key to unlocking the mystery.

To be invited to the combine, a prospect must have measurable talent, nonmeasurable skills, or likely some combination of both, and must be strong enough to play well in college and catch the eye of NFL scouts.

The fact that a prospect is invited creates a selection bias in the combine results. The guys with relatively low measurables tend to have higher nonmeasurables, or else they likely wouldn’t have been invited. And the guys with low nonmeasurables tend to have high measurables for the same reason. Measurable talent and nonmeasurable skills tend to be cross-wired, given the fact that someone has been invited to the combine.

Here’s what I take from this: Data from the combine is best used when we remember that 1) it’s from the combine, 2) to be at the combine a prospect must be selected to be there, and 3) many combine attendees have some unquantifiable attributes.

Berkson’s Paradox

For Burke, the question of combine data comes down to a “phenomenon known as Berkson’s paradox.”

If you look up Berkson’s paradox (or Berkson’s bias or Berkson’s fallacy) on Wikipedia, you’ll see some formulas that only people like Burke and Colin Davy understand, such as . . .

if 0 < P(A) < 1 and 0 < P(B) < 1

and P(A|B) = P(A)

then P(A|B,C) < P(A|C) where C = AB (i.e. A or B)

I guess I could lie and tell you that I understand all of that, but we’d both know I’m lying . . . and then it really wouldn’t be a lie, would it?

Essentially, Berkson’s paradox is an ascertainment bias: When two independent events are seen as conditionally dependent because of the constraints of the experiment — when the person studying data can’t properly see the facts because (s)he has gathered a sample that is inherently skewed — that’s Berkson’s paradox.

Men Are Pretty Much Horrible — Except When They’re Ugly

Jordan Ellenberg (a professor of mathematics) has created the go-to explanation of Berkson’s paradox. (By the way, I previously mentioned Ellenberg in my Labyrinthian on the overvaluation of premium DFS assets.)

In an article entitled “Why Are Handsome Men Such Jerks?” Ellenberg drops some truth bombs, which I’ve excerpted:

Suppose you’re a person who dates men. You may have noticed that, among the men in your dating pool, the handsome ones tend not to be nice, and the nice ones tend not to be handsome.

Behold the Great Square of Men.

great square of men

Now, let’s take as a working hypothesis that men are in fact equidistributed all over this square. In particular, there are nice handsome ones, nice ugly ones, mean handsome ones, and mean ugly ones, in roughly equal numbers.

But niceness and handsomeness have a common effect: They put these men in the group of people that you notice. Be honest — the mean uglies are the ones you never even consider. So inside the Great Square is a Smaller Triangle of Acceptable Men:

triangle of acceptable men

Now the source of the phenomenon is clear. The handsomest men in your triangle, over on the far right, run the gamut of personalities, from kindest to (almost) cruelest. On average, they are about as nice as the average person in the whole population, which, let’s face it, is not that nice. And by the same token, the nicest men are only averagely handsome. The ugly guys you like, though — they make up a tiny corner of the triangle, and they are pretty darn nice. They have to be, or they wouldn’t be visible to you at all. The negative correlation between looks and personality in your dating pool is absolutely real. But the relation isn’t causal. If you try to improve your boyfriend’s complexion by training him to act mean, you’ve fallen victim to Berkson’s fallacy.

Just to make sure that this is clear: Berkson’s fallacy is the result of looking at the metaphorical triangle of data and not the rectangle — and this part’s important: It’s a triangle you have created. (I’m using “you” generically, sort of.)

Berkson’s Paradox and DFS

DFS players can commit Berkson’s fallacy when performing two main tasks: Researching players and reviewing performance.

Researching Players

Let’s say you want to research by using the Trends tool. (Wise move.) And now let’s say you have some ideas about what types of players are acceptable to begin with, so you use the filters to create a cohort of qualified candidates upon which you can conduct various trend tests with the cohort data providing a baseline.

On the one hand, it’s smart to have a baseline. On the other hand, the baseline — created through a combination of criteria — could be skewed. On top of that, the sample is purposely restricted. As a result, it’s likely unrepresentative of the larger population. After all, the sample was made to exclude those who are deemed unacceptable.

What this means is that any research done on the cohort — and any insights derived from that research — could mislead. For instance, you might see that two factors seem correlated when in fact they’re correlated only in the restricted sample you’ve created. You might notice that adjusting a particular filter results in an elevated Plus/Minus for the matching players — but you might fail to realize that for most players that filter has little impact.

It’s fine to narrow the group of players on which you conduct research, as long as you keep in mind these two facts:

  1. Because the cohort is smaller than it could be, the results of the research are likely to be less reliable than they otherwise might be.
  2. What is true for the cohort might not be true for players not in the cohort.

If you don’t keep these points in mind, your research could be tainted by Berkson’s bias.

Reviewing Performance

I mentioned earlier that this piece in part is about the need not to spoon-feed oneself. When it comes to reviewing performance in DFS, a lot of people spoon-feed themselves. They do this in two ways:

  1. They review their winning lineups.
  2. They review their lineups.

It’s nice to review lineups that win — kmills8 is probably enjoying his personal lineup breakdown — but there’s only so much that can be learned from analyzing successful rosters because they represent a small (and perhaps biased) portion of your total portfolio. If you analyze yourself by looking at your best performances, the conclusions you reach will almost certainly be wrong.

For instance, you might notice that in a high percentage of your winning lineups you employ a specific strategy. That observation could lead you to use that strategy in an overwhelming majority of your lineups. Of course, if you also looked at your losing lineups — and the ratio of your winning/losing lineups and the types of slates in which you win/lose — then you might realize that your ‘winning strategy’ applied only in particular situations.

Additionally, it’s probably not sufficient to look at just your lineups — because they represent and are formed by the possible biases and habits of which you might not even be aware. Just as combine data is of limited value because it’s based on a restricted player pool that has been selectively created, so to are your lineups of restricted utility (as instructive tools) because they are only some of the lineups in the DFS universe and were constructed in a manner particular to you.

And don’t forget — as the combine reminds us — that we live in a world in which not yet everything is quantifiable. Because FantasyLabs Co-Founders Jonathan Bales and Peter Jennings (CSURAM88) have sustained records of quantifiable success, there’s value in watching the premium lineup reviews (available for free!) that they offer — but really you should watch these to hear what Bales and CSURAM88 have to say, since not all wisdom is numerical.

While it’s great to hear Peter talk about adjustments to his Player Models, it’s even better to learn about the logic behind those changes and the perspective Peter generally has. It’s awesome to hear Bales talk about ownership percentages — which Pro subscribers should review in the DFS Ownership Dashboard after each slate has locked — but it’s even better to understand how Bales approaches various slates and when he thinks contrarianism is most beneficial.

You should review your winning lineups, because they do have significance — they provide insight into who you are as a player — but you shouldn’t give them priority over other kinds of rosters.

Sometimes the best way to change what’s in the mirror is to focus on what isn’t there.

Stretching a Double Into a Touchdown

I probably should’ve ended this piece with the sentence right before this one.

The Labyrinthian: 2017.22, 117

This is the 117th installment of The Labyrinthian, a series dedicated to exploring random fields of knowledge in order to give you unordinary theoretical, philosophical, strategic, and/or often rambling guidance on daily fantasy sports. Consult the introductory piece to the series for further explanation. Previous installments of The Labyrinthian can be accessed via my author page.

“tl;dr version: it’s because of Berkson’s Paradox.”
— Brian Burke, Twitter

I thought about starting this piece with a quotation from E. M. Forster — a respected British writer of the last 150 years whose work I’ve never bothered to read — but I’m not sure if the quotation I want to use is actually his. I found the quotation on the internet, and internet quotations are often misattributed and/or misquotations. Also, even if it is his, I’m not sure about the original context of the quotation. Did he himself say it in a speech or write it in an essay? Or was it written by one of his narrators in a story or spoken by one of his characters? (Context matters.)

Anyway, here’s the quotation loosely associated with Forster: “Spoon feeding in the long run teaches us nothing but the shape of the spoon.”

That’s pretty f*cking good — but is it really any better than Burke’s tweet?

This piece is about Berkson’s Paradox and the need not to spoon-feed oneself. By the way, I had no idea what Berkson’s Paradox was until about five minutes before I started writing this piece, so I’m the perfect person to tell you all about it.

Life of Brian

  1. A movie that used to be overrated and is now probably underrated.
  2. The existence of a guy who used to have a cool analytics site before joining ESPN.

Burke is a former Navy pilot who started the site Advanced Football Analytics and eventually leveraged that into a job with ESPN’s Stats & Information Group as a senior football analytics specialist.

I’ve been lucky enough to speak with Brian a couple of times on a podcast — and by “speak with Brian,” I mean “he spoke and I fumbled through sentence fragments as I interviewed him” — and anything I’ve ever read of his or heard him say has always seemed reasonable and researched. He’s a dude who definitely is devoted to facts and data.

Today as I was wasting time researching, I came across a tweet of his . . .

. . . after which was another tweet — the one with which I started the article. Because Burke is smart and I’m interested in paradoxes, riddles, fallacies, etc. — as well as the NFL combine — I read Burke’s article.

Full disclosure: I read it hoping I’d learn about Berkson’s Paradox and be able to use it in a piece. Appearances to the contrary, The Labyrinthian doesn’t write itself — although it might be better if it did.

Burke’s Thoughts on Combine Numbers

In “Why You Should Take Those Combine Numbers With a Grain of Salt,” Burke doesn’t say that combine data is meaningless, but he does say this (which I’ve excerpted):

It may surprise you just how difficult it is to find meaningful connections between combine performance and success in the pros. It’s much harder to detect the signal among the noise than you might expect.

The tremendous difficulty of finding a connection between athletic abilities and performance in the NFL is something of a mystery. After all, if you and I and other random fans were invited to the combine, we’d likely do quite poorly at the drills.

But that’s just it! You and I aren’t invited to the combine. That’s the key to unlocking the mystery.

To be invited to the combine, a prospect must have measurable talent, nonmeasurable skills, or likely some combination of both, and must be strong enough to play well in college and catch the eye of NFL scouts.

The fact that a prospect is invited creates a selection bias in the combine results. The guys with relatively low measurables tend to have higher nonmeasurables, or else they likely wouldn’t have been invited. And the guys with low nonmeasurables tend to have high measurables for the same reason. Measurable talent and nonmeasurable skills tend to be cross-wired, given the fact that someone has been invited to the combine.

Here’s what I take from this: Data from the combine is best used when we remember that 1) it’s from the combine, 2) to be at the combine a prospect must be selected to be there, and 3) many combine attendees have some unquantifiable attributes.

Berkson’s Paradox

For Burke, the question of combine data comes down to a “phenomenon known as Berkson’s paradox.”

If you look up Berkson’s paradox (or Berkson’s bias or Berkson’s fallacy) on Wikipedia, you’ll see some formulas that only people like Burke and Colin Davy understand, such as . . .

if 0 < P(A) < 1 and 0 < P(B) < 1

and P(A|B) = P(A)

then P(A|B,C) < P(A|C) where C = AB (i.e. A or B)

I guess I could lie and tell you that I understand all of that, but we’d both know I’m lying . . . and then it really wouldn’t be a lie, would it?

Essentially, Berkson’s paradox is an ascertainment bias: When two independent events are seen as conditionally dependent because of the constraints of the experiment — when the person studying data can’t properly see the facts because (s)he has gathered a sample that is inherently skewed — that’s Berkson’s paradox.

Men Are Pretty Much Horrible — Except When They’re Ugly

Jordan Ellenberg (a professor of mathematics) has created the go-to explanation of Berkson’s paradox. (By the way, I previously mentioned Ellenberg in my Labyrinthian on the overvaluation of premium DFS assets.)

In an article entitled “Why Are Handsome Men Such Jerks?” Ellenberg drops some truth bombs, which I’ve excerpted:

Suppose you’re a person who dates men. You may have noticed that, among the men in your dating pool, the handsome ones tend not to be nice, and the nice ones tend not to be handsome.

Behold the Great Square of Men.

great square of men

Now, let’s take as a working hypothesis that men are in fact equidistributed all over this square. In particular, there are nice handsome ones, nice ugly ones, mean handsome ones, and mean ugly ones, in roughly equal numbers.

But niceness and handsomeness have a common effect: They put these men in the group of people that you notice. Be honest — the mean uglies are the ones you never even consider. So inside the Great Square is a Smaller Triangle of Acceptable Men:

triangle of acceptable men

Now the source of the phenomenon is clear. The handsomest men in your triangle, over on the far right, run the gamut of personalities, from kindest to (almost) cruelest. On average, they are about as nice as the average person in the whole population, which, let’s face it, is not that nice. And by the same token, the nicest men are only averagely handsome. The ugly guys you like, though — they make up a tiny corner of the triangle, and they are pretty darn nice. They have to be, or they wouldn’t be visible to you at all. The negative correlation between looks and personality in your dating pool is absolutely real. But the relation isn’t causal. If you try to improve your boyfriend’s complexion by training him to act mean, you’ve fallen victim to Berkson’s fallacy.

Just to make sure that this is clear: Berkson’s fallacy is the result of looking at the metaphorical triangle of data and not the rectangle — and this part’s important: It’s a triangle you have created. (I’m using “you” generically, sort of.)

Berkson’s Paradox and DFS

DFS players can commit Berkson’s fallacy when performing two main tasks: Researching players and reviewing performance.

Researching Players

Let’s say you want to research by using the Trends tool. (Wise move.) And now let’s say you have some ideas about what types of players are acceptable to begin with, so you use the filters to create a cohort of qualified candidates upon which you can conduct various trend tests with the cohort data providing a baseline.

On the one hand, it’s smart to have a baseline. On the other hand, the baseline — created through a combination of criteria — could be skewed. On top of that, the sample is purposely restricted. As a result, it’s likely unrepresentative of the larger population. After all, the sample was made to exclude those who are deemed unacceptable.

What this means is that any research done on the cohort — and any insights derived from that research — could mislead. For instance, you might see that two factors seem correlated when in fact they’re correlated only in the restricted sample you’ve created. You might notice that adjusting a particular filter results in an elevated Plus/Minus for the matching players — but you might fail to realize that for most players that filter has little impact.

It’s fine to narrow the group of players on which you conduct research, as long as you keep in mind these two facts:

  1. Because the cohort is smaller than it could be, the results of the research are likely to be less reliable than they otherwise might be.
  2. What is true for the cohort might not be true for players not in the cohort.

If you don’t keep these points in mind, your research could be tainted by Berkson’s bias.

Reviewing Performance

I mentioned earlier that this piece in part is about the need not to spoon-feed oneself. When it comes to reviewing performance in DFS, a lot of people spoon-feed themselves. They do this in two ways:

  1. They review their winning lineups.
  2. They review their lineups.

It’s nice to review lineups that win — kmills8 is probably enjoying his personal lineup breakdown — but there’s only so much that can be learned from analyzing successful rosters because they represent a small (and perhaps biased) portion of your total portfolio. If you analyze yourself by looking at your best performances, the conclusions you reach will almost certainly be wrong.

For instance, you might notice that in a high percentage of your winning lineups you employ a specific strategy. That observation could lead you to use that strategy in an overwhelming majority of your lineups. Of course, if you also looked at your losing lineups — and the ratio of your winning/losing lineups and the types of slates in which you win/lose — then you might realize that your ‘winning strategy’ applied only in particular situations.

Additionally, it’s probably not sufficient to look at just your lineups — because they represent and are formed by the possible biases and habits of which you might not even be aware. Just as combine data is of limited value because it’s based on a restricted player pool that has been selectively created, so to are your lineups of restricted utility (as instructive tools) because they are only some of the lineups in the DFS universe and were constructed in a manner particular to you.

And don’t forget — as the combine reminds us — that we live in a world in which not yet everything is quantifiable. Because FantasyLabs Co-Founders Jonathan Bales and Peter Jennings (CSURAM88) have sustained records of quantifiable success, there’s value in watching the premium lineup reviews (available for free!) that they offer — but really you should watch these to hear what Bales and CSURAM88 have to say, since not all wisdom is numerical.

While it’s great to hear Peter talk about adjustments to his Player Models, it’s even better to learn about the logic behind those changes and the perspective Peter generally has. It’s awesome to hear Bales talk about ownership percentages — which Pro subscribers should review in the DFS Ownership Dashboard after each slate has locked — but it’s even better to understand how Bales approaches various slates and when he thinks contrarianism is most beneficial.

You should review your winning lineups, because they do have significance — they provide insight into who you are as a player — but you shouldn’t give them priority over other kinds of rosters.

Sometimes the best way to change what’s in the mirror is to focus on what isn’t there.

Stretching a Double Into a Touchdown

I probably should’ve ended this piece with the sentence right before this one.

The Labyrinthian: 2017.22, 117

This is the 117th installment of The Labyrinthian, a series dedicated to exploring random fields of knowledge in order to give you unordinary theoretical, philosophical, strategic, and/or often rambling guidance on daily fantasy sports. Consult the introductory piece to the series for further explanation. Previous installments of The Labyrinthian can be accessed via my author page.

About the Author

Matthew Freedman is the Editor-in-Chief of FantasyLabs. The only edge he has in anything is his knowledge of '90s music.