Andrew McCutchen starting slow and then picking it up during mid-summer is a story we all know by now. In fact, our own Corey March wrote about it a couple weeks ago. In that article, he showed graphs of McCutchen’s ISO and wOBA for the last couple years, organized by month. The point was: this is now a trend that McCutchen gets ramped up as the season progresses.

Why? For some players in other sports, sometimes it’s just a matter of getting into playing shape, but I don’t think that’s the issue here. Now, this may not be the final answer, but I did find Daren Willman’s recent tweet illuminating and I think it brings up a broader topic: should we be using PITCHf/x data more in our DFS research?


Drew Dinkmeyer and I discussed this in our most recent Fantasy Labs podcast. Drew said that he looks at any data he can get his hands on, and while it may not be a part of his model or formula to rate players each day, it’s definitely something that he believes is part of the process. I agree.

In terms of researching for your DFS plays, you’re going to come across weird stuff like McCutchen’s splits based on months. But instead of just knowing that those splits exist, I think it’s immensely more valuable to know why they exist. Has it been just bad luck in the early months? Perhaps he was facing tougher pitchers earlier in the season or the one-two guys in the order weren’t getting on base as much so that affected his play?

Instead, we can see thanks to the cool PITCHf/x data and sites like Daren’s that McCutchen wasn’t making good contact or hitting the ball hard. Knowing that – just like knowing BABIP data – will allow you to predict a regression. If you see that McCutchen is starting to pound the ball again, while it may not lead to immediate results, you can expect fantasy points to eventually mirror that. Even being ahead of the public curve with that knowledge by one day could mean hitting a GPP.

This PITCHf/x data will probably be at the forefront of my upcoming articles over the next couple weeks, as I really believe this is an important topic. I fall victim myself to just accepting “surface data” as it is, without digging into why it is. The latter illuminates predictiveness; the former is merely descriptive. I’m going to look into this and I would encourage you to do the same – don’t accept surface data. Learn why.