One in 49 Million

– The K Zone –
December 3rd, 2018
One in 49 Million, by Ian Joffe

The hitting streak is among the most exciting phenomena of the game of baseball. We like to think them as incredible feats, accomplished only by a unique combination of mental and physical skills manifesting themselves over a month-long period. There is another view, however, on the creation of hitting streaks: that they are actually statistical likelihoods which are all but bound to occur within a given period of time, controlled by data’s randomness alone. Both explanations seem reasonable. The perfectly robotic sabermatrician would argue for the latter, for in a game driven by statistics, things like the hitting streak can be predicted rather perfectly using data and probability. But the first argument, too, has logical merit. Players are human, and it’s very possible that they are able to get “locked in” to some mechanical or psychological state that increases their odds of getting hits in each game. 

To determine which argument is true, and if hitting streaks exist as anything more that statistical illusions, I compared data from baseball reference‘s play index about real hitting streaks to simulated data from a python program I wrote that determines the odds of certain hitting streaks occurring over a given time period. If the real MLB data matches the statistically expected data, it is reasonable to assume that real hitting streaks are based in nothing more than statistical probabilities, but if the MLB data is distinguishable from the expected results, it would appear that there is something special going on with players who have lengthy streaks.

To find the number of expected streaks in a given period of time, one must apply a geometric distribution, which is based on a string of events, each of which is labeled a success or a failure. The probability of a success, denoted by p is, in this case, a game played without a hit. A failure, then, is a game with a hit. To find the number of trials (games) it takes for a batter to not get a hit, or the number of consecutive games with a hit before a batter fails to get one, one applies two conditions. First, a batter must fail to get a hit in the game in question (p), and second, the hitter must get a hit in all previous games ((1 – p)x-1), where 1-p is the probability of a hit (or more specifically, the odds of not not getting a hit), and x is the number of the games in the streak, the last game being the one without a hit. So, the formula for the expected frequency of both conditions to occur is the product of the two, or (1 – p)(x-1)(p) . The data that I used extends from 2000-2018, over which the MLB batting average was .260. The average player had 3.134 at bats per game during that period (although this is a very, very slight overestimate because in order to avoid adding too many games without at bats for players like AL pitchers, I had to purge from my data players with less than one at bat per game on average). So p, the probability of not getting a hit in a game, equals (1-.260)3.134, or 0.389. From this, I was able to plug in and find the expected number of each length of hitting streak.

To find the real number of hitting streaks since 2000, I wrote a script that put together data from baseball reference’s play index. The longest hitting streak in that period is Dan Uggla’s 33-game streak back in 2011, so I calculated the odds of each streak length up to there. Here were the results:

 Looking at the shorter streaks where length < 9, the expected values are actually greater than the observed values, which suggests that getting in a short groove has no psychological or mechanical advantage. Having a three-game hitting streak does not make a player any more likely to have a four-game hitting streak. So, where did the extra frequencies go? For starters, the observed one-game streaks is much higher than the expected, which is strange. I have no explanation for that. But, a lot of frequencies went to longer streaks as well. Here’s the graph zoomed in on lengths > 10:

There’s a critical point after about 10 games where the observed frequencies overtake the expected frequencies, and they do so by a very significant amount. The chi-square P-value was way under 0.001. That’s probably because this effect becomes even more exaggerated as the hitting streaks get longer. Here’s the data for hitting streaks longer than 20 games:

The observed values start to lose their perfect exponential curve because of the smaller sample, but the effects are still very clear. Very, very few hitting streaks over 20 games are expected. Yet, many occurred. In total, the model expected 10.28 hitting streaks longer than 20 games in the 19-year period. We got 81 – an increase by nearly a factor of eight. The model predicted 1.49 hitting streaks of 23 games. The actual value: 14. The odds of a hitting streak like Dan Uggla’s occurring during the new millennium were just over 1 in 100.  I would say we should consider ourselves lucky to be able to see such incredible statistical feats – and we are – but this is clearly more than luck. There is no way that so many of these lengthy hitting streaks occurred in a non-mental, non-physical game of randomness. While there is little evidence to suggest a 4-game hitting streak is any more likely than expected, it is clear that players are far more likely to go on hitting streaks over 20 games than statistics would expect. A player who already has a hit in 22 games is much more likely than expected to get a hit in the 23rd. This is probably because there’s little pressure involved on a short streak. I doubt a hitter would even be aware that they have a hit for four games in a row. But, as the steaks climb above 10 and 20 and the media starts to pay attention, it’s impossible not to be aware of them. For the players who perform well on the big stage, they start to improve. Based on the data, we can be all but certain that the mental factor is there. 

I found this a rather relieving conclusion. Some of my previous articles, like those about taking revenge on old teams, or players on their birthdays, found little evidence for a mental factor in baseball. They suggested that the game is perfectly predictably random. This data, however, suggests otherwise. It shows that there is an element to how hitters perform above the statistics. It’s still incredibly scientific – my opinion is that psychology and next level sports medicine will be the next Moneyball-esque breakthrough in the game – but it shows that players are more than numbers. I love statistics, which you know because you just read my article, but it’s still nice to think that players operate on a field above the random, and from this, one can argue that they do. 

Of course, I couldn’t finish an article about hitting streaks without mentioning Joe DiMaggio. His 56-gamer in 1941 is still the gold standard for hitting streaks, and feels as unbreakable as a record gets. The purely statistical odds of any player having such a streak since the dead ball era are 1 in 49,000,000. In other words, he did something in one short century that should have taken five billion years, the literal age of the Earth, to accomplish. Yeah, DiMaggio was pretty great. 

 

If you found this article interesting, make sure to follow The K Zone on Twitter and be the first to know when we post brand new research and interviews. Thanks!

 

Sources Cited:
Fangraphs
Baseball Reference
Ms. Christine Robbins
Statistics How To

Image Attributed to:
The Associated Press

Advertisement

Evaluating the Myth of Postseason Experience

– The K Zone –

October 1st, 2018

Image result for walker buehler

Evaluating the Myth of Postseason Experience, by Ian Joffe

We’ve all heard MLB commentators, especially the old-school ones, complain about a team’s postseason chances because its players lack experience. The idea is that younger or less experienced players (I’ve heard both versions) are more anxious in the postseason, and are therefore more likely to choke on the grand stage. It’s undeniable that there are major psychological effects going on in the postseason. From Clemens to Kershaw (arguably), there are some players who just seem overwhelmed by the bright lights, and no matter how good their regular season was, they collapse when it matters. The questionable part, though, is if this correlates with age or experience. There may just be some players who, no matter their age, cannot figure out the playoffs, or players that may actually improve in October once given a chance to get used to it.

As a 17-year-old, I have always argued in favor of the youth. I don’t think there’s anything about being younger that makes one choke under pressure. A lot of that judgement is based on stereotypes that have little or no basis in hard evidence. Furthermore, one could just as easily put together an argument that youth should be better in the postseason because their energy can match the hype. That’s my problem with a lot of psychological arguments: it’s easy to make one up that will go either way, and they usually take the path more traveled by – that is, they tend to be prone to confirmation bias of stereotypes that most people already believe. Once again, there’s no denying that psychology is a science and that mastery of it can provide a tremendous advantage in sports. But, if it is a science, there must be scientifically gathered evidence for any conclusion to be valid. So that’s what I set out to do; this is my quest for evidence for the myth of postseason experience.

There are four buckets that I looked into to see if experience or age could impact postseason performance: experience for hitters, experience for pitchers, age for hitters, and age for pitchers. According to my numbers, all of which comes from the incredible Lahman Database and I spliced up using Python, 1470 batters have played in the postseason since 2000, which is far back as my data goes. That makes 42,369 plate appearances, or, a pretty good sample. Of those PA’s, 13,997 occurred during a player’s first postseason. The other 28,372 took place during some other nth postseason. The total wOBA (an explanation of which is linked here) of the batters playing in their first postseason was .316. The total wOBA of players in later series is .314. It appears solely based on these numbers that there are no advantages to having experience, and that batters are basically the same in their first and later postseason series. However, the average age (adjusted for their number of PA’s) of batters in their first postseason is 27.6 years old, while the average age of players in later series is 31.4. Here is the basic wOBA aging curve, which I got from fangraphs:

aging_curve_wrcp.jpgThe graphic actually splits up the curve by time period in modern baseball history, which is helpful, but my main point is that a 31-year-old is expected to have a wRC+ about 10 points lower than a 28-year-old. That translates to a 10% difference in wOBA, meaning if we adjust the 28-year-old to 31-year-old status, the 28-year-old actually has a wOBA around .284, significantly lower than that of the 31-year old. So, that would suggest that actually, experience plays a large role in postseason performance.

Let’s look at a progression now, rather than just a player’s first postseason vs. later postseason. For the sake of sample size, I grouped players by sets of two series of postseason experience, so there’s a group with 0-2 series, a group with 2-4 series, and groups all the way up to 9+ series. Here are all the wOBA numbers from each individual:figure_1-1.png

On the x-axis is years of experience, and on the y-axis is wOBA. Each dot is an individual in one series, and the red line is the average wOBA (weighted, of course for their number of PA’s). There are obviously a lot of outliers and a lot small samples there, so let’s zoom in on the red line.

figure_1.png

Once again, it appears there is a strong possibility that experience helps players in the postseason. There are two potential competing forces that make up this almost parabolic progression. The aging curve (which we know is a factor) pushes a player’s wOBA down, while experience (which we don’t know is a factor – that’s what’s being tested) may push a player’s wOBA up. Based on the graph, it looks like the power of the aging curve starts out more influential than experience, but after a player’s 6th postseason series or so, they reach a threshold where that pattern reverses. Suddenly, it appears that experience matters so much that it reverses aging in the postseason, which is actually supposed to accelerate as one gets older. Don’t underestimate the gravity of that conclusion; according to it, the power of postseason experience can, at some point, reverse natural aging in the batter’s box. So, from the data we have collected on hitters, it seems that experience may actually be a notable factor in postseason performance.

Now let’s look at pitchers. My data holds 10,276.1 total innings from 652 pitchers. A whole 8160 of those occurred in that pitcher’s first postseason, while the other 2,116.1 innings took place in later postseasons. The FIP of pitchers in their first postseson is 3.91. That number goes down in later postseasons, to 3.82. Age, obviously, went in the other direction, up from 28.7 to 33.8. Here’s the pitcher aging curve, from . Focus on FIP, as that’s the stat that I will use.

Pitcher_Curves_Starters_5zrnjxjd_geewfck2.png

FIP increases by about 10% between those two ages, adjusting the 3.82 all the way up to 4.30, which is not even close to 3.82. This analysis, like the one with hitters, suggests that pitchers do get better with postseason experience. We can look at it progressively, too. I didn’t make the graph with all the individual dots this time, as it didn’t really show us anything last time.

Inkedfigure_1-2_LI.jpg

This looks similar to the graph for batters, but the threshold at which the benefits of experience take over the drawbacks of old age seems to be even earlier, around the pitcher’s 4th series. From this data, it seems like both hitters and pitchers are positively effected by postseason experience.

So, based on all that, it appears there is strong evidence that experience is a factor in the postseason. Now let’s look at age. Keeping in mind the regular season aging curves that have already been presented (treat those like a control group), this is how hitters and pitchers progressed as they aged in the postseason.

figure_2.pngfigure_2-2.png

Despite the large samples, the postseason aging curves for batters and hurlers alike appear to be almost random, jolting up and down at unpredictable times. It doesn’t go directly down like it’s supposed to, but at the same time, it doesn’t go up, nor does it start going down and then go up. Every researcher hates to say it, but these charts are inconclusive. There’s no way of saying that age does or does not affect postseason play based on provided data.

Overall, my attempt to defend youth is probably not accurate as the data does suggest that players improve in the postseason with experience. It is possible that other factors contributed to those results. For example, strong young players can be brought up by any team, but good, older free agents are usually (see: Hosmer, Eric) only signed by teams that are already in postseason contention. However, if that were the sole factor in this correlation, there would be a pattern in the age chart too, which there is not. So, in conclusion, while it does not appear there is any correlation between older age and better postseason performance, I will no longer be calling foul when experience is cited as a factor in evaluating teams for a world series run. The evidence is here.

Be sure to follow us on twitter and be the first to know when we post new research!

Works Cited:
The Lahman Database
Fangraphs
Baseball-Reference

Images Attributed to:
Fangraphs
USA Today
Hub Pages

The NL Cy Young Race: Contenders, Chances, and Predictions

The K Zone

by Maddie Marriott

August 27th, 2018

As the 2018 Major League Baseball season begins to wind down, the race for the coveted Cy Young Award is just heating up.  Named after Cy Young, the winningest pitcher of all time, the award is meant to honor the best pitcher in both the American and National Leagues at the conclusion of each season.  This year’s National League race is just as competitive as ever, with four standout candidates based on several statistical categories: Jacob deGrom of the New York Mets, Max Scherzer of the Washington Nationals, Aaron Nola of the Philadelphia Phillies, and Patrick Corbin of the Arizona Diamondbacks.

 What makes this award so fascinating and hotly debated is the subjectivity of the term “best pitcher.” There are numerous factors that determine if a pitcher should be considered for this award, and what one voter thinks is important may be different than the next.  The winner of the award is voted on by the Baseball Writers’ Association of America, and has been since the award’s introduction in 1956 by then commissioner Ford Frick.  For the first ten years, the award was only given to one pitcher in the league, but was changed to honor a pitcher from each league after Commissioner Frick’s retirement in in 1966.

image

Don Newcombe was the first winner of the Cy Young Award with the Brooklyn Dodgers the year before the franchise moved to Los Angeles.  Unlike some awards, it is not uncommon for players to be honored with the Cy Young Award multiple times in their career.  Sandy Koufax was the first to repeat in 1963, 1965, and 1966, and it has been done sixteen times since then.  Roger Clemens as the most of these awards with an impressive seven, his first in 1986 and his last in 2004. Max  Scherzer is the only one of this year’s top choices to have a previous win, as he earned the honor in 2013, 2016, and 2017. I mention this trend only to clarify that previous wins do not exclude Max Scherzer from consideration this year.

788d4b848a094edb78e00d345f79902f-cy-young-award-sandy-koufax

In order to determine which of these candidates should take home the trophy, I’ll use a point system for each factor.  The top pitcher in each category will earn four points, the second best will earn three, the third will earn two, and the last will earn one.  In the event of a tie, each pitcher will get the higher number of points.  I’ll keep track along the way and use these numbers to determine who should win the award, although no promises it will match my prediction.  The pitcher with the most points after taking these categories into consideration is who should win based on these stats.  My prediction for the winner will include a few other factors. 

In no particular order, the first consideration for each of the pitchers is ERA, or earned run average.  La Marr Hoyt had the highest ERA of any Cy Young winner with 3.66 in 1983.  The average ERA in the American League that year was 4.06.  The average ERA of previous Cy Young winners is 2.51.  Leading the pack of this year’s top options is deGrom with 1.71.  Scherzer and Nola are tied for second with 2.13, and Corbin follows at  3.17, still well within the range of past winners.

deGrom: 4 / Scherzer: 3 / Nola: 3 / Corbin: 1

1a74619b-a8e4-4ff8-b110-ad35c29f51ae-20180818_lbm_fb5_062

The next factor to consider is how many walks the pitcher gives up.  Walks are a way to assess a pitcher’s accuracy on the mound.  Jacob deGrom is best in this category as well, issuing 2.1 walks per nine innings.  Corbin and Scherzer tie for second with 2.2, and Nola with 2.4. 

deGrom: 8 / Scherzer: 6 / Nola: 4 / Corbin: 4

max-scherzer-051116-getty-ftrjpg_1qsehixz4qf0j1kp07rt1ck029

Next: strikeouts.  While strikeouts aren’t the only effective way to make outs, they are an important tool to show the value of a pitcher in making his own outs without the aid of his defense.  Scherzer leads this category with 12.1 strikeouts per nine innings. deGrom comes in second with 11.1 strikeouts per nine innings, followed by Corbin with 11.  Nola brings up the rear with 9. 

 

 

deGrom: 11 / Scherzer: 10 / Nola: 5 / Corbin: 6

MLB: Pittsburgh Pirates at Philadelphia Phillies

More specifically, swing and miss percentage indicates how well pitchers can hit tough spots and fool batters.  Scherzer leads this category with 16.1%, meaning batters swung and missed at 16.1% of his pitches.  Corbin and deGrom are tied for second at 15%, and Nola comes in fourth at 12.1%.

deGrom: 14 / Scherzer: 14 / Nola: 6 / Corbin: 9

patrickcorbin_ma4-620x370

Ground balls are another effective way to get outs.  Ground ball percentage is an important statistic for a pitcher because grounders do not have the potential to leave the park and do not result as often in extra base hits.  This excerpt from fangraphs.com shows the importance of the statistic:  “In general, ground balls go for hits more often than fly balls (although they don’t result in extra base hits as often). But the higher a pitcher’s ground ball rate, the easier it is for their defense to turn those ground balls into outs. In other words, a pitcher with a 55% ground ball rate will have a lower BABIP on grounders than a pitcher with a 45% ground ball rate.”  Aaron Nola leads this category with a 50% ground ball rate, followed by Corbin with 48.4%, deGrom with 44.8%, and Scherzer with 36%.  Something interesting to note is that with these four candidates, ground ball percentage and stikeouts per inning have an inverse relationship.  This means that, for example, although Aaron Nola doesn’t strike out as many batter as Max Scherzer, he converts many more ground ball outs.

deGrom: 16 / Scherzer: 15 / Nola : 10 / Corbin: 12

mlb-spring-training-new-york-mets-at-houston-astros-5ae9a40e6d2f7328

Next we’ll move onto home runs.  MLB players are on pace to hit almost 5,700 home runs this year, a benchmark only passed once in history in the 2017 season when players combined for a whopping 6,105 home runs.  As teams come to rely on the home run more and more, pitchers’ ability to keep balls out of the stands has become more important.  Once again, deGrom comes in first place in preventing the home run, giving up 0.41 home runs every nine innings.  Nola follows closely behind with 0.43.  Then comes Corbin with 0.61, and finally Scherzer with 0.89. 

deGrom: 20 / Scherzer: 16 / Nola: 13 / Corbin: 14

usa_today_10049577-0

In order to measure command and control on the mound, we’ll take a look at wild pitches.  deGrom has impressively not thrown a single wild pitch in the 2018 season.  Nola comes in second with 3, Scherzer follows with 4, and Corbin brings up the rear with 6. Catching also contributes to wild pitches, but each team’s wild pitches do not correlate with the above order enough to be a compelling factor in the pitchers’ performance.

deGrom: 24 / Scherzer: 18 / Nola: 16 / Corbin: 15

3730503_070918nolaallstar

FIP, or fielding independent pitching, measures a pitcher’s effectiveness at preventing home runs, walks, and HBP’s, and causing strikeouts. These stats are important to measure because they are an indication of how a pitcher works without the involvement of the defense.  FIP is set up using ERA as a constant, meaning theoretically a 2.00 ERA would indicate the same amount of talent as a 2.0 FIP.  deGrom is on top again with 20.7, followed by Corbin with 2.37, Scherzer with 2.63, and Nola with 2.66.

deGrom: 28 / Scherzer: 20 / Nola: 17 / Corbin: 18

r362465_608x342_16-9

The final category to look at for these pitchers is WAR, or wins above replacement. Essentially, WAR sums up a player’s total contribution to his team.  To read a more in-depth explanation of WAR, check out Ian Joffe’s article by clicking here. Scherzer is at the top of the list with 8.8, followed by Nola with 8.6, deGrom with 7.7, and Corbin with 3.8. 

deGrom: 30 / Scherzer: 24 / Nola: 20 / Corbin: 19

453091724-0

Based on these categories, there seems to be a clear winner in Jacob deGrom.  He is the current leader in five out of the nine measured categories and does not fall last in any of the other four.  However, he is not my pick for the award.  This is because of his 8-8 record on the season.  While I am very aware that there are many factors outside of deGrom’s control when it comes to his record, this is still a poor reflection on him and will probably away turn voters that place importance in the win-loss category.  Only two players have ever won a Cy Young with a winning percentage of 50% or less, Bruce Sutter in 1979 and Eric Gagne in 2002, but they were both closers.  To clarify, I would vote for deGrom because of his obvious advantage in the above categories. However, I do not believe the voters will do the same because they will take record into account more than I would.  degrom_scherzer

Despite the statistical support in favor of deGrom, I believe the 2018 NL Cy Young Award will go to Max Scherzer for a few reasons.  He leads the league in WAR among pitchers and is tied for the most wins with 16.  He has also pitched the most innings out of any pitcher, meaning he goes deep into games, a skill that is becoming more rare.  Furthermore, he has played for the most seasons by far out of an of the top candidates and, as mentioned above, is already a three time winner of the Cy Young Award.  This means that he has the respect of the voters as a veteran and has proven his skill many times before, as well as showing his consistently commendable skills time and time again.

Check out my articles about The Phillies’ Odúbel Herrera and top prospect Sixto Sánchez, or take your pick of tons of interviews and articles here.  Follow The K Zone Blog on Twitter to get updates on baseball and notifications about new articles.

All credit for images goes to original owners.

 

The Sport of Revenge

– The K Zone –

USP MLB: BOSTON RED SOX AT MINNESOTA TWINS S BBA USA MN

The Sport of Revenge, by Ian Joffe

August 23rd, 2018

We all know the story: a player gets traded or DFA’ed by their team, and they can’t stand it. They join another team and, in their frustration with their old team or themselves or just the world in general, they start crushing the ball. Suddenly, the old unrosterable player is the player of the week and the month, until they eventually cool down. After that they may return to who they used to be, or be slightly better, until they come back to face their former team. As a Red Sox fan, I always remember David Ortiz punishing the Twins, especially in Minnesota. It seems like every team has players like that, but just because a few players crush their old teams it doesn’t mean everyone does. To answer that question, it’s going to require a look at a lot more players.

There are multiple reasons why a player would do better against their old team. The most simple reason is the psychological effect. I think that traditional sabermetrics (which would argue players do not do better against their former teams) often overlooks the science of psychology in places where data shows there could be a trend (for example, closers pitching poorly before the ninth). Players may feel angry or that they have something to prove to the club that abandoned them, and that could somehow affect their performance on their field. Additionally, they may have extra non-public knowledge about the pitchers on the team that they’re now facing. The same would apply in the inverse with the pitchers knowing more about the hitter in question, but because hitters tend to improve during in-game at bats against a pitcher as the innings pass by, it’s possible that hitters will have the advantage in this team-switch case as well. Third, there could be a park factor, where players are more comfortable in a ballpark that they used to play in, or fans motivate them by cheering or booing. That third reason is probably the least likely, as it only applies in the away ballpark and there’s little proof that fan interaction has a psychological effect, but it’s worth noting.

Since 2000, batters in MLB have had 107,790 plate appearances against teams they used to be on (I’m not sure, but I think that should be a large enough sample). Using a Python program to compile Fangraphs data, I compared how players did in those “Revenge PA’s” compared to normal plate appearances. Here are the results:

Average Player Players vs. Former Teams
AVG .260 .258
HR% 2.7% 2.8%
K% 18.5% 18.4%
BB% 8.4% 8.7%
wOBA .323 .324

 

In part due to the large sample, there is almost zero difference between the everyday player and players against their former teams. From this study, the simple, albeit disappointing conclusion is that the average player is no better against a club that they used to play on.

An interesting side note here from the study itself is the question of why I (and I assume many readers) had the misconception that many or most players will do better against their former teams. When players do well against their old teams, it creates drama which compels media to cover it, and the media covers it because they think people will be attracted to the apparent drama. And, they’re right. I wrote a whole article about this because it interested me. My point is that we have this misconception because the media over-covers stories that seem dramatic. They would be less likely to cover, say, Mookie Betts hitting .350 against the Giants because he’s never played for San Francisco, so there’s no drama there. We need to be aware of biases in what the media covers like this one and make sure that we realize that these stories about a few players that get covered for dramatic reasons do not necessarily reflect on everyone in the average population pool; whether that be baseball players or people in general. But I digress…

Having concluded that the average, aggregate player may not experience a change, I wanted to look into individual players. After all, everyone is affected differently by psychological factors and there may still be an abnormally high amount of players who perform well against their former teams. So, filtering out players with less than 20 plate appearances, I checked how many players hit .400 against their old teams – there were only 11. Then I looked at how many hit .350 – only 30. That’s a lot of players to reach those totals in a full season, but in the very small samples that I’m looking at, those numbers are very normal. They could be explained purely through luck in statistics, without psychology. Still, I wanted to see if there was any way I could show a real physiological difference for those few players.

Theoretically, if random streaks explain why these few players improved against their former teams, the increase in batting average would be independent of other statistical changes. A change in batting average would lead to a small change in other stats directly (for example, if BA goes up by .010, out rate will go down .010), but if there consistently are other effects than that, it can be shown that there is something unusual going on. In other words, if more than one statistic is drastically changing, this may not just be randomness and luck playing out. In order to look at a few different samples, I made three groups:

“The 30” are the 30 players who hit .350 in at least 20 PA’s

“The 11” are the 11 players who hit .400 in at least 20 PA’s

“The 5” are the 5 players who hit .350 in at least 60 PA’s

Here’s the data on a few stats from each group:

The 30 The 11 The 5
HR% 4.3% 4.6% 6.7%
K% 14.4% 14.2% 11.9%
BB% 8.6% 7.1% 10.1%

 

Comparing these numbers to the league averages you can see in the earlier table, there are clear changes. Even after factoring in the direct changes (the pure change in batting average for The 30 and The 11 should put K Rate around 16%, for example), players improved significantly in power and in avoiding strikeouts. The only place that we don’t see a consistent story is in walks, which is weird but can probably be chalked up to the small samples that I’m dealing with. Looking just at strikeouts and home runs though, we can see clear improvements. That tells me that there could be more than randomness going on in the BA improvements for these players. It looks like they actually are motivated to be better against their former teams. Furthermore, the decrease in strikeouts tells us that there is no conscious change in approach. None of the groups are more aggressive at the plate, which would lead to an increase in strikeouts and consistent decrease in walks. This leads me to believe that the change is less conscious and harder to explain in baseball language (such as “more aggressive” or “faster swings”) – it’s more purely psychological.

In the end, the results of this study are that revenge psychology factors little into the performance of major leaguers. Very few players are better against their former team than any other team. There are, however, a very small select group of players who seem to be legitimately better against their former teams, but it’s hard to explain exactly what they do differently. When asked about his performance against the Mets, Daniel Murphy, one of The Five and the subject of one of the first K Zone articles said “I don’t know. They’re a division rival.” There’s little information from other players. While some may not want to openly admit their feelings on the subject of their old team, for the most part it seems that these players are not particularly conscious of their performance at all, at least while it’s happening. This is more the work of deeper psychological effects in a small portion of the population. Perhaps one day we will have neurological information on draftees or free agents so that teams can predict if they will be an unconscious vengeance seeker. For now though, it’s safe to assume that a random player will not come back to bite.

Oh, and while I was writing this article this happened. We’ll see if Murphy goes on to terrorize the Nationals as a Cubbie.

Sources:
Fangraphs
The New York Post

Images Attributed to:
USA Today

If you liked this article, you may also be interested in this short piece about whether or not players hit any better on their birthdays. Or, you can follow The K Zone on Twitter and be the first to know when we publish research, interviews, or other great new content.

Birthday Bashing

– The K Zone –

_138280-49fdab4f7bf207b3cc31f72186c86b0a642f0802_642x428

Birthday Bashing, by Ian Joffe

July 4th, 2018

In honor of America’s 242nd birthday, we’re going to take a look at how players hit on their own birthdays. A strict “clutch hitting does not exist” model would argue that there should be no difference in batting outcomes, but a more psychologically based view could see significant changes in hitters’ approaches and results. It certainly seems as if some players consistently hit on their birthday, but that could be by random chance, or only apply to some. The question is whether a larger trend exists.

To test the two theories against each other, I compiled data from The Lahman Database and Retrosheet (two excellent sites and sources, by the way) using python. In total, there were 478 times in 2017 in which a player made a plate appearance on their birthday, which is not a huge sample size, but it should be enough for most stats. We will turn those 478 PA’s into a conglomerate player, who we’ll call the “birthday boy.” Here’s a comparison between The Birthday Boy’s numbers and the league averages from the 2017 season:

  League Average Birthday Boy
AVG .255 .263
HR% 3.3% 3.6%
K% 21.6% 22.2%
BB% 8.5% 9.4%

At least last season, there was no indication that players did better on their birthdays than any other day. The small differences in, say, BB% do not come close to holding significance when a t-test is applied. Not only do players not improve in power on contact by a statistically significant amount, but it seems their approach does not change either. Players are not more anxious and aggressive, nor are they more nervous and passive, according to strikeout and walk rates.

All in all, it looks like the robot within us has this one locked up. Don’t go picking up players in DFS just because it’s their birthday, and don’t count on your 25th man to come up clutch just because it’s his special day. There will be plenty of opportunities for clutch hitting down the stretch, when the team needs it most.

Who Got Figured Out?

– The K Zone –

June 16, 2018

Who Got Figured Out, by Ian Joffe

The hot start…and the extended slowdown. The rookie sensation…and the “sophomore slump.” Baseball is made of up and down streaks, and figuring out what causes them can be a secret to understanding the game. Sometimes they come from simple luck of defensive positioning and batted ball location. Usually those kinds of changes in offense will be associated with changes in BABIP. Other times, hitters can make legitimate changes to their swing or approach, although it’s smart to be suspicious of any supposed changes until there’s a large enough sample to confirm that the change made a real difference. In other cases, pitchers start to approach batters in different ways, and the batters become, at least temporarily, befuddled. A pitcher has two weapons at his disposal: Where to pitch the ball and how to pitch the ball. I’m going to focus mostly on the latter, in terms of pitch types, but I’ll also look at simple zone percent to see if a batter has stopped getting balls or strikes.

Often, pitchers will change their approach in response to a hitter becoming surprisingly hot, especially when there’s no clear indication of luck. To see which hitters received major adjustments from pitchers this season, one can look at the difference between the amount of pitches of a certain type they received in April vs. in May of this year. Here are all the qualified hitters whose opposing arsenal changed by at least 7% in at least one pitch, by Pitch f/x data (note that these are changes in percentage out of the total arsenal, not the individual pitch, for example +10% curveballs means the batter used to receive 15% curves and now gets 25%):

Mike Trout: -7.6% Sinkers
Rhys Hoskins: +9.3% Changeups (who Mike interviewed just about a year ago)
Jed Lowrie: -8.8% Sinkers
Freddie Freeman: +10.3% Sliders
Kris Bryant: +8.2% Sliders
J.D. Martinez: +7.8% Changeups, -10.3% Curveballs
Javier Baez: +12.3% Sliders
Odubel Herrera: +8.8% Sinkers
Lorenzo Cain: -10.1% Sinkers
Mike Moustakas: -7.8% Curveballs
Kevin Pillar: +8.7% Sinkers (who we also happened to have interviewed)
Jose Martinez: -7.1% Sinkers
Jose Ramirez: +7.8% Curveballs
Nick Ahmed: +7.6% Sliders
Mallex Smith: +12.5% Four-Seamers
C.J. Cron: +8.1% Sinkers
Nicholas Castellanos: -7.4% Sliders
Evan Longoria: +14.1% Four-Seamers, -10.7% Curveballs
Trea Turner: +7.3% Four-Seamers
Yonder Alonso: -8.9% Sliders, +7.3% Curveballs
Alex Bregman: +7.7% Cutters
Buster Posey: +8.0% Four-Seamers
Scooter Gennett: +8.6% Four-Seamers
Matt Joyce: +10.3% Changeups
Gary Sanchez: +11.4% Curveballs, -8.2% Four-seamers
Giancarlo Stanton: +9.8% Four-seamers
Tucker Barnhart: -8.3% Sinkers
Jose Peraza: +10.6% Four-seamers
Kyle Seager: +7.1% Sliders
Marwin Gonzalez: -8.9% Four-seamers
Michael Taylor: -8.2% Four-seamers
Victor Martinez: -8.2% Four-seamers
Justin Upton: -7.4% Sinkers
Miguel Rojas: +7.1% Changeups
Brett Gardner: +7.5% Curveballs
Adam Jones: +10.4% Changeups
Adam Duvall: -11.9% Sinkers, +7.7% Four-seamers
Edwin Encarnacion: -7.3% Four-seamers
Billy Hamilton: +10.3% Four-seamers, -9.6% Sinkers
Ian Desmond: -7.0% Four-seamers, +8.4% Sliders
Brandon Crawford: -8.2% Sliders, +8.2% Strikes
Lewis Brinson: +9.4% Four-seamers

Pitchers, catchers, and pitching coaches may choose to alter their selection against a hitter because his data shows a weakness on that specific pitch in the past, or a potential hole in sequencing is noticed that can be used to take advantage of the hitter. Surprisingly, however, very few of these changes were shown to have effects. Filtering out the batters whose offensive adjustments were based on BABIP luck, there were only a few hitters who lost at least 20 wRC+ (the best tell-all offensive metric that we have) between March and April:

Javier Baez was unable to keep up in sequences with sliders, losing 46 wRC+ but only 0.016 of BABIP when his rate of sliders increased 12%. Don’t look for too much from the potentially promising young Cub.

Lorenzo Cain‘s changes draw concern based on his recent league change. Pitchers in the NL took a little time to adjust, as Cain had a strong showing in April, but when they realized his weakness against the sinker, they capitalized, and his wRC+ dropped by 33 while his K-Rate rose by 4.3%. His specific case related to the league change is certainly worrying.

Mike Moustakas, who returned to the Royals on a one-year deal after a rough free agency, was quickly put in his place by opponents’ curves, whose increase cost him 37 wRC+ in May. If there’s one player here I’m less worried about, though, it’s Moustakas; I think we know, and pitchers already knew, who he is.

Jose Martinez was a surprise story in April, but a barrage of sliders have slowly started to chip away at his stats, and he, like Moose, lost 37 wRC+ but also experienced a 5.8% rise in strikeouts. Look for more regression as pitchers continue to figure out the young Cardinal.

Matt Joyce fell victim to the changeup this May, which isn’t surprising given his history and swing. Don’t hold your breath of Matt’s career year. In addition to the drop in wRC+, he struck out 10% more often in May than April.

Marwin Gonzalez showed plenty of signs of potential regression last year, especially on Statcast metrics, and as soon as pitchers stopped throwing him fastballs this year, his season collapsed in the form of 32 wRC+ and poor batted ball metrics.

Michael A. Taylor, the Nationals speedster who has stepped in and stepped up in place of several injured Nats looked strong at first, but couldn’t keep up the pace in response to an increase in fastballs and a decrease in offspeed stuff. If he wants to stay in the starting lineup, he’ll have to learn to keep up with velocity.

While there is some worry to be cast on the players above for the rest of the season, it’s even more surprising how resilient the vast majority of hitters were. Most seemed unfazed by the new way they were being treated after early success, and any changes were much more often associated with luck than real factors, like being pitched to differently. And, even for the seven who made the shortlist of concern, there is hope. Repeating this exercise on the 2017 and 2016 seasons showed that there is little reason to worry even for them.

Last season, in 2017, the shortlist was comprised of Michael Brantley, Mike Moustakas (again), Salvador Perez, Tim Beckham, Yasiel Puig, and Randal Grichuk. Inconveniently, only three of those six qualified in the second half: Moustakas, Beckham, and Puig – and none of them had problems. Puig had the second best half of his career, posting a 136 wRC+, and Moose put up a fairly solid 106 mark. Both maintained down-to-Earth strikeout rates too. Beckham also had a great half, with a 113 wRC+, although a higher K-rate and a .353 BABIP suggest that that may not be entirely natural. Either way, if the 2017 crop is any evidence, some guys may just take a while to get used to changes in how they’re pitched to, but will recover eventually.

Running the same program for 2016, eight names were churned out. That list included Anthony Rizzo, Neil Walker, Starling Marte, Chris Davis, Curtis Granderson, Corey Dickerson, and Billy Burns. Four of the eight: Rizzo, Davis, Granderson, and Dickerson, qualified in the second half, and once again, they had a strong showing. Other than Davis, who, to be blunt, isn’t that good anyways, the players each had a wRC+ over 108, topped off, of course, by Rizzo’s 121. Like in 2017, the 2016 slow-adjusters figured it out eventually. Based off what’s happened in the prior two years, I see little reason to worry about this season’s seven. They each may have individual concerns, but the simple fact that they are taking a while to learn how to hit after pitchers realized they were decent has not, historically, been telling of anything. They may be figured out for now, but just wait a month and see.

The Secret of Ray Searage

– The K Zone –

January 25th, 2017

The Secret of Ray Searageby Ian Joffe

20170902pdPirates01

Some call him the pitch whisperer. Others say he’s just fun to hang out with. Either way, Pirates pitching coach Ray Searage has turned around countless pitching careers, and I am set out to figure out why. While Searage is not the biggest fan of the term, many call his various turnarounds “reclamation projects,” and while they don’t always work out (just ask Ryan Vogelsong and his 5.00 FIP), they tend to be a relatively safe bet. Pitchers who get traded to Pittsburgh warrant at a least watch list add in fantasy, or a weekly stat check on Baseball Reference. In the past six seasons, some some of the biggest reclamation projects have been Francisco Liriano (2013), reliever Mark Melancon (2013), Edinson Volquez (2014), J.A. Happ (traded at the 2015 deadline), Ivan Nova (from the 2016 deadline) and A.J. Burnett (twice, in 2012 and 2015). Those are the seven cases which I will use in my study.

My first attempt at the cracking the Searage code was to look through interviews. I had no luck – turns out that Ray guards his secrets pretty closely. Jared Hughes even joked about his organization’s secrecy with Sports Illustrated when they tried to talk to him, from “Uncle Ray” to his high-beets diet. In a 2013 interview with MLB.com, Liriano was asked why he was having so much success in his new ballpark. He starts “I don’t know, I don’t know what to say.” He continues to say that he feels under control and needs to not make mistakes. I never would have guessed. On the topic of Searage, Nova vaguely states to nj.com that “he’s a good pitching coach. He’s been a good pitching coach forever. He has a lot of guys who have pitched great games for him.” Nova goes on to suggest that perhaps people are just very comfortable with him, leading them pitch better. J.A. Happ and Jason Grilli add to this, telling USA Today that it’s all about trust, and that people trust Searage because of his enthusiasm and the level of research he puts into every game. Nova also tries to take a little credit off of Searage’s back. “It’s not all the time about the pitching coach.” Melancon also claims that catcher Russel Martin had the biggest influence on him. After all, both Martin and Melancon were once on a team with Mariano Rivera, to whom Melancon credits his cutter. The players are not lying; there is more that goes into a pitcher’s game than just advice from the pitching coach (from beets to former assistant pitching coach Jim Benedcit’s advice), but the wide array of reporting on Searage and his high count of success stories no matter who else is on the team to co-inform at the moment don’t lie either. There is something going on in Ray’s mind and the Pirates organization, so as I so often do, I turned to Brooks Baseball to find out.

For those who are unfamiliar with Brooks, brooksbaseball.net has the most comprehensive pitch tracking data on the internet. It was difficult to find any pattern among my pitcher sample, but eventually I reached something. In 6 out of the 7 cases, after being acquired by Pittsburgh, the pitchers threw far fewer classic four-seam fastballs than before, replacing them with moving fastballs such as sinkers and cutters. In some cases, like Liriano’s the pitcher eliminated the four-seamer from their arsenal altogether. The one exception to this rule is J.A. Happ, who actually threw more fastballs than ever after joining the Pirates. However, according to Fangraphs, Happ’s regular fastball actually moves a lot, the 11th most in baseball and just as much as any of Happ’s other pitches. So, one could say that Happ’s straight fastball is more like another pitcher’s moving fastball.

For as long as baseball has been played, the four-seam fastball has the been the most important pitch. It’s the first pitch kids are taught (which, I understand, is for health reasons, not necessarily for skill-building reasons) and the first pitch scouts will look for. A pitcher that throws more of another pitch than a fastball is a rare occurrence, especially for starters. But baseball is in an era of change. Perhaps Ray Searage is trying to pioneer a change of his own, which, when you really think about it, makes perfect sense. According to The Hardball Times, the straight fastball allows for the most flyballs (32%), the most line drives (19%), and the fewest grounders (39%) of any pitch type. Moving fastballs, on the other hand, lead to the least flyballs (19%), the least line drives (17%), and the most grounders (59%). An emphasis on keeping the ball on the ground is especially important, and its merits are more proven than ever, in this Statcast era, where hitters are focusing on launch angle and fly balls (or, as Daniel Murphy likes to put it, “high line drives”) more than ever.

Four-seam fastballs are important for command and speed differentiation. But, moving fastballs are only a few miles per hour slower, and exhibit much better results. Perhaps it’s time to take advice from Searage and start to question the value of the fastball as the primary MLB pitch. It could be far more advantageous as a secondary option, to blow a hitter away once they’re used to timing breaking pitches, or waiting for the cutter, sinker, or even splitter to move last-second. I’m not the very first to suggest this; Sports Illustrated first pointed out that the Yankees took a similar approach to a decrease in fastball use this year (although hopefully I’m near the front of the S-curve). They quote Yankees pitching coach Larry Rothschild, who is far more open than the Pirates have been. “Fastballs get hit,” he says bluntly, especially if they are not commanded well. The addition of the Pirates to this anti-fastball revolution (shall we call it #ForgetTheFastball?), however, certainly adds credibility to the idea that this could work for many types of pitchers on many different teams. Searage definitely does more than make pitchers throw fewer fastballs; what I have discovered is probably less than 10% of his plan for each pitcher. He works on mechanics, makes each pitch better, and has different game plans every day. This is especially notable in the case of Edinson Volquez, who did throw fewer fastballs after joining Pittsburgh, but saw his biggest drop in four-seam use the year before his signing. But, in this post-Moneyball decade, with teams looking to question everything to gain any slight advantage, Ray Searage has shown us that combating the hitter-friendly fly ball trend through a decline in straight fastball use could be a secret to pitching success.

Images attributed to The Associated Press and the Pittsburgh Post-Gazette

The Road Rockies

The K Zone

chatwood.jpg

December 7th, 2017

by Ian Joffe

I often find myself pondering two what-if scenarios in regard to the Colorado Rockies. My questions rise out of the well-documented fact that the Rockies play half their games a mile in the sky, at their home in Coors Field. The thin air allows baseballs to be driven harder and with more consistency than any other ballpark. Considering that my first fantasy, Giancarlo Stanton being traded to Colorado, was killed earlier this offseason, in this article I will turn to the question of what kind of team the Rockies would be in a normal ballpark. Expectedly, in 2017, the Rockies fell in the middle pack in terms of runs allowed, but scored the third most runs in baseball, and the most in the National League. This combination led them to the 8th best record in MLB, and a wild card berth.

Looking strictly from a team-wide perspective, “the Coors Field Effect” appeared to have a positive impact on the Rockies. The team went just over .500 on the road (41-40), ranking 25th in total runs scored. Colorado’s pitching rose up to 9th in baseball, but this was not enough to remedy the weakened hitting. Under these changes, the road Rockies compare extremely well to the Tampa Bay Rays, who went 80-82 over the regular season – not a bad team, but certainly not a playoff roster either. Based on this information, it’s easy to claim that the Rockies are just lucky to make Coors Field home, but I would instead give credit to their front office. They figured out how to build a good team based on the conditions they were given, which is not an easy task.

Just as interesting as the performance of the overall team, however, is the performance of individual Rockies players on the road. Looking at the hitters, the first one who stands out is Charlie Blackmon. Most recently, Blackmon was subject to controversy over MVP balloting. Some argued that park should be counted against him, while others said wanted to let his numbers be. Those who argued that his stats were inflated by park certainly had a point. 12 of Blackmon’s 13 triples were at home, a huge driver of his .601 total slugging percentage, which fell under .450 on the road. Blackmon still put up respectable numbers, but not MVP level. His 102 wRC+ suggests he is more of a league average hitter, and his defense has never been spectacular. Only one Rocky (or is it Rockie?) ended up being an above average offensive player on the road: Nolan Arenado. The third baseman’s wRC+ only fell by 4 points when away, a trend he has exhibited throughout his career.

Five Rockies starting pitchers threw at least 50 innings on the road. Strangely, Kyle Freeland got worse, while German Marquez improved only a little. Antonio Senzatela experienced significant improvement on the road, but remained a league average pitcher. Jon Gray is the first of the important Rockies pitchers on the road. His 10 K/9 would put him in the upper tier of starting pitching, as would his low walk and home run rates. These total up to a 3.05 road FIP, certainly an enviable mark. Even more impressive, however, are Tyler Chatwood’s numbers. Disappointingly, Chatwood finished 2017 with only mediocre road numbers, but career, he owns a 3.31 road ERA with a 0.71 HR/9. In 2016, Chatwood boasted a mere 1.69 road ERA, and in 2013 he put up a 2.72 mark (Tyler missed most of 2014 and 2015 with injuries). Both home and away, Chatwood is an extreme ground baller, approaching a 60% rate, which, combined with a 24% soft hit rate and 26% hard hit rate on the road, is likely responsible for the elite road years. It is very worthy of note that none of Chatwood’s more sabermetric numbers stand out on the road, but it would still be interesting to see how he would fare in a whole season. And, the best part is, soon, we will be able to see. Chatwood was a free agent this offseason, and just last week he signed with the Cubs, so the dream of seeing a full road season out of Tyler Chatwood is will, at long last, become a reality.

 

Sources:

Fangraphs

MLB.com

Images Attributed to:

ESPN

stadiumparkingguides.come

A Look at Luck: Hitter Edition

-The K Zone-

July 6th, 2016

452793164.jpg

A Look At Luck: Hitter Edition by Ian Joffe

Today is June 6th. It’s an arbitrary day – I chose it because today I happened to be bored – but this time always proves to be fascinating for player evaluation. With a couple of months and change under most players’ belts, we both have a large enough sample size to start judging which breakout stars are for real, but also have a small enough sample size that a ton of luck is still involved.

As in all statistics, a larger sample size means more accuracy. In baseball, that sample size generally applies to plate appearances (at-bats excludes walks and such) and innings pitched. The inverse of this is that a smaller sample size creates more randomness, or luck. In baseball, luck can refer to a multitude of factors, from ballpark dimensions to weather, but the most important factor to luck is opposing defense. Defensive luck is a critical component to early and current sabermetrics, as statistical studies by early SABR experts have suggested that the outcome of all balls in play are almost entirely up to the defense, rather than the hitter. This is not to suggest that balls in play and entirely random, but rather that most are, and the very hard hit balls tend to even out with the very soft hit balls. Based on this, we can conclude that most hitters should have a similar BABIP, or Batting Average on Balls in Play, because the defenses they face over a large sample will be similar to one another. That BABIP is expected to be around .300. There are certainly exceptions, including incredibly speedy hitters and power hitters, but even the exceptions shouldn’t have a BABIP over .330, and even Hall of Fame level exceptions will not achieve a BABIP over .350. Thus, if a player has a BABIP far over the 300-330 range, they have experienced high defensive luck, and we can expect that luck to even out over a larger sample size, as the luck of a coin flip would, causing their numbers to regress. Similarly, a batter who has encountered great defense may have a BABIP far below the 270-300 range, meaning their stats will likely improve, or experience positive regression (I know, it’s an oxymoron, but that’s the proper term). In this article, I will examine the highest and lowest BABIPs in the league, and attempt to determine how sustainable that makes their current statistics. Whose hot starts can you trust, and whose will wash away?

Hitting BABIP Leaders:

  1. Miguel Sano (.465)
  2. Ryan Zimmerman (.411)
  3. Aaron Judge (.408)
  4. Avisail Garcia (.396)
  5. Jean Segura (.395)
  6. Zack Cozart (.393)
  7. Xander Bogaerts (.387)
  8. Corey Dickerson (.387)
  9. Keon Broxton (.385)
  10. Matt Kemp (.379)

Miguel Sano has amazing peripherals so far. His soft hit rate is 7%, with a hard hit rate over 50% and league leading 96.6 average exit velocity to accompany it. It is those peripherals that make his BABIP somewhat believable for the time being, but they are also what makes his numbers unsustainable. He won’t finish the season with those kinds of batted ball metrics, and considering he is currently hitting barely over .300, his batting average could turn out dismal. But, if he continues his three-outcomes approach, I would not be surprised if his massive power continues, along with big walks and really big strikeouts.

Ryan Zimmerman is perhaps the most exciting player on this list. His 2016 return from/return to injury was disappointing on the surface, but deeper Statcast analysis told a more complex story. Zimmerman was killing the baseball, but hitting it on the ground almost 50% of the time, with one of the worst average launch angles in baseball. This year, however, he has embraced the new, league wide ball-elevation mentality, and has had terrific results as the potential league MVP so far. Thanks to his high current BABIP, Zimmerman will not maintain his current, ridiculous 1.111 OPS, but should maintain good numbers throughout the season thanks to his new alchemy of velocity and launch angle. I would expect to see a batting average around .300 with good power.

Avisail Garcia is the most unfortunately standout of the group. After falling from top prospect status years ago, he has disappointed scouts and fans alike. Yet, this 2017, he went off to a fiery start, all on the heels of a nearly .500 BABIP. And, he did all of that despite hitting 50% ground balls with normal exit velocities. In May, the BABIP slowly began to regress, and so has the batting average along with it. The two will continue to decrease together until Garcia reaches a more natural .300 BABIP, which should, for him, match a low average (I’m talking .220 or so) with few walks.

Aaron Judge is one of the greatest success stories of the year, getting off to an incredibly hot start after largely failing in last year’s small sample. He hit nearly .400 in April and continues to lead the league in home runs, but his batting average has since fallen to the low .300s. That batting average will continue to fall, even if his good batted ball numbers allow him to maintain a somewhat higher BABIP. He may hit .250 at the end of the year, but with good on-base instincts and continued power (maybe not to the extent he has hit so far, but he could go deep 45 times total), expect Judge to keep high value and be a Rookie of the Year favorite.

Jean Segura had similar BABIP issues last year, when he hit well over .300 in April, but he somehow managed to maintain a strong, .319 batting average throughout the season. Segura has the speed to maintain an above average BABIP, but Usain Bolt wouldn’t get close to .400 over a large sample, and neither will Segura. I would estimate his current .341 batting average at the end of the year to drop significantly below .290, unless, like last year, he can manage to defy the odds. If he does continue to hit, it would throw an interesting punch in the face of DIPS theory.

Zack Cozart might be a pleasant surprise in terms of walk rate, but like Garcia, his batting average seems doomed. He has shown us nothing to prove that his BABIP can even remain above average, telling me it should fall about 100 points, bringing his batting average to the mid-200s. That would make sense, considering it would match his career mean.

Xander Bogaerts‘ high batting average may have canceled out some Boston fans’ fears over the lack of power, but I would worry about both hitting tools. Xander has not hit the ball hard, in fact he is hitting it particularly soft, and has only good speed. Like last year, he could go on a massive cold spell at any moment, and his batting average may drop below .280, dare I say .270.

Corey Dickerson seemed like half of a terrible trade for both teams, when the Rockies dealt him to Tampa Bay for Jake McGee a couple years back. Critics blamed park factors for his miserable 2015, when his OBP dipped below .300, but he seems to have adjusted at this point. Much of his .330 batting average is BABIP fueled, but he has shown us the ability to hit for a moderately high BABIP and average before, and may end the year hitting a very productive .280 or so.

Keon Broxton may need a demotion after the BABIP regresses, making his story one of the more disappointing ones on this list. After a hot start last year (although that may have been BABIP-fueled as well). He’s hitting only .240 despite the luck, and may be batting somewhere in the .100’s without the inflated BABIP. Additionally, he has struck out out 40% of the time. Broxton should not be expected to hit for power either. He will steal his 30 bases, but he’s not fast enough to sustain such a high BABIP. If I were a fantasy owner, I would try to trade him while I still could.

Matt Kempthe former MVP candidate, was traded to San Diego and later Atlanta after falling from stardom through numerous injuries, but seems to be putting up respectable production once again. His average could fall to .275 or so, but like Dickerson, Kemp has shown the ability to put up an above average BABIP before. Kemp’s days of speed and defense are over, but he could still be a middle-of-the-order bat for the rebuilding Braves, with good power.

Along with looking at BABIP to see who will regress (and who may keep a reasonably high average), one could also use the stat to encourage people to not give up on certain players who have had really bad luck. Rizzo (.216) and Machado (.229) immediately come to mind. Don’t worry Cubs and Orioles fans, they will bounce back to All-Star production when their BABIP improves. The young Kyle Schwarber and Dansby Swanson also have BABIPs under .230. Matt Carpenter will get better, as his BABIP is .238. I also want to say that fantasy owners and White Sox fans shouldn’t worry about the ToddFather — his BABIP is under .200 — but the fact that he managed to keep a .200 BABIP all last year, and continues to do so this year, makes him appear to be a candidate to be a reverse Segura-type. Only time will tell. Padres hitter Ryan Schimpf, who has the worst BABIP in the league, is little-known among everyday fans, but to communities that embrace the three true outcomes approach and launch angle strategy (Schmipf proudly owns a 64.6% launch angle), he may be the face of the movements. His batting average is in the mid-.100s despite 14 home runs so far. He has far too many strikeouts and fly balls for the batting average to improve very much, but it should rise at least above .210 by the end of the season.

If you liked this, you may want to check out my look at the WAR statistic, or you can look at any of Mike’s player interviews. Follows on  Twitter and Instagram are greatly appreciated, and you’ll be the first to know when new content comes out!