The Flaws of Batting Average
While batting average is rightly being less and less used these past years, there are still many people who use the stat as their primary performance judgement stat. However, batting average may just be one of the worst stats to use to do so. Batting average was created in 1887 as a means of player assessment. To find batting average, one would divide the number of hits by the number of at bats. 120 years later, this stat is still widely used by many baseball fans.
However, while batting average accurately displays the hits a player accumulates in a certain amount of at bats, it fails to track two of the most important aspects that a hitter can provide to their team. One being the ability to get on base, not just through hits but walks, hit by pitches, and other forms. The other is extra base hits. Extra base hits (doubles, triples, and homeruns) are clearly more valuable than singles, but batting average simply calculates them as equal, as extra base hits don’t increase one’s batting average any more than singles.
As Fangraphs says, “In baseball, we care about run scoring (and prevention) and so when looking at offensive statistics, we want to find statistics that tell you something about how much a player contributes to the run scoring process. Batting average leaves out walks and walks play a major role in run scoring” (https://library.fangraphs.com/stats-to-avoid-batting-average/). Thus, there are far more effective and valuable stats that can be used instead to evaluate player performance and their impact on their teams.
To find the value of AVG compared to other stats, I pulled player data from 2010 - 2020, including batting average, on base percentage, WAR, slugging percentage, wRC+ (weighted runs created plus, and plenty more. For this experimental comparison, I used WAR as my dependent variable, and other stats as the independent variable. Wins above replacement is a player value assessment that calculates the total wins a player contributed to, and as Fangraphs explains, it answers the question of “If this player got injured and their team had to replace them with a freely available minor leaguer or a AAAA player from their bench, how much value would the team be losing?” (https://library.fangraphs.com/misc/war/). Because WAR is one of the more value based stats, I chose it to visualize what effect stats such as batting average have on it.
To visualize some other potential stats that could replace batting average as the standard stat used for player analysis, I created a few correlation graphs to show how batting average may not be as useful a stat as it was once and often still is considered to be.
Many people may not even need visualization to know this; analytical based fans have already begun the movement to not use batting average, already aware of its faults. However, just for the purpose of further or more evidence based research as to why and how batting average is not useful, I will dive into a basic but effective visualization of the value of batting average.
Since WAR is an accumulative stat, I filtered out players from 2010 - 2020 that haven’t had more than 1000 at bats, as they would automatically have a much lower WAR given their lack of games played. With that said, this is the correlation between batting average and WAR, with a rather low correlation of .297.
However, as seen here, the correlations between OBP, on base percentage, and WAR is slightly stronger, with a correlation of 0.596.
Thus, it can be inferred through this data that although batting average certainly has a positive relationship with WAR, the relationship between OBP and WAR is stronger.
Furthermore, this correlation between WAR and wRC+ of the more advanced player contribution stats, is a much higher 0.619. As seen in the graph, the higher the runs created by the player, the higher their WAR will be.
Now that we’ve covered the somewhat inferior impact batting average has in comparison to other stats in regards to player value, let’s explore the effect it has on team success.
Based on this graph below of the total batting averages of all 30 teams and winning percentages from the 2010 – 2020 seasons, there is clearly very little correlation between batting average and winning.
At just a 0.178 correlation, there is no direct relationship between batting average and teams winning. The variation between the points on the graph above is very high, signaling a very low correlation between the two variables. A team could have a batting average as low as .245, but still have a very solid .540ish winning percentage, or a batting average as high as .267 but a winning percentage as low as .460, as seen in the graph.
There is, however, a much stronger correlation between OBP and winning percentage, which makes sense, considering OBP takes into account not only the hits that a player accumulates, but also their ability to get on base, which as I mentioned, is one of the most crucial aspects of the game players can provide.
With a correlation of 0.708 and a much lower variation of data points, we can see that OBP has a higher relationship with winning.
Slightly stronger is wRC+’s relationship with team winning percentage, at 0.718.
Of course, this is only factoring in the past 10 years of data, and is a very simple visualization of the correlations. However, it was still very interesting to see how little relationship batting average has with winning percentage. Keep in mind that the best thing a player can do is contribute to team success, which is measured by winning, but by the relationship we saw in the graphs, batting average does not correlate well to that success.
While many baseball fans may not even need these visualizations to confirm the impractical use of batting average in regards to player value, see this more as the ultimate confirmation. With all that having been explored, it is clear that batting average is not directly correlated with two of the most important aspects of player contribution: scoring runs and helping their teams win games. Thus, our entire criteria on MVP candidates should be shifted away from the use of batting average. Instead, we should shift our focus primarily to possibly wRC+, or even more simpler stats such as OBP, OPS, or SLG. Clearly, batting average is not useful as a form of justification or argument in favor of player and team performances. While the arrival of new analytical stats have begun to erase the use of batting average, it still remains the primarily used stat of the MLB. As the MLB All Star voting commences, I would encourage voters to stay away from batting average when considering who to vote for.