Net54baseball.com Forums - View Single Post - Best lefty off all time? My vote is Koufax!

Snowman · #**965** 11-17-2021, 07:21 PM

Quote:

Originally Posted by AndrewJerome

This is a fascinating subject. I really enjoy analyzing baseball history and player performance.

There seem to be a few disconnects in this debate.

One disconnect is how much weight to place on counting stats. Pro-Spahn posters in this thread rely on longevity and counting stats with what appears to be a decent peak, with a pretty good ERA+ across his peak years etc. Anti-Spahn posters believe he was a pretty average pitcher in regard to “stuff” since his K/9 doesn’t blow your hair back and wins are team dependent. He pitched a lot of innings and a lot of years, but innings eaters can’t get to GOAT status if they don’t provide elite innings. Essentially that Spahn’s peak is not enough to be the best lefty ever, even with all the counting stats. Koufax’s stats are obviously much different. One very good year, 5 off the charts years, some mediocre years, early retirement and nowhere near the overall counting stats of Spahn. Anti-Koufax posters essentially dismiss him outright because his lack of counting stats eliminate him from lefty GOAT status. He essentially didn’t pitch long enough to even be in the conversation. I tend to agree that the weaknesses of both Spahn and Koufax as described above eliminate them from lefty GOAT status. Both clearly were great pitchers though.

Another disconnect here is how to compare players by era. Snowman appears to be arguing that Grove’s pitching competition was weak and therefore his stats should be discounted a great deal. The ERA titles, ERA+ etc is tainted by weak pitching competition. Essentially that Grove was much better than his pitching peers, but since his pitching peers were very bad, him being much better than them should not be as impressive as the stats appear. I have always wondered about this, but I have no way of figuring out how to crunch the numbers to argue one way or the other. The 1920s / early 1930s batting averages went nuts. Hitters went crazy. How much of this was a result of bad pitching during those years? Anyway, Snowman, I am curious how stats can help us figure out which time periods were strong and which time periods are weak. It has always been something of a mystery to me. On a similar note, WAR is a bit misleading to me since it seems to value relative to replacement where replacement level is determined differently every year. The value of a replacement level player could be very different in a time period where quality of play overall is very high as compared to a time period where quality of play was lower. But how in the world can we figure out relative quality of play?

First, I have to say thank you for actually reading my posts and summarizing my views in a way that I could actually sign off on. You're the first person here who has even made an attempt to understand what I've said without intentionally trying to distort it.

As far as how it could be calculated, there are several options. My preferred approach would be to build a hierarchical mixed-effects model (which controls for both fixed-effects and random-effects simultaneously). These models are extremely powerful. You could create time blocks for various periods where something of note happened (like 1942-1946 when the talent pool was heavily diluted due to players leaving for WW2), or pre-1950 for larger strike zones, or 1950-62 for larger strike zones, etc. You would hard code those into your data and treat them as fixed-effects. We could also control for offensive efficiencies of each era by measuring the delta between runners left in scoring position, among countless other ways (offenses were considerably less efficient when Grove was pitching). We could also control for a pitcher's ability to control the ball across eras by including their K/BB ratios and capturing the interaction of that metric against K/HRs since strikeout rates are both a function of how well a pitcher pitches and what strategies are employed by the hitters. If that relationship is non-linear, we could apply a mathematical transformation (like the square root, cubed root, log, etc.) that enables us to create a linear relationship which would then have predictive power in a model like this. Worth noting is that there is an extremely strong correlation over time between strikeout rates and HRs because swinging for the fences results in striking out more often. I would also include several rate stats that contrast the ratios between batting average and OPS over time, as this has a measurable effect on pitching statistics across eras. Also worth including is the relationship between league-wide ERA and WHIP over time and looking for gaps in that ratio. If WHIP values were high, relative to ERAs, that would be indicative of pitchers ERAs having benefitted from inefficient offenses (and something that Grove and his peers on the mound surely benefitted from, perhaps tremendously). Something else worth noting (and I suppose this is a hint of sorts for something I referenced earlier) is that it's more important to know a pitcher's strikeouts per plate appearance than it is to know their K/9. There are also differences in approach over time. Ted Williams talks about just "putting the bat on the ball" and how that made him a "better hitter" than he would have been if he tried to hit home runs. While yes, it gave him a better batting average, we now know that this isn't what makes someone a "better hitter", at least not in the sense of producing more runs and winning more games. We would also need to control for mound heights at each ballpark over time. We could treat the individual players' performances as random-effects while treating the other metrics we are interested in estimating as fixed-effects, while simultaneously adjusting for age. We could also look at the differences in slopes of the age curve calculations over time and how those slopes have changed. The flatter the curve, the less skilled their peers are, and the steeper the curve, the stronger the opposition. The beauty of using this approach with the hierachical multilevel models, as opposed to using something like standard regression or econometric type models, is that it uses recursive algorithms which output extremely accurate coefficients that are capable of producing different slopes AND intercepts for each cohort as opposed to all using the same slope with different intercepts like you'd get from multiple regression models. The overlap of players playing across different eras (in aggregate, not just cherry-picking one or two players) allows us to measure the differences in the overall skill level of each time period we are interested in (again, adjusting for age and all of the other factors simultaneously). One thing worth keeping in mind is that it's not so much that hitters from the 1930s were "worse" hitters in the sense that they were less capable (although surely, this is also true), but rather that they were "worse" hitters in the sense that they employed sub-par hitting strategies (e.g., they bunted too often and just tried to "get a bat on the ball" rather than just swinging from their heels like Babe Ruth and Lou Gehrig did). We would also want to adjust for the overall talent pool of players in the league and the populations from which they were drawn from. Professional athletes are sampled from the right tail of a Gaussian (or "normal") distribution. They are the best of the best. The ratio of the number of players in the league vs the number of total possible baseball players from which they could have been drafted is extremely important, as this effectively tells us where along that normal distribution that this talent level lies. The larger that ratio, the further to the left they are on that distribution, and the smaller that ratio, the further to the right they are. And the further the league is to the left on that curve, the less skilled they are as a whole. If one era is 3 standard deviations to the left, we can extremely confident that we're effectively watching something that amounts to something like single-A ball today with a handful of star players sprinkled in. A prime example of this is the fact that I played varsity basketball at my high-school. However, the reason I was able to make the team wasn't because I was some elite athlete, but rather because there were only about 200 students in my high-school. Had I attended a much larger school, I might not have even made the JV squad. There were probably only one or two kids on my entire team, if any, who could have made the team on a much larger school. However, their stats would have certainly gone down if they did. They might have averaged 20 ppg and 8 rebounds on my team, but only 12 ppg and 5 rebounds on the team with better players and stronger opponents. Baseball is no different. Player talent pools grew over time. The earlier years, while still fun and nostalgic, were simply not nearly as strong as they are today. Just watch some of the available footage from that era. Half those pitchers look like Weeble Wobbles on the mound with their "deliveries". Those guys were not throwing heat.

I often use these sorts of models when I'm building predictions for NFL games. If a team's starting center is injured and will miss the game on Sunday, I can use these types of models to predict the impact that his absence will have on the spread (hint, it's more you'd probably think).

We could also make retrodictions about things like how fast they pitched in the 1920s by looking at the evolution/progression of other similar sports for which we actually do have data. One option could be to look at the history of javelin and discus throwing records in the Olympics over time and see how well human performance correlates to the progression of pitching stats during the periods for which we have data for both, and regress pitching stats retrodictively against those other throwing sports to yield directionally accurate estimates for the pitching stats from the eras where we didn't have radar guns. While I haven't run the numbers yet, I'm extremely confident that there's no way in hell anyone in the 1920s was throwing a baseball 100 mph. It's worth pointing out that all of these anectodal stories about players saying that Walter Johnson (or pick your favorite hero) was the hardest pitcher they ever saw don't really mean all that much. The plural of anecdote is not data. When I was in middle school, I played against pitchers who were throwing ~70-75 mph. I still vividly remember to this day, going to the batting cages during that time and entering the 90 mph cage. I just remember laughing and thinking, "how the hell am I supposed to hit that?" Speeds are all relative. Walter Johnson throwing the ball 10 mph faster than the 2nd fastest guy doesn't mean all that much when we don't know how hard the other players are throwing. Everyone just knows that he throws "heat" relative to what they're accustomed to. He very well might have been throwing the ball a mere 90 mph, but it "felt like 100 mph" to anyone standing at the plate who was used to swinging at 80 mph "fastballs".