Best lefty off all time? My vote is Koufax! - Page 23

G1911 · #**1101** 11-20-2021, 09:25 PM

Quote:

Originally Posted by Peter_Spaeth

Maddux's success was the result of dumb luck!! 21 years of it, well, maybe 16 of which really defined him.

Well, he pitched for 21 years which seems like a lot, but it was only 5,008 innings. The sample is just too small. Maddux was lucky. Also a bum because only K pitchers who played after Spahn, except for Koufax who is exempted because I don�t know, are any good.

Lorewalker · #**1102** 11-20-2021, 10:24 PM

Quote:

Originally Posted by G1911

Well, he pitched for 21 years which seems like a lot, but it was only 5,008 innings. The sample is just too small. Maddux was lucky. Also a bum because only K pitchers who played after Spahn, except for Koufax who is exempted because I don�t know, are any good.

Snowman is always right. Just ask him.

BobC · #**1103** 11-20-2021, 10:52 PM

Quote:

Originally Posted by Lorewalker

Snowman is always right. Just ask him.

He even has a statistical algorithm to prove it. But don't ask him to show you, because he hasn't actually created it yet. And he doesn't really have the time to do it right now, unless you want to pay him. But even if you do, and then he does, it probably doesn't matter because he'll tell you you're too ignorant to understand it anyway.

Snowman · #**1104** 11-21-2021, 01:23 AM

Quote:

Originally Posted by Peter_Spaeth

No they are 9 points lower, I already posted that.

If your thesis is that Greg Maddux' career (after all he was not a dominant strikeout pitcher with 6 K/9) was jut the result of dumb luck, you have pretty much disqualified yourself as knowing anything about baseball, however good you are with data.

My instincts about Maddux's BABIP were wrong. You're right, Maddux did beat the league average BABIP, particularly between 1992-1998 (see plot below). But you appear to be misunderstanding what I'm saying. I'm not saying Greg Maddux was just a "lucky" pitcher. He was an excellent pitcher. I'm saying that people conflate his remarkable ability to control the ball with him having the ability to also control where the ball goes after someone puts it into play. The extent to which pitchers actually have this ability is miniscule at best. It's probably at least an order of magnitude less than people are thinking of when they make that claim. Maddux rarely walked hitters. He led the league in BB/9 9 times, and was probably in the top 3 15 times or more. This was his superpower. As I mentioned earlier, there is some research (which I'd have to read again, it's been a while) that suggests a really strong pitcher may have a small, but measurable effect on their BABIP, but that estimate is only something like 5 points worth of BABIP, which is to say out of every 1,000 times a ball gets put into play, an elite pitcher is able to prevent an additional 5 of those into becoming hits than his peers (hence I said it's a tough sell). However, a pitcher's BABIP can often fluctuate 70 or 80 points from one season to the next. Even if 5 of those points are within their control, that still leaves 65 to 75 points worth of variance or "luck" which is completely outside of their control.

BABIP is a very useful statistic for putting other stats into context. It is influenced primarily by luck, but also by the defensive talent of the players on the field, the skill of the batters, and by the ballpark. Hitters have a fair amount of control over their BABIP numbers (though they are also very much subject to luck in the short term) as exit velocity is highly correlated to BABIP values. The harder you hit the ball, the more likely it is to drop in for a hit. But pitchers face an approximately uniform (top of the order inflated) distribution of batters, so hitting talent mostly evens out for them with some minor exceptions (e.g., pitching in the NL yields a slightly lower BABIP than the AL because of the DH spot, and pitching in a division that is stacked with good hitters can deflate your BABIP if you have a higher than average number of starts against strong offensive teams than your peers. But these effects are fairly small. The overwhelming majority of the variance in BABIP values is simply due to random chance. And this variance is actually pretty wide from season to season, and it correlates highly with the fluctuations you see with other stats that are highly subject to luck as well (like ERA and WHIP).

A pitcher like Maddux had a few things going for him which should have helped him outperform the league average BABIP numbers. He pitched in the NL, was in a pitcher's park, and had Andruw Jones chasing down balls for him in CF. I'm not sure exactly how much each of those factors weighs in exactly off the top of my head, but they do have a measurable impact. But even if it is true that a pitcher as great as Maddux is capable of "beating" the BABIP line, the evidence shows that it would only be to the tune of a few balls out of 1,000. That's certainly not what people who promote the idea that he can control ball flights with his pitching style mean when they make such claims. If playing in a pitcher's park is worth 1 or 2 balls per 1,000, and having Andruw Jones running down fly balls is worth 1 or 2 per 1,000, and pitching in the NL is worth 2-3 balls per 1,000 and having god-like control is worth 3-5 balls per 1,000, that would add up to someone like Maddux beating the BABIP line by 9 points.

If you haven't read it before, this is worth a read. It has a pretty good explanation of BABIP and why it's important.

https://library.fangraphs.com/pitching/babip/

And since I was wrong and am happy to admit when I'm wrong, here's a plot of Maddux vs the league average BABIP showing that he did in fact beat the league for a good several-year run in the 90s (note the blue line is MLB average, not NL average, which would be slightly lower).

Lorewalker · #**1105** 11-21-2021, 01:41 AM

Quote:

Originally Posted by BobC

He even has a statistical algorithm to prove it. But don't ask him to show you, because he hasn't actually created it yet. And he doesn't really have the time to do it right now, unless you want to pay him. But even if you do, and then he does, it probably doesn't matter because he'll tell you you're too ignorant to understand it anyway.

Pretty much sums up our snowman on every thread, Bob. Since I have been here this is the fastest I have seen someone overstay their welcome however he is very amusing because of how serious he takes himself.

Snowman · #**1106** 11-21-2021, 03:05 AM

Quote:

Originally Posted by G1911

Well, he pitched for 21 years which seems like a lot, but it was only 5,008 innings. The sample is just too small. Maddux was lucky. Also a bum because only K pitchers who played after Spahn, except for Koufax who is exempted because I don�t know, are any good.

Maddux also pitched in the NL, in a pitcher's park, and with one of the greatest defensive center fielders of all time catching balls for him. His BABIP would be expected to be lower than MLB average. If you look at Smoltz and Glavine's numbers during the same time, they also both beat league average MLB BABIP.

Perhaps you should read up on BABIP? I somewhat excuse the level of ignorance on these topics by the non data savvy people in this thread because it's not exactly their job to understand numbers. But if you are serious about being a data analyst, your perpetual ignorance displayed throughout the entirety of this thread with respect to just basic statistics and simple statistical concepts is remarkably embarassing. You should be ashamed of yourself. Go read a book. Or three.

Snowman · #**1107** 11-21-2021, 03:18 AM

Quote:

Originally Posted by G1911

Well, he pitched for 21 years which seems like a lot, but it was only 5,008 innings. The sample is just too small. Maddux was lucky. Also a bum because only K pitchers who played after Spahn, except for Koufax who is exempted because I don�t know, are any good.

I'll give you $1k right now if you can repeat my arguments in a way I'll sign off on. Good luck.

Snowman · #**1108** 11-21-2021, 03:21 AM

Quote:

Originally Posted by BobC

He even has a statistical algorithm to prove it. But don't ask him to show you, because he hasn't actually created it yet. And he doesn't really have the time to do it right now, unless you want to pay him. But even if you do, and then he does, it probably doesn't matter because he'll tell you you're too ignorant to understand it anyway.

And I'll give you $1k right now if you can explain in detail why a pitcher's win totals and ERA from any given season should not be used to evaluate pitching performance. And you can't just say "sample size". Good luck.

BobC · #**1109** 11-21-2021, 03:29 AM

Quote:

Originally Posted by Lorewalker

Pretty much sums up our snowman on every thread, Bob. Since I have been here this is the fastest I have seen someone overstay their welcome however he is very amusing because of how serious he takes himself.

LOL

Hopefully things will change, but the fact he got bounced off Blowout makes the the question others have asked as to whether or not he's a troll, more possible than not I guess. He's a smart guy, just wish he'd be a little more open minded and realize he's not always going to be right. Oh well. Guess we'll wait to see what happens. I just put him on "Ignore" myself and don't read his posts anymore. It's better that way.

Aquarian Sports Cards · #**1110** 11-21-2021, 06:40 AM

from MLB.com

"The formula

(H - HR)/(AB - K - HR + SF)
Why it's useful

BABIP can be used to provide some context when evaluating both pitchers and hitters. The league average BABIP is typically around .300. Pitchers who have allowed a high percentage of hits on balls in play will typically regress to the mean, and vice versa. In other words, over time, they'll see fewer (or more) balls in play fall for hits, and therefore experience better (or worse) results in terms of run prevention. The same applies for batters who have seen a high or low percentage of their balls in play drop in for hits.

That said, skill can play a role in BABIP, as some pitchers are adept at generating weak contact, while some hitters excel at producing hard-hit balls. For example, Clayton Kershaw finished the 2019 season with a lifetime .270 BABIP allowed, while Mike Trout ended the campaign with a career .348 BABIP."

My Thoughts:

The all-time leader of BABIP for starters over 1000 innings is Babe Ruth at .241, 2000 innings Andy Messersmith at a slightly higher .241, 3000 innings Catfish Hunter at .243 Those are all fine pitchers but none of them are in the running for all-time greatest status. So clearly BABIP, even to the degree it is controllable, isn't a perfect stat either.

Peter_Spaeth · #**1111** 11-21-2021, 09:11 AM

Quote:

Originally Posted by Snowman

My instincts about Maddux's BABIP were wrong. You're right, Maddux did beat the league average BABIP, particularly between 1992-1998 (see plot below). But you appear to be misunderstanding what I'm saying. I'm not saying Greg Maddux was just a "lucky" pitcher. He was an excellent pitcher. I'm saying that people conflate his remarkable ability to control the ball with him having the ability to also control where the ball goes after someone puts it into play. The extent to which pitchers actually have this ability is miniscule at best. It's probably at least an order of magnitude less than people are thinking of when they make that claim. Maddux rarely walked hitters. He led the league in BB/9 9 times, and was probably in the top 3 15 times or more. This was his superpower. As I mentioned earlier, there is some research (which I'd have to read again, it's been a while) that suggests a really strong pitcher may have a small, but measurable effect on their BABIP, but that estimate is only something like 5 points worth of BABIP, which is to say out of every 1,000 times a ball gets put into play, an elite pitcher is able to prevent an additional 5 of those into becoming hits than his peers (hence I said it's a tough sell). However, a pitcher's BABIP can often fluctuate 70 or 80 points from one season to the next. Even if 5 of those points are within their control, that still leaves 65 to 75 points worth of variance or "luck" which is completely outside of their control.

BABIP is a very useful statistic for putting other stats into context. It is influenced primarily by luck, but also by the defensive talent of the players on the field, the skill of the batters, and by the ballpark. Hitters have a fair amount of control over their BABIP numbers (though they are also very much subject to luck in the short term) as exit velocity is highly correlated to BABIP values. The harder you hit the ball, the more likely it is to drop in for a hit. But pitchers face an approximately uniform (top of the order inflated) distribution of batters, so hitting talent mostly evens out for them with some minor exceptions (e.g., pitching in the NL yields a slightly lower BABIP than the AL because of the DH spot, and pitching in a division that is stacked with good hitters can deflate your BABIP if you have a higher than average number of starts against strong offensive teams than your peers. But these effects are fairly small. The overwhelming majority of the variance in BABIP values is simply due to random chance. And this variance is actually pretty wide from season to season, and it correlates highly with the fluctuations you see with other stats that are highly subject to luck as well (like ERA and WHIP).

A pitcher like Maddux had a few things going for him which should have helped him outperform the league average BABIP numbers. He pitched in the NL, was in a pitcher's park, and had Andruw Jones chasing down balls for him in CF. I'm not sure exactly how much each of those factors weighs in exactly off the top of my head, but they do have a measurable impact. But even if it is true that a pitcher as great as Maddux is capable of "beating" the BABIP line, the evidence shows that it would only be to the tune of a few balls out of 1,000. That's certainly not what people who promote the idea that he can control ball flights with his pitching style mean when they make such claims. If playing in a pitcher's park is worth 1 or 2 balls per 1,000, and having Andruw Jones running down fly balls is worth 1 or 2 per 1,000, and pitching in the NL is worth 2-3 balls per 1,000 and having god-like control is worth 3-5 balls per 1,000, that would add up to someone like Maddux beating the BABIP line by 9 points.

If you haven't read it before, this is worth a read. It has a pretty good explanation of BABIP and why it's important.

https://library.fangraphs.com/pitching/babip/

And since I was wrong and am happy to admit when I'm wrong, here's a plot of Maddux vs the league average BABIP showing that he did in fact beat the league for a good several-year run in the 90s (note the blue line is MLB average, not NL average, which would be slightly lower).

Two points: one, you at first presented your "instincts" about Maddux as if they were facts you already knew. Read the language of your post.

Two, Kershaw's BABIP is 27 points below the ML average for his career. What's your take on that which obviously can't be explained by NL alone?

Carter08 · #**1112** 11-21-2021, 09:18 AM

Many probably know this but I�ll repeat it here. Simply amazing:

Maddux faced 20,421 batters during his time in the league. In those 20,421 at-bats, only 310 hitters saw a 3-0 count. Out of those 310, 3-0 counts, 177 of them were intentional walks.

Peter_Spaeth · #**1113** 11-21-2021, 09:54 AM

Until he ended it with an intentional pass, Maddux once went 72 straight innings without a walk.

cardsagain74 · #**1114** 11-21-2021, 10:37 AM

Quote:

Originally Posted by Peter_Spaeth

I also question some "oh it's too small a sample size" arguments. Those always seem to me to reflect cherry-picking, to dismiss inconvenient stats that don't fit the theory. We used to see that argument all the time here to rebut the theory that Kershaw was not a good post-season pitcher; his lousy performances were just random events and couldn't possibly reflect that he wilted under pressure. Of course after a full season worth of postseason outings there's still a huge disparity so maybe that argument has been retired.

Of course when the stats do fit the theory, we don't see the sample size argument so much.

And sometimes people don't look at it a little deeper (when they want to dismiss an argument). Like you said, it's not always just as simple as "postseason doesn't matter" because of the sample size.

Willie Mays hit one homer in 134 postseason plate appearances, and that was at the end of a game that was 8-1. So, basically none.

If you assume that 100 of Mays' 134 postseason PAs were relevant (for lack of a better word), the chance of him hitting no homers in those (given his lifetime HR rate) is around .005.

Even if you factor in how it's tougher to hit homers against the quality of championship-level pitching, that's still way too out there on the bell curve to assume that it's just random statistical noise.

G1911 · #**1115** 11-21-2021, 10:44 AM

Quote:

Originally Posted by Snowman

Maddux also pitched in the NL, in a pitcher's park, and with one of the greatest defensive center fielders of all time catching balls for him. His BABIP would be expected to be lower than MLB average. If you look at Smoltz and Glavine's numbers during the same time, they also both beat league average MLB BABIP.

Perhaps you should read up on BABIP? I somewhat excuse the level of ignorance on these topics by the non data savvy people in this thread because it's not exactly their job to understand numbers. But if you are serious about being a data analyst, your perpetual ignorance displayed throughout the entirety of this thread with respect to just basic statistics and simple statistical concepts is remarkably embarassing. You should be ashamed of yourself. Go read a book. Or three.

The only person being embarrassed in this thread is you. You�ve progressed into actually having some points beyond claiming to be infallible and have a statistical model you can�t show that proves your claims, but any good point in it is lost by the constant insults of everyone else here and the childish immaturity of your �over the top brag - insult� pattern that never ceases. I�m well aware of what BABIP is and already said the defense behind the pitcher needs to be adjusted for. Regardless of what you claim, great contact pitchers find success at not giving up many runs, often equal to or even better than great K pitchers. Dismissing all non K centric pitchers, which seems to be your implied basis for ignoring Spahn but including his exact contemporary Koufax, is not supported by the data. It does not appear to be random luck, and they tend to have lower BABIP�s over large sample sizes.

But I�m illiterate and homeless, among many other things.

Lorewalker · #**1116** 11-21-2021, 11:35 AM

Quote:

Originally Posted by BobC

LOL

Hopefully things will change, but the fact he got bounced off Blowout makes the the question others have asked as to whether or not he's a troll, more possible than not I guess. He's a smart guy, just wish he'd be a little more open minded and realize he's not always going to be right. Oh well. Guess we'll wait to see what happens. I just put him on "Ignore" myself and don't read his posts anymore. It's better that way.

He seems to double down and then resort to putting everyone down in every thread in which his theories, which he presents as facts, are successfully challenged. He might be smart but he is not that bright.

Carter08 · #**1117** 11-21-2021, 12:30 PM

Maybe not the thread for this but why didn�t the yanks trot Ruth out to pitch more often? I assume it�s the obvious - to keep him healthy and batting and if it ain�t broke don�t fix it.

Snowman · #**1118** 11-21-2021, 12:36 PM

Quote:

Originally Posted by Peter_Spaeth

Until he ended it with an intentional pass, Maddux once went 72 straight innings without a walk.

And when asked about his scoreless innings steak, his response was "honestly, it was mostly luck"

Carter08 · #**1119** 11-21-2021, 12:51 PM

Quote:

Originally Posted by Snowman

And when asked about his scoreless innings steak, his response was "honestly, it was mostly luck"

Humble guy. An admirable quality usually displayed by people confident they are good.

Mark17 · #**1120** 11-21-2021, 01:01 PM

Quote:

Originally Posted by Carter08

Maybe not the thread for this but why didn�t the yanks trot Ruth out to pitch more often? I assume it�s the obvious - to keep him healthy and batting and if it ain�t broke don�t fix it.

From what I understand, Ruth didn't want to do both.

He should've said, "Pay me 2 salaries and I'll be a starting pitcher, and an outfielder." Of course, he ended up being paid more than the President of the country anyway......

Snowman · #**1121** 11-21-2021, 01:08 PM

Quote:

Originally Posted by Peter_Spaeth

Two points: one, you at first presented your "instincts" about Maddux as if they were facts you already knew. Read the language of your post.

Two, Kershaw's BABIP is 27 points below the ML average for his career. What's your take on that which obviously can't be explained by NL alone?

Again, you conflating. The facts I already know are that a pitcher's BABIP regresses to the mean and each pitcher has little to no control over their values. What I was wrong about was that Maddux's values were 9 points below league average. But that still doesn't mean he is able to control his BABIP. If you look up his teammates, they too all beat the league average BABIP. In other words, the ballpark, pitching in the NL, and the defense behind him was responsible for most, if not all, of his ability to beat it.

As far as Kershaw goes, it appears to be the same thing. I just looked up 5 or 6 of his teammates over the years in LA to check their BABIP values. Grienke, Urias, Buehler, Jansen, Baez, all of them are 20 to 40 points below league average BABIP. Again, this means it is their defense, the fact that they all pitch in the NL, and the ballpark that account for the differences, not some magical ability that Kershaw possesses.

Snowman · #**1122** 11-21-2021, 01:10 PM

Quote:

Originally Posted by Carter08

Humble guy. An admirable quality usually displayed by people confident they are good.

Or perhaps more likely, he was simply a realist.

Peter_Spaeth · #**1123** 11-21-2021, 01:24 PM

Quote:

Originally Posted by Snowman

Again, you conflating. The facts I already know are that a pitcher's BABIP regresses to the mean and each pitcher has little to no control over their values. What I was wrong about was that Maddux's values were 9 points below league average. But that still doesn't mean he is able to control his BABIP. If you look up his teammates, they too all beat the league average BABIP. In other words, the ballpark, pitching in the NL, and the defense behind him was responsible for most, if not all, of his ability to beat it.

As far as Kershaw goes, it appears to be the same thing. I just looked up 5 or 6 of his teammates over the years in LA to check their BABIP values. Grienke, Urias, Buehler, Jansen, Baez, all of them are 20 to 40 points below league average BABIP. Again, this means it is their defense, the fact that they all pitch in the NL, and the ballpark that account for the differences, not some magical ability that Kershaw possesses.

Is it also the case that pitchers regress to the mean in extra base hits and home runs against?

Snowman · #**1124** 11-21-2021, 01:28 PM

Quote:

Originally Posted by G1911

The only person being embarrassed in this thread is you. You’ve progressed into actually having some points beyond claiming to be infallible and have a statistical model you can’t show that proves your claims, but any good point in it is lost by the constant insults of everyone else here and the childish immaturity of your ‘over the top brag - insult’ pattern that never ceases. I’m well aware of what BABIP is and already said the defense behind the pitcher needs to be adjusted for. Regardless of what you claim, great contact pitchers find success at not giving up many runs, often equal to or even better than great K pitchers. Dismissing all non K centric pitchers, which seems to be your implied basis for ignoring Spahn but including his exact contemporary Koufax, is not supported by the data. It does not appear to be random luck, and they tend to have lower BABIP’s over large sample sizes.

But I’m illiterate and homeless, among many other things.

In one breath, you claim to understand BABIP and its implications, and in the very next breath you use the completely nonsensical term of "great contact pitchers" as if such a thing exists. This is what I'm trying to tell you. There is no such thing as a "great contact pitcher". They are the Loch Ness Monster of baseball. A myth. If you don't understand this, then you don't understand BABIP and why it is important.

This isn't exactly news either. Every franchise in the league today knows this. You might find some old school uneducated managers here and there who still reject it, but the front offices and owners across the league all accept this fundamental truth. It's been well known for the better part of 20 years now.

You should read this. It's a link to the original research article by the guy who discovered this fundamental truth about pitchers not being able to control contact after the pitch.

https://www.baseballprospectus.com/n...-hurlers-have/

Snowman · #**1125** 11-21-2021, 01:33 PM

Quote:

Originally Posted by Peter_Spaeth

Is it also the case that pitchers regress to the mean in extra base hits and home runs against?

No. They will regress to their own individual expected means, but not to the league averages. Bad pitchers serve up more meatballs than good pitchers. This is not contradictory to the discussion above.

Peter_Spaeth · #**1126** 11-21-2021, 01:39 PM

Quote:

Originally Posted by Snowman

No. They will regress to their own individual expected means, but not to the league averages. Bad pitchers serve up more meatballs than good pitchers. This is not contradictory to the discussion above.

If a pitcher like Maddux was better at keeping the ball in the park, and/or could limit extra base hits better, then that seems at least some evidence he could in fact control where/how hard the ball was hit against him, even if not reflected in batting average itself. Do you agree?

Take a hypothetical at bat, a bad pitcher hangs a curve and the batter hits it over the wall. Maddux paints the corner with a slider and the batter gets a bloop single off the end of the bat. Same BABIP but different (in most cases) outcome.

Bigdaddy · #**1127** 11-21-2021, 02:00 PM

In Sandy's own words:

"I became a good pitcher when I stopped trying to make them miss the ball and started trying to make them hit it."

And the whole idea of 'weak contact' is within the pitcher's control - Are they consistently ahead or behind in the count: are they grooving a ball down the middle of the plate, or painting the corners; are they disrupting a batter's timing??? Great pitchers consistently pitch ahead in the count, paint the corners and keep batters off balance - and induce weak contact.

Snowman · #**1128** 11-21-2021, 02:03 PM

Quote:

Originally Posted by Peter_Spaeth

If a pitcher like Maddux was better at keeping the ball in the park, and/or could limit extra base hits better, then that seems at least some evidence he could in fact control where/how hard the ball was hit against him, even if not reflected in batting average itself. Do you agree?

Take a hypothetical at bat, a bad pitcher hangs a curve and the batter hits it over the wall. Maddux paints the corner with a slider and the batter gets a bloop single off the end of the bat. Same BABIP but different (in most cases) outcome.

I don't know about doubles and triples. I missed that part of your question. I'd have to look at that. My gut would tell me that they likely regress. But HR rates definitely do not regress to league averages.

Snowman · #**1129** 11-21-2021, 02:05 PM

Quote:

Originally Posted by Bigdaddy

In Sandy's own words:

"I became a good pitcher when I stopped trying to make them miss the ball and started trying to make them hit it."

That's also when his BB/9 rate fell though. And when his strike zone grew.

Peter_Spaeth · #**1130** 11-21-2021, 02:11 PM

Quote:

Originally Posted by Snowman

I don't know about doubles and triples. I missed that part of your question. I'd have to look at that. My gut would tell me that they likely regress. But HR rates definitely do not regress to league averages.

What we need, if it doesn't already exist, is a slugging average for balls in play stat.

On plain old SLG against, Maddux over his career was some 55 points below the average. That sounds meaningful? Especially since his BA against was 14 points better than average. A non-statistician would conclude from that he was limiting extra base hits pretty well.

AndrewJerome · #**1131** 11-21-2021, 02:30 PM

Great stuff guys. This is a fun thread.

A few things:

Snowman, if that model could be created it would be pretty cool. Obviously it would take a lot of work. I have a practical question. Sorry, I don�t know all the terminology, and I really have no idea how a model like that works. If that model were to be created, how would information get processed through the model? For say 1953 or whatever year, would every stat for that year have to be manually input into the model?

The idea of how athletes evolve is interesting. Of course humans have slowly gotten bigger, faster, stronger etc over the past 130 years. However, for quality of play in baseball, I�m not sure it is as simple as every year we go forward the quality of play gets a little better. Obviously, there have been social changes that impact this greatly. Quality of play clearly went down during the war years of the early 1940s, and clearly went up in the late 1940s with integration. This is only a guess, but it seems to me, just brainstorming, that quality of play seems especially strong in the 1950s / early 1960s, and also from the late 1980s to around 2000. A high number of very elite players entered MLB in the 1950s. Mantle, Mays, Aaron, Clemente, Jackie Robinson, Frank Robinson, Snider, Berra, Campanella, Banks, Matthews, Koufax, Gibson etc. The upper tier HOFers are seemingly endless for the 1950s and moving to the 1960s for the end of their careers. But it seems like there were far less upper tier HOFers starting out in the 1960s. Brock, Rose, Morgan types are not nearly as impressive as the 1950s list. Similarly, upper tier HOFers starting out near 1970 and early to mid 1980s are not nearly as impressive as the 1950s list. 1970s you have Reggie, Schmidt, Brett, early to mid 1980s you have Rickey Henderson, Ripken, etc. but no where near the top end talent starting out in the 1950s. But then in the mid to late 1980s you add Bonds, Clemens, Griffey, Randy Johnson, Maddux, Pedro, Arod, Jeter, Frank Thomas etc., just a lot of top tier HOFers and it would seem like very high level of play. I guess my question is how much impact do high end HOFers have on the level of play for a time period? The flip side of the argument would be that the �average� type players increased in skill greatly over time, and the �average� players in the league getting better over time could be more impactful than the amount of top end talent at any one time. Anyway, fun stuff to think about.

Finally, my understanding is that a high or low BABIP generally is a lucky/unlucky stat. An unusually high (and out of line) BABIP for a pitcher would entail bad luck where a bunch of line drives and grounders happen to get hits. And an unusually low BABIP for a pitcher would be good luck where line drives seem to be hit right at guys etc. How much of BABIP is �good situational pitching� or �good situational defense�? Who knows. But this being said, Maddux is a fascinating pitcher. His control is obviously elite and close to best of all time for control. And not just throwing strikes, but the ability to nibble at the edges of the strike zone. This makes it very hard to make solid contact and should equate to a lower BABIP. That�s just the eye test from watching him. Strikes that are on the corners are difficult to hit hard. It you rarely throw a meat ball and get lots of strikes on the corners then you�d think stats should follow the eye test, just because Maddux was so good with his control.

BobC · #**1132** 11-21-2021, 05:25 PM

Quote:

Originally Posted by Lorewalker

He seems to double down and then resort to putting everyone down in every thread in which his theories, which he presents as facts, are successfully challenged. He might be smart but he is not that bright.

Well, that is where the troll reference may become applicable. LOL It would seem he deep down must really get into these back and forths at some level, unfortunately, maybe more so than just the desire and interest in discussing such topics themselves that we typically end up doing on here from time to time. In other words, maybe he comes on sites like this looking for the arguments because that is what his psyche wants and needs, and doing so over the internet, he can stlll remain removed, somewhat anonymous, and thus feel safe. Which is kind of the definition for being a troll when you think about it. LOL

For example, it was much earlier in this thread that he appeared to get frustrated when people pushed back and didn't simply accept what he was saying, or the implied or overt insults. So he clearly and emphatically stated he was done with this, which pretty much every intelligent, normal person would take to mean he was done with responding and interacting with everyone on this thread anymore. Had he actually stuck to his word, I wonder if he wouldn't have garnered a little more respect from the crowd on here. But instead, it was just a few posts later, and he was right back at it without missing a beat. So does that point to some deeper, psychological urge or need, who knows?

On the positive side, even though I simply ignore and no longer waste my time reading his posts, in looking at what others are posting ang saying in this thread, it appears he's finally admitting the he may have made same errant statements and that his statistical assumptions and conclusions may not in fact always be infallible. And if I'm right, good for him. He does have and makes some very intelligent and interesting points and comments. It's just that he doesn't seem to realize, or doesn't want to admit, that as good as statistical analysis can appear to be, in the end it is nothing more than a tool to hopefully allow someone to more accurately predict an outcome, like who's going to win the Super Bowl. Unfortunately, when their ability to predict outcomes like the winner of a Super Bowl begins to have some success, such people may then try to extend that tool to possibly use it for something else that is not a totally objective question, like deciding who the greatest lefty pitcher of all time is. That is clearly not an objective question, and has no absolutely certain outcome we can then actually measure the effectiveness that some statistical analysis may have in predicting it, at least not like knowing there will be an actual Super Bowl winner. And also extremely important (and maybe the MOST important thing of all), everyone knows, AND AGREES, on exactly what the definition of and how you decide on who the Super Bowl winner is. In the case of the greatest lefty of all time, we haven't even begun to decide on the correct definition of "greatest" yet, let alone the actual measures we will then use to POSSIBLY decide an answer, if it can even be done. And untill that has been determined, everything is just someone's opinion, INCLUDING someone's statistical analysis.

And in regards to referring to statistics as just a tool.........

A statistician's wife has been bugging him for weeks to replace a light fixture on the ceiling, and he's finally going to get around to doing it (And without her having to pay him to do so, go figure!). Unfortunately, he needs a screwdriver to remove a few screws to get the job done, but doesn't have one. Well, he's up on the ladder already, so before getting down and then having to drive all the way to the store to buy a screwdriver, he goes digging around in his pocket and finds his penknife, and promptly uses that to remove the screws and complete the task. So he gets the job done using a tool that wasn't actually meant for what he ended up using it for. But he took a chance on guessing it might work and got lucky, like he got lucky to also just happen to have the penknife in his pocket when he most needed it to begin with. But before you go applauding the statistician for his fine work in completing the given task, and he triumphantly goes riding off into the sunset on his noble, white steed, with his beautiful and now forever grateful wife astride behind him, I have to finish the rest of the story.

Turns out that for maybe what little the statistician knew about tools, he knew even less about electricty. For while using his penknife to remove the screws and then replace the light fixture, he accidently knicked some wires in the ceiling and unknowingly got them crossed. So once he had the fixture replaced, he joyously called his wife to come and flip the switch to see the new fixture working, and what a great job he had done. Unfortunately, the knicked and crossed wires created a short, which blew out the fuse box, and resulted in having to call in an electrician to fix everything, at a very hefty cost. And as a result, our woebegone hero ended up sleeping on the couch for the rest of the week. So much for our happy ending!

And as for statistics always being able to measure and actually predict human nature and outcomes, go read some Asimov!

Lorewalker · #**1133** 11-21-2021, 06:06 PM

Quote:

Originally Posted by BobC

Well, that is where the troll reference may become applicable. LOL It would seem he deep down must really get into these back and forths at some level, unfortunately, maybe more so than just the desire and interest in discussing such topics themselves that we typically end up doing on here from time to time. In other words, maybe he comes on sites like this looking for the arguments because that is what his psyche wants and needs, and doing so over the internet, he can stlll remain removed, somewhat anonymous, and thus feel safe. Which is kind of the definition for being a troll when you think about it. LOL

For example, it was much earlier in this thread that he appeared to get frustrated when people pushed back and didn't simply accept what he was saying, or the implied or overt insults. So he clearly and emphatically stated he was done with this, which pretty much every intelligent, normal person would take to mean he was done with responding and interacting with everyone on this thread anymore. Had he actually stuck to his word, I wonder if he wouldn't have garnered a little more respect from the crowd on here. But instead, it was just a few posts later, and he was right back at it without missing a beat. So does that point to some deeper, psychological urge or need, who knows?

On the positive side, even though I simply ignore and no longer waste my time reading his posts, in looking at what others are posting ang saying in this thread, it appears he's finally admitting the he may have made same errant statements and that his statistical assumptions and conclusions may not in fact always be infallible. And if I'm right, good for him. He does have and makes some very intelligent and interesting points and comments. It's just that he doesn't seem to realize, or doesn't want to admit, that as good as statistical analysis can appear to be, in the end it is nothing more than a tool to hopefully allow someone to more accurately predict an outcome, like who's going to win the Super Bowl. Unfortunately, when their ability to predict outcomes like the winner of a Super Bowl begins to have some success, such people may then try to extend that tool to possibly use it for something else that is not a totally objective question, like deciding who the greatest lefty pitcher of all time is. That is clearly not an objective question, and has no absolutely certain outcome we can then actually measure the effectiveness that some statistical analysis may have in predicting it, at least not like knowing there will be an actual Super Bowl winner. And also extremely important (and maybe the MOST important thing of all), everyone knows, AND AGREES, on exactly what the definition of and how you decide on who the Super Bowl winner is. In the case of the greatest lefty of all time, we haven't even begun to decide on the correct definition of "greatest" yet, let alone the actual measures we will then use to POSSIBLY decide an answer, if it can even be done. And untill that has been determined, everything is just someone's opinion, INCLUDING someone's statistical analysis.

And in regards to referring to statistics as just a tool.........

A statistician's wife has been bugging him for weeks to replace a light fixture on the ceiling, and he's finally going to get around to doing it (And without her having to pay him to do so, go figure!). Unfortunately, he needs a screwdriver to remove a few screws to get the job done, but doesn't have one. Well, he's up on the ladder already, so before getting down and then having to drive all the way to the store to buy a screwdriver, he goes digging around in his pocket and finds his penknife, and promptly uses that to remove the screws and complete the task. So he gets the job done using a tool that wasn't actually meant for what he ended up using it for. But he took a chance on guessing it might work and got lucky, like he got lucky to also just happen to have the penknife in his pocket when he most needed it to begin with. But before you go applauding the statistician for his fine work in completing the given task, and he triumphantly goes riding off into the sunset on his noble, white steed, with his beautiful and now forever grateful wife astride behind him, I have to finish the rest of the story.

Turns out that for maybe what little the statistician knew about tools, he knew even less about electricty. For while using his penknife to remove the screws and then replace the light fixture, he accidently knicked some wires in the ceiling and unknowingly got them crossed. So once he had the fixture replaced, he joyously called his wife to come and flip the switch to see the new fixture working, and what a great job he had done. Unfortunately, the knicked and crossed wires created a short, which blew out the fuse box, and resulted in having to call in an electrician to fix everything, at a very hefty cost. And as a result, our woebegone hero ended up sleeping on the couch for the rest of the week. So much for our happy ending!

And as for statistics always being able to measure and actually predict human nature and outcomes, go read some Asimov!

He loves the attention. He will take any side of an debate simply to argue and be the contrarian. On many threads he has ended up having to back down, back off or admit he was wrong. It is truly hysterical. For him it is the sport of it. He does not care who he annoys or even how he comes off. Even our talking about him gets his juices flowing. An absolutely massive ego and has narcissism down pat.

Sure he is bright but he has no people skills and I would guess that it is just not here but out in the wild too. I cannot ignore him...I admit I have a weakness for his rants. Endlessly amusing to watch him carry on the same exact way each time.

G1911 · #**1134** 11-21-2021, 06:20 PM

Quote:

Originally Posted by Snowman

In one breath, you claim to understand BABIP and its implications, and in the very next breath you use the completely nonsensical term of "great contact pitchers" as if such a thing exists. This is what I'm trying to tell you. There is no such thing as a "great contact pitcher". They are the Loch Ness Monster of baseball. A myth. If you don't understand this, then you don't understand BABIP and why it is important.

This isn't exactly news either. Every franchise in the league today knows this. You might find some old school uneducated managers here and there who still reject it, but the front offices and owners across the league all accept this fundamental truth. It's been well known for the better part of 20 years now.

You should read this. It's a link to the original research article by the guy who discovered this fundamental truth about pitchers not being able to control contact after the pitch.

https://www.baseballprospectus.com/n...-hurlers-have/

And yet, throughout the entirety of baseball history, we have great pitchers who are not strikeout pitchers (and thus getting their outs on contact) having very long careers and performing far above most pitchers. If there is no such thing as a great contact pitcher, how are pitchers like Maddux great? Or do you think Maddux and the numerous other pitchers like him are all sheer luck?

I'm familiar with McCracken's article and Bill James' positive take on it. I think some of the points are true indeed. But I also am aware that some contact pitchers have high inning careers of greatness. These sample sizes seem unreasonable to chalk up to sheer dumb luck. If it was purely the team defense behind them, pitchers like Maddux and the number 5 starter on the team who isn't a strikeout pitcher would chalk up about the same numbers on the whole. Maddux is a good example, he wasn't a great K pitcher. He pitched to contact. And he won 4 ERA crowns, 4 FIP crowns, led the league in fewest hits per 9 once. How do we explain his 5,000IP career if contact pitchers are all bad or mediocre?

Are you capable of making any argument whatsoever without insulting anyone? I think you've actually started to bring up good points that can coalesce into a coherent, rational argument, but your absurd egotism and propensity to just resort to the ad hominem at every single turn obscures even your good points.

Carter08 · #**1135** 11-21-2021, 06:46 PM

Quote:

Originally Posted by G1911

And yet, throughout the entirety of baseball history, we have great pitchers who are not strikeout pitchers (and thus getting their outs on contact) having very long careers and performing far above most pitchers. If there is no such thing as a great contact pitcher, how are pitchers like Maddux great? Or do you think Maddux and the numerous other pitchers like him are all sheer luck?

I'm familiar with McCracken's article and Bill James' positive take on it. I think some of the points are true indeed. But I also am aware that some contact pitchers have high inning careers of greatness. These sample sizes seem unreasonable to chalk up to sheer dumb luck. If it was purely the team defense behind them, pitchers like Maddux and the number 5 starter on the team who isn't a strikeout pitcher would chalk up about the same numbers on the whole. Maddux is a good example, he wasn't a great K pitcher. He pitched to contact. And he won 4 ERA crowns, 4 FIP crowns, led the league in fewest hits per 9 once. How do we explain his 5,000IP career if contact pitchers are all bad or mediocre?

Are you capable of making any argument whatsoever without insulting anyone? I think you've actually started to bring up good points that can coalesce into a coherent, rational argument, but your absurd egotism and propensity to just resort to the ad hominem at every single turn obscures even your good points.

Plus one. And I without looking at stats I will just say the eye test can tell a great pitcher. It�s fun to watch a guy where no one can touch the ball - thinking DeGrom when he�s actually healthy - but it�s also fun to watch a guy that paints corners and throws junk down the middle that ends up with dribblers.

earlywynnfan · #**1136** 11-21-2021, 06:57 PM

Quote:

Originally Posted by Snowman

Again, you conflating. The facts I already know are that a pitcher's BABIP regresses to the mean and each pitcher has little to no control over their values. What I was wrong about was that Maddux's values were 9 points below league average. But that still doesn't mean he is able to control his BABIP. If you look up his teammates, they too all beat the league average BABIP. In other words, the ballpark, pitching in the NL, and the defense behind him was responsible for most, if not all, of his ability to beat it.

As far as Kershaw goes, it appears to be the same thing. I just looked up 5 or 6 of his teammates over the years in LA to check their BABIP values. Grienke, Urias, Buehler, Jansen, Baez, all of them are 20 to 40 points below league average BABIP. Again, this means it is their defense, the fact that they all pitch in the NL, and the ballpark that account for the differences, not some magical ability that Kershaw possesses.

Back to the original Grove vs. Koufax line, can you please use your statistics to explain Koufax's widely disparate home vs. away records??

cardsagain74 · #**1137** 11-21-2021, 07:10 PM

Quote:

Originally Posted by G1911

I'm familiar with McCracken's article and Bill James' positive take on it. I think some of the points are true indeed. But I also am aware that some contact pitchers have high inning careers of greatness. These sample sizes seem unreasonable to chalk up to sheer dumb luck.

The article did have some good points, but I agree that its whole "FIP is all that matters" conclusion is too simplistic and goes too far. And some of the points were really grasping at straws; the quotes from Maddux and Pedro were an especially poor attempt to help prove the merits of the study (of course a long scoreless innings streak will have a lot of luck...what does that have to do with that specific discussion?)

I've noticed that when it comes to sports and gambling, statisticians love to claim as many "this is completely random" findings as they possibly can. A lot of that probably has to do with being the devil's advocate about the general public's often faulty attempts to find reason in trends or insufficient statistics.

And with having such a passion to do so, it's easy for them to go too far in the other direction (and be too quick to dismiss the possible meaning in some numbers)

BobC · #**1138** 11-21-2021, 07:56 PM

Quote:

Originally Posted by AndrewJerome

Great stuff guys. This is a fun thread.

A few things:

Snowman, if that model could be created it would be pretty cool. Obviously it would take a lot of work. I have a practical question. Sorry, I don’t know all the terminology, and I really have no idea how a model like that works. If that model were to be created, how would information get processed through the model? For say 1953 or whatever year, would every stat for that year have to be manually input into the model?

The idea of how athletes evolve is interesting. Of course humans have slowly gotten bigger, faster, stronger etc over the past 130 years. However, for quality of play in baseball, I’m not sure it is as simple as every year we go forward the quality of play gets a little better. Obviously, there have been social changes that impact this greatly. Quality of play clearly went down during the war years of the early 1940s, and clearly went up in the late 1940s with integration. This is only a guess, but it seems to me, just brainstorming, that quality of play seems especially strong in the 1950s / early 1960s, and also from the late 1980s to around 2000. A high number of very elite players entered MLB in the 1950s. Mantle, Mays, Aaron, Clemente, Jackie Robinson, Frank Robinson, Snider, Berra, Campanella, Banks, Matthews, Koufax, Gibson etc. The upper tier HOFers are seemingly endless for the 1950s and moving to the 1960s for the end of their careers. But it seems like there were far less upper tier HOFers starting out in the 1960s. Brock, Rose, Morgan types are not nearly as impressive as the 1950s list. Similarly, upper tier HOFers starting out near 1970 and early to mid 1980s are not nearly as impressive as the 1950s list. 1970s you have Reggie, Schmidt, Brett, early to mid 1980s you have Rickey Henderson, Ripken, etc. but no where near the top end talent starting out in the 1950s. But then in the mid to late 1980s you add Bonds, Clemens, Griffey, Randy Johnson, Maddux, Pedro, Arod, Jeter, Frank Thomas etc., just a lot of top tier HOFers and it would seem like very high level of play. I guess my question is how much impact do high end HOFers have on the level of play for a time period? The flip side of the argument would be that the “average” type players increased in skill greatly over time, and the “average” players in the league getting better over time could be more impactful than the amount of top end talent at any one time. Anyway, fun stuff to think about.

Finally, my understanding is that a high or low BABIP generally is a lucky/unlucky stat. An unusually high (and out of line) BABIP for a pitcher would entail bad luck where a bunch of line drives and grounders happen to get hits. And an unusually low BABIP for a pitcher would be good luck where line drives seem to be hit right at guys etc. How much of BABIP is “good situational pitching” or “good situational defense”? Who knows. But this being said, Maddux is a fascinating pitcher. His control is obviously elite and close to best of all time for control. And not just throwing strikes, but the ability to nibble at the edges of the strike zone. This makes it very hard to make solid contact and should equate to a lower BABIP. That’s just the eye test from watching him. Strikes that are on the corners are difficult to hit hard. It you rarely throw a meat ball and get lots of strikes on the corners then you’d think stats should follow the eye test, just because Maddux was so good with his control.

Andrew,

Some very insightful points. In particular about the measure of "luck" in regards to BABIP. Kind of like predicting the outcome of flipping a coin and whether it lands heads or tails. That outcome is always a 50/50 probability. And so over time, and all other things constant and equal and assuming a sufficient sample size, anyone flipping coins would eventually expect to see them ending up with exactly half heads, and half tails. To me, I've always thought of this as kind of what is meant by "regressing to the mean", in this case ending up 50/50 on heads or tales. But what is interesting is say you start out flipping coins to test this, and everything being constant and nothing abnormal with the coin, the first 9 flips all come out tails. Now the absolute probabity of a head or a tail is still just 50/50 on that next, 10th flip, or is it? Since over a large enough sample size we expect the number of heads or tails to come up to regress to that expected mean of 50/50 for each of the two possible outcomes, if in starting out with getting tails 9 times in a row, you know you eventually have to start flipping heads, but the probability of each and every single flip is still always going to be just 50/50. So now you have somewhat of a paradox on what the actual probability of flipping a head or tail on all future attempts should be, at least it seems like one to me.

So now back to BABIP. The fact that you have some pitchers that appear to consistenly be above or below the league average BABIP, all the time, leads me to believe there is something other than simple "luck" involved with them being able to do that. At what point (ie: sample size) will a statistician be comfortable in finally admitting there may be some other factor(s) or variable(s) that they haven't been able to effectively measure, quantify, and account for, and as a result just refer to it as "luck". For wouldn't it be true that if they had been able to somehow measure and include all the pertinent factors and variables in their formulas, such as a pitcher like Maddux's ability to have batters consistenly not hit the ball hard or cleanly, that those formulas would in fact show where all things do eventually regress to a mean. Just like they do in the case of flipping coins where it will eventually always come back to show a 50/50 heads or tails probability. In other words, in the case of BABIP, if the statisticians could effectively factor in ALL variables and factors, there would be no outliers, like a Maddux maybe, sitting significantly outside the mean, unless expainable by some other variable or factor, like a lack of a sufficient sample size. But to just simply explain these outliers by attributing those differences to such an amorphous concept or idea as luck, leads me to believe there is an inability, or unwillingness, on the part of those performing the statistical analysis to effectively be able to find and include all the pertinent variables and factors in their formulas. Thus making BABIP maybe the best statistical tool for it's intended purpose they can do for now, but ultimately not the best and closest statistical measure or tool currently out there for use that it could be.

BobC · #**1139** 11-21-2021, 08:19 PM

Quote:

Originally Posted by Lorewalker

He loves the attention. He will take any side of an debate simply to argue and be the contrarian. On many threads he has ended up having to back down, back off or admit he was wrong. It is truly hysterical. For him it is the sport of it. He does not care who he annoys or even how he comes off. Even our talking about him gets his juices flowing. An absolutely massive ego and has narcissism down pat.

Sure he is bright but he has no people skills and I would guess that it is just not here but out in the wild too. I cannot ignore him...I admit I have a weakness for his rants. Endlessly amusing to watch him carry on the same exact way each time.

Chase,

Agree, agree, agree. Plus, you just made a very enlightening comment I had thought about as well, but hadn't shared yet. You mentioned a possible lack of people skills, which can often go along with others factors, like sitting in an office all day just running numbers and never really interacting with anyone to ever be able to develop such people skills. Isn't it often true that people will tend to gravitate towards work and professions that most often mirror, or at least coincide with a large part of, their personalities? Assuming so, maybe he just needs to get out more. The next Net54 dinner/get together at the upcoming National would be perfect, don't you think?

Mark17 · #**1140** 11-21-2021, 09:15 PM

Quote:

Originally Posted by BobC

Chase,

Agree, agree, agree. Plus, you just made a very enlightening comment I had thought about as well, but hadn't shared yet. You mentioned a possible lack of people skills, which can often go along with others factors, like sitting in an office all day just running numbers and never really interacting with anyone to ever be able to develop such people skills. Isn't it often true that people will tend to gravitate towards work and professions that most often mirror, or at least coincide with a large part of, their personalities? Assuming so, maybe he just needs to get out more. The next Net54 dinner/get together at the upcoming National would be perfect, don't you think?

If he's as good at building predictive models as he says, I'd rather take him to Vegas.

Carter08 · #**1141** 11-21-2021, 09:18 PM

Very true

Lorewalker · #**1142** 11-21-2021, 09:22 PM

Quote:

Originally Posted by BobC

Chase,

Agree, agree, agree. Plus, you just made a very enlightening comment I had thought about as well, but hadn't shared yet. You mentioned a possible lack of people skills, which can often go along with others factors, like sitting in an office all day just running numbers and never really interacting with anyone to ever be able to develop such people skills. Isn't it often true that people will tend to gravitate towards work and professions that most often mirror, or at least coincide with a large part of, their personalities? Assuming so, maybe he just needs to get out more. The next Net54 dinner/get together at the upcoming National would be perfect, don't you think?

Bob I think he should be a guest speaker. He can berate, mock and chastise everyone in attendance. His message really gets lost because of his method of delivery. Not sure what happened on Blowout but shortly after arriving here he made it clear he was only here to mix it up. One thing is absolute and that is he is great for page views. Not a single thread he has posted on has been boring.

BobC · #**1143** 11-21-2021, 09:24 PM

Quote:

Originally Posted by Mark17

If he's as good at building predictive models as he says, I'd rather take him to Vegas.

LOL

I don't know if he can work them up fast enough for each hand of blackjack, roll of the dice, or spin of the roullette wheel though. Plus, he'll probably want you to pay him up front, win or lose.

BobC · #**1144** 11-21-2021, 09:29 PM

Quote:

Originally Posted by Lorewalker

Bob I think he should be a guest speaker. He can berate, mock and chastise everyone in attendance. His message really gets lost because of his method of delivery. Not sure what happened on Blowout but shortly after arriving here he made it clear he was only here to mix it up. One thing is absolute and that is he is great for page views. Not a single thread he has posted on has been boring.

O------M------G!!!!!!

That's it. He must have been hired to increase posts and site hits. That's brilliant!

Snowman · #**1145** 11-21-2021, 11:27 PM

Quote:

Originally Posted by earlywynnfan

Back to the original Grove vs. Koufax line, can you please use your statistics to explain Koufax's widely disparate home vs. away records??

His away numbers are worse, for sure, but I don't know that I'd call them "widely disparate". There's an expected disparity for all pitchers when pitching at home and on the road. Part of the "home field advantage" in baseball comes from an umpire's subconscious bias in calling balls and strikes, just like in basketball with fouls. Even when they are trying their best to be neutral, it is somehow still human nature to call the games more favorably for the home team than the away team. The effect is small, but measurable over the course of a career. When you look at Koufax's career Home vs Away numbers, they don't really look all that out of line to me when you consider the fact that he pitched in a pitcher's park. Here's what I see. Note he had almost identical IPs for both. Also, ERA values are much more reliable over the course of a career with 1,000+ IP, so it's fair to look at those in the context of a career of this length, whereas it wouldn't be from season to season.

Home IP: 1158.0
Away IP: 1166.1

ERA Home: 2.48
ERA Away: 3.04

BB/9 Home: 2.9
BB/9 Away: 3.4

K/9 Home: 9.5
K/9 Away: 9.1

WHIP Home: 1.045
WHIP Away: 1.167

HR% Home: 2.2%
HR% Away: 2.1%

BABIP Home: 0.252
BABIP Away: 0.266

When I look at those numbers, the most interesting difference to me is the BB/9 rate. That's a significant gap, and one that definitely has an impact on his WHIP delta. Why was he walking more batters outside of LA? That's not a park effect. Some small disparity exists from umpire subconscious bias as I mentioned, but not that much, I wouldn't think. The differences in BABIP are probalby entirely explainable through park differences and his BB/9 & K/9 rates. I don't think there's much delta attributable to luck over that sample size, and the delta is narrow enough that it is within expectation. There is an expectation also though of a player's general discomfort level when on the road. People just perform better at home. I definitely acknowledge he was better at home than on the road, but I don't see anything that looks wildly out of line with expectations. The BB/9 rate is the most interesting part to me though. Pitching in Dodger stadium definitely helped too.

Snowman · #**1146** 11-22-2021, 12:32 AM

Quote:

Originally Posted by cardsagain74

The article did have some good points, but I agree that its whole "FIP is all that matters" conclusion is too simplistic and goes too far. And some of the points were really grasping at straws; the quotes from Maddux and Pedro were an especially poor attempt to help prove the merits of the study (of course a long scoreless innings streak will have a lot of luck...what does that have to do with that specific discussion?)

I've noticed that when it comes to sports and gambling, statisticians love to claim as many "this is completely random" findings as they possibly can. A lot of that probably has to do with being the devil's advocate about the general public's often faulty attempts to find reason in trends or insufficient statistics.

And with having such a passion to do so, it's easy for them to go too far in the other direction (and be too quick to dismiss the possible meaning in some numbers)

The underlying problem is that every statistic you read really should come with a confidence interval attached to it. But of course that's just too confusing for most people, and it would probably just annoy everyone. Plus, it's just impractical. But the reality for most of these statistics is that they are actually estimates of the athlete's underlying "true" abilities. Mike Trout's "true" batting average is some unknowable number, but we can estimate it using statistics. And that's precisely what we do. After the first game, he goes 3 for 4, we estimate it to be 0.750. Well, that's not going to fool anyone, because nobody hits 0.750, so we wait for more data. After a month, he's still hitting 0.414 though. Hell, by the all-star break, he's still hitting 0.392. That's after nearly 100 games and 400 at-bats! Surely, that's a large sample, right? Has he turned a corner? Rumors start spreading about him "putting in work in the off-season". They say he's "really focused now", etc. But none of this fool's the statistician, because we don't read his batting average as 0.392. We understand that 0.392 is just an estimate of his "true" batting average and that we can calculate a 95% confidence interval around this estimate by looking at the standard deviation and sample size associated with it. So, instead of reading it as being 0.392, we more appropriately read it as something like 0.392 +/- 0.130. In other words, his "true" batting average is 95% likely to be in the range of 0.262 to 0.522, which ultimately, just isn't all that helpful. Because we know this, we are hesitant to say things like "Trout is a better hitter this season than Harper since Trout is hitting 0.392 and Harper is only hitting 0.333 at the all-star break". The truth is, we just don't have enough data to make that determination. The sample sizes are simply too small, the standard devaition is too large, and thus the confidence intervals are too wide to be able to make claims "with confidence" about that statistic.

The same is true for something like ERA from season to season. It is a highly volatile statistic. When we say something like "it has too much variance", we mean that literally. Mathematically speaking, variance is the square of the standard deviation. Some statistics have extremely wide standard deviations, like ERA, batting averge, OBP, etc. Whereas other statistics have MUCH lower variance/standard deviations. Stats like FIP vary far less than ERA. This means we can compare two pitchers at the all-star break with much greater confidence by comparing their FIPs than we can by comparing their ERAs. It is a mathematical property of the inherent differences between those statistics. The same is true of K/9 and BB/9. They have lower variance than ERA, and thus have much narrower confidence intervals. A statistician might be able to read Koufax's K/9 rate at the all-star break with a fairly narrow confidence interval because of this. So they might read his K/9 of 10.1 as being something like 10.1 +/- 0.4, making comparisons against other pitchers much more possible. If two pitchers' statistics do not overlap when taking into consideration their confidence intervals, then you can say that you are 95% confident that Koufax is a better strikeout pitcher because his 10.1 +/- 0.4 K/9, or as an interval, read (9.7, 10.5) is greater than some other pitcher whose K/9 confidence interval is (8.8, 9.6). Note the bottom of Koufax's range (9.7) exceeds the top of the other pitcher's range (9.6), so we can state with confidence that he is indeed better. However, this is rarely possible to say with ERAs. The confidence intervals with those are just absolutely massive. Even after an entire season. One pitcher's ERA of 3.05 may look quite a bit better than someone else's 2.64, but we just can't state that with confidence because their intervals might be something like 3.05 +/ 0.65 and 2.60 +/- 0.75 resulting in ranges of (2.40, 3.70) and (1.85, 3.35). And since those intervals overlap, we cannot state with confidence that they are truly different or that one is clearly better than the other. This is why an asshole like myself says something along the lines of, "that doesn't mean shit", whereas someone more tolerant might say something like, "the standard deviations of that statistic are too wide and the sample sizes are too small for us to be able to make a determination about the differences between those two data points". One of the most fascinating aspects about baseball, which is probably a big part of why I love the game as much as I do, is that the game truly is subject to a MASSIVE amount of variance. Great hitters can hit 0.348 one season and 0.274 the next. People will come up with all sorts of explanations about what is causing the slump, whether his home life is affecting him too much, if he's injured or just experiencing a mental lapse, etc. However, the informed fan knows that this is simply within expectations, and looks to statistics like BABIP to help shed light on what the actual underlying cause is (the guy just got some lucky bounces last season and some favorable ones this season. Or perhaps he didn't. Perhaps his BABIPs are the same, and there actually really is something going on in his personal life or he really is injured. But variance/luck needs to be ruled out first, because if it's present, then you already have your answer). This is also precisely why I stated earlier that I see no reason to believe that Randy Johnson was tanking games in Seattle in 1998 before being traded to Houston that season. At first glance, his numbers appear to tell a significantly different story (ERA of 4.33 in Seattle and 1.28 in Houston). But when you dig in closer and look at the confidence intervals associated with those deltas, and look at his FIP, K/9, and BABIP values, and the confidence intervals around those, you'll see that they all overlap. We simply don't have enough data to say that those numbers are truly different, even though they certainly appear to be, and read that way to the non-statistician.

But these things do in fact matter. This isn't just some statistician's "opinion". We can actually calculate these things mathematically. We can also calculate the precise probability that pitcher A will have a lower ERA than pitcher B by the end of the season based on their differences at the all-star break. And if the formula says that pitcher A is 50% likely to have a higher ERA than pitcher B, based on their current ERAs and the confidence intervals associated with them, and if we run those comparisons for all pitchers in the league, we really will be "wrong" on 50% of them at the end of the season because these confidence intervals are real-world probabilities that will play out in the future. That's the beauty of the discipline of statistics. It's all based on sound theory that has been proven mathematically.

Snowman · #**1147** 11-22-2021, 12:47 AM

Quote:

Originally Posted by BobC

Andrew,

Some very insightful points. In particular about the measure of "luck" in regards to BABIP. Kind of like predicting the outcome of flipping a coin and whether it lands heads or tails. That outcome is always a 50/50 probability. And so over time, and all other things constant and equal and assuming a sufficient sample size, anyone flipping coins would eventually expect to see them ending up with exactly half heads, and half tails. To me, I've always thought of this as kind of what is meant by "regressing to the mean", in this case ending up 50/50 on heads or tales. But what is interesting is say you start out flipping coins to test this, and everything being constant and nothing abnormal with the coin, the first 9 flips all come out tails. Now the absolute probabity of a head or a tail is still just 50/50 on that next, 10th flip, or is it? Since over a large enough sample size we expect the number of heads or tails to come up to regress to that expected mean of 50/50 for each of the two possible outcomes, if in starting out with getting tails 9 times in a row, you know you eventually have to start flipping heads, but the probability of each and every single flip is still always going to be just 50/50. So now you have somewhat of a paradox on what the actual probability of flipping a head or tail on all future attempts should be, at least it seems like one to me.

This is referred to as the "Gambler's Fallacy".

From Wikipedia - The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the incorrect belief that, if a particular event occurs more frequently than normal during the past, it is less likely to happen in the future (or vice versa), when it has otherwise been established that the probability of such events does not depend on what has happened in the past. Such events, having the quality of historical independence, are referred to as statistically independent. The fallacy is commonly associated with gambling, where it may be believed, for example, that the next dice roll is more than usually likely to be six because there have recently been fewer than the usual number of sixes.

Snowman · #**1148** 11-22-2021, 01:28 AM

Quote:

Originally Posted by Mark17

If he's as good at building predictive models as he says, I'd rather take him to Vegas.

Quote:

Originally Posted by BobC

LOL

I don't know if he can work them up fast enough for each hand of blackjack, roll of the dice, or spin of the roullette wheel though. Plus, he'll probably want you to pay him up front, win or lose.

As others have pointed out previously, there's a reason my nickname is "Rainman". I was a math savant as a kid, and I'm autistic. Perhaps ironically, I also spent the first half of my adult life as a professional "gambler" before joining the corporate world.

Snowman · #**1149** 11-22-2021, 03:13 AM

Quote:

Originally Posted by BobC

So now back to BABIP. The fact that you have some pitchers that appear to consistenly be above or below the league average BABIP, all the time, leads me to believe there is something other than simple "luck" involved with them being able to do that. At what point (ie: sample size) will a statistician be comfortable in finally admitting there may be some other factor(s) or variable(s) that they haven't been able to effectively measure, quantify, and account for, and as a result just refer to it as "luck". For wouldn't it be true that if they had been able to somehow measure and include all the pertinent factors and variables in their formulas, such as a pitcher like Maddux's ability to have batters consistenly not hit the ball hard or cleanly, that those formulas would in fact show where all things do eventually regress to a mean. Just like they do in the case of flipping coins where it will eventually always come back to show a 50/50 heads or tails probability. In other words, in the case of BABIP, if the statisticians could effectively factor in ALL variables and factors, there would be no outliers, like a Maddux maybe, sitting significantly outside the mean, unless expainable by some other variable or factor, like a lack of a sufficient sample size. But to just simply explain these outliers by attributing those differences to such an amorphous concept or idea as luck, leads me to believe there is an inability, or unwillingness, on the part of those performing the statistical analysis to effectively be able to find and include all the pertinent variables and factors in their formulas. Thus making BABIP maybe the best statistical tool for it's intended purpose they can do for now, but ultimately not the best and closest statistical measure or tool currently out there for use that it could be.

While it may be true that certain pitchers consistently outperform league average BABIP values, that doesn't mean that the pitchers themselves are responsible. As I mentioned earlier, pitching BABIP values revert to the mean, but if you want to get more specific, they actually revert to their current team's unknowable true mean BABIP against, which we can estimate with sample data. I say current team because that team's value depends on their defensive capabilities and/or motivations (the "dog days of summer" is a very real thing, and teams out of contention behave in strange ways, but that's for another thread) at any given point in time. The ballpark is also a factor, as well as playing in the NL or AL as mentioned previously.

Doing some "back of the napkin" math, here's a quick breakdown of how those numbers might look. I say "might" because I'm just using quick maximum likelihood estimates using means here from the past 3 seasons to break these values apart, but it's probably at least directionally accurate with at least a handful of outliers. We could get a much more precise breakdown of how these BABIP values vary by team with more data and by solving for it with a system of equations using linear algebra by setting them all up in a matrix and inverting it, but I'm too lazy to do that. Well, that and it takes more time. But these values were easy to find for each team over the past 3 seasons for both home and away stats, so I did some quick math to break the numbers out into various attributions.

Notes/caveats - The standard deviation in this sample data is ~0.018, or 18 BABIP points, so these are loose estimates, and the true values will vary. But it's still a useful exercise in at least understanding how some pitchers can seemingly be able to "beat the system", when in actuality, they are just benefitting from being in the right park on the right teams. We could test this theory by looking at how pitchers perform before vs after being traded (you'd have to look at the population of all traded pitchers as a whole though, not just a few of them). The column "3Y_Delta" represents the delta between a team's 3-year average BABIP and the MLB average BABIP.

Anyhow, some interesting takeaway approximations from these loose estimates are:

Pitching in the NL appears to be worth a mere -0.002 BABIP points (not sure I buy this, I'd like more data)
Home field advantage is worth around -0.006 BABIP points

Note, the data sufficiently explains the "Koufax effect". And while I didn't run the numbers for the full league back when Maddux was pitching, I did spot check several of the other pitchers on his team during that era, and it appears to sufficiently explain the "Maddux effect" as well.

Per my rough calculations, it appears as though the Dodgers' advantage is more attributable to their defensive abilities than it is to the ballpark. While not all of the values should pass the "smell test" (sample sizes, confidence intervals, blah blah blah), you'll notice that many/most? of these do (e.g., Colorado and Boston have terrible park BABIP effects while Seattle, St Louis, and San Diego all show as being a clear pitcher's parks). Also worth noting is that the teams that do not pass the smell test are more likely to be the teams whose actual BABIP values varied wildly over the past 3 seasons (like the Chicago White Sox, whose ballpark factor of -0.030 does not pass the smell test, but whose BABIP values were all over the place the past 3 seasons with 0.292, 0.268, and 0.303).

I'm not well-tuned to the current defensive abilities of each team though. Perhaps someone paying closer attention can look at these numbers and see if they look directionally accurate as a group. Although also worth pointing out is that these defensive BABIP values are made up of both the players' abilities and team strategies like when to play the shift and where to place your players. Teams that are heavily invested in analytics certainly outperform other teams that are not, with respect to these defensive BABIP values.

See my "back of the napkin math" table below which rank orders teams by their overall 3-year average BABIP performances. Note, we would expect pitchers on these teams to have BABIPs that regress to their team averages more so than to the league average.

Aquarian Sports Cards · #**1150** 11-22-2021, 07:57 AM

Quote:

Originally Posted by Snowman

His away numbers are worse, for sure, but I don't know that I'd call them "widely disparate". There's an expected disparity for all pitchers when pitching at home and on the road. Part of the "home field advantage" in baseball comes from an umpire's subconscious bias in calling balls and strikes, just like in basketball with fouls. Even when they are trying their best to be neutral, it is somehow still human nature to call the games more favorably for the home team than the away team. The effect is small, but measurable over the course of a career. When you look at Koufax's career Home vs Away numbers, they don't really look all that out of line to me when you consider the fact that he pitched in a pitcher's park. Here's what I see. Note he had almost identical IPs for both. Also, ERA values are much more reliable over the course of a career with 1,000+ IP, so it's fair to look at those in the context of a career of this length, whereas it wouldn't be from season to season.

Home IP: 1158.0
Away IP: 1166.1

ERA Home: 2.48
ERA Away: 3.04

BB/9 Home: 2.9
BB/9 Away: 3.4

K/9 Home: 9.5
K/9 Away: 9.1

WHIP Home: 1.045
WHIP Away: 1.167

HR% Home: 2.2%
HR% Away: 2.1%

BABIP Home: 0.252
BABIP Away: 0.266

When I look at those numbers, the most interesting difference to me is the BB/9 rate. That's a significant gap, and one that definitely has an impact on his WHIP delta. Why was he walking more batters outside of LA? That's not a park effect. Some small disparity exists from umpire subconscious bias as I mentioned, but not that much, I wouldn't think. The differences in BABIP are probalby entirely explainable through park differences and his BB/9 & K/9 rates. I don't think there's much delta attributable to luck over that sample size, and the delta is narrow enough that it is within expectation. There is an expectation also though of a player's general discomfort level when on the road. People just perform better at home. I definitely acknowledge he was better at home than on the road, but I don't see anything that looks wildly out of line with expectations. The BB/9 rate is the most interesting part to me though. Pitching in Dodger stadium definitely helped too.

If you know you are a little more hittable away from your home park, might you not be inclined to nibble a bit more?

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Lefty Grove = Lefty Groves... And Lefty's 1921 Tip Top Bread Card	leftygrove10	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	12	10-15-2019 01:55 AM
62 koufax ,59 mays,72 mays vg ends monday 8 est time sold ended	rjackson44	Live Auctions - Only 2-3 open, per member, at once.	3	05-22-2017 06:00 PM
Final Poll!! Vote of the all time worst Topps produced set	almostdone	Postwar Baseball Cards Forum (Pre-1980)	22	07-28-2015 08:55 PM
Long Time Lurker. First time poster. Crazy to gamble on this Gehrig?	wheels56	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	17	05-17-2015 05:25 AM
It's the most wonderful time of the year. Cobb/Edwards auction time!	iggyman	Net54baseball Vintage (WWII & Older) Baseball Cards & New Member Introductions	68	09-17-2013 01:42 AM