Quote:
Originally Posted by Peter_Spaeth
What if Josh Gibson's batting average was substantially lower in the 50 percent or maybe it's higher of games for which there are no stats? Leaving aside all the equality arguments, the incompleteness of the stats makes comparisons dubious.
|
From a statistical perspective, it is highly unlikely that his batting average in the missing games is significantly different than that of the games we do have records for. My stats is a bit rusty since I haven’t used it since studying for my Lean Six Sigma Green Belt last year, but if we assume we only have details on 50% of his at bats, a sample size of 2526 yields a margin of error of 2.57% at a 99% confidence level. To put it in more colloquial terms, there is a 99% chance Gibson’s real batting average is within plus or minus .026 of the known average of .373
I am undoubtedly oversimplifying the math a bit here, so it is a darn shame we don’t have a credentialed data scientist around here to help.