View Full Version : OT: Seeking Data Sources for a Newbie!

12-14-2011, 12:02 AM

Figured I'd try posting this to Net54, as I seem to get so much else from this forum!

I am looking for a very easy to use data source for baseball stats. Eg, I want to easily download seasonal data for the pitchers who threw the most innings in 1909. In addition I'd like to download career info for those same players including stats like black ink, gray ink, and hall of fame standard/monitor.

I am not a programmer! I prefer simplicity and am willing to pay $ to get access to an easier to access source. I plan to bring the data into Excel as it is the analytical tool I can use well.

I have been through an number of sites. Baseball-reference.com for example has the data I need but I have yet to figure out how to download it easily. Through baseball-reference.com I found Lahman's database: http://baseball1.com/statistics/. This data set seems a bit bulky for me to use (or is it as good as it gets and I need to figure it out?).

Any and all help is most appreciated!


12-14-2011, 08:40 AM

Nice to see someone post about some quantitative baseball research!

I just took my first look at the Lahman's database--pretty nice actually. Don't know him, but I would think that Lahman would have software that was designed to extract data for purposes such as you need, so maybe if you contact him and offer a donation he would provide it.

Otherwise, it seems you would have two choices.

1) Go through all the pitcher's data manually and extract the data you need. Go to the pitcher's data and search for 1909 and extract the data by hand. To get career stats you would also have to extract all the other seasons for pitchers that played in 1909, and then sum these stats in a spreadsheet. This is cumbersome, fraught with possible errors and frustration, but possible.

2) Find someone to write a program that can extract the desired data, perform calculations and export it into files. This second option is not trivial either, but would be best because the program could be used not just for this particular application but almost any other application you might want in the future.

I used to write many programs similar to what you would need. However, I now have some rather severe limitations. If you cannot fiind better help send me a message and I might be able to guide you further along.


12-14-2011, 08:11 PM

Thanks so much for the note! I appreciate it and your verifying that Lahman's data source seemed good. I played with it a bit today and did make some progress.

Much appreciated!