Jump to content

NFL play by play data? XML format?


Fezmid

Recommended Posts

Has anybody found a source for NFL play by play data in XML format? I've found a couple of sites that seem to sell that data, but I was hoping for free (even if it's years old). Not sure if anybody here has come across anything like that or not.

 

The reason I ask is that I'm preparing to build a data warehouse of all play by play data so that you can ask interesting questions like, "How often do the Bills run the ball on 3rd and less than 2 on the road when trailing by 14 points in the snow?"

 

Ambitious, but we'll see what happens.

 

Thanks!

 

(EDIT: I did find this site, and it contains plays in CSV. Might be able to use it, but I think XML fits into the project better).

 

http://www.armchairanalysis.com/nfl-play-by-play-data.php

Link to comment
Share on other sites

Has anybody found a source for NFL play by play data in XML format? I've found a couple of sites that seem to sell that data, but I was hoping for free (even if it's years old). Not sure if anybody here has come across anything like that or not.

 

The reason I ask is that I'm preparing to build a data warehouse of all play by play data so that you can ask interesting questions like, "How often do the Bills run the ball on 3rd and less than 2 on the road when trailing by 14 points in the snow?"

 

Ambitious, but we'll see what happens.

 

Thanks!

you can get free 2000-2010 Play-by-Play here in CSV format. There's a number of CSV-to-XML converters available like this one

Link to comment
Share on other sites

The NFL has agreed to give me a week's worth a data, although I'm asking if they can give me 16 weeks worth of an individual team's data instead of one week of everyone's data. We'll see what they say, but I was impressed that the NFL wrote me back. :)

Link to comment
Share on other sites

  • 4 weeks later...

Well, the NFL sent me a season of data in XML for the Panthers... Unfortunately they just sent it a couple of days ago so I didn't have time to use it in my project. :(

 

The good news is that I used the data from ArmChairAnalysis.com and was able to get my data warehouse up and running with data from every game back to the year 2000! Now to come up with cool queries for it.

 

For example, did you know that since 2000, the Bills have run the ball 55.28% of the time on the first play of a drive? However if you only take home games into account, that number rises to 58.09%, and is only 53.61% on the road. I guess I would've thought they'd run the ball more on the road to start a drive, but I guess not.

 

I can't wait to play around with this data. I have 473,621 plays in this database across 2,921 games! :)

Link to comment
Share on other sites

I actually programmed a PBP parser for a football game I was developing that sort of did the same thing...

 

Initially that's how I wanted to get my data -- parse through the PBP logs of NFL.com. However what they publish on their site isn't in any sort of standard, so parsing would be very difficult and rife with errors. :(

 

I have a decent data set now though -- can't wait to play with it. :)

Link to comment
Share on other sites

I copied them to doc files, standardized the language and made use of a lot of Regex..

 

Works perfectly fine, pulled all pertinent data and stuffed it into an SQL DB, and then ran all kindsa various sorts and stored procedures.

Edited by matter2003
Link to comment
Share on other sites

×
×
  • Create New...