SECRETS REVEALED - The process used when researching CFB History.
I've had several people ask me about the process that I use for auditing each school/conference collegiate football history, so I decided to write a basic step by step guideline of the entire process that I use. With as detailed as this may appear to be, do note that I'm NOT tipping my hand to everything, nor am I explaining the actual in-depth detail of each step.
Just an example, I basically take a School's media guide (I'll use Eastern Michigan - Linked PDF Media guide showing year by year results) from PDF all the way through to Eastern Michigan's GridironHistory.com profile (Linked Profile here on GH).
It's important to note that I'm using what I call "Short Names". This allows for efficient naming conventions when the URL is called up in a web browser (Something like "Saint Mary's of California" would potentially break the URL on some browsers, so I use "Saint Marys" as the Short Name, and on the actual school profile page, the full name of the University is used, whenever possible.
You should also be aware that the school names that we use in our database are the CURRENT names. At some point, I will begin researching the history of each school and allow people to track a School's name changes over the years. See the second post after this one for a few examples.
Here's the process I use:
Audit Stage 1 - Individual member schools:
- Depending on the amount of work in the first two steps, I can complete Audit Stage 1 in anywhere from 2-8 hours, per school.
Audit Stage 2 - Playing with data at a Conference level.
- Depending on the amount of corrections, this can take anywhere from 4 hours to 16 hours to complete. In many cases, I average 6 hours.
Audit Stage 3 - Bring the Stage 2 data into the previously audited records from other conferences.
- Much like Audit Stage 2, Stage 3 can take anywhere from 4 to 16 hours to verify.
I've got nearly 19 months and over 3500 hours of work into researching nearly 135 schools. When I started with the SEC, it took me nearly 3 months of work just to complete all of the steps above, except that it was done manually, without any scripts to help automate the audit process. As you can imagine, it was quite a headache to do all of the verifications.
People frequently ask where I get this information, since I'm not physically travelling to each School's library to look at their microfiche. 95% of the information comes directly from each School's media guide (I have nearly 180 media guides and archived HTML pages that came directly from the school itself). I also utilize Wikipedia as a reference point, especially when researching a School's name changes over the years. I also utilize up to 6 other CFB Historical websites to cross reference conflicting information. The key is to know WHAT you're searching for when using Bing or Google. I even have complete histories at my disposal that aren't on any of the other CFB History websites.
Does "Florida Southern University" sound familiar in terms of Collegiate Football? No? Not surprising. They only fielded a team from 1923 to 1933.
As I've worked through this process, and since I fancy myself a PHP/MySQL Developer, I started writing several scripts to help automate the steps above. The Admin tools on GridironHistory.com are pretty intuitive and powerfully complex, while making it *very* simple for even the most basic person to work through.
I do have to be careful, though. There is a LOT of reading and things you have to absolutely pay attention to, or you can pretty much obliterate an entire conference's data, or quite possibly the entire database itself.
With the tools that are now in place (and are constantly being adjusted to introduce new algorithms), I can now audit a 12 team conference in as little as 10 days, IF I'm left to work uninterrupted (HAHAHA!). I tend to average approximately 15-20 days, per conference.
Most of you won't even care about what it takes to be able to research the history of your favorite school. All you see, at the end of the day, is some pretty damn amazing stats.
If you managed to read through ALL of this... thank you! I hope this gives you an insight as to how detailed and how much work it actually takes to deliver the information you see on GridironHistory.com.
I hope you enjoyed learning about our audit process!
Lead Historian & Researcher for GH