D3Football.com released their Preseason Top 25 poll today, and I pretty much immediately started digesting the information therein. A few things I noticed:
- Voters really like teams that showed a lot of improvement last season, especially if they return a lot of production (Berry, Hendrix, C-M-S, Albright, Whitworth, W & L, ETBU)
- No love for the IIAC
- More love for the East than we've seen in other (probably less informed) preseason polls
- Who voted for Amherst???
Seriously, who voted for Amherst? This is like a couple years ago when NDSU was receiving votes in the FBS Top 25 after they demolished Iowa State. I'm not saying a team in the NESCAC can't be one of the 25 best Division III teams in the country, but what's the point?
To digest the first couple bullet points, I compared my ratings & rankings to the Top 25:
Relative Dif is measures how "over-" or "under-rated" a team is compared to my system, based on the vote share and national ranking (I'm not saying these teams actually are over-rated, don't crucify me). Among the most over-rated teams by D3Football are the teams I mentioned above.
A lot of the difference in rankings can be explained by perceived conference/schedule strength between human voters and my system. Take Rose-Hulman for example. My model gives them a preseason ranking of #95, but their human vote share would slot them at #40 (yes, I know the ranking isn't intended to slot teams outside of the Top 25, but if they didn't want me to do it, they shouldn't list their vote shares). The Fighting Engineers were only 8 points away from an undefeated regular season last year, which is difficult no matter your conference, and their conference record ended up at 6-2. Guess what my model is projecting their conference record to be next year? Yup, 6.0-2.0 on the nose.
Some of the other differences are due to the human tendency to extrapolate, while a statistical model tends to regress to the mean. In my preseason ratings, about 75% of a team's rating is from their finish in the previous season, and the other 25% is calculated based on how they have finished in previous seasons. If a team has a very large improvement from one year to the next, they tend to revert to their long-term averages somewhat in subsequent seasons. Think of MIT two seasons ago; they went 10-0 in the regular season, and then last year they went 2-8. This sort of regression isn't always the case though, and sometimes something tangible (coaching changes, a few stellar players), is to explain for the improvement, and humans can see what a model can't.
The last major cause for differences between my model and the human poll is due to how I built the model. My model is based pretty much entirely on point differentials, and there's no inherent boost for winning games (other than scoring more points than them). Some of the teams under-rated by my model relative to the human poll, such as C-M-S, won a lot of close games last year. The model isn't very impressed by close wins, especially at home. Because home-field advantage works out to about 3 points per games, a 1 point win at home is essentially treated as a loss. But... as a former coach and player, I do firmly believe there is an inherent skill involved in winning close games. Yes, there's luck involved in a lot of cases, but I've heard it said before that good teams make their own luck. If a young team is winning a lot of close games, it's not folly to assume they'll improve as they gain more experience.
As a fun side-note: Despite receiving 18 of a possible 25 first-place votes, my model thinks the humans are under-rating the Purple Raiders. I'm sure that might change once I'm able to input returning starter info though.