This system was designed with the purpose of predicting outcomes of Division III football games, while using rating metrics that a casual fan can understand easily. The model uses only the score and location of each game as an input. Unlike most other models, my model does not discredit results of blowout or outlier games. I tested variations of this model that did account for those factors, but my results were less accurate.
Ratings are on a 0-1 scale. The value represents each team's expected win probability versus a hypothetical team with an average offense and an average defense who plays at an average pace. Essentially this value is their opponent-adjusted true winning percentage.
A team with a rating of 0.500 is average, a team with a rating above around 0.875 is probably in the Top 25, teams generally need ratings above 0.900 to receive Pool C bids (but somehow Illinois College got one in 2011 with a rating of 0.544), and teams rated above 0.990 have a legitimate shot of winning the Stagg Bowl. Mount Union, Whitewater, and St. Thomas are the only teams since 2005 to have ratings greater than 0.9995.
Adjusted Offense (AdjO) & Adjusted Defense (AdjD):
Adjusted offense is the expected amount of points each team would score against an average defense, while adjusted defense is the expected amount of points each team would give up to an average offense. I like to think of these metrics as their "Opponent-Adjusted Points per Game."
For AdjO & AdjD, the average that each team is compared to is relative to every other team in the nation for that week. In 2005 the national average was around 24.5 points per game, and in 2015 it was 26.5, so offenses and defenses are compared to that standard. This method keeps the average rating of all teams hovering around 0.500, but means that an "average offense" changes depending on the week you're investigating. Check out the Historical Ratings Interactive to see how each team has compared to the national average.
Generally speaking an offense with a rating above 42 is probably in the Top 10 in the country, with 55 being the average rating of the best offense. Mount Union 2014 has the best offense since 2005, with a rating of around 66.
For defenses an AdjD rating below 10 is the typical benchmark for a Top 10 team. Whitewater and Mount Union are the only teams since 2005 to have a rating below 0, with Whitewater peaking at nearly -5 after their 2013 Stagg Bowl victory.
Estimated Offense (EstO) & Estimated Defense (EstD):
This is the predicted score for each game, obtained by adding Team A's AdjO to Team B's AdjD and subtracting the nationwide average. The home team also receives a 1.5 point addition and subtraction for their EstO & EstD, respectively.
Win Probability (Win%):
The win probability based on the point spread, with normally distributed error and a standard deviation based on the historical accuracy of the model.
Strength of Schedule (SoS):
The rating of each team's opponents, based on their average AdjO & AdjD at the time they played. The model makes better predictions when only considering results from the most recent games (and not updating based on the recent performance of previous opponents), so the strength of schedule metric also adheres to this trend.
Conference strength is found by averaging the AdjO & AdjD of each team in the conference, and using those marks to determine the rating relative to average.
This is the same method used for averaging opponents in the strength of schedule metric. This method is used instead of simply averaging the ratings of each team because averaging the ratings would diminish the impact of the best (and worst) teams in each conference. Ratings are determined assuming a normal distribution, which means a team needs a larger increase in point differential to improve their rating from 0.900 to 0.950 than they would to increase from 0.500 to 0.550.
EstW & EstL:
This metric is the estimated win-loss record for each team based on their pre-game win probabilities. Their EstW is simply the summation of their Win%, and EstL is the difference between EstW and the total number of games. Essentially these numbers can be used to determine which teams are "better than their record," and vice versa.
Efficiency & Explosiveness Statistics (AKA Yards, 1st Downs, & Points, or Y1P+):
All of the efficiency and explosiveness metrics are adjusted for opponent, which means the numbers displayed are what each team would be expected to produce against an average squad.
Yards per Play (YPP)
Yards per play is the same basic metric that you would see on any site. The only differences are that these numbers are adjusted for opponent, and sacks are included as a passing attempt instead of a rushing attempt. YPP is meant to be the "explosiveness" portion of "Efficiency and Explosiveness."
First Downs per Opportunity (1DnRt)
First downs per opportunity is my attempt at quantifying a team's efficiency. It is a rate stat, which expresses how frequently a team is able to move the chains on every new set of downs. Ideally, this would be a play-by-play rate (i.e. "staying ahead of the chains") instead of "set of downs-by-set of downs" rate, but DIII doesn't have that level of data available. The equation is (First downs due to penalties are purposely excluded, but safeties and end-of-half results are excluded due to lack of information, but should affect the final results in a meaningful way):
(Rush1Dn + Pass1Dn + RushTD + PassTD)
(Rush_1Dn + Pass_1Dn + RushTD + PassTD + FGA + Punts + Interceptions + Fumbles_Lost + TO_On_Downs)
Points per Drive (PPD)
Points per drive is included right now as a catch-all for other metrics (Special Teams, Avg Starting Field Position) that I don't have a good way to analyze or estimate right now. The equation is similar to 1DnRt, see below:
(6 * (RushTD + PassTD) + 3 * FGM + 2 * 2PtConv + PATMade)
(RushTD + PassTD + FGA + Punts + Interceptions + Fumbles_Lost + TO_On_Downs)
Yards, 1st Downs, & Points (Y1P+)
Y1P+ is the combined metric for the three stats above. It's meant to be on the same scale as AdjO and AdjD (above). The coefficients for the three stats and the method of regression will change throughout the course of the season as I get more data in the model, and minimize my average error.