It's football season again, hooray! Every year for my friends' football pool I try out a different algorithm. Invariably, my picks are around 60% accurate. Not terrible, but according to NFL Pickwatch (archive, current season), the best pickers get to 68 or 69%. So, an amazing performance—my upper bound—is just under 70%, and the lower bound for a competitive model—the FiveThirtyEight baseline—is 60%.

I've been modeling NFL outcomes for a couple of years, and running linear (predicting point spread) and logistic (predicting win probability) regressions given various team and player data. My best year so far incorporated the Vegas spread into the model, and my biggest disaster so far was an aggressive lasso model on every player in every offensive line, with team defenses lumped as a group. Attempting to track injuries, suspensions, and other changes to the starting lineup was not sustainable for the amount of time I wanted to spend.

Enter Nate Silver's awesome NFL Elo rankings, the aspirational target for this year. What's impressive is that he gets something like 60% accuracy out of literally no information but home field advantage and past scores. I particularly love that it updates weekly to incorporate the new information—this immediately says "Bayesian" and in fact is a lot how people using their intuition are making their picks anyway. A system like his—but with a more straightforward Bayesian model—is the goal of this post.

Markov-Chain Monte Carlo (MCMC) methods are a category of numerical technique used in Bayesian statistics. They numerically estimate the distribution of a variable (the posterior) given two other distributions: the prior and the likelihood function, and are useful when direct integration of the likelihood function is not tractable.

I am new to Bayesian statistics, but became interested in the approach partly from exposure to the PyMC3 library, and partly from FiveThirtyEight's promoting it in a commentary soon after the time of the p-hacking scandals a few years back (Simmons et. al. coin 'p-hacking' in 2011, and Head et. al. quantify the scale of the issue in 2014).

Until the 1980's, it was not realistic to use Bayesian techniques except when analytic solutions were possible. (Here's Wikipedia's list of analytic options. They're still useful.) MCMC opens up more options.

The Python library pymc3 provides a suite of modern Bayesian tools: both MCMC algorithms and variational inference. One of its core contributors, Thomas Wiecki, wrote a blog post entitled MCMC sampling for dummies, which was the inspiration for this post. It was enthusiastically received, and cited by people I follow as the best available explanation of MCMC. To my dismay, I didn't understand it; probably because he comes from a stats background and I come from engineering. This post is for people like me.

What's so great about Knime?

Bayesian updating and the NFL

Modeling property tax assessment in Cook County, IL

MCMC and the Ising Model