MMS • RSS
Article originally posted on Data Science Central. Visit Data Science Central
Last time, I wrote on wrangling data from a pdf file to assemble a data set of D1 college athletic performance in the Learfield Directors’ Cup competition. In this blog, I embellish that data, calculating individual school ranks from scores for the fall, winter, and spring seasons. I then aggregate and contrast conference performance for the total year as well as for the individual seasons. It turns out that of the 32 D1 conferences, the real competition for best boils down to just the top 5 big football or FBS “leagues” — the Pac 12, SEC, ACC, Big Ten, and Big 12.
The methodology deployed is to derive conference performance ranks for total year in addition to the fall, winter, and spring seasons by aggregating individual school ranks. Thus, for example, the Pac 12 school rankings for total year, fall, winter, and spring seasons would be combined to come up with Pac 12 total, fall, winter, and spring scores using formulae that would no doubt be debated. The conferences would then be compared on these scores, and “winners” derived for the four categories. The rankings are ultimately visualized with violin plots, descendants of the trusty box and whiskers.
Below is the R code for the final data building and analyses. The technology used is Microsoft Open R 3.4.4 running JupyterLab Beta.
Read the full blog here.