As before, the data comes from the ergast API:
#Load in the core utility functions to access ergast API source('ergastR-core.R') #Get the standings after each of the first 16 rounds of the 2014 season df=data.frame() for (j in seq(1,17)){ dft=seasonStandings(2014,j) dft$round=j df=rbind(df,dft) } #Data is now in: df
The returned data contains the championship standing, and points to date, for each driver at the end of each round. We can derive further data elements from it:
#Sort the data by ascending round and position df=arrange(df,round,pos) #Find how many points ahead of the driver behind each driver is df=ddply(df,.(round),transform,diffbehind=diff(c(points[[1]],points))) #Sort by ascending round and descending position df=arrange(df,round,desc(pos)) #Find how many points behind the driver ahead each driver is df=ddply(df,.(round),transform,diff=diff(c(points[[1]],points))) #Derive how many points each driver scored in each race df=ddply(df,.(driverId,year),transform,racepoints=diff(c(0,points)))
As before, we can generate a base chart:
library(ggplot2) library(directlabels) #The base chart g=ggplot(df,aes(x=round,y=pos,group=driverId)) charter=function(g) { g=g+geom_line() #Remove axis labels and colour legend g=g+ylab(NULL)+xlab(NULL)+guides(color=FALSE) #Add a title g=g+ggtitle("F1 Drivers' Championship Race, 2014") #Add the line labels, resized (cex), and with an x-value offset g=g+geom_dl(aes(label=driverId),list("last.points",cex=0.7,dl.trans(x=x+0.2))) #Add right hand side padding to the chart so the labels don't overflow g=g+scale_x_continuous(limits=c(1,20)) g } g=charter(g)
Let's annotate the chart - firstly with data showing the number of points gained at each race. As previously, crossed lines show changes in championship standing between consecutive rounds:
g+geom_text(data=df,aes(label=racepoints),vjust=-0.4,size=3)
That's okay, insofar as it goes, but we could perhaps add in colour relative to the number of points scored in each race to highlight the higher values a little more clearly.
g+geom_text(data=df,aes(label=racepoints,col=racepoints),vjust=-0.4,size=3)
The default colour scheme scales from black to light blue. The higher values look a little washed out to me, making me think it might be worth exploring other colour mappings to highlight the higher values more clearly.
Annotating the chart with points scored per race helps us see how well each driver fared in a particular race, but the chart does not give us a sense of how many points separate drivers in the championship standings at the end of each round. We can address this by using the total number of championship points scored to date as the text label, preserving the an indication of the number of points awarded for each race by using the colour dimension.
g+geom_text(data=df,aes(label=points,col=racepoints),vjust=-0.4,size=3)+scale_color_continuous(high='red')
Looking down a column, we can compare the number of points separating drivers in the drivers championship at the end of each round. From the colour field we can see how drivers placed next to each other compared in terms of points awarded in each round. Looking along a line, we can (if necessary) calculate the number of points obtained in a particular round as a simple subtraction.
Elements of this recipe may form part of a forthcoming chapter in the Wrangling F1 Data With R book.
No comments:
Post a Comment
There seem to be a few issues with posting comments. I think you need to preview your comment before you can submit it... Any problems, send me a message on twitter: @psychemedia