This blog shares some tips and tricks for working with data - particularly aimed at ecologists and conservation biologists.

I am a conservation biologist, so I use GIS and data analysis tools a lot for reports and articles. Other people may on occasion read these publications, so I like to think that any one can repeat / reproduce my work, which is why I tend to use “free” software for processing spatial data and running analysis. I have Windows on my laptop and Linux (openSUSE) on a desktop (which does most of the heavier analysis). So I need programs that run on both.

Friday, November 19, 2010

R circular graphics

Often temporal patterns are represented as linear, however this is not always appropriate. For example, graphically a linear representation suggests an event at midnight is far away from an event at 1am (a). A more realistic representation is circular (b). I am going to illustrate how to produce this circular graph using R .




The code was used in Figure 2 of Norris et al. 2010 “Habitat patch size modulates terrestrial mammal activity patterns in Amazonian forest fragments”; Journal of Mammalogy; doi: 10.1644/09-MAMM-A-199.1.
First, some example data, the number and proportion of events for each hour on a 24 hour clock.

R code:

ahour<-c(1:24)
acount<-c(8,5,9,10,0,0,0,1,0,2,3,0,0,0,1,2,0,3,0,4,5,6,7,4)
proportions<-acount/sum(acount)
sum(proportions) # check total is 1!
df.hours<-data.frame(ahour,acount,proportions)
df.hours$ahour<-as.factor(df.hours$ahour) #convert to factor.


Now to plot some graphs, I tried a number of plot functions from various packages and found package "ggplot2" to be the most effective for what I wanted. Full details of ggplot2 functions are available on the package website , which also has a user mail list and a book (ggplot2: Elegant Graphics for Data Analysis ) to answer further questions.

R code:
library(ggplot2)
adata <- ggplot(df.hours, aes(x=ahour,y=proportions)) #the data to be used
graph.a<-adata + geom_bar(width = 1, stat = "identity") # simple barchart
print(graph.a)



This standard representaion suggests a bimodal pattern with two peaks, one around 4 and the other around 23. Compare this with a circular represenation.

To make a circular representation, add the ggplot2 "geom" of coord_polar.


R code:

graph.b<-graph.a+coord_polar() # circular graph based on ggplot2 defaults
print(graph.b)
A circular representaion shows an entirely different pattern. There is a general increase from 19 hrs with a peak at 4am.

Now to change the font and colours to something like what was used in the publication.
You need to add the theme "theme_pub" (code below the graph) before making the graph.

R code:

graph.pub<-graph.a+coord_polar(start=0.11)+theme_pub()
print(graph.pub)



Below is the code for a new theme, which is one way to change font sizes and colours in ggplot2.

R code:

theme_pub<-function (base_size = 12) { structure(list(axis.line = theme_blank(), axis.text.x = theme_text(size = base_size * 2.4, face="bold",lineheight = 0.4, vjust = 0), axis.text.y = theme_text(size = base_size * 2.4, lineheight = 0.9, hjust = 1), axis.ticks = theme_segment(colour = "black", size = 0.5), axis.title.x = theme_blank(), axis.title.y = theme_blank(), axis.ticks.length = unit(0.3, "lines"), axis.ticks.margin = unit(0.1, "lines"), legend.background = theme_rect(colour = NA), legend.key = theme_rect(colour = "grey80"), legend.key.size = unit(1.2, "lines"), legend.text = theme_text(size = base_size * 1.0), legend.title = theme_blank(), legend.position = "right", panel.background = theme_rect(fill = "white", colour = NA), panel.border = theme_rect(fill = NA, colour = "grey50"), panel.grid.major = theme_line(colour = "grey50", size = 0.2), panel.grid.minor = theme_line(colour = "grey50", size = 0.5), panel.margin = unit(0.1, "lines"), strip.background = theme_blank(), strip.label = theme_blank(), strip.text.x = theme_blank(), strip.text.y = theme_blank(), plot.background = theme_rect(colour = NA), plot.title = theme_blank()), class = "options") }

Calendar