Thursday, July 5, 2012

pd.resemble: an R function for calculating the pairwise resemblance in phylogenetic diversity of ecological samples

Here's a function I have written for the R statistical environment that calculates the pairwise resemblance (ie. either similarity or dissimilarity)  in Phylogenetic Diversity of multiple samples. I am providing it for free and without warranty under the GNU General Public License. You need to be familiar with R to use this function. The function also requires that the ape package be installed. To load the function, place the file in your working folder and type ‘source(“pdresemble.R”)’.

Latest version: 7th March 2011.

Please note that this function was previously called “phylosim”. However, an R package now has that name, so I changed the name of my function. This version of the function also allows for the calculation of either similarity or dissimilarity while previous versions only calculated similarity, although conversion between the two is trivial.

pd.resemble (x, phy, incidence = T, method = “sorensen”, dissim=T)
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.
incidence is a logical indicating whether the data are to be treated as incidence (binary presence-absence) or abundance.
method indicates the particular form of the resemblance index you wish to use. Current options are: "sorensen" (default - 2a/a+b+c), "jaccard" (a/a+b+c), "simpson" (a/a+min{b,c}) and "faith" (a+0.5d/a+b+c+d).
dissim is a logical indicating whether the pairwise resemblance scores should be similarity or dissimilarity (default).
pd.resemble takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates the resemblance in Phylogenetic Diversity (PD-resemblance) of all pairwise combinations of samples/sites. The principles for calculating PD-resemblance on incidence data are discussed by Ferrier et al. (2007). I have extended this approach to include abundance data (Nipperess et al. 2010).
pd.resemble returns a dist object giving the PD-resemblance of all pairwise combinations of sample/sites in x.
Ferrier S, Manion G, Elith J & Richardson K. 2007. Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity & Distributions 13: 252-264.
Nipperess DA, Faith DP & Barton K. 2010. Resemblance in phylogenetic diversity among ecological assemblages. Journal of Vegetation Science 21: 809-820.

No comments:

Post a Comment