About my research

I am a computational biologist - I address longstanding biological questions by developing new techniques. Currently my research follows two complementary strands:

  1. Understanding variation in the tree of life. I develop and apply statistical approaches to understand why the tree of life is so uneven: some groups of species are far more diverse than others, and species traits are unevenly distributed.
  2. Understanding the processes that generate and maintain diversity. I am using mathematical and simulation approaches to understand the processes that prevent a "single winner" dominanting ecosystems.

Highlights from selected publications

How much of the world is woody?

with Matt Pennell, Amy Zanne, Peter Stevens, Dave Tank and Will Cornwell

Journal of Ecology, 2014. doi: 10.1111/1365-2745.12260

We tried to answer the simple question of what fraction of the world's plant species are woody: surprising the answer to this is not known. We surveyed researchers and found a huge range of estimates. We used a trait database that spanned 12% of plant diversity, and found that due to taxonomic patterns of woodiness, the estimates of woodiness were extremely biased. We used Monte Carlo (sampling) methods to correct this bias and found that just under half the world's vascular plant species are woody. Surprisingly this was much higher than estimates from researchers, even with significant botanical training.

All code and data to reproduce this paper are available on github, and an automatically generated analysis complete with code and figures is available here. This project uses continuous integration to ensure that it is fully reproducible.

diversitree: Comparative phylogenetic analyses of diversification in R

Methods in Ecology and Evolution, 2012. doi:10.1111/j.2041-210X.2012.00234.x

This is the companion paper to my package diversitree. This paper outlines the overall design goals of the package. I introduce a new method, "MuSSE", for partitioning the effects that multiple traits may have on diversification. I also describe two new algorithms for fitting evolutionary models: a linear time algorithm for fitting continuous trait models based on Brownian Motion (always faster than the traditional cubic-time algorithm based on covariance matrices) and for fitting discrete traits with many levels (linear in the number of levels, rather than quadratic or worse).

All code and data to replicate the analysis are available on github. The package is available on CRAN and on github. Diversitree has now been used in over 100 studies, to answer questions I never dreamt of.

Blog posts covering the biological question and challenges in making the analysis reproducible.

Quantitative traits and diversification

Systematic Biology, 2010. doi:10.1093/sysbio/syq053

I develop a new method, QuaSSE, to infer the effect of continuous traits such as body size on speciation and extinction rates, while simultaneously modelling the evolution of these traits. I show that under ideal situations with simulated data, the methods can infer patterns of diversification deep in the past, even while using only extant species data. I tested a long-standing hypothesis that species tend to become larger in size while large species diversify less rapidly than smaller species using a tree of all primates: I found little support for this hypothesis.

This method is implemented within diversitree as make.quasse. The analysis from the paper is replicated within the tutorial (p 23 - 30, source). This paper won the Systematic Biology "Publishers Award for Excellence in Systematic Research".

Estimating trait-dependent speciation and extinction rates from incompletely resolved phylogenies

with Wayne Maddison and Sally Otto

Systematic Biology, 2009 doi:10.1093/sysbio/syp067

In this paper I present ideas, algorithms and code for detecting the effect of species traits on rates of speciation and extinction. This generalised a new method (BiSSE) where only part of the structure of the phylogeny is known. This relaxes key assumptions in the original method and allowed BiSSE to be used on much broader classes of data than was previously possible.

This method is implemented within diversitree as make.bisse. The analysis from the paper is replicated within the tutorial (p 15 - 20, source).