Statistical methods for alerts

The alert system page contains a general overview of how the alert system works. More detailed information is given below about the statistical methods used to estimate population indices, population changes and their confidence intervals.

General structure of the data

The data for all of the schemes reported here consist of annual counts made over a period of years at a series of sites. They can thus be summarised as a data matrix of sites x years, within which a proportion of the cells contain missing values because not all of the sites are covered every year. Such data can be represented as a simple model:

log (count) = site effect + year effect

Each site has a single site-effect parameter. These site parameters are not usually of biological interest but they are important because abundance is likely to differ between sites. The main parameters of interest are the year effects. These can be modelled either with the same number of parameters as years (an annual model), or with a smaller number of parameters, representing a smoothed curve.

A simple annual model would be fitted as a generalised linear model with Poisson errors and a log link function. This is the main model provided by the program TRIM (Pannekoek & van Strien 1996), which is widely used for population monitoring.

Fitting smoothed trends

Our preferred method for generating a smoothed population trend is to fit a smoothed curve to the data directly using a generalised additive model (GAM) (Hastie & Tibshirani 1990, Fewster et al. 2000). Thus the model from the previous section becomes:

log (count) = site effect + smooth (year)

where smooth (year) represents some smoothing function of the year effect. It was not straightforward to fit GAMs to the bird census data and we have therefore fitted smoothed curves with a similar degree of smoothing to the annual indices (details below).

The non-parametric smoothed curve fitted in our models is based on a smoothing spline. The degree of smoothing is specified by the number of degrees of freedom (df). A simple linear trend has df = 1, whereas the full annual model has df = t-1, where t is the number of years in the time series. Here we set df to be approximately 0.3 times the number of years in the time series (Fewster et al. 2000). The degrees of freedom used for the main data sets presented in this report are summarised below.

 
Years
Length of
time series
df for smoothed
index
1966–2010
45
14
1974–2010
37
11
1994–2010
17
5
1928–2010
83
25
1983–2010
28
8

Note that the numbers of years shown here are different from those available for calculating change measures, because we use the whole time series available for analysis (i.e. prior to the truncation of end points), and because we count the number of years in the time series rather than the number of annual change measures.

CBC/BBS, WBS/WBBS and BBS trends

The model fitted to the combined CBC/BBS and WBS/WBBS data is that historically employed for the BBS – a generalised linear model with counts assumed to follow a Poisson distribution and a logarithmic link function. Standard errors were calculated via a bootstrapping procedure involving 199 replications. For presentation in the figures, both the population trend and its confidence limits were also subsequently smoothed using a thin-plate smoothing spline. The overall result is a smoothed trend that is mathematically equivalent to that produced from a generalised additive model.

Heronries Census trends

The Heronries Census data were analysed using a modified sites x years model based on ratio estimation which incorporates information about new colonies (sites) that have been established and other colonies from the sample that are known to have become extinct. The method was developed by Thomas (1993) specifically in relation to the heronries data set. Since then the heronries database has been substantially upgraded and the method has been applied to the full data set (Marchant et al. 2004).

The above method of analysis cannot be easily applied within a GAM framework. Therefore we fitted a smooth curve to the annual indices. This was done using PROC TSPLINE of SAS (SAS 2009). This procedure should give very similar estimates to a GAM analysis but it does not provide confidence intervals for the smoothed population trend or the change measures derived from it. This is not a serious limitation as there are presently no alerts for Grey Heron, whose populations have generally been increasing.

Constant Effort Sites trends

GAMs were fitted to the CES data for catches of adults and juveniles separately with the addition of an offset to correct for missing visits. Confidence limits were fitted using a bootstrap technique to avoid restrictive assumptions about the distribution of the data. Bootstrap samples were drawn from the data by sampling plots with replacement. We generated 199 bootstrap samples from each data set and fitted a GAM to each of them. Confidence limits for the smoothed population indices (85% cl) and change measures (90% cl) were determined by taking the appropriate percentiles from the distributions of the bootstrap estimates, in a similar manner to that employed for the WBS/WBBS trends.