|
Abstract from BTO Research Report No 226:
Stephen R Baillie & Mark M Rehfisch
(eds.) (2006)
National and site-based alert systems for UK birds
ISBN 1-904870-78-3
EXECUTIVE SUMMARY AND RECOMMENDATIONS
This report presents the conclusions of a working group that met
in 1998 to consider alert systems for bird populations in the United
Kingdom. Inputs to the working group were provided by a series of
research and evaluation projects reported here, together with a
workshop involving both statisticians and ecologists. The members
of the working group were as follows:
David Gibbons (RSPB, Chairman), Stephen Baillie (BTO), Mark O’Connell
(WWT), Stephen Freeman (BTO), Rhys Green (RSPB), Richard Gregory
(BTO), Phil Grice (EN), Mark Rehfisch (BTO), Susan Davies (JNCC)
and David Stroud (JNCC).
NATIONAL AND REGIONAL ALERTS
There are great advantages in having a system which has a consistent
basis across both terrestrial breeding birds and wintering waterbirds
and which can eventually be extended to all species with appropriate
monitoring data. We have therefore presented our conclusions with
this end in mind. Details will need to vary in relation to the available
monitoring data and the analytical methods that can be applied to
specific types of data.
Overall approach
1. Thresholds and time periods
Alerts should flag up population declines of 25-50% and of 50% or
more measured over the whole data series, 25 years, 10 years and
5 years.
For terrestrial breeding birds the start of the maximum CBC period
over which changes will be measured will be taken as 1968. This
avoids problems relating to the severe winter of 1962/63 and early
changes in CBC methods. When fitting GAMs it will be preferable
to include data from 1966 and 1967 when fitting the GAM, but to
calculate changes from 1968 so as to avoid endpoint effects (below).
For counts of wintering waterbirds the winter of 1969/70 will be
treated as the start of the index series. Note that counts of ducks
and geese go back for some years before this but in the absence
of major declines prior to 1969/70 it seems better to adopt a single
start date for all waterbirds.
2. Measurement of population changes using smoothed population
indices
Population changes will be measured as the percentage change (on
an arithmetic scale) between two points on a population index curve
that has been smoothed to remove short-term environmental fluctuations.
3. Significance tests
90% confidence intervals for population changes will be calculated.
Alerts will only be flagged up when these confidence intervals do
not overlap zero change (i.e. the change can be said to be statistically
significant). Formal significance tests against the 25% or 50% threshold
levels will not be conducted.
4. Description of data quality
Alerts should be assessed for all data sets which are likely to
produce useful information, even where these fall short of the sample
sizes and representativeness that would be ideal. Use of confidence
limits will largely guard against flagging spurious alerts from
small samples, but they will not guard against problems of unrepresentativeness.
The sources of information used to calculate alerts and known limitations
on the extent to which the information may not be fully representative
should be as transparent as possible. All alert reports should include
the number of sites included in the analysis. Where the data may
not be fully representative the known limitations should be recorded,
even if this is only possible in descriptive terms (e.g. Table 3.1).
A method of assessing the likely representativeness of CBC indices
using data from the 1988-91 Breeding Atlas was devised by Gibbons
et al. (1996) as part of the Birds of Conservation Concern listing
process. The ratio of mean frequency index in all Atlas squares
to mean frequency index in squares with CBC plots was inspected
and CBC data were then not used for species which had higher frequencies
in areas without CBC plots. We propose that this ratio should be
included in alert tables based on CBC data. Similar approaches could
potentially be developed for other schemes where sampling is not
fully representative.
For Wetland birds it may often be possible to provide an estimate
of the proportion of the population that is included in the counts.
Data deficient species should be flagged separately. These will
typically be species where the change measures have very wide confidence
limits but where we cannot be certain that the population is not
in serious decline. For example a species with change confidence
limits of +5% to -60% would not flag a formal alert but is most
likely to be in at least moderate decline. Such species are likely
to be candidates for increased monitoring or special surveys.
5. Presentation of alert results
When considering alerts for any particular grouping (e.g. widespread
UK terrestrial breeding birds) change measures for all species for
which data are available should be presented. These tabulations
should include summary information on data quality and representativeness,
as notes against each species, with additional footnotes if necessary.
They should record the population changes for all time periods of
interest and which of these are statistically significant. Significant
changes of 25-50% (moderate declines) and 50% or more (rapid declines)
should be flagged. It would be possible to divide the alerts into
many more categories based on time periods, consistency of declines,
data quality and so on. However, we do not think this would aid
interpretation, as such categories may be difficult to understand
for those who are not familiar with the system. It would therefore
be better just to describe particular alerts appropriate to any
specific context, based on the information given in the alerts tables
(e.g. “The Corn Bunting shows a rapid decline but the data
are mainly from southern Britain”). Data deficient species
should be flagged separately (above).
6. Status and dissemination of alerts information
National, country and regional alerts will be advisory and are intended
to act as triggers for closer scrutiny of results and potential
further investigation by interested parties. They will be released
to the ornithological/conservation communities annually, at agreed
times which fit into the analysis and reporting timetables of the
schemes involved. The information will be made available to all
interested parties and will be in the public domain. It is important
that there should be close co-ordination between relevant conservation
and research bodies to ensure that publicity and interpretation
of the results presents a coherent picture. Details of timing of
publication and of the co-ordination of publicity and interpretation
are best dealt with as part of the management arrangements for the
schemes and partnerships involved (i.e. WeBS, BBS and the JNCC/BTO
Partnership). In addition to producing alerts for the United Kingdom
there is interest in eventually producing all Ireland alerts in
collaboration with colleagues from the Republic of Ireland.
7. Data to be analysed
Alerts for terrestrial breeding birds will be calculated from Common
Birds Census data for the whole UK, although these have known bias
towards southern Britain. Alerts will also be calculated from BBS
data as they become available for particular time periods. BBS will
allow separate alerts to be calculated for Scotland, England, Wales
and Northern Ireland as well as for the UK as a whole. Alerts for
individual countries will cover fewer species than those for the
whole of the UK. Work to combine indices from CBC and BBS, where
possible, is advanced but falls outside the present project.
Alerts based on WeBS data will be produced annually for the UK,
Scotland, England, Wales and Northern Ireland (and also for individual
designated sites - below).
Technical issues
1. Choice of statistical models
Generalized Additive Models with site and time effects provide the
preferred means of calculating long-term population changes (section
2.3). They have the advantage of placing the analyses within a coherent
statistical framework and yet avoiding any restrictive assumptions
about the shape of the trend curves. Furthermore, models providing
annual population indices (i.e. a sites x years model within log-linear
Poisson regression, section 2.2 or the standard Underhill sites
x years x months model, sections 4 and 5) are just special cases
of a GAM. The modified Underhill index, smoothed over a 3-year moving
window, provides an alternative method for waterbirds. Results will
be similar to a GAM but the smoothing is less refined (section 4).
We recommend that trend analyses for terrestrial birds should be
carried out using GAMs. The long-term aim should be to apply these
to the waterbirds data as well. However, due to computing requirements
(see 11 below) the Underhill model with a 3-year moving window should
be adopted as an interim solution. Where GAMs are used it is important
that the precise smoothing algorithm and level of smoothing (below)
should be reported.
2. Degree of smoothing
Degrees of freedom (i.e. amount of smoothing) appropriate for trend
modelling using GAMs are examined in section 3.2 for terrestrial
species and in section 5.4 for waterbirds. Plots showing trend curves
with varying numbers of degrees of freedom are shown in figures
3.1 and 5.1. Our general recommendation for both types of data is
to follow Fewster et al. (2000) and use 0.3 x t degrees of freedom,
where t is the number of years in the time-series.
Underhill proposes calculating changes between counts averaged
over three or five years (section 4). He has not compared averaging
over different numbers of years in detail. In the absence of further
work we propose that the data should be averaged over three years.
There is a special problem for a few terrestrial species, notably
the Wren, where the above degrees of freedom leave fluctuations
that will trigger some “false alarms” (section 3.2).
In the medium term it may be possible to develop models incorporating
weather co-variants in order to deal with this problem. For the
time being we recommend applying the same level of smoothing to
all species and dealing with such problems through interpretation
of the alerts.
3. Treatment of endpoints
All methods of trend estimation are likely to give less reliable
estimates for the endpoints of the series because they are based
on fewer data. Problems of endpoint estimation within Generalized
Additive Models are addressed for terrestrial birds in section 3.2
and for waterbirds in section 5.7. Inspection of graphs based on
fitting GAMs to different numbers of years of data indicates that
occasional endpoints do differ noticeably from the long-term trend.
These could in principle trigger “false alarms”. We
therefore recommend that population changes used to evaluate alerts
should be calculated from year t-1, where t is the last year of
the index series. While the change is measured from t-1 all t years
of data would be used to fit the GAM model on which the change estimates
are based. A similar principle should be applied to calculations
based on the beginning of the time series. Deviations of endpoints
from the long-term trend have not been evaluated for the Underhill
method. Note, however, that the use of three year means to calculate
population changes from this method will result in a measure of
change from t-1 as proposed above.
4. Confidence and consistency intervals
All the methods proposed calculate confidence intervals for smoothed
index values and population changes using a bootstrapping technique
where the data are resampled by site. This technique does not assume
that the data are described by any theoretical statistical distribution,
but instead regards the distribution of the data as approximating
that of the population from which they were sampled. Regional indices
may sometimes necessarily be based on small numbers of sites. The
statistical theory on which bootstrapping is based is only well
developed in certain conditions and for “large” samples.
The extent to which it is robust for small samples is unknown. It
is therefore not possible to specify any particular sample size
below which bootstrapping will not give reliable results, but any
results for sample sizes of less than 10 should certainly be treated
with particular caution (S.N. Freeman pers comm.) Interpretation
of the confidence limits calculated in this way for many waterbirds,
particularly estuarine waders, are somewhat different from conventional
confidence intervals because a high proportion of the total population
is counted each year. Thus Underhill has often referred to these
as “consistency intervals” because they measure the
extent to which changes on different sites are consistent. We believe
that such consistency intervals are still helpful for interpreting
alerts, particularly because counts are subject to considerable
counting error even where a high proportion of the population is
covered. It may be useful to distinguish those waterbirds populations
where most of the population is counted from those where the counts
are more like a sample survey, and for which confidence limits are
therefore more appropriate. We suggest that 70% of the population
being included in the counts would be a suitable threshold for this.
Terminology describing these intervals should be standardized
as far as possible. We recommend that the intervals for both terrestrial
birds and waterbirds should be referred to as confidence limits
within alerts reports as this term is widely understood. We may
wish to have a standard footnote which explains the interpretation
of these confidence limits for some waterbirds.
The GAM analyses for both terrestrial breeding birds (section
3) and waterbirds (section 5) use 95% confidence limits, while Underhill
has always used 90% limits. The selection of 90% or 95% limits is
arbitrary but we strongly recommend that a standard approach should
be adopted for both groups. We recommend that 90% confidence limits
should be used on the basis that we are only testing for declines
(i.e. we are doing a one-tailed test).
Ideally confidence limits would be calculated from about 1000
bootstrap replicates. In practice, however, the number of replicates
is limited by computing power, particularly for GAMs. In this report
GAMs for terrestrial species had 119 bootstrap replicates while
GAMs for waterbirds were generally run with 199 bootstrap replicates.
Analyses of Dunlin data with different numbers of bootstrap replicates
(Figure 5.2) indicate that 100-200 replicates should provide adequate
confidence intervals. Confidence intervals for the Underhill model
were based on 500 bootstrap replicates. Normally 119 bootstrap replicates
should be sufficient but if significance is marginal then more replicates
should be undertaken.
5. Practicalities of computing
A large amount of computer time is needed to fit GAMs. Individual
GAM models fitted to a single dataset typically take between 3 and
30 minutes of CPU time on a powerful Unix computer. While fitting
individual models requiring this amount of time is not a particular
problem, calculating bootstrap confidence intervals from even 119
replicates consumes a great deal of CPU time. Improvements in computer
hardware and software will make it possible to run these analyses
more quickly in the future, but at present it is not practical to
apply GAMs with bootstrapped confidence limits to all data sets
on a routine basis. We therefore propose the following approaches
for the immediate future.
Analyses of CBC data for terrestrial birds will be undertaken
using GAMs. However, confidence limits may only be calculated when
a population change is sufficiently large to trigger an alert if
it was significant. The computer resources needed to calculate confidence
limits for BBS data, which include many more sites that the CBC,
have yet to be evaluated.
Waterbirds analyses will be undertaken using the revised version
of the Underhill method. This will be applied within a standardized
alerts framework as outlined above.
SITE-BASED ALERTS
Overall approach
1. Species coverage
Site-based alerts will only be implemented for wintering waterbirds
at present, although the system outlined here may be extended to
other species in the future.
2. Site coverage
It is intended that one third of the Ramsar and Special Protection
Areas classified for non-breeding waterbirds will be assessed, commented
upon and reported on every year in rotation, so that all such sites
are covered every three years. One sixth of SSSIs and ASSIs classified
for non-breeding waterbirds will be assessed, commented upon and
reported on every year in rotation, so that all such sites are covered
every six years. These analyses will be based on WeBS data.
3. Change measures
Changes on individual sites will be measured from smoothed population
trends using methods similar to those outlined for national and
regional trends above. Changes will be measured over the whole time
series (from 1969/70), 25 years, 10 years and 5 years. Declines
of 50% or more over any of these periods will be used to flag alerts.
In addition, declines of 25-50% over the full time series or 25
years will be used to flag alerts.
4. Confidence limits
Confidence limits cannot be calculated for changes on individual
sites so bootstrap significance testing will not form part of the
procedure for identifying site-based alerts.
5. Comparisons with national and regional trends
Site-based alerts will be compared with national and regional trends,
obtained from the national and regional alerts system outlined above,
for purposes of interpretation. However, formal testing of trends
or change measures from individual sites against national or regional
figures will not form part of the system for flagging site-based
alerts.
Technical issues
1. Measurement of population changes for individual sites
This should probably be done using data for the site of interest
only. Incorporating imputed values may bias estimates if the trend
at the site of interest differs from the national or regional data
set from which the imputing was carried out.
2. Endpoints
Changes should be calculated to year t-1, where year t is the final
year of the time-series. This follows the procedure outlined above
for national and regional alerts.
3. Confidence limits of population changes at individual sites
cannot be calculated from a bootstrap procedure because there is
no replication of sites. Some large sites can be sub-divided into
a number of smaller areas but there will not usually be enough of
these to obtain useful confidence limits by bootstrapping. In the
future it might be possible to develop suitable tests using a jack-knife
procedure involving the dropping of counts for individual years
from the time series but this is not a priority at present.
4. Statistical comparisons between the trends for individual sites
and national or regional trends
In principle it is possible to test whether the GAM trend for a
particular site or region differs from the rest of the data set
(Section 5). There are two problems with this. The first is that
the distributions of the deviances from GAMs may not be approximated
well by the chi-square distribution. This problem may be circumvented
to some extent by using an F-test approximation, which is recommended
for overdispersed data, as outlined for regional CBC trends by Fewster
et al. (2000). This F-test approach is the standard procedure
for tests between Generalized Linear Models with overdispersion.
It is though to be also applicable to GAMs, although further statistical
validation is needed. Many of the WeBS counts show high levels of
overdispersion. The second problem, however, is one of interpretation.
Two smooth curves from GAMs may be highly significantly different
because they have different shapes, even if their endpoints are
identical. While such information may be important for interpreting
alerts, it is difficult to see how it could be used to flag them
up directly. We therefore recommend that such analyses should not
be included within the formal alert system, although they may form
a useful part of follow up investigations.
Back
to Research Reports 345 onwards
|