It is crucial to build a manual verification step into any acoustic project. Here you will find guidelines for how to audit identifications returned by the Acoustic Pipeline.
We strive to make our classifiers as accurate as possible. However, all automated systems have the capacity to make mistakes, especially in novel situations or when compromised by poor recording quality.
This page features guidelines on auditing identifications returned by the Acoustic Pipeline, with the assistance of Acoustic Pipeline Tools, a free bespoke R Shiny app for Pipeline users.
Acoustic Pipeline probability values
For ultrasonic recordings, the BTO Acoustic Pipeline returns a probability value for each species detected in a recording. This value is the true positive rate, or the probability that the Pipeline has correctly assigned the identification to species.
The Pipeline returns results for all recordings with a probability greater than 0.1. This allows users to decide which records to inspect and manually audit.
How to facilitate rapid auditing
We recommend that, after assigning them to a species, users of the Pipeline copy recordings into three main folders:
- Folder 1: Copy of recordings into separate species folders where probability >=0.5 (50%)
- Folder 2: Copy of recordings into separate species folder where probability <0.5 (<50%)
- Folder 3: A random sample of recordings where no species were identified
Note that as the Pipeline returns results for more than one bat species if they are both present in a single wav file, a copy of the wav file may go into more than one of the species folders. Therefore, auditing should check whether the ‘assigned’ species is correct. This is easier than checking every species identification if there are multiple species present in the recording.
All recordings in Folder 1 should be checked.
The exception is where there are tens of thousands of recordings of a species. In this case, a random sample should be subject to manual auditing to quantify the error rate for the species.
- For a large dataset, checking a random sample of 1,000 recordings for a species should be sufficient to obtain a robust estimate of the identification error rate.
- It may be possible to justify auditing a smaller sample of recordings, where it has been demonstrated that the error rate for the species is very low and / or depending on the project, where the level of error has few implications e.g. potentially for Common Pipistrelle.
In all cases, the error rate should be presented. For example, “5 of 1,000 common pipistrelle recordings (0.5% of recordings) were assigned to the wrong species”.
If the error rate is high when checking a random sample of recordings, and there are important implications for misidentifying this species, it may be necessary to audit all recordings of the species.
The BTO Acoustic Pipeline recommends that recordings in Folder 2 are discarded.
However, depending on the project, recordings from Folder 2 may be checked (or a sample checked), to demonstrate or provide peace of mind that the classifier is missing very little that could be assigned by any means to species.
Just to reiterate, the thresholds used and the focus here is on species identification. If you are interested instead in identifying e.g. all vocalisations of any Myotis species, regardless of the possibilities for assigning these to species, you may want to check all low-confidence identifications, with the cost of more time needed for auditing.
As above for Folder 1, if there are a large number of low-confidence identifications for a species, a random sample may be selected for auditing. The error rate should be presented. For example, “5 of 1,000 Common Pipistrelle recordings (0.5% of recordings) were assigned to the wrong species”.
If the error rate is high, and there are important implications for misidentifying this species, it may be necessary to audit all recordings of the species.
Depending on the project, a random sample from Folder 3 could be checked, to demonstrate that the classifier is missing very little that could be assigned to species by any means (and to be able to quantify the error rate / the false negative rate).
How to use the R shiny app to automate copying files
The BTO Acoustic Pipeline offers an R Shiny app that automates copying recordings into species folders for easier auditing.
Specifically, it can read in the Pipeline’s csv results file(s) and gives various strategies for copying none, all, high scoring, or random samples of recordings into folders for manual checking.
Treatment of rare or unexpected identification records
If the results include species identifications that are rare or unexpected for the location or region, we recommend that these identifications are supported with further evidence that defines the basis on which the identification was made.
- Our report on bats in the Bailiwick of Guernsey provides a good example of this.
Importantly, a number of cryptic species can be difficult for an experienced analyst to assign to species in areas where their ranges overlap.
For bats in Europe, this includes:
- Myotis mystacinus and M. brandtii
- Myotis nattereri, M. crypticus and M. escalerai
- Myotis myotis and M. blythii
- Nyctalus leisleri and Vespertilio murinus
- Pipistrellus nathusii and P. kuhlii.
Our own approach is to treat results for these species together, unless there is evidence when it comes to manual auditing of the recordings e.g. the presence of diagnostic social and / or feeding buzzes, to assign an identification to one species over another.
Can’t find the answer you’re looking for?
See our other Support guides
If the information you’re looking isn’t on this page, please return to the Support Hub to browse our other guides.
If you can’t find what you’re looking for in the Support Hub, please email us at
acoustic.pipeline [at] bto.org
BirdTrack migration blog – early spring
It may still feel like winter but for some species, the increasing temperatures and lengthening days have already kick-started spring migration, with birds starting to arrive and depart across Britain and Ireland.