Variability in ground-truth data sets and the performance of two automated detectors for Antarctic blue whale calls in different soundscape conditions
Automated detectors are important tools for processing large passive acoustic databases. Assessing the performance of a given method can be challenging and needs to be interpreted in the light of the overall purpose of analysis. Performance evaluation often involves comparison between thedetector output and a ground-truth data set, which often involves manual analyses of the data. Such analyses may be subjective depending on, e.g., interfering background noise conditions. In this study, we investigated the variability between two analysts in the detection of Antarctic blue whale Zcalls (Balaenoptera musculus intermedia), as well as the intra-analyst variability, in order to understand how this variability impacts the creation of a ground-truth and the assessment of detector performances. Analyses were conducted on two test datasets reflecting two basins and different situations of call abundance and background noise conditions. Using a ground-truth based on combined results of both analysts, we evaluated the performances of two automated detectors, one using spectrogram correlation and the other using a subspace-detection strategy. This evaluation allows understanding how recording sites, vocal activity, and interfering sounds affect the detector performances and highlights the advantages and limitations of each of the methods, and the possible solutions to overcome the main limitations.
Helmholtz Research Programs > PACES II (2014-2020) > TOPIC 1: Changes and regional feedbacks in Arctic and Antarctic > WP 1.6: Large scale variability and change in polar benthic biota and ecosystem functions