Investigation of hidden parameters influencing the automated object detection in images from the deep seafloor of the HAUSGARTEN observatory
Detecting objects in underwater image sequences and video frames automatically requires the application of selected algorithms in consecutive steps. Most of these algorithms are controlled by a set of parameters, which need to be calibrated for an optimal detection result. Those parameters determine the effectivity and efficiency of an algorithm and their impact is usually well known. There are however further non-algorithmic impact factors (or hidden parameters), which bias the training of a machine learning system as well as the subsequent detection process and thus need to be well understood and taken into account. In benthic imaging, one dominant, hidden parameter is the distance of the image acquisition device above the seafloor. Variations in the distance lead to variations in the benthic area size being captured, the relative size and position of an object within an image, the effect of the artificial light source and thus the recorded color spectrum. Image processing techniques that allow modeling the induced variations can be used to compensate for those effects and thus allow the exploration of initially biased data. Those processing techniques again require algorithmic parameters, which are influenced by the hidden parameters contained within the initial data. In supervised machine-learning architectures, further challenges arise from the inclusion of human expert knowledge used for the training of the learning algorithm. Utilizing the knowledge of only one expert can conceal the information needed for the generalization capability of an automated semantic image annotation system. Utilizing the knowledge of several experts requires explicit instruction of the participants to be able to produce comparable results. The fusion of individual expert knowledge poses further hidden parameters that impact the supervised learning architecture. Those could be an individual object specific expertise or the tendency to annotate with more or less self-criticism, which together can be expressed as the expert’s trustworthiness. In the context of megafauna detection in benthic images, we investigate the effects of some of these parameters on our machine learning based detection system iSIS [1] that consists of four succeeding steps: Imaging, expert annotation, training, and detection (see Figure 1). The images to be analyzed were taken at the deep-sea, long-term observatory HAUSGARTEN and five experts created an annotation gold standard. We found, that the hidden parameters from imaging as well as the fusion of expert knowledge could partly be compensated and were able to achieve detection performances of 67% precision and 87% recall. Despite the efforts to compensate the hidden parameters, the detection performance was still varying across the image transect. This poses the potential occurrence of further hidden parameters not taken into account so far. Here, we correlate the distance of the acquisition device with the image‐wise detection results (see Figure 2 A). Also, we show conformity of the automated detection results to the outcome of the manual detection consensus of human experts (see Figure 2 B). Finally, we show the impact of hidden parameters on subsequent steps by means of the effect of image illumination on the human expert annotation.