Deep Learning for mapping retrogressive thaw slumps across the Arctic
Retrogressive thaw slumps (RTS) are typical landscape processes of thawing and degrading permafrost. To this point, their distribution and dynamics are almost completely undocumented across many regions in the permafrost domain, partially due to the lack of data and monitoring techniques in the past. We are tackling this shortcoming by creating a deep learning based semantic segmentation framework to detect RTS, using multi-spectral PlanetScope, derived topographic (ArcticDEM) and multi-temporal Landsat Trend data. We created a highly automated processing pipeline, which is designed to create reproducible results and to be flexible for multiple input features. The processing workflow is based on the pytorch deep-learning framework and includes a variety of different segmentation architectures (UNet, UNet++, DeepLabV3), backbones and includes common data transformation techniques such as augmentation or normalization. We tested (training, validation) our DL based model in six different regions of 100 to 300 km² size across Canada (Banks Island, Tuktoyaktuk, Horton, Herschel Is.), and Siberia (Kolguev, Lena). We performed a regional cross-validation (5 regions training, 1 region validation) to test the spatial robustness and transferability of the algorithm. Furthermore, we tested different architectures backbones and loss-function to identify the best performing and most robust parameter sets. For training the models we created a training database of manually digitized and validated RTS polygons. The resulting model performance varied strongly between different regions with maximum Intersection over Union (IoU) scores between 0.15 and 0.58. The strong regional variation emphasizes the need for sufficiently large training data, which is representative for the massive variety of RTS. However, the creation of good training data proved to be challenging due to the fuzzy definition and delineation of RTS, particularly on the lower part. We have recently expanded our analysis to several RTS-rich regions across the Arctic (Fig.X) for the year 2021 and annual analysis (2018-2021) for RTS hot-spots, e.g. Banks Island, Peel Plateau and others. First model inference runs are promising for detecting RTS, but are still strongly overestimating the number and area of RTS, due to an excessive number of false positives. Model performance however, varies strongly between regions. Due to the strong variability of landscapes with RTS, we expect an improvement in model performance with an increase in the number and spatial distribution of training datasets. The community driven formation of the IPA Action Group RTSIn, which aims to create standardized RTS digitization protocols and training datasets for deep/machine-learning purposes will be a great boost for our purpose. With our standardized processing pipeline (preprocessing, training, inference), which allows to add more features based on user interest and data availability,, we tested our workflow for surface water and pingos with a mixture of publically available (Jones et al) and digitized data (Grosse pingos, Nitze water). These tests produced very good results and showed that the designed workflow is transferrable beyond the segmentation of RTS only. In the near future, we are aiming to integrate the community based training data and further gradually improve our training database. Within the framework of the ML4Earth project, we will create a temporal and pan-arctic monitoring system for RTS based on our highly automated processing chain. This will enable us to better understand pan-arctic RTS dynamics, their influencing factors, and consequences. Combining these spatial-temporal datasets with volumetric change information and carbon stock information will enable us to better quantify the consequences of thaw slumping across the permafrost domain.