Harmonizing heterogeneous multi-proxy data from lake systems


Contact
gregor.pfalz [ at ] awi.de

Abstract

When performing spatial-temporal investigations of multiple lake systems, geoscientists face the challenge of dealing with complex and heterogeneous data of different types, structure, and format. To support comparability, it is necessary to transform such data into a uniform format that ensures syntactic and semantic comparability. This paper presents a data science approach for transforming research data from different lake sediment cores into a coherent framework. For this purpose, we collected published and unpublished data from paleolimnological investigations of Arctic lake systems. Our approach adapted methods from the database field, such as developing entity-relationship (ER) diagrams, to understand the conceptual structure of the data independently of the source. We demonstrated the feasibility of our approach by transforming our ER diagram into a database schema for PostgreSQL, a popular database management system (DBMS). We validated our approach by conducting a comparative analysis on a set of acquired data, hereby focusing on the comparison of total organic carbon and bromine content in eight selected sediment cores. Still, we encountered serious obstacles in the development of the ER model. Heterogeneous structures within collected data made an automatic data integration impossible. Additionally, we realized that missing error information hampers the development of a conceptual model. Despite the strong initial heterogeneity of the original data, our harmonized dataset leads to comparable datasets, enabling numerical inter-proxy and inter-lake comparison.



Item Type
Article
Authors
Divisions
Primary Division
Programs
Primary Topic
Helmholtz Cross Cutting Activity (2021-2027)
N/A
Peer revision
Peer-reviewed, Web of Science / Scopus
Publication Status
Published
Eprint ID
55325
DOI 10.1016/j.cageo.2021.104791

Cite as
Pfalz, G. , Diekmann, B. , Freytag, J. C. and Biskaborn, B. K. (2021): Harmonizing heterogeneous multi-proxy data from lake systems , Computers & Geosciences, 153 , p. 104791 . doi: 10.1016/j.cageo.2021.104791


Download
[thumbnail of Pfalz_et_al_2021.pdf]
PDF
Pfalz_et_al_2021.pdf

Download (11MB)

Share


Citation

Geographical region

Research Platforms

Campaigns
Arctic Land Expeditions > RU-Land_2016_Keperveem
Arctic Land Expeditions > RU-Land_2016_Lena
Arctic Land Expeditions > RU-Land_2018_Chukotka
Arctic Land Expeditions > RU-Land_2019_Kisi
Arctic Land Expeditions > RU-Land_2020_Chukotka
Arctic Land Expeditions > RU-Land_2020_Khamra
Arctic Land Expeditions > RU-Land_2013_Yakutia


Actions
Edit Item Edit Item