O2A - Data Flow Framework from Sensor Observations to Archives
The Alfred Wegener Institute coordinates German polar research and is one of the most productive polar research institutions worldwide with scientists working in both Polar Regions – a task that can only be successful with the help of excellent infrastructure and logistics. Conducting research in the Arctic and Antarctic requires research stations staffed throughout the year as the basis for expeditions and data collection. It needs research vessels, aircrafts and long-term observatories for large-scale measurements as well as sophisticated technology. In this sense, the AWI also provides this infrastructure and competence to national and international partners. To meet the challenge the AWI has been progressively developing and sustaining an e-Infrastructure for coherent discovery, visualization, dissemination and archival of scientific information and data. Most of the data originates from research activities being carried out in a wide range of sea-, airand land-based operating research platforms. Archival and publishing in PANGAEA repository along with DOI assignment to individual datasets is a pursued end-of-line step. Within AWI, a workflow for data acquisition from vessel-mounted devices along with ingestion procedures for the raw data into the institutional archives has been well established. However, the increasing number of ocean-based stations and respective sensors along with heterogeneous project-driven requirements towards satellite communication, sensor monitoring, quality control and validation, processing algorithms, visualization and dissemination has recently lead us to build a more generic and cost-effective framework, hereafter named O2A (observations to archives). The main strengths of our framework (https://www.awi.de/en/data-flow) are the seamless flow of sensor observation to archives and the fact that it complies with internationally used OGC standards and assuring interoperability in international context (e.g. SOS/SWE, WMS, WFS, etc.). O2A comprises several extensible and exchangeable modules (e.g. controlled vocabularies and gazetteers, file type and structure validation, aggregation solutions, processing algorithms, etc.) as well as various interoperability services. We are providing integrated tools for standardized platform, device and sensor descriptions following SensorML (https://sensor.awi.de), automated near-real time and “big data” data streams supporting SOS and O&M and dashboards allowing data specialists to monitor their data streams for trends and early detection of malfunction of sensors (https://dashboard.awi.de). Also in the context of the “Helmholtz Data Federation” with outlook towards the European Open Science Cloud we are developing a cloud-based workspace providing user-friendly solutions for data storage on petabyte-scale and state-of-the-art computing solutions (Hadoop, Spark, Notebooks, rasdaman, etc.) to support scientists in collaborative data analysis and visualization activities including geo-information systems (http://maps.awi.de). Our affiliated repositories offer archival and long-term preservation as well as publication solutions for data, data products, publications, presentations and field reports (https://www.pangaea.de, https://epic.awi.de).