Techniques for Increased Automation of Aquatic Sensor Data Post Processing in Python: Video Presentation


Authors:
Owners:
Resource type: Composite Resource
Storage: The size of this resource is 351.0 MB
Created: Sep 07, 2021 at 10:46 p.m.
Last updated: Sep 07, 2021 at 11:16 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 59
Downloads: 2
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

This resource contains a video recording for a presentation given as part of the National Water Quality Monitoring Council conference in April 2021. The presentation covers the motivation for performing quality control for sensor data, the development of PyHydroQC, a Python package with functions for automating sensor quality control including anomaly detection and correction, and the performance of the algorithms applied to data from multiple sites in the Logan River Observatory.

The initial abstract for the presentation:
Water quality sensors deployed to aquatic environments make measurements at high frequency and commonly include artifacts that do not represent the environmental phenomena targeted by the sensor. Sensors are subject to fouling from environmental conditions, often exhibit drift and calibration shifts, and report anomalies and erroneous readings due to issues with datalogging, transmission, and other unknown causes. The suitability of data for analyses and decision making often depend on subjective and time-consuming quality control processes consisting of manual review and adjustment of data. Data driven and machine learning techniques have the potential to automate identification and correction of anomalous data, streamlining the quality control process. We explored documented approaches and selected several for implementation in a reusable, extensible Python package designed for anomaly detection for aquatic sensor data. Implemented techniques include regression approaches that estimate values in a time series, flag a point as anomalous if the difference between the sensor measurement exceeds a threshold, and offer replacement values for correcting anomalies. Additional algorithms that scaffold the central regression approaches include rules-based preprocessing, thresholds for determining anomalies that adjust with data variability, and the ability to detect and correct anomalies using forecasted and backcasted estimation. The techniques were developed and tested based on several years of data from aquatic sensors deployed at multiple sites in the Logan River Observatory in northern Utah, USA. Performance was assessed based on labels and corrections applied previously by trained technicians. In this presentation, we describe the techniques for detection and correction, report their performance, illustrate the workflow for applying to high frequency aquatic sensor data, and demonstrate the possibility for additional approaches to help increase automation of aquatic sensor data post processing.

Subject Keywords

Deleting all keywords will set the resource sharing status to private.

Content

References

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation Collaborative Research: Elements: Advancing Data Science and Analytics for Water (DSAW) 1931297

How to Cite

Jones, A. S., J. S. Horsburgh, T. Jones (2021). Techniques for Increased Automation of Aquatic Sensor Data Post Processing in Python: Video Presentation, HydroShare, http://www.hydroshare.org/resource/bc5c616426214b60b068352ae028d963

This resource is shared under the Creative Commons Attribution CC BY.

 http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required