Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

Techniques for Increased Automation of Aquatic Sensor Data Post Processing in Python: Video Presentation

Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI ( for information on this resource.
Type: Resource
Storage: The size of this resource is 351.0 MB
Created: Sep 07, 2021 at 10:46 p.m.
Last updated: Sep 07, 2021 at 11:16 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 509
Downloads: 11
+1 Votes: Be the first one to 
Comments: No comments (yet)


This resource contains a video recording for a presentation given as part of the National Water Quality Monitoring Council conference in April 2021. The presentation covers the motivation for performing quality control for sensor data, the development of PyHydroQC, a Python package with functions for automating sensor quality control including anomaly detection and correction, and the performance of the algorithms applied to data from multiple sites in the Logan River Observatory.

The initial abstract for the presentation:
Water quality sensors deployed to aquatic environments make measurements at high frequency and commonly include artifacts that do not represent the environmental phenomena targeted by the sensor. Sensors are subject to fouling from environmental conditions, often exhibit drift and calibration shifts, and report anomalies and erroneous readings due to issues with datalogging, transmission, and other unknown causes. The suitability of data for analyses and decision making often depend on subjective and time-consuming quality control processes consisting of manual review and adjustment of data. Data driven and machine learning techniques have the potential to automate identification and correction of anomalous data, streamlining the quality control process. We explored documented approaches and selected several for implementation in a reusable, extensible Python package designed for anomaly detection for aquatic sensor data. Implemented techniques include regression approaches that estimate values in a time series, flag a point as anomalous if the difference between the sensor measurement exceeds a threshold, and offer replacement values for correcting anomalies. Additional algorithms that scaffold the central regression approaches include rules-based preprocessing, thresholds for determining anomalies that adjust with data variability, and the ability to detect and correct anomalies using forecasted and backcasted estimation. The techniques were developed and tested based on several years of data from aquatic sensors deployed at multiple sites in the Logan River Observatory in northern Utah, USA. Performance was assessed based on labels and corrections applied previously by trained technicians. In this presentation, we describe the techniques for detection and correction, report their performance, illustrate the workflow for applying to high frequency aquatic sensor data, and demonstrate the possibility for additional approaches to help increase automation of aquatic sensor data post processing.

Subject Keywords


Related Resources


Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation Collaborative Research: Elements: Advancing Data Science and Analytics for Water (DSAW) 1931297

How to Cite

Jones, A. S., J. S. Horsburgh, T. Jones (2021). Techniques for Increased Automation of Aquatic Sensor Data Post Processing in Python: Video Presentation, HydroShare,

This resource is shared under the Creative Commons Attribution CC BY.


There are currently no comments

New Comment