Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

Hydroinformatics Instruction Module Example Code: Introduction to Machine Learning with Residential Water Use Data

Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI ( for information on this resource.
Type: Resource
Storage: The size of this resource is 584.0 KB
Created: Jan 28, 2022 at 9:27 p.m.
Last updated: Feb 17, 2022 at 3:10 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 485
Downloads: 55
+1 Votes: Be the first one to 
Comments: No comments (yet)


This resource contains Jupyter Notebooks with examples that are an introduction to machine learning classification based on residential water use data. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn.

This resources consists of 4 example notebooks and a data files.

1. Example 1: Data import and exploration
2. Example 2: Implementing a first machine learning model
3. Example 3: Comparing multiple machine learning models
4. Example 4: Model optimization by hyperparameter tuning

Data files:
The data is contained in a flat file and is a record of water use data from a single residential property with manually applied labels to classify the water uses. Columns are:
- StartTime: Start date and time of each individual event. Format: 'YYYY-MM-DD HH:MM:SS'
- EndTime: End date and time of each individual event. Format: 'YYYY-MM-DD HH:MM:SS'
- Duration: Duration of each individual event (end time - start time). Units: Minutes
- Volume: Volume of water used in each individual event. Unit: Gallons
- FlowRate: Average flow rate of each individual event. Unit: Gallons per minute
- Peak: Maximum value observed in each 4-seconds period within each event. Unit: Gallons
- Mode: Most frequent value observed in an event. Unit: Gallons
- Label: Event classification. Values: faucet, toilet, shower, clotheswasher, bathtub

Subject Keywords


This resource is part of a HydroLearn module for Hydroinformatics and Water Data Science.

Instructions for running code in the CUAHSI JupyterHub:

  1. Click on "Open with" at the top right
  2. Select CUAHSI JupyterHub and agree to terms of use
  3. Select Python as the Server Option
  4. Once JupyterHub opens, click to open the *.ipynb file for the example of interest
  5. Use the Jupyter tools and run code in each cell to retrieve data, generate plots, etc.
  6. Once the CUAHSI JupyterHub is launched, additional files associated with the resources may be opened directly (File -> Open)

Related Resources

This resource belongs to the following collections:
Title Owners Sharing Status My Permission
Hydroinformatics Instruction Modules Example Code Amber Jones  Public &  Shareable Open Access


Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation Collaborative Research: Elements: Advancing Data Science and Analytics for Water (DSAW) 1931297

How to Cite

Bastidas Pacheco, C. J., A. S. Jones (2022). Hydroinformatics Instruction Module Example Code: Introduction to Machine Learning with Residential Water Use Data, HydroShare,

This resource is shared under the Creative Commons Attribution CC BY.


There are currently no comments

New Comment