Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...

EarthCube2021 An Approach for Creating Immutable and Interoperable End-to-End Hydrological Modeling Computational Workflows


Authors:
Owners: This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (help@cuahsi.org) for information on this resource.
Type: Resource
Storage: The size of this resource is 99.5 KB
Created: Apr 29, 2021 at 5:10 p.m.
Last updated: May 16, 2021 at 6:43 p.m.
Citation: See how to cite this resource
Sharing Status: Public
Views: 1193
Downloads: 33
+1 Votes: Be the first one to 
 this.
Comments: No comments (yet)

Abstract

This HydroShare resource provides the Jupyter Notebooks created for the study "An Approach for Creating Immutable and Interoperable End-to-End Hydrological Modeling Computational Workflows" led by researcher Young-Don Choi submitted to the 2021 EarthCube Annual meeting, Notebook Sessions.

To find out the instructions on how to run Jupyter Notebooks, please refer to the README file provided in this resource.

For the sake of completeness, the abstract for the study submitted to the EarthCube session is mentioned below:

"Reproducibility is a fundamental requirement to advance science. Creating reproducible hydrological models that include all required data, software, and workflows, however, is often burdensome and requires significant work. Computational hydrology is a rapidly advancing field with fast-evolving technologies to support increasingly complex computational hydrologic modeling. The growing model complexity in terms of variety of software and cyberinfrastructure capabilities makes achieving computational reproducibility extremely challenging. Through recent reproducibility research, there have been efforts to integrate three components: 1) (meta)data, 2) computational environments, and 3) workflows. However, each component is still separate, and researchers must interoperate between these three components. These separations make verifying end-to-end reproducibility challenging. Sciunit was developed to assist scientists, who are not programming experts, with encapsulating these three components into a container to enable reproducibility in an immutable form. However, there were still limitations to support interoperable computational environments and apply end-to-end solutions, which are an ultimate goal of reproducible hydrological modeling. Therefore, the objective of this research is to advance the existing Sciunit capabilities to not only support immutable, but also interoperable computational environments and apply an end-to-end modeling workflow using the Regional Hydro-Ecologic Simulation System (RHESSys) hydrologic model as an example. First, we create an end-to-end workflow for RHESSys using pyRHESSys on the CyberGIS-Jupyter for Water platform. Second, we encapsulate the aforementioned three components and create configurations that include lists of encapsulated dependencies using Sciunit. Third, we create two HydroShare resources, one for immutable reproducibility evaluation using Sciunit and the other for interoperable reproducibility evaluation using library configurations created by Sciunit. Finally, we evaluate the reproducibility of Sciunit in MyBinder, which is a different computational environment, using these two resources. This research presents a detailed example of a user-centric case study demonstrating the application of an open and interoperable containerization approach from a hydrologic modeler’s perspective."

Subject Keywords

Content

Readme.md

How to run the RHESSys end-to-end workflows and create a Sciunit Container

This Readme file provides the users with the step-by-step guide to successfully run the two developed notebooks.
The steps, in the order they need to be taken, are explained in what follows.

STEP 0: Preliminary step

In this step the researchers make sure that they have access to the content files of the resource and required compute platform.
- In order to be able to run the two Jupyter notebooks, researchers need to first have a HydroShare account.
- If the researchers do not have access to CyberGIS-Jupyter for Water (CJW), they need to submit an access request to the CyberGIS-Jupyter for Water platform.

STEP 1: Execute 'YD_01_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb' using CJW

  • To run this notebook:
    1. Click the OpenWith button in the upper-right corner of this HydroShare resource webpage;
    2. Select "CyberGIS-Jupyter for Water";
    3. Open the notebook and follow instructions;

In this notebook, users first run RHESSys end-to-end modeling workflows to make sure this workflows are working properly. Then, they create a Sciunit Container to encapsulate 1) data, 2) computational environments, and 3) modeling workflows. After that, users can create MyBinder configuration files using Sciunit. Finally, users create two HydroShare resources, each for one case study, to evaluate reproducibility. These two HydroShare resources will be used to create MyBinder computational environment for case study 1 (step 2) and case study 2 (step 3).

STEP 2: Create and Run MyBinder Computational Environment for the Case Study-1 using the HydroShare Resource Created for It

In this step, users run the second notebook, YD_02_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb, using MyBinder computational environment to evaluate reproducibility in the immutable Sciunit container.

STEP 3: Create and Run MyBinder Computational Environment for the Case Study-2 using the HydroShare Resource Created for It

In this step, users run the third notebook, YD_03_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb, using MyBinder computational environment to evaluate reproducibility in the interoperable computational environments like CyberGIS-Jupyter for Water for extendable computational research.
Note, the third notebook is the same as the first notebook, but with the Sciunit process being removed only to simplify the workflows and reduce the runtime.

Related Resources

The content of this resource is derived from http://www.hydroshare.org/resource/d2a469fe56714715bad849a5dfc380bc

Credits

Funding Agencies

This resource was created using funding from the following sources:
Agency Name Award Title Award Number
National Science Foundation EarthCube Data Capabilities: Collaborative Research: Integration of Reproducibility into Community CyberInfrastructure ICER-1928369, ICER-1928315

How to Cite

Choi, Y., J. Goodall, I. Maghami, R. Ahmad, T. Malik, L. Band, Z. Li, S. Wang, D. Tarboton (2021). EarthCube2021 An Approach for Creating Immutable and Interoperable End-to-End Hydrological Modeling Computational Workflows, HydroShare, http://www.hydroshare.org/resource/b42bb18c5fcf4d04a6910877d5d2b222

This resource is shared under the Creative Commons Attribution CC BY.

http://creativecommons.org/licenses/by/4.0/
CC-BY

Comments

There are currently no comments

New Comment

required