Checking for non-preferred file/folder path names (may take a long time depending on the number of files/folders) ...
This resource contains some files/folders that have non-preferred characters in their name. Show non-conforming files/folders.
This resource contains content types with files that need to be updated to match with metadata changes. Show content type files that need updating.
EarthCube2021 An Approach for Creating Immutable and Interoperable End-to-End Hydrological Modeling Computational Workflows
|This resource does not have an owner who is an active HydroShare user. Contact CUAHSI (email@example.com) for information on this resource.
|The size of this resource is 99.5 KB
|Apr 29, 2021 at 5:10 p.m.
|May 16, 2021 at 6:43 p.m.
|See how to cite this resource
|Be the first one to this.
|No comments (yet)
This HydroShare resource provides the Jupyter Notebooks created for the study "An Approach for Creating Immutable and Interoperable End-to-End Hydrological Modeling Computational Workflows" led by researcher Young-Don Choi submitted to the 2021 EarthCube Annual meeting, Notebook Sessions.
To find out the instructions on how to run Jupyter Notebooks, please refer to the README file provided in this resource.
For the sake of completeness, the abstract for the study submitted to the EarthCube session is mentioned below:
"Reproducibility is a fundamental requirement to advance science. Creating reproducible hydrological models that include all required data, software, and workflows, however, is often burdensome and requires significant work. Computational hydrology is a rapidly advancing field with fast-evolving technologies to support increasingly complex computational hydrologic modeling. The growing model complexity in terms of variety of software and cyberinfrastructure capabilities makes achieving computational reproducibility extremely challenging. Through recent reproducibility research, there have been efforts to integrate three components: 1) (meta)data, 2) computational environments, and 3) workflows. However, each component is still separate, and researchers must interoperate between these three components. These separations make verifying end-to-end reproducibility challenging. Sciunit was developed to assist scientists, who are not programming experts, with encapsulating these three components into a container to enable reproducibility in an immutable form. However, there were still limitations to support interoperable computational environments and apply end-to-end solutions, which are an ultimate goal of reproducible hydrological modeling. Therefore, the objective of this research is to advance the existing Sciunit capabilities to not only support immutable, but also interoperable computational environments and apply an end-to-end modeling workflow using the Regional Hydro-Ecologic Simulation System (RHESSys) hydrologic model as an example. First, we create an end-to-end workflow for RHESSys using pyRHESSys on the CyberGIS-Jupyter for Water platform. Second, we encapsulate the aforementioned three components and create configurations that include lists of encapsulated dependencies using Sciunit. Third, we create two HydroShare resources, one for immutable reproducibility evaluation using Sciunit and the other for interoperable reproducibility evaluation using library configurations created by Sciunit. Finally, we evaluate the reproducibility of Sciunit in MyBinder, which is a different computational environment, using these two resources. This research presents a detailed example of a user-centric case study demonstrating the application of an open and interoperable containerization approach from a hydrologic modeler’s perspective."
How to run the RHESSys end-to-end workflows and create a Sciunit Container
This Readme file provides the users with the step-by-step guide to successfully run the two developed notebooks.
The steps, in the order they need to be taken, are explained in what follows.
STEP 0: Preliminary step
In this step the researchers make sure that they have access to the content files of the resource and required compute platform.
- In order to be able to run the two Jupyter notebooks, researchers need to first have a HydroShare account.
- If the researchers do not have access to CyberGIS-Jupyter for Water (CJW), they need to submit an access request to the CyberGIS-Jupyter for Water platform.
STEP 1: Execute 'YD_01_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb' using CJW
- To run this notebook:
- Click the OpenWith button in the upper-right corner of this HydroShare resource webpage;
- Select "CyberGIS-Jupyter for Water";
- Open the notebook and follow instructions;
In this notebook, users first run RHESSys end-to-end modeling workflows to make sure this workflows are working properly. Then, they create a Sciunit Container to encapsulate 1) data, 2) computational environments, and 3) modeling workflows. After that, users can create MyBinder configuration files using Sciunit. Finally, users create two HydroShare resources, each for one case study, to evaluate reproducibility. These two HydroShare resources will be used to create MyBinder computational environment for case study 1 (step 2) and case study 2 (step 3).
STEP 2: Create and Run MyBinder Computational Environment for the Case Study-1 using the HydroShare Resource Created for It
In this step, users run the second notebook,
YD_02_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb, using MyBinder computational environment to evaluate reproducibility in the immutable Sciunit container.
STEP 3: Create and Run MyBinder Computational Environment for the Case Study-2 using the HydroShare Resource Created for It
In this step, users run the third notebook,
YD_03_An_Approach_for_Creating_Immutable_and_Interoperable_End_to_End_Hydrological_Modeling_Computational_Workflows.ipynb, using MyBinder computational environment to evaluate reproducibility in the interoperable computational environments like CyberGIS-Jupyter for Water for extendable computational research.
Note, the third notebook is the same as the first notebook, but with the Sciunit process being removed only to simplify the workflows and reduce the runtime.
|The content of this resource is derived from
This resource was created using funding from the following sources:
|National Science Foundation
|EarthCube Data Capabilities: Collaborative Research: Integration of Reproducibility into Community CyberInfrastructure
How to Cite
This resource is shared under the Creative Commons Attribution CC BY.http://creativecommons.org/licenses/by/4.0/