Cloud-based Jupyter Notebooks for Water Data Analysis
|Authors:||Anthony Castronova · liza brazil · Martin Seul|
|Resource type:||Composite Resource|
|Created:||Dec 07, 2017 at 8:06 p.m.|
|Last updated:||Jul 18, 2018 at 11:32 a.m. by Anthony Castronova|
The development and adoption of technologies by the water science community to improve our ability to openly collaborate and share workflows will have a transformative impact on how we address the challenges associated with collaborative and reproducible scientific research. Jupyter notebooks offer one solution by providing an open-source platform for creating metadata-rich toolchains for modeling and data analysis applications. Adoption of this technology within the water sciences, coupled with publicly available datasets from agencies such as USGS, NASA, and EPA enables researchers to easily prototype and execute data intensive toolchains. Moreover, implementing this software stack in a cloud-based environment extends its native functionality to provide researchers a mechanism to build and execute toolchains that are too large or computationally demanding for typical desktop computers. Additionally, this cloud-based solution enables scientists to disseminate data processing routines alongside journal publications in an effort to support reproducibility. For example, these data collection and analysis toolchains can be shared, archived, and published using the HydroShare platform or downloaded and executed locally to reproduce scientific analysis. This work presents the design and implementation of a cloud-based Jupyter environment and its application for collecting, aggregating, and munging various datasets in a transparent, sharable, and self-documented manner. The goals of this work are to establish a free and open source platform for domain scientists to (1) conduct data intensive and computationally intensive collaborative research, (2) utilize high performance libraries, models, and routines within a pre-configured cloud environment, and (3) enable dissemination of research products. This presentation will discuss recent efforts towards achieving these goals, and describe the architectural design of the notebook server in an effort to support collaborative and reproducible science
This was presented as an EPoster at the 2017 American Geophysical Union and can be found at:
How to cite
This resource is shared under the Creative Commons Attribution CC BY.http://creativecommons.org/licenses/by/4.0/
|liza brazil||CUAHSI||(339) 221-5400 x204|
|Martin Seul||CUAHSI||Massachusetts, US||(339) 933-4656|
Select content in the file browser to see metadata specific to that content. Metadata will only display here when the the content is selected above. Content specific metadata does not display on the Discover page.
Please wait for the process to complete.