Tao Wen

Syracuse University | Assistant Professor

Subject Areas: Hydrogeochemistry, Environmental Data Sciences

 Recent Activity

ABSTRACT:

River and groundwater geochemistry reflect the integrated results of natural and anthropogenic biogeochemical processes that generate and transport solutes from rocks and soil to rivers and aquifers. These solutes can be derived from multiple sources, such as the weathering of silicate, carbonate, and evaporite rocks, meteoric precipitation, and human pollution. Delineating the contribution of each source to measured solute chemistry is critical for understanding biogeochemical processes and for helping policymakers improve management strategies to safeguard water resources under a changing climate.

End Member Mixing Analysis (EMMA) refers to a group of methods that are increasingly used to identify solute sources and quantify their contributions to water chemistry. EMMA techniques include statistical methods, inverse models, and emerging tools based on machine learning such as non-negative matrix factorization (NMF). As the availability of large hydrogeochemistry datasets and computational resources improves, there is an increased need for the hydrogeochemistry community to become fluent in a variety of EMMA approaches. In this half-day workshop, we will teach how to apply EMMA techniques with genuine river and groundwater hydrogeochemistry datasets in R and MATLAB to solve Earth and environmental sciences problems.

Workshop attendees will use the CUAHSI JupyterHub and MATLAB online cloud computing environment to complete workshop activities. This computing environment will be pre-configured with all necessary software in order to maximize engagement in end member mixing analysis workflows. While attendees are required to bring a personal laptop, all software libraries and hardware requirements will be pre-configured for them to use, no software installation will be necessary.

Show More

ABSTRACT:

River and groundwater geochemistry reflect the integrated results of natural and anthropogenic biogeochemical processes that generate and transport solutes from rocks and soil to rivers and aquifers. These solutes can be derived from multiple sources, such as the weathering of silicate, carbonate, and evaporite rocks, meteoric precipitation, and human pollution. Delineating the contribution of each source to measured solute chemistry is critical for understanding biogeochemical processes and for helping policymakers improve management strategies to safeguard water resources under a changing climate.

End Member Mixing Analysis (EMMA) refers to a group of methods that are increasingly used to identify solute sources and quantify their contributions to water chemistry. EMMA techniques include statistical methods, inverse models, and emerging tools based on machine learning such as non-negative matrix factorization (NMF). As the availability of large hydrogeochemistry datasets and computational resources improves, there is an increased need for the hydrogeochemistry community to become fluent in a variety of EMMA approaches. In this half-day workshop, we will teach how to apply EMMA techniques with genuine river and groundwater hydrogeochemistry datasets in R and MATLAB to solve Earth and environmental sciences problems.

Workshop attendees will use the CUAHSI JupyterHub and MATLAB online cloud computing environment to complete workshop activities. This computing environment will be pre-configured with all necessary software in order to maximize engagement in end member mixing analysis workflows. While attendees are required to bring a personal laptop, all software libraries and hardware requirements will be pre-configured for them to use, no software installation will be necessary.

Show More

ABSTRACT:

High methane and salt levels in groundwater have been the most widely cited unconventional oil and gas development (UOGD) related water impairments. The attribution of these contaminants to UOGD is usually complex, especially in regions with mixed land uses. Here, we compiled a large hydrogeochemistry dataset containing 13 geochemical analytes for 17,794 groundwater samples from rural northern Appalachia, i.e., 19 counties located on the boundary between Pennsylvania (PA; UOGD is permitted) and New York (NY; UOGD is banned). With this dataset, we explored if statistical and geospatial tools can help shed light on the sources of inorganic solutes and methane in groundwater in regions with mixed land uses. The traditional Principal Component Analysis (PCA) indicates salts in NY and PA groundwater are mainly from the Appalachian Basin Brine (ABB). In contrast, the machine learning tool – Non-negative Matrix Factorization (NMF) highlights that road salts (in addition to ABB) account for 36%–48% of total chloride in NY and PA groundwaters. The PCA fails to identify road salts as one water/salt source, likely due to its geochemical similarity with ABB. Neither PCA nor NMF detects a regional impact of UOGD on groundwater quality. Our geospatial analyses further corroborate (1) road salting is the major salt source in groundwater, and its impact is enhanced in proximity to highways; (2) UOGD-related groundwater quality deterioration is only limited to a few localities in PA.

Show More

ABSTRACT:

As natural gas has grown in importance as a global energy source, leakage of methane (CH4) fromwells has sometimes
been noted. Leakage of this greenhouse gas is important because it affects groundwater quality and, when
emitted to the atmosphere, climate. We hypothesized that streams might be most contaminated by CH4 in the
northern Appalachian Basin in regions with the longest history of hydrocarbon extraction activities. To test this,
we searched for CH4-contaminated streams in the basin. Methane concentrations ([CH4]) for 529 stream sites are
reported in New York, West Virginia and (mostly) Pennsylvania. Despite targeting contaminated areas, the median
[CH4], 1.1 μg/L, was lower than a recently identified threshold indicating potential contamination, 4.0 μg/L. [CH4]
values were higher in a few streams because they receive high-[CH4] groundwaters, often from upwelling seeps.
By analogy to the more commonly observed type of groundwater seep known as abandoned mine drainage
(AMD), we introduce the term, “gas leak discharge” (GLD) for these waters where they are not associated with
coal mines. GLD and AMD, observed in all parts of the study area, are both CH4-rich. Surprisingly, the region of oldest
and most productive oil/gas development did not show the highest median for stream [CH4]. Instead, the median
was statistically highest where dense coal mining was accompanied by conventional and unconventional oil and
gas development, emphasizing the importance of CH4 contamination from coal mines into streams.

Show More

ABSTRACT:

This study presents measurements of the bulk gas composition, stable isotope, and noble gas volume fraction and isotope for shale gas samples collected from gas wells in the Wufeng-Longmaxi shale, the southern Sichuan Basin, China. These gas wells are divided into two groups: forelimb and backlimb based on their relative locations to the anticline. The dryness [C1/(C2 +C3)] ranging from 166.3 to 251.2, combined with δ13C1 and δDC1 data that vary from –28.83 to –27.26 ‰ and –152.5 to –144.6 ‰, respectively, point to a late mature thermogenic origin of the gas. 3He/4He ratios of gas samples are mostly around 0.01 times the air value suggesting a dominant crust-derived He. 21Ne/22Ne and 40Ar/36Ar ratios of many gas samples are higher than the corresponding air values indicating the mixing of terrigenic and atmospheric noble gases. In addition, forelimb samples present the highest 21Ne/22Ne and 40Ar/36Ar ratios indicating a larger contribution of terrigenic noble gas in these wells. Elemental ratios of air-derived noble gas isotopes – 22Ne/36Ar, 84Kr/36Ar, and 132Xe/36Ar are compared to the recharge water values, pointing to the interactions of oil, gas, and water phases in the shale over geologic time. The study of terrigenic noble gases further suggests the addition of crust-derived noble gases from deeper formations. Unlike backlimb samples, all of the forelimb samples display ages older than 45 Ma – the age of major tectonic exhumation event in the study area – likely indicating a larger flux of external radiogenic 4He due to the higher density of deep faults in the forelimb area caused by the basement-involved deformation. In the meantime, the basement-involved deformation causes the pore collapse especially in the forelimb leading to lower porosity, which might in turn allow better preservation of noble gases in the shale by reducing the recharge of younger groundwater into the shale.

Show More

 Contact

Mobile 7347308814
Email (Log in to send email)
Website http://jaywen.com/

 Author Identifiers

Resources
All 0
Collection 0
Resource 0
App Connector 0
Resource Resource
Sliding Window Geospatial Tool
Created: June 26, 2020, 7:32 a.m.
Authors: Wen, Tao

ABSTRACT:

This resource collects teaching materials that are originally created for the in-person course 'GEOSC/GEOG 497 – Data Mining in Environmental Sciences' at Penn State University (co-taught by Tao Wen, Susan Brantley, and Alan Taylor) and then refined/revised by Tao Wen to be used in the online teaching module 'Data Science in Earth and Environmental Sciences' hosted on the NSF-sponsored HydroLearn platform.

This resource includes both R Notebooks and Python Jupyter Notebooks to teach the basics of R and Python coding, data analysis and data visualization, as well as building machine learning models in both programming languages by using authentic research data and questions. All of these R/Python scripts can be executed either on the CUAHSI JupyterHub or on your local machine.

This resource is shared under the CC-BY license. Please contact the creator Tao Wen at Syracuse University (twen08@syr.edu) for any questions you have about this resource. If you identify any errors in the files, please contact the creator.

Show More
Resource Resource
Python and R Basics for Environmental Data Sciences
Created: July 1, 2020, 7:11 p.m.
Authors: Wen, Tao

ABSTRACT:

This resource collects teaching materials that are originally created for the in-person course 'GEOSC/GEOG 497 – Data Mining in Environmental Sciences' at Penn State University (co-taught by Tao Wen, Susan Brantley, and Alan Taylor) and then refined/revised by Tao Wen to be used in the online teaching module 'Data Science in Earth and Environmental Sciences' hosted on the NSF-sponsored HydroLearn platform.

This resource includes both R Notebooks and Python Jupyter Notebooks to teach the basics of R and Python coding, data analysis and data visualization, as well as building machine learning models in both programming languages by using authentic research data and questions. All of these R/Python scripts can be executed either on the CUAHSI JupyterHub or on your local machine.

This resource is shared under the CC-BY license. Please contact the creator Tao Wen at Syracuse University (twen08@syr.edu) for any questions you have about this resource. If you identify any errors in the files, please contact the creator.

Show More
Resource Resource
Intelligent Earth Collective Notebooks
Created: Aug. 6, 2020, 8:27 p.m.
Authors: Bandaragoda, Christina

ABSTRACT:

The structure of our course is designed to mirror the structure of the interactive learning in a Jupyter Notebook. It’s just like a chemistry lab, we squeeze the Earth into a web browser shaped beaker, and turn on the team science bunsen burner. Each Section in the module is based on one experiment (or computational workflow) which replicates results available in a published journal article in collaboration with publication coauthors. Sub-sections have various themes, research question motivations, datasets, models, but each have one assessment focusing on skills to address Nested Learning Objectives identified by each Summary question. Sub-sections include: Introduction, Theoretical Background, Methods, Results, Discussion, Conclusion.
In the Experiments, we distinguish between cyberinfrastructure, data science, and geoscience domain methods. We also introduce Team Science (Convergence, Diversity, Inclusion, Equity) and Information Science elements in each Section (Findable, Accessible, Interoperable, Accessible).

Show More
Resource Resource

ABSTRACT:

This study presents measurements of the bulk gas composition, stable isotope, and noble gas volume fraction and isotope for shale gas samples collected from gas wells in the Wufeng-Longmaxi shale, the southern Sichuan Basin, China. These gas wells are divided into two groups: forelimb and backlimb based on their relative locations to the anticline. The dryness [C1/(C2 +C3)] ranging from 166.3 to 251.2, combined with δ13C1 and δDC1 data that vary from –28.83 to –27.26 ‰ and –152.5 to –144.6 ‰, respectively, point to a late mature thermogenic origin of the gas. 3He/4He ratios of gas samples are mostly around 0.01 times the air value suggesting a dominant crust-derived He. 21Ne/22Ne and 40Ar/36Ar ratios of many gas samples are higher than the corresponding air values indicating the mixing of terrigenic and atmospheric noble gases. In addition, forelimb samples present the highest 21Ne/22Ne and 40Ar/36Ar ratios indicating a larger contribution of terrigenic noble gas in these wells. Elemental ratios of air-derived noble gas isotopes – 22Ne/36Ar, 84Kr/36Ar, and 132Xe/36Ar are compared to the recharge water values, pointing to the interactions of oil, gas, and water phases in the shale over geologic time. The study of terrigenic noble gases further suggests the addition of crust-derived noble gases from deeper formations. Unlike backlimb samples, all of the forelimb samples display ages older than 45 Ma – the age of major tectonic exhumation event in the study area – likely indicating a larger flux of external radiogenic 4He due to the higher density of deep faults in the forelimb area caused by the basement-involved deformation. In the meantime, the basement-involved deformation causes the pore collapse especially in the forelimb leading to lower porosity, which might in turn allow better preservation of noble gases in the shale by reducing the recharge of younger groundwater into the shale.

Show More
Resource Resource

ABSTRACT:

As natural gas has grown in importance as a global energy source, leakage of methane (CH4) fromwells has sometimes
been noted. Leakage of this greenhouse gas is important because it affects groundwater quality and, when
emitted to the atmosphere, climate. We hypothesized that streams might be most contaminated by CH4 in the
northern Appalachian Basin in regions with the longest history of hydrocarbon extraction activities. To test this,
we searched for CH4-contaminated streams in the basin. Methane concentrations ([CH4]) for 529 stream sites are
reported in New York, West Virginia and (mostly) Pennsylvania. Despite targeting contaminated areas, the median
[CH4], 1.1 μg/L, was lower than a recently identified threshold indicating potential contamination, 4.0 μg/L. [CH4]
values were higher in a few streams because they receive high-[CH4] groundwaters, often from upwelling seeps.
By analogy to the more commonly observed type of groundwater seep known as abandoned mine drainage
(AMD), we introduce the term, “gas leak discharge” (GLD) for these waters where they are not associated with
coal mines. GLD and AMD, observed in all parts of the study area, are both CH4-rich. Surprisingly, the region of oldest
and most productive oil/gas development did not show the highest median for stream [CH4]. Instead, the median
was statistically highest where dense coal mining was accompanied by conventional and unconventional oil and
gas development, emphasizing the importance of CH4 contamination from coal mines into streams.

Show More
Resource Resource
Archived Dataset for Epuna et al. (2022) in Water Research
Created: July 2, 2022, 8:49 p.m.
Authors: Wen, Tao

ABSTRACT:

High methane and salt levels in groundwater have been the most widely cited unconventional oil and gas development (UOGD) related water impairments. The attribution of these contaminants to UOGD is usually complex, especially in regions with mixed land uses. Here, we compiled a large hydrogeochemistry dataset containing 13 geochemical analytes for 17,794 groundwater samples from rural northern Appalachia, i.e., 19 counties located on the boundary between Pennsylvania (PA; UOGD is permitted) and New York (NY; UOGD is banned). With this dataset, we explored if statistical and geospatial tools can help shed light on the sources of inorganic solutes and methane in groundwater in regions with mixed land uses. The traditional Principal Component Analysis (PCA) indicates salts in NY and PA groundwater are mainly from the Appalachian Basin Brine (ABB). In contrast, the machine learning tool – Non-negative Matrix Factorization (NMF) highlights that road salts (in addition to ABB) account for 36%–48% of total chloride in NY and PA groundwaters. The PCA fails to identify road salts as one water/salt source, likely due to its geochemical similarity with ABB. Neither PCA nor NMF detects a regional impact of UOGD on groundwater quality. Our geospatial analyses further corroborate (1) road salting is the major salt source in groundwater, and its impact is enhanced in proximity to highways; (2) UOGD-related groundwater quality deterioration is only limited to a few localities in PA.

Show More
Collection Collection

ABSTRACT:

River and groundwater geochemistry reflect the integrated results of natural and anthropogenic biogeochemical processes that generate and transport solutes from rocks and soil to rivers and aquifers. These solutes can be derived from multiple sources, such as the weathering of silicate, carbonate, and evaporite rocks, meteoric precipitation, and human pollution. Delineating the contribution of each source to measured solute chemistry is critical for understanding biogeochemical processes and for helping policymakers improve management strategies to safeguard water resources under a changing climate.

End Member Mixing Analysis (EMMA) refers to a group of methods that are increasingly used to identify solute sources and quantify their contributions to water chemistry. EMMA techniques include statistical methods, inverse models, and emerging tools based on machine learning such as non-negative matrix factorization (NMF). As the availability of large hydrogeochemistry datasets and computational resources improves, there is an increased need for the hydrogeochemistry community to become fluent in a variety of EMMA approaches. In this half-day workshop, we will teach how to apply EMMA techniques with genuine river and groundwater hydrogeochemistry datasets in R and MATLAB to solve Earth and environmental sciences problems.

Workshop attendees will use the CUAHSI JupyterHub and MATLAB online cloud computing environment to complete workshop activities. This computing environment will be pre-configured with all necessary software in order to maximize engagement in end member mixing analysis workflows. While attendees are required to bring a personal laptop, all software libraries and hardware requirements will be pre-configured for them to use, no software installation will be necessary.

Show More
Resource Resource

ABSTRACT:

River and groundwater geochemistry reflect the integrated results of natural and anthropogenic biogeochemical processes that generate and transport solutes from rocks and soil to rivers and aquifers. These solutes can be derived from multiple sources, such as the weathering of silicate, carbonate, and evaporite rocks, meteoric precipitation, and human pollution. Delineating the contribution of each source to measured solute chemistry is critical for understanding biogeochemical processes and for helping policymakers improve management strategies to safeguard water resources under a changing climate.

End Member Mixing Analysis (EMMA) refers to a group of methods that are increasingly used to identify solute sources and quantify their contributions to water chemistry. EMMA techniques include statistical methods, inverse models, and emerging tools based on machine learning such as non-negative matrix factorization (NMF). As the availability of large hydrogeochemistry datasets and computational resources improves, there is an increased need for the hydrogeochemistry community to become fluent in a variety of EMMA approaches. In this half-day workshop, we will teach how to apply EMMA techniques with genuine river and groundwater hydrogeochemistry datasets in R and MATLAB to solve Earth and environmental sciences problems.

Workshop attendees will use the CUAHSI JupyterHub and MATLAB online cloud computing environment to complete workshop activities. This computing environment will be pre-configured with all necessary software in order to maximize engagement in end member mixing analysis workflows. While attendees are required to bring a personal laptop, all software libraries and hardware requirements will be pre-configured for them to use, no software installation will be necessary.

Show More