ML23166A112
ML23166A112 | |
Person / Time | |
---|---|
Issue date: | 06/14/2023 |
From: | Gundimada S, Matthew Homiack, Raj Iyengar, Lubars J, Starr M, Verzi S Office of Nuclear Regulatory Research, Sandia |
To: | |
References | |
SAND2023-06170PE | |
Download: ML23166A112 (30) | |
Text
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energys National Nuclear Security Administration under contract DE-NA0003525.
Artificial Intelligence and Machine Learning Support for Probabilistic Fracture Mechanics Analysis Stephen J. Verzi, Joseph P. Lubars, Satyanadh Gundimada, Michael J. Starr Matthew Homiack, Raj Iyengar Machine Learning / Deep Learning Workshop July 17-20, 2023 SAND2023-06170PE Sandia National Laboratories U.S. Nuclear Regulatory Commission This presentation was prepared as an account of work sponsored by an agency of the U.S.
Government. Neither the U.S. Government nor any agency thereof, nor any of their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for any third partys use, or the results of such use, of any information, apparatus, product, or process disclosed in this report, or represents that its use by such third party would not infringe privately owned rights.
The views expressed in this presentation are not necessarily those of the U.S. Nuclear Regulatory Commission.
Abstract 2
In this research, artificial intelligence and machine learning (ML) methods are used to search an uncertain parameter space more efficiently for the most important inputs with respect to response sensitivities and then to construct, train, and test low-fidelity surrogate models. These methods are applied to the Extremely Low Probability of Rupture (xLPR) probabilistic fracture mechanics code used at the U.S. Nuclear Regulatory Commission in support of nuclear regulatory research. This presentation will show two separate but related efforts: 1) ranking important uncertain input features with respect to target outputs as determined by convergence in the confidence intervals for increasing sample sizes using simple random sampling; and 2) implementation of a reduced-order surrogate model for fast, approximate sample generation. Unoptimized, readily available off-the-shelf ML models were used in both efforts.
The results show that ML models can assist analysts in conducting sensitivity and uncertainty analyses with respect to typical xLPR use cases. The ML models help to reduce the number of random realizations needed in the xLPR simulation by focusing on the most important input parameters (as part of the first effort) and augment xLPR output generation by providing quick approximate time-series simulation (as part of the second effort). This research involved several different kinds of ML models, including linear, random forest, gradient boosting, and multi-layer perceptron regression techniques.
Even though not all models perform well in each task or scenario, especially when data is scarce (i.e., low probabilities of leak and rupture), the results show that there are cases where each ML model can perform well. Future efforts can focus on hyperparameter optimization, when appropriate. Both efforts were intended to augment xLPR simulations, which is what the results show.
Outline 3
Problem Space Probabilistic Fracture Mechanics (PFM)
Extremely Low Probability of Rupture (xLPR) Code Solution Space Artificial Intelligence (AI) and Machine Learning (ML)
Sensitivity Analysis with xLPR Importance Sampling Determining Appropriate Sample Size Surrogate Model for xLPR Quantities of Interest (QoIs)
Time of 1st Leak, if any Crack Propagation via Normalized Depth Results Potential Future Work Acknowledgments and Contacts
Problem Space - Probabilistic Fracture Mechanics (PFM) 4 Analysis This figure illustrates a simplistic PFM analysis. The curve on the left represents the distribution of crack driving force or applied stress intensity factor (SIF), which depends on the uncertainties in stress and crack size. The curve on the right represents the toughness distribution or critical (i.e., allowable) SIF of the material. When the two distributions overlap, there is a finite probability of failure, which is indicated by the shaded area. Time dependent crack growth, such as from fatigue or stress-corrosion cracking or both, can be considered by applying the appropriate growth laws to the crack distribution. Crack growth can cause the applied SIF distribution to shift to the right with time, thereby increasing the probability of failure.
Figure from U.S. NRC Technical Letter Report, TRL-RES/DE/REB-2022-13.
Problem Space - Extremely Low Probability of Rupture (xLPR) Code 5
Analysis Architecture Figure from U.S. NRC Technical Letter Report, TRL-RES/DE/REB-2022-13.
Sensitivity Analysis with xLPR 6
Given input (uncertain parameter) distributions and associated Monte Carlo outputs Can we use AI/ML models to determine/rank importance of inputs while finding proper sample size with respect to output set?
inputs ranked features NRC, Technical Letter Report TLR-RES/DE/REB-2021-14-R1, "Probabilistic Leak-Before-Break Evaluations of Pressurized-Water Reactor Piping Systems using the Extremely Low Probability of Rupture Code," April 2022, ADAMS Accession No. ML22088A006
Surrogate Modeling using ML for xLPR 7
Given input (uncertain parameter) distributions and associated Monte Carlo outputs Can we use AI/ML models to train surrogate model with respect to output set?
outputs Outputs or Quantities of Interest (QoIs) cc_depth_normalized cc_ID_length_normalized cc_OD_length_normalized is_leaking is_ruptured total_leak_rate potential surrogate models leak time sample index 492 of 2000 samples normalized crack depth simulation time 240
- 2000 samples
Results 8
Using random forest regressor (scikit-learn)
Mean decrease in impurity (MDI)
Permutation importance values Using linear regression (scikit-learn)
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011. https://dl.acm.org/doi/abs/10.5555/1953048.2078195 Outputs or Quantities of Interest (QoIs) cc_depth_normalized cc_ID_length_normalized cc_OD_length_normalized is_leaking is_ruptured total_leak_rate inputs
Results - Using Random Forest Regressor 9
Multi-variate output set (6) given input set (65) w/ 200 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 0.5973 WRS_axial_premitigation_pt26 0.0680 WRS_axial_premitigation_pt24 0.0545 WRS_axial_premitigation_pt22 0.0501 WRS_axial_premitigation_pt05 0.0351 weld_material_PWSCC_growth_component_to_component_variabilit y_factor_fcomp 0.0280 WRS_axial_premitigation_pt21 0.0255 weld_material_PWSCC_growth_activation_energy_Qg 0.0202 WRS_axial_premitigation_pt02 0.0158 WRS_axial_premitigation_pt07 0.0133 WRS_axial_premitigation_pt17 0.0107 Ranked permutation importance for those inputs with values greater than random feature mean + 2
- standard deviation.
Results - Using Random Forest Regressor 10 Multi-variate output set (6) given input set (65) w/ 2000 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 0.9740 weld_material_PWSCC_growth_component_to_component_variabilit y_factor_fcomp 0.2029 weld_material_PWSCC_growth_within_component_variability_factor
_fflaw 0.1503 WRS_axial_premitigation_pt14 0.0301 WRS_axial_premitigation_pt26 0.0268 WRS_axial_premitigation_pt22 0.0230 WRS_axial_premitigation_pt02 0.0176 WRS_axial_premitigation_pt23 0.0163 WRS_axial_premitigation_pt24 0.0162 WRS_axial_premitigation_pt07 0.0158 Ranked permutation importance for those inputs with values greater than random feature mean + 2
- standard deviation.
Results - Using Random Forest Regressor 11 Multi-variate output set (6) given input set (65) w/ 20000 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 1.2146 weld_material_PWSCC_growth_component_to_component_variabilit y_factor_fcomp 0.3489 weld_material_PWSCC_growth_within_component_variability_factor
_fflaw 0.2416 WRS_axial_premitigation_pt02 0.1115 initial_cc_full_length 0.0323 WRS_axial_premitigation_pt25 0.0226 weld_material_PWSCC_growth_activation_energy_Qg 0.0205 WRS_axial_premitigation_pt26 0.0177 WRS_axial_premitigation_pt24 0.0162 left_pipe_material_yield_strength 0.0155 Ranked permutation importance for those inputs with values greater than random feature mean + 2
- standard deviation.
Results - Using Random Forest Regressor 12 Determining appropriate sample size (all 6 QoIs) - Confidence intervals for 200, 2000 and 20000 samples 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 200 2000 20000 value
- Samples cc_depth_normalized 95% CI 95% high mean 95% low 0
0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 200 2000 20000 valiue
- Samples cc_ID_length_normalized 95% CI 95% high mean 95% low 0.36 0.38 0.4 0.42 0.44 0.46 0.48 200 2000 20000 value
- Samples cc_OD_length_normalized 95% CI 95% high mean 95% low 0
0.05 0.1 0.15 0.2 0.25 0.3 0.35 200 2000 20000 value
- Samples is_leaking 95% CI 95% high mean 95% low 0
0.05 0.1 0.15 0.2 0.25 0.3 200 2000 20000 value
- Samples is_ruptured 95% CI 95% high mean 95% low 0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 200 2000 20000 value
- Samples total_leak_rate 95% CI 95% high mean 95% low 2000 Samples are sufficient
Results - Surrogate Model 13 Can we predict time of first leak?
A) Training process xLPR QoI is_leaking changes from 0 to 1 training trained model xLPR inputs B) Use in prediction predicted xLPR QoI trained model xLPR inputs 492 out of 2000 samples result in leak
Results - Using Linear Regression 14 492 (out of 2000) samples - 75/25 % training/testing split Seems like we do not have enough training data mse = 662.7 Predicted leak time (in months) for test set by sample index leak time (months) test sample index
Results - Using Linear Regression 15 492 (out of 2000) samples - 75/25 % training/testing split Seems like we do not have enough training data mse = 662.7 Ground truth versus predicted for test set by sample Predicted leak time ground truth
Results - Using Random Forest Regression 16 492 (out of 2000) samples - 75/25 % training/testing split Seems to miss both high and low values (we probably do not have enough training data) mse = 707.5 Ground truth versus predicted for test set by sample Predicted leak time ground truth
Results - Using Random Forest Regression 17 492 (out of 2000) samples - 100/0 % training/testing split Significant drop in mse mse = 108.6 Predicted leak time (in months) for train set by sample index leak time (months) test sample index
Results - Using Random Forest Regression 18 492 (out of 2000) samples - 100/0 % training/testing split Significant drop in mse - but model still not capturing high and low ends well mse = 108.6 Ground truth versus predicted for train set by sample Predicted leak time ground truth
Results - Surrogate Model 19 Can we predict normalized crack depth at next time step - Use Case: Leak occurs (when normalized crack depth is 1.0)?
A) Training process training trained model xLPR inputs xLPR QoI at time t normalized_crack_depth xLPR QoI at time t+1 normalized_crack_depth B) Use in prediction trained model xLPR inputs xLPR QoI at time t normalized_crack_depth predicted xLPR QoI at time t+1 normalized_crack_depth 2000 samples with 240 time steps each
Results - Using Linear Regression 20 2000 samples - 75/25 % training/testing split Seems to lose performance for larger normalized crack depth values mse = 2.7e-5 xLPR depth versus predicted for test set by sample Predicted leak time xLPR depth
Results - Using Linear Regression 21 2000 samples - 75/25 % training/testing split Attempts to extrapolate beyond unit normalized crack depth xLPR depth versus predicted time-series for single sample initial conditions Predicted depth xLPR depth
Results - Using Random Forest Regression 22 2000 samples - 75/25 % training/testing split Captures larger normalized crack depth values and more of curvature mse = 2.0e-6 xLPR depth versus predicted for test set by sample xLPR depth Predicted leak time
Results - Using Random Forest Regression 23 2000 samples - 75/25 % training/testing split Captures similar curvature, but predicts leak sooner than xLPR xLPR depth versus predicted time-series for single sample initial conditions Predicted depth xLPR depth
Results - Using Random Forest Regression 24 2000 samples - train and test on 100% of data Captures larger normalized crack depth, but with increased uncertainty mse = 2.5e-7 xLPR depth versus predicted for train set by sample Predicted leak time xLPR depth
Results - Using Random Forest Regression 25 2000 samples - train and test on 100% of data Captures similar curvature, but predicts leak even sooner than xLPR or 75%
training data -
indicates overtraining xLPR depth versus predicted time-series for single sample initial conditions Predicted depth xLPR depth
Results - Using Random Forest Regression 26 2000 samples - 25/75 % training/testing split Captures larger normalized crack depth values but with increased uncertainty mse = 2.7e-6 xLPR depth versus predicted for test set by sample Predicted leak time xLPR depth
Results - Using Random Forest Regression 27 2000 samples - 25/75 % training/testing split Captures similar curvature, but predicts leak sooner than xLPR
- not as overtrained xLPR depth versus predicted time-series for single sample initial conditions Predicted depth xLPR depth
Potential Future Work 28
- Efficiently identify response sensitivities from an uncertain input parameter space
- Sensitivity analysis
- Sensitivity studies
- Uncertainty analysis
- Identify methods of creating tiered surrogate models (machine-learning/data-driven) with comparable accuracy to the physics-based xLPR model, including characterization of the increased computational efficiency of the potential surrogate models
- Start with one output and use neural network to predict
- Determine level of effort needed
Our Team - Sandia 29 Sandia Michael Starr (PI)
Stephen Verzi (staff, ML)
Joseph Lubars (staff, ML/RL & statistics)
Satyanadh Gundimada (staff, ML/DL)
U.S. Nuclear Regulatory Commission Matthew Homiack Raj Iyengar Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia
- LLC, a
wholly owned subsidiary of Honeywell International Inc. for the U.S.
Department of Energys National Nuclear Security Administration under contract DE-NA0003525.
This presentation was prepared as an account of work sponsored by an agency of the U.S. Government. Neither the U.S. Government nor any agency thereof, nor any of their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for any third partys use, or the results of such use, of any information, apparatus, product, or process disclosed in this report, or represents that its use by such third party would not infringe privately owned rights. The views expressed in this presentation are not necessarily those of the U.S. Nuclear Regulatory Commission.
Thank You 30 Questions?