ML23166A112

From kanterella
Jump to navigation Jump to search
Presentation on Artificial Intelligence and Machine Learning Support for Probabilistic Fracture Mechanics Analysis
ML23166A112
Person / Time
Issue date: 06/14/2023
From: Gundimada S, Matthew Homiack, Raj Iyengar, Lubars J, Starr M, Verzi S
Office of Nuclear Regulatory Research, Sandia
To:
References
SAND2023-06170PE
Download: ML23166A112 (30)


Text

SAND2023-06170PE Artificial Intelligence and Machine Learning Support for Probabilistic Fracture Mechanics Analysis Stephen J. Verzi, Joseph P. Lubars, Sandia National Laboratories Satyanadh Gundimada, Michael J. Starr Matthew Homiack, Raj Iyengar U.S. Nuclear Regulatory Commission Machine Learning / Deep Learning Workshop July 17-20, 2023 This presentation was prepared as an account of work sponsored by an agency of the U.S.

Sandia National Laboratories is a Government. Neither the U.S. Government nor any agency thereof, nor any of their employees, multimission laboratory managed and makes any warranty, expressed or implied, or assumes any legal liability or responsibility for any third operated by National Technology and Engineering Solutions of Sandia LLC, a wholly partys use, or the results of such use, of any information, apparatus, product, or process disclosed in owned subsidiary of Honeywell International this report, or represents that its use by such third party would not infringe privately owned rights. Inc. for the U.S. Department of Energys National Nuclear Security Administration The views expressed in this presentation are not necessarily those of the U.S. Nuclear Regulatory under contract DE-NA0003525.

Commission.

2 Abstract In this research, artificial intelligence and machine learning (ML) methods are used to search an uncertain parameter space more efficiently for the most important inputs with respect to response sensitivities and then to construct, train, and test low-fidelity surrogate models. These methods are applied to the Extremely Low Probability of Rupture (xLPR) probabilistic fracture mechanics code used at the U.S. Nuclear Regulatory Commission in support of nuclear regulatory research. This presentation will show two separate but related efforts: 1) ranking important uncertain input features with respect to target outputs as determined by convergence in the confidence intervals for increasing sample sizes using simple random sampling; and 2) implementation of a reduced-order surrogate model for fast, approximate sample generation. Unoptimized, readily available off-the-shelf ML models were used in both efforts.

The results show that ML models can assist analysts in conducting sensitivity and uncertainty analyses with respect to typical xLPR use cases. The ML models help to reduce the number of random realizations needed in the xLPR simulation by focusing on the most important input parameters (as part of the first effort) and augment xLPR output generation by providing quick approximate time-series simulation (as part of the second effort). This research involved several different kinds of ML models, including linear, random forest, gradient boosting, and multi-layer perceptron regression techniques.

Even though not all models perform well in each task or scenario, especially when data is scarce (i.e., low probabilities of leak and rupture), the results show that there are cases where each ML model can perform well. Future efforts can focus on hyperparameter optimization, when appropriate. Both efforts were intended to augment xLPR simulations, which is what the results show.

3 Outline Problem Space Probabilistic Fracture Mechanics (PFM)

Extremely Low Probability of Rupture (xLPR) Code Solution Space Artificial Intelligence (AI) and Machine Learning (ML)

Sensitivity Analysis with xLPR Importance Sampling Determining Appropriate Sample Size Surrogate Model for xLPR Quantities of Interest (QoIs)

Time of 1st Leak, if any Crack Propagation via Normalized Depth Results Potential Future Work Acknowledgments and Contacts

4 Problem Space - Probabilistic Fracture Mechanics (PFM)

Analysis This figure illustrates a simplistic PFM analysis. The curve on the left represents the distribution of crack driving force or applied stress intensity factor (SIF), which depends on the uncertainties in stress and crack size. The curve on the right represents the toughness distribution or critical (i.e., allowable) SIF of the material. When the two distributions overlap, there is a finite probability of failure, which is indicated by the shaded area. Time dependent crack growth, such as from fatigue or stress-corrosion cracking or both, can be considered by applying the appropriate growth laws to the crack distribution. Crack growth can cause the applied SIF distribution to shift to the right with time, thereby increasing the probability of failure.

Figure from U.S. NRC Technical Letter Report, TRL-RES/DE/REB-2022-13.

Problem Space - Extremely Low Probability of Rupture 5

(xLPR) Code Analysis Architecture Figure from U.S. NRC Technical Letter Report, TRL-RES/DE/REB-2022-13.

6 Sensitivity Analysis with xLPR Given input (uncertain parameter) distributions and associated Monte Carlo outputs Can we use AI/ML models to determine/rank importance of inputs while finding proper sample size with respect to output set?

inputs ranked features NRC, Technical Letter Report TLR-RES/DE/REB-2021-14-R1, "Probabilistic Leak-Before-Break Evaluations of Pressurized-Water Reactor Piping Systems using the Extremely Low Probability of Rupture Code," April 2022, ADAMS Accession No. ML22088A006

7 Surrogate Modeling using ML for xLPR Given input (uncertain parameter) distributions and associated Monte Carlo outputs Can we use AI/ML models to train surrogate model with respect to output set?

492 of 2000 samples 240

  • 2000 samples Outputs or Quantities of Interest (QoIs) normalized crack depth cc_depth_normalized cc_ID_length_normalized cc_OD_length_normalized leak time is_leaking is_ruptured sample index simulation time total_leak_rate outputs potential surrogate models

8 Results Using random forest regressor (scikit-learn)

Mean decrease in impurity (MDI)

Permutation importance values Using linear regression (scikit-learn) inputs Outputs or Quantities of Interest (QoIs) cc_depth_normalized cc_ID_length_normalized cc_OD_length_normalized is_leaking is_ruptured total_leak_rate Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011. https://dl.acm.org/doi/abs/10.5555/1953048.2078195

9 Results - Using Random Forest Regressor Multi-variate output set (6) given input set (65) w/ 200 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 0.5973 WRS_axial_premitigation_pt26 0.0680 WRS_axial_premitigation_pt24 0.0545 WRS_axial_premitigation_pt22 0.0501 WRS_axial_premitigation_pt05 0.0351 weld_material_PWSCC_growth_component_to_component_variabilit 0.0280 y_factor_fcomp WRS_axial_premitigation_pt21 0.0255 weld_material_PWSCC_growth_activation_energy_Qg 0.0202 WRS_axial_premitigation_pt02 0.0158 WRS_axial_premitigation_pt07 0.0133 WRS_axial_premitigation_pt17 0.0107 Ranked permutation importance for those inputs with values greater than random feature mean + 2

  • standard deviation.

10 Results - Using Random Forest Regressor Multi-variate output set (6) given input set (65) w/ 2000 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 0.9740 weld_material_PWSCC_growth_component_to_component_variabilit 0.2029 y_factor_fcomp weld_material_PWSCC_growth_within_component_variability_factor 0.1503

_fflaw WRS_axial_premitigation_pt14 0.0301 WRS_axial_premitigation_pt26 0.0268 WRS_axial_premitigation_pt22 0.0230 WRS_axial_premitigation_pt02 0.0176 WRS_axial_premitigation_pt23 0.0163 WRS_axial_premitigation_pt24 0.0162 WRS_axial_premitigation_pt07 0.0158 Ranked permutation importance for those inputs with values greater than random feature mean + 2

  • standard deviation.

11 Results - Using Random Forest Regressor Multi-variate output set (6) given input set (65) w/ 20000 samples Input Variable Permutation Importance WRS_axial_premitigation_pt01 1.2146 weld_material_PWSCC_growth_component_to_component_variabilit 0.3489 y_factor_fcomp weld_material_PWSCC_growth_within_component_variability_factor 0.2416

_fflaw WRS_axial_premitigation_pt02 0.1115 initial_cc_full_length 0.0323 WRS_axial_premitigation_pt25 0.0226 weld_material_PWSCC_growth_activation_energy_Qg 0.0205 WRS_axial_premitigation_pt26 0.0177 WRS_axial_premitigation_pt24 0.0162 left_pipe_material_yield_strength 0.0155 Ranked permutation importance for those inputs with values greater than random feature mean + 2

  • standard deviation.

12 Results - Using Random Forest Regressor Determining appropriate sample size (all 6 QoIs) - Confidence intervals for 200, 2000 and 20000 samples cc_depth_normalized 95% CI cc_ID_length_normalized 95% CI cc_OD_length_normalized 95% CI 0.35 0.16 0.48 0.3 0.14 0.46 0.25 0.12 0.44 0.1 0.2 value 95% high valiue 0.08 95% high value 0.42 95% high 0.15 mean 0.06 mean mean 0.4 0.1 95% low 0.04 95% low 95% low 0.05 0.38 0.02 0 0 0.36 200 2000 20000 200 2000 20000 200 2000 20000

  1. Samples # Samples # Samples is_leaking 95% CI is_ruptured 95% CI total_leak_rate 95% CI 0.35 0.3 0.07 0.3 0.25 0.06 0.25 0.05 0.2 0.2 0.04 value 95% high value 0.15 95% high value 95% high 0.15 0.03 mean mean mean 0.1 0.1 95% low 95% low 0.02 95% low 0.05 0.05 0.01 0 0 0 200 2000 20000 200 2000 20000 200 2000 20000
  1. Samples # Samples # Samples 2000 Samples are sufficient

13 Results - Surrogate Model Can we predict time of first leak?

xLPR inputs xLPR inputs trained predicted xLPR QoI model trained training model xLPR QoI is_leaking changes from 0 to 1 A) Training process B) Use in prediction 492 out of 2000 samples result in leak

14 Results - Using Linear Regression 492 (out of 2000) samples - 75/25 % training/testing split Predicted leak time (in months) for test set by sample index Seems like we do not have enough training data leak time (months) mse = 662.7 test sample index

15 Results - Using Linear Regression 492 (out of 2000) samples - 75/25 % training/testing split Ground truth versus predicted for test set by sample Seems like we do not have enough training data Predicted leak time mse = 662.7 ground truth

16 Results - Using Random Forest Regression 492 (out of 2000) samples - 75/25 % training/testing split Ground truth versus predicted for test set by sample Seems to miss both high and low values (we Predicted leak time probably do not have enough training data) mse = 707.5 ground truth

17 Results - Using Random Forest Regression 492 (out of 2000) samples - 100/0 % training/testing split Predicted leak time (in months) for train set by sample index Significant drop in mse leak time (months) mse = 108.6 test sample index

18 Results - Using Random Forest Regression 492 (out of 2000) samples - 100/0 % training/testing split Ground truth versus predicted for train set by sample Significant drop in mse - but model still not Predicted leak time capturing high and low ends well mse = 108.6 ground truth

19 Results - Surrogate Model Can we predict normalized crack depth at next time step - Use Case: Leak occurs (when normalized crack depth is 1.0)?

xLPR inputs xLPR inputs trained predicted xLPR QoI trained model at time t+1 training model normalized_crack_depth xLPR QoI at time t xLPR QoI at time t normalized_crack_depth normalized_crack_depth xLPR QoI at time t+1 normalized_crack_depth A) Training process B) Use in prediction 2000 samples with 240 time steps each

20 Results - Using Linear Regression 2000 samples - 75/25 % training/testing split xLPR depth versus predicted for test set by sample Seems to lose performance for larger Predicted leak time normalized crack depth values mse = 2.7e-5 xLPR depth

21 Results - Using Linear Regression 2000 samples - 75/25 % training/testing split xLPR depth versus predicted time-series for single sample initial conditions Attempts to extrapolate beyond unit normalized crack Predicted depth depth xLPR depth

22 Results - Using Random Forest Regression 2000 samples - 75/25 % training/testing split xLPR depth versus predicted for test set by sample Captures larger normalized crack depth values and Predicted leak time more of curvature mse = 2.0e-6 xLPR depth

23 Results - Using Random Forest Regression 2000 samples - 75/25 % training/testing split xLPR depth versus predicted time-series for single sample initial conditions Captures similar curvature, but predicts leak sooner than xLPR Predicted depth xLPR depth

24 Results - Using Random Forest Regression 2000 samples - train and test on 100% of data xLPR depth versus predicted for train set by sample Captures larger normalized crack depth, but with Predicted leak time increased uncertainty mse = 2.5e-7 xLPR depth

25 Results - Using Random Forest Regression 2000 samples - train and test on 100% of data xLPR depth versus predicted time-series for single sample initial conditions Captures similar curvature, but predicts leak even sooner Predicted depth than xLPR or 75%

training data -

indicates overtraining xLPR depth

26 Results - Using Random Forest Regression 2000 samples - 25/75 % training/testing split xLPR depth versus predicted for test set by sample Captures larger normalized crack depth values but Predicted leak time with increased uncertainty mse = 2.7e-6 xLPR depth

27 Results - Using Random Forest Regression 2000 samples - 25/75 % training/testing split xLPR depth versus predicted time-series for single sample initial conditions Captures similar curvature, but predicts leak sooner than xLPR Predicted depth

- not as overtrained xLPR depth

28 Potential Future Work

  • Efficiently identify response sensitivities from an uncertain input parameter space
  • Sensitivity analysis
  • Sensitivity studies
  • Uncertainty analysis
  • Identify methods of creating tiered surrogate models (machine-learning/data-driven) with comparable accuracy to the physics-based xLPR model, including characterization of the increased computational efficiency of the potential surrogate models
  • Start with one output and use neural network to predict
  • Determine level of effort needed

29 Our Team - Sandia Sandia Michael Starr (PI) Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned Stephen Verzi (staff, ML) subsidiary of Honeywell International Inc. for the U.S.

Department of Energys National Nuclear Security Joseph Lubars (staff, ML/RL & statistics) Administration under contract DE-NA0003525.

Satyanadh Gundimada (staff, ML/DL)

U.S. Nuclear Regulatory Commission This presentation was prepared as an account of work sponsored by an agency of the U.S. Government. Neither the U.S. Government nor any agency thereof, nor any of Matthew Homiack their employees, makes any warranty, expressed or implied, or assumes any legal liability or responsibility for any third Raj Iyengar partys use, or the results of such use, of any information, apparatus, product, or process disclosed in this report, or represents that its use by such third party would not infringe privately owned rights. The views expressed in this presentation are not necessarily those of the U.S. Nuclear Regulatory Commission.

30 Thank You Questions?