Latest revision as of 17:12, 18 January 2022

Text

Xlpr Metamodeling
	ML21230A364
Person / Time
Issue date:	08/18/2021
From:	; NRC/RES/DE
To:	;
	Homiack M
Shared Package
ML21230A354	List: ML21230A356; ML21230A357; ML21230A358; ML21230A360; ML21230A361; ML21230A363; ML21230A364; ML21230A366;
References
	Download: ML21230A364 (10)
	v • d • e

Global Sensitivity Analysis of xLPR using Metamodeling (Machine Learning) xLPR User Group Meeting August 18, 2021 1

Background

As part of applying xLPR to production analyses and to further validate the model, sensitivity analyses were conducted

- Sensitivity studies can be used to assess the impacts of uncertain parameters and analysis assumptions on the results

- Sensitivity analysis is a useful tool for identifying important uncertain model inputs that explain a large degree of the uncertainty in a quantity of interest

Reasons to perform a sensitivity analysis:

- Identify inputs that warrant greatest level of scrutiny, validation, and further sensitivity analysis

- Identify inputs that are key to the results

- Model validation

- Improve understanding of model behavior

- Reduction of model complexity (e.g., set unimportant inputs to constant values)

- Inform advanced Monte Carlo sampling strategies (e.g., importance sampling)

Available techniques (see TLR-RES/DE/CIB-2021-11; ML21133A485):

- One-at-a-time

- Local partial derivatives (e.g., Adjoint Modeling)

- Variance-based (e.g., Sobol method)

- Linear regression

- Metamodels 2

Sensitivity Analysis using Metamodels

Why machine learning metamodeling?

- Can handle correlated inputs

- Accurately reflects non-monotonicity, non-linearity, and interactions

- Importance measures reflect the whole input space

- Several machine learning models automatically generate sensitivity metrics and down-select input variables based on information gained as part of the model fitting process

- Fitted model can be used in place of the original model to compute quantitative sensitivity measures at lower computational cost

Focus of this presentation: using built-in sensitivity metrics generated during fitting 3

Metamodeling Analysis Workflow

Run the probabilistic code and collect results

Implement metamodeling code

- Import results from probabilistic code runs

- Transform results to prepare for input to metamodel fitting (e.g., accounting for spatially sampled variables)

- Fit the metamodel, including parameter optimization using cross-validation

- Extract and report input importance metrics

Evaluate

- Examine goodness of fit metrics

- Compare importance ranking results from alternate metamodels

- Compare importance ranking results across different outputs of interest

Iterate

- Collect more inputs

- Analyze different outputs

- Run different discrete configurations of the probabilistic code

- Use different metamodels / different metamodel parameters 4

Model Implementation

Python 3.6 using Scikit Learn Package*

Machine learning models implemented:

- Gradient Boosting Decision Trees

- Random Forest Decision Trees

- Linear Support Vector Machines

All models used are classifiers (as opposed to regressors) because the outcomes are binary (yes/no). Regressor models would be used for scalar outputs.

All models include metrics for feature selection / feature importance

Initial work focused on subset of 60 inputs:

- Inputs that are expected to have high importance

- Distributed inputs

- Constant inputs uniformly distributed from 0.8 to 1.2 times constant value

Outputs analyzed:

- Occurrence leak

- Occurrence rupture (with and without inservice inspection (ISI))

Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011 5

Spatially Distributed Inputs / Outputs

Pipe section split into 19 subunits that can potentially crack

Some inputs sampled on a subunit basis

Some outputs also available on a subunit basis

Aggregation methodology for subunit inputs / outputs

- Pipe subunit inputs and outputs: Analyze each pipe subunit and crack direction separately and average feature importance metrics

- Pipe subunit inputs and global outputs:

Average input across all pipe subunits (and crack types) and perform single analysis to determine feature importance

- This method may cause underreporting of importance metrics in comparison to alternative methods 6

Results: Leak Output

Output: Leak (through wall crack) in any pipe subunit

Analyzed using Gradient Boosted Trees Classifier (GBC)

Allows comparison between averaging subunit inputs and averaging subunit analysis outputs

Top importance parameters for averaged subunit inputs:

- Primary water stress-corrosion cracking (PWSCC) initiation parameters

- PWSCC growth parameters

- Operating Temp./Pressure

- Pipe outside diameter /

Thickness

- Welding Residual Stresses (WRS) - Hoop

- Pipe yield strength 7

Results: Rupture Output

Rupture full model output (not subunit basis)

Analyzed using all three machine learning classification algorithms

Best prediction accuracy and CV score using Gradient Boosted Trees Classifier

General agreement between all three fitted models

Top importance parameters consistent with leak parameters

- PWSCC initiation

- Axial WRS ranked above Hoop (opposite of leak) 8

Changes in Importance Rankings

Importance factor results may be compared between different Most important inputs scenarios/cases to consistently drive result show changes in the relative ordering of inputs

Useful for:

- Comparison between alternate metamodeling approaches Scatter indicates low

- Determining differences confidence in relative in sensitivity between ranking (in the noise) different outputs of interest

- Comparing runs with different model settings (e.g., different ISI intervals) 9

Conclusions

Key findings

- Relative comparisons (e.g., Axial vs. Circ, Rupture with/without ISI) are very useful for sanity checking the model

- Relatively high confidence in the identification of highest-impact inputs but low confidence in ordering of low-impact inputs

General challenges

- Input distributions need to be selected carefully to get informative results

A default real-world analysis input set is probably not sufficient

- Special consideration needed for inputs that are not continuous variables (e.g., settings flags)

xLPR-specific challenges

- Prediction of simulation-wide outcomes using subunit-level sampled values

- Consideration of all inputs would be time-intensive (labor to extract sampled values and simulation time to adequately cover full input space)

Potential future improvements

- Include more inputs in the machine learning model

- Examine other outputs of interest (e.g., leak rate jump indicator)

- Examine alternate configurations that cant be covered automatically using input distributions

- Use more advanced methods to improve on the relative rank importance metric (e.g., variance decomposition) 10

Revision as of 22:35, 31 August 2021 (view source) StriderTol (talk \| contribs) (StriderTol Bot insert)	Latest revision as of 17:12, 18 January 2022 (view source) StriderTol (talk \| contribs) (StriderTol Bot change)
(3 intermediate revisions by the same user not shown)
(No difference)

ML21230A364: Difference between revisions

Latest revision as of 17:12, 18 January 2022

Text

Background

Navigation menu

ML21230A364: Difference between revisions

Latest revision as of 17:12, 18 January 2022

Text

Background

Navigation menu

Search