Text

Artificial Intelligence and Machine Learning in Nondestructive Examination and In-service Inspection Activities
	ML24064A212
Person / Time
Issue date:	03/06/2024
From:	Stephen Cumblidge, Carol Nove; NRC/NRR/DNRL, NRC/RES/DE
To:	;
	Carol Nove 3014152217
References
	Download: ML24064A212 (39)
	v • d • e

Artificial Intelligence and Machine Learning in Nondestructive Examination and In-Service Inspection Activities

Carol A. Nove: RES/DE Stephen Cumblidge: NRR/DNRL

ACRS Full Committee Briefing March 6, 2024 Outline

Introduction and Background

Research Program

- Evaluation of commercially available automated data analysis

- Evaluation of machine learning for ultrasonic NDE

Research Program Outcomes

2 Acronyms ADA - automated data analysis ASME Code - American Society of Mechanical Engineers Boiler and Pressure Vessel Code CASS - cast austenitic stainless steel CNN - convolutional neural network CS-carbon steel DMW - dissimilar metal weld DNN - deep neural network DR - detection rate EPRI-Electric Power Research Institute FPR - false positive rate ISI - inservice inspection ML - machine learning NDE - nondestructive examination ORNL - Oak Ridge National Laboratory POD - probability of detection PNNL - Pacific Northwest National Laboratory ROC - Receiver Operating Curve RVUH - reactor vessel upper head TFC - thermal fatigue cracks TPR - true positive rate UT - ultrasonic testing (ultrasonics, ultrasonic examination, etc.)

UV - UltraVision VP - VeriPhase WSS - wrought stainless steel

3 Introduction and

Background

4 Nondestructive Examination (NDE) in Nuclear Power Plants

10 CFR 50.55(a)(b) incorporates by reference the American Society of Mechanical Engineers Boiler and Pressure Vessel Code (ASME Code),Section III, Rules for Construction of Nuclear Facility Components, and Section XI, Rules for Inservice Inspection of Nuclear Power Plant Components

NDE needed for timely detection of service-induced flaws

Plant aging increases likelihood of service-induced flaws

Accurate & Reliable NDE increasingly important due to industry trends to reduce:

- Inspection time during outages

- Radiation exposure

- Number of examinations

5 Drivers for Automated Data Analysis (ADA)

Section XI, Appendix VIII, Performance Demonstration for Ultrasonic Examination Systems, provides requirements for performance demonstration for ultrasonic examination procedures, equipment and personnel to detect and size flaws

Industry projecting potential shortage in NDE technicians with proper skillsets to conduct NDE to meet future fleet needs (ML24026A087)

Some UT inspections such as upper head exams yield large quantities of data that must be reviewed by multiple qualified inspectors during the outage period. (EPRI 3002023718)

- High level of focus required for long periods of time

- Human factors related to fatigue and momentary loss of focus can challenge reliability of results

6 ADA Is Coming

Widely available, open-source ML tools have enabled the development and application of ML algorithms for many uses

These tools are becoming more powerful and easier to use over time

The nuclear industry is funding work to use these tools for automated data analysis algorithms to analyze NDE data

7 ADA/ML Use Cases for Ultrasonic NDE

Near term

- Analysis of encoded (recorded) data

- Screening: Identify regions that are indication-fre e

- Classification: Identify regions that contain flaws

- Quality Control for NDE Examinations

Longer term Flaw Classification

- Data compression

- Generate NDE reports

- Real Time data analysis of unencoded data

- Synthetic data generation for training

Flaw Screening (Hypothetical Example)

8 Two Ways of using ADA

A DA-Assisted Examination A fully-qualified inspector uses hints or highlighted areas to analyze the data, but the qualified individual makes the final calls

Fully-Automated Examination The ADA algorithm makes the calls without human input

9 Automated Data Analysis -Assisted Procedures

One sug gested approach by EPRI is for an ADA algorithm to flag areas with flaws, and the algorithm must find all flaws in the qualification set

The algorithm can produce more false calls than allowed in the given supplement

It will be up to the inspector to determine which of the areas flagged by the algorithm contain flaws, and ultimately the inspector is responsible for the results

Graphics from EPRI 3002023718 10 Automated Data Analysis - Possible Benefits ADA has the potential to improve detection of flaws and improve the human factors of an examination.

In-ser vice flaws are rare in the nuclear industry. Computers can maintain vigilance in cases where humans strug gle.

Humans and computers make different types of mistakes, and a qualified analyst paired with an analysis run by ML gives the best of both worlds.

Reduced dose to inspectors if ML used to support manual UT examinations. Graphic adapted from NUREG/CR-7295

11 Automated Data Analysis - Possible Hazards

ADA has the potential to introduce common -cause failures of inspections across the fleet

Licensees may not understand the capabilities and limitations of ADA, which could lead to improper use of ADA

ADA assistance may allow people to pass Appendix VIII qualification testing without the skills to recognize unknown degradation in the field

ML algorithms can be challenging to train and retrain, possibly making the ML algorithms unreliable

ML algorithms require a new class of experts to support UT examinations

12 Automated Data Analysis - Expect the Unexpected

As plants age and new reactors are designed, it is almost certain that new degradation mechanisms will emerge, and flaws will appear in unexpected places

ADA methods can be very good at handling known problems but may not work on new forms of degradation

13 Research Program

14 Research Program on Automated Data Analysis User Need Request for Evaluating the Reliability of Nondestructive Examinations, (NRR-2022-007), Task 4, Au tomate d Data Analysis, requests that RES provide a technical basis describing current capabilities of machine learning and automated data analysis for nondestructive examination (NDE).

RES activity to address UNR request:

Evaluating machine learning (ML) for Flaw inside the weldment Ultrasonic Examinations (UT) - Oak Ridge Longitudinal 45 Shear 45 National Laboratory (ORNL)

Evaluate commercially available automated data analysis platforms including rule-based and ML-based systems - Pacific Northwest National Laboratory (PNNL) Variation in Data (probe/mode)

15 Automated Data Analysis - Types of Algorithms

Rule-based

Decisions made based off explicit rules

Easy to determine why specific decisions are made

Learning-based

Decisions based off training d ata

Difficult to determine why specific decisions are made

Analysis

Assisted - ADA provides analyst with flag ged dataset

Automated - No analystVrana and Singh, 2021, https://doi.org /10.1007/s10921- 020- 00735-9

16 Evaluation of ADA for UT

Objectives

- Assess current capabilities of ADA for improving NDE reliability

- Provide technical basis to support regulatory decisions and Code actions related to ADA for NDE

Expected outcomes

- Identify capabilities and limitations of ADA for UT NDE applications

- Identify factors influencing ADA performance and their impact on NDE reliability

- Recommend verification and validation approaches and methods for qualifying ML (and ADA, as appropriate) for nuclear power NDE

- Identify gaps in existing Codes and Standards relative to ADA for UT NDE

17 Assessment of Rule-Based ADA Takeaways from Literature Review

Almost all recent publications are dealing with learning-based analysis

Rule-based ADA is usually used for flaw detection and signal processing An amplitude threshold can be used to identify flaw signals above the noise floor Signal processing can help improve signal to noise ratio

Rule-based ADA can achieve high detection rates but also high false call rates Not able to consistently distinguish between geometric and flaw responses

18 Assessment of Rule-Based ADA Empirical Evaluation of Commercial ADA Syste m s

Data analysis with two different commercial ADA software packages compared to analysis by qualified Level III UT analyst

Statistical analysis of results using established methodologies

Rule-based ADA is likely not fit for nuclear pipe inspections on its own

Rule-based ADA could potentially be used alongside learning-based methods depending on the use -case

19 Assessment of Machine Learning (ML) Algorithms

Slag Inclusion Porosity

Limited to ultrasonic NDE classification problems with Lack of Crack Fusion data from weld inspections Lack of Penetration

- Materials: Steel (austenitic stainless steel, DMW, etc.)

- Flaw types: saw cuts, EDM https://www.zetec.com/blog/destructive -and-nondestructive-testing-of-welds-how-ndt-ensures-quality/

notches, thermal fatigue, stress corrosion cracking, weld fabrication flaws

- Inspection procedure Example A-Scan

Flaw top Right side assumed to be appropriate Flaw of the bottom weldmentWeld root for weld inspections Example TOFD Scan Example B-Scan https://www.olympus-ims.com/en/applications/introduction-to-time-of-flight-diffraction-for-weld-inspection/

20 Empirical ML Research Objectives

Determine capabilities and TransducerWe l d Steel limitations of ML for NDE

Identify factors influencing D e fe c t applicability to other inspections (CASS, DMW, RVUH, etc.)

Assess effects of data augmentation, including using simulated data

Establish methods to quantify confidence in ML results

Assess capabilities for flaw size quantification from UT data

21 Generic Workflow for Assessment of ML for UT NDE

70

1. Collect ultrasonic NDE data TFC

Saw cut from a variety of materials 60 EDM notch with multiple probe designs, 50 frequencies and wave modes

2. Pre -process the data to 40 remove noise and outliers

3. Train a machine learning 30 algorithm on the preprocessed data 20

4. Use the trained algorithm to 10 analyze new ultrasonic data

5. Assess the results using 0 multiple metrics 10 20 30 40 50 60 70 80 90 100 110

Flaw length (mm)

Flaw size distribution for four stainless steel and two DMW specimens.

22 Over view of Empirical Assessment B-scan input ML modeling Original B-scan (CNN/DNN)

Fully connected layers

Output Preprocessing Input to Flatten layer

Convolution Pooling Convolution Pooling (Crop, normalization, ML model downsampling, flip)

Flaw Non-flaw

Flaw Non-flaw Accuracy=0.88 Predicted as 339 14 TPR=0.83 Flaw FPR=0.05 Predicted as 67 274 non-flaw Accuracy, true/false Confusion matrix ROC curves positive rate (TPR, FPR)

Visualize Results

23 Examples of Results

Low true positive rate on flaws close to weld centerline and on smaller TFC flaws

24 Transfer Learning Example

Test results using the retrained model

Retraining

Retraining and incorporating transfer learning methods may help to improve the performance when the model encounters new data.

25 Findings to Date: ML

Capable of high TP, low FP and FN

May be able to learn key signatures using data from simple flaws (e. g. saw cuts) and generalize well to other flaw types (e. g., TFC)

- Generalization capability may vary with flaw size and location

Transfer learning techniques may be useful for improving accuracy with new data sets

Model type (for instance, NN vs DNN) may not significantly change results ML, if used with care, can be used for NDE data classification

26 Findings to Date: Data

Training data should be representative of the types of data expected during testing

- Expanded training data sets may allow ML to accommodate nominal weld geometrical variances and associated noise

High accuracy possible if test data is in distribution relative to training data

- Consistency across training and test data sources important for high classification accuracy

27 Findings to D ate: Metrics

Desired performance thresholds likely dependent on use case

Commonly used metrics: TPR, FP, FN

- Low FP and FN rates, high TPR desirable

- Zero FN, low FP, high (100%) TPR for screening?

Other useful measures

- Receiver operating characteristic (ROC) curves - TPR vs FPR

- ML training curves - can indicate overfitting and potential poor classification accuracy if deployed

28 Findings to Date: Best PracticesB-scan of B-scan

saw cut of TFC

Consistency in preprocessing procedures (crop, normalization, down-sampling, etc.)

Review and correct, if necessary, output labels Normalized to 40 dB

Tuning and selecting parameters that control the learning method

Retraining a trained network with additional data to improve performance Original B-scan Image Cropped B-scan image and tune ML to site-specific data Cro p Flaw

29 Status of RES Program - Assessment of Commercially-available Algorithms/Systems

Technical Letter Report entitled Evaluation of Commercial Rule -

Based Assisted Data Analysis in the RES/NRR review cycle

Confirmatory analysis of the commercial ML system being tested by industry in field trials has recently begun

- Focus on upper head examinations

- Mockups being designed and fabricated

- Assessment will include:

Pre -trained algorithm tested with vendor collected UT data on NRC -owned mockups

Training and testing with PNNL/ORNL data with comparison of results to ORNL ML algorithm results

30 Status of RES Program - ML for UT NDE Ongoing Research

Impact of ML on POD, and comparison of ML results with manual analysis performed by a qualified analyst (including comparison of ML performance against Appendix VIII requirements)

AI-Assisted vs Fully-Automated analysis: Detection and sizing of degradation that the ML system has not been trained on, validation/qualification requirements, and essential variables

Qualification of ML

- Training, test, validation data requirements, and benchmark data sets

- Acceptable performance thresholds and requalification processes

Methods for establishing confidence in ML results

- Verification and validation of data and methods

- Uncertainty quantification, ML interpretability, and related criteria (if any) for qualification

31 Status of RES Program - ML for UT NDE

Technical letter report entitled An Assessment of Machine Learning Applied to Ultrasonic Nondestructive Evaluation (ORNL/SPR -

2023/3245) published February 2024 (ML24046A150)

Other publications

- H. Sun, R. Jacob, and P. Ramuhalli, Classification of Ultrasonic B-Scan Images from Welding Defects Using a Convolutional Neural Network, Proc. 13th NPIC&HMIT 2023, Pages 272 - 281. ISBN 978- 0-89448-791- 0 (ML23241A961)

- H. Sun, P. Ramuhalli, and R. Jacob, Machine Learning for Ultrasonic Nondestructive Examination of Welding Defects: A Systematic Review, Ultrasonics, Vol. 127 Issue 1, Jan 2023, Pages 106854 (ML22284A071)

32 Research Program Outcome

- providing the technical basis to answer...

33 Potential Qualification Pathways for ADA (including ML)

ADA for classification (flaw A DA for screening (excluding unflawed detection) regions from evaluation)

Can adopt approach similar to existing

Can adopt approach similar to existing Section Section XI, Appendix VIII for XI, Appendix VIII for performance demonstration performance demonstration

Assumed standard for performance:

Biased toward calling detections

- Goal is to have no misses

- Greater than or equal to current - Tolerance for high false call rate practice (i.e. human performance)

- Could adopt similar acceptance criteria - Qualified UT analyst responsible for all calls for performance demonstration

Acceptance criteria should reflect the bias toward detection

Do training /qualification specimens need to incorporate non-flaw features intended to generate a detection response with the algorithm?

If ML-based ADA has the potential to be better than current practice, then should ADA be held to a higher performance standard?

34 Initial Qualification Requirements for A DA-Assisted Examinations

A UT procedure that uses ADA-assistance can currently be qualified using Appendix VIII as the user of the procedure is a UT L eve l I I

How should the qualification requirements specified in Section XI, Appendix VIII be updated?

- Currently only covers encoded data

- There are many complexities associated with training ML algorithms not captured in current rules

35 Implications Related to Retraining ADA Algorithms

If an ML algorithm is retrained, the algorithm has been altered and is a change of an essential variable in the procedure

In ASME Code Section XI Appendix VIII, a procedure must be requalified via a successful personnel qualification if an essential variable is changed

The NRC understands the potential benefits of changing the ASME Code to allow for field -friendly implantation of ML (e.g.

requalifying a retrained ML algorithm on -si te )

36 Paths to the Future for ADA

Current Near Future on Current Trajectory Inspections Im p ro ve d Skilled Engaged AI Optimized B e st New procedures can be developed Inspectors Experts Pro ce d u re s Outcome New degradation can be found New inspectors Lack of AI experts become overly-AI experts move prevents effective dependent on AI on to new tasks development and tools retraining of algorithms Current

Inspections may Unskilled No AI Unoptimized Wo r s t be OK Inspectors Experts Pro ce d u re s Outcome New degradation may not be found

New Procedures Care must be taken to prevent this outcome. m ay b e challenging

37 Avoiding Future Problems

Industry needs to build the infrastructure to allow for the effective use of ADA

Create rules for requalifying an algorithm after modification that does not require a person to pass a personnel test

- e.g. Finds all flaws in qualification data without too many additional false calls

Requirements for personnel to use ADA-assisted procedures to assure that they have appropriate skills

- e.g. Pass an Appendix VIII tests for the same Supplement without ADA assistance

38 39

ML24064A212

Text

Background

Navigation menu

Search