ML18033A124
ML18033A124 | |
Person / Time | |
---|---|
Issue date: | 03/14/2018 |
From: | Weber M NRC/RES/DRA |
To: | Brian Holian Office of Nuclear Reactor Regulation |
C. Hunter | |
Shared Package | |
ML18033A123 | List: |
References | |
Download: ML18033A124 (21) | |
Text
NRC Response to Public Comments on Draft NUREG-XXXX, Common-Cause Failure Analysis in Event and Condition Assessment: Guidance and Research, Draft Report for Comment Draft NUREG-XXXX, Common-Cause Failure Analysis in Event and Condition Assessment:
Guidance and Research, Draft Report for Comment (Agencywide Documents Access and Management System (ADAMS) Accession No. ML111890290) was provided for public comment (see Federal Register Notice published on November 2, 2011). Formal comments were received from the following:
- PWR Owners Group (ADAMS Accession No. ML12062A072)
- Nuclear Energy Institute (ADAMS Accession No. ML12075A173)
- Exelon Generation Company (ADAMS Accession No. ML12083A057)
- KNF Consulting Services (ADAMS Accession No. ML121170479)
- Palisades Nuclear Plant (ADAMS Accession No. ML12006A067)1 The attached table provides a summary of these comments and the associated NRC staff response. Note that the NUREG has been significantly revised. Based on significant feedback received from industry stakeholders and meetings with internal stakeholders, NRC staff decided that the scope of the NUREG should be limited to providing the basis for the treatment of common-cause failure potential in SDP risk assessments.
In addition to the restructuring and revision of text, the material provided in Sections 3 and 4 and Appendices A and B of the draft NUREG has been eliminated. Given these changes, many of the comments received have been rendered moot or no longer apply. However, the general intent of some of these comments still may be applicable and, therefore, the staff still tried to provide a response in these cases.
1 Comments/feedback on the draft NUREG were provided by Palisades as part of a Significance Determination Process (SDP) risk assessment regarding the failure of service water pump coupling that occurred in 2011.
1
Public Comments on Draft CCF NUREG and NRC Response PWROG General Comments: The purpose of the draft NUREG is to assist the staff in dealing with modeling considerations of CCF in the context of an ECA.
This draft NUREG is intended to replace the text that was removed in the RASP Handbook that was related to treatment of CCFs. Section 1 establishes the philosophical underpinnings supplemented by three ground rules and some examples. Thus, Section 2 provides the recommended approaches to the NRC staff.
If these approaches were sufficiently robust and representative of a consensus of PRA practitioners, the draft NUREG would end there; however, Section 3 discusses a variety of issues related to current CCF modeling approaches (especially in a SPAR model) in the context of ECA. Section 4 reinforces these issues by citing the research that is currently being (or planned to be) performed to address some of the identified issueswith some of the timeframes being long-term (to be completed in more than five years). This suggests in the context of an ECA that the currently used CCF models are not sufficiently robust to support the application at hand. Perhaps some simpler, more grounded approaches are needed to deal with CCF in the context of ECA at this time. When the stated research resolves some of the identified issues, then the concept presented in this draft NUREG will be more viable.
NRC Response: We agree that the scope of the draft NUREG was too broad and resulted in the appearance of conflicting information about the current state-of-practice in CCF modeling. Therefore, the sections covered by the comment (e.g., Sections 3 and 4) in draft NUREG have been eliminated in the revised NUREG.
The scope of the revised NUREG has been limited to material relevant to the treatment of CCF potential in Significance Determination Process (SDP) risk assessments and, therefore, we believe provides the simpler, more grounded approach referenced in the comment.
PWROG General Comments (cont.): Starting in Section 1.1 and continuing throughout Sections 1 and 2, especially the discussion in Section 1.4, is the concept that the underlying performance deficiency (e.g., poor maintenance practice) should be the basis to consider the full impact of CCF in the retrospective assessments. In a presentation made by NRC at the EPRI's Configuration Risk Management Forum (CRMF) on this draft CCF NUREG, this concept was discussed. The, concept as proposed in the draft NUREG sets a dangerous precedent as there is apparently no line established that would limit the CCF impact to the CCCG. While the NRC recognizes the limitation of this concept, the final NUREG needs to be explicit about this limitation. Otherwise, with a performance deficiency of poor maintenance, which could potentially affect any component, without some explicit limitations in the draft CCF NUREG, the scope of the CCF impact could extend beyond the normally-considered intra-system CCF affects to include a wide range of inter-system CCF impacts that would potentially result in conservatism insights, which are not supported by CCF data.
NRC Response: The identification of CCCG boundaries is outside the scope of the NUREG. However, the CCCG boundaries in the base PRA should be selected in a manner consistent with the current-state-of-practice and applicable PRA standards and are not expected to require adjustment for the treatment of potential CCF in SDP risk assessments. As described in the revised NUREG, the impact of potential CCF is limited among redundant components within the same CCCG.
See Section 2.1.5 in the revised report.
PWROG General Comments (cont.): The text supporting Figure 1 needs to be clear that the A1, B1, A2, B2, etc. represent subcomponents of a single modeled component in the PRA, i.e., Al, B1, etc. are not separately modeled in the PRA. Second, it must be clear the Components A and B are in the same CCCG.
Without these explicit constraints, there would be no bounds on the impact of subsequent CCFs.
NRC Response: Figure 1 and the supporting text has been modified and moved to Section 2.1.2 in the revised NUREG to address the comment.
PWROG General Comments (cont.): With these conditions established, there is potential for being overly conservative. Using the example of poor maintenance process as the performance deficiency, there is no differentiation of how the maintenance practices have contributed to the observed failure. The draft CCF NUREG suggests that if Subcomponent A2 was over-torqued (leading to its failure) due to a poor maintenance procedure or poorly trained technician, then all other subcomponents are subject to failure due to poor maintenance and the entire value of CCF for all the components in the CCCG must remain in the model.
This ignores the fact that there are varied forms of maintenance for which the component may be subject. All of these forms are not torqueing (e.g., calibration, filling an oil reservoir, cleaning, etc.). Further, these similar and dissimilar maintenance activities may be performed by different maintenance personnel, with different levels of training and experience. Another consideration is that all maintenance procedures are not created equal. Thus, to consider the full impact of the CCF (e.g., not reducing the alpha factor or beta factor) in light of the uncommon aspects of the underlying performance deficiency will over-estimate the impact of CCF on the risk metric results. Adjusting the CCF factors to account for these issues is permitted by the definition of failure memory approach where it states that failure probabilities ... are conditioned as necessary to reflect the details of the event.
2
Public Comments on Draft CCF NUREG and NRC Response NRC Response: The purpose of the SDP risk assessments is determine the risk significance of the observed licensee performance deficiency, which is normally defined at a more general organizational or performance level (rather than at a specific piece-part or detailed activity level). The performance deficiency establishes the shared proximate cause for CCF potential. The staff agrees with the commenters observation that a performance deficiency impacting equipment within a common cause component group (CCCG) could be manifested in different ways on different redundant components within the CCCG. However, the staff does not believe that this modeling approach is necessarily conservative, and in fact may be non-conservative (e.g., when the performance deficiency is associated with strong linkage factors within the CCCG). Furthermore, operating experience data collection and CCF parameter estimation is consistent with the proximate cause as normally defined under the SDP. Although defining the performance deficiency in this manner could imply that there is CCF potential for equipment outside the affected CCCG, as discussed in the NUREG, CCF potential is limited to only those redundant components within the same CCCG.
PWROG General Comments (cont.): Focusing on the common elements, there is also an issue of extent of condition. The draft CCF NUREG states that the failure memory concept does not acknowledge subsequent verification of extent of condition. This is overly conservative if there are some uncommon factors present, (e.g., different inspectors, different maintenance personnel, experience and training, etc.).
NRC Inspection Manual Chapter 308, Attachment 3, Section 3.C specifies ten specific attributes and principles for risk-informed SDP tools. Principle 10 states that: All technical judgments made by the staff within any probabilistic-based SDP tool should have bases that are clearly observable as reasonable, as well as reasoned, based on best available information, and not purposefully biased in a conservative manner simply because of uncertainties which are applicable in both conservative and non-conservative directions. As a corollary, this also requires that staff technical or probabilistic judgments not be traded off within a risk model by allowing a conservative bias in one modeling factor simply because another factor is believed to be non-conservatively biased.
It should be ensured that this principle is maintained in the guidance for treatment of CCF in an ECA by ensuring that the process is sufficiently reasonable and reasoned, and not biased in a conservative manner to account for perceived uncertainties in the probabilistic treatment of CCF.
NRC Response: Without supporting data, it is difficult to quantitatively credit the uncommon factors presented. The SDP is a risk-informed process and, therefore, factors not explicitly considered in the quantitative risk assessment can be qualitatively considered during the enforcement process (i.e., the numerical result should not be adjusted to reflect non-quantifiable factors). However, many CCF defenses (e.g., staggered maintenance) were in place during observed CCF events and are already represented in parameter estimates; therefore, qualitative credit should only be considered for unique defenses not already factored in the CCF data. The staffs treatment of CCF (e.g., maintaining observed successes at their nominal likelihood consistent with the CCF model) is consistent with the failure memory approach, which is state-of-practice for event and condition assessment (ECA).
PWROG Comment-Missing Expressions: There are several missing expressions on Page 7, 2123, which make it difficult to follow the discussion.
NRC Response: The discussion on page 7 (independent failure) is not within the scope of the revised NUREG and as such has been eliminated.
PWROG Comment-Modeling of CCF in PRA Models: The draft CCF NUREG Section 1.1 states that Inter-components dependencies, which are not captured explicitly in the PRA models, span a wide range This type of inter-component dependencies is specifically referring to CCF. The statement appears to contradict subsequent discussion in that paragraph, and in Section 1.2, which acknowledges that CCF is included in the PRA. Such a statement may be misleading.
NRC Response: This apparent contradiction has been removed. Note that the text regarding intercomponent dependencies has been modified (see Section 1.2 in the revised NUREG for additional information).
PWROG Comment-Common Cause Failure Definition: The definition of CCF (Section 1.3) states that the failure mechanisms do not have to be shared. This is true to a certain extent. For example, an earthquake may cause different components to fail via different mechanisms is clearly-identified as a CCF with the same cause, e.g., the earthquake. If one component fails because the manufacturer used an under-specified sub-component and a second component in the same CCCG fails due to a faulty maintenance procedurethese failures are not considered to be a CCF event. While the mechanisms differ, there is no shared cause to generate those mechanisms.
NRC Response: The discussion on shared cause and coupling factor has been modified (see Section 2.1.2 in the revised NUREG for additional information). For multiple failures of redundant components within the same CCCG to be considered a CCF, the cause must be shared. In CCF data collection, the determination of whether the cause is shared is at the proximate-cause level. Therefore, the example provided in the comment would likely be classified as two (separate) independent failures.
3
Public Comments on Draft CCF NUREG and NRC Response PWROG Comment-Common-Mode Failure Definition: The draft CCF NUREG, as stated in this Section 1.3, does not encourage the use of the term common mode failure, which was first used in WASH-1400. The SPAR models used by the NRC to support the SDP are less detailed that the industry's PRA models and may not explicitly include failures caused by shared components or latent human errors. Such failures are included in the CCF database and used to estimate CCF parameters. The PWROGC has an ongoing program to identify such failures when the CCF data sets are reviewed.
NRC Response: SPAR models are designed to effectively capture risk-significant contributors and are benchmarked with licensee PRA models and have been subject to peer review. In fact, the SPAR CCF modeling approach is better suited to addressing CCF contribution for ECA activities than many licensee PRAs.
The data collection effort uses component boundaries that are consistent with SPAR model component representations and as such, captures the impact of dependencies and latent human errors described by the commenter.
PWROG Comment-CCF Parameter Update: Failure probabilities are conditioned as necessary to reflect the details of an event associated with the performance deficiency as part of an ECA. As noted in the first ground rule (Section 1.4), such performance deficiencies are usually identified in the NRC's inspection reports.
In the majority of cases, such events are classified as potential CCFs that have not yet entered into the CCF database. The calculation of conditional CCF probabilities (i.e., alpha factors) is based on the CCF parameters derived from events already included in the CCF database. The draft CCF NUREG does not discuss whether a Bayesian update of the CCF parameters should be performed to include the event associated with the performance deficiency. The NRC should not be updating CCF parameters on the basis of a single failure and a retrospective assessment (to determine what might have happened). The CCF database must be developed and maintained on the basis of CCF event that have actually occurred.
NRC Response: We agree that ECA activities and data collection/analysis are two distinct activities and note that data collection and CCF parameter updates is beyond the scope of the revised NUREG. However, it is important to note that SDP evaluations utilize existing CCF parameters to calculate the change in risk (i.e., a Bayesian update of CCF parameters is generally not performed to support an SDP assessment). As such, the circumstances associated with the event being analyzed by the SDP is not reflected in the CCF parameter estimates supporting that analysis.
PWROG Comment-Deviations from Ground Rules: This section (Section 1.4.1) provides a caution when revising CCCG boundaries, because typical performance deficiencies, which reflect organizational problems, such as poor maintenance, can couple the EDGs despite design differences.
NRC Response: The text concerning these deviations has been modified (see Section 3.3 in the revised NUREG for additional information). SDP evaluations focus on the proximate cause, and there is potential for a cause at this level to affect redundant components despite design differences. The consideration of CCF potential in SDP risk assessments is limited to redundant components within the same CCCG. The base PRA should consider whether the design differences (and other factors per the ASME/ANS PRA standard) sufficiently de-couple the redundant components to where they should not be contained within the same CCCG.
PWROG Comment-CCF Examples: In Section 1.5 (first paragraph), there is a footnote that indicates that the events from this section would be included in future revisions of the database. These events each involve the failure of a single componentthere is no CCF to put into the database. For the purpose of an ECA, a CCF is assumed to be able to occur in the future. The database should be reserved for when a CCF event actually occurred.
NRC Response: This footnote has been deleted.
PWROG Comment-Basic Principles of CCF Treatment in ECA (Section 2.1): These principles, particularly item (6), convey that in the context of ECA, all failures are dependent failures unless proof can be shown of failure independence. This is not reflective of operating/ failure experience, and in the context of ECA will be overly conservative with the number (and nature) of dependent failure.
NRC Response: The NUREG describes the process of considering the CCF potential for redundant components within the same CCCG (i.e., components with shared coupling factors), given an observed failure of a component and the shared (proximate) cause (i.e., the observed performance deficiency). The treatment of whether an event is classified as independent or is added in the CCF database is a data collection and coding matter and is outside the scope of this NUREG.
4
Public Comments on Draft CCF NUREG and NRC Response PWROG Comment-Failure Memory Approach: In discussing the guidance in Section 2.1 on Basic Principles of CCF Treatment in ECA, the draft CCF NUREG indicates that in using the failure memory approach in ECA no credit is given to observed successful equipment operation. To determine the potential for CCF, given that one or more components within the CCCG were observed to be incapable of performing their intended function, testing of the redundant components within the CCCG may be performed. Certain limiting conditions of operation (LCO) (i.e., EDGs) for the Technical Specifications (TS) require the performance of additional tests when one component within the CCCG is observed to be in a degraded condition. Successful operability tests give assurance that the component will perform its intended function when demanded. The failure memory approach appears to be in conflict with certain LCOs. It appears that some credit should be given for successful equipment operation in the reduction of CCF potential.
NRC Response: The failure memory approach, which is the state-of-practice for ECA, accounts for the possibility that equipment that functioned successfully during the actual event may fail to function during a future event. Therefore, when assessing the potential for CCF, subsequent demonstrations of functionality (e.g., to verify TS operability) do not change the potential CCF likelihood (which instead is determined by the parameter estimates based on operating experience).
If a subsequent testing of redundant components resulted in additional equipment failures, this would likely indicate the existence of an actual CCF rather than a potential CCF.
PWROG Comment-ECA Workspace: In Section 2.2, CCF Treatment Categories, the draft CCF NUREG indicates that examples are provided in Appendix C that illustrate the ECA Workspace of SAPHIRE 8. Appendix C was not included as part of the draft CCF NUREG.
NRC Response: The scope of the revised NUREG is now limited to how CCF potential is treated in SDP risk assessments. All text related to the details of the probabilistic treatment within SAPHIRE (including Appendix C) have been eliminated.
PWROG Comment-BPM Symmetric Assumption (Section 3.1.2): The BPM assumes that each component in the CCCG has the same failure rate or failure probability. This assumption is invalidated if one of the components within the CCCG is degraded and has a failure rate or failure probability that differs for the others. SAPHIRE 8 addresses the treatment of the degraded component, but the draft CCF NUREG provides no explanation on how the degraded condition is modeled in SAPHIRE 8 when performing an ECA.
NRC Response: We agree with the comment, however material covered by this comment related to SAPHIREs treatment of CCF in the draft NUREG has been eliminated from the revised NUREG.
PWROG Comment-Treatment of Shared Components and Latent Human Errors (Section 3.2.1): The treatment of shared components and latent human errors can be a source of uncertainty in the ECA. The analyst must determine whether such treatment is implicit or explicit. To make such a determination, the analyst should obtain necessary information from the utility. This can be a source of uncertainty that may or may not be recognized the ECA analyst. The draft CCF NUREG has not discussed the potential sources of uncertainty that may be encountered in performing an ECA.
NRC Response: Sources of uncertainty within ECA are outside the scope of the revised NUREG. However, failure rate data used to develop CCF parameter estimates for SPAR models includes human error contributions. Therefore, explicit treatment of latent human error leading to CCF would introduce significant double counting. Furthermore, the SDP includes opportunities for engagement between the licensee and NRC risk analystthese interactions provide additional confidence that the SDP appropriate represents the failure event. See Section 3.3 of the revised NUREG for additional information.
PWROG Comment-Prior Distribution of Alpha Factors: The prior distributions for alpha factors are currently estimated using data from the CCF database for the 1995-2005 timeframes, as noted in this section. The latest released version of the CCF database includes events up through 2010. The NRC has an ongoing program that collects CCF events, which are used to update the CCF database.
This section (Section 3.2.2) contains the following: Because it was felt that the number of complete CCF events may be under-represented, especially for large group sizes, a statistical model was developed to estimate the number of missing complete events, and these were then added to the partial event counts for each group size.
This statement is not particularly strong in conviction. What is the basis for the feeling that the number of CCF events may be underestimated? What is the basis for a statistical program to fill in missing events? This appears to be an unsubstantiated and ad hoc process to develop a prior distribution.
5
Public Comments on Draft CCF NUREG and NRC Response NRC Response: The material covered by this comment related to the topics in Section 3 of the draft NUREG has been eliminated from the revised NUREG. It is recognized that the selection of a prior distribution often depends on judgment and experience. For this application, we believe that the formulation of the prior is appropriate. However, we also note that as operating experience is accumulated, potential biasing caused by the prior selection will decrease.
PWROG Comment-Treatment of Staggered Testing: The discussion in this section (Section 3.2.4) indicates a question about what, if anything, should be done.
A paper is referenced that states that both of these equations maybe in error. And further that the net impact on ECA of these equations being incorrect remains to be examined. This reinforces the discussion in the general comment that the premise upon which the draft CCF NUREG is based raises a number of questions about the validity of the methods proposed.
NRC Response: Material covered by this comment related to the topics in Section 3 of the draft NUREG has been eliminated from the revised NUREG.
PWROG Comment-Conditional CCF Probability: Appendix A (first paragraph) of the draft CCF NUREG states that Appendix B to the SAPHIRE 8 technical reference includes details of the conditional CCF calculations. These calculations are not available to the draft CCF NUREG reader.
NRC Response: The appendices in the draft NUREG have been eliminated in the revised NUREG. However, information on the SAPHIRE computer code can be found in NUREG/CR-7039.
PWROG Comment-Table 2 of Rounding Errors: The values provided in this table and the appropriate expression from Equation A.8 is used to calculate the basic event probabilities in Table 3 of Appendix A. The values provided in Table 3 of Appendix A are rounded to two significant numbers after the decimal point.
The rounded values slightly over-estimate the probabilities, for cut sets with multiple basic events, provided in Table 4 of Appendix A. Using the values provided in Table 2 of Appendix A, the calculated probability for cut set {A-S, B-R, C-R} is 6.579E-07, which is slightly lower that the calculated probability of 6.607E-07 that was obtained using the rounded values in Table 3 of Appendix A. Depending on the number of cut sets, the overall conditional probability can also be over-estimated. For the basic event unavailability cases considered in Appendix A, the calculated probabilities should be based on the actual values provided in Table 2 of Appendix A.
NRC Response: Appendix A (including Table 2) has been eliminated in the revised NUREG.
NEI-General Comments: The Nuclear Energy Institute (NEI) 1, on behalf of the nuclear energy industry, appreciates the opportunity to comment on the subject draft NUREG, "Common-Cause Failure Analysis in Event and Condition Assessment: Guidance and Research, Draft Report for Comment," as the main objective of this NUREG is to improve the consistency and accuracy of event condition assessments performed in support of the SDP. The treatment of conditional CCF probability estimates in SDP evaluations has been particularly problematic for the past several years, and the effort to address this via the work documented in this NUREG is an important step towards improving SDP evaluations as a whole.
However, as written, this NUREG does not sufficiently support this objective. Specifically, the document endorses the use of the alpha factor as a proxy for the conditional probability of CCF. Such an approach is not appropriate for evaluation of a specific event, and the document should be revised to better guide those performing SDP evaluations towards accurate characterization of conditional probability of CCF. Detailed comments regarding specific approaches that should be discussed in the NUREG in lieu of inappropriately using alpha factors in event-specific assessments are included in the attachment to this letter.
Given the extensive revisions that would be needed to support inclusion of such information, the industry believes that a public meeting on the content of the draft NUREG would be beneficial, and further suggests that another draft of this document be released for public comment prior to publication of the final NUREG.
Additionally, we urge the NRC to strongly consider comments submitted by utilities and other stakeholder organizations in evaluating the content of the draft NUREG as it currently stands.
6
Public Comments on Draft CCF NUREG and NRC Response NRC Response: We thank you for your organizations feedback and comments on the draft CCF NUREG. In regard to this specific comment, conditional probability results can be expressed as a function of alpha factors follows from the math in Appendix E of NUREG/CR-5485. The alpha factor model is a state-of-practice method used within industry and is referred to the ASME/ANS PRA standard as a Capability Category III CCF method. Alpha factors are generally specific to the system and component being evaluated; however, they are estimated based on aggregate data without regard to a specific failure mechanism and are not specific to the inspection finding being evaluated. The procedure for estimating alpha factors, and the supporting data is the current state-of-practice and, therefore, are appropriate for ECA activities. The NRC staff rigorously considered industrys comments, which resulted in significant improvements to the NUREG, including a reduction in the scope of the report. A separate public meeting associated with this NUREG was not held because the applicable CCF issues have been discussed during Reactor Oversight Process public meetings when the guidance was incorporated into the RASP Handbook
[see the public meeting summary for the May 13, 2013 meeting on revised RASP handbook guidance (ADAMS Accession No. ML13175A198) for additional information] and are well understood by NRC staff. More recently, public meetings to discuss CCF modeling were held on May 2, 2017 (ML17123A195) and September 19, 2017 (ML17264A624).
NEI-Inaccuracies Associated with Simplistic Approach to Estimating CCF: The draft NUREG proposes the use of the alpha factor in developing event-specific CCF estimates, which is a simple approach that supports more rapid evaluations to support the SDP. While the desire to pursue rapid evaluations is understandable given NRC's expectations for timely completion of a final SDP evaluation, the loss of accuracy due to the fact that the alpha factor considers failure events from all observed causes is not appropriate, and pursuing more accurate estimates should be a priority.
NRC Response: The use of the alpha factor model is a state-of-practice CCF approach that is fully consistent with the PRA standard (see Supporting Requirement DA-D5 in ASME/ANS RA-Sb-2013). The alpha factor is not simplified relative to other standard CCF methods (e.g., MGL or basic parameters approaches)it is an effective way to count failure data given current data collection practices.
NEI-Insufficient Discussion on Consideration of Failure Cause: The draft NUREG recommends using the alpha factor as the conditional failure probability for redundant components given that one component in the CCCG fails, even though the alpha factor is not a conditional failure probability, but rather a correlation factor of actual and potential failures to all failures. It recommends defaulting to this approach without regard to the specific attributes of the performance deficiency cause that could manifest in a CCF event.
The NUREG does not acknowledge that the CCF probabilities are estimated based on causes that did not affect the component that failed. For example, if the cause is a deficient maintenance program, common cause due to environment and design do not apply. The NUREG goes on to note that conservative and non-conservative estimates can result from this approach, but does not address alternative approaches that could address this issue.
NRC Response: See response to previous comment. The CCF approach described in the revised NUREG report does not use alpha factors as conditional failure probability as stated by the commenter, instead, the alpha factor model is conditioned on the observed failure. Thus, the CCF likelihood (as conditioned on the observed failure) is a mathematical function of the alpha factor parameters and component failure likelihood. Although this function can, under certain circumstances, be simplified to a single alpha factor, it is not correct to say that the alpha factor is the conditional failure probability. Note that other basic parameter CCF models (e.g., MGL), can be used in a similar manner to obtain the CCF likelihood conditioned on the observed failure event. The alpha factor model is the preferred NRC approach for ECA applications because it has a number computational advantages when estimating model parameters based on operating experience data (e.g., better alignment with operating experience data for PRA applications and the ability to estimate CCF parameters independent of failure rate). We acknowledge that more refined causal-parameter estimates could be developed to better estimate the causal aspects of CCF potential given an observed failure due to a specific (proximate) cause. However, such methods are beyond the current state-of-practice and would require a significant amount of operating experience data (beyond what is currently available) to provide reasonable estimates of model parameters.
7
Public Comments on Draft CCF NUREG and NRC Response NEI-Inaccuracies Due to Use of the Alpha Factor as CCF Probability: The report states that the full conditional CCF probability should be applied to all components in the group with the failed component, regardless of the details or cause associated with the failure even though the alpha factors used in PRA and SPAR models include all inter-component dependencies not captured explicitly in the models. Because of the relatively high values for CCF probabilities, it is likely that CCF will be a significant contributor and artificially drive the results into higher action categories. For example, incipient failures are included in the CCF probability calculations. Events that do not represent an actual failure, but may have been a CCF, are used in the CCF probabilities. Although not assigned a full failure, they bias the results.
The draft NUREG acknowledges these complications, as Section 3.1 specifically notes that use of alpha factors in event condition assessments can result in conservative or non-conservative estimates, yet this approach is still endorsed in the draft NUREG. This is because the approach supports rapid completion of SDP evaluations. As noted above, the potential negative effects of this approach do not justify the slight reduction in analysis time, and, as written, the document will not result in more precise treatment of CCF in SDP evaluations.
The industry acknowledges that properly evaluating the conditional failure probability of redundant components given failure of one component due to a performance deficiency that has potential CCF implications is not a simple task. This does not change the fact that the use of the alpha factor as a surrogate for the conditional probability of failure of the redundant components due a plant specific performance deficiency is entirely inappropriate.
A more appropriate approach would be to use the CCF database criteria in combination with expert elicitation to calculate the impact vectors for the plant specific conditions, taking into account all specific information about the condition, including successful runs or tests as well as the degree of degradation observed in redundant components. This could then be used to estimate a conditional probability of CCF that would be more representative of the actual cause.
NRC Response: The approach described in the revised NUREG is based on state-of-practice CCF method (i.e., alpha factor model) and the scope of the SDP evaluations. It is standard practice in SDP risk assessments that credit is not provided for equipment that successfully operates (either during the event or subsequent testing) for their PRA mission time (see Section 2.1.3 in the revised NUREG for additional information). Furthermore, the methods described in the revised NUREG do not constitute any change in staff position or practice for the SDP. This state-of-practice approach described in the NUREG is consistent with the failure memory approach and has been a long standing practice in event and condition assessment.
Exelon-General Comments: Exelon believes that the suggested use of the conditional CCF (i.e., the alpha-factor) in ECA might be flawed. The draft NUREG states that the full conditional CCF probability should be applied to all components in the group with the failed component, regardless of the details or cause associated with the failure. Exelon's concern with this approach is that the conditional CCF probabilities (i.e., alpha factors) used in PRA and SPAR models include all inter-component dependencies not captured explicitly in the models.
As discussed in Section 1.1, PRA Treatment and Dependent Failure, the CCF parameters include multiple failures of components that: "span a wide range, and may include common design, manufacture, testing, maintenance, environment, and many others." The draft NUREG attempts to provide a supporting position to justify that the potential shared failure is at a causal level (e.g., a deficiency in a maintenance process) as opposed to a failure mechanism. While there is merit in this argument, the draft NUREG does not acknowledge that the other causes included in the CCF probabilities did not affect the component that failed. For example, if the cause is a deficient maintenance program, common-cause due to environment and design do not apply. However, the CCF probabilities include all these causes. The draft NUREG mentions that the impact of this assumption may be conservative or non-conservative, but does not seem to justify why the entire CCF probability should be used. While the draft NUREG describes that the industry is incorrect in trying to define the cause too, it suggests a method that assumes all causes of CCFs are applicable. Exelon does not necessarily agree with this assumption, since this might tend to increase the impact of common-cause.
In using the full CCF probability, the ECA penalizes licensees in cases where the CCF probability is actually lower than the value used in the PRA model. Because of the relatively high values (see note below) for CCF probabilities, it is likely that CCF will be a significant contributor and drive the results into higher action categories (e.g., WHITE, YELLOW). Even if the CCF probability used in the ECA is non-conservative (i.e., there is a stronger causal link than average, due to the nature of the deficiency and the historic events), the underestimation of risk is expected to be less than the overestimation using the suggested approach discussed in the draft NUREG. This is because of the relatively high alpha-factor values and the corresponding numerical impacts of CCFs on the final results.
8
Public Comments on Draft CCF NUREG and NRC Response NOTE: There appears to be some inconsistency between Section 1.1, Definitions and Discussions (page 6), first paragraph, and the information discussed above. This is based on the statement in the draft NUREG definition that the "potential for failure does not need to be high." CCF probabilities on the order of <0.05 should not be considered low, when compared to most independent failure probabilities (e.g., diesel generators are an exception) which are on the order of <0.001. In fact, when comparing alpha-factors to independent component failure probabilities, many alpha-factors would be considered "high."
Furthermore, there are several items discussed in the draft NUREG which result in higher CCF probabilities, making application of the full CCF probability to all components in the CCCG conservative:
- The inclusion of incipient failures in the CCF probability calculations. Events that do not represent an actual failure, but may have been a CCF, are used in the CCF probabilities. Although not assigned a full failure, they bias the results in a conservative direction.
- The prior distributions are statistically manipulated to account for what is perceived to be fewer complete CCF events than represented by the data (referenced in Section 3.2.2). This results in the CCF probabilities being larger than the historical data indicates.
Finally, several statistical issues are raised on the "accuracy" of the CCF probabilities, which may result in conservative or non-conservative assessments. As discussed previously, the conservatisms will tend to have more impact on the results than the non-conservatisms. This is unfortunate in a "risk-informed" process that results in enforcement actions on licensees, where the result of the SDP calculation is generally used, without consideration for other factors, to determine the significance of the event.
NRC Response: The alpha factor model is a state-of-practice CCF approach and is consistent with the PRA standard (see Supporting Requirement DA-D5 in ASME/ANS RA-Sb-2013). Although the staff agrees that CCF methods could be improved to better reflect causal factors in ECA analyses, the practical application of such causal methods is beyond the current state-of-practice and not adequately supported by operating experience data. However, as noted in other comment responses, factors not explicitly considered in the quantitative risk assessment can be qualitatively considered during the enforcement process.
Karl Fleming-Use of Alpha Factors in ROP: From my experience in supporting several utilities in the ROP of several specific events and in the review of this draft report, I think there is a need to clarify what it is we are trying to do in the ROP. I believe it would be helpful to make a clear distinction between two distinct steps that I believe are muddled together and confused in this draft report and in the application of the concepts in this report to actual events.
Step 1 is to achieve a risk characterization of the event, i.e. to describe what happened in factual terms including how it did or did not impact plant performance or material condition of the plant. Such questions as the following should be addressed in Step 1:
- What was the root cause, and how the event is or is not represented in the existing relevant PRA models?
- Was it an initiating event or did it increase the likelihood of an initiating event?
- What was the sequence of events?
- Did it involve failure or degradation of one or more components according to some well-defined success criteria?
- What was the role of the plant operators and how did they respond?
I believe this step needs to be very well defined before any attempt is made to evaluate its risk significance. Also in principle this step should be largely deterministic. It is recognized that there may be uncertainties in this step but lets first start with the facts and those should include the facts documented by the plant owner in the event report and include the results of any root cause evaluations that may have been performed.
Step 2 is to evaluate the risk significance of the event. In my mind, what this involves is a process where one asks the question: If the same event or condition were to re-occur in the future at the same plant, perhaps under different circumstances, what would be risk impact? In this step it is very important that the reoccurrence of the event be characterized as it is in Step 1 while using the existing PRA model to quantify the appropriate risk metric, which in some cases might be the conditional probability of core damage or large early release, change in annual average CDF, or incremental conditional core damage probability, depending on the nature of the event.
In my view these two steps are muddled together in the draft report and in the way the concepts in this draft report have been used in recent significance 9
Public Comments on Draft CCF NUREG and NRC Response determinations. When we are in Step 1 and are evaluating whether the event in question is a CCF or not, this process should be no different than the process used to code impact vectors in the CCF data base according to existing coding guidelines. The probability that any number of components did in fact fail or were degraded in short period of time has to be determined by the facts of the event in Step 1. The impact vector method allows for the expression of uncertainties by using a probabilistic impact vector. Application of some generic alpha factor from the CCF data base for this purpose is just plain wrong. I may have misunderstood the report in this regard, but that was the impression I was getting from the description in the report. The existing generic alpha factor estimates are averages over prior historical events in the industry and do not have anything to do with a particular new event. The only meaningful role of alpha factors from previously analyzed data should be confined to Step 2 and that would be to calculate the probability of some other CCF events that are independent of the event being analyzed but part of the existing PRA model used to calculate risk significance. The distinction between these two very different kinds of CCF treatments in Step 1 and Step 2 is completely obscured in this draft report. This makes it impossible for us to understand how the CCF models are being used in this application.
NRC Response: Within the context of the SDP, step 1 is completed when the performance deficiency has been identified. The scope of the SDP risk assessment is to determine the risk significance at the performance deficiency level (i.e., proximate cause). The observed failure is needed to enter the SDP process (i.e.,
evidence of the performance deficiency); however, the CCF potential is conditioned on the performance deficiency and not the observed failure mechanism.
Further, the methods described in the NUREG report do not constitute a change in staff position and are consistent with long standing practices for addressing CCF in ECA.
Karl Fleming-Definition of Common Cause Failure: There are some issues with the definitions of CCF in the draft report and when combined with some new concepts such as the failure memory approach appear to be leading to a redefinition of CCF which I believe may lead to gross distortions in the evaluation of risk significance. The draft report starts with the definition of common mode failure from WASH-1400 and some definitions of dependent failures from NUREG/CR-2300, but curiously does not mention the definition from the ASME/ANS PRA Standard which the NRC has endorsed in RG 1.200. That definition is:
Common cause failure: A failure of two or more components during a short period of time as a result of a single shared cause.
This definition has been fully vetted by the consensus standards process and is the definition that best matches the common cause models currently being used in PRAs including MGL and Alpha Factor. The definition given in this draft report at the beginning of Section 1.3 is:
Common cause failure: When two or more components fail within the PRA mission time window as a result of a shared cause.
The introduction of a new definition that is different than the one in the standard is not helpful as it does not offer an improvement. Rather, it introduces a subtle yet significant change in the meaning. Ambiguity is introduced by the use of the term mission time. If the mission time is not short, the definition is not valid. It does not work with normally operating systems whose failure may cause or contribute to an initiating event and for which there is no short mission time defined. Mission time, if it is reasonably short such as 24 hours2.777778e-4 days <br />0.00667 hours <br />3.968254e-5 weeks <br />9.132e-6 months <br /> or less, may be ok for a kind of rule of thumb for analyzing data but the original concept is that the failures must occur in a sufficiently short period of time so that the occurrence cannot be satisfactorily explained as an unfortunate combination of random independent failures.
When there is an argument about how short is short, the argument can be easily settled by calculating the probability of observing multiple independent failures. In some cases the condition being evaluated in the ROP can be best explained as a condition where the failure rate for independent events has increased relative to that used in the base PRA. There needs to a part of Step 1 that resolves whether this is the case without prematurely invoking a common cause description. The draft NUREG seems to convey a bias towards assuming that the event is a CCF whether it meets the definition or not. Also the use of a PRA model construct of mission time to define a physical event does not contribute to getting the facts straight in Step 1. The ASME/ANS standard definition should be used unless the authors have an improved definition that can be considered for incorporation into the standard and then subjected to the consensus vetting that other changes to the standard are subjected to.
NRC Response: The NUREG has been revised to include the ASME/PRA standard definition of common-cause failure. However, the NRC staff believes that a short period of time contained in the ASME/PRA standard definition is vague. Component failures have to be within the PRA mission time to be meaningful and we have found no reason that common-cause failures should be any different. As such, Section 2.1.3 in the revised NUREG states that a short period of time is synonymous with the mission time, if it is sufficiently short (e.g., 24 hours2.777778e-4 days <br />0.00667 hours <br />3.968254e-5 weeks <br />9.132e-6 months <br />). From a practical perspective, the application of mission time consideration is more of a consideration for data collection and analysis and is beyond the scope of this report.
10
Public Comments on Draft CCF NUREG and NRC Response Karl Fleming-Expanding CCF models to Non-Common Cause Events: My biggest concern with the concepts discussed in this report, and the reason for the large disconnects we are seeing being NRC performed and industry performed significance determinations, is the apparent use of CCF models to characterize events that are not common-cause events. When I read Section 1.4, in the description how the ECA process is applied to an event involving a single (independent) failure, I cannot determine whether I am in Step 1 or Step 2 of the process I outlined in the beginning of my letter. If an event only involves a single failure it is clearly not a CCF. If there are repeated failures of the same component it may suggest a problem with the maintenance process and perhaps an indication of an increased failure rate but this is not a CCF. Even if the event or conditions involve two or more failures that are not sufficiently short that they are satisfactorily explained by multiple independent failures they are not CCFs. Much of the discussion in this section appears to describe non-common cause events as CCFs. In those cases that are not CCF, when I do Step 1 the reoccurrence should be modeled as an independent failure or failures and the possibility of a CCF should only be considered in Step 2, that is the probability that there is some other failures (independent or common cause) that are in the PRA model and could occur INDEPENDENT of the Step 1 event. So in these cases where there is no CCF in Step 1 the only valid use of a CCF model would be to model some other unrelated CCF events that happen to be in the PRA model. Conversely, if there is some potential for CCF in Step 1, the methodology for treating that probabilistically is not the CCF models (e.g. MGL, Alpha) but rather the impact vector method that was introduced in NUREG/CR-4780 and later refined and used at INL to code events into the CCF database.
A related concern is the discussion about proximate causes and programmatic actions and how CCF models are being used to address that. The CCF models that are being manipulated here were not designed and are not capable of assessing the increased probability of CCF that may be perceived in the review of the causes of the event, e.g. poor maintenance practices. In the past the NRC funded research to investigate whether one could quantify the impact of organizational factors in PRA and the conclusion of that research that I recall (I was on the peer review team for that) was that this is well beyond the state of the art of PRA. I do not think that situation has changed. As a principal author of many of these CCF models and NUREG/CR-4780 I can emphatically state that these models and the data that has been collected for use with them are not capable of quantifying the increase in CCF probability due to perceived organizational weaknesses.
Certainly any existing generic alpha factor estimates derived from service data do not accomplish that. Furthermore, if one were able to do that, which we are not, it would be necessary to baseline this increase against some kind of industry average organization capabilities. In my personal opinion, this baseline is reflected in the service data.
As a final comment I want to bring to the authors attention some insights about the nature of causes of CCF and how those causes compare with those of independent component failures. In the early stages of the EPRI research on CCF that was conducted in the early to mid-1980s we published our first report on our analysis of experience data involving dependent events. In our report Classification and Analysis of Reactor Operating Experience Involving Dependent Events, prepared for Electric Power Research Institute, PLG-0334, January, 1984, we identified events involving dependent and independent failures and classified their causes. What we found upon collecting and analyzing several thousand events is that the causes of independent and CCFs were essentially the same. That is the root causes of CCF are not unique or distinct from the root causes of independent failure.
One way to characterize the causes of a CCF event is to identify the root causes of the event and the coupling mechanisms that may exist to link the cause of failure to two or more components at the same time or in the same time frame. What is unique about CCF is not the root cause but rather the existence of a coupling mechanism that helps explain why the multiple failures occur at the same time. Hence the approach described of the draft NUREG of just looking for shared or common causes is not sufficient to identify CCF potential. If you are not identifying the coupling mechanisms that link the root causes to the equipment and synchronize the multiple failures, it is unlikely that focusing on shared causes will be fruitful. One can take any two failures that occur in a nuclear power plant and find shared causes, but that does not make them CCFs. Essentially all the incandescent light bulb failures that have ever occurred have been due to the same degradation mechanism, thermal fatigue of the filament. That does not mean that they are all CCFs. Absent a coupling mechanism two or more light bulb failures seldom if ever occur in a short interval of time. If they did there must be a coupling mechanism for them to be classified as a CCF.
11
Public Comments on Draft CCF NUREG and NRC Response NRC Response: We appreciate the commenters observation that the root causes of CCF are not unique or distinct from the root causes of non-CCFs and the associated importance of coupling mechanisms for CCF. The purpose of the SDP risk assessment is to determine the risk significance of the performance deficiency, which is associated with the higher level organizational factors that led to the observed failure. The definition of common cause component groups (CCCGs) is beyond the scope of this NUREG, but the CCCGs are defined based on the existence of linkage factors between the redundant components within the CCCG. As such, if a single component is determined to be failed due to a licensee performance deficiency, the risk assessment will include the CCF potential of redundant components within the same CCCG since the organization factors leading to the performance deficiency can propagate to the redundant components in the CCCG via the pre-existing coupling factors. The determination of whether the failure is independent or a common-cause failure, is part of the parameter estimation process and is beyond the scope of the revised NUREG. If a single performance deficiency (i.e., shared cause/coupling factors) results in the failure of multiple, redundant components, then a CCF has occurred and will be treated as such in the SDP risk assessment. See NRC response to NEI comment on Insufficient Discussion on Consideration of Failure Cause for additional information.
Palisades-Foreword: This initial discussion of the proposed approach to conduct common cause analysis under the premise that it not be constrained to the same piece part or subcomponent or the same failure mechanism, is contrary to any existing PRA methodology for conducting common cause analysis.
This document attempts to move the concept of cross cutting issues into the PRA model which may or may not be consistent with the approach to implement common cause contribution in the model. A principal issue is the guidance provided does not include any direction to provide a basis to support the conclusion that the performance deficiency is directly tied to the component failures being observed and that all CCFs identified would be expected to occur within the PRA mission time of a single event. The approach is to elevate the definition of performance deficiency to the broadest definition that can be shown to encompass the event in question. The elevated description then makes the assertion that many other CCFs beyond those identified would be possible and not necessarily constrained to the common cause group of the component(s) failed in the event under consideration.
For example, the following is from IM 0308 Attachment 3:
The staff is responsible to define licensee performance deficiencies. Where the proximate cause of multiple degraded conditions is the same, there is likely to be only one finding (based on the identified performance deficiency related to the proximate cause) and the risk impact of the collective degraded conditions (including any overlapping conditions) is then appropriately used as the basis for the SDP result. However, this concept could be taken to an extreme of defining all licensee performance deficiencies as management weakness or something similarly fundamental. Doing so would then cause all degraded conditions to be manifestations of a single and possibly never-ending finding, would make unnecessary the need for an Action Matrix, and may require the staff to devise a continuous risk meter or similar substitute for the Action Matrix. Thus, a floor was set for the implementation of this concept that is consistent with the ROP framework, in that no performance deficiency should be defined at a level associated with the ROP cross-cutting issues (i.e., human performance, safety-conscious work environment, and problem identification and resolution) or more fundamentally. Although artificially setting this floor may create a philosophical inconsistency with use of a probabilistic thinking framework (i.e., if there is really a known common-cause effect taking place, then it should be explicitly acknowledged in a probabilistic model), it remains necessary for practical reasons as long as the Action Matrix continues in its present form. Concerns about possible insufficient regulatory responses arising from this approach are also mitigated as noted below.
It is considered this documents premise is inconsistent with the above.
This documents approach goes on to state, given the definition of the performance deficiency, the current assignments in the PRA model for common cause grouping may no longer be applicable, and the failure mechanisms considered as the common element of group failure are no longer a constraint on the number of components that would be the target group for CCF. The issue is that the definition of the performance deficiency at this high level (e.g., poor maintenance process) represents an unbounded characterization of commonality among a group of components. This would allow cross system groupings, and grouping of dissimilar components, into much larger common cause groups. There is no guidance that mandates the development of a technical basis that would establish the connection of the specific observed failure(s) to the entire common cause group.
12
Public Comments on Draft CCF NUREG and NRC Response NRC Response: The treatment of potential CCF in SDP risk assessments is limited to redundant components within the same CCCG (see Section 2.1.5 in the revised NUREG). The fact that components are within the same CCCG indicates that dependencies between these redundant components do exist. In an example of inadequate maintenance process that led to a failure of a component, the SDP risk assessment would be limited to determining the risk significance of the performance deficiency of the failed component including the CCF potential of any redundant components in the same CCCG. The evaluation of CCF potential would not be expanded to cover all components that maintenance is performed on. However, the evaluation of CCF potential would not be limited to the piece-part that caused the observed failure (i.e., CCF potential exists even if the redundant components do not have the affected piece-part).
Palisades-Foreword: The summary discussion in the foreword also states that it describes technical issues; with the consensus CCF model used in PRAs, and the associated parameter estimates and data upon which they are based. The principal issue with this description is the industry has developed several standards (ASME/ANS) to establish a baseline consistent methodology of implementing risk assessment.
Plants are required to undergo review by external organizations to establish the degree of implementation/compliance with these standards. The authors herein have determined that the current standards are inadequate and infer that the implementation of this approach provides a method of quantifying risk assessments that will correct these deficiencies.
The guidance provided is a proposed means of correcting issues with the current consensus model without having been subjected to the same process of development as the current standards. Moreover, any issues with the current standards should be resolved within the standards process prior to issuing contrary guidance. If there are legitimate issues with the ASME/ANS standard process, it needs to be corrected first.
NRC Response: The summary discussion in the Forward section has been modified in the revised NUREG to provide further clarity. This NUREG is only applicable to the treatment of CCF potential in a specific application (SDP risk assessments). It does not provide guidance with CCF modeling (including parameter estimation), that the ASME/ANS PRA standard addresses. However, the use of the alpha factor model is fully consistent with the PRA standard (see Supporting Requirement DA-D5 in ASME/ANS RA-Sb-2013).
Palisades-Section 1, First Two Paragraphs: This discussion is somewhat vague. When an event happens and core damage does not occur, the conditional probability of core damage is zero. What is being computed is the likelihood that, if such an event or similar event were to occur again under the same boundary conditions that existed when the actual event occurred, that additional failures would have occurred to produce core damage. The key is the probability of what happened is not being evaluated, but what could happen if the event were to occur again.
NRC Response: We agree with the comment that the purpose of event and condition risk assessment is to assess the probability of core damage should the event or condition occur in the future. In order to better focus the NUREG, the paragraphs referred to in the comment have been eliminated in the revised NUREG.
The long-standing ECA practice uses the concept of the failure memory approach, which specifically models failures in the analysis (remembered); however, successes are treated probabilistically (typically by the base PRA failure probabilities).
Palisades-Section 1, 3rd Paragraph: When an event occurs and the cause of the event is determined, the conditional probability of it being a common cause failure or independent failure is either 1 or 0. There may be uncertainty in determining this, so one might assign some probability that it was a common cause using engineering judgment, but this should not be compared with CCF model parameters.
13
Public Comments on Draft CCF NUREG and NRC Response NRC Response: In a SDP risk assessments, equipment observed to be failed are set to TRUE (i.e., probability of 1.0) and equipment that performed successfully are kept at their nominal likelihood and handled consistently with the underlying reliability modeling. The use of the failure memory approach does not credit successful operation of components (whether from an actual demand or post-event testing). As previously noted by the commenter, crediting successes will result in risk significance of an event and condition assessment to be zero if core damage did not occur. The conditional CCF failure probability of redundant components within the same CCCG are not set to TRUE unless an actual CCF event occurred. In addition, the conditional CCF is not set to FALSE, even their operation during the event or subsequent testing is successful. In SDP risk assessments, the CCF of the redundant components within the same CCCG (i.e.,
shared coupling factors) are conditioned on the observed failure and a shared cause (i.e., the performance deficiency). See Appendix E of NUREG/CR-5485 for additional information on CCF equation formulations for ECA. Volume 2 of NUREG/CR-7039, SAPHIRE 8 Technical Reference provides additional information on the SAPHIRE software manipulations performed as part of CCF probability calculations.
Palisades-Section 1, 4th Paragraph: The guidance above states that, while common cause contribution has been shown to be a significant contributor in past ECAs, there have been issues in not appropriately characterizing the common cause potential perceived to be associated with the observed events. The approach states the problem is related to being overly specific in the statement of the performance deficiency which restricts the focus of the risk assessment.
Therefore, the deficiency description should be elevated and broadened to a level commensurate with the definition of the cornerstone or the general requirements of the Quality Assurance Program Elements. While this is appropriate in the context of determining the possible association of several different events into a depiction of broader organizational issues, it also raises two concerns:
- 1. Statements of performance deficiencies at this level result in unbounded issues which makes it difficult to impossible to demonstrate issue resolution.
- 2. The association of component failures from several different events that have been encompassed by this broadened deficiency definition may not have not been shown to be connected by a direct common cause.
The issue is that deficiency definitions at this level are self-fulfilling with respect to any group one would choose to create. The definitions become so vague that anything can be postulated to belong to the group.
Almost all equipment failures that have ever occurred could be lumped into a single group as long as we are willing to discuss causes at the proposed level (e.g.,
poor maintenance processes). In addition, the depiction of the deficiency in this broader characterization to assess several different events that occurred over some extended period of time ignores any correlation that would have established the probability of the different events occurring within a single event response.
PRA models have not been developed to accommodate this type of assessment of organizational issues and there is no data to support the quantification, as is being proposed in this document. This type of assessment has historically been a qualitative determination of the level of significance of the possible impacts of several disparate but similar events.
Current PRA models are not developed with the capability to perform this type of assessment. To now provide a methodology that would superimpose this type of assessment onto a PRA would be subject to subjective determinations, and gross over or underestimation of the risk contribution.
NRC Response: SDP assessments are based on the actual failures observed during the event or condition of interest. Although the performance deficiency may indicate potential to impact equipment outside the failed components CCCG, current state of practice CCF models are not equipped to treat inter-system CCF.
Given this limitation, the treatment of CCF potential is limited to redundant components within the same CCCG.
Palisades-Section 1, 4th Paragraph: It is unclear what is meant by proximate cause. This should be better defined.
NRC Response: Revised text in the NUREG provides additional description on what is meant by the term proximate cause in both the SDP (i.e., IMC 0308, ) and its relation to the root cause (as defined in NUREG/CR-5485). See Sections 1.3 and 2.1.1 in the revised NUREG for additional information.
14
Public Comments on Draft CCF NUREG and NRC Response Palisades-Section 1.1, CCF Definition: Per the ASME/ANS PRA standard CCF is defined as:
Common cause failure: a failure of two or more components during a short period of time as a result of a single shared cause.
This definition brings in the concept of short time which is only implied in the WASH-1400 definition. Note that the term was changed from common mode to common cause because the cause was the key to defining the failures in the same short time intervalfailure modes can be common but at different times they are not CCFs.
NRC Response: The definition contained in the draft NUREG is now replaced by the ASME/ANS definition for common-cause failure.
Palisades-Section 1.1, 4th Paragraph: Failure at the piece part level is not the same as a failure mechanism. The confusion occurs in the use of CCF models for the purposes they were not intended for.
NRC Response: Per NUREG/CR-5485, CCFs require only that the component failure be linked by a shared cause, not a shared failure mechanism, to redundant components within the same CCCG (and grouping components within the same CCCG indicates the components have shared coupling factors). A shared failure mechanism (which usually is exhibited on the same piece-parts) is not required to be present for there to be a potential for CCF.
Palisades-Section 1.1, Last Paragraph: This discussion does not make it clear that the times of the multiple failures must be synchronized. A failure due to poor maintenance practice and noting that the maintenance practice is shared by redundant components does not meet the definition of common cause. Poor maintenance practice could just as easily lead to higher independent failure rates than increased CCF potential.
NRC Response: This text has been modified and moved to Section 2.1.2 in the revised NUREG. We agree that multiple observed failures with a shared cause does not mean a common-cause failure has occurred unless the failures occur within the PRA mission time. However, the determination of whether failures could occur within the mission time is associated with data collection and parameter estimation. Therefore, the CCF parameters used for the SDP assessment already account for this mission time consideration. Note that data collection, analysis, and parameter estimation are beyond the scope of this NUREG.
Palisades-Section 1.1, Last Paragraph: In the discussion of Section 1.1, the argument is made that once the definition of the performance deficiency is elevated to a broader scope description, this is sufficient basis for expanding the existing common cause grouping to include any number of diverse components because they can be shown to be encompassed by the all-inclusive definition. Creating deficiency descriptions at this level creates a condition in which almost any failure that ever occurred could be considered part of the group because the over generalized cause statement cannot be proven incorrect. Consequently, this allows the focus to be shifted away from the actual component failures and their direct causes. Attempts to over generalize these conditions to estimate the risk of organizational weakness has not been the purview of PRA modeling and should not be.
The PRA model focus has been, and should continue to be, on maintaining the reliability of components credited in mitigating analyzed events. At no point has there been any discussion of the need to develop a basis for the connection of these events under one common theme. While it is appropriate to characterize events similar to the examples provided as poor maintenance processes as an example for the purpose of aggregating against the ROP cornerstones, or the broadly defined QA areas, it is not necessarily true that the elements of the maintenance processes are all necessarily failed or failed to the same degree.
Also, the failure to correctly implement a procedural requirement one time does not guarantee failure on the next occurrence. This must be demonstrated by providing evidence that the procedural requirement is routinely violated and that evidence exists in implementation of other procedures as well. Even in the case of additional examples, any suspect increase in risk should be restricted to the cases where the evidence is provided. Otherwise, the generalized statements of performance deficiency result, as was done in this document, in an overall indictment of an entire process which was not supported by any factual information.
This is the very issue that was raised in the SDP process to be avoided because of the likely gross over estimation of the risk significance.
NRC Response: The purpose of the SDP is to estimate the risk significance of the observed failure or condition, including the potential for the performance deficiency to propagate to redundant equipment. However, the treatment of CCF potential in SDP risk assessment is limited to redundant components within the same CCCG (see Section 2.1.5 in the revised NUREG for additional information).
Palisades-Section 1.1, Figure 1: This model would apply equally well to maintenance causing increased independent failure rates or increased common cause potential. The model should include the time element.
15
Public Comments on Draft CCF NUREG and NRC Response NRC Response: The CCF model includes the time element, which appears in the coding requirements for cataloging CCF events. NUREG/CR-6268 can be referred for additional information on the CCF coding guidance.
Palisades-Section 1.2, 1st Paragraph: An equally, or more important reason, is that causes are too numerous to mention and difficult to codify. The causes described here are general cause categories and are not defined sufficiently to determine the type of cause.
NRC Response: Causes are assigned (both specific and group) to all failure events. The current state-of-practice is that the data coder assigns a cause and independent reviewer(s) concur, consistent with operating experience data coding guidelines.
Palisades-Section 1.2, 2nd Paragraph: The idea of a conditional CCF probability is not carefully defined. Conditional CCF probability is not related to MGL or Alpha factors. Alpha factors are correlative and should not be used as surrogate conditional probability values. Moreover, as cited in this document ALPHA factors can be conservative or non-conservative. So if Alpha factors are applied as surrogate conditional probabilities, the conclusion is unclear.
NRC Response: The alpha factors are not surrogates for conditional probabilities, they are basic parameters of the underlying CCF model (similar to the factors in the MGL model). While there is a direct correspondence between alpha factors and the parameters of an MGL formulation, the alpha factor model was developed to be better aligned with data collection activities and ECA analysis. Different formulations of conditional CCF probability could be obtained using alternate CCF parameterization models.
Palisades-Section 1.3, Dependent Failure Definition: Stating the dependent failures must occur within the mission time window is somewhat vague. There are situations where the mission time might be long (i.e., months). If the mission time is short and the independent failure rate is high, multiple failures in short time intervals are not necessarily CCF.
NRC Response: The focus of potential CCF requires the focus on the PRA mission time window (i.e., failures of redundant equipment must occur within the PRA mission time to be of concern). The failure of redundant components within short time intervals are not necessarily a CCF, and these failures must have a shared cause for it to be classified as an actual CCF (components within the same CCCG have already been determined to share the same coupling factors).
Palisades-Section 1.4, Rule 1: This rule does not address the time element. If redundant components share a deficiency, it does not mean that the deficiency will increase the likelihood of a CCF. For example, the incandescent light bulb:
- All light bulbs share the same deficiency which explains why almost 100% of failures occur due to the same failure mechanism - thermal fatigue of the filament.
- However 99.999+% of all cases of light bulb failure due to this mechanism are independent failures. Many cases of shared deficiency can be explained by an increased failure rate.
NRC Response: The ground rules have been revised as key principles (see Section 3.2 in the revised NUREG). The time element is considered during data coding and parameter estimation activities, rather than during the SDP evaluation. With regard to the commenters example, thermal fatigue of the filament is not a performance deficiency of which SDP evaluations are focused. However, if a maintenance program sets the mean time to failure (MTTF) as the time limit to replace the light bulb frequently, but this maintenance program was not followed, then a performance deficiency exists. This performance deficiency (i.e., failure to follow the maintenance program to replace the light bulb according to the MTTF) is the shared cause and, therefore, the CCF potential would exist for the other lightbulbs.
Palisades-Section 1.4, Rule 1: The guidance states that the performance deficiency is not the degraded condition itself but its the proximate cause of the degraded condition. Note that degraded condition has crept into the guidance. PRAs do not typically analyze the impact of degraded conditions. More importantly the guidance states that the determination of cause does not need to be based on rigorous root cause evaluation but can be based on reasonable assessment and judgment of the staff. Given the possible implications of the findings associated with the performance deficiency, a statement that rigorous evaluation is not required is not consistent with potential consequences of such a judgment.
NRC Response: The NUREG does not redefine the performance deficiency. This paragraph is referenced from IMC 0308, Attachment 3, Significance Determination Process Basis Document. The purpose of SDP risk assessments is to determine the risk at the performance deficiency level and not the root cause. Therefore, once the performance deficiency is known, the SDP risk assessment can be evaluated and completed.
16
Public Comments on Draft CCF NUREG and NRC Response Palisades-Section 1.4, Rule 1: The guidance states that given the failure of one component in a CCCG, the analyst will use the conditional probability of CCF, given the observed component failure. It is recognized that while one or more additional failures do not occur during an event is not a guarantee that addition common cause failures could not have occurred. This guidance precludes any consideration of facts that could discount or substantially reduce the probability of common cause failure.
In addition, consideration of possible random failure of components that were known to be successful during the event response is not considered the same as arriving at the conclusion that the conditions necessary for a common cause failure of multiple components is present.
NRC Response: The ground rules have been revised as key principles (see Section 3.2 in the revised NUREG). The potential for CCF of redundant components within the CCCG is satisfied by the presence of a shared cause. For SDP risk assessments, the shared cause is at the performance deficiency level. The base PRA defines redundant components within the same CCCG if these components have common coupling factors.
Palisades-Section 1.4, Rule 2: This is counter to the ASME/ANS standard definition of CCF. The time element is key to what makes a failure common cause.
NRC Response: The ground rules have been revised as key principles (see Section 3.2 in the revised NUREG). While an actual CCF event requires the failures of redundant components (within the same CCCG) to occur within the same mission time, the treatment of CCF potential in SDP risk assessment is done probabilistically. Consistent with the long-standing ECA practice to not credit the successful running of components (i.e., failure memory approach), the successful testing or running of redundant components is not credited in eliminating potential increase in CCF. Note that CCF data coding guidelines (which support parameter estimation used for the SDP) consider timing factors when determining if an actual CCF event occurred; however, data collection, coding, and analysis are outside the scope of this NUREG.
Palisades-Section 1.4, Rule 2: The guidance regarding the impact of the time window for CCF or chance conditions states that consideration of the time window for CCF is irrelevant and contrary to the failure memory concept. Further the guidance states that simply testing redundant components cannot provide proof that multiple dependent failures would not occur within the mission time would not occur within the mission time window.
However, it has been long standing practice that upon discovery of a failed risk significant component, that redundant component(s) be immediately tested to verify that the failure is not present in those components. Implicit in this evaluation is an assumption that the tested components are available for the mission time. If we are to accept the premise that redundant components cannot be proven to not be subject to common cause failure during the mission time, then it is unclear why plants are not required to shut down immediately upon discovery of a failure of a risk significant component that can be characterized as a cause which can result in CCF.
Further, the guidance discounts the benefit of staggered testing which is a planned evolution based on the premise that such testing provides CCF cause failure.
Again the impact of staggered testing is left to the judgment of the analyst to decide whether CCFs could have occurred during the PRA mission time.
NRC Response: This ground rule has been revised into the third key principle (see Section 3.2.3 in the revised NUREG) to acknowledge that operability testing of redundant components within the same CCCG proves that a CCF did not occur; however, it does not eliminate the potential for CCF for the purposes of the SDP assessment.
Palisades-Section 1.4, Rule 2: For normally operating systems, judgments about time windows can be made when the failures are self announced.
NRC Response: See response to previous comment.
Palisades-Section 1.4.1, 1st Paragraph: The guidance recognizes the potential for over-estimating the risk significance of CCF by applying a particular failure mode to a CCCG in a PRA model where the failure does not apply to all components within the CCCG. It is not unusual for common cause groupings to be present in a PRA model for a limited set of failure mechanisms that apply to the group, but a full set of failure mechanisms typical of the component type may not be applicable to all components within the group.
However, the guidance cautions against alteration of CCCG boundaries to accommodate these design differences as the characterization of the performance deficiency as a broader based problem can couple the components despite any design differences that could preclude CCF of components within the group.
The Palisades model has separate CCCGs for which design differences come in to play.
17
Public Comments on Draft CCF NUREG and NRC Response NRC Response: Because the scope of SDP risk assessments is at the performance deficiency (i.e., proximate cause) level, design differences will not typically affect whether the proximate cause is shared between redundant components within the same CCCG. Note that CCF potential is only considered for components within the same CCCG. If the design differences (along with other factors per the ASME/ANS standard) are sufficient to de-couple redundant components, then these components should not be in the same CCCG in the PRA model of record and, therefore, CCF potential would not be considered between these components in a SDP risk assessment. Note that NRC staff performing the SDP risk assessment will review the licensees basis for the separation of the (similar) redundant components from the same CCCG to determine that they are sufficiently different. See Sections 2.1.5 and 3.2.4 in the revised NUREG for additional information.
Palisades-Section 1.5, 1st Paragraph: The guidance here suggests that events described in the examples provided are not currently considered CCF in the current version of the NRC CCF database and that they will be added in a future update. This represents another example of the implication of deficiencies in other processes that are theorized to underestimate the actual CCF parameters.
Moreover, the guidance suggests that analyses of events that involve these deficiencies will be within the SDP process via implementation of this NUREG process without having first addressed the issues in the underlying processes (ASME/ANS).
NRC Response: This text has been revised. These events (see Section 2.2 in the revised NUREG) are provided as examples of cases with CCF potential (because causes and coupling align), not as CCF events that have occurred nor to be cataloged in the CCF database as actual CCF events.
Palisades-Section 1.5, Hatch Example: There is not enough information in this example to properly evaluate. Were the failures observed here used to revise the failure rate estimate? Evaluating the CCF parameter must also include a look at the failure rate as it is used before judging the adequacy of the model.
NRC Response: This text has been changed (see Section 2.2.3 in the revised NUREG). Because the presence of shared cause (at the performance deficiency level) and the redundant EDGs are within the same CCCG (i.e., have shared coupling factors), the calculation of the CCF potential is appropriate (given the scope of the SDP risk assessments).
Palisades-Section 1.5, Dresden Example: In the discussion of the Dresden event, the argument by the licensee is discredited based on the elevation of the performance deficiency description to inadequate material control. This description allows for the arbitrary inclusion of a broader scope of components that results in substantially higher risk significance. At issue in this example is; does any factual information to support a conclusion that the broader group of components was subject to an elevated level of risk from specific performance deficiencies which could impact their performance, exist?
The original deficiency was appropriately characterized as a failure to order proper parts and failure to detect the problem during receipt inspection (i.e., a material control deficiency in the broader context) with respect to one diesel generator. However, all other diesel generators had the appropriate part installed and no other material control deficiencies were identified with respect to any of the other diesel generators. In fact, the example states that because of the elevated description of the deficiency all other diesel generators become suspect and assigned increased probability of failure as a group.
The example argues that because of the elevated description, the issue is now about any other possible failure mechanisms that could result from inadequate material control not just lube oil strainers (the original issue) without having any evidence that material control deficiencies currently exist that could result in failure of the remaining diesel generators. There is a certain level of guilt by association and implication in this approach.
NRC Response: This text has been changed (see Section 2.2.2 in the revised NUREG for additional information). The point that an inadequate control of purchased material could extend beyond the EDG CCCG is possible, but the NUREG limits consideration of CCF to components within the same CCCG. Note that SDP risk assessment is focused on the performance deficiency (e.g., inadequate control of purchased material) and not the observed failure mechanism.
18
Public Comments on Draft CCF NUREG and NRC Response Palisades-Section 1.5, Calvert Cliffs Example: In the Calvert Cliffs example, a diesel generator experienced a failure due to a design feature unique to that diesel. This diesel incorporates a fan cooled radiator for engine cooling. The other diesel engines are water cooled and do not experience the additional in-rush current from the radiator fans on diesel start. This design feature represents a diversity of design that offsets at least some level of contribution from CCF as a consequence of a support system failure.
However, the guidance establishes the position that while the particular mechanism is not shared among the component group, it is sufficient to argue that other failure mechanisms exist between components within the group and therefore the existence of this uncorrelated failure is a basis to elevate the risk impact of CCF of the group is justifiable. If this approach holds, then any benefit from diversity of design is negated. This type of argument represents a self-fulfilling prophecy.
One can always argue that while a particular failure mechanism is not shared within the CCCG the presence of other failure mechanisms that could be considered shared is an appropriate basis for increasing the CCF probability of the group as a consequence of the existence of this unshared failure mechanism.
NRC Response: The Calvert Cliffs example has been eliminated in the revised NUREG. Note that design differences between redundant components within the same CCCG, may not be sufficient (by itself) to eliminate CCF potential at the performance deficiency (i.e., proximate cause) level. For example, redundant equipment with some different design features included within the same CCCG could still be vulnerable to shared maintenance or engineering design deficiencies.
Note that the CCCG should be designated in a manner consistent with the ASME PRA standard, which requires a review that considers similarity in the following factors: (1) service conditions, (2) environment, (3) design or manufacturer, (4) maintenance.
Palisades-Section 1.5, Comanche Peak Example: In the Comanche Peak example, a diesel generator had been painted immediately after a successful surveillance test and apparently failed the next surveillance test due to a failure to assure that the painting did not impact the functionality of the painted components. The performance deficiency was failure to adequately implement maintenance procedure(s). This approach implements the chance argument.
The information provided only identifies a single occurrence of the failure to implement this aspect of the procedure. The chance argument is predicated on the possibility that the other diesel might have been painted between subsequent diesel tests.
Given painting of diesel components is not considered a routine activity, this appears to be an overestimation of the probability of the second diesel becoming subject to the same condition. While it cannot be argued that this is impossible, the question is how probable was it? But, the process discounts conditions which would support lower probability of a common cause event. The argument provided also assumes that the isolated occurrence of the failure to implement the procedure represents a condition that guarantees future failure, and no credit can be considered for the procedure to prevent a second occurrence of the failure.
NRC Response: This text has been changed (see Section 2.2.1 in the revised NUREG). Because the redundant EDGs are within the same CCCG and there is a shared cause (i.e., the performance deficiency), the SDP risk assessment will consider the CCF potential as described in Appendix E of NUREG/CR-5485.
19
Public Comments on Draft CCF NUREG and NRC Response Palisades-Sections 1.4-1.6: Sections 1.4-1.6 of the draft NUREG describe the ECA ground rules for treatment of a component failure in a CCCG. The general approach described is that any component that fails will have some impact on CCCG failure probability regardless of the cause group failure probability.
For example, Pump A fails due to Cause X, and Pump B, a redundant pump, fails five years later due to Cause Y. Following the draft NUREG guidance, these failures could be treated as a common cause due to any performance deficiency. The performance deficiencies could be the same preventive maintenance was performed, or the pumps have a similar design, or they are in the same room, or they operate at the same temperature, etc.
A further example could be a group of redundant components have been installed in a plant for 30 years. Over that period there have been three failures spaced 10 years apart. This data would typically be used to update the random failure probability for the components, but does not result in evidence that warrants an increase in the CCF probability simply because the components have the same design characteristics or operating environment.
NUREG/CR-6268 and NUREG/CR-4780 when defining common cause factors state:
The concept of a shared cause of malfunction or change in component state is the key aspect of a CCF event. The use of the word shared implicitly includes the concept of coupling factor or mechanism. In addition, the reference to a time interval between failures acknowledges the reliability significance of these events. Multiple component failures from a shared cause, but without affecting mission requirements, in a period required for performance are of little or no significance from a reliability point of view. It is the correlation of failure times and their simultaneity in reference to the specified mission time that carries their reliability significance. Often when the same cause is acting on multiple components, failure times are also closely correlated.
NUREG/CR-6268 further defines the timing factor for announced failures as within three times the PRA mission time.
There is no discussion in the draft NUREG of when plant specific evidence may be applied to update random failure probability when performing an ECA. The failures in the above examples would be more appropriately treated as an increase in the random failure probability. The components may constitute a CCCG, but if they are unreliable, this will be reflected in their individual random failure probabilities.
NRC Response: If redundant components are in the same CCCG within the base PRA, then coupling factors already exist between them. If one of the components fails and a performance deficiency associated with this failure is identified, then a dependency evaluation that determines the failure probability of the remaining pump must be estimated. Considering the pump example provided, two separate risk assessments would be performed (given a performance deficiency was identified for each), which would consider the potential CCF of redundant components within the same CCCG in each evaluation at the time of the observed failure. Therefore, a SDP risk assessment would be performed when pump A fails, which would consider the potential for the identified performance deficiency would result in a CCF of redundant components within the same CCCG (including pump B). Likewise, a SDP risk assessment would be performed when pump B fails, which would consider the potential for the identified performance deficiency would result in a CCF of redundant components within the same CCCG (including pump A). Because these pumps did not fail due to the same cause and were not in a failed state during the same time period, neither of these failures would be considered as CCF events in the CCF database, consistent with our data coding guidelines. Under the current methodology, the same alpha factors would be used for regardless of the identified cause. In addition, the evidence could be used to inform the failure rate estimates. However, it is very rare for circumstances to justify new parameters estimates. Typically when revised parameter estimates are sought, a Bayesian process is used that that relies on an industry-average prior that is almost never influenced by a single event, or even by a single plants entire operating history. Manipulations of the parameter estimates a considered beyond the scope of this NUREG.
Palisades-Section 2.2.1: There is no basis for the approach of using the alpha factor in the baseline PRA model as an estimate for the conditional probability that an event is a CCF. The CCF model is being implemented in a manner that it was never intended to be used for. The alpha factor is correlative and not a conditional probability. Again this surrogate use of an alpha factor may be driven by the goal of creating a fast and dirty conditional probability value but any conclusion is suspect.
NRC Response: The alpha factors are parameters of an accepted CCF model that is widely used by NRC and within industry, and is fully consistent with the PRA standard (see Supporting Requirement DA-D5 in ASME/ANS RA-Sb-2013). Algebraic reduction of the conditional probability expression for the CCCG affected by an observed failure leaves a complicated function of alpha factors, but that in no way implies the alpha factors are be treated as conditional probabilities. See Appendix E of NUREG/CR-5485 for additional information.
20
Public Comments on Draft CCF NUREG and NRC Response Palisades-Section 2.2.1: This treatment arbitrarily brings in some knowledge to update the PRA model and excludes others. If it is determined that the remaining components are not affected by the cause of the first failure, it should be modeled as an independent event.
NRC Response: Given the scope of the SDP, if the proximate cause at the performance deficiency level exists between redundant components within the same CCCG, then calculation of the CCF dependency is justified. The long-standing ECA practice uses the concept of the failure memory approach, which specifically models failures in the analysis (remembered); however, successes are treated probabilistically (typically by the base PRA failure probabilities). Therefore, a successful demonstration of operability for redundant components does not rule out the potential for CCF failure in ECA.
21