ML12006A067
| ML12006A067 | |
| Person / Time | |
|---|---|
| Site: | Palisades |
| Issue date: | 09/28/2011 |
| From: | Lucius Pitkin |
| To: | Blind A Entergy Nuclear Operations, NRC/RGN-III |
| References | |
| EA-11-241, Meeting Notice - ML113530619, EA-PSA-SDP-P7C-11-06, F11358-LR-001, Rev. 0 | |
| Download: ML12006A067 (90) | |
Text
Boston Area Office 36 Main Street, Amesbury, MA 01913 Tel: 978-517-3100 Fax: 978-517-3110 www.lpiny.com Boston, MA New York, NY Richland, WA TM Lucius Pitkin, Inc.
September 28, 2011 LPI Ref. F11358-LR-001, Rev. 0 Mr. Alan Blind Entergy Nuclear Operations, Inc.
Palisades Nuclear Plant 27780 Blue Star Memorial Highway Covert, MI 49043
SUBJECT:
Past Operability Assessment of Service Water Pumps P-7A and P-7B associated with As-found Evaluation of Pump Shaft Couplings - Palisades Nuclear Plant LPI Project No. F11358 Entergy Contract No. 10325528
Dear Mr. Blind:
Lucius Pitkin, Inc. (LPI) is presently supporting Entergy Palisades Nuclear Plant (PLP) with an evaluation and assessment of Service Water System (SWS) pump couplings, following the failure of in-line pump shaft coupling No. 7, for Pump P-7C, as described in condition report CR-PLP-2011-03902 [2]1.
Palisades has requested a past operability assessment of service water pumps P-7A and P-7B, relative to the ability of the pump¶s shaft couplings to perform their function for a mission time of 30 days. At this time, LPI has examined couplings No. 4, 5, 6 and 7 removed from Pumps P-7A and P-7B, with respect to the failure assessment described within LPI Report F11358-R-001 [3] associated with the failure of pump P-7C coupling No. 6.
1.0 BACKGROUND
Failure of two pump couplings, No. 7 coupling in 2009 (referred to herein as 09-P7C-7F)2 and No. 6 coupling in 2011 (11-P7C-6F), occurred on the P-7C pump, as described in [1 and 2]. The couplings were fabricated of ASTM A582 Type 416 1 Numbers in [xx], i.e ³', refer to References listed in Section 5.0.
2 Coupling naming convention used herein and in [3] refers to year of failure or examination-pump identity-coupling identifier. The F or K term is added if the identified coupling failed or cracked, respectively. Thus the 2009 failure of pump P-7C coupling number 7 is identified as 09-P7C-7F.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 2 of 41 stainless steel (SS) material. A failure evaluation of these couplings was performed as described in [3], which identified the failure mode as intergranular stress corrosion cracking (IGSCC), resulting from a susceptible material (a martensitic steel found to have relatively low fracture toughness) operating in a corrosive environment and subjected to a threshold tensile stress. The report [3] also identified a crack approximately one-quarter through the wall in another coupling from the P-7C pump: coupling No. 7 (11-P7C-7K) that exhibited similar stress corrosion cracking (SCC) characteristics.
Each SW pump features eight (8) couplings, numbered 1 through 8. The present assessment has been focused on coupling Nos. 5, 6, and 7 for all service water pumps, since they are subjected to wet and dry cycles. Couplings 6 and 7 are subjected to wet and dry cycles dependent only upon pump usage state (i.e. on or off). Coupling 5 at approximately elevation 579¶ is dependent upon pump usage and water level (see Figure 1-1). Depending upon the time of year, water level in the service water basin (i.e. Lake Michigan level) ranges from elevation 576¶ to 580¶. It is postulated in LPI report F11358-R-001 [3] that the wet/dry cycle enables the chlorides in the service water to concentrate at the thread roots of the coupling when the water drains from the coupling. This postulate is supported by fluorescent magnetic inspection testing (MT) of coupling 4 of each pump resulting in no indications. Unlike couplings 5, 6 and 7, coupling 4 is continuously submerged and does not typically experience wet/dry cycles.
Chlorides are present in the raw service water (water from Lake Michigan) in concentrations of approximately 9.7 ppm. The service water system is chlorinated on a daily basis to control algae and other microbes. After chlorination, the chloride levels increase to approximately 10 ppm. Thus, chlorination has little impact on the chloride levels in the service water. Chlorides in high humidity and oxygen rich environment are known to be corrosive agents resulting in IGSCC of martensitic stainless steels such as 416 SS.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 3 of 41 2.0 EXAMINATION/TEST RESULTS 2.1 Pump P-7A Couplings Pump P-7A coupling numbers 4, 5, 6 and 7 (coupling identifiers: 11-P7A-4, 11-P7A-5, 11-P7A-6, and 11-P7A-7, respectively) were submitted to LPI for examination on August 30, 2011. A photograph of the as-received coupling 11-P7A-7 is provided in Figure 2-1. Couplings 4 through 7 were visually examined and inspected by fluorescent magnetic particle inspection testing (MT). Couplings 5 through 7 were hardness tested, tensile tested, Charpy V-Notch (CVN) impacted tested and analyzed for material composition using methods as described in [3] for the P-7C pump couplings.
The results of the visual examination and MT inspection are shown in Figures 2-2 and 2-3, respectively. The threads of the P-7A couplings were found to be coated with lubricant, which has previously been identified by PLP maintenance as Neolube [4]. Following cleaning, the threads of couplings 4 through 7 were MT inspected and did not exhibit indications of linear flaws.
The couplings were tested with results summarized for: CVN impact energy in Table 2-1; Tensile Strength in Table 2-2; Material Composition in Table 2-3; Surface Hardness in Table 2-4, and Through Thickness Hardness in Table 2-5.
2.2 Pump P-7B Couplings Pump P-7B coupling numbers 4, 5, 6 and 7 (coupling identifiers: 11-P7B-4, 11-P7B-5, 11-P7B-6, and 11-P7B-7, respectively) were submitted to LPI for examination on September 2, 2011. A photograph of the as-received couplings 11-P7B-4 through 11-P7B-7 is shown in Figure 2-4. Couplings 11-P7B-4 though 11-P7B-7 were split longitudinally for fluorescent magnetic particle inspection (MT). The as-split coupling 11-P7B-4 appeared to have been cleaned and exhibit a dye liquid penetrate residue on the threads (see Figure 2-5). The presence of the liquid penetrate on the thread surface is consistent with efforts by Palisades to examine this coupling for possible re-use due to procurement issues with the replacement couplings fabricated from 17-4PH material. As-split couplings 11-P7B-5 through 11-P7B-7 are EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 4 of 41 shown in Figure 2-6 through Figure 2-8. Neolube is present on the threaded surfaces of coupling 11-P7B-5 to 11-P7B-7. However, an apparent band of corrosion product was observed at the center two-to-three threads of couplings 11-P7B-6 and 11-P7B-7. Coupling 11-P7B-5 also exhibited some corrosion at the center threads but not to the extent of couplings 11-P7B-6 and 11-P7B-7 (see Figure 2-6).
Following cleaning, couplings 11-P7B-4 through 11-P7B-7 were MT inspected. MT of couplings 11-P7B-4 through 11-P7B-7, revealed indications at the center (near location of corrosion products) of couplings 11-P7B-5, 11-P7B-6 and 11-P7B-7. An indication was also found at the motor end of coupling 11-P7B-5. No indication was found on coupling 11-P7B-4.
Metallographic examination of the MT indications on 11-P7B-5K through 11-P7B-7K revealed a network of branched cracks initiating from pits at the thread roots (see Figure 2-9, Figure 2-10 and Figure 2-11). The branching network of cracks is indicative of SCC. A summary of the as-found cracks on the P-7B couplings follows:
Coupling Crack Location Crack Depth (in)
Crack Length (in)
B5 Center 0.065 1.25 Motor End 0.02 0.25 B6 Center 0.132 0.5 B7 Center 0.043 0.5 Couplings 11-P7B-5K though 11-P7B-7K were tested with results summarized for: CVN impact energy in Table 2-1; Tensile Strength in Table 2-2; Material Composition in Table 2-3; Surface Hardness in Table 2-4, and Through Thickness Hardness in Table 2-5.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 5 of 41 3.0 EVALUATION 3.1 Pump Run Time The matrix below summarizes the SWS pump coupling service history at time of extraction.
SWS Pump Coupling Life Pump Date Installed Date Extracted Installed Time Run Time Start/Stops Notes (hrs)
(hrs)
P 7A 4/4/09 8/28/11 21,024 16,259 148 1
P 7B 5/12/10 9/1/11 11,391 9,073 70 2
P 7C 6/12/09 9/29/09 2,616 2,414 13 3
P 7C 10/1/09 8/8/11 16,224 14,115 95 4
Notes:
- 1)
Run hours and stops and starts based on total presented in Palisades response to NRC RFI 43 [5] plus average monthly hours from 4/10 to 9/10 times 6 months.
- 2)
Information provided in Figure 3-0.
- 3)
Information provided in Figure 3-1.
4)
Run hours and stops and starts based on total presented in Palisades response to NRC RFI 43 [5].
3.2 Visual Inspection 3.2.1 Coupling P-7A Visual inspection of the P-7A pump couplings 11-P7A-4 through -7 identified the threads to be well coated with lubricant (as-received).
This was in contrast to observations of the P-7C couplings in the as-received condition, where lubricant was not observed to be as well coated on the threads. A comparison of this is shown in Figure 3-2.
The coating of lubricant on the P-7A couplings could enhance pitting resistance.
3.2.2 Coupling P-7B Visual inspection of the P-7B pump couplings 11-P7B-4, 11-P7B-5K, 11-P7B-6K, and 11-P7B-7K show the couplings to be generally well coated with Neolube. However, the amounts were generally less than that observed on couplings 11-P7A-4 through 11-P7A-7.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 6 of 41 Stereomicroscopy of the coupling thread roots indicated that couplings exhibiting cracks tend to contain pitting, whereas couplings without indications contained less or no pitting. That is, pits or cracks were not observed in couplings 11-P7A-4 through 11-P7A-7 and 11-P7B-4, whereas couplings 11-P7B-5 through 11-P7B-7 contained pits and cracks. Figure 3-3 presents a representative comparison of a coupling without pits (and with no indications) and couplings that exhibit pits and contain cracks.
Based on the amount of Neolube on the coupling of P-7A and the absence of observed pits or cracks at the thread roots, it is postulated that the Neolube provided a protective coating that enhanced the corrosion resistance of the P-7A couplings.
3.3 Metallurgical / Environmental Chemical composition and tensile testing of all tested P-7A and P-7B couplings indicates that the couplings are within specification for ASTM A582 Type 416 martensitic stainless steel. CVN impact energy test results ranged from 3 ft-lb to 16 ft-lb for tests at 32°F. ASTM A582 does not specify an impact energy requirement. CVN impact energy, tensile properties and chemical composition did not directly correlate with cracked and un-cracked couplings. Coupling 11-P7A-6 had low impact energy (3 ft-lb at 32°F) and did not exhibit cracks, whereas coupling 11-P7B-7 had higher impact energy (11 to 14 ft-lb at 32°F) and contained cracks.
Based on composition and CVN impact energy test results, all examined couplings are considered within the range of susceptibility to SCC since the environment to which the couplings are subjected is postulated to be the same. This postulation is based on the pumps extracting service water from the same basin. However, the observation of SCC in examined couplings from pumps P-7C and P-7B, but not examined couplings from P-7A, could be attributed to the Neolube and/or the third criterion for SCC, tensile stress.
Also, based on the run time data provided in Section 3.1, it can be seen that the P-7A pump and couplings have been in service longer, experienced more run time and starts and stops than the other two service water pumps but yet EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 7 of 41 the examined P-7A couplings are free of cracks. Since the environment3 and material susceptibility is essentially the same for all service water pumps, the other contributors to SCC, applied tensile stress is investigated. Tensile stress in couplings of P-7C, P-7B and P-7A could differ due to thread form fit between the shaft and couplings.
3.4 Tensile Stress The coupling and shafts are assembled to ensure equal threading of the two shafts within a coupling by the use of an alignment aid inserted in the 1/8' hole on the side of the coupling. Once the shafts touch the alignment aid it is removed and motor torque is relied upon to tighten up the shaft-coupling assembly.
With application of motor torque the two shafts will tighten and bear against each other within the coupling (see figure to right) which will induce compression on the shaft. The shaft compression is reacted as tension across the coupling. Also when the shafts butt up against each other within the coupling, a circumferential gap between the shaft and coupling is created due to the end geometry of the two shafts. The gap in conjunction with the alignment hole would enable deposits to collect at the exposed thread roots of the shaft intersecting plane.
To estimate the tensile stress across the coupling, a finite element analysis (FEA) model of a coupling was developed. A half FEA model of an intact coupling was developed using ANSYS and consists of the steel body, alignment hole and threads. The model was constructed of the eight-node brick element, SOLID45 (see Figure 3-4). The symmetric boundary condition, Uz=0 and U =0, is applied on the inner surface as shown in Figure 3-5.
ASTM A582 Type 416 stainless steel material property for the coupling FEA model is as follows:
Young¶s modulus: 29.2 x 106 psi Poisson¶s ratio: 0.3 3 Although as outlined in Section 3.2.1, the presence of liberal amounts of Neolube on the P7A couplings could play a significant role in protecting those threads from the corrosive environment.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 8 of 41 Coupling threads are 2-3/16, 8 TPI which is not a common thread form.
Specific thread properties are not available in the Machinery¶s Handbook
[11]. Therefore, internal thread properties of the coupling is taken to be the average internal diameter of 2-1/4, 8 TPI and 2-1/16, 8 TPI in the Machinery¶s Handbook [11].
Loading on the coupling model consists of the weight of components below the coupling, hydraulic thrust and motor torque. These loads are extracted from HydroAire calculation NQ5940 [12]. Motor torque is transmitted across the coupling by bearing of the shaft ends against each other within the coupling.
The resulting stress distribution across the wall of the coupling at the middle thread root of the coupling, as determined from the FEA model is presented in Figure 3-6. The Figure 3-6 stress distribution is based on reacting the applied shaft end compression load from the applied motor torque across three threads4 per shaft end. Considering the two shafts meeting approximately in the center of the coupling, the three threads below the centerline react the lower shaft compression loading, the three threads above the centerline react the compression loading in the upper shaft. However, tolerances in machining of the threads could translate into different load and stress distribution across the threads. If the load is distributed to fewer or greater number of threads than the three assumed, then the tensile stress at the thread root could range according to the load distribution to the threads.
For example, if the shaft bearing loads were distributed across six threads instead of three, the maximum tensile stress would be on the order of 30 ksi.
Based on the FEA and depending on the number of threads sharing the load, it is not un-reasonable for tensile stresses to range from approximately 20 ksi to 80 ksi at the thread root.
3.5 SCC Growth Evaluation The time to failure of a susceptible material in a given environment is dependent on the applied tensile stress, as can be seen in Figure 3-7. The plot compares applied stress or load to the logarithm of exposure time in an 4 Based on extensive testing, as presented in various literature sources [9], the threads nearest the plane of load application carry the majority of the applied loading.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 9 of 41 environment and illustrates the time to failure increases significantly with decreasing applied stress. The crack propagation time, tcp is taken to be the difference between the time of failure, tf, minus the time of initiation, tin. The time at failure is typically known. However, the time of initiation is highly alloy-environment and applied stress dependant and thus is an unknown without specific test data. The initiation time is also highly dependent upon pre-existing flaws that may have been introduced during heat treatment or thread fabrication. Therefore, predicting initiation time is difficult. Unless there are preexisting flaws, a distribution of 80% initiation and 20% propagation is considered reasonable for the life of a component subject to SCC process as suggested by Figure 3-8.
The SCC process usually occurs in three stages:
- 1) crack initiation and stage 1 propagation,
- 2) stage 2 or steady-state propagation (independent of stress intensity),
and
- 3) stage 3 crack propagation or final fracture.
A typical plot of crack growth rate (da/dt) versus stress intensity illustrating the three stages of SCC propagation is presented in Figure 3-9. The figure illustrates a threshold stress-intensity, K1SCC, for SCC initiation and stage 1 propagation. The threshold stress-intensity is dependent upon interaction of the alloy and environment (alloy-environment). Stage 1 propagation is followed by Stage 2 crack propagation where the crack growth velocity is independent of stress intensity. Stage 2 crack growth velocity is limited to the alloy-environment interaction such as the mass transfer of corrosive environmental elements up the crack to the crack tip. Stage 3 propagation is dependent upon stress intensity, until the critical level, K1c to produce mechanical overload of the remaining ligament. The crack propagation time is the sum of the time at each stage, tcp=t1+ t2+ t3.
A plot of crack growth rate (da/dt) versus stress intensity for 12% chromium and 0.2% carbon alloy at various tempering temperature per [7] is provided in Figure 3-10. The generic categorization of 12% Cr and 0.2% C would cover the 416 SS coupling material. Based on tempering heat traces for the P-7A and P-7B couplings presented in Figure 3-11, the 550°C (1022°F) curve is appropriate. Figure 3-10 show that the threshold stress-intensity, K1SCC for EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 10 of 41 the 550°C curve is approximately and the stage 2, stress intensity independent crack growth rate is approximately 2.3E-4 in/hr per [7].
For a stress distribution of 20 to 80 ksi, the stress intensity at the thread root (without pits) of the coupling can range from to as demonstrated below.
Therefore, based on three threads taking the load and resulting tensile stress of 70 ksi (see Figure 3-6), the stress intensity of would be sufficient to initiate a crack. If the tensile stress at the thread root is 55 ksi or less then the stress intensity would fall below the threshold stress intensity, K1SCC of and SCC initiation would not be expected. However, if a pit were to form at the thread root, the stress intensity would be higher for a given tensile stress. For example, a pit 0.01' deep and tensile stress of 55 ksi would result in a stress intensity of approximately (see below), which is greater than the threshold stress intensity for SCC initiation.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 11 of 41 Once a crack initiates, the stress intensity would increase with increasing crack length, however the crack rate is limited to the stage 2 propagation rate until the critical fracture stress intensity, K1c resulting in failure by overload.
To determine whether the couplings extracted from P-7A and P-7B would have survived a mission time of 30 days, an appropriate crack growth rate (CGR) is required. The following three cases evaluates crack growth rates based the life of failed couplings 09-P7C-7F and 11-P7C-6F and the stress intensity independent crack growth rate from Figure 3-10.
Case 1 +/- CGR based on Figure 3-10 Using the stage 2 stress intensity independent crack growth rate (CGR) of 2.3E-4 in/hr from Figure 3-10, and applying it to the as-found cracks of examined couplings from P-7B, it would require approximately 66 days to propagate through the wall5 of the deepest flaw found on the P-7B couplings, i.e. 11-P7B-6K (evaluation is shown in Figure 3-12). Using the same CGR, and assuming a crack initiated at the time the examined P-7A couplings were removed from service, it would require 90 days to propagate through wall. Applying this simplistic approach to couplings 09-P7C-7F and 11-P7C-6F would result in the same 90 days to propagate through thickness (see Figure 3-12).
Based on the loading mechanism discussed in Section 3.4, coupling tensile stress would be generated at pump start-up following coupling assembly and is independent of pump starts and stops. Therefore coupling life is based upon the time of installation. Using the 90 days of propagation life, the initiation time would be approximately 19 days for coupling 09-P7C-7F and 586 days for 11-P7C-6F (see Figure 3-12).
It is apparent that there is quite a disparity between initiation time and life of these two couplings. That is, the initiation time of 09-P7C-7F comprise only 17% of its life whereas the initiation time of 11-P7C-6F is 87%. The disparity could be explained by differences between impact energy test results (3 to 4 ft-lb for 09-P7C-7F vs. 6 to 8 ft-lb for 11-P7C-6F at 32°F) and/or stress levels and/or a preexisting flaw in 09-P7C-7F. A preexisting flaw combined with the low impact energy in 09-P7C-7F, would result in 5 Examination of the fracture surface of coupling 11-P7C-6F in [3] indicates SCC propagation through wall prior to final overload.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 12 of 41 a shorter initiation time. Also, based purely on the concept provided in Figure 3-7, coupling 09-P7C-7F would have been subjected to higher stresses than 11-P7C-6F to produce the initiation time disparity.
Case 2 +/- CGR based on 50/50 Life Split of 09-P7C-7F Assuming a 50/50 split for initiation and propagation in the life of coupling 09-P7C-7F would result in a propagation rate of about 3.81E-4 in/hr.
Applying a propagation rate of 3.81E-4 in/hr to the examined couplings would result in an initiation time of about 622 days and propagation time of 55 day for 11-P7C-6F (see below). Since SCC propagation rate is alloy-environment dependant and stress independent in stage 2 (plateau velocity), applying the same crack velocity to the SWS pump coupling is considerable reasonable. Using a CGR of 3.81E-4 in/hr would result in initiation making up 92% of the life of coupling 11-P7C-6F.
Case 3 +/- CGR based on 0/100 Life Split of 09-P7C-7F Assuming a preexisting flaw and propagation life to be the total life of coupling 09-P7C-7F would result in a CGR of about 1.91E-4 in/hr.
Applying this propagation rate to coupling 11-P7C-6F would result in an initiation life about 84% of the total life. This CGR results in reasonable initiation and propagation distribution for failed coupling 11-P7C-6F and the cracked couplings, as shown below.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 13 of 41 Case Coupling Life (days)
CGR (in/hr)
Initiation (days)
Propagation (days)
Init/Life
(%)
1 09-P7C-7F 110 2.3E-4 19 90 17%
11-P7C-6F 676 2.3E-4 586 90 87%
2 09-P7C-7F 110 3.81E-4 55 55 50%
11-P7C-6F 676 3.81E-4 621 55 92%
3 09-P7C-7F 110 1.91E-4 0
110 0%
11-P7C-6F 676 1.91E-4 566 110 84%
Note(s)
- 1. Life is based on time of installation.
- 2. CGR = Crack Growth Rate Based on the above assessment, a reasonable crack growth rate for the SWS pump couplings is in the range of 1.91E-4 in/hr to 3.81E-4 in/hr. This range encompasses the stress intensity independent CGR of 2.3E-4 in/hr for 12%Cr, 0.5%C steel tempered at 550°C in distilled water per [7] (see Figure 3-10). However, the CGR of 1.91E-4 in/hr is the most reasonable in terms of initiation and propagation distribution life for failed coupling 11-P7C-6F and all cracked couplings. This means that coupling 09-P7C-7F is anomalous (e.g. preexisting flaw) and is postulated to have propagated shortly after installation. Barring specific CGR for the alloy-environment interaction of the SWS pump couplings, using a CGR of 2.3E-4 in/hr is considered as a reasonable mean value and 3.81E-4 in/hr is considered a bounding value for this evaluation.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 14 of 41 4.0
SUMMARY
Pump shaft couplings from PLP SWS Pumps P-7A and P-7B were submitted to LPI for examination and metallurgical testing. Examination of the couplings revealed cracks in couplings 5, 6 and 7 (these are subjected to wet/dry cycles) from pump P-7B, but cracks were not observed in examined P-7A couplings.
Visual examination of couplings 4 through 7 from pump P-7A identified them to be well coated with Neolube to a greater degree than on examined P-7B and P-7C couplings. Very little to none was observed on couplings 11-P7C-6F and 11-P7C-7K. It is postulated that the presence of liberal amounts of Neolube on the threads of examined P-7A couplings enhanced its pitting resistance by providing a coating that protected the thread roots from corrosive agents in the service water basin environment.
Considering the coupling material and environment, a stress intensity, K independent crack growth rate (CGR) of 3.81E-4 in/hr is considered a reasonable bounding value for this evaluation. This CGR encompasses the stress intensity independent CGR of 2.3E-4 in/hr for 12%Cr, 0.5%C steel tempered at 550°C (1022°F) in distilled water per [7] (see Figure 3-10). Using a CGR of 3.81E-4 in/hr, in conjunction with the as-found flaws on the P-7B couplings, it would require approximately 40 days to propagate through the wall for the deepest flaw found on coupling 11-P7B-6K. Since no flaws were found on examined P-7A couplings, it is concluded that it would require approximately 54 days for a flaw to propagate through wall if, conservatively, a crack initiated at the time they were removed from service. Also since pits are a precursor to cracks (as observed on the examined P-7B and P-7C couplings that exhibit cracks) and no cracks were found on the examined P-7A coupling, it is postulated that the life of examined couplings extracted from P-7A could be greater than 54 days to allow for pit formation.
LPI concludes that the couplings removed from service water pump P-7A and P-7B would have continued to perform their design function for at least an additional 30 days of operation from the time of extraction.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 15 of 41
5.0 REFERENCES
- 1.
PLP Condition Report Entergy Palisades Root Cause Evaluation for CR-PLP-2009-04519, ³Service Water Pump P-7C Failure to Provide Discharge Pressure' 3/4/2010, Rev. 1
- 2.
Entergy Palisades Root Cause Evaluation Report for CR-PLP-2011-03902,
³Service Water Pump 7-C Line Shaft Coupling Failure'
- 3.
LPI Report No. F11358-R-001, Rev. Draft G, ³Metallurgical and Failure Analysis of SWS Pump P-7C Coupling #6'
- 4.
PLP Maintenance Procedure WI-SWS-M-03 and WI-SWS-M-04.
- 5.
PLP Response to NRC RFI # 43 ³Operating Service History Since Last Coupling Failure'
- 6.
Entergy Contract No. 10325528
- 7.
M.O. Speidel ³Corrosion Fatigue in Fe-Ni-Cr Alloys', NACE-5 Stress Corrosion Cracking and Hydrogen Embrittlement of Iron Base Alloys, National Association of Corrosion Engineers, Houston, 1977, p. 1071 to 1094.
- 8.
ASM Handbook, Volume 13A ³Stress-Corrosion Cracking', R.H. Jones, Battelle Pacific Northwest National Laboratory.
- 9.
Pilkey, Walter ³Peterson¶s Stress Concentration Factors', 2nd Edition,. © 1997 John Wiley & Sons
- 10.
API 579-1/ASME FFS Fitness-for Service, API/ASME 2007
- 11.
OBERG E, et al. ³Machinery¶s Handbook' 25th Ed. Industrial Press
- 12.
HydroAire Calculation NQ5940,
³Maximum combined shear stress calculation for threaded coupling', Rev. 3
- 13.
ASM Handbook, Volume 13A ³Evaluating Stress-Corrosion Cracking', S.D.
Cramer, B.S. Covino, Jr, Revised by Bopinder Phull.
- 14.
R.W. Hertzberg, ³Deformation and Fracture Mechanics of Engineering Materials', 2nd Edition John Wiley & Sons.
- 15.
ANSYS Inc., LPI Report No. V&V-ANSYS-11, Rev. 3,'Verification and Validation of ANSYS Software Program' EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 16 of 41 6.0 QUALITY ASSURANCE The outlined work has been performed in accordance with the requirements of Entergy Purchase Order 10325528 [6]. The Approver of this document attests that all project examinations, inspections, tests and analysis (as applicable) have been conducted using approved LPI Procedures and are in conformance to the contract/purchase order. Ref. 3 is in draft form at this time, but does not impact the conclusions reached in this letter report, relative to estimated life of extracted couplings from pump P-7A and P-7B.
Rev Date Prepared Checked Design Verified Approved 0
9/28/11 S. Yim J. Mills, Ph.D.
G. Zysk P. Bruck EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 17 of 41 Table 2-1: CVN IMPACT TEST RESULTS Coupling Specimen Identification Test Temperature
(°F)
Absorbed Energy (ft-lb)
Lateral Expansion (in.)
Percent Shear
(%)
11-P7A-5 5-A3 32 7
0.003
<10 5-A4 32 7
0.002
<10 5-A1 75 8
0.006 10 5-A2 75 9
0.005 10 5-A5 100 13 0.009 30 11-P7A-6 6-A3 32 3
0.002
<10 6-A4 32 3
0.002
<10 6-A1 75 6
0.006 10 6-A2 75 6
0.008 10 6-A5 100 9
0.012 10 11-P7A-7 7-A3 32 11 0.005
<10 7-A4 32 9
0.004
<10 7-A1 75 12 0.009 20 7-A2 75 12 0.010 20 7-A5 100 18 0.015 50 11-P7B-5 B5-9 0
9 0.006 10 B5-10 0
10 0.007 10 B5-1 32 14 0.013 20 B5-2 32 16 0.014 40 B5-7 32 12.5 0.011 20 B5-8 32 13 0.011 20 B5-3 76 22 0.018
>90 B5-4 76 29 0.018
>90 B5-5 100 28 0.016
>90 B5-6 150 26 0.017
>90 11-P7B-6 B6-9 0
5 0.004
<10 B6-10 0
5 0.004
<10 B6-1 32 8
0.004
<10 B6-2 32 5
0.005
<10 B6-7 32 6
0.005
<10 B6-8 32 6
0.003
<10 B6-3 76 11 0.008 10 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 18 of 41 Table 2-1: CVN IMPACT TEST RESULTS Coupling Specimen Identification Test Temperature
(°F)
Absorbed Energy (ft-lb)
Lateral Expansion (in.)
Percent Shear
(%)
B6-4 76 10 0.009 10 B6-5 100 21 0.015 80 B6-6 150 21 0.015
>90 11-P7B-7 B7-9 0
8 0.007 10 B7-10 0
9 0.007 10 B7-1 32 12 0.008 20 B7-2 32 11 0.009 20 B7-7 32 11 0.009 10 B7-8 32 14 0.013 20 B7-3 76 11 0.015 50 B7-5 100 22 0.016
>90 B7-6 150 32 0.022
>90 Table 2-2: TENSILE TEST RESULTS Coupling Specimen Identification Yield Strength (ksi)
Tensile Strength (ksi)
Elongation
(%)
11-P7A-5 5-1 131.8 146.1 14.1 5-2 136.0 152.6 14.6 11-P7A-6 6-1 111.5 126.3 16.2 6-2 108.1 123.2 16.1 11-P7A-7 7-1 136.3 150.5 12.8 7-2 135.9 150.2 15.4 11-P7B-5 5-1 117.5 135.6 17.9 5-2 117.7 135.6 14.6 11-P7B-6 6-1 130.5 143.8 14.2 6-2 126.1 139.6 15.2 11-P7B-7 7-1 114.2 129.3 17.3 7-2 114.5 129.1 16.5 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 19 of 41 Table 2-3: COMPOSITION OF COUPLINGS (WT%)
Element Coupling ASTM A582 [7]
0.14 0.10 0.13 0.15 max Cr 13.06 12.16 12.97 12.00 +/- 14.00 Cu 0.071 0.077 0.073 ns Mn 0.74 0.83 0.65 1.25 max Mo 0.07 0.12 0.06 0.60 max Ni 0.22 0.43 0.22 ns P
0.018 0.021 0.017 0.060 max S
0.35 0.22 0.35 0.15 min Si 0.42 0.34 0.42 1.00 max ns +/- not specified Element Coupling ASTM A582 11-P7B-5 11-P7B-6 11-P7B-7 C
0.13 0.12 0.12 0.15 max Cr 12.1 12.0 12.0 12.00 +/- 14.00 Cu 0.082 0.079 0.078 ns Mn 0.72 0.70 0.72 1.25 max Mo 0.080 0.078 0.078 0.60 max Ni 0.28 0.27 0.27 ns P
0.02 0.02 0.02 0.060 max S
0.37 0.29 0.34 0.15 min Si 0.37 0.37 0.37 1.00 max EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 20 of 41 Table 2-4: SURFACE HARDNESS OF COUPLINGS Coupling End Average (HRC)
Measurements (HRC) 11-P7A-5 top 32.5 32.5, 32.0, 32.5, 32.0, 33.0, 33.0 bottom 29.9 28.0, 31.5, 31.5, 30.0, 30.5, 28.0 11-P7A-6 top 25.3 25.5, 25.5, 25.0, 25.5, 25.0, 25.5 bottom 23.8 24.5, 23.5, 23.5, 25.0, 22.0, 24.0 11-P7A-7 top 31.5 31.0, 31.5, 32.0, 32.0, 32.5, 31.0 bottom 28.7 28.0, 28.0, 27.0, 30.0, 29.0, 30.0 11-P7B-5 top 27.6 28.0, 28.0, 27.5, 28.5, 26.5, 27.0 bottom 26.1 25.0, 27.0, 26.5, 25.0, 27.0, 26.0 11-P7B-6 top 29.9 30.0, 30.0, 30.5, 30.0, 30.0, 29.0 bottom 28.5 27.5, 29.0, 28.0, 28.5, 28.5, 29.5 11-P7B-7 top 28.3 28.0, 28.0, 28.5, 29.0, 28.5, 28.0 bottom 26.1 25.5, 27.0, 26.0, 27.0, 24.5, 26.5 Table 2-5: THROUGH THICKNESS HARDNESS OF COUPLINGS Coupling Location Measurements from OD to ID (HRC) 11-P7A-5 1
33.0, 29.9, 33.2, 33.0, 33.6, 32.2 2
32.9, 32.9, 33.0, 32.6, 32.8, 33.1 11-P7A-6 1
24.0, 24.0, 24.0, 24.0, 24.0, 24.0 2
24.5, 24.0, 24.0, 24.0, 24.5, 24.0 11-P7A-7 1
31.0, 31.5, 31.0, 31.5, 31.0, 31.0 2
32.0, 31.5, 32.0, 31.0, 31.0, 31.0 11-P7B-5 1
27.0, 27.5, 27.5, 28.0, 27.9, 27.5 2
28.5, 28.0, 28.6, 28.0, 27.7, 28.1 11-P7B-6 1
28.8, 29.0, 29.1, 29.4, 29.8, 29.8 2
29.2, 29.9, 30.0, 30.0, 30.1, 30.2 11-P7B-7 1
28.2, 28.0, 27.9, 27.9, 28.0, 28.5 2
28.1, 28.8, 28.7, 28.5, 28.4, 28.6 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 21 of 41 Figure 1-1: SWS Pump Sketch EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 22 of 41 Figure 2-1: As-Received Coupling No. 11-P7A-7 Pump P-7A Coupling No. 1 through 8 were submitted to LPI.
Coupling Nos. 5, 6 and 7 (11-P7A-5, 11-P7A-6, 11-P7A-7) were selected for analysis Coupling No. 7 (11-P7A-7) shown in its as-received condition.
Notice the neolube on the threads.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 23 of 41 Figure 2-2: Visual Observation of Coupling No. 11-P7A-5 As-received couplings exhibited a greasy grey coating on threads, and near the center of the coupling some surface deposits.
A cleaned surface is shown for coupling No.
11-P7A-5 No cracks along the thread root were visually observed for any of the cleaned couplings (11-P7A-4, 5, 6, 7)
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 24 of 41 Figure 2-3: MT Examination of Coupling No. 11-P7A-6 Pump P-7A coupling Nos. 11-P7A-5, 6, 7 were sectioned longitudinally, as shown in the figure.
The threaded inner diameter region of couplings nos. 11-P7A-4, 5, 6, and 7 were examined for cracks or discontinuities by fluorescent magnetic particle testing (MT)
No indications were readily visible in these couplings EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 25 of 41 Figure 2-4: As-Received Couplings 11-P7B-4 through 11-P7B-7 Figure 2-5: As-Split Coupling 11-P7B-4 B4 B5 B6 B7 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 26 of 41 Figure 2-6: As-Split Coupling 11-P7B-5 Figure 2-7: As-Split Coupling 11-P7B-6 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 27 of 41 Figure 2-8: As-Split Coupling 11-P7B-7 Figure 2-9: Cracks in 11-P7B-5 Initiating at a Pit EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 28 of 41 Figure 2-10: Cracks in 11-P7B-6 Initiating at a Pit EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 29 of 41 Figure 2-11: Cracks in 11-P7B-7 Initiating at a Pit Figure 3-0: P-7B Run Time EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 30 of 41 Figure 3-1: P-7C Run Time EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 31 of 41 Coupling No. 11-P7C-6F 11-P7C-7K in the as-received condition does not exhibit the same level of neolube on threaded surface as 11-P7A-7 11-P7A-7 in the as-received condition exhibits significant amounts of neolube on thread surface Figure 3-2: Contrast Thread Coating (As-Received) on P-7C Failed and Cracked Couplings to P-7A Coupling EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 32 of 41 Representative Image of 11-P7A Coupling 11-P7B Couplings 11-P7C Couplings Figure 3-3: Contrast Thread Root Pitting Some corrosion product was observed near the middle of the coupling however no pitting was observed during stereomicroscope examination of couplings A5 through A7.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 33 of 41 Figure 3-4: Half FEA model of coupling Figure 3-5: Cross-section of half FEA coupling model Z
X Y
Local coordinate system numbered 11 is cylindrical coordinate system Symmetric boundary condition: Uz=0 U =0 EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 34 of 41 Figure 3-6: Tensile Stress Distribution across Wall Thickness of Coupling (Based on three threads reacting out applied load)
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 35 of 41 Figure 3-7: Time to Failure vs Applied Stress [8]
Figure 3-8: SCC Process [13]
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 36 of 41 Figure 3-9: Crack Growth Rate (da/dt) versus stress intensity w/Three Stages of Crack Growth [8]
Figure 3-10: Effects of Tempering Temperature and Applied Stress-Intensity factor on Velocity of Stress-Corrosion Cracking [7]
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 37 of 41 Figure 3-11: SWS Pump Coupling Temperature Trace [3]
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 38 of 41 Figure 3-12: Coupling Life Estimate for Crack Propagation Across Wall EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 39 of 41 DESIGN VERIFICATION CHECKLIST Document No(s)1:
F11358-LR-001 Rev.:
0 Review Method:
X Design Review Alternate Calculation Test Criteria DV2 1
Were the inputs correctly selected and incorporated into design?
GZ 2
Are assumptions necessary to perform the design activity adequately described and reasonable? Where necessary, are the assumptions identified for subsequent re-verifications when the detailed design activities are completed? If applicable, has an as built verification been performed and reconciled?
GZ 3
Are the appropriate quality and quality assurance requirements specified?
GZ 4
Are the applicable codes, standards and regulatory requirements including issue and addenda properly identified and are their requirements for design met?
N/A 5
Have applicable construction and operating experience been considered, including operation procedures?
N/A 6
Have the design interface requirements been satisfied?
GZ 7
Was an appropriate design method used?
GZ 8
Is the output reasonable compared to inputs?
GZ 9
Are the specified parts, equipment, and processes suitable for the required application?
N/A 10 Are the specified materials compatible with each other and the design environmental conditions to which the material will be exposed?
N/A 11 Have adequate maintenance features and requirements been specified?
N/A 12 Are accessibility and other design provisions adequate for performance of needed maintenance and repair?
N/A 13 Has adequate accessibility been provided to perform the in-service inspection expected to be required during the plant life?
N/A 14 Has the design properly considered radiation exposure to the public and plant personnel?
N/A 15 Are the acceptance criteria incorporated in the design documents sufficient to allow verification that design requirements have been satisfactorily accomplished?
GZ 16 Have adequate pre-operational and subsequent periodic test requirements been appropriately specified?
N/A 17 Are adequate handling, storage, cleaning and shipping requirements specified?
N/A 18 Are adequate identification requirements specified?
N/A 19 Are requirements for record preparation review, approval, retention, etc., adequately specified?
GZ 20 Has an internal design review been performed for applicable design projects? Have comments from the Internal Design Review been appropriately considered/addressed?
N/A (1) Include any drawings developed from reviewed documents, or include separate checklist sheet for drawings (2) Design Verifier shall initial indicating review and mark N/A where not applicable DV Completed By:
Printed Name G. Zysk Signature Date 9/28/11 Page 1
of 1 Total Pages (Include DV Checklist and Comment Resolution sheets in page count)
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 40 of 41 DOCUMENT SOFTWARE RECORD (Include Separate Sheet for Each Software Package Utilized) 1 Computer Software Used (Code/Version)
ANSYS Version 11.0 2
Software Supplier ANSYS, Inc.
3 Software Update Review Error notices; describe: Reviewed error reports for elements used Other; describe:
4 Nuclear Safety Related Software NO
- 1. If YES:
Hardware identification # used for execution:
Desktop Serial #: J2WTBM1 Basis for V & V: [15]
YES1 5
Input Listing(s)
Input listing(s) attached:
Not attached; identify File/Disc ID*:
Coupling Pump Bearing & Bending.txt Coupling Pump Bearing.txt Coupling Pump No Bearing.txt
- A CD with input listings and output data to be provided on project completion.
6 Output results attached:
Not attached; identify File/Disc ID*:
- A CD with input listings and output data to be provided on project completion.
7 Output Identifier(s)*
(see 6 above)
- e.g., run date/time; use for reference, as appropriate, within body of calculation 8
Comments 9
Keywords**
SOLID45, Static
- For use in describing software features used in this calculation; use common terms based on software user manual.
10 Project Manager Name:
S. Yim If computer software was used on project, complete form with required information.
Update the LPI Computer Software Use List per LPI Procedure 13.1 requirements.
EA-PSA-SDP-P7C-11-06
Mr. Alan Blind LPI Ref. F11358-LR-001, Rev. 0 Page 41 of 41 DOCUMENT INSTRUMENT RECORD Instrument Used Instrument Description Serial No.
Calibration Due Date 1
Tensile Testing Machine (120 kips)
Baldwin 37205 4/7/12 2
Extensometer (1 in) 2620-824/1033 4/7/12 3
Charpy Impact Tester Satec Model SI-1K/1306 6/17/12 4
Hardness Tester Wilson 5YR/58 4/7/12 5
Thermocouple Omega 650 J/8320 7/12/12 6
Caliper Fowler 6'/7082002 6/21/12 7
Magnetic Yoke Magnaflux Y-6/43530 Per use calibration 8
9 10 11 SEM/Oxford EDS 17218-118-01 Per use calibration 12 13 14 Project Manager Name:
S. Yim For instrument(s) used on the project, identify instrument and include the instrument calibration due date.
Update the LPI Instrument Use List per LPI Procedure 13.1 requirements.
EA-PSA-SDP-P7C-11-06
EA-PSA-SDP-P7C-11-06 0 Page 1 of 8 Evaluation of Service Water Pumps P-7A and P-7B Failure Rates Following Failure of Pump P-7C 1.0 Introduction Palisades experienced failure of couplings on the P-7C service water pump on two occasions. The first occurred about 110 days after installation of couplings made of a material that was susceptible to inter-granular stress corrosion cracking (IGSCC). After replacement of the couplings with similar material, another failure occurred approximately 670 days later. Post failure analysis by Lucius Pitkin Incorporated (LPI) determined the failure mechanism and indicated that the other service water pump couplings were susceptible to similar failure mechanisms. Shortly thereafter, all service water pump couplings were replaced with a material less susceptible to IGSCC failure and testing of the couplings for the P-7A and P-7B pumps was conducted to determine their as-found condition and estimate their expected lifetime had they been allowed to run to failure.
This evaluation examines the conditions discovered by that inspection and the assessment of potential crack growth rates (CGR) for the P-7A and P-7B pump couplings. Based on the as-found condition information provided in the LPI report [1], an estimate of the degraded failure to run rate for each pump was developed. A convolution distribution for the joint probability of failure was derived from the individual pump failure to run rates. This represents the probability of failure of both pumps as a function of time since the couplings were initially installed.
Following failure of pump P-7C, the probability that the two remaining pumps would fail to run can be assessed for any particular time period by subtracting the convolution failure probability at the time of failure of P-7C and the convolution probability of failure at some time thereafter. The key time period of interest is either the repair time for the P-7C pump, or the Technical Specification (TS) allowed outage time whichever is smaller. In this case, the TS allowed outage time of 3 days was used in this evaluation even though the actual time to repair the pump was shorter. This represents a minor conservatism in the analysis that is not expected to affect the result or conclusions appreciably.
EA-PSA-SDP-P7C-11-06 0 Page 2 of 8 2.0 Evaluation There are three stages to failure of a coupling due to IGSCC. These are crack initiation, crack growth to a through wall condition, and critical fracture (failure of the material). The last of these occurs rapidly compared to the other two and is conservatively assumed to occur immediately upon crack growth to a through wall state. The crack growth rate (CGR) is a constant over time once initiated, while the crack initiation rate varies over time.
As noted in the LPI report [1]:
The time to failure of a susceptible material in a given environment is dependent on the applied tensile stress, as can be seen in Figure 3-7 (not shown here). The plot compares applied stress or load to the logarithm of exposure time in an environment and illustrates the time to failure increases significantly with decreasing applied stress. The crack propagation time, tcp is taken to be the difference between the time of failure, tf, minus the time of initiation, tin. The time at failure is typically known. However, the time of initiation is highly alloy-environment and applied stress dependant and thus is an unknown without specific test data. The initiation time is also highly dependent upon pre-existing flaws that may have been introduced during heat treatment or thread fabrication. Therefore, predicting initiation time is difficult. Unless there are preexisting flaws, a distribution of 80% initiation and 20% propagation is considered reasonable for the life of a component subject to SCC process as suggested by Figure 3-8 (not shown here).
To determine the expected life of the non-failed couplings due to IGSCC, three cases for the CGR were postulated in the LPI report [1] based on the P-7C pump history for the first failure (the shortest time between installation and failure which leads to the highest CGR estimation).
Case 1 involved using the crack growth rate based on generic data associated with crack growth rates for this type of material in distilled water (2.3E-04 in/hr). This produced an expected life of about 90 days if a crack was initiated at the time of installation. Case 2 assumes that half of the observed life of the first P-7C coupling failure was spent in crack initiation and half of the time was spent in crack growth. This case results in a CGR of 3.81E-04 in/hr. Case 3 involved the assumption that a pre-existing crack had occurred at the time of coupling installation for the first P-7C coupling failure (the one with the shortest time to failure) and resulted in a CGR of 1.91E-04 in/hr.
The LPI report [1] indicates that the P-7C pump couplings failed initially after about 110 days of operation with the new coupling material in place and the second failure occurred at about 670 days after the first failure with similar coupling material in place. As the CGRs in this evaluation are all based on the shortest time to failure for the P-7C pump, using these CGRs to estimate the remaining life of the as-found coupling for P-7A and P-7B represents a conservative estimate based on the as-found conditions.
EA-PSA-SDP-P7C-11-06 0 Page 3 of 8 To evaluate the failure to run rate for the P-7B pump, the actual condition of the worst coupling was used to estimate the remaining life. The worst coupling exhibited cracks with a depth of 0.132 inches with a total wall thickness 0.5 inches. Using the CGRs from the three cases noted above, the LPI report estimated remaining life for this coupling. A Generalized Gamma distribution was fit for this data and is shown below. Weibull++7 software package is used to fit this data [2].
EA-PSA-SDP-P7C-11-06 0 Page 4 of 8 For the P-7A couplings, there were no indications of crack initiation apparent from the examination of the material after the failure of the P-7C coupling. As a conservative estimation of the remaining life of these couplings, the LPI report [1] assumed that a crack would initiate at the time the coupling was removed for inspection. Thus, the estimated remaining life is based on the CGRs from the three cases noted above and an assumption that cracks existed which were not observed in the as-found condition of the couplings. Using this conservative assumption, a Generalized Gamma distribution was fit for this data and is shown below. Weibull++7 software package is used to fit this data [2].
EA-PSA-SDP-P7C-11-06 0 Page 5 of 8 To estimate the probability that both the P-7A and P-7B couplings would fail within the 3 day time window allowed by Technical Specifications, a convolution of these two failure rates produced the joint failure probability curve shown below. By looking at the joint failure probability at the time of the P-7C failure and the joint failure probability 3 days later, the difference is the probability that both P-7A and P-7B would fail some time during that 3 day period. That value is 2.65E-05. The figure below shows the convolution curve. The convolution was simulated using OpenBugs [3] and the result of the OpenBugs was fitted to a closed form distribution using EasyFit software package [4].
EA-PSA-SDP-P7C-11-06 0 Page 6 of 8 The convolution curve generated above is based on the fact that while the mechanisms that cause a higher than normal failure rate for service water pump couplings are common, the rate at which they affect each pump is not. That is, while each has a higher than normal failure rate, the failure rate for each is different and statistically independent. The worst case scenario would involve a complete dependence between the failure rates for both pumps. This is tantamount to using the worst failure rate for either pump as the rate at which both pumps fail.
The actual evidence of the difference in time between the failure of the P-7C couplings and the as-found conditions of the P-7A and P-7B couplings demonstrates that this worst case scenario is simply not valid. Nevertheless, assuming complete dependence between the two failure rates results in a probability of total loss of service water during the 3 day allowed outage time of 2.61E-03. The figure below shows the ³delta' curve for the convolution curve and the complete dependence curve as a function of time.
EA-PSA-SDP-P7C-11-06 0 Page 7 of 8 3.0 Summary Using the as-found condition of the P-7A and P-7B pump couplings and conservative assumptions about the crack growth rate (based on the shortest time to failure of the P-7C pump), an estimate of the remaining life for these couplings was provided by the LPI report [1]. From that information, a distribution for the failure to run rate was produced by fitting a Generalized Gamma distribution to that data.
A convolution of the resulting failure rate curves produced a curve representing the probability of failure of both the P-7A and P-7B couplings as a function of time after the couplings were initially installed. Comparing the probability at the time of P-7C failure and the probability three days later (based on the TS allowed outage time) demonstrates that the likelihood of a total loss of service water during that interval was small.
The figure below is a combination of the degraded failure rates based on as-found conditions along with the convolution curve for those failure rates. It also includes the ³delta' curve which shows the difference between the convolution curve value at the time of P-7C failure and the convolution curve at various times after P-7C failure. This evaluation indicates that the likelihood of total loss of service water following failure of the P-7C pump was low for a considerable period of time following the failure of the P-7C pump even with degraded failure rates in the remaining pump couplings.
EA-PSA-SDP-P7C-11-06 0 Page 8 of 8 4.0 References
- 1. F11358-LR-001 Rev. 0, ³Past Operability Assessment of Service Water Pumps P-7A and P-7B associated with As-found Evaluation of Pump Shaft Couplings +/- Palisades Nuclear Plant', Lucius Pitkin, Inc., September 28, 2011
- 2. Weibull++ 7 website, http://www.reliasoft.com/Weibull/index.htm
- 3. OpenBUGS website http://www.openbugs.info/w/FrontPage
- 4. EasyFit website, http://www.mathwave.com/
5.0 Appendices Appendix A, Convolution Input and Output (7 pages)
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 1 of 7 Development of probability of failure curve of coupling pump 7A and 7B Estimated Time to Failure use to develop probability of failure curve of coupling pump 7B Input for Weibull++ 7 (Pump 7B)
Coupling Time to Failure 11-P7B-6K Case 1 552.5 11-P7B-6K Case 2 515 11-P7B-6K Case 3 574 Curve Fitting Result to Generalized Gamma Distribution for Pump 7B Output from Weibull++ 7: Generalized Gamma Distribution with
=6.3173,
=0.0231 and =2.3501 ReliaSoft Weibull++ 7 - www.ReliaSoft.com Probability - Generalized Gamma Time, (t) 100.000 1000.000 10.000 50.000 99.000 Probability-G-Gamma Data 1 G-Gamma-3P NLRR SRM MED FM F=3/S=0 Data Points Probability Line Krisnandito Hardjoko Erin Engineering 12/1/2011 12:44:18 PM
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 2 of 7 Use StatAssist that part of EasyFit software package to calculate cumulative probability of failure F(857)
F(860)
F(867)
F(887)
F(917)
F(947)
Day since the failure of Pump 7-C for the 2nd time 0
3 10 30 60 90 Generalized Gamma for Pump 7B where F(t)=FB(t-403) 2.02E-02 2.28E-02 3.02E-02 6.57E-02 1.99E-01 5.64E-01 Cumulative Probability of Failure for Pump 7B Input for Weibull++ 7 (Pump 7A)
Coupling Time to Failure 11-P7A-6K Case 1 966.3 11-P7A-6K Case 2 931 11-P7A-6K Case 3 985 1.E 05 1.E 04 1.E 03 1.E 02 1.E 01 1.E+00 0
20 40 60 80 100 Day after 2nd failure of Pump 7C Generalized Gamma Fit of Pump 7B Generalized Gamma for Pump 7B
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 3 of 7 Curve Fitting Result to Generalized Gamma Distribution for Pump 7A Output from Weibull++ 7: Generalized Gamma Distribution with
=6.8863,
=0.0231 and =1.5429 Cumulative Probability of Failure for Pump 7A ReliaSoft Weibull++ 7 - www.ReliaSoft.com Probability - Generalized G amma Time, (t) 100.000 1000.000 10.000 50.000 99.000 Probability-G-Gamma Data 1 G-Gamma-3P NLRR SRM MED FM F=3/S=0 Data Points Probability Line Krisnandito Hardjoko Erin Engineering 11/29/2011 11:21:31 AM 1.E 05 1.E 04 1.E 03 1.E 02 1.E 01 1.E+00 0
20 40 60 80 100 Day after 2nd failure of Pump 7C Generalized Gamma Fit of Pump 7A Generalized Gamma for Pump 7A
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 4 of 7 Use StatAssist that part of EasyFit software package to calculate cumulative probability of failure F(857)
F(860)
F(867)
F(887)
F(917)
F(947)
Day since the failure of Pump 7-C for the 2nd time 0
3 10 30 60 90 Generalized Gamma for Pump 7A 1.89E-02 2.09E-02 2.62E-02 4.96E-02 1.26E-01 3.07E-01 Convolution of Pump 7A and Pump 7B Simulate Pump 7A time to failure with Pump 7B time to failure+403 days.
Take the longest time to failure among Pump 7A and 7B for each data generation Generate 100000 data Use OpenBugs to do convolution Fit the 100000 convolution result to a curve.
Input for OpenBugs
- new model{
ta~dggamma(ra,mua,betaa) tb~dggamma(rb,mub,betab) tnb<-tb+403 tcomb<-max(ta, tnb)
}
list(ra=0.42, mua=0.001008503, betaa= 66.73, rb=0.181, mub=0.001775, betab= 101.81) mean sd MC_error val2.5pc median val97.5pc start sample ta 956.8 35.83 0.1106 865.7 964.5 1004.0 1001 100000 tb 532.8 27.96 0.09087 459.6 540.3 565.4 1001 100000 tcomb 966.0 22.23 0.07175 917.2 966.4 1004.0 1001 100000 tnb 935.8 27.96 0.09087 862.6 943.3 968.4 1001 100000 Output of OpenBugs mean sd MC_error val2.5pc median val97.5pc sample time to failure pump 7A 956.8 35.83 0.1106 865.7 964.5 1004 100000 time to failure pump 7B 532.8 27.96 0.09087 459.6 540.3 565.4 100000 time to failure pump 7B + 403 days 935.8 27.96 0.09087 862.6 943.3 968.4 100000 time to failure convolution pump 7A & 7B 966 22.23 0.07175 917.2 966.4 1004 100000
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 5 of 7 Output point from OpenBugs and input for EasyFit OpenBugs simulation output and input for E Weibull (3P) fit the best to the convolution result.
Curve Fitting Result to Weibull (3P) Distribution for Convolution Result Output from Easy Fit: Weibull (3P) with parameter
=8.630243967,
=172.4086991, and =803.0072566 Probability Density Function Histogram Weibull (3P) x 1020 1000 980 960 940 920 900 880 860 840 0.24 0.22 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 6 of 7 Cumulative Probability of Failure for Convolution of Pump 7A and Pump 7B Summary of Cumulative Probability of Failure for Pump 7A, Pump 7B and Convolution Result F(857)
F(860)
F(867)
F(887)
F(917)
F(947)
Day since the failure of Pump 7-C for the 2nd time 0
3 10 30 60 90 General Gamma for Pump 7A 1.89E-02 2.09E-02 2.62E-02 4.96E-02 1.26E-01 3.07E-01 General Gamma for Pump 7B 2.02E-02 2.28E-02 3.02E-02 6.57E-02 1.99E-01 5.64E-01 Weibull (3P) fit for convolution of Pump 7A and Pump 7B 4.45E-05 7.10E-05 1.93E-04 2.01E-03 2.77E-02 1.90E-01 1.E 05 1.E 04 1.E 03 1.E 02 1.E 01 1.E+00 0
20 40 60 80 100 Day after 2nd failure of Pump 7C Weibull (3P) fit for convolution of Pump 7A and Pump 7B Weibull (3P) fit for convolution of Pump 7A and Pump 7B
EA-PSA-SDP-P7C-11-06 0 Appendix A Page 7 of 7 Probability of Both Pump 7A and Pump 7B Fail Since the Pump 7-C Failed for the 2nd time. The blue curve assumed completely dependent to the worst pump (pump 7B) and the purple curve assumed pump 7-A and 7-B completely independent Probability of Both Pump 7A and Pump 7B Fail Since the Pump 7-C Failed for the 2nd time F(857)
F(860)
F(867)
F(887)
F(917)
F(947)
Day since the failure of Pump 7-C for the 2nd time 0
3 10 30 60 90 FB(t) = FB(t)-FB(857) 0.00E+00 2.61E-03 9.98E-03 4.55E-02 1.79E-01 5.44E-01 Fconv(t) = Fconv(t)-Fconv(857) 0.00E+00 2.65E-05 1.48E-04 1.97E-03 2.77E-02 1.90E-01 1.E 05 1.E 04 1.E 03 1.E 02 1.E 01 1.E+00 0
20 40 60 80 100 Day after 2nd failure of Pump 7C Fconv(t) =
Fconv(t) Fconv(857)
FB(857)
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 1 of 29 Common-Cause Failure Analysis in Event and Condition Assessment: Guidance and Research Song-Hua Shen, NRC Don Marksberry, NRC Gary DeMoss, NRC Kevin Coyne, NRC Dale M. Rasmuson (NRC, retired)
Dana L. Kelly, INL John A. Schroeder, INL Curtis L. Smith, INL
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 2 of 29 v
ABSTRACT Event and condition assessment is an application of probabilistic risk assessment in which observed equipment failures, degradations, and outages are mapped into the risk model to obtain a numerical estimate of risk significance, which can then be used in other applications, such as the Significance Determination Process. Past experience has shown that conditional common-cause failure (CCF) probability is often a substantial contributor to the risk significance of a performance deficiency.
However, guidance for assessing CCF potential has been lacking. Because of this lack of guidance, considerable resources have been expended in efforts to demonstrate an absence of CCF potential, often by scrutinizing piece-part differences across redundant trains instead of focusing on the higher organizational or programmatic issues that were the real cause of the observed failure. Piece parts have often been the object of scrutiny in efforts to declare an observed failure independent, meaning that no potential existed for CCF of redundant components. Such scrutiny of piece parts is counter to the Significance Determination Process guidance in Inspection Manual Chapter 0308, Reactor Oversight Process (ROP) Basis Document, which states, The performance deficiency should most often be identified as the proximate cause of the degradation. In other words, the performance deficiency is not the degraded condition itself, it is the proximate cause of the degraded condition.
This NUREG offers guidance for assessing CCF potential at the level of the observed performance deficiency, provides essential definitions of technical terms, and describes the treatment of CCF for a number of categories of component failures and outages. It also describes technical issues with both the consensus CCF model used in probabilistic risk assessments conducted in the United States and the associated parameter estimates and the data upon which they are based. The NUREG closes with a summary of future research intended to address these issues.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 3 of 29 v
CONTENTS INTRODUCTION AND MOTIVATION............................................................................................. 1 1.1 PRA Treatment of Dependent Failure............................................................................................................. 2 1.2 ECA Philosophy Regarding CCF.................................................................................................................... 3 1.3 Definitions and Discussion................................................................................................................................ 4 1.4 ECA Ground Rules for CCF Treatment.......................................................................................................... 7 1.4.1 Deviations from Ground Rules.............................................................................................................. 9 1.5 CCF Examples.......................................................................................................................................... 10 1.6 Summary of Guidance......................................................................................................................................... 14 DETAILED GUIDANCE FOR TREATMENT OF CCF................................................................. 15 2.1 Basic Principles of CCF Treatment in ECA.......................................................................................... 15 2.2 CCF Treatment Categories............................................................................................................................. 16 2.2.1 Observed Failure with Loss of Function of One Component in the CCCG......................... 16 2.2.2 Observed Failures with Loss of Function of Two or More Components in the CCCG....... 16 2.2.3 Observed Failure with Loss of Function of One Component in the CCCG - Component not in SPAR Model.................................................................................................................................. 16 2.2.4 Degradation in One or More Components in CCCG without Observed Failure.................. 16 2.2.5 Observed Unavailability of One or More Components in CCCG Due to Testing or Planned Maintenance................................................................................................................................... 17 2.2.6 Observed Loss of Function of Components in CCCG Caused by the State of Other Components Not in the CCCG..................................................................................................................... 17 2.2.7 Observed Loss of Function of One or More Components in CCCG as a Result of Environmental Stress Caused by Failure or Degradation of Other Components outside Affected CCCG............................................................................................................................... 17 2.2.8 No Observed Failure or Degradation in the Affected CCCG................................................. 18 2.3 Cases where Guidance Might not Apply............................................................................................... 18 2.3.1 CCCG Boundary Issues....................................................................................................................... 18 Issues with Current CCF Modeling and Data Analysis Relevant to ECA.......................................... 19 3.1 Issues with CCF Model..................................................................................................................................... 19 3.1.1 CCF Model is not Causal....................................................................................................................... 19 3.1.2 BPM Employs a Symmetry Assumption............................................................................................ 20 3.1.3 CCF is not Modeled Across Component or System Boundaries................................................ 20 3.1.4 Impact on More than One Failure Mode not Captured................................................................ 20 3.1.5 Conditional CCF Calculations in Models for Support System Initiating Events (SSIE)....... 20 3.2 Issues with Alpha-Factor Estimates............................................................................................................ 23
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 4 of 29 vi 3.2.1 Treatment of Shared Components and Latent Human Errors..................................................... 24 3.2.2 Prior Distributions for Alpha Factors............................................................................................. 24 3.2.3 Estimates of Alpha Factors are not Plant-Specific........................................................................ 26 3.2.4 Treatment of Staggered Testing......................................................................................................... 26 Future Research................................................................................................................................ 28 4.1 Causal Failure Models....................................................................................................................................... 28 4.2 SSIE Models................................................................................................................................................ 29 4.3 Enhancements to NRC CCF Database..................................................................................................... 30 4.3.1 Prior Distribution for Alpha-Factors.............................................................................................. 30 4.3.2 Plant-Specific Alpha-Factor Estimates.......................................................................................... 30 4.3.3 Adjusting Alpha-Factor Estimates.................................................................................................. 30 4.3.4 Effects of Testing Schemes on Estimators.................................................................................... 30 REFERENCES................................................................................................................................. 31 Appendix A................................................................................................................................................... 33 Conditional Common-Cause Failure Probability Calculations...........................................................................33 A.1 Review of Basic Parameter Model and Alpha-Factor Parameterization........................................ 33 A.2 Calculating Conditional Common-Cause Failure Probability........................................................... 35 A.2.1 Failure To Start......................................................................................................................... 37 A.2.2 Independent Failure To Start....................................................................................................... 38 A.2.3 Test or Preventive Maintenance Outage.................................................................................... 39 Appendix B............................................................................................................................................................... 40 Effects of Testing Schemes on Common-Cause Failure Parameters................................................................. 40 B.1 Common-Cause Failure Model Parameters............................................................................................ 40 B.2 Ways of Collecting Data........................................................................................................................... 41 B.2.1 Estimators when All m Components Are Demanded on Each Test........................................ 42 B.2.3 Estimators with Staggered Testing............................................................................................. 44 B.3 Effect of Using the Wrong Formula........................................................................................................ 47 B.4 Proofs of Selected Results......................................................................................................................... 48 FIGURES Figure 1 Illustration of difference between cause of failure and failure mechanism....................................... 3 Figure 2 Plot of 500 dependent failure times for two components from Marshall-Olkin shock model, showing simultaneous failure times caused by shared shocks occurring randomly in time........................... 5
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 5 of 29 viii Figure 3 Scatterplot of 100 dependent failure times for two components, showing positive correlation of failure times, but without the simultaneous failures that would be produced from a shock model, taken from (Kelly D. L., 2007)................................................................................................................................................. 6 Figure 4 Circumferential crack in EDG flexible coupling............................................................................. 11 Figure 5 Lube oil lead from Y-strainer end cap on Dresden EDG 2/3............................................................ 12 Figure 6 Failed plastic end cap on Dresden EDG 2/3 lube oil strainer......................................................... 12 Figure 7 Reliability block diagram for two pumps in parallel, one normally running and one in standby 20 Figure 8 Markov model for two-pump configuration in Figure 7.................................................................... 21 Figure 9 Reduced Markov model for two-pump CCCG, given that pump B has failed with potential for CCF of pump A........................................................................................................................................................ 22 Figure 10 Example of Bayesian network causal model..................................................................................... 29 Figure 11 Example fault tree for two failure modes and three components................................................. 36 TABLES Table 1 CCF Data Used To Develop Alpha-Factor Prior Distribution............................................................ 24 Table 2 Component Failure Probabilities for Three-Component Example..................................................... 36 Table 3 Basic Event Probabilities for Three-Component Example.................................................................. 36 Table 4 Quantified Minimal Cut Sets for Three-Component Example........................................................... 37 Table 5 Quantified Minimal Cut Sets and Conditional Probabilities for Three-Component Example, Given Observed Failure To Start of Component A............................................................................................ 37 Table 6 Quantified Minimal Cut Sets and Conditional Probabilities for Three-Component Example, Given Observed Independent Failure To Start of Component A..................................................................... 39
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 6 of 29 x
FOREWORD The U.S. Nuclear Regulatory Commissions Division of Risk Analysis in the Office of Nuclear Regulatory Research develops and manages research programs relating to probabilistic risk assessments (PRAs), human factors, and human reliability analysis. The division assesses U.S. operational safety data and reliability information to determine risk-significant insights and trends, which allows us to focus on the risks most important to protecting public health and safety.
A general conclusion from PRAs of commercial nuclear power plants is that common-cause failures (CCFs) are significant contributors to the unavailability of safety systems. Especially in event and condition assessment (ECA), an observed performance deficiency has the potential to fail multiple components in a relatively short time period. CCF in ECA has to be analyzed at the level of the observed performance deficiency. In other words, the assessment of the potential for multiple dependent equipment failures in ECA should not be constrained by an assumption that CCF requires failure of the same piece part or subcomponent because of the same failure mechanism. Instead, this assessment should focus on the higher organizational or programmatic issues that were the real cause of the observed failure.
This NUREG offers guidance for assessing CCF potential at the level of the observed performance deficiency, provides essential definitions of technical terms, and describes the treatment of CCF for a number of categories of component failures and outages. It also describes technical issues with both the consensus CCF model used in PRAs conducted in the United States and the associated parameter estimates and the data upon which they are based. The NUREG closes with a summary of ongoing and future research intended to address these issues.
Richard Correia, Director Division of Risk Analysis Office of Nuclear Regulatory Research U.S. Nuclear Regulatory Commission 1.) This initial discussion of the proposed approach to conduct common cause analysis under the premise that it not be constrained to the same piece part or subcomponent or the same failure mechanism, is contrary to any existing PRA methodology for conducting common cause analysis.
This document attempts to move the concept of cross cutting issues into the PRA model which may or may not be consistent with the approach to implement common cause contribution in the model. A principal issue is the guidance provided does not include any direction to provide a basis to support the conclusion that the performance deficiency is directly tied to the component failures being observed and that all common cause failures identified would be expected to occur within the PRA mission time of a single event. The approach is to elevate the definition of performance deficiency to the broadest definition that can be shown to encompass the event in question. The elevated description then makes the assertion that many other common cause failures beyond those identified would be possible and not necessarily constrained to the common cause group of the component(s) failed in the event under consideration.
For example, the following is from IM 0308 Attachment 3:
The staff is responsible to define licensee performance deficiencies. Where the proximate cause of multiple degraded conditions is the same, there is likely to be only one finding (based on the identified performance deficiency related to the proximate cause) and the risk impact of the collective degraded conditions (including any overlapping conditions) is then appropriately used as the basis for the SDP result. However, this concept could be taken to an extreme of defining all licensee performance deficiencies as management weakness or something similarly fundamental. Doing so would then cause all degraded conditions to be manifestations of a single and possibly never-ending finding, would make unnecessary the need for an Action Matrix, and may require the staff to devise a continuous risk meter or similar substitute for the Action Matrix. Thus, a floor was set for the implementation of this concept that is consistent with the ROP framework, in that no performance deficiency should be defined at a level associated with the ROP cross-cutting issues (i.e., human performance, safety-conscious work environment, and problem identification and resolution) or more
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 7 of 29 x
fundamentally. Although artificially setting this floor may create a philosophical inconsistency with use of a probabilistic thinking framework (i.e., if there is really a known common-cause effect taking place, then it should be explicitly acknowledged in a probabilistic model), it remains necessary for practical reasons as long as the Action Matrix continues in its present form. Concerns about possible insufficient regulatory responses arising from this approach are also mitigated as noted below.
It is considered this documents premise is inconsistent with the above.
This documents approach goes on to state, given the definition of the performance deficiency, the current assignments in the PRA model for common cause grouping may no longer be applicable, and the failure mechanisms considered as the common element of group failure are no longer a constraint on the number of components that would be the target group for common cause failure. The issue is that the definition of the performance deficiency at this high level (POOR MAINTENANCE PROCESS) represents an unbounded characterization of commonality among a group of components. This would allow cross system groupings, and grouping of dissimilar components, into much larger common cause groups. There is no guidance that mandates the development of a technical basis that would establish the connection of the specific observed failure(s) to the entire common cause group.
2.) The summary discussion in the foreword also states that it describes technical issues; with the consensus CCF model used in PRAs, and the associated parameter estimates and data upon which they are based. The principal issue with this description is the industry has developed several standards (ASME / ANS) to establish a baseline consistent methodology of implementing risk assessment.
Plants are required to undergo review by external organizations to establish the degree of implementation/compliance with these standards. The authors herein have determined that the current standards are inadequate and infer that the implementation of this approach provides a method of quantifying risk assessments that will correct these deficiencies.
The guidance provided is a proposed means of correcting issues with the current consensus model without having been subjected to the same process of development as the current standards. Moreover, any issues with the current standards should be resolved within the standards process prior to issuing contrary guidance. If there are legitimate issues with the ASME/ANS standard process, it needs to be corrected first.
ACKNOWLEDGMENTS The authors would like to acknowledge the technical contributions of Dr. Corwin Atwood to Appendix B.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 8 of 29 x
ABBREVIATIONS ASP accident sequence precursor BPM basic parameter model CCF common-cause failure CCCG common-cause component group CFR Code of Federal Regulations ECA event and condition assessment EDG emergency diesel generator FTR failure to run FTS failure to start HFE human failure event INL Idaho National Laboratory IR inspection report MCC motor control center MLE maximum likelihood estimate NRC U.S. Nuclear Regulatory Commission PORV power-operated relief valve PRA probabilistic risk assessment SAPHIRE Systems Analysis Programs for Hands-On Integrated Risk Evaluations SDP significance determination process SPAR standardized plant analysis risk SSIE support system initiating event
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 9 of 29 9
Common-Cause Failure Analysis in Event and Condition Assessment
- 1. INTRODUCTION AND MOTIVATION Event and condition assessment (ECA)ais an application of probabilistic risk assessment (PRA) in which observed equipment failures, degradations, and outages are mapped into the risk model to obtain a numerical estimate of risk significance. Such an assessment can be either prospective, as when utilities use PRA as an aid in planning and scheduling equipment maintenance, or retrospective, such as in the Nuclear Regulatory Commissions Significance Determination Process (SDP) and Accident Sequence Precursor (ASP) Program. In this report, we focus on retrospective assessments intended to estimate the risk significance of degraded conditions, such as equipment failure caused by a deficiency in a maintenance process.
However, it is important to understand that the analyst is estimating a conditional risk metric (e.g., the conditional probability of core damage) for the event. Because the actual event did not lead to core damage, the event is not modeled exactly as it transpired because this would lead to a conditional core damage probability of zero. Instead, observed failures are mapped into the PRA model, but successes are treated probabilistically; the analyst accounts for the possibility that equipment that functioned successfully might, with some probability, fail to function, and that equipment that was not demanded in the actual event could have been demanded for some scenarios and also have a probability of failure.
Thus, failure probabilities are left at their nominal values or are conditioned as necessary to reflect the details of the event.
3.) This discussion is somewhat vague. When an event happens and core damage does not occur, the conditional probability of core damage is zero. What is being computed is the likelihood that, if such an event or similar event were to occur again under the same boundary conditions that existed when the actual event occurred, that additional failures would have occurred to produce core damage. The key is the probability of what happened is not being evaluated, but what could happen if the event were to occur again.
As an example, many if not most retrospective ECAs in the U.S., are done as part of the NRCs SDP, which is intended to evaluate the risk significance of inspection findings that pertain to an observed deficiency in licensee performance. Therefore, the ECAs main purpose is to quantify the risk significance of the event caused by the deficiency, using a PRA model (U. S. Nuclear Regulatory Commission, 2006). As an example, if the deficiency that led to an observed failure were poor quality control, then the ECA would estimate the risk significance of the observed equipment failure caused by this deficiency, and the analysis includes a probabilistic treatment of failures that were not actually observed, but could have been, because multiple equipment items could have been impacted by the deficiency (decreased reliability due to susceptibility to the same cause). Because nuclear plants utilize redundant safety equipment, the risk significance of such a deficiency will often be strongly influenced by the potential for dependent failure of this redundant equipment. Thus, a crucial term in the ECA risk equation is the conditional probability that remaining redundant components could fail, given that one or more such components were failed as a result of the identified performance deficiency. This probability is obtained by calculating a conditional common-cause failure (CCF) probability using the inputs to the PRA model.
4.) When an event occurs and the cause of the event is determined, the conditional probability of it being a common cause failure or independent failure is either 1 or 0. There may be uncertainty in determining this, so one might assign some probability that it was a common cause using engineering judgment, but this should not be compared with CCF model parameters.
Past SDP experience has shown that conditional CCF probability is often a significant contributor to the risk significance of the deficiency. In addition, guidance for assessing CCF potential has been lacking. Due to this lack of guidance, considerable resources have been expended in efforts to demonstrate an absence of CCF potential, often by scrutinizing differences among subcomponents and
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 10 of 29 10 piece parts across redundant trains instead of focusing on the higher organizational or programmatic issues that were the real cause of the observed failure. Piece parts have often been the object of scrutiny in efforts to declare the observed failure independent, meaning there was no potential for CCF of a We will not distinguish, except where it is technically necessary to do so, between an event, which involves the occurrence of a PRA initiating event, and a condition, which involves degradation of components for some period of time, but without the occurrence of an initiating event.
5.) The guidance above states that, while common cause contribution has been shown to be a significant contributor in past Event and Condition Assessments (ECA), there have been issues in not appropriately characterizing the common cause potential perceived to be associated with the observed events. The approach states the problem is related to being overly specific in the statement of the performance deficiency which restricts the focus of the risk assessment.
Therefore, the deficiency description should be elevated and broadened to a level commensurate with the definition of the cornerstone or the general requirements of the Quality Assurance Program Elements. While this is appropriate in the context of determining the possible association of several different events into a depiction of broader organizational issues, it also raises two concerns:
- 1. Statements of performance deficiencies at this level result in unbounded issues which makes it difficult to impossible to demonstrate issue resolution.
- 2. The association of component failures from several different events that have been encompassed by this broadened deficiency definition may not have not been shown to be connected by a direct common cause.
The issue is that deficiency definitions at this level are self fulfilling with respect to any group one would choose to create. The definitions become so vague that anything can be postulated to belong to the group.
Almost all equipment failures that have ever occurred could be lumped into a single group as long as we are willing to discuss causes at the proposed level (e.g. poor maintenance processes). In addition, the depiction of the deficiency in this broader characterization to assess several different events that occurred over some extended period of time ignores any correlation that would have established the probability of the different events occurring within a single event response.
PRA models have not been developed to accommodate this type of assessment of organizational issues and there is no data to support the quantification, as is being proposed in this document. This type of assessment has historically been a qualitative determination of the level of significance of the possible impacts of several disparate but similar events.
Current PRA models are not developed with the capability to perform this type of assessment. To now provide a methodology that would superimpose this type of assessment onto a PRA would be subject to subjective determinations, and gross over or underestimation of the risk contribution.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 11 of 29 11 redundant components. Such scrutiny of piece parts is counter to the SDP guidance in (U. S. Nuclear Regulatory Commission, 2006), which states, The performance deficiency should most often be identified as the proximate cause of the degradation. In other words, the performance deficiency is not the degraded condition itself, it is the proximate cause of the degraded condition. The inspector should exercise care to ensure that the performance deficiency is not focused too specifically on a particular subcomponent or piece part. For example, a deficiency related to an inadequate vendor design basis for short-time inrush current on a circuit breaker might better be worded to refer simply to an inadequate vendor design basis, without adding the specific details as to the components that were affected. One of the primary motivations for this NUREG is to elaborate on the performance deficiency issue, and to provide clear guidance for conditioning CCF probability in ECA.
6.) It is unclear what is meant by proximate cause. This should be better defined.
1.1 PRA Treatment of Dependent Failure Since the publication of WASH-1400, PRA studies have recognized the importance of dependent failure as a means of defeating designed-in redundancy and diversity. In treating dependent failure of hardware components, WASH-1400 employed the term common mode failure, defined as:
Multiple failures which are dependent, thereby causing the joint failure probability to increase. The multiple failures are common mode or dependent because they result from a single initiating cause, where "cause" is used in its broadest context.
7.) Per the ASME/ANS PRA standard CCF is defined as:
common cause failure (CCF): a failure of two or more components during a short period of time as a result of a single shared cause.
This definition brings in the concept of short time which is only implied in the WASH-1400 definition. Note that the term was changed from common mode to common cause because the cause was the key to defining the failures in the same short time interval - failure modes can be common but at different times they are not common cause failures.
WASH-1400 elaborates on what can constitute a cause of dependent failure, noting that a cause can be one of a number of possibilities: a common property, a common process, a common environment, or a common external event. While WASH-1400 used the term common mode failure to encompass all types of dependence, later PRAs categorized dependent failures largely based on how they were treated in PRA models. The PRA Procedures Guide (U.S. Nuclear Regulatory Commission, 1983) describes nine different types of dependent failure, summarizing these in three categories: common cause initiating events (now called external hazards), inter-system, and inter-component dependencies. External hazards (e.g., earthquake) produce dependent failures of equipment through spatial interactions; such dependencies are treated through special analysis techniques. An example of inter-system dependence, which is generally captured in the PRA fault trees and event trees, is dependence of front-line systems on shared support systems. These can be thought of as hard-wired dependencies that are a result of system design. Inter-component dependencies, which are not captured explicitly in the PRA models, span a wide range, and may include common design, manufacture, testing, maintenance, environment, and many others. The PRA Procedures Guide (U.S. Nuclear Regulatory Commission, 1983) referred to this last dependence category as common cause failure (CCF), and U.S. PRAs since that time have followed this convention, defining CCF as the failure of multiple redundant components, within the mission time window of the PRA, as a result of a shared cause. However, as noted by (Bedford & Cooke, 2001),
dependent failure treatment in PRA remains an issue around which much confusion and misleading terminology exist.
Part of the confusion, particularly with respect to ECA, stems from a focus on the manner in which the cause is manifested at a piece-part level, that is, the failure mechanism.
8.) Failure at the piece part level is not the same as a failure mechanism. The confusion occurs in the use of CCF models for the purposes they were not intended for.
As discussed above, such a focus is counter to existing SDP guidance in (U. S. Nuclear
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 12 of 29 12 Regulatory Commission, 2006). For example, if the shared cause of failure is a deficiency in a maintenance process, the manner in which the deficiency is manifested across components may vary. In other words, two components could fail in the same mode due to CCF, with the shared cause being the deficiency in the maintenance process, but the failure mechanism at the subcomponent or piece-part level might not be the same. CCF does not require that the failure mechanism be identical, only that the cause of failure is shared. This concept is illustrated graphically in Figure 1.
9.) This discussion does not make it clear that the times of the multiple failures must be synchronized. A failure due to poor maintenance practice and noting that the maintenance practice is shared by redundant components does not meet the definition of common cause. Poor maintenance practice could just as easily lead to higher independent failure rates than increased CCF potential.
10.) In the discussion of the PRA Treatment of Dependent Failure section 1.1, the argument is made that once the definition of the performance deficiency is elevated to a broader scope description, this is sufficient basis for expanding the existing common cause grouping to include any number of diverse components because they can be shown to be encompassed by the all-inclusive definition. Creating deficiency descriptions at this level creates a condition in which almost any failure that ever occurred could be considered part of the group because the over generalized cause statement cannot be proven incorrect. Consequently, this allows the focus to be shifted away from the actual component failures and their direct causes. Attempts to over generalize these conditions to estimate the risk of organizational weakness has not been the purview of PRA modeling and should not be.
The PRA model focus has been, and should continue to be, on maintaining the reliability of components credited in mitigating analyzed events. At no point has there been any discussion of the need to develop a basis for the connection of these events under one common theme. While it is appropriate to characterize events similar to the examples provided as poor maintenance processes as an example for the purpose of aggregating against the ROP cornerstones, or the broadly defined QA areas, it is not necessarily true that the elements of the maintenance processes are all necessarily failed or failed to the same degree.
Also, the failure to correctly implement a procedural requirement one time does not guarantee failure on the next occurrence. This must be demonstrated by providing evidence that the procedural requirement is routinely violated and that evidence exists in implementation of other procedures as well. Even in the case of additional examples, any suspect increase in risk should be restricted to the cases where the evidence is provided. Otherwise, the generalized statements of performance deficiency result, as was done in this document, in an overall indictment of an entire process which was not supported by any factual information. This is the very issue that was raised in the SDP process to be avoided because of the likely gross over estimation of the risk significance.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 13 of 29 13 Poor Maintenance Process is the observed performance deficiency A2 is the observed failure mechanism of a specific sub-component due to the observed performance deficiency.
B2 is the same failure mechanism of the same sub-component of component B.
A1, A3, A4, ---- and B1, B3, B4, ---- are the other failure mechanisms may be caused by the same performance Deficiency.
11.) This model would apply equally well to maintenance causing increased independent failure rates or increased common cause potential. The model should include the time element.
Figure 1 Illustration of difference between cause of failure and failure mechanism 1.2 ECA Philosophy Regarding CCF CCF is included in the PRA because analysts have long recognized that many factors, such as the poor maintenance process in the previous example, which are not modeled explicitly in the PRA, can defeat redundancy or diversity and make failures of multiple similar components more likely than would be the case if these factors were absent. The effect of these factors on risk can be significant. For practical reasons related to data availability, the PRA community has estimated the CCF probability of similar components using ratios of failure counts collected at the component level, without regard to failure cause. Since CCF probabilities are thus based on composite parameters from a cause perspective, the baseline risk estimate of PRAs is felt to be correct in an average sense. While use of conditional CCF probability in ECA can be imprecise because of lack of specificity at the cause level, this document will go on to show that it is important to consider (rather than ignore) this conditional probability, and suggest ways that the approach can be improved.
12.) An equally, or more important reason, is that causes are too numerous to mention and difficult to codify.
The causes described here are general cause categories and are not defined sufficiently to determine the type of cause.
Often, factors such as poor maintenance processes are part of the environment in which the components are embedded, and are not intrinsic properties of the components themselves. The conditioning of CCF probability on observed failures in ECA allows the PRA to provide an approximate insight as to the risk significance of these implicit environmental or organizational factors; CCF is the principal means (human reliability analysis being the other) by which current PRAs can assess the impact Component A Failure Component B Failure Sub-Component and Failure Mechanism A1 A2 A3 A4 B1 B2 B3 B4 Performance Deficiency Poor Maintenance Process
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 14 of 29 14 of organizational factors on risk, however approximate the assessment may be. As was mentioned above, CCF is modeled parametrically (i.e., the treatment is statistical rather than explicit) in PRA, using what can be considered to be a consensus model, as defined in (Drouin, Parry, Lehner, Martinez-Guridi, LaChance, & Wheeler, 2009). In this consensus model, the CCF parameter values are estimated from a combination of past events, which had a variety of causes. Thus, the CCF parameter values are not specific to a single cause, such as a poor maintenance process. As a result, the conditional CCF probability, which is a function of the baseline CCF parameter values, might be either conservative or nonconservative, depending on the specific situation being modeled. In addition, while some causal factors, such as poor maintenance processes, can impact multiple systems, the state of the practice in U. S.
PRA does not include models of intersystem CCF, only CCF within a redundant group of components in a single system. From this perspective, the conditional CCF probability could be nonconservative.
13.) The idea of a conditional CCF probability is not carefully defined. Conditional CCF probability is not related to MGL or ALPHA factors. ALPHA factors are correlative and should not be used as surrogate conditional probability values. Moreover, as cited in this document ALPHA factors can be conservative or nonconservative. So if ALPHA factors are applied as surrogate conditional probabilities, the conclusion is unclear.
1.3 Definitions and Discussion Because of the confusion in terminology mentioned by (Bedford & Cooke, 2001), we will define some necessary terms as clearly as possible. These definitions are in general accord with wider PRA usage.
Dependent failure: The joint probability of two or more components failing is not equal to the product of the individual probabilities of failure when the failures are dependent. For hardware, the joint probability of failure is typically larger in the case of dependence than the product of the individual probabilties, and this is the reason for concern with dependent failure in the PRA. Note: to be of concern in the calculation of risk, multiple failures have to occur within the mission time window; however, dependent failures do not have to be simultaneous.
14.) Stating the dependant failures must occur within the mission time window is somewhat vague. There are situations where the mission time might be long - months. If the mission time is short and the independent failure rate is high, multiple failures in short time intervals are not necessarily CCF.
Common cause failure: When two or more components fail within the PRA mission time window as a result of a shared cause. The failure mechanisms do not have to be shared. In other words, the subcomponent or piece part that fails does not have to be the same; it is the cause of failure that is shared.
Various stochastic models of CCF have been developed over several decades. A few of these, such as the Marshall-Olkin model (Marshall & Olkin, 1967) and the binomial failure rate adaptation of that model (Vesely, 1977), are shock models, meaning that dependent failures are due to shocks that affect multiple components simultaneously. The common cause shocks are randomly distributed in time in these models.
In such models, the failure times of components affected by common cause shocks are simultaneous, as shown in
, taken from (Kelly D. L., 2007). Thus, shock models cannot, without modification, Figure 2 represent more general causes of dependent failure that do not lead necessarily to simultaneous failures.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 15 of 29 15 Figure 2 Plot of 500 dependent failure times for two components from Marshall-Olkin shock model, showing simultaneous failure times (those along the line t.2 = t.1) caused by shared shocks occurring randomly in time, from (Kelly D. L., 2007)
However, the most commonly used PRA models of CCF (e.g., alpha-factor model) are not shock models.
Furthermore, the requirement that failures be simultaneous in order to count as CCF is overly restrictive; a shared cause such as poor maintenance might generally cause the failure times of affected components to be positively correlated, without giving rise to exactly simultaneous failures, as shown in Figure 3, taken from (Kelly D. L., 2007).
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 16 of 29 16 Figure 3 Scatterplot of 100 dependent failure times for two components, showing positive correlation of failure times, but without the simultaneous failures that would be produced from a shock model, taken from (Kelly D. L., 2007)
Common-cause component group (CCCG) : This is defined in (Mosleh, Fleming, Parry, Paula, Worledge,
& Rasmuson, 1988) as A group of (usually similar) components that are considered to have a high potential of failing due to the same cause. We would revise this definition slightly to say simply that the components share a potential for failing due to the same cause; the potential for failure does not need to be high. In fact, typical CCF probabilities are quite low; in a CCCG of size two, if we have observed a failure of one component, the conditional probability that the second component fails due to the same cause is < 0.05.
Assignment of components to CCCGs is part of the qualitative analysis of CCF done for the PRA. A description of the qualitative analysis can be found in (Mosleh, Fleming, Parry, Paula, Worledge, &
Rasmuson, 1988). If components have been placed into a CCCG as part of the PRA model development, one can assume that a potential for dependent failure exists among the components. As a result, if a component in a CCCG fails as a result of a performance deficiency, potential for CCF exists with other components in the CCCG unless the boundaries of the CCCG can be shown to be inaccurate.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 17 of 29 17 Common mode failure: Term used by WASH-1400 to describe dependent failures of all kinds. This term was later applied to a variety of different conditions, and this has led to confusion. One usage of the term that is heard occasionally today, which we do not encourage, is for failures caused by a shared component, or by a latent human error, where this dependence is not modeled explicitly in the PRA.
Because the NRC Standardized Plant Analysis Risk (SPAR) models contain less detail than a typical licensee PRA and the primary function of the NRC CCF database is to support the SPAR models with CCF parameter estimates, such events are classified as common cause failures in the NRC CCF database.
This is due to the boundary definitions of PRA components developed for the NRC CCF database and the fact that few latent human errors are included explicitly in the SPAR models. For more information on the NRC CCF database, see (Wierman et al., 2007).
Independent failure : In one sense, an independent failure is just a failure whose probability is not influenced in any way by other failures or successes that may have occurred. Thus, the joint probability of failure of two or more component can be written as the product of the individual (more formally, marginal) failure probabilities; this is the mathematical definition of stochastic independence. However, independent failure is a term that is sometimes applied in a more specialized sense when estimating parameters of CCF models, such as the alpha-factor model. The parameter estimates of these models are based fundamentally on observed (and inferred) failure counts. For example, a commonly used estimate in the alpha-factor model is In this equation, nk is the number of events involving the failure of k redundant components in a CCCG of size m. For k = 1, the events involve only one component; such events are referred to as independent counts. If the presence of a shared failure cause guaranteed failure of all components sharing that cause, the term would be apt, as observed single failures would by necessity imply the absence of a shared cause. However, this is not the case in reality; a shared cause might only increase the joint probability of multiple failures (per the definition of dependent failure above). Hence, a more appropriate term for n1 might be individual failures.
Failure memory approach : The aim of using PRA for ECA is to assess probabilistically what else could have happened in the event, but which did not happen, that would have resulted in core damage. So failures are remembered and successes are forgotten. Thus, observed failures are mapped into the PRA model, but successes are treated probabilistically: the analyst accounts for the possibility that equipment that functioned successfully or was not demanded in the actual event might, with some probability, fail to function. Thus, failure probabilities are left at their nominal values or are conditioned as necessary to reflect the details of the event. For more details see App. A to Vol. 1 of the RASP Handbook and (Hulsmans, De Gelder, Asensio, & Gomez, 2001).
1.4 ECA Ground Rules for CCF Treatment This section presents the fundamental approach for treating CCF of redundant components in an ECA risk calculation, when one or more of the redundant components are not available due to a deficiency in licensee performance. In past analyses, assessments of CCF were made in varying ways due to the lack of clear guidance for making the assessment. This section attempts to eliminate or at least significantly reduce this variability by presenting clear and simple guidelines for the assessment, which are consistent with what CCF represents in terms of dependent failure, as described above. We also discuss some deviations from the guidelines, which address limitations in current PRA modeling. A more detailed discussion can be found in Sec. 2.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 18 of 29 18 There are three basic ground rules for treatment of CCF in ECA. We list these rules, along with some discussion. Some examples are provided later in this Section.
(1) The shared cause is the deficiency identified in the Inspection Report, which led to the observed equipment failure. For example, if the deficiency were poor quality control, which could affect other redundant components in the CCCG, then a potential for CCF would be judged to exist.
The shared cause of failure at this level is the quality control deficiency. Note that the shared cause should be considered in a broad sense, and not necessarily limited to what was written in the Inspection Report, in order to ensure that the potential for CCF is considered appropriately.
15.) This rule does not address the time element. If redundant components share a deficiency, it does not mean that the deficiency will increase the likelihood of a CCF.
For example, the incandescent light bulb; All light bulbs share the same deficiency which explains why almost 100% of failures occur due to the same failure mechanism - thermal fatigue of the filament, However 99.999+% of all cases of light bulb failure due to this mechanism are independent failures. Many cases of shared deficiency can be explained by an increased failure rate.
(U. S. Nuclear Regulatory Commission, 2006) has the following to say regarding a performance deficiency:...it is important to recognize that discernable (sic) risk increases come from degraded plant conditions, both material and procedure/process in nature [emphasis added], and that the performance deficiency should most often be identified as the proximate cause of this degradation. In other words, the performance deficiency is not the degraded condition itself, it is the proximate cause of the degraded condition. This determination of cause does not need to be based on a rigorous root-cause evaluation (which might require a licensee months to complete), but rather on a reasonable assessment and judgement of the staff.
16.) The guidance states that the performance deficiency is not the degraded condition itself but its the proximate cause of the degraded condition. Note that degraded condition has crept into the guidance. PRAs do not typically analyze the impact of degraded conditions. More importantly the guidance states that the determination of cause does not need to be based on rigorous root cause evaluation but can be based on reasonable assessment and judgment of the staff. Given the possible implications of the findings associated with the performance deficiency, a statement that rigorous evaluation is not required is not consistent with potential consequences of such a judgment.
Emphasis was placed on degradations in procedures and processes in this quote to help discourage overly narrow descriptions of a performance deficiency, focusing on specific subcomponents or piece parts and thus diluting the larger impact of such a deficiency on risk. It is preferable to state that a performance deficiency was poor maintenance practices rather than poor maintenance practices associated with installing bearings in the correct orientation. Other examples: failure to correct a condition adverse to quality instead of failure to correct a condition adverse to quality associated with lube oil systems; the licensee failed to develop and implement scheduled preventive maintenance, as required by Technical Specifications instead of the licensee failed to develop and implement scheduled preventive maintenance, as required by Technical Specifications for Agastat E7000 series time delay relays in the emergency diesel generator (EDG) 2B protective logic; failure to identify the cause of a significant condition adverse to quality instead of failure to identify corrosion on the turbine-driven auxiliary feedwater pump governor control valve stem.
The effect of this ground rule is that the analyst will use the conditional probability of CCF, given the observed component failure. In applying this ground rule, we are treating the shared cause probabilistically. This treatment is consistent with how other events are handled in ECA, relying on the failure memory approach. For example, if we observe failure of one component in a CCCG of size two, we know that the other component did not fail during the event, but still we do not set its failure probability to zero, as it might have failed; likewise, unless conditions exist that would prevent the cause (e.g., good maintenance practice) of failure from being shared, a potential for CCF is judged to be present.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 19 of 29 17.) The guidance states that given the failure of one component in a common cause group, the analyst will use the conditional probability of CCF, given the observed component failure. It is recognized that while one or more additional failures do not occur during an event is not a guarantee that addition common cause failures could not have occurred. This guidance precludes any consideration of facts that could discount or substantially reduce the probability of common cause failure.
In addition, consideration of possible random failure of components that were known to be successful during the event response is not considered the same as arriving at the conclusion that the conditions necessary for a common cause failure of multiple components is present.
One can question whether this treatment gives a best estimate of risk or a conservatively large estimate. Continuing with the example of a poor maintenance practice, which might be shared among components but which resulted in the failure of only one component in the event, failing to consider the potential (i.e., probabilistic) impact of observed poor maintenance on the other components in the CCCG by not calculating a conditional CCF probability is the same mistake as setting the failure probabilities of the other components to zero, on the basis that they did not fail during the event. Doing so would be inconsistent with the failure memory approach employed in ECA. Likewise, if it was only chance that prevented multiple components from being impacted by the common maintenance practice, then declaring the observed failure to have no potential for shared common cause would not give a best estimate of risk; it would give a nonconservatively low estimate because the impact of the poor maintenance practice (i.e., its potential to defeat redundancy) is not properly reflected in the risk model.
The estimate provided under the assumption of an observed independent failure gives full probabilistic credit to the remaining redundant components, which might not be warranted.
(2) For ECA, arguments about the time window are irrelevant, and essentially go against the failure memory concept. Simply testing the redundant components cannot provide proof that multiple dependent failures would not occur within the mission time window, as the failure memory concept does not allow credit for successful operation, and there is no guarantee that multiple components could not fail during the mission time window. To put it another way, chance alone cannot be relied upon to eliminate CCF potential from an ECA, based on failure timing. Of course, for data analysis purposes, CCF failures are of concern when they occur during the mission time of the PRA, which for internal hazard groups is generally 24 hours2.777778e-4 days <br />0.00667 hours <br />3.968254e-5 weeks <br />9.132e-6 months <br />. This is the time window used in estimating parameters in CCF models, in which observed failure counts are used.
18.) This is counter to the ASME/ANS standard definition of CCF. The time element is key to what makes a failure common cause.
19.) The guidance regarding the impact of the time window for common cause failure or chance conditions states that consideration of the time window for common cause failure is irrelevant and contrary to the failure memory concept.
Further the guidance states that simply testing redundant components cannot provide proof that multiple dependent failures would not occur within the mission time would not occur within the mission time window.
However, it has been long standing practice that upon discovery of a failed risk significant component, that redundant component(s) be immediately tested to verify that the failure is not present in those components. Implicit in this evaluation is an assumption that the tested components are available for the mission time. If we are to accept the premise that redundant components cannot be proven to NOT be subject to common cause failure during the mission time, then it is unclear why plants are not required to shutdown immediately upon discovery of a failure of a risk significant component that can be characterized as a cause which can result in common cause failure.
Further, the guidance discounts the benefit of staggered testing which is a planned evolution based on the premise that such testing provides for early detection of conditions and implementation of corrective actions to minimize the potential for common cause failure. Again the impact of staggered testing is left to the judgment of the analyst to decide whether common cause failures could have occurred during the PRA mission time.
The potential for multiple dependent failures within the mission time window is taken into account in the NRC CCF data collection effort, via a timing factor. This data collection effort examines past events involving failures of components in a SPAR CCCG. Because most failures are discovered during testing, and much testing is done on a staggered basis, multiple dependent failures may be separated in calendar time by weeks or months, a period of time much longer than the PRA mission time. In these cases,
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 20 of 29 judgment is applied as to whether such failures would have occurred during the mission time window, had the failures been the result of an actual PRA demand.
20.) For normally operating systems, judgments about time windows can be made when the failures are self announced.
In the data collection effort, the judgment about timing produces a conditional probability that the failures would have occurred within the mission time window, given that the failures were the result of an actual PRA demand. In contrast, ECA is usually focused on a single failure, and is concerned with the probability that multiple dependent failures during a recurrence of this event could fall within the PRA mission time window (3) Credit for programmatic actions to mitigate CCF potential (staggering equipment modifications, etc.) should be applied qualitatively during the enforcement process and not incorporated into the numerical risk result. In other words, strong defenses against coupling factors can mitigate CCF potential, but such mitigation is to be addressed qualitatively during the enforcement process rather than quantitatively in the ECA. Qualitative consideration of such factors might allow, for example, a low White finding to be changed to Green.
1.4.1 Deviations from Ground Rules Because typical PRAs do not model components to the piece-part level, it is possible that some failure causes cannot be shared among components that are redundant from the perspective of the PRA model.
In other words, from a high level perspective such as component type and function, components may be placed into the same CCCG in the PRA, but there may be differences at a lower level that are important to take into consideration. Conditioning CCF probability on the assumption of a potential shared cause in this case could produce an unnecessarily conservative estimate of risk. As an example, consider the failure of the EDG 1A feeder breaker described in Calvert Cliffs Inspection Report (IR) 2006012. While the NRC SPAR model for Calvert Cliffs places the EDGs in the same CCCG, EDG 1A, being air-cooled, is a unique design from the other EDGs, and employs radiator cooling fans, which contribute to the inrush current. The other EDGs are water-cooled and have no corresponding short-time overcurrent trip on the breakers feeding the auxiliary MCCs. Thus, depending on the specifics of the ECA, the analyst might need to treat EDG 1A separately from the other EDGs. However, caution should be exercised in revising CCCG boundaries, because typical performance deficiencies, which reflect organizational problems, such as poor maintenance, can couple the EDGs despite the design differences.
21.) The guidance recognizes the potential for over-estimating the risk significance of common cause failure by applying a particular failure mode to a common cause group in a PRA model where the failure does not apply to all components within the common cause group. It is not unusual for common cause groupings to be present in a PRA model for a limited set of failure mechanisms that apply to the group, but a full set of failure mechanisms typical of the component type may not be applicable to all components within the group.
However, the guidance cautions against alteration of common cause group boundaries to accommodate these design differences as the characterization of the performance deficiency as a broader based problem can couple the components despite any design differences that could preclude common cause failure of components within the group.
The Palisades model has separate common cause groupings for which design differences come in to play.
A second category where the ground rules may not strictly apply is also related to the level of detail in the PRA model. In this case, the licensee PRA may have explicit treatment of some dependencies that are treated implicitly via CCF in the associated NRC SPAR models. Two examples are shared equipment and latent (pre-initiator) human failure events (HFE). For example, the deficiency might apply to a power supply control processor that is shared among all steam generator power-operated relief valves (PORV),
where this dependency is modeled explicitly in the fault trees of the licensees PRA, but is not included in the fault trees of the associated SPAR model. In this case, an event in which a failure of the shared control processor led to multiple dependent PORV failures would be captured in the NRC CCF data collection effort, and would contribute to alpha-factor estimates used to calculate CCF probability of the PORVs in the SPAR model. Rather than calculating a conditional CCF probability for the remaining
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 21 of 29 PORVs, an alternative treatment of a deficiency involving the shared control processor might be to modify the SPAR fault trees to capture this dependency explicitly, allowing better comparison to the licensee PRA. The other generic example related to level of detail is latent HFEs (e.g., miscalibrations).
These are not treated comprehensively in the SPAR models, and again are captured indirectly by including such events in the alpha-factor estimates.
1.5 CCF Examples We include in this chapter some selected examples of CCF that have resulted in an ECA. Note that none of these events are coded as CCF events in the current version of the NRC CCF database.a This is because each event involved failure of only a single component. Failures of single components are added to the CCF database only infrequently, when there is judged to be incipient failure of at least one other component in the CCCG. Such events are assigned a fractional count in estimating the alpha factors for the associated CCCG.
22.) The guidance here suggests that events described in the examples provided are not currently considered common cause failure in the current version of the NRC CCF database and that they will be added in a future update. This represents another example of the implication of deficiencies in other processes that are theorized to underestimate the actual CCF parameters.
Moreover, the guidance suggests that analyses of events that involve these deficiencies will be within the SDP process via implementation of this (NUREG) process without having first addressed the issues in the underlying processes (ASME/ANS).
Hatch EDG Coupling Failure (EA-09-054, SIR 2008008))
Cracking in the engine-to-generator flexible coupling for EDG 1B caused severe vibration during a 24-hour load run. The EDG was secured and declared to be inoperable following troubleshooting.
Similar cracking was found in the flexible couplings of the other EDGs, and as a result EDG 1C was also declared inoperable. An example of the cracking is shown in Figure 4. Such cracking had been first observed by the utility in 1988, 20 years before the event. At that time, the cracking was not viewed as being indicative of coupling degradation. In fact, the observations of cracking were not even documented. No consideration was given to industry experience with cracking of EDG flexible couplings, and no condition report was written for the cracking observed in 1988. Taking all of this into account, the performance deficiency in the IR for the 2008 event was against 10 CFR 50 App. B, Criterion XVI, which requires that measures be established to promptly identify and correct conditions adverse to quality.
The utility determined the root cause of the cracking to be age-related hardening of the rubber in the flexible coupling between the engine and the generator. This cause was shared with the couplings in the other EDGs because they were similar in age, manufacture, operational and environmental conditions, and were subject to the same maintenance and testing program. Despite the shared cause, an argument for not treating this as a CCF was presented. This argument was based on a claimed low likelihood that multiple couplings would fail within the PRA mission time window for EDG operation. The low likelihood portion of the argument was based in turn on predicted failure times using a regression model developed from ex situ testing of the EDG couplings, and on the lower cumulative run times of the other EDGs.
23.) There is not enough information in this example to properly evaluate. Were the failures observed here used to revise the failure rate estimate? Evaluating the CCF parameter must also include a look at the failure rate as it is used before judging the adequacy of the model.
There are numerous problems with this argument, but regardless of these problems the argument is counter to the guidance provided above because the potential for CCF exists at the level of the identified performance deficiency. Furthermore, the failure memory approach does not credit a As a result of discussion during a March 2011 meeting, these events will be
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 22 of 29 included in future revisions of the database.
successful tests. Thus, an ECA of this event should use conditional probabilities of CCF for the remaining EDGs, given the observed failure to run.
Figure 4 Circumferential crack in EDG flexible coupling Dresden EDG Strainer Plug Failure (IR 2009005)
EDG 2/3 was 25 minutes into a monthly surveillance run in June 2009 when an oil leak on the turbo lube oil system Y-strainer end cap required EDG shutdown. The leak is shown in Figure 5. This was caused by failure to replace a plastic shipping plug on the recently installed strainer with a metal end cap (see Figure 6). The utility had not ordered the metal cap, and receipt inspection and subsequent maintenance activities failed to identify and correct the problem. The strainer on EDG 2/3 had been replaced in 2008 because of wear on the strainer blowdown caps, which may have been caused by the use of improper tools. The root causes of the failure were judged to be failure to order proper parts and failure to detect the problem during receipt inspection and subsequent maintenance and testing (i.e., a material control deficiency).
Inspection found that the other four EDGs had metal end caps installed on their Y-strainers, which had not been replaced, and this was the basis for an argument that this constituted an independent failure of EDG 2/3. This argument conflates the cause of failure (inadequate material control, and not just of lube oil strainers) with the manifestation of the cause (failed plastic shipping plug). As discussed above, CCF does not require failure of identical piece parts, only that the cause of failure be shared, which it was in this case. A material control deficiency can affect multiple items in EDGs and other important plant components, increasing the joint failure probability of these components.
In this example, the material control deficiency was discovered because of the strainer failure. It is possible that other important components in the plant were impacted by this deficiency, but focusing too much on the manifestation of the cause at the piece-part level can underestimate the risk
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 23 of 29 significance of the deficiency.
24.) In the discussion of the Dresden event, the argument by the licensee is discredited based on the elevation of the performance deficiency description to inadequate material control. This description allows for the arbitrary inclusion of a broader scope of components that results in substantially higher risk significance. At issue in this example is; does any factual information to support a conclusion that the broader group of components was subject to an elevated level of risk from specific performance deficiencies which could impact their performance, exist?
The original deficiency was appropriately characterized as a failure to order proper parts and failure to detect the problem during receipt inspection (i.e. a material control deficiency in the broader context) with respect to one diesel generator. However, all other diesel generators had the appropriate part installed and no other material control deficiencies were identified with respect to any of the other diesel generators. In fact, the example states that because of the elevated description of the deficiency all other diesel generators become suspect and assigned increased probability of failure as a group.
The example argues that because of the elevated description, the issue is now about any other possible failure mechanisms that could result from inadequate material control not just lube oil strainers (the original issue) without having any evidence that material control deficiencies currently exist that could result in failure of the remaining diesel generators. There is a certain level of guilt by association and implication in this approach.
Figure 5 Lube oil lead from Y-strainer end cap on Dresden EDG 2/3
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 24 of 29 Figure 6 Failed plastic end cap on Dresden EDG 2/3 lube oil strainer
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 25 of 29 25 Farley Residual Heat Removal (RHR) Motor-Operated Valve (MOV) Failure (EA-07-173)
An encapsulated MOV in the suction line from the containment sump to RHR pump 2A failed to stroke fully open on two occasions in 2006 and 2007. The specific root cause of failure could not be identified clearly, but the performance deficiency was failure to promptly identify and correct conditions adverse to quality (10 CFR 50, App. B, Criterion XVI). One candidate root cause was corrosion, caused by high humidity in the valve encapsulation. Another was hammerblow forces on the valve torque switch, with this cause not being shared among the redundant valves, because they had different actuator configurations.
At the level of the performance deficiency, which was failure to adequately identify an equipment problem and take appropriate corrective actions, the argument at the piece-part level is no longer relevant. The identified deficiency, which is an organizational problem, can lead to an increased probability of multiple equipment failing. In this specific instance, the problem manifested itself in failure of a single MOV in the RHR system, but from a probabilistic perspective the problem could have been manifested in multiple dependent failures, so the risk evaluation would use the conditional CCF probability for the other valves in the CCCG.
Calvert Cliffs EDG Failure (IR 2006012)
The feeder breaker to EDG 1A tripped in 2006 due to a low design set point. The performance deficiency was identified as an inadequate vendor design basis for short-time inrush current on the feeder breaker to EDG 1A auxiliary motor control centers (MCC). The EDG arrangement at Calvert Cliffs is somewhat unique in that EDG 1A is a very different design than the other three. EDG 1A, being air-cooled, uses radiator cooling fans, whereas the other three EDGs are water-cooled. The radiator fans on EDG 1A contribute to the short-time inrush current; the other EDGs do not even have a short-time overcurrent trip on the feeder breakers to their auxiliary MCCs.
However, despite these differences, the defined EDG common cause component group in the Calvert Cliffs PRA includes both the single air-cooled diesel and the three water-cooled diesels. Although the specific failure manifestation of the performance deficiency was associated with a unique feature of the EDG 1A, all four EDGs share numerous common attributes that can introduce the potential for common cause failure, including EDG support systems other than cooling and maintenance and testing practices. For this reason, the performance deficiency should be viewed in the broader sense of a failure of the licensee to adequately verify design basis information supplied by a vendor, rather than in the more narrow sense of only being associated with breaker current trip settings. In this case, the specific regulatory violation was cited against 10 CFR 50, Appendix B, Criterion III for inadequate design control, a deficiency that has potential to propagate to other EDG trains.
Therefore, this event had the potential to affect other trains within the common cause component group and the risk assessment should include consideration of CCF potential.
25.) In the Calvert Cliffs example, a diesel generator experienced a failure due to a design feature unique to that diesel. This diesel incorporates a fan cooled radiator for engine cooling. The other diesel engines are water cooled and do not experience the additional in-rush current from the radiator fans on diesel start. This design feature represents a diversity of design that offsets at least some level of contribution from common cause failure as a consequence of a support system failure.
However, the guidance establishes the position that while the particular mechanism is NOT shared among the component group, it is sufficient to argue that other failure mechanisms exist between components within the group and therefore the existence of this uncorrelated failure is a basis to elevate the risk impact of common cause failure of the group is justifiable. If this approach holds, then any benefit from diversity of design is negated. This type of argument represents a self-fulfilling prophecy. One can always argue that while a particular failure mechanism is not shared within the common cause group the presence of other failure mechanisms that could be considered shared is an appropriate basis for increasing the common cause probability of the group as a consequence of the existence of this unshared failure mechanism.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 26 of 29 26 Comanche Peak EDG Failure Due to Paint (IR 2007008, EA-08-028)
EDG 1-02 failed to start during a monthly surveillance test in 2007. Troubleshooting revealed two fuel rack linkage/metering rods bound by paint. The painting had occurred immediately after the last successful surveillance test. The performance deficiency was failure to adequately implement maintenance procedures, which require post-painting verification of equipment functionality. One might argue that this was a random independent failure because the redundant EDG had not been painted. However, this appeared to be pure chance; because of the procedural breakdown (the procedure is a CCF defense mechanism), the other EDG could easily have been painted before finding the problem on EDG 1-02. Preventing CCF requires deliberate defenses against coupling factors among redundant components; in this case the defense mechanism was the maintenance procedure, which was not correctly implemented.
26.) In the Comanche Peak example, a diesel generator had been painted immediately after a successful surveillance test and apparently failed the next surveillance test due to a failure to assure that the painting did not impact the functionality of the painted components. The performance deficiency was failure to adequately implement maintenance procedure(s). This approach implements the chance argument. The information provided only identifies a single occurrence of the failure to implement this aspect of the procedure. The chance argument is predicated on the possibility that the other diesel might have been painted between subsequent diesel tests.
Given painting of diesel components is not considered a routine activity, this appears to be an overestimation of the probability of the second diesel becoming subject to the same condition. While it cannot be argued that this is impossible, the question is how probable was it? But, the process discounts conditions which would support lower probability of a common cause event. The argument provided also assumes that the isolated occurrence of the failure to implement the procedure represents a condition that guarantees future failure, and no credit can be considered for the procedure to prevent a second occurrence of the failure.
1.6 Summary of Guidance In summary, the three ECA ground rules are:
(1) The shared cause is the performance deficiency identified in the Inspection Report.
(2) For ECA, arguments about the time window are irrelevant, and essentially go against the failure memory concept.
(3) Defenses against CCF are to be addressed qualitatively during the enforcement process rather than quantitatively in the ECA.
CCF is the principal means by which PRA can quantify the contribution to risk of crosscutting organizational issues. As noted above, with current CCF models this quantification is an approximation, and might either under-or overestimate the actual contribution in specific situations.
If the performance deficiency is related to issues rooted in the organization, such as procedures or control of maintenance, some level of dependence is likely among affected components (more than the failed component are affected), and the CCF probability for these components in the PRA should be conditioned on the observed failure and the presence of a shared cause of dependent failure.
Looking for differences in how a failure cause is manifested at a subcomponent or piece-part level, in addition to consuming resources, is almost certain to underestimate the risk associated with such a deficiency; differences can always be found at a low enough level. The only case where these general ground rules might not apply would appear to be issues with the CCCG boundary. However, if the shared cause is rooted in an organizational performance deficiency, the ground rules are expected to be generally applicable.
27.) Sections 1.4 - 1.6 of the draft NUREG describe the ECA (Event and Condition Assessment) ground rules for treatment of a component failure in a common cause group. The general approach described is that any component that fails will have some impact on the common cause group failure probability regardless of the
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 27 of 29 27 timing of the failures or the failure mechanism. This implies that essentially all failures may impact common cause group failure probability.
For example, pump A fails due to cause X, and pump B, a redundant pump, fails 5 years later due to cause Y. Following the draft NUREG guidance, these failures could be treated as a common cause due to any performance deficiency. The performance deficiencies could be the same preventive maintenance was performed, or the pumps have a similar design, or they are in the same room, or they operate at the same temperature, etc. etc.
A further example could be a group of redundant components have been installed in a plant for 30 years. Over that period there have been 3 failures spaced 10 years apart. This data would typically be used to update the random failure probability for the components, but does not result in evidence that warrants an increase in the common cause failure probability simply because the components have the same design characteristics or operating environment.
NUREG/CR-6268 and NUREG/CR-4780 when defining common cause factors state:
The concept of a shared cause of malfunction or change in component state is the key aspect of a CCF event. The use of the word shared implicitly includes the concept of coupling factor or mechanism. In addition, the reference to a time interval between failures acknowledges the reliability significance of these events. Multiple component failures from a shared cause, but without affecting mission requirements, in a period required for performance are of little or no significance from a reliability point of view. It is the correlation of failure times and their simultaneity in reference to the specified mission time that carries their reliability significance. Often when the same cause is acting on multiple components, failure times are also closely correlated.
NUREG/CR-6268 further defines the timing factor for announced failures as within three times the PRA mission time.
There is no discussion in the draft NUREG of when plant specific evidence may be applied to update random failure probability when performing an ECA. The failures in the above examples would be more appropriately treated as an increase in the random failure probability. The components may constitute a common cause group, but if they are unreliable, this will be reflected in their individual random failure probabilities.
- 2. DETAILED GUIDANCE FOR TREATMENT OF CCF In this section we expand upon the high-level guidance introduced above. We provide guidance for CCF treatment within SAPHIRE 8 of a number of cases that can be encountered with varying frequency in ECA. We also give more detailed discussion of deviations from this guidance that may be encountered. However, as stated above, such deviations are expected to be infrequent.
2.1 Basic Principles of CCF Treatment in ECA We begin with the basic principles that underlie the detailed guidance that is to follow.
(1) CCF represents all implicit intercomponent dependencies. To reiterate from earlier, CCF is included in the PRA because many factors, such as poor maintenance practices, which are not modeled explicitly in the PRA, can defeat redundancy and make failures of multiple redundant components more likely than would be the case if these factors were absent. These factors often reside in the organizational environment in which the components are embedded, and are not intrinsic properties of the components themselves.
(2) The treatment of such intercomponent dependencies is probabilistic. The unconditional probabilities associated with CCF basic events in the PRA represent joint dependent failure probabilities of similar components that have been placed into a CCCG on the basis that shared causes of dependent failure exist. ECA requires these CCF probabilities to be conditioned upon observed failures of components in the CCCG. As a simple example, assume a CCCG contains three redundant valves. If one valve is already failed, the unconditional probability of three valves failing dependently must be replaced by the conditional probability that the remaining two valves fail dependently, given that one valve has already failed. This conditional probability of two
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 28 of 29 28 valves failing will obviously be higher than the unconditional probability that all three valves fail.
(3) The parameters in the probabilistic CCF models are not estimated for specific causes of failure.
CCF parameter estimates are based on counting component failures, with these failures due to an array of different causes. Thus, the alpha-factor estimates in the NRC CCF database are based on events spanning a range of causes, and there are currently no means to condition a CCF probability upon a particular cause of failure using these estimates. Thus, with the current consensus CCF modeling approach, the best that can be done is to condition CCF probability upon failures observed in the event being analyzed.
(4) All identified performance deficiencies (typically in an IR) that result in failure of one or more components in a CCCG have the potential for CCF. Exceptions to this principle are expected to be infrequent in practice, and will most often be related to issues with the CCCG boundaries. We discuss potential exceptions in more detail below.
(5) ECA relies on the failure memory approach, and gives no credit to observed successful equipment operation. This treatment of CCF is consistent with how other events are handled in ECA. This principle governs the issues of testing and mission time window. A successful operability test of redundant components in the CCCG in which a failure was observed does not reduce the conditional CCF probability of CCF of the remaining components to zero. For ECA, arguments about the time window are irrelevant, and essentially go against the failure memory concept. Again, the failure memory concept does not allow credit for successful operation, and there is no guarantee that multiple components could not fail during the mission time window.
(6) With respect to defenses against CCF, such as staggering maintenance on redundant components, such defenses must be deliberate and effective in order to break dependence. To put it another way, chance alone cannot be relied upon to eliminate CCF potential from an ECA. The effect of such defenses on risk is to be considered qualitatively in during the enforcement phase rather than quantitatively in the ECA.
EA-PSA-SDP-P7C-11-06 1, PAL Comments on Draft NUREG, Page 29 of 29 29 2.2 CCF Treatment Categories In this section we describe categories of events that are expected to span the spectrum of cases encountered in ECA. Guidance is given for CCF treatment in each of these categories using the ECA Workspace in SAPHIRE 8.aSee App. C for examples that illustrate the ECA Workspace. SAPHIRE 8 calculates conditional CCF probabilities for each category, using the approach described in (Rasmuson &
Kelly, 2008).
2.2.1 Observed Failure with Loss of Function of One Component in the CCCG 28.) There is no basis for the approach of using the alpha factor in the baseline PRA model as an estimate for the conditional probability that an event is a CCF. The CCF model is being implemented in a manner that it was never intended to be used for. The alpha factor is correlative and not a conditional probability. Again this surrogate use of an alpha factor may be driven by the goal of creating a fast and dirty conditional probability value but any conclusion is suspect.
In this category, which is probably the most commonly encountered one in ECA, the event being analyzed involves the observed failure of one component in a CCCG. The corresponding basic event in the SPAR model is set to Failure with potential shared cause in the ECA Workspace in SAPHIRE 8.
This will be the case regardless of the outcome of an operability test on the other components in the CCCG. This treatment is also generally independent of differences among the components in the CCCG at the subcomponent or piece-part level; exceptions will be infrequent and will typically involve CCCG boundary issues, which should be resolved via modifications to the SPAR model, with assistance from the INL, before carrying out the conditional CCF calculation, as discussed below.
29.) This treatment arbitrarily brings in some knowledge to update the PRA model and excludes others. If it is determined that the remaining components are not affected by the cause of the first failure, it should be modeled as an independent event.
NO ADDITIONAL COMMENTS
EA-PSA-SDP-P7C-11-06 2 Comments on SDP Phase 3 Analysis Page 1 of 5 1
Attachment ATTACHMENT 1 SDP PHASE 3 ANALYSIS To calculate the exposure time, Section 2.1 of Volume 1 of the Risk Assessment of Operational Events (RASP) Handbook was used. The RASP Handbook states that the exposure time is the duration period of the failed or degraded structure, system, or component (SSC) being assessed that is reasonably known to have existed and includes repair time. For the P-7C pump, the exposure time was one year (the maximum allowable time used in risk analyses), based upon the new stainless steel material for the couplings for all three SW pumps being in place since at least mid-May 2010. The P-7C pump failed on August 9, 2011, at 1201 hours0.0139 days <br />0.334 hours <br />0.00199 weeks <br />4.569805e-4 months <br /> and was returned to service on August 12, 2011, at 0309 hours0.00358 days <br />0.0858 hours <br />5.109127e-4 weeks <br />1.175745e-4 months <br /> following successful post-maintenance surveillance testing. Thus, the repair time was approximately 63 hours7.291667e-4 days <br />0.0175 hours <br />1.041667e-4 weeks <br />2.39715e-5 months <br />.
There is no recovery credit in this analysis.
This analysis divided the exposure time into two segments:
The exposure time with the P-7C SW pump not failed (for approximately one year), but with an increased failure-to-run (FTR) rate for all three SW pumps.
The exposure time when the P-7C SW pump was failed (approximately 63 hours7.291667e-4 days <br />0.0175 hours <br />1.041667e-4 weeks <br />2.39715e-5 months <br />), with an increased FTR rate for SW pumps P-7A and P-7B.
The SRA used the Palisades Standardized Plant Analysis Risk (SPAR) model (version 8.17 dated June 20, 2011), and the SAPHIRE 8 version 8.0.17 software.
The Palisades SPAR model was modified using the Events and Conditions Assessment (ECA) workspace with the following changes:
A revised FTR rate for the three SW pumps was obtained using statistical analysis, i.e., a Bayesian update with a Jeffreys non-informative prior. The two observed failures of the SW pumps over a total run-time of 40509 hours for the three SW pumps (since the new stainless steel couplings were installed until the failure of the P-7C pump on August 9, 2011) was used. The revised FTR rate for the three SW pumps was 6.17E-5/hour.
A revised initiating event frequency (IEF) for a loss of service water (IE-LOSW) was obtained based on an approximate method recommended by Idaho National Laboratory (INL). To estimate a new IEF for the LOSW event, the existing SW system fault tree was solved with the nominal SW pump FTR failure rate (3.9E-6/hour), and then again with the new FTR rate (6.17E-5/hour). The ratio of the SW system unavailability with the new rate to that of the system unavailability with the old rate was then calculated. The IEF for the LOSW was then increased by this ratio.
1.) The approach for revising the IE-LOSW uses a ratio of calculated unavailability from a mitigation system fault tree and then multiplying by the existing IE frequency. This method would not be capable of meeting the ASME/ANS PRA Standard Supporting Requirements IE C-9 and IEC-10 as it combines fault trees for system unavailability with a model for initiating event analysis. However, it is recognized that SPAR modeling utilizes conservative
EA-PSA-SDP-P7C-11-06 2 Comments on SDP Phase 3 Analysis Page 2 of 5 2
Attachment techniques in order to efficiently assess equipment reliability issues and that is likely the case here. But such an approach can overstate the significance of equipment failure.
Fault tree models that are developed for establishing the unavailability of a system in response to an initiating event cannot be manipulated this way to produce a correct estimate of the initiating event frequency. Both the structure of the tree and the computational algorithm must be modified to provide an appropriate model. This is the motivation behind SRs IEC-9 and IE C-10. In addition, the success criteria and mission time assumptions are fundamentally different.
2.) The SW system has a different configuration during normal operation than is the case following most initiating events. In the mode of normal operation there are two normally operating pumps and one pump in standby which may or may not be in maintenance at the time of the initiating event. Which pumps are in which mode are rotated periodically. After most initiating events, the configuration is changed due to various signals yielding a symmetrical configuration. The common cause models, success criteria, and mission times all need to be modified when converting from one configuration to another.
3.) The calculation of IE frequency is a point estimate that does not include uncertainty. The Palisades analysis accounts for uncertainty. This is important for the run-run cutsets due to the state of knowledge correlation.
4.) The approach used to calculate the IE frequency does not adequately isolate the contributions from pump related and non-pump related failure causes whereas the Palisades analysis does. This is critical to the question of how much of an impact changes in pump performance impact the LOSW IE frequency.
For the exposure time of approximately 1 year with all three SW pumps having an increased FTR rate, the IEF for the LOSW event increased by a factor of 3.23 (from 2.50E-4/year to 8.06E-4/year).
5.) The SDP model evaluates the CDF over a one year period, whereas the Palisades analysis covers the entire period when the wrong SS material was installed, which is about 2.5 years.
The configurations covered by the SDP analysis only look at one of the pump failures whereas the Palisades analysis covered both pump failures and other periods of pump maintenance unavailability.
For the exposure time from August 9, to 11, 2011, of 63 hours7.291667e-4 days <br />0.0175 hours <br />1.041667e-4 weeks <br />2.39715e-5 months <br /> (with the P-7C SW pump failed - True), and with an increased FTR rate for P-7A and P-7B SW pumps to 6.17E-5/hour, the IEF for the LOSW event increased by a factor of 1590 (from 2.50E-04/year to 0.40/year).
A common cause failure (CCF) potential associated with the performance deficiency for all three SW pumps was assumed. Consistent with the RASP Handbook, a component failure should only be modeled as an independent failure if the cause is well understood and there is no possibility that the same circumstance exists in other components in the same common-cause component group. Based on this, it was assumed that there was a CCF potential associated with all three SW pumps.
6.) There is insufficient information to understand how the common cause potential is modeled.
A reasonable way to do this would be to assess an impact vector for each event in the same format as is done when CCF events are coded into INL CCF database. If it was assumed that the two failure events were common cause failures of all three pumps that would be inconsistent with the engineering evaluations that were performed by Palisades. Each event involved failure of a single pump. Such an impact would express the probability that if similar failures occurred in the future that the other SW pumps would also be failed at the same time or same time frame. The probability that reoccurrence of a pump failure would have resulted in failures of 1 or both additional pumps must be extremely low.
EA-PSA-SDP-P7C-11-06 2 Comments on SDP Phase 3 Analysis Page 3 of 5 3
Attachment In summary, the method and weight given to the common cause potential is not available to review. In the Palisades report, common cause failures dominate the estimated change in CDF and the assumptions behind this are documented.
The change in core damage frequency (CDF) risk was evaluated for each of the two segments of the exposure time and the results were added together to get a total internal events CDF risk.
Case 1: P-7C SW pump not failed (approximately 1 year), but with an increased FTR rate for all three SW pumps.
For the exposure time of approximately 1 year with all three SW pumps having an increased FTR rate of 6.17E-5/hour and an increased IEF of 8.06E-4/year, the CDF was calculated to be 7.6E-7/year.
Case 2: P-7C SW pump failed (approximately 63 hours7.291667e-4 days <br />0.0175 hours <br />1.041667e-4 weeks <br />2.39715e-5 months <br />), and with an increased FTR rate for P-7A and P-7B SW pumps.
For the exposure time from August 9, to 11, 2011, of 63 hours7.291667e-4 days <br />0.0175 hours <br />1.041667e-4 weeks <br />2.39715e-5 months <br /> with the P-7C SW pump failed and with an assumed increase in the FTR rate for P-7A and P-7B SW pumps of 6.17E-5/hour and an increased IEF to 0.40/year, the CDF was calculated to be 3.9E-6/year.
The total internal events CDF is the sum of the two CDFs calculated above or 4.7E-06/year.
7.) It is not clear if the analysis is calculating the change in the average CDF due to pump issues.
This is evidenced by the fact two different CDF cases for two different pump alignments are added together, but this does not consider the fraction of time in each alignment. Adding up two configuration specific CDF estimates does not appear appropriate. When estimating the change in CDF, both CDF estimates should be on the same basis. This concern may be due to insufficient details provided to explain how the numbers were calculated.
The dominant sequences involved a loss of service water system initiating event, failure of reactor coolant pump (RCP) seal cooling, failure of SW system recovery, and containment cooling failure cutsets.
Since the total estimated change in core damage frequency was greater than 1.0E-7/year, IMC 0609, Appendix A, Attachment 3, was used to assess external event contributions.
The fire risk contribution was estimated using information from the licensee¶s Individual Plant Examination for External Events (IPEEE), Revision 1, dated May 22, 1996. From Section 4.0.4 of the IPEEE, the core damage frequency from fires is 3.31E-5/year.
In Table 4.12-2, ³Risk Significant Operator Actions for the Fire Analysis,' a Risk Achievement Worth (RAW) value for ³failure to align alternate suction source to auxiliary feedwater (AFW) upon depletion of the Condensate Storage Tank (CST)' is given the value of 3.6. The increase in the failure probability of the SW system (used as a suction source to the AFW system) due to the performance deficiency was calculated to be 3.44E-3.
An estimate of the CDF for the fire risk was obtained as:
EA-PSA-SDP-P7C-11-06 2 Comments on SDP Phase 3 Analysis Page 4 of 5 4
Attachment CDF(fire) = [RAW +/- 1] x [Increase in Failure Probability of SW] x [CDF for fires]
= [3.6 +/- 1] x [3.44E-3] x [3.31E-5/year]
= 3.0E-7/year The total estimated CDF from fires is thus 3.0E-7/year.
8.)
The equation used to estimate impact on fire CDF using RAW-1 would only yield the temporary risk impact with one train out of service. Since the comparison is made to an annual CDF, it would be more appropriate to use the Fussell-Vesely importance.
The seismic risk contribution was estimated using information from the licensee¶s IPEEE, Revision 1, dated May 22, 1996. From Table 3.6-3 of the IPEEE, the total core damage frequency from Class IA and Class IB seismic events was 6.16E-6/year.
Failure of secondary heat removal requires the loss of the AFW system. AFW pumps P-8A and P-8B take suction from the fire protection system (FPS) after the condensate storage tank (CST) is depleted. AFW pump P-8C is the only one of the three AFW pumps that can take suction from the SW system after the CST is depleted. The failure rate of AFW pump P-8C is proportional to the loss of secondary heat removal during a seismic event, and thus is proportional to the total core damage frequency from Class IA and Class IB seismic events. Per Section 3.6.5.3.3 of the IPEEE, there are two dominant random event groups that contribute to the failure of AFW pump P-8C which gives a failure rate of 6.01E-2/year for AFW pump P-8C.
However, with the performance deficiency associated with the SW pump couplings, the increase in the IEF for a LOSW event represented an additional failure rate for AFW pump P-8C. With the performance deficiency, the total failure rate for AFW pump P-8C was 6.35E-2/year (6.01E-2/year + 3.44E-3/year). The fractional increase in the failure rate for AFW pump P-8C due to the performance deficiency was 5.7 percent.
An estimate of the CDF for the seismic risk was obtained as follows:
CDF(seismic) = [Fractional increase in failure rate for AFW pump P-8C] x [CDF for Class IA and Class IB seismic events]
= [0.057] x [6.16E-6/year]
= 3.5E-7/year The total estimated CDF from seismic events is thus 3.5E-7/year.
Internal flood risk contributions were screened using IMC 0609, Appendix A, Table 3.1, Plant Specific Flood Scenarios. The guidance lists SSCs important to internal flooding and there are no SSCs listed for Palisades.
The total estimated delta CDF for external events is obtained by summing the contributions from the fire risk (3.0E-7/year) and the seismic risk (3.5E-7/year) or 6.5E-7/year.
The total estimated delta CDF is the sum of the internal events contribution (4.7E-6/year) and the external events contribution (6.5E-7/year) or 5.4E-6/year.
EA-PSA-SDP-P7C-11-06 2 Comments on SDP Phase 3 Analysis Page 5 of 5 The SRAs used IMC 0609 Appendix H, ³Containment Integrity Significance Determination Process' to determine the potential risk contribution due to LERF. Palisades Nuclear Plant is a 2-loop Combustion Engineering Pressurized Water Reactor (PWR) with a large, dry containment.
Sequences important to LERF include steam generator tube rupture events and inter-system loss of coolant accident (LOCA) events. These were not the dominant core damage sequences for this finding and thus the risk significance due to LERF was evaluated to be of very low safety significance.
In summary, the conclusion of the Phase 3 analysis was an estimated change in core damage frequency of 5.4E-6/year (WHITE). The licensee has not yet provided the results of a risk evaluation for the finding.
9.)
A parameter comparison of the Palisades analysis to the Phase 3 SDP analysis is provided in the table below.
Table A12-1, Comparison of Palisades Analysis to Phase 3 SDP Analysis Parameter Palisades Analysis Phase 3 SDP Analysis SW pump failure rate base case per hour 3.91E-06 Not provided SW pump failure rate in degraded period per hour 6.04E-05 6.15E-05 Prior used for degraded state failure rate estimate Jeffreys non-informative Jeffreys non-informative Evidence used for Bayes' update 2 failures in 41,429 hrs.
2 failures in 40,505 hrs Period over which change in CDF is evaluated 3.5 years 1 year Base CDF per RCY 2.83E-05 Not provided Base CDF due to LOSW IE per RCY 3.27E-06 Not provided Base CDF due other IE per RCY 2.50E-05 Not provided CCDP given LOSW IE Base 2.68E-03 Not provided CCDP given LOSW IE in degraded period 2.68E-03 Not provided Base LOSW IE Frequency (average) per RCY 1.22E-03 2.50E-04 Base LOSW IE Frequency due pumps per RCY 1.31E-05 Not provided Base LOSW IE Frequency due non pump related causes per RCY 1.21E-03 Not provided Base LOSW IE Frequency with 3rd pump OOS per RCY 1.99E-03 Not provided Base LOSW IE Frequency with 3rd pump in service per RCY 1.21E-03 Not provided Degraded LOSW IE Frequency per RCY 1.58E-03 Not provided but can be estimated at 3.68E-03 Increase in LOSW IE Frequency in degraded period per RCY 3.65E-04 Not provided but can be estimated at 3.43E-03 Degraded LOSW IE Frequency with 3rd pump OOS per RCY 1.71E-02 4.00E-01 Degraded LOSW IE Frequency with 3rd pump in service per RCY 1.25E-03 8.06E-04 Common Cause Treatment Beta factor for two running pumps assumed to be the same as for the base case unavailability model Some potential is assessed but how this is quantified is unknown Change in CDF due to degraded SW couplings 8.98E-07 4.70E-06