ML20206B987

From kanterella
Jump to navigation Jump to search
Summary of 881102 Meeting W/Numarc & EPRI in Rockville,Md Re Emergency Diesel Generator Reliability Monitoring & Reliability Programs.Attendee List Encl
ML20206B987
Person / Time
Site: Crystal River Duke Energy icon.png
Issue date: 11/08/1988
From: Serkiz A
NRC OFFICE OF NUCLEAR REGULATORY RESEARCH (RES)
To: Kniel K
NRC OFFICE OF NUCLEAR REGULATORY RESEARCH (RES)
Shared Package
ML20206B990 List:
References
FRN-57FR14514, REF-GTECI-B-56, REF-GTECI-EL, RTR-NUREG-CR-5078, TASK-B-56, TASK-OR AE06-1-010, AE6-1-10, NUDOCS 8811160066
Download: ML20206B987 (44)


Text

)

i

+?

o, UNITED STATES 8

NUCLEAR REGULATORY COMMISSION 3

g W ASHINGTON, D, C. 20555 n

%,...../

MEMORANDUM FOR: Karl Kniel, Chief Reactor and Plant Safety Issues Branch Division of Safety Issue Resolution, RES FROM:

Aleck W. Serkiz, Senior Task Manager Reactor and Plant Safety Issues Branch Division of Safety Issue Resolution, RES

SUBJECT:

SUMMARY

OF MEETING WITH NUMARC AND EPRI ON EMERGENCY DIESEL GENERATOR (EDG) RELIABILITY MONITORING AND RELIABILITY PROGRAMS Meeting Date:

November 2, 1988 Locat. ion:

U.S. NRC 5650 Nicholson Lane, Rm. NL/5-013 Rockv111c, Md.

20852 Purpose of Meeting:

The general purpose of this meeting was to continue an exchange of information with NUMARC and EPRI related to d

estimating and monitoring EDG reliability levels and to obtain insights into current industry practices concerning diesel generator reliability.

Attendees:

See attached attendees list.

Sumary:

This meeting was follow-up meeting to the July 27, 1988 meeting with NUMARC and EPRI. The following topics were discussed:

1) EPRI's proposed approach to combining individual EDG reliabilities to calculate a plant-specific EDG reliability level (Enclosure 1).
2) EPRI's proposed approach to maintaining EDG reliability and "graded actions" for responding to EDG failure trigger exceedances (Enclosures 2 and 3).

In addition EPRI provided a copy of an Erin Engineering and Research, Inc. report entitled: "Investigation of an Emergency Diesel Generator Reliability Program: A Case Study of i

9 Crystal River Unit 3," October 1988 (Enclosure 4).

/j

3) Feedback on NUMARC's survey of EDG reliability and k

N~

g{S'h maintenanceprogramsbeingutilized(Enclosure 5).

(Q Discussions related to r,4 bining individual EDG reliabilities for estimating plant-specific levels wm., not concluded at this meeting. Outstanding jt-questions related to accounting for comon cause affects, how to incorporate I-G011160066 001100

((1 PDR TOPRP EylEPnl C

PNV t

(

t

s e

i 0 8 NOV 1988 K. Kniel,

"swing" diesels, variability of plant specific cross-connections etc. require follow-up. A calculstional template (or set of equations) for expected configuration; needs to be developed and agreed upon.

Considerable discussion was devoted to EPRI's presentation of an industry process for maintaining EDG reliability and a graded response" E00.

reliability program (see Enclosures 2 and 3). The following items identify where follow-up discussions are needed:

1) Making more extensive use of a "20 tests" sample
2) Utilization of INP0's performance indicators tracking program, identification of the level of information preserved at the plant-site for an EDG failure, and timely reporting of such data. Currently data is resorted to INPO on a quarterly basis and INP0 takes anotier quarter to review and provide feedback to the industry, j
3) Identification of the time period wherein corrective action will be taken to arrest a degrading EDG reliability picture and verification that the such corrective action has been effect?vs.
4) Actions that might compensate for any deg-adation of re',iability in the interim before corrective measures are taken and verified.

In the course of these discussions, Hoopingarner (PNL) presented his preliminary i

views on action levels (see Enclosure 6) derived from his work for RES's aging program. is Devenroe's overview of current industry practice and l

experience concerning diesel generator reliability. Questions related to these slides identified a need to review the back-up information used to develop these slides for identification of specific monitoring and surveillance practices which are effective and are most important.

To facilitate follow-up discussions related to surveillance techniques NRC is providing a DRAFT copy of "A Reliability Program for Emergency Diesel Generators at Nuclear Power Plants, NL' REG /CR-5078 Volume 2 (Enclosure 7). This draft has received both RES and SNL technical review approval and is scheduled to be published in late November 1988.

It is my understanding that NUMARC will provide the NRC with comparable back up information for rev.iew in advance of follow-up meetings to facilitate identification of the most important surveillance and monitoring items.

NRC consnents forwarded to NUMARC (see A. Setkir to A. Marion transmittal dated September 19,1988) were discussed and for the nast part appear to have been resolved, or will be pursued further (i.e., making use of the "20 tests" sample).

0 8 NOV 888 K. Taiel The SELB resresentative noted that accelerated testing is necessary to determine w1 ether a major degradation in diesel generation reliability is indicated, and to provide a'surance that corrective actions taken in response to recent failures have been effective. The accelerated testing (monthly to weekly and recovery therefrom) should be based on failures in the last 20 demands.

The detailed position is provided in Section 10.2.2 of the proposed Rev. 3 of RG 1.9.

This position does not imply that diesel generator reliability is to be calculated on the basis of the last 20 demands.

NUMARC representatives stated that they are not in agreement with the above SELB position and that NUMARC would coment further during the FOR COMMENT period associated with RG 1.9, Proposed Revision 3.

E. Butcher (NRR/0TSB)providedsomeinsightsintorecentNRCactivitiesrelated to technical specification improvements which are underway. During this discussion NUMARC was asked to consider participating in a pilot program which would use the EDG reliability program (as described on NUREG/CR-5078, Vol. 1) and a revised RG 1.9 to develop and implement EDG technical specification requirements. NUMARC representatives agreed to raise the question with the NUMARC executive structure. This subject will be on the agenda for a follow-up i

meeting, In conclusion, it can be fairly stated that these discussions were forthright and that a considerable exchange of views took place. A tentative follow-up was identified to further pursue the items i

meeting date of December noted above. A.Serkiz(NRC15,)1988andA.Marion(NUMARC)willdevelopanagenda prior to such a meeting.

l l

Aleck W. Serkiz, Senior Task Manager I

Reactor and Plant Safety Issues Branch Divi-ion of Safety Issue Resolution, RES

Enclosures:

As stated cc:

E. Beckjord T. Speis R. W. Houston W. Minners F. Gillespie i

A. Thadani F. Rosa Attendees PDR i

Central Files ?

M. Vagins

s MEETING ATTENDEES November 2,1988

SUBJECT:

NUMARC/EPRI APPROACH TO EDG RELIABILITY MONITORING Attendees Affiliation Phone Number Al Serkir NRC/RES/RPSIB 301-492-3923 Arthur Payne SNL 505-846-3588 Ernie Lufgren SAIC 703-R21-4492 John Gaertner EPRI 415-855-2933 John Flack NRC/RES/ARGIB 301-492-3741 Robert J. Colmar NRR/GAIS/PTSB 301-492-3076 l

Jack Burns NRC/RES/EMEB 301-492-3845 R. J. Deese Duke Eng, & Service 704-373-4740 Stuart Lindsay Duke Power / Nuclear Maint.

704-373-8768 Matthew Chiramal NRC/AEOD 301-492-4441

]

Mike McGarry BCP&R/NUMARC 202-371-5733 Harvey L. Wyckoff EPRI 415-855-2393 l

Om P. Chopra NRR/SELB 301-492-0835 Karl Kniel NRR/RES/DSIR/RPSIB 301-492-3950 Warren Minners NRR/RES/DSIR 301-492-3980 i

Alex Marion NUMARC 202-872-1280 Chuck Ondash DEV0NRUE 617-426-4550 David Shum NRC/hRR/SPLB 301-492-0860 l

Ken Hoopingarner PNL(Battelle) 509-376-4643 i

Paul Norian NRR/RES/DSIR/RPS1B 301-492-3910 I

ED Butcher / Bob Giardina NRR/0TSB 301-492-1183 4

ENCLOSURE 1, Ref. 11-2-88 NUMARC/EPRI/NRC Meeting

SUBJECT:

EDG RELIABILITY

(

ELEMENT 5 THE CASE FOR COMBINING THE FAILURE EXPERIENCE OF A NUCLEAR UNIT'S EDGs l

l 1

l IS EMERGENCY POWER RELIABillTY BEST PORTRAYED AND MOST ACCURATE IF IM'. >ETERMINATION IS KEYED ON:

THE REllABILITY PERFORMANCE OF EACH EDG l

THE COMBINED FAILURE EXPERIENCE OF A NUCLEAR UNIT'S 1

EDGs 1

HISTORICALLY, EACl-l EDG HAS BEEN EVALUATED SEPARATELY THIS MAKES THE REllABILITY PERFORMANCE OF EACH EDG STAND OUT MORE SHARPLY ON THE OTHER HAND THE INDIVIDUAL APPROACH HAS SIGNIFICANT DISADVANTAGES THE DATA CAN REACH BACK UP TO 8 YEARS (FOR A 100 DEMAND SAMPLE), REMOVING RELEVANCE OR ALTERNATELY, REACH-BACK CAN BE LIMITED BY CHOOSING SMALL SAMPLE SIZES, BUT THIS CAUSES GREATER STATISTICAL UNCERTAINTY 1

i h

THERE IS MUCH TO RECOMMEND COMBINING THE FAILURE EXPERIENCE OF A UNIT'S EDGs AND JUDGING THE OVERALL EDG l

REL!AB!LITY AS AN ENTITY THE REACH-BACK CAN BE CUT I" 'iALF, OR l

4 l

LARGER SAMPLE. SIZES CAN BE USED, OR l

SOME COMBINATION l

THE OUTCOME IS A MORE RELEVANT AND CONFIDENT 4

i RELIABILITY DETERMINATION REFLECTS THAT OVERALL PLANT RISK IS DEPENDENT ON THE COMBINED RELIABILITY OF THE EDGs

)

CONCERN IS SOMETIMES EXPRESSED THAT BY COMBINING THE FAILURE EXPERIENCE OF A NUCLEAR UNIT'S EDGs, A RELIABLE i

EDG CAN MASK ONE WITH P0OR RELIABILTY THIS IS NOT THE CASE

[

l

\\

l i

i 1

)

I i

l l

2

f CONSIDER TWO EDGs EACH WITH 5 FAILURES IN 100 DEMANDS THE PAIR'S RELIABILITY IS:

0.05 X 0.05 = 0.0025 ONE EDG WITH 1 FAILURE IN 100 DEMANDS AND ONE WITH 9 FAILURES IN 100 DEMANDS THE PAIR'S P.ELIABILITY IS: 0.01 X 0.09 0.0009 SAME AS PRECEDING. BUT THE FAILURE EXPERIENCE IS COMBINED THE PAIR'S RELIABIL:TY WOULD BE DETERMINED TO BE:

- 0.0025 COMBINED FAILURE EXPERIENCE IS A CONSERVATIVE HEASURE OF A NUCLEAR UNIT'S OVERALL EDG RELIABILITY IT ALSO IS IN CONCERT WITH THE PHILOSOPHY TO APPLY RELIABILITY IMPROVEMENT ACTIONS TO ALL EDGs A COMBINED GROUP SHOULD INCLUDE ONLY EDGs THAT PROTECT THE SAME NUCLEAR UNIT OR UNITS 3

ELEMENT 5: COMBINING THE FAILURE EXPERIENCE OF A NUCLEAR UNIT'S EOGs A nuclear unit typically has at least two EDGs to provide on-site emergency power.

In deciding how to characterize the plant's assurance of having on-site emergency power available when needed, an important judgment must be made. On balance, is the overall reliability of the plant's on-site emergency power best portrayed, most accurate, and most meaningful if 1) the determination is keyed on the individual reliability performance of each EDG, or 2) the failure experience of all of the unit's EDGs are combined and judged as an entity?

Historically, the reliability of each EDG has been determined and considered separately. This approach has the sole advantage of making the reliability performance of individual EDGs stand out more sharply. But this advantage comes at the price of having smaller sample sizes to base conclusions on.

This can result in long reach-back periods and large statistical uncertainties.

Individual EOGs at some plants load-run as few as a dozen times a year. This results in a reach-back of up to 8 years for a 100 demand 5

sample and 4 years for a 50 demand sample. As explained in Element 4, a 20 l

demand sample is statistically too uncertain to be usable.

An 8 year reach-i back (100 demand sample) is too long to be relevant and a 4 year reach-back 4

(50 demand sample) is relevant only if conditions have remained relatively unchanged, i

Because of accuracy and relevance considerations, there is much to recommend combining the failure experience of all of a nuclear unit's EOGs and judging overall EDG reliability as an entity. Combining the failure experience causes 4

the reach-back period to be cut by at least half or it allows the use of larger sample sizes, or some combination of the two. The benefit of a larger database is less uncertainty in the EOG reliability determination along with the use of more recent and relevant data.

l There is no debate that it is the combined reliability of a nuclear unit's EDGs that determines their overall contribution to reducing unit risk.

1 l

1

E ie g

However, the concern is sometimes expressed that by combining the failure experience of a nuclear unit's EDGs, a highly reliable EOG can mask an EOG with poor reliability. This concern is addressed in the paragraphs that follow.

i Before presenting numerical examples of why a highly reliable E00 cannot mask one with poor performance, " worthwhile to recall certain fundamentals of EDG reliability statistics. These are described in Elements 1 thru 4.

The key fundamental is that an EDG's underlying reliability cannot be directly j

measured or exactly known.

It is only N ssible to get indications of reliab!11ty at moments in time. Because of th2 effects of the sliding sample methodology and normal statistical variations, these momentary indications of reliability have constantly changing, wide variations from the actual f

underlying reliability. To deal with these statistical phenomenon EPRI has p',*oposed the use of failure triggers. These triggers relate to the number of' j

failures in the past 50 demands and past 100 demands; they have been carefully selected to accurately account for the many statistical complexities that are at work.

The numerical example that is presented below, to show why a highly reliable l

EDG cannot mask one with poor performw:e, uses the simple ratio of number of failures to number of demands as the measure of unreliability.

This does not account for the complexities that sctually exist and must be dealt with. The i

simple ratio is used so that the example stays visible above the subtleties and that the complexities of using actual underlying unreliability are l

i avoided.

I The example shows thet overall EOG reliability calculated using combined values will always be equal to and usually lower than the reliability calculated using individual EOG values. Moreover, the greater is the spread l

i

)

between the reliability of the better and pe:rer units, the greater will the calculation understate the reliability of the combined units.

j 2

i

The following exampf e, thows why this is true.

1) Consider two EDGs that each have 5 failures in 100 demands. This is 10 failures in 200 demands. The calculated unreliability is:

l 0.05 x 0.05 = 0.0025 = 2.5 x 10-3

2) Next consider two EDGs, one of which has 1 failure in 100 demands, and one of which has 9 failures in 100 demands. As in the first example, this too is 10 fallnres in 200 demands.

Their calculated unreliability is:

0.01 x 0.09 = 0.0009 = 9 x 10-4

3) Next consider how the unreliabilit/ would be calculated for the above case if the failure experience of the 2 EOGs is combined.

Their combined calculated unreliability would be:

2 l

Il failure + 9 failures

,(0.05)

=.0025 = 2.5 x 10

(

200 demands

)

j The calculated overall unreliability using the individual EOG values is 9 x 10-4 which is lower than the calculated unreliability using combined values.

As can be seen, if the EDG f ailure experience of several EOGs is combined, the f

determined unreliability will be greater than the unreliability calculated l

l asir.g individual EOG values, I

I I

The above comparisont are for independe'. failures.

In this example the j

0.0025,

i calcu' 'ed combined failure experience unreliability is a f actor of 2.8 1arger than unreliability using individual EOG values.

ThegenehaNfect l

l of an allowance for common cause failures is to add a constant to each of the

[

I calculated unreliabilities. This will tend to reduce the ratio.

In the above example it might drop from 2.B to 1.8.

However, even with an allowance for f

convron cause f ailures, the unreliability using combined values will always t

appear larger than the unreliability using individual EOG values I

This approach for assuring EOG reliability focusses on the combined ability of the EOGs to maintain overall plant risk at acceptable levels. For this l

reason, if EOG reliability must be improved, the improvement actions sho;1d be j

j i

implemented on all EOGs uniformly. This would be the case, even if the need t

l l

I i

i for improvements stems from poor experience with only one EOG. With the philosophy to apply any EDG reliability improvement actions to all EOGs, it is appropriate to make decisions based on the combined failure experience of the EDGs.

When combining the failure experience of EDGs, it must be determined how the groupings should be made. The paragraphs that follow offer guidance in this 4

regard.

i Fundamentally, a combined EDG group should include only EDGs that all supply the same nuclear unit (or in a few cases, units).

In the case of a single nuclear unit, the group would incluoe all of its EDGs.

If a site has two nuclear units and each has its own dedicated EDGs, the EDG3 for each nuclear unit would be grouped separately.

In these typical :ases, the EOGs would be combir.ed on a nuclear unit basis.

plants that share one (or more) EOGs between two nuclear units must be considered on a case by case basis, using the underlying criteria. A prime example of such a situation is plants that have three EOGs for two nuclear units. At these plants, there is one EDG dedicated to each nuclear unit. The third (swing) EOG can provide power to either nuclear unit, if its dedicated EOG falls on demand.

1 Here, three EOGs have the commonality that they are supplyii.g two nuclear I

units. Even if there is an EOG whose reliability performance is slightly j

poorer, each nuclear unit also is covered by an EOG whose reliability performance is commensurately better. And as previously explained, the actual f

combined reliability will be better than the calculated combined i

reliability.

Thus, for a plant having two nuclear units, and three EDGs, the f

1 EDGs failure experience can be combined on a plant-wide basis.

i HLW: 3849N58 1

i

ENCLO2URE 2, Ref. 11-2-89 NUMARC/EPR1/NRC Meeting

SUBJECT:

EDG RELIABILITY 1

L INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY

.l i

John Gaertner Electric Power Research Institute

-l for l

NRC/NUMARC Meeting on EDG Reliability Program November 1988 l

}

GRADED RESPONSE EDG RELIABILITY PROGRAM Desirable Featutes of Regulatory Reauired Program

/CR.

Consistent with many features of NUREG 5078,\\loL.1 s

Simple to implement; clear requirements Expected to maintain good EDG performance and to improve poor performance No punitive or counterproductive requirements Graded response dependent on EDG performance l

l 1

INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY PERIODIC m

a TESTING V

r 3

EVALUATE PERFORh%NCE IMPROVE-MARGINAL ACCEPTABLE MENTNEEDED PERFORA9NCE PERFORh%NCE V

IDENTIFY RELIADil.lTY IDENTIFY IMPROVEMENTS RELIABILITY BASED ON IMPROVEMENTS A N OR EXTENSIVE BASED ON PAST FAILURES EVALUATION FAILURES

+

+

+

I

.I

u INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY PERIODIC TESTING 4

r COMPARE WITH TRICCER VALUED t

o o

o TOR EACH TOR EACH TOR EACH PLANT TH AT EXCEEDS l

l PLANT THAT EXCEEDS j

{ PLANT THAT EXCEEDS

}

No TRICCEP.S ONE TRICCER Two TRICCER!

i r s i 1,

DETERMlNE CAUSE DETERMINE CAUSC DETERM1NE CAUSE l

CT NEW TA! LURES OF NEW TA! LURES or sgw rAtteggs t

i 4

ADORESS ITS RECENT AO RESS ITS RECENT ADDRE!! ITS RECENT RICURRING. COMMON RECURR!NG COMMON RECURRING. COMMON CAUSE AND SE\\TRE CAUSE AND !EVERE CAUSE AND SEVERE CON!E;UE*;CE TA! LURE!

l CONSEQUENCE TAILUPt3 CONSEQUINCE TA! LURE!

4 4

J ADDRESS ALL OF ADDRESS ALL CT ITS PAST TA! LURES ITS PAST FAILURES 4

Y 1

t ASSESS CRITICAL ADDRE!! DOMINANT 3EVIEW ELEMENTS IN EDG TAILURE MODES TERMS OF EDC THROUGH STUDY OT

(

TAILURE EXPERIENCE POTEN'!AL TAILURES f

i h

i ASSESS CRITICAL I

RE\\1EW ELIMENTS IN TERMS OF EDG TAILURE EXPERIENCE I

AND POTENTIAL l

TAILURE MODES 1DENTITY PROGRAM CHANCES h

EST! MATE EX5ECTED i

NOT IMPRO\\"EMENT OK OK g

INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY A_p_plicable Critical Review Elements for Proarammatic Evaluation (NUREG/CR-5078)

EDG Surveillance Needs EDG Performance Monitoring EDG Maintenance Program EDG Failure Analysis and Root Cause Investigation Problem Close-out Data System l

g EXAMPIE FOR *nIC IDENTIFICATION Ol' IMPLICATI21 CRITICAL REVIEW EIIMENTS (1)

Implicated Critical Review Elements

.erformance Maintenance Failure /

Problem Data Surveillance Failtre/ Failure Mode Needs Monitoring Program Root Cause Close-out Systems

1) Poor PM procedure led to failure of X

lube oil pump.

2) Oxidation on timing X

rolay.

3) Foiled bearing in X

governor.

4) Fuel oil pump shaft X

e decoupled.

5) Cctse of failure unknown.

X X

X G) Fuel oil pump shaft decoupled.

(1)

For each failure or failure mode, any Critical Review Element implicated in the response to the questions will be marked with an "X".

Critical Review Elements implicated in more than one failure or failure mode will be cvaluated on a programmatic basis.

-~

.e l

INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY Address Dominant EDG Failure Modes Through Analvsis of Potential Failures Dominant failure modes include those which fail an important function

- and have occurred at the plant

- have occurred at other plants and could occur at this plant with reasonable likelihood

- have not occurred but which could fail an important function at this plant with reasonable likelihood Potential failure mode analyzed through systematic engineering evaluation such as:

- FMEA with logic tree analysis

- Detailed fault tree or other system model

- Systematic expert review sessions

9 INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY Reoorting Reauirements All Plants:

Unit average EDG performance compared to triggers Report of failure causes for new failures Analysis of recurring, common cause and high consequence failures in last twenty demands Plants Exceeding One or Both Triggers:

Analysis of past (and potential) failure modes, results of critical review element assessment, I

and list of potentialimprovements Description of selected improvements Schedule for implementation of improvements Estimated effectiveness of improvements

~

)

INDUSTRY PROCESS FOR MAINTAINING EDG RELIABILITY Post Exceedence Actions Exceedence status continues for 2 years or until trigger values are met for last 50 and 100 f

demands, whichever comss first Planned improvements must be implemented regardless of EDG performance New failures must be evaluated against improvement actions to ensure past conclusions continue to be valid

r REV.1 10/21/88 s-ENCLOSURE 3, Ref. 11-2-88 NUMARC/EPRI/NRC Meetire SUBJECTS EDG RELIABILITY ELEMENTS 6 AND 7 GRADE 0 ACTIONS FOR RESPON0!HG TO EDG FAILURE TRIGGER EXCEE0ENCES BACKGROUND Overall, the reliability of emergency diesel generators at U.S. nuclear plants is excellent. The national average is about 985. However, from time to time, an EDG at a nuclear plant will give an indication of a decrease in i

reliability. These indications usually come from a momentary measure of reliability from a single sample of past demands. Frequently these indications are the retutt of normal statistical variations and do not reflect an actual underlying change in reliability. These downside variations typically are but a few percent, but a variation can take an EDG belcw its target reliability. These slippages can be the result of real problems, can be caused by purely statistical variations, or can be the result of some combination.

t In a given year, about 10% of the EDGs have had a momentary relfability indication that was below 955. About 99% of these 10% have been temporary and This is an the reliability of these EDGs has been acceptable the next year.

indication that most momentary drops in reliability are quickly reversed and may be the result of normal statistical variations.

In net, only 2 or 3 EDGs l

at U.S. nuclear plants have experienced unacceptable reliability for 2 years in a row.

In mid-1988, the NRC issued a new set of regulations (10CFR50.63 and Regulatory Guide 1.155) to further educe the probability of a nuclear power i

plant experiencing a loss of all AC power. One provision of R.G. 1.155 calls l

for a further forma 11 ration of the way that EDG reliability is monitored and kept at acceptable levels. This is being pursued und.r Generic issue B-56, i

i 1

i

s Diesel Generator Reliability. The intent is to assure that any EOGs that do exceed unreliability targets are recognized; and further, that there is a concensus on appropriate actions for returning these EOGs to acceptable levels.

R.G. 1.155 sets forth criteria that nuclear plants can use to select target EDG unit reliabilities. A plant can select a 95% EDG reliability target, or if it meets certain criteria, it can choose a 97.5% target. However, the plant must maintain the chosen target reliability.

This document describes graded actions for responding to EDG failure trigger exceedences. The failure trigger methodology is described in the documentation for EPRI Elements 1 through 5.

In these documents it is demonstrated that the use of a sliding sample, small sample sizes and long reach-back times result'in momentary large statistical variations of EDG reliability.

In addition, it is shown that even EOGs with excellent underlying reliability can expect to have exceedences purely from the cumulative probability of statistical variations. A further important factor is that plant risk is not significantly impacted by small variations in EDG reliability.

These realities argue that there is little basis for a kr.ife edge division between acceptable and unacceptable EOG reliability. One cannot say for example that a snap-shot indication of 95% reliability is acceptable but a momentary indication of 94.9% requires drastic actions. As with most feedback control systems, some kind of proportional response to variations is the best answer.

A graded response avoids the need to take insediate drastic actions besed on determinations of momentary reliability that are known to be highly uncertain. While one cannot prevent statistical variations or stop cumulative probability from triggering an incorrect exceedence, the graded response approach calls for initial response actions that should be highly effective, 2

e yet should not prove unduly burdensome. Equally important, the initial indication of a problem is a signal. The signal permits modest actions and provides an interval in which to understand the situation and to determine if the problem is real or a normal statistical variation.

It also provides a chance for plant personnel to take steps to correct any problem and to make unnecessary the more major corrective actions associated with exceeding both triggers.

The graded response approach requires a varied level of response by the utility, depending on the observed failure rate. With this approach, exceedence of only the 50 demand failure trigger, or only the 100 demand failure trigger signifies marginal EOG reliability performance and justifies certain additional utility actions. Exceedence of both the 50 demand failure trigger and 100 demand failure trigger justifies more extensive actions. Th(

exceedence of either the 50 or 100 demand trigger will provide an initial indication that a problem may exist. This will give the plant a chance to come to grips with the situation and to take necessary actions to prevent i

exceeding both the 50 snd 100 demand triggers. A flowchart describing the proposed program elements is shown in Figure 1.

I l

i l

l 3

Figure 1 INDUSTRY PROCESS FOR M AINTAINING EDG RELIABILITY PERIODIC TESTING r

-3 COMPARE WITH TRICCER VA LL'E3 g

it ir 1r f

FOR EACH FOR EACH IVR EACH

{ PLANT THAT EXCEEDS j

l PLANT THAT EXCEEDS l

l PLANT THAT EXCEEDS

}

NO TRICCERS ONE TRICCER TWO TRICCERS

< r 5,

i r DETERMINE CAUSE DETERhCNE CAUSE DETERMINE CAUSE OF NEW FAILURES OF NEW FAILURES OF NEW FA! LUCES

+

+

+

ADDRESS ITS RECENT ADDRESS ITS RECENT ADDRESS ITS RECENT RECURRlNC. COMMON RECURRING, COMMON RECURR!NG, COMMON C AUSE AND SE4TRE CAUSE AND SEVERE CAUSE AND SEVERE CONSEQUENCE FA! LURES CONSEQUENCE FAILURES CONSEQUENCE FAILURES 4

+

ADDRESS ALL OF ADDRESS ALL OF ITS PAST FAILURES ITS PAST FAILURES 4

+

ASSESS CRITICAL ADDRESS DOMINANT REVIEW ELEMENTS IN EDC FAILURE MODES TERMS OF EDC THROUCH STUDY OF FAILURE EXPERIENCE POTENTIAL FAILURES

<r ASSESS CRITICAL REi1EW ELIMENTS IN TERMS OF EDC FAILURE EXPERIENCE AND POTENTIAL FAILURE MODES

/

IDENTIFY PROGRAM CH ANC ES h

ESTIMATE EXPECTED NOT IMPRO\\TMENT

, ON I

  • O O

ACTIONS FOR PLANTS THAT 00 NOT EXCEE0 EITHER TRIGGER Plants whose observed number of failures in the past 50 and the past 100 demands are less than the trigger values established for their plant reliability goal (i.e. 0.95 or 0.975) would only have to meet certain minimum reliability assurance obligations. These minimum obitgations would be aimed at ensuring a continued acceptable level of reliability.

Operating U.S. nuclear plants have numerous activities which, when considered together, are intended to achieve the same objectives as a formal EDG reliability program; i.e., maintain an acceptably high reliability of the EOGs in performing their function. This fact has been demonstrated by the NUMARC sponsored "Review of Current Practices Concerning Diesel Generator Reliability i

(draft 8/88). The fact that a plant consistently has no target exceedence is evidence that these activities are being effective. Plants would be expected' to continue activities which have been successful in the past.

The minimum reliability assurance obligations that would be required include

1) determining the cause of each new failure, and 2) reviewing the recent operatir.g history relative to 3 classes of failures. The cognizant utility personnel would be responsible for ensuring that these basic reviews are kept current.

The following paragraphs elaborate on the routine obligations that are referred to above. All plants would have these obligations.

Determine Cause of New Failures Determining the cause of failures is an essential step in assuring acceptable reliability of EDGs. Accurate failure cause data can provide insight into the most effective steps for maintaining acceptable reliability. Plants currently issue Licensee Event Reports (LER) for EOG failures.

These LERs typically include a description of the cause of the failure and the corrective actions that are taken.

5

s A quality root cause analysis capability is generally agreed to be an effective part of the failure analysis process. This p"ogram does not specify the details of such a capability, but generally requires that:

1) the cause of failures be investigated in sufficient detail to classify all catastrophic, incipient, and degraded failures with appropriate cause codes for tracking cms. and 2) the cause of all functional failurel be determined to the highest level at which they can be addressed by an applicable and effective maintenance task, testing task, procedure change, operations change, or design modification.

If no such task is appropriate, the analysis would be done to the depth required to make such a determination.

Address Recur.ing. Common Cause, and Severe Consequence Failures The plant staff should review the recent operating history (past 20 EDG demands) to determine if any indications exist that suggest the presence of one of 3 classes of failures.

The three classes are:

failures with common cause potential (e.g., with the potential

+

to cause failure of more than one diesel at a time).

Failures with a recurring root cause within a short interval.

Failuras with severe (or potentially severe) consequences (e.g., substantial equipment camage and long repair times).

In carrying out the at' 6:e review, plant personnel would pay special attention to closely spaced fativres. These could be an indication of degradation or, if on different EOGs, of a comron cause.

if two or more failures occurred within this 20 demand interval, they would be evaluated to determine if they were due to the same failure cause.

If they 6

were, the cause of the failures and appropriate corrective action would have to be determined. This step would assure that a recent declining trend in reliability performance would be addressed at an early stage. The results of these reviews would be reported on a quarterly basis.

4 I

f 4

7 i

i e

ACTIONS FOR PLANTS THAT EXCEE0 A SINGLE TRIGGER Plants,that exceed the past 50 demand failure trigger or the past 100 demand failure trigger, but not both, would be required to take actions that are in addition to the routine reliability assurance obligations that are described in the previous section. The additional actions would focus on uncovering and correcting the cause of the decrease in reliability, based on the actual failures that had occurred at the plant.

The first additional action would be to review all failures over an appropriate history. While all plants must review recent failures that fall in any of the three classes delineated in the previous section, a plant that l

has exceeded one failure trigger would address all past failures. With the l

failure review in hand, the next step would be to prepare a prioritized list,

l of possible corrective actions. To make certain that any programatic deficiencies also are identified and dealt with, six critical review elements (e.g. surveillance, performance monitoring, maintenance, failure root cause analysis, problem close-out methods, and EDG data systems) also would be examined relative to the failures that had occurred. The goal here would be to identify programatic problems and possible solutions. These would be added to the list of possible corrective actions. With these resources, a group of plant experts would select the most effective and practical corrective actions to implement. Finally, an estimate would be made of the expected reliability improvement, based on the specific actions identified.

This would involve either, 1) identifying how the actions taken would have precluded sufficient previous failures to have avoided exceedence of the target or, 2) performing an alternate predictive reliablitty analysis. Each of these actions is discussed below.

Address All Past Failures The review of observed EDG f attures experienced at the plant would be undertaken to identify specific improvements in EDG testing, maintenance, and operational practices that would restore an EDG's reliability to an acceptable level. The scope of the investigation would encompass all applicable observed 8

EDG failures. This investigation would strive to understand the failure modes and the underlying reasons for the failures. For this review a'l failure modes actually experienced are considered to be dominant modes. With this information it would be possible to specify actions that could be taken to preclude or minimize the recurrence of many of the observed failures. The product of this task would be a list of potential applicable and effective reliability program changes that could be implemented. At this point, the list should not be constrained by practical considerations, such as cost, but l

rather, should be ac:omposite list of improvements.

i I

Plants that have exceeded one trigger also would implement an EDG corrective maintenance (CM) tracking program. This program would provide cognizant plant personnel with additional information that would be useful in identifying precursors to further reliability degradation. As part of this tracking program, each CM related to an EDG system component failure would be evaluated f

and categorized in four important areas: severity of failure, functions affected. EDG subsystem involved, and fe. lure cause classification. The f

f severity of each CM would be classified in accordance with the IEEE-500 l

severity levels: catastrophic, incipient and degraded. The functions affected would identify the system function lost due to the component failure (i.e.

f failure to start, failure to load-run, etc.).

A sample' format for tracking i

EOG cms is provided in Figure 2.

After implementing the CM tracking program, I

plant personnel would have available regular summaries of the CH data to l

1 i

assist in monitoring and evaluating EDG performance.

l Assess Failure History Against Critical Review Elements i

Once the specific f ailures have been reviewed, and potential program l

improvements have been identified, an evaluation would be carried out to determine if there are programmatic deficiencies.

This would be done by j

determining whether the observed reliability problems are being caused by l

deficiencies in any of six critical review element (CRE) areas.

This would be l

achieved by assessing each of the observed failures with a few key questions t

j aimed at identifying potential problems within the scope of each critical

[

I i

I

g fleure 2 htentiel Coteseeles f er Use In f rech tne co.'rective nelatenance sieteA<

Catestrophic/

Component IncIpleat/

funetien(e)

SescriptIen ef Cne involved Sutpsys t em segradent effected tellure (1)

(2)

(3)

(4)

(S)

(6)

No seedine cefinitfoas A unty identitler for the merk reepsest er merk authertsetlen uhlen use initiated in response to the f ailure.

(1) [me (2) Component involved

- The unlegue e<gulpment piece numer(s) for the component (s) Involved in the failure.

The (DC subsystem offected Spy this failure (i.e. feet, Stortine Alt, Engine, Generator, Coeling, Eshaust, (3) Svhsystee Lu6elcetlen, er ISC.

(4) Le18_13rerAlcl Incepient/

Closiflcotdon of the f ailure according to the IEEE-SDS severity inden.

Desteded a

fu ction(s) affected - Idemificet ten of the functlen(s) et the EDC lapacted by the f ailure (i.e. storting, le-ding, ' etinued (S) n oper et t ens, shut deun, e t c. )

A briet desceipt len of the f ailure addressed by the Cm, including covee. The Identificetten of f ailure cause (6) failure could ut6tise e stonderd cause code Indes to f acilitate eveleset ten of Ca histories en the bests of cause.

~. - -

b 1

review element. Examples of these questions are given in Attachment A Figures A-1 through A-6 for each of the six critical review elements.

To facilitate the review of critical review elements, a matrix would be developed to correlate each failure to the CREs identified in the review. A sample matrix is provided in Figure 3.

Upon completion of the matrix, those CREs which were identified by two or more of the faileres would be selected for further evaluation. This evaluation could lead to the need for additional, broader changes to the reliability program. For each CRE which is identified for further evaluation, some disposition would have to be developed to describe why actions were (or were not) taken to improve the reliability program in that area.

Identify Program Changes Through the study of actual present and past failures. and the examination of the critical review elements in light of these failures, a comprehensive list of potential improvements would be available. This list would be evaluated by a plant team composed of knowledgeable personnct from engineering, operations and maintenance. Their mission would be to select those improvements which would most effectively address the observed failures. To facilitate this process, the complete list of potential improvements would be prioritized on the basis of their expected effectiveness. Cost, operational impact and other practical considerations should be factors in this prioritiration. The product of this review would be a shorter list of improvements that would be implemented by the plant personnel.

Expected Reliability Improvement Once specific actions have been identified, the estimated reliability improvement of the EDG5 could be assessed. This evaluation would focus on the numbers of previous failures which the improvement program would have eliminated (had it been previously implemented).

If this evaluation indicated that the actions being taken would have precluded enough failures to avoid the original exceedence, no sdditional actions would be required.

If insufficient 11

1 l!

jli; sm ae ttas Dy S

)

t I

mu e o.

(

l S

be T

os N

ro E

Pl M

C E

)

L 2

E

(

W s

E t

e I

n

/s nw V

e eu i e E

m ra i

R e

uC dv l

l ee L

E it tR A

ao a

C w

Fo cl e I

e R

i ab T

i l c I

v pil R

e mtl C

R e

iii c

rw D

l n

tC n

e E

a am m

.d e

T c

na A

i er o

C t

tg e"

m-I i

no lX L

r ir E"

e P

C aP r

8 3

M wnu 7

M

=

I d

eal 0

E e

i i

5 R

F t

vha U

O a

etf R

G c

Ri C

N e

wr

/

i f

O cg l

o G

lp nn ad E

I T

m ai cee R

A mr ik r U

I C

ro tru N

I ot i al F

fi rmi h

C a

t I

rn h

T eo ef i

N PM yb w

E n

e D

al n t

o.

n I

,l s

e i

E e

ewni t

H c

d as s

T n

osha i

a mntb s

R l s o

n O

ld eiec o

F ie rtri c

ee usot T

vN l ema s

A r

i u m

t M

u aqnm n

R S

f ia e

O r

m F

redg e

oheo l

L tt r E

A e

e ap d

roc w

I T

o utia e

N M

l l

i epn v

E i

T e

asmo e

O r

f ni R

P u

o d

l hpse l

i cstt a

a aena c

F ereu i

/

ml t

e reea i

r ohl v r

u FtEe C

l i

)

a

)

2 F

1

(

(

Z

o past failures have been addressed by the actions (to have avoidad target exceedence), additional actions could be proposed or a more detailed alternate EDG reliability improvement estimate could be made. This alternate determination would take into account the probability that each improvement would eliminate specific failures that had reduced EDG reliability below target values.

The reliability improvement estimate is based on the informatir.n available at the time, combined with good engineering judgment.

It is recognized that a reasonable period will be necessary to implement improvements.

It also is recognized that conditions affecting EDG reliability h4ve uncertainty, so the estimate must not be used as a prediction.

l 13

J ACTIONS FOR PLANTS THAT EXCEED BOTH TRIGGERS I

Plants whose EDGs exceed both the 50 demand and the 100 demand failure triggers would take additional actions beyond those required of plants i

exceeding a single trigger value.

The same basic actions discussed earlier also would be required of these plants. However, the scope of failures to be evaluated would be increased beyond those that actually had occurred by 4150 including potential dominant failure modes.

}

Address Potential EDG Failure Modes From Study of Potential Failures j

In addition to the evaluation of actual observed failures, a systematic

]

identification of potential dominant failure modes would be carried out to ensure completeness.

)

Dominant f ailure modes are defined to be those which can f all an important EOG l

function and which 1) 4M ::s,@d (a etw frequently or 2) are unlikely but Fa.t iden signifscant consequences that they should be prevented from c*?

occurring even once. Such a determinaton involves engineering judgement of an experienced and knowledgeable engineer. Such judgements are routinely made in reliability analyses.

Since this would be a required regulatory program, the following guidance is provided to assist utility engineers in their identification of dominant failure modes.

This is to assure that the analysis is adequate but not unnecessarily detailed. Dominant failure modes are those that are relatively likely to occur and result in an actual failure of the machine to fulfill a clearly stated intended function. Thus, those failures which would not preclude adequate performance can be omitted.

Similarly, those failures that would not contribute to the unreliability on the basis of their likelihood may be omitted from consideration. This requires judgement from the utility reliability analysts. Guidance to assist utility reliability analysts in making this judgement cculd include a threshold which is well below the d

expected reliability. For example, a threshold of 10 / demand would screen out those that contribute less than 1% of a 0.99 relicble diesel. All failure 14 V

modes which actually occurred at the plant are dominant failure modes. A comprehensive evaluation of the dominant failure modes should include an assessment of the likelihood and consequence, to determine overall importance. This can be done in a number of ways:

a failure modes and ef fects analysis (FMEA) in conjunction with a logic tree analysis J detailed fault tree or GO analysis l

other systematic methodologies such as expert review sessions

=

1 Key inputs to these analyses would include EDG related plant maintenance 1

histories, test data, LERs, NPROS data for similar EDGs, and other industry

]

data on failure modes and frequencies. The product of these analyses in 1

conjunction with ve evaluation of actual f ailures, would be a list of the EDG dominant failure modes.

v l

j Normally, many of the dominant failure modes already are addressed through existing plant programs or can be addressed through minor changes to existing i

programs.

For those failure modes which are not, a decision must be made whether to implement maintenance program changes, modify the design to j

preclude the failure, or accept the failure because no cost-effective change i

can be identified.

After the dominant failure modes and potential improvements are identified,

~

the critical review elements would be assessed in terms of these failure modes to see if any programatic weaknesses esist. The questions in figures A-1 through A-6 apply to both observed f ailures and potential f ailure modes.

Thus, the format for documenting the results of this evaluation would be the 1

same as for single exceedences as shown in figure 3.

i When the complete list of potential improvements is identified from the dominant failure mode analysis and critical review element assessrent, the list would be evaluated by a plant team. The team would select those changes that would be most effective in ensuring EOG reliability, following the 15

selection of the improvements to be implemented, an estimate of expected reliability improvement would be made. The method would be similar to that described for sing 1,i exceedences. The objective would be to identify and implement enough changes, that address actual failures, to give promise that a target exceedence would be unlikely to reoccur.

If the number of f ailures addressed is greater than the dif ference between the number of failures experienced and the target values, these actions (if impler.ented originally) should have prevented target exceedence.

In the event this estimate of reliability does not demonstrate compliance, a quantitative evaluation could be made that is based on both observed and potential failure modes. This would indicate projected failure mode frequencies after the modifications are in place.

16

b i

i REPORTING REQUIREMENTS It would appear appropriate to report the following information to the NRC and

[

IMPO on a quarterly basis.

i All Plants:

l The unit average EDG failure performance as compared to the appropriate 50 demand and 100 demand failure triggers.

A description of the failure cause identified for each failure.

The results of the important failures review.

l a

i For Plants Exceeding One or Both Triggers:

A description of EDG reliability program improvements in i

response to exceeding a trigger, l

The results of the critical review element assessment.

l i

I The schedule for implementing improvements.

An estimate of the effectiveness of actions being taken in l

1 4

terms of the numbers of failures addressed (or the results of f

an alternate reliability evaluation).

l J

POST EXCEE0ENCE ACTIONS t

f Plants which exceed one or both f ailure trigger values would continue to j

monitor their actual performance versus the trigger values.

They would f

not revert to a no exceedence status until an exceedence no longer exists in the last 50 or the last 100 demands, or 2 years from the last 4

)

failure while in an exceedence, whichever occurs first. No.ever, before l

4 plant could revert to a nonexceedence status, all planned improvement f

actions would be required to have been completed.

Should a plant continue in an exceedence because of new failures, these failures would have to be evaluated against the improvement actions identified for l

implementation.

The purpose of this evaluation would be to assess

[

whether the failure shcJ1d have been addressed by the identified f

l improvements and if any of the conclusions of the previous evaluations f

should be modified based on the occurrence of the new failure.

l l

17 3850-6REV r

i f

i

b e

4 e

l i

Attachment A l

l l

l e

l l

l

.s e

Figure A 1 ASSESSMENT OF EDG SURVEILLANCE NEEDS 1)

Does the failure relate to equipenent previously excluded from consideration as part of the EDG system?

l 2)

Could the failure be prtvented by a change in surveillance practices (i.e.

scheduling or content)?

l 3)

If the failure is one which proceeds from a degraded state to a failed state, could it be identified by a surveillance before the failure occurs?

4)

If the failure is related to aging, could surveillance detect the aged i

condition?

i i

5)

If the failure has comen cause failure potential, could the common cause potential be identified through surveillance?

f i

i 6)

Does the severity of this failure warrant a change in surveillance practices?

t 7)

Would the presence of a surveillance plan preclude this failure?

i i

For observed failures on1vt j

1)

Should the existing surveillance program have identified this fatture before f

it caused (DC failure?

f I

If the answer to any of these questions is yes, then mark the "Surveillance Needs' j

box for this failure.

l l

[

I i

t

s e

Figure A 2 ASSESSMENT OF EDG PERFORMANCE MONITORING 1)

Are there parameters that could be monitored which would preclude this failure?

2)

Could.the existence of alert levels or corrective action levels preclude this failure?

3)

If the condition monitoring procedures were improved, would this failure be precluded?

4) is the condition monitoring frequency inadequate to detect this failure?

  • 5)

Could surveillance practices be changed to improve condition monitoring for this failure?

i

[.gr observed failures onivt

!)

is there a condition monitoring task in place which should have detected this failure before it caused an EDG fatture?

2)

Should existing alert levels or corrective action levels have precluded this failure?

1 If the answer to any of these questions is yes, then mark the ' Performance.

Monitoring' box for this failure, t

s 6

Figure A 3 i

ASSESSMENT _0F EDG MAINTENANCE PROGRAM t

L 1)

If maintenance response to this failure is not based on the severity of the l

failure, then should it be?

2)

If maintenanca restanse to this failurt is not based on the expected time to repair, then should it be?

l 3)

Could a PM preclude this fatture?

J l

1 i

For observed failures only!

f I

1)

Did maintenance response to this repair adversely affect EDG reliability?

l i

e 2)

If maintenance was not involved in the failure and root cause analyst, of I

I this failure, should they have been?

l t

I 3)

Old spare parts play a role in this fatture?

j l

l If the answer to any of these questions is yes, then mark the *?.aintenance t

f I

Program" box for this failure.

i I

(

f I

l a

1

}

f 4

u r

5e s.

20 -

2 c'

bi' Figure A 4 ASSESSMENT OF FAllVPJ ANALYSIS & ROOT CAUSE INVESTIGATIONS i

1)

Is the necessary information routinely gathered to assess the root cause of i

this failure?

For observed failures oniv:

1)

Would an improved failure cause and root cause investigation have precluded this failure?

I 2)

Would improved accessibility and retrievability of failure cause and root, l

cause data have prevented this failure?

If the answer to any of these questions is yes, then mark the "Failure / Root Cause" box for this failure.

I 1

i

4 Figure A 5 ASSESSMENT OF EDG PROBLEM CLOSE-0UT For observed failures only:

1)

Did inadequate problem close out contribute to this failure?

2)

Could improved problem close-out procedures have precluded this failure?

3)

Would improved problem close out criteria have precluded this failure?

4)

Would special monitoring to enhance problem close out have precluded this failure?

If the ansker to any of these que % ions is yes, then mark the "Problem Close out" box for this failure.

l

- c.,,

Figure A 6 ASSESSMENT OF EDG DATA SYSTEMS 1)

Would an improved data system serve to preclude this failure?

i For observed failures oniv:

1)

Did an inadequate data system cause this failure (i.e. due to the inaccactibility of aither plant data or generic data)?

l If the answer to any of these questions is yes, then mark the "Data Syst:m" box for this failure.

l e

i e

i l

l l

t l

-