ML19332E874
| ML19332E874 | |
| Person / Time | |
|---|---|
| Issue date: | 09/17/1989 |
| From: | Peranich M Office of Nuclear Reactor Regulation |
| To: | |
| References | |
| NUDOCS 8912130096 | |
| Download: ML19332E874 (11) | |
Text
pQ...
y;-[i
- 'n
"'$*i SIXTEDf1H Al@RAL NATIGAL ENERGY DIVISION CONTEIGNCE y M:
~
0 Date & Place:
September 17 - 20, 1989 Q * ' :;
Ft. Lauderdale, Florida
~
M N
l,,, l k
1 n.
' ipi 1
Uf 7m 1
?,
SESSION P - ROOT CAUSE ANALYSIS THE OPERATING REACTOR INSPECTION PROGRAM AND GUIDANCE.FOR INSPECTION OF ROOT CAUSE F
Mark W. Peranich, Chief Inspection and Licensing Program Development Section Inspection and Licensing Program Branch t
Program Management, Policy Development i
and Analysis Staff ~
{
Office of Nuclear Reactor Regulation United States Nuclear Regulatory Commission Washington, D.C.
20555 ABSTRACT This paper briefly reviews the Nuclear Regulatory Comission's (NRC's).
l operating reactor inspection program to provide a basis for the subject of j
L.
this session -- root cause analysis.
This paper focuses on NRC inspection policy, requirements, and guidance for pursuing both the root cause of identified violations - from NRC regulations or deviations from licensee
{
.comitments identified in the Firial Safety Analysis Report (FSAR).
The NRC approach: for reviewing licensee root cause and corrective action deter-minations, and for examining the effectiveness of corrective actions implemented to prevent the recurrence of identifieri problems is discussed.
Training given to NRC inspectors on how to review root cause analysis and
.some current ' inspection-experience with specific industry problems and associated root causes are also discussed.
1.
BACKGROUND l
The NRC inspection program is oriented toward performino audits and does not necessarfly examine every activity or item, but verifies, through carefully selected samples, that activities are being properly conducted to operate nuclear facilities safely.
The NRC formally evaluates licensee perfonnance by implementing the program described in N,RC Manual Chapter 0516. " Systematic Assessment of L
.8912130096 890917 I
1 07 PDR-ORG NRRB q
1 SIXTEDmi AIMAL NATICNAL ENERGY DIVISION CONFERENCE 7
o th
.h w[p;
' l -
Licensee Performance" (SALP).
Consistent with the SALP progran, the NRC inspection program emphasizes those areas of the licensee's activities that are most important to reactor safety and recognizes licensee performance in these areas as the basis for managing inspection resources.
The SALP functional areas that are inspected under the operating reactor inspection program are:
Plant Operations Radiological Areas Maintenance / Surveillance
'- Emergency Preparedness Security Engineering / Technical Support Safety Assessment / Quality Verification Other Areas as Needed 2.
THE OPERATING REACTOR INSPECTION PROGRAM The operating reactor inspection program consists of a fundamental inspection program and a number of additional programs.
By direct observation and verification of licensee activities, the operating reactor inspection program obtains sufficient information on licensee performance to ascertain - whether the facility is being operated safely, whether the management control program is effective, and whether regulatory requirements are being satisfied.
In addition, information is gathered to support SALP evaluations.
The inspection program takes a balanced look at a cross-section of licensee activities important to plant safety and reliability, and also looks at specific licensee activities that may need additional attention.
2.1 Fundamental Inspection Program The fundamental program, consisting of the core inspection program and the mandatory team inspection program, is performed at all reactor sites.
Very few plants are limited to only the conduct of the fundamental program; most require additional inspection effort consistent with their performance ratings in the various SALP areas.
Core Inspection Program As part of the fundamental inspection program, a specific group of inspection procedures were designated for incorporation into the core inspection program.
The core inspection program is designed to ensure a balanced look at a cross-section of plant activities considered important to maintaining safety, to confirm adequate licensee performance, and to identify potential operational problems in the early stages.
The core inspection must be complete at every plant at a prescribed interval and is performed by resident and regionally based specialist inspectors.
Handatory Team Inspection Program The second part of the fundamental inspection program is the mandatory team inspection program.
This mandatory team inspection focuses on en area 2
+. - - - +
.p-w p
w
.-.p w-..
c.m.
.___.._m,,_,,___,_,,,_____m_
~
i f'
SIX'M:Dml N@AAL NATICIAL DIERGY DIVISION CONTERENCE
\\,,,,/
of plant operations that has been selected for inspection emphasis.
The area currently selected is maintenance.
The selection is based on the NRC identification of an emerging safety concern or en area requiring increased emphasis because of a history of longstanding or recurring problems for which industry root cause analysis and long-term corrective action is. in question.
2.2 Additional Inspection Programs Some programs are performed as required to follow up on operationel events and safety issues and to further investigate the root causes and corrective actions related to fundamental inspection program findings, These additional programs will be discussed now.
Regional Initiative Inspections The regional staff initiates sow inspections to follow up on safety concerns identified by the fundamental inspection progran or as a result of information from other plant experience.
When problems with licensee perfonnance are identified, such follow up inspections will focus on examining the root cause of identified problems and apparent failures in licensee management controls that allowed the problems to occur.
In general, those plant activities that have not been given a SALP Category I rating will require additional regional initiative inspections.
e
-Regional Reactive Inspections A regional reactive inspection is generally an unplanned, onsite inspection.that is initiated. almost innediately in response to an oper-ational event or incident and before a licensee event report (LER) is issued.
The LER would be subject to review and inspection under the core inspection program.
The resident inspectors provide the major onsite NRC
[
presence for direct observation and verification of licensee activities, and hence, usually will perform the greater part of the initial, event-related l
reactive inspection effort.
However, this effort may be augmented by other I
i inspectors, depending on the type of event and expertise required.
Special Team Inspection Program The special team inspection program can be initiated by headquarters or regional staff; it consists of an independent, in-depth, and balanced i
examination of licensee performance to assess the adequacy of specific functional technical activities that ensure safe operations.
Special team inspections are encouraged whenever it is considered necessary to conduct an l
in-depth or multidisciplined examination of a particular licensee activity l
l and also to focus on the apparent root cause of previously identified j
l problems.
Special team inspections include, but are not limited to, a safety system functional ins (SSFI), a safety systems outage modification inspection (SSOMI)pection, and an operational safety team inspectio (OSTI).
'3
]
SIXTt2NIH N#EAL tRTICDRL Dit:RGY DIVISION CONFTRDKT' i
r.,,
s Safety Issues _P,rogram The tafety issues program is implemented through the issuance of individual Temporary (inspection)
Instructions (TI) and is the means utilized to provide for verifying - that the licensee has implemented requirements imposed by the NRC to resolve a specific generic safety issue.
Such inspections are generally conducted to assess how well a licensee has implemented NRC requirements contained in NRC bulletins or generic letters.
3.
INSPECTION OF ROOT CAUSE The NRC inspection and pursuit of licensee root cause analysis for an identified safety issue is provided for through the general policy and specific inspection requirements of the operating reactor inspection program. The inspection program recoonizes that immediate corrective action taken by a licensee to resolve a significant operatinnal problem may not address the root cause of an identified problem.
Therefore, until this matter is adequately addressed, the NRC inspector will continue to pursue licensee root cause analysis and corrective action required to prevent recurrence of similar problems.
Thus, the NRC inspection of licensee overall corrective action is accompitshed sequertially over a period time t+.arting with the initial identification of a violation of NRC requirements.
For each Notice of Violation identified by the NRC, the licensee is requested to provide a response to the following:
l Acceptance or Denial of the Violation L
Reason for Its Occurrence Corrective Action Taken and Results Achieved l
Corrective Action Taken To Prevent Recurrence Date When Full Compliance Will Be Achieved Thus, in addition to the initial corrective action usually taken to allow for continued operation of an item, system, or activity, the licensee must specifically address corrective action needed to prevent recurrence of an identified safety problem. Although the current wording of the Notice of Violation does not specifically request that the licensee state the root cause of the identified violation, the NRC review and any inspection follow up deemed necessary, place particular emphasis on determining how adequately the licensee has identified the root cause of the safety problem.
i In summary, the NRC inspection program is designed to identify plant operational safety issues at an early stage, and to provide for additional review and follow up of identified problems to gain a full understanding of their scope and impact on safe operation.
Also, the inspection programs will continously assess how effectively a licensee corrects a problem and the timeliness of such corrective action, the accuracy of the licensee's analysis of the related root cause, and any additional corrective action the licensee may need to take to prevent recurrence of similar problems.
l l
4
SIXTEDmt MHJAL NATIGAL D4ERCrt DIVISION ColWERDiCE k
To this end, the core inspection program provides for a review of each licensee response to an NRC-identified finding and for a further inspection of significant safety issues.
In particular, Inspection Procedure 92702,
" Followup on Corrective Actions for Violations or Deviations," requires the staff to follow up on each issue by evaluating the adequacy of the licensee's planned corrective action, root cause determination, evaluation of generic implications, and actions taken to determine the need to change the governing quality assurance program policy and procedures.
In addition, Inspection Procedure 35502, " Evaluation of Licensee Quality Assurance Program Implementation," requires the NRC regional staff to perform a
- periodic summary evaluation of the effectiveness of the licensee's quality assurance program by reviewing licensee perfomance in all areas of plant operations before the SALP evaluation, so that perceived problems can be identified early.
This evaluation is performed to determine whether NRC-identified findings -operationel events, and other information on plant experience indicate a fundamental weakness in the structure or imple-mentation of the overall cuality assurance program.
Khere such weaknesses are believed to exist, a special regional initiative inspection is performed to focus on the perceived problem areas.
Such inspections would generally result in a further review of NRC-and licensee-identified findings and root cause determinations for the areas of concern.
Similar inspection requirements are included in inspection procedures utilized for regional initiative and special team inspections.
An example of such typical requirements included in the operational safety team inspection IOSTI) are:
l i
Verify that an effective root cause determination is made for all safety-significant deficiencies.
Verify that corrective action bounds the effects of any identified deficiency on operational safety.
- ' Determine if corrective action is structured to emphasize safety as well I
as compliance.
1 Verify that the licensee has established an effective program for l
documenting and correcting identified deficiencies and for raising significant deficiencies to an appropriate level of management control.
L Also, under the OSTI examination of licensee management oversight, the inspector is required to evaluate management's concern for keeping the plant in operation against its concern for safe operation of the plant.
This is done by assessing, for example, the thoroughness of management reviews conducted before plant restart (i.e. post trip and event reviews) including management involvement in root cause analyses of significant equipment failures and corrective actions.
4.
INSPECTION FINDINGS It is appropriate during this session on root cause analysis to discuss some of the more current NRC inspection initiatives, to give examples of identified problems, and to offer NRC views on the associated root causes.
5
3' SIXTEDmi AIMAL NATICHAL DERGY DIVISION C0tFD(D4CE 1\\
~
)'i
%,,,, /
4.1 Emergency Operating Procedure (EOP) Inspections First, under the Safety Issues Program the NRC issued Temporary Instruction (TI) 2515/92, " Emergency Operating Procedures."
The results of the first series of these-inspections are discussed in NUREG-1358, " Lessons Learned from the Special Inspection Program for Emergency Operating Procedures."
The great majority of problems that were identified by the inspections resulted from inadequate or incomplete implementation of E0P prograns.
Although the inspections focused on the emergency operating procedures themselves, the kinds of problems that were identified led the staff to examine the programmatic weaknesses responsible for those problems and for allowing them to go uncorrected.
The root cause of the widespread program weaknesses is that licensees have generally not followed the published guidance regarding the upgrading of E0Ps.
It appears from the inspection findings that, rather than intentionally disgarding NRC guidance, licensees do not understand the principles included in that guidance.
The most significant programatic problems are: lack of a multidisciplinary team approach, especially a lack of human factors expet tise; lack of an independent. review to assure that E0Ps are correct and can be performed; lack of a systematic process for ensuring that the quality of E0Ps does not degrade over time; and lack of adequate management commitment which means that too low a priority has been
. assigned to the E0P program within the organization.
4.2 Maintenance Inspection As I have said, the current area of emphasis under the Mandatory Team Inspection Program is maintenance.
These inspections are conducted as directed under TI 2515/97, " Maintenance Inspection."
Approxinately one-third of these inspections have been performed.
Examples of NRC inspection findings from regional inspection reports and as characterized during a briefing of the NRC Comissioners on the results and status of these inspections.
The most comon weakness in nuclear industry maintenance prograns found during the maintenance team inspections have been sumarized. They include:
- poor root cause analysis
- insufficient engineering support
- inefficient spare parts procurement
- ineffective trending of equipment failures / histories
- poor control of contractors
- failure to use procedures properly
- significantly different treatment of non-TS equipment in terms of management sensitivity to significance of problems, corrective actions, trending, procedures, and documentation.
6,
l y
SIXTEDmi AlHJAL NATIONAL D4ERGY DIVISION CCtWERENCE j
~
,I
\\
g,,. e During the May 2, 1989 briefing of the NRC Commission, the staff summarized the weaknesses in the engineering support and trending of maintenance.
Engineering Support s
Engineering personnel did not perfonn adequate root cause analysis for equipment failures.
The repetitive failures of equipment were not identified as a basis for changes in the scope of the preventive maintenance program.
Engineering involvement in the resolution of problems noted on work orders during the performance of the job was not clearly evident. And despite vendor recommendations, preventive maintenance activities were not conducted and engineering did no technical evaluation to support these exclusions.
In some cases, it took Engineering up to two years to resolve problems.
Trending Although there are indications that licensees have established or are starting to establish trending programs, implementation is severely lagging.
Some licensee trending programs were not capable of identifying repetitive failures over a long period of time. They did not identify subtle trends or
' individual component failure trends.
At some sites, the Nuclear Plant Reliability Data System (NPDRS) was not being utilized to identify component failures trends in those components.
Also, information documented on completed work packages was not adequate to assist in root cause analysis and failure trend analysis.
Some programs were fragmented.
Not all the failure information was available for review.
System engineers only saw preventive maintenance work packages and not corrective maintenance work packages.
l S.
TRAINING OF NRC INSPECTORS The NRC recognizes that to achieve an effective process for root cause l
analysis requires the involvement of individuals who are technically I
competent and have adequate experience and training needed to fully understand the various programs and activities that could be the root cause of an identified problem.
We also recognize, that even with this capability, it is not easy in some cases to identify the true root cause of l
j a problem.
l To give NRC inspectors a full appreciation for root cause analysis fol-lowup, the NRC provides each inspector with general training on matters to be considered when reviewing licensee actions in this regard.
Some of the guidance given on this subject follows.
5.1 Generic Implications Once a problem has been identified, it is very important to determine generic implications of that problen in order to prevent the failure of a similar component elsewhere in the plant.
Once the root cause is established, the licensee may find that the failure mechanism can be applied to other components or systems that are not necessarily similar to the 7
~
l SIXMENDI AtHJAL NATIONM., INERGY DIVISION CONTERDiCE
(' %,,.../
component or system in which the failure first occurred.
A root cause determination of a programmatic failure such as the use of improperly calibrated equipment or the work of poorly trained craftsmen can have far reaching generic implications and can require extensive corrective action.
5.2 Root Cause Determination and Action Needed To prevent Recurrence The most important and most misunderstood aspect of the root cause and corrective action-evaluation processes is that of determining the actions to be taken to prevent recurrence of a problem.
One must not limit his review to. the immediate problem in question.
It should be recognized that the identification of the root cause for the immediate problem may dictate a continued evaluation to determine if the error was an isolated instance, or if the error resulted from a latent root cause of an undiscovered deficiency in the quality control system that allowed the problem to occur.
This aspect of the evaluation process requires a critical look into the inner workings of the responsible organization and a frank realization of the weaknesses that led to the problem.
The determination of the root cause of a problem requires much introspection on the part of the reviewer.
The reviewer must be objective in evaluating why the processes and/or people failed to prevent the problem.
The most common failure in this area is that of not doing an evaluation that is of sufficient depth.
If, for example, a deficiency occurred because a written procedure was not followed or because an individual did not perform in accordance with accepted practices, the cause may be listed as personnel error, and the corrective action might typically be to counsel the indi-vidual.
This may not adequ:.tely address the root cause of the problem.
If the individual involved failed to follow a written procedure, a more detailed evaluation might determine that one or more of the following may be the actual cause.
- 1. The written procedure was not provided to or was not available to the individual.
- 2. The individual did not know about the procedure or did not understand it.
- 3. The individual was aware the procedure existed and the procedure was available, but because of distractions, or pressures, or other factors, several steps in the procedure were missed or overlooked,
- 4. The individual was not mentally or physically fit to perform the activity.
- 5. The individual was aware the procedure existed, but because of an attitude problem chose to ignore it.
In each of these cases, the categorization of personnel error would fit, but the real cause, the root cause, is quite different in each case.
In the first case, management failed to provide the individual with the tools (that is, the procedure) to do the job properly.
This may also indicate that the individual was not adequately trained about the need for using procedures and so did not have the knowledge or motivation to ask for 8
.{
?
SIXTEnmi NHAL !&TIONAL ENERGY DIVISION CONFERENCE o.
g.,
a procedure.
In the second case, the individual clearly had not received adequate training to perform the assigned tasks.
This could be an indication of a weak training program and/or a weak supervisor who did not provided sufficient oversight of the individual's performance.
In the third case, the individual may not have been provided with a suitable work environment to perform the job safely.
This case may also indicate a failing on the part of management to provide the kind of resources (people and equipment) necessary to do high-quality work.
In the fourth case, management had not been effective in assuring that employees were physically or mentally capable of perfonning at the attention level required in a nuclear power plant.
This could indicate an inadequate fitness-for-duty program and/or a supervisor who was not observant and sensitive to the telltale signs of an employee's inability to function well in the job.
The last case is one of the hardest to evaluate.
The " attitude" of the individual could possibly reflect the attitude of management, and this may be a difficult reality for a reviewer to face.
This last case could also indicate that the individual's supervisors would not, or could not, deal with an attitude problem.
If, in another case, the problem is one in which defective procedures or an inadequate program is involved, the reviewer needs to determine why or how the originator allowed the problem or defect to get into the procedure or program, why the various reviews (by managers and committees) did not catch the defect, and whether other procedures or programs are subject to a similar defect.
If an operational component fails because of en defective part, the apparent cause may be a " design or manufacturing error."
However, the reviewer must also ask whether there is anything that should have identified the problen with the component before it failed and became an operational problem.
This, of course, requires a full understanding of the quality assurance program requirements placed on the purchase, installation and test of the item and what particular element of those programs should have identified the defective part.
In such cases, it may be prudent for the root cause analysis to also examine the generic implications of why existing quality controls failed to identify the problem.
This review should consider the need to address such things as:
the adequacy of the original design, or the vendor's and owner's quality control program for post-manufacturing / installation / operational testing of the part or component.
5.3 Fixing the Root Cause If the system has provided exceptional means for identifying and evaluating problems but does not follow through in assuring that the problem is fixed, the program is worth very little.
If one is to truly prevent recurrence of a problem, the root cause of the problem must be fixed in such a manner that the problem does not resurface in the near future.
9-
w 8
SIXTEDTDI AtWUAL NATICNAL ENERGY DIVISION CONTERENCE I
l Once a problem has been identified and evaluated, the problem is usually fixed by using a maintenance request, design change, or other document to make the fix.
Some plants have been known to close out a problem in the corrective action system when the maintenance work request l
was issued.
This approach does not provide assurance that the problem will get fixed, since the maintenance work request, or Other document, could get
' lost or cancelled.
This approach also fails to assure-that the proposed corrective action-is appropriate and that the problem has been truly fixed.
To do it right, the corrective action system - should assure that the evaluator and, if possible, the originator, reviews and approves the i
document that orders the corrective actions to be taken, including that 1
intended to prevent recurrence.
This requires time, however the correction of a problem should be well planned and not rushed.
In reviewing the proposed action to fix the problem, the evaluator should. review the appropriateness of the work procedure.
The evaluator should also determine whether the individuals performing the work need
'j training (or retraining) before doing the work.
Consideration must also be
-given to the need to verify the operability of associated redundant systems.
The key is to assure that the corrective ection is done right so it will neither have to be done over, nor cause a more severe problem.
The same deliberate controls must apply to the actions to prevent recurrence.
For example, if the corrective action involves additional training for certain individuals to assure they are aware of some specific requirement, a training session for current employees will not suffice.
The one-time training. session will not assure that the employee who is hired next month
- or next year will be aware of the problem, and thus could cause the same i
problem in the future.
When training is needed, it must be factored into the training program to assure that future employees are properly trained in
.the problem area and that current employees are given periodic refresher training. to remind them of the potential problem.
In some cases an alternate to repetitive training is to design the problem away by changing the physical plant so it is no longer vulnerable to the problem.
This approach should be considered when repetitive training is needed to ensure that periodic checks are performed to verify that a narginally designed component or system is not operated beyond its design linit.
A second example:
When the corrective action involves changes to pro-cedures or program documents, the licensee's commitment tracking system should record that the specific changes were made to correct an identified problem.
This will assure that future questions regarding the reason for e specific requirement in a procedure, or program document, can be addressed and will not be inadvertently dropped and result in a repeat of the initial problem.
Finally, the history record associated with the correction of the problem must include an assembly, or reference to all appropriate identification, evaluation, corrective action, work, tracking and trending documents related to the issue.
10
~
t 0
'-.,.,. l-SIXTEDm! NNUAL NATICNAL ENERGY DIVISION CONFERENCE j
(
%e.'
6.
CONCLUSION On the basis of my NRC experience, I am encouraged by what I have seen to date in the general industry's performance relative to root cause analysis..However, it is also apparent that significant improvements in this area can be-achieved through a better understanding of the importance of problem root cause analysis to reactor safety.
This session on root cause analysis should be helpful to the nuclear industry in gaining a better understanding of-this subject.
I
{
l l
l 1
l 11 h
--