ML21274A021

From kanterella
Jump to navigation Jump to search
Final Report Sec Subcommittee Report: Review of the NCNR Event Response and Technical Working Root Cause Analysis and Corrective Action Plan
ML21274A021
Person / Time
Site: National Bureau of Standards Reactor
Issue date: 08/12/2021
From:
US Dept of Commerce, National Institute of Standards & Technology (NIST)
To:
Office of Nuclear Reactor Regulation
Shared Package
ML21274A018 List:
References
Download: ML21274A021 (28)


Text

SEC Subcommittee Review 1

FINAL Report SEC Subcommittee Report:

Review of the NCNR Event Response and Technical Working Group Root Cause Analysis and Corrective Action Plan Submitted to NCNR Director August 12, 2021 From the Safety Evaluation Committee (SEC) Subcommittee:

Event Response and Corrective Action Subcommittee Members:

Elizabeth Mackey, NIST Chief Safety Officer, SEC Vice Chair Donald Pierce, NIST, NCNR Engineer, SEC Chair Amber Johnson, University of Maryland, SEC Member Timothy Barvitskie, NIST, NCNR Health Physicist, SEC Member James Adams, NIST, Chief Radiation Physics Division

SEC Subcommittee Review 2

Table of Contents LIST OF ABBREVIATIONS......................................................................................................................... 3 EXECUTIVE

SUMMARY

............................................................................................................................ 4

1.

BACKGROUND AND CONTEXT........................................................................................................ 8

2.

SCOPE OF REVIEW AND EVALUATION............................................................................................ 9

3.

METHODS USED TO CONDUCT THE REVIEW AND EVALUATION................................................... 9

4.

MATERIALS REVIEWED................................................................................................................. 11

5.

REVIEW OF NCNR RESPONSE TO EVENT ON FEB 3....................................................................... 12

6.

REVIEW OF TWG REPORT: CIRCUMSTANCES, CAUSES AND CORRECTIVE ACTION PLAN............ 14

7.

MANAGEMENT SYSTEMS............................................................................................................. 15

8.

QUALIFICATION AND TRAINING PROGRAM................................................................................. 19

9.

PROCEDURES................................................................................................................................ 21

10.

INSTRUMENTS, EQUIPMENT AND TOOLS................................................................................ 24 REFERENCES............................................................................................................................................. 28

SEC Subcommittee Review 3

LIST OF ABBREVIATIONS AP: annunciator procedure CA: corrective action CARRI: corrective actions and reactor recovery items CF: contributing factor EAL: emergency action level ECS: emergency control station EI: emergency instruction ERCAS: Event Response and Corrective Action Subcommittee IET: instruments, equipment and tools MS: management systems NBSR: National Bureau of Standards Reactor, i.e., the NIST test reactor NCNR: NIST Center for Neutron Research NI: nuclear instrument NIST: National Institute of Standards and Technology NOUE: notice of unusual event NRC: US Nuclear Regulatory Commission PR: procedure QAPM: Quality Assurance Program Manager QT: qualifications and training RC: root cause RM: radiation monitor SEC: Safety Evaluation Committee SPI: suggested program improvement SRO: Senior Reactor Operator TWG: Technical Working Group

SEC Subcommittee Review 4

EXECUTIVE

SUMMARY

On Feb. 3, 2021, shortly after commencing a normal reactor start up, the NBSR (NIST test reactor) experienced a rapid drop in power. Several radiation monitors exceeded setpoints and one of these caused a major scram. The NCNR declared an Alert condition and informed the NRC. The Alert was downgraded to Notice of Unusual Event and terminated the same day. The NCNR convened a Technical Working Group (TWG) to perform a root cause analysis and develop a corrective action plan. The TWG determined that an unlatched fuel element was the direct cause of the event, identified five root causes and developed a corrective action plan. The results of the investigation were reported to the NRC in accordance with requirements of NBSR Technical Specification 6.7.2.

At the request of the NCNR Director, the Safety Evaluation Committee (SEC) chair convened a subcommittee to fulfill requirements of Technical Specification 6.2.3(2) which requires the SEC to review events involving special reports to the NRC. This subcommittee, the Event Response and Corrective Action Subcommittee (ERCAS) was charged with reviewing the circumstances of the event and the measures planned to preclude recurrence by conducting an evaluation of the NCNR response to the event and reviewing the causes, contributing factors, and corrective actions identified by the TWG. The subcommittee fulfilled the charge by:

1. Reviewing event-related reports to the NRC, the NRCs interim special inspection report, specific applicable procedures (i.e., administrative rules, operation instructions, emergency instructions, health physics instructions) and relevant sections of the emergency plan; and
2. Interviewing individuals who were present and responsible for responding to the event, as well as other members of the reactor operations, reactor engineering and health physics (HP) groups, and those who oversee and manage those groups. The ERCAS interviewed ten reactor operators (three of whom are supervisors), four HP staff, four engineers, and three managers of supervisors.

Based on analysis of the NCNR response to this event, the ERCAS determined that safety systems functioned as intended, defined roles were fulfilled quickly and correctly, and defined processes and procedures were implemented. There are no recommended corrective actions regarding the NCNR response to the event. There were lessons identified, captured as observations, with specific suggested program improvements provided for consideration by NCNR management. These include: developing a checklist to assist with implementing the Emergency Plan; improving the emergency control station including formalizing use of reactor at your desktop software to monitor conditions remotely; developing guidelines to more easily assess radiological conditions associated with notifications; and expanding the scope of drills to include a wider variety of emergency conditions to exercise critical thinking and practice responses. This review and analysis and a list of suggested program improvements is included in Section 5.

The ERCAS evaluated the TWG report and found the description of the event including the timeline, analysis of precursor circumstances, and determination of the direct cause (an unlatched fuel element) to be complete, correct, and consistent with information obtained by this subcommittee. The ERCAS concurs with the TWGs list of five (5) root causes (RCs) and fifteen (15) corrective actions (CAs) and added two (2) additional root causes and eight (8) corrective actions recommended to preclude recurrence of this type of event. The root causes and contributing factors fell into one or more of the following categories: management systems (MS); qualification and training (QT); procedure adequacy and use (PR); and instruments, equipment, and tools (IET). An analysis of the circumstances that led to this event, the root causes, contributing factors, and recommended corrective actions are presented in sections 6-10. These sections also include observations and suggested program improvements which are not deemed necessary to prevent recurrence of this or similar events but are provided for consideration by NCNR management. A brief discussion each root cause is included below, followed by a summary list of root causes and recommended corrective actions. For completeness those identified by the TWG and ERCAS are combined and presented in this one summary list.

1. Root Cause ERCAS-MS-RC1. Robust management systems typically include processes to manage change. The NCNR has well-established and expertly implemented processes to manage changes to procedures,

SEC Subcommittee Review 5

experiments, parts and instruments that have the potential to affect reactor operations and specifically, safety of operations. However, recent changes in 1.) staffing levels and crew proficiency associated with staff attrition, 2.) oversight associated with turnover in key supervisory positions, 3.) shift rotation alignment with reactor schedule (changed some years ago) contributed to lack of proficiency, and 4.) tools (minor changes) associated with tool wear and replacement were identified as contributing factors to this event. Based on analysis of these contributing factors, the ERCAS determined one root cause to be an insufficient change management program and supports the corrective actions identified by the TWG (listed below as MS-CA2, -CA3, -CA4) and also recommends development of a more comprehensive framework to manage change (MS-CA1).

2. Root Cause TWG-MS-RC2. Consistent and adequate oversight of reactor operators is essential to ensuring consistency, quality and safety of operations. Inadequate oversight of refueling operations was identified as a root cause of this event by the TWG. The ERCAS concurs with this assessment. Interviews with crews pointed to inconsistent supervision and inadequate training on how to supervise. The TWG corrective actions (MS-CA5, MS-CA6) are sufficient to address this root cause.
3. Root Cause ERCAS-MS-RC3. Crew chiefs, supervisors and managers noted that a culture of complacency was prevalent in the reactor operations group. This resulted in a lack of ownership of processes and procedures, and the perception that there was no expectation for operators to proactively identify necessary improvements. NCNR management has taken several actions to improve staff engagement and has communicated the expectation that staff must take responsibility for improving reactor operations. NCNR managers have established multidisciplinary teams in which reactor operators collaborate with engineers and others on developing corrective actions to address root causes of this event and to improve operations.

The ERCAS supports this action and also recommends that NCNR managers develop and implement a standing preventive action program that encourages and rewards proactive efforts by staff to improve quality, safety, and efficiency of operations (MS-CA7).

4. Root Cause TWG-QT-RC1. The TWG identified inadequacy of the training and qualification program as a root cause of this event. The ERCAS concurs with this assessment. Crews were not proficient in latch-check operations and did not fully understand how to detect an unlatched fuel element. The ERCAS endorses the corrective actions (QT-CA1, -CA2, -CA3) and more specifically recommends development of a more structured training program (QT-CA4) that makes use of materials that reflect well-defined objectives (QT-CA3). ERCAS also recommends improvements to the reactor operator qualification program, specifically, the development of clearly defined requirements for task proficiencies, and consistency in evaluating those proficiencies (QT-CA5).
5. Root Cause TWG-PR-RC1. The TWG identified inadequacies in latch-checking procedures as a root cause of this event and described four corrective actions to address this root cause (PR-CA1, -CA2, -CA3, -CA4). The ERCAS concurs with the assessment of root cause and endorses these four recommended corrective actions as sufficient to address this root cause. The ERCAS found that the processes used to revise, review, and reissue procedures were well-documented, well-implemented, and well-understood by reactor operations staff. No additional corrective actions are recommended.
6. Root Cause TWG-PR-RC2. The TWG found that procedural compliance was not enforced and identified this as a root cause of this event and recommended two corrective actions to address this root cause. The ERCAS concurs with this determination and believes the planned actions are sufficient. The first corrective action, to revise requirements for and training on procedure adherence (PR-CA5), was accomplished prior to the writing of this report. The NCNR held a stand-down to discuss expectations with employees and revised AR 1.0, Conduct of Operations, on 3/23/2021; additional improvements are planned. The second corrective action recommended by the TWG, to revise procedures to be consistent with INPO 11-003, Guidelines for

SEC Subcommittee Review 6

Excellence in Procedure and Work Instruction Use and Adherence, (PR-CA6) is underway but will take more time to fully implement. No additional corrective actions are recommended by the ERCAS.

7. Root Cause TWG-IE-RC1. The TWG identified deficiencies in the fidelity of latch determination equipment and tools as a root cause, and the ERCAS concurs. Only the rotational latch check, using the rotation tool, gave a definite indication of latch status, and it was later discovered that it was possible for the element to become unlatched if the fuel head was bumped by the tool after the latch check. The corrective actions identified by the TWG (IE-CA1) require instituting visual (latch) checks and documenting that such checks provide adequate defense against unlatching (IE-CA2). The ERCAS endorses this corrective action and recommends that the new method specify use of a camera or video device to examine fuel head features that provide indisputable evidence of latch status. The subcommittee concurs with modifying the index plate and providing clear fiduciary marks (IE-CA3) as a means to ensure the index plate is consistently positioned. The ERCAS additionally recommends instituting administrative controls that prevent tool contact with fuel heads (IE-CA5) once the final visual verification has been accomplished, and in light of the new visual verification methods, recommends that the NCNR consider discontinuing height checks as a means of fuel element latch verification (IE-CA4). The TWG noted a lack of opportunities for staff to become proficient in latch-checking; the ERCAS recommends that NCNR management ensure opportunities are provided by either increasing access to the reactor top for training or providing a new or modified test stand for practice that simulates the latch-checking experience (IE-CA6).

SEC Subcommittee Review 7

Summary of Root Causes (RC) and Recommended Corrective Actions (CA)

1. Change management program needs improvement. (ERCAS-MS-RCA1)
a. ERCAS-MS-CA1: Develop and implement a change management framework to evaluate sufficiency of existing change management processes, identify gaps and areas for improvement.
b. TWG-MS-CA2: Develop system for knowledge and skills management in the presence of personnel attrition.
c. TWG-MS-CA3: Assess efficacy of all tools and determine necessary improvements.
d. TWG-MS-CA4: Prioritize and elevate the Aging Reactor Management program emphasizing oversight of communications between groups and ensuring that maintenance and other issues identified are resolved.
2. There was inadequate management oversight of refueling staffing. (TWG-MS-RC2)
a. TWG-MS-CA5: Develop program for robust qualification of supervisors overseeing refueling operations.
b. TWG-MS-CA6: Require training for supervisors on oversight.
3. There was a culture of complacency, lack of staff ownership of continuous improvement. (ERCAS-MS-RC3)
a. ERCAS-MS-CA7: Develop a plan for involving staff in continuous improvement of reactor operations, through participation in a preventive action program that encourages and rewards proactive efforts to improve quality, safety, and efficiency of operations.
4. Training and qualification program for operators was not on par with programmatic needs. (TWG-QT-RC1)
a. TWG-QT-CA1: Require proficiency training for personnel prior to all refuelings, emphasizing the importance of latching and procedural compliance.
b. TWG-QT-CA2: Develop program for robust qualification of operators and candidates in moving fuel.
c. ERCAS-QT-CA3: Training materials, such as qualification cards and experience with use of fuel handling stand, should reflect learning objectives.
d. ERCAS-QT-CA4: Provide consistent and structured training and immediate and continual feedback to trainees during on-the-job training to ensure comprehension of performance expectations.
e. ERCAS-QT-CA5: Develop consistent standard by which all supervisors evaluate qualifications.
5. Procedures as written did not capture necessary steps to assure latched elements. (TWG-PR-RC1)
a. TWG-PR-CA1: Rewrite OI 6.1 and OI 6.2 to capture detail of fuel and latch movements to align with training.
b. TWG-PR-CA2: Reinstitute requirement for latch checks prior to final pump restart; modify OI 2.1.
c. TWG-PR-CA3: Institute method of visual checks.
d. TWG-PR-CA4: Institute a redundant rotation latch check, performed by a second individual. (TWG)
6. Procedural compliance was not enforced. (TWG-PR-RC2)
a. TWG-PR-CA5: Update procedures to require training for all personnel on procedure adherence.
b. TWG-PR-CA6: Revise procedures to be consistent with INPO 11-003. (TWG)
7. Inadequacies existed in the fidelity of latch determination equipment and tools. (TWG-IE-RC1)
a. TWG-IE-CA1/PR-CA3: Institute a method of visual checks. ERCAS clarification: Specify use of a camera or video camera to provide indisputable proof that each element is fully latched.
b. TWG-IE-CA2: Document that improved latching and latch check processes provide adequate defense against unlatching.
c. ERCAS-IE-CA4: Consider discontinuing use of height checks to verify latching (ERCAS)
d. ERCAS-IE-CA5: Put administrative controls in place (procedures) to assure no tool contact with fuel head following final visual latch verification prior to reactor startup (ERCAS)
e. ERCAS-IE-CA6: Increase access to the reactor top for training purposes or redesign/modify existing test stand to better simulate reactor top fuel loading/latching/latch checking experience (ERCAS)

SEC Subcommittee Review 8

1. BACKGROUND AND CONTEXT On February 3, 2021, during commencement of normal reactor start up, there was a sudden drop in power and rapid increases on several radiation monitors (RMs), including the fission products monitor (RM 3-2) and the stack monitor (RM 4-1). At approximately 0909, the stack monitor reached its setpoint of 50,000 counts per minute. This initiated a major scram, that resulted in a reactor scram and immediate activation of the confinement isolation system sealing off confinement to prevent or limit release of radioactive material from the facility. In accordance with the emergency instructions, the National Institute of Standards and Technology (NIST) Center for Neutron Research (NCNR) declared an Alert condition and notified the U.S. Nuclear Regulatory Commission (NRC) on or about 0929. Based on analysis of radiological samples taken at the 400-meter site boundary and the reactor confinement exhaust stack, NCNR determined that radiological conditions for an Alert were not met and downgraded to a Notification of an Unusual Event (NOUE) on or about 1532. Through analysis of additional samples, NCNR determined that they no longer met the radiological conditions for a NOUE and terminated the event on or about 1935 the same day. Details of the circumstances of this event are included in NCNRs 14-day report to the NRC dated February 16, 2021 (Ref. 1) and a subsequent addendum, dated March 4, 2021, to correct estimates of the activity released during the event (Ref. 2).

The NCNR investigation of the direct cause of this event commenced soon after radiological conditions were evaluated and conditions for work to commence were established. It was determined by visual inspection that a single fuel element was damaged, indicating that the fuel temperature safety limit was exceeded. This was reported to the NRC on March 5, 2021 (Ref. 3).

On March 10, the NCNR Director convened a Technical Working Group (TWG) to evaluate the circumstances that led to the event, determine the direct, contributing and root causes and develop a corrective action plan. The TWG reported that the failure of the element was likely the result of this element not being properly latched at the time of startup, although this condition was not known at that time. This finding was reported to the NRC on May 13, 2021 (Ref. 4). The TWG issued the report, Root Cause Analysis of February 2021 Fuel Failure, Revision 1 to the NCNR Director on May 13, 2021 (Ref. 5). A subsequent addendum to the TWG report dated June 3, 2021 (Ref. 6) noted the possibility of inadvertently unlatching a fuel element by a relatively small impulse force from dropping the pickup tool onto the element head without any rotational force being applied.

The NRC chartered a Special Inspection Team on February 8, 2021 and conducted inspections from February 9 through April 9, 2021. An interim special inspection report was issued on April 14, 2021 (Ref. 7). NRC activities are ongoing as of the issue of this report.

There were several actions taken by NIST and the NCNR to ensure adequate response to this event. The key components of NISTs response to this event are listed below.

1. February 3, 2021: NCNR Reactor Operations Staff Implement the Emergency Plan to secure the facility and ensure safety of the staff and the public;
2. February 3, 2021: NIST established a NIST-level Incident Response Team, chaired jointly by the NCNR Director and NIST Chief Safety Officer, to ensure adequate resources and to coordinate organizational support for the ongoing response to the event and through resumption of normal operations;
3. February 5, 2021: NCNR instituted daily stand-up meetings for reactor recovery were initiated to plan and review work; frequency of meetings was modified as appropriate to needs;

SEC Subcommittee Review 9

4. February 23, 2021: The Safety Evaluation Committee (SEC) charted a subcommittee, the Incident Recovery Review Subcommittee to conduct safety reviews of new procedures developed as part of the recovery work;
5. March 10, 2021: NCNR Director convened a TWG to investigate the circumstances and conditions that led to this event, determine the contributing and root causes and recommend appropriate corrective actions
6. May 2021: Corrective Actions and Reactor Recovery Items (CARRI) teams were convened by NCNR leadership to flesh out detailed corrective action plans based on the corrective actions identified in the TWG report and input from team members and to begin implementation of action plans;
7. May 14, 2021: The NCNR Director and SEC Chair, chartered a subcommittee, Event Response and Corrective Action Subcommittee (ERCAS) to review and evaluate:
a. NCNR response on the day of the event, including implementation of the Emergency Plan and
b. The WG report and provide recommendations on the corrective actions planned.
8. Planned: NIST Acting Director and NCNR Director plan to engage external Subject Matter Experts to evaluate independently: the TWG report and corrective action plan; the SEC Subcommittee Report and recommendations; and NISTs organizational response to the event.

This document is the report issued by the SEC Subcommittee, ERCAS, associated with Item 7 above, in accordance with the scope and charge issued by the NCNR Director; see section 2.

2. SCOPE OF REVIEW AND EVALUATION The Event Response and Corrective Action Subcommittee (ERCAS) was convened by the NCNR Safety Evaluation Committee (SEC) chair at the request of the NCNR Director. The scope of this subcommittees review was issued by the NCNR director in a Memorandum to the SEC Chair on May 13, 2021 (Ref. 8), consistent with the SEC responsibilities delineated in the Technical Specifications for the NIST Test Reactor (NBSR) (Ref. 9) and SEC charter (Ref. 10). See text from this memo, copied below.

In accordance with Tech Spec 6.2.3(2), I am directing the NCNR Safety Evaluation Committee to review the circumstances of this incident and measures planned to preclude a recurrence. The SEC findings and recommendations shall be provided in a written report to me.

The SEC report should include:

An evaluation of the NCNR response to the incident; An independent review of the cause(s) of the incident and any contributing factors and corrective and preventive actions identified in the investigation report by the NCNR Technical Working Group on the Incident; and Any recommendations for actions by me to ensure safe operation of the NBSR.

3. METHODS USED TO CONDUCT THE REVIEW AND EVALUATION Methods used to perform the required evaluation and review and develop recommendations were structured around the two-part charge to the subcommittee; these are described below.

SEC Subcommittee Review 10 Evaluation of NCNR Response. The first part of the charge was to evaluate the adequacy of NCNRs response to the event. This subcommittee considered the response to the event by Reactor Operations and by Health Physics staff, which was bounded by reactor startup on February 3, 2021, through declaration of an Alert, downgrade to Notification of Unusual Event (NOUE), through termination of the event as well as the short-term recovery activities that followed the termination of the event until normal ventilation was restored on February 6, 2021. The ongoing recovery activities necessary to remediate contamination in the confinement building and to return the reactor to normal operating condition are not within the scope of this review.

As a starting point, this subcommittee reviewed the NRC interim special inspection report dated April 14, 2021.

(Ref. 7) This subcommittee accepts the NRCs conclusions summarized in the excerpts copied below.

Licensee Response to the Event:

Based on interviews and document review, the inspectors determined that the licensee followed the NRC-approved emergency plan and the licensees approved emergency plan implementing procedures during the initial response to the event.

Consequences of Event:

Based on interviews and document review, the inspectors found that doses to members of the public and occupational workers were a small fraction of the regulatory limits established in 10 CFR Part 20. The inspectors also found that the air ECs were a small fraction of the limits specified in Appendix B of 10 CFR Part 20.

The NCNR Director requested that the ERCAS independently review the adequacy of NCNRs response to the event and the consequences associated with it and provide recommendations as appropriate. The ERCAS interviewed several reactor operators and health physics staff who were on duty or called to duty and responsible for responding to the event on February 3, 2021, and reviewed documents and records to assess:

1. Implementation of the emergency plan
2. Functioning of safety systems
3. Fulfillment of designated roles and responsibilities
4. Process for assessing reactor and facility conditions
5. Adequacy of equipment and instruments
6. Communication among involved staff Based on assessment of these actions and conditions, the subcommittee evaluated the adequacy of the response and developed suggestions for program improvements. These are presented in section 5.

Independent Review of Technical Working Group Report. The second part of the charge was to provide an independent evaluation of the event causes and associated corrective action plan identified by the TWG, and to make recommendations, as necessary and appropriate, for additional actions to ensure safety and prevent recurrence of this type of event. As a starting point, this subcommittee reviewed the TWG report and addendum (Ref. 5, 6). This subcommittee accepts the conclusion of TWG that the direct cause of the event (defined as elevated radiation levels following reactor startup on February 3, 2021) was failure of a fuel element due to this element being unlatched during startup.

To independently evaluate the contributing factors (CFs) and root causes (RCs) that led to the unlatched fuel element (that caused this event), the subcommittee used the following process to better understand the relevant circumstances and conditions, and develop recommendations.

SEC Subcommittee Review 11

1. Determine the categories of potential root causes and contributing factors of this event based on TapRoot and National Safety Council incident investigation methods. The potential contributing factor categories were determined to be:
a. Management structure, systems, oversight, and employee relations;
b. Training and qualification of reactor operators;
c. Procedure adequacy and use (associated with refueling); and
d. Equipment and tools (used for refueling) and instruments (indicators of off-normal fuel conditions).
2. Develop a list of questions to address gaps in knowledge of the event and relevant circumstances under each category of contributing factors.
3. Conduct interviews of health physics staff, reactor operators, reactor engineers, supervisors, and crew chiefs, and NCNR management.
4. Observe reactor operators conducting movement and latching of fuel elements.
5. Observe the test stand and dummy fuel element used for training on refueling.
6. Determine relevant contributing factors, root causes, and develop recommendations for actions that would help prevent recurrence of this type of event.

This subcommittees independent review of the TWGs determination of root causes, circumstances and causal factors is presented in Section 6. Recommended corrective actions (CAs) developed by this subcommittee are presented together with those provided by the Technical Working Group. Summaries of planned and recommended CAs grouped by CF categories are presented in Sections 7-10. Corrective actions are intended to prevent recurrence of this type of event. Related observations and associated suggested program improvements (SPIs) are provided for consideration by NCNR management and intended to strengthen NCNRs safety management system and associated processes.

4. MATERIALS REVIEWED
a. Reports to the NRC (Refs. 1-4)
b. Technical Working Group Report, Revision 1 and associated addendum (Refs. 5,6)
c. US NRC Interim Special Inspection Report (Ref. 7)
d. Technical Specifications, relevant sections (Ref. 9)
e. Emergency Plan (Ref. 11)
f. Emergency Instructions
g. INPO Guideline for Excellence in Procedure and Work Instruction Use and Adherence, Revision 0, INPO 11-003, June 2011. (Ref. 13)
h. Administrative Rule 1.0 Conduct of Operations, Revision C (Ref. 14)
i.

Administrative Rule 5.0 Procedures and Manuals (Ref. 15)

j.

Operation Instruction 6.1 Fueling and Defueling Procedures, Revision E (Ref. 16)

SEC Subcommittee Review 12

5. REVIEW OF NCNR RESPONSE TO EVENT ON FEB 3 During normal reactor startup on February 3, the NBSR experienced a major scram during the approach to full power. Upon receipt of the major reactor scram signal, the reactor safety systems functioned as designed (i.e.,

reactor automatically shut down, confinement isolation system was initiated, and ventilation realigned from the normal operating mode to emergency mode). Interviewees noted that safety systems worked as intended and without operator intervention.

Following confirmation of status of safety systems and based on increases on the fission monitor and stack alarm, reactor operators sounded the building evacuation alarm, began evacuating the confinement building, and declared an Alert. Prior to evacuating, operators took vital actions to ensure safety, i.e., verified the reactor was shut down and primary coolant pumps were running to maintain cooling. Health Physics personnel in the confinement building ensured non-essential personnel had evacuated prior to exiting confinement to obtain samples from the confinement exhaust stack. Interviews with the operators and health physics staff, confirmed that the emergency plan (Ref. 11) and associated emergency instructions (EIs) and annunciator procedures (AP),

e.g. the helium sweep activity high, were used. This was confirmed in the NRC interim special inspection report (Ref. 7). The EIs have general guidelines for evacuation but not a definitive list of systems to check prior to evacuating, so operators created their own checklist to verify the state of certain systems prior to evacuation.

A suggested program improvement is to develop a checklist of systems to check in the event of an emergency that requires evacuation of the control room [ER-O1].

Securing the CO2 (secondary concern) was missed and the potential for CO2 levels to build up in low lying areas and present an O2 deficiency hazard wasnt' recognized until reentry into confinement on February 4. This omission was investigated separately; a suggested program improvement is to implement and communicate the corrective actions developed as part of the root cause analysis specific to this condition (Ref. 12) [ER-O2].

After all personnel (a total of 9) had evacuated confinement, the reactor was monitored by operators from the emergency control station (ECS) located outside of the confinement building. Six reactor operations staff working in the ECS who had been in the control room or near the reactor top at the time of the event informed Health Physics staff that they had become externally contaminated. Health Physics staff determined the extent of the contamination and to minimize personnel exposure and remove surface contamination, instructed operations staff to doff their clothing, provided them with coveralls and instructed them to proceed to the B-Wing showers to decontaminate themselves by removing the remainder of the surface contamination. Two Health Physics staff and another member of the NCNR staff present inside C-100 at the initiation of the event did not become contaminated and therefore decontamination was not necessary. Later the same day it was determined that subsequent re-entry into the Confinement Building was necessary to take helium sweep gas and primary samples and initiate shutdown cooling. A control point was set up at the north vestibule entrance to confinement, four individuals were briefed of the radiological conditions, and, wearing protective clothing, entered the confinement building at approximately 1205 to collect samples and initiate shutdown cooling.

Upon returning from collecting samples and initiating shutdown cooling, individuals doffed their protective clothing and were sent to the B-Wing showers to decontaminate themselves. Ten NCNR staff were instructed to go to building 245 the next morning for bioassay measurements.

Reactor operators noted that it was difficult to monitor the reactor from the ECS because the only way to view the console was with a security camera in the control room. It was noted that the camera did not have a view of the leak detector system. It is recommended that the ECS be enhanced to improve utility during an emergency [ER-O3]. The current monitoring system, Reactor at Your Desktop, was identified as a useful tool to monitor conditions and to see if actions taken reduced radiation levels.

SEC Subcommittee Review 13 During the event and through short-term recovery efforts, NCNR staff collected air samples at the 400-meter site boundary and confinement exhaust stack. Based on analysis of these samples, NCNR determined that the radiological conditions were less than emergency action levels (EALs) in the Emergency Plan (Ref. 11) for both an alert and notification of unusual event, and terminated the event on 2/3/21 at 1935. Health Physics and Reactor Operations staff noted that the Emergency Plan and EIs specify EALs which may be used as thresholds for initiating appropriate emergency measures. However, it was noted that EIs do not specify how to make measurements, interpret results, and perform calculations necessary to downgrade or upgrade emergency classes. A subsequent detailed analysis determined that recommendations to downgrade could have actually been made earlier. It is recommended that the NCNR develop guidelines that outline methods for making measurements, interpreting results and performing calculations and dose projections used as the basis for radiological protective actions and to upgrade and downgrade emergency classes [ER-O4].

The subcommittee also discussed training on emergency response with Operators and Health Physics personnel.

Operators train on the emergency plan (Ref. 11) as part of their initial licensing process. Health physics staff and operators continuously develop emergency response skills through drills and exercises. Additional tabletop drills are conducted monthly at the crew level. After-action reports have been issued following exercises, but it was noted that distribution of the reports and debriefing could be improved [ER-O5]. Some staff have prior experience outside of the NCNR that helps prepare them for responding to an emergency.

Interviewees also suggested that training should be more adaptive and test critical thinking skills [ER-O6].

Practical drills are needed, e.g., simulating audible alarms that could go off and induce panic. It may be helpful if individuals were more comfortable running drills.

In summary, the event that occurred on February 3, 2021 put to the test several, if not all, of Reactor Operations and Health Physics processes, procedures, and equipment. The safety systems functioned as intended, defined roles were fulfilled quickly and correctly, and defined processes and procedures were implemented. This event provided the opportunity to observe emergency response in action. Several lessons learned from this event have been captured as observations with specific suggestions for program improvements in the table below.

Table 1. Observations and Suggested Program Improvements Associated with Event Response Observation Description Suggested Program Improvement Source ER-O1 No list of items or systems to check prior to evacuation of the control room.

Develop a definitive checklist for use during an evacuation.

ERCAS ER-O2 The potential for CO2 levels to build up in low-lying areas and present an O2 deficiency hazard was not identified prior to the incident. This was investigated separately.

Communicate and implement the identified corrective actions identified as part of a separate root cause evaluation of the incident (IRIS 21-IG-0017; Ref. 12)

ERCAS ER-O3 Emergency Control Station (ECS) monitoring and control capabilities need improvement; there were difficulties monitoring specific reactor systems following the event.

Develop and implement a plan to improve monitoring and control capabilities to improve the utility and usability of the ECS; consider formalizing the use of the reactor at your desktop system for monitoring and assessing plant conditions.

ERCAS

SEC Subcommittee Review 14 ER-04 Emergency plan and associated EIs specify emergency action levels and who makes the decision to declare and downgrade an emergency class. They do not specify how to make measurements, interpret results, perform calculations, and make on and off-site dose estimates.

Develop guidelines that outline methods for making measurements, interpreting results, performing calculations, and making dose projections, e.g. (dose projections that are used as basis for radiological protective action recommendations and those used to upgrade and downgrade emergency classes).

Note: The term guidelines is used and not procedure as it should provide suggested strategies and implementation methods. Within the guidelines, staff would have the latitude to respond as necessary to unpredictable or dynamic situations.

ERCAS ER-O5 Emergency drills and exercises are held and follow-up critiques are conducted. Many staff are unaware how deficiencies identified during follow-up critiques are tracked and used to form the basis for training and procedure updates.

Develop a process to communicate and track deficiencies identified during follow-up critiques; ensure corrective and preventive actions are assigned appropriately and tracked for timely resolution.

ERCAS ER-06 Emergency drills and exercises need improvement.

Ensure emergency drills and exercises are rigorous, and include diverse (and not only predictable) scenarios, implemented in a way that adequately challenges the emergency response organization and helps to identify, correct and communicate performance deficiencies with the goal of enhancing performance during an actual emergency.

ERCAS

6. REVIEW OF TWG REPORT: CIRCUMSTANCES, CAUSES AND CORRECTIVE ACTION PLAN The TWG convened by the NCNR Director evaluated the circumstances of the event, determined the direct, contributing and root causes and developed a corrective action plan designed to mitigate these causes and prevent recurrence of this type of event. This work is described in the TWG Report (Ref. 5). The TWG provided an addendum to the report, dated June 3, 2021 (Ref. 6) based on a subsequent evaluation of the tools used to perform latch checks. This work demonstrated that it was possible to inadvertently unlatch a fuel element with a relatively small impulse force generated by dropping the pickup tool onto the element head without any rotational force being applied. TWG noted that this finding did not alter their root cause analysis or the associated corrective action plan.

SEC Subcommittee Review 15 This subcommittee (ERCAS) evaluated the TWG report and found the description of the event, including the timeline, the analysis of precursor circumstances and determination of the direct cause (an unlatched fuel element), to be complete, correct, and consistent with information obtained by this subcommittee through review of documents and interviews with management, supervisors, reactor operators and reactor engineers.

This subcommittee concurs with the TWGs list of five (5) root causes and fifteen (15) corrective actions, and added to these two (2) additional root causes and recommends eight (8) additional corrective actions.

The ERCAS analysis and recommendations are presented in sections 7-10. Each section contains a narrative description of the analysis and a table of root causes, contributing factors, and associated recommended corrective actions. This subcommittee endorses the TWGs root causes and corrective actions, and for completeness, included these in the tables together with those developed by this subcommittee. The ERCAS also included relevant observations and suggested program improvements for consideration by NCNR management.

Suggested program improvements (SPIs) are not corrective actions; that is, the subcommittee does not believe the suggested improvements are necessary to prevent recurrence of this type of event.

7. MANAGEMENT SYSTEMS In determining the contributing factors to this event, the following management-related causal factor categories were evaluated: organizational structure; management oversight; employee relations and communications; administrative policies related to change management.

Organizational Structure. The organizational structure of the NCNR was described as relatively flat. The Deputy Director of the NCNR is the Chief of Reactor Operations and Engineering (CROE) and oversees the Chief of Reactor Engineering (CRE) and the Chief of Reactor Operations (CRO). The CRO supervises the Crew Chiefs, who supervise reactor operators. Oversight of daily tasks performed by reactor operators is provided by Reactor or Shift Supervisors. Interviews conducted with individuals at all levels within the organization revealed no structural barriers to safe operations, safety-related communications or raising safety-related concerns. Reactor operators noted that they had the ability to phone anyone in their chain-of-command as necessary and stated that they had access all the way to the top. There were no CAs or SPIs regarding the organizational structure.

Responsibilities of staff and supervisors are generally well understood, but greater clarification is needed on the requirements to become a crew chief (see MS-SPI2).

Reactor engineering staff and reactor operations staff reside in different working groups; these groups collaborate on development of tools and equipment used by operators. This cross-divisional collaboration works well for tool development but could be improved by extending the collaboration to the full life-cycle of engineered items used by reactor operators (See MS-CA4 and MS-SPI4).

Management and Supervisor Oversight. The TWG found inadequate management oversight to be a root cause of this event. The ERCAS agrees with this assessment. The following corrective actions recommended by the TWG are sufficient to address this root cause:

  • Develop a program for robust qualification of supervisors overseeing refueling operations (MS-CA5)
  • Require training for supervisors on oversight (MS-CA6)
  • Revise procedures to be consistent with INPO 11-003, Ref. 13 (PR-CA6)

While responsibilities of crew chiefs and reactor (shift) supervisors are clear and understood by all staff, several interviewees reported inconsistencies in how supervision is exercised. The level of guidance provided to trainees

SEC Subcommittee Review 16 differs from one crew to the next. Providing oversight and mentoring of crew chiefs and supervisors to improve consistency is a suggested program improvement (MS-SPI6; see also QT-CA5).

Employee Relations. In general, the staff of the reactor operations group respect each other, their supervisors, and their managers. In recent years, staff morale has been adversely impacted by rate of attrition and inability to gain proficiency in specific tasks due to shift-locking. The current shift rotations are such that individuals do not rotate through all of the different operational tasks required to gain required proficiencies. The ERCAS believes the corrective actions recommended by the TWG (specifically, QT-CA1, QT-CA2) will address this insufficiency. Interviewees also expressed a desire for greater engagement of top managers in daily operations, and improved communications on management decisions that may affect reactor operations. ERCAS suggests NCNR enhance engagement of managers with the crews to improve employee relations. (See MS-SPI5.) The NCNR should also consider implementing the NIST Safety Management Observation Program, or otherwise practice management by walking around to establish an ongoing dialog on operational safety.

Complacency was identified as a root cause of this event (MS-RC3). Several interviewees, managers and supervisors and staff, described the culture as one characterized by complacency, noting that employees did not believe that it was their responsibility to improve operations, but rather just to get the job done.

Employees did not report difficulties with tools or procedures, because they were not expected to do so. Many actions are underway to improve staff engagement and ERCAS commends these actions. Following this event, processes have been implemented to:

Ensure attention to detail, including recent implementation of the revised AR 1.0 (Ref. 14);

Ensure operator ownership of procedures (including emphasis on existing processes for updating and revising procedures and planned improvements to these processes)

Enhance staff participation via participation in CARRI teams convened to recommend corrective actions and program improvements to address root and contributing causes to this event.

ERCAS commends and endorses these actions and recommends continuing to involve staff in the continuous improvement of reactor operations through participation in a standing preventive action program that encourages and rewards proactive efforts to improve quality, safety, and efficiency of operations (MS-CA7).

Administrative Policies: Change Management. A robust and comprehensive change management program can help prevent safety-related incidents and events by ensuring that changes in organizational structure, staffing, processes, procedures, tools and equipment are planned, evaluated, reviewed, and communicated to affected parties; that is, changes are implemented in a controlled manner. The NCNR has several very good processes in place to manage changes including processes to manage procedure revisions and changes to equipment and tools, but there are some gaps in change management processes (discussed below). One contributing factor pointed to inadequate management of change in staffing proficiency. Other observations point to possible improvements in processes such as review of procedure changes and managing the life-cycle of tools and equipment. Therefore, ERCAS identified inadequate change management as a root cause of this event and recommends development of a change management framework to evaluate sufficiency of existing change management processes and to identify gaps and areas for improvement (MS-CA1).

There was no formal process to help manage changes in staffing. Turnover in the position of Chief of Reactor Operations (CRO), attrition of reactor operator staff, changes in the level of experience and proficiency of new hires, and changes in operator schedules were noted in the TWG report. The CRO position is integral to operational consistency, quality, efficiency, and safety. This person oversees the crew chiefs and is the second level supervisor for all reactor operators. The CRO serves as the conduit between operators and upper management. Changes in this position have the potential to affect consistency of operations. The current Acting CRO is highly competent and well-respected, but is serving in an acting capacity, so there may be additional

SEC Subcommittee Review 17 turnover. A suggested program improvement is to develop a process to manage CRO changes that ensures duties of this position are fulfilled during transitions and effectively transferred to new CROs (MS-SPI1).

Given that the nature of the work is specific to the NBSR, reactor operators are trained in-house, typically by more experienced operators. Attrition, specifically, loss of experienced staff, affected crew proficiency. Unlike most hires from previous decades, many new hires are not former Navy and have varying levels of experience and task proficiency. The training program must evolve to ensure efficacy for all new hires. ERCAS believes that corrective action, MS-CA2 recommended by TWG, Develop system for knowledge and skills management in the presence of personnel attrition will adequately address this insufficiency. The ERCAS suggests that the system developed as part of this corrective action include analysis of the types of positions and staffing levels needed to support reactor safety and reliability. It was stated that a change from 28-d shifts to 56-d shifts limited the ability to gain proficiency in specific tasks. But it was noted by the Acting Chief of Reactor Operations that reactor operators work 28-d shifts, and the reactor is operated on a 38-d run cycle with 18 days between cycles and that this is what contributes to shift-lock such that operators do not rotate through all tasks required to gain needed proficiencies. The system to manage skills should provide alternatives to shift locking to ensure staff gain and maintain proficiency in all required tasks.

Differences in replacement tools affected the accuracy of height checks and gradual changes in tools (wear) affected operator proficiency with tools. In addition to TWG corrective actions MS-CA3 and MS-CA4 which address tool efficacy and maintenance, ERCAS suggests as a possible program improvement, a process to manage engineered equipment and tools over the full life-cycle of the items. (See MS-SPI4.)

There are several processes by which operators may update or modify procedures (Ref 15). They may contact the Quality Assurance Program Manager, initiate changes using the Trouble Ticket system or, with reactor supervisor approval and when necessary and appropriate, make pen-and-ink changes in real time to address specific situations. Some of these procedures involve use of engineered items. Reactor Engineering Group staff are not necessarily part of the review and approval process, and there is no requirement to inform this group of changes to procedures involving use of items they design and fabricate. A suggested program improvement is to ensure that Reactor Engineering Group staff are informed of changes to procedures that specify use of engineered items to accomplish the task. (See MS-SPI3.)

In accordance with Tech Spec 6.2, and specifically 6.2.5, the Safety Assessment Committee, a committee of experts external to NIST and appointed by the NCNR Director, is tasked with providing an annual independent review or audit of reactor operations. The SAC reports in writing to the NCNR Director. NCNR management receives SAC recommendations and may implement changes in response to these SAC recommendations.

Interviewees noted that while NCNR management informs the SEC of the results of the SAC review, operations staff are often unaware of the SAC recommendations and NCNR actions taken in response to these recommendations. A suggested program improvement is to develop a process to communicate SAC recommendations and associated NCNR actions to affected staff. This could be accomplished by tasking the SEC with tracking these items and communicating status of actions to those impacted (see MS-SPI7).

Table 2. Management Systems Related Root Causes, Contributing Factors, Recommended Corrective Actions, Related Observations and Suggested Program Improvements.

Root Cause and Contributing Factor Description Recommended Corrective Actions; Source Root Cause MS-RC1 Change Management Program needs improvement MS-CA1: Develop and implement a change management framework to evaluate sufficiency of existing ERCAS

SEC Subcommittee Review 18 change management processes, identify gaps and areas for improvement.

Contributing Factor MS-CF 1 Management of staffing changes (people and shifts) needs improvement to ensure adequate proficiency.

MS-CA2: Develop system for knowledge and skills management in the presence of personnel attrition.

TWG Related Observations Description Suggested Program Improvements Observation MS-O1: Management of changes in key management and supervisory positions needs improvement MS-SPI1: Develop a process to manage CRO transitions based on specific requirements and duties of this positions.

ERCAS Observation MS-O2: Career path to achieve the supervisory position of crew chief is unclear to staff, viewed as time in grade rather than competency-based.

MS-SPI2: Define the career path to becoming a crew chief, including defining the skills required to fulfill role and responsibilities.

ERCAS Observation MS-O3: Processes used to modify procedures do not expressly require review by or notice to Reactor Engineering staff when engineered items are used in the procedure.

MS-SPI3: Review processes for modifications to procedures that involve use of engineered items and communicate changes to Reactor Engineering staff.

ERCAS Observation MS-O4 Processes for managing the full life-cycle of engineered equipment, tools and parts needs improvement.

MS-CA3: Assess efficacy of all tools and determine necessary improvements MS-CA4: Prioritize and elevate the Aging Reactor Management program emphasizing oversight of communications between groups and ensuring that maintenance and other issues identified are resolved.

MS-SPI4: Develop a process for RO/RE to collaboratively manage engineered items over the full lifecycle, from development, acceptance testing, through use, routine PM and replacement that includes stakeholder engagement and feedback.

TWG TWG ERCAS Root Cause Description Recommended Corrective Action Source

SEC Subcommittee Review 19 Root Cause MS-RC2 There was inadequate management oversight of refueling staffing MS-CA5: Develop program for robust qualification of supervisors overseeing refueling operations.

MS-CA6: Require training for supervisors on oversight.

TWG TWG Related Observations Description Suggested Program Improvements MS-O5: Management engagement needs improvement MS-SPI5: Increase management engagement with reactor operators, i.e., establish an ongoing dialog on practices, issues and planned changes.

ERCAS MS-O5: Supervisor oversight is implemented inconsistently MS-SPI6: Provide oversight and mentoring for supervisors to ensure greater consistency in practices among crew chiefs ERCAS Root Cause Description Recommended Corrective Actions Source Root cause MS-RC3: Culture of complacency characterized by a lack of staff ownership of continuous improvement.

MS-CA7: Develop a plan for involving staff in continuous improvement of reactor operations, through participation in a preventive action program that encourages and rewards proactive efforts to improve quality, safety, and efficiency of operations.

ERCAS Related Observations Description Suggested Program Improvements MS-O7: Recommendations from the SAC are reviewed, dispositioned, and tracked by management. Many staff are unaware of these actions.

MS-SPI7: Develop a process to communicate SAC recommendations and NCNR actions taken to address recommendations; consider establishing an SEC subcommittee to track corrective and preventive actions implemented in response to recommendations from SAC and other external review committees.

ERCAS

8. QUALIFICATION AND TRAINING PROGRAM Reactor operations staff are essential to the safe and reliable operation of the NBSR. The TWG report correctly identified the lack of a robust training and qualification program as a root cause. During interviews, reactor operators stressed the need for hands-on training, e.g., with a test stand designed to allow practice not only on fuel movement but also on element latch checks and a robust written procedure rather than reliance on

SEC Subcommittee Review 20 experience and feel to get it right. Corrective actions associated with training to prevent a recurrence of an unlatched element are satisfactory [QT-CA1 and QT-CA2; see also IE-CA6].

The current iteration of the NCNR reactor operator training program relies heavily on self-motivation of trainees to prepare themselves for the NRC senior reactor operator (SRO) exam. Previously, the training program was structured more as an apprenticeship. Crews should be encouraged to work as a team to transfer knowledge, best practices, and actively coach new trainees. Operators identified inconsistent use of qualification cards across crews and a training stand that is nothing like the reactor top. Training materials should be developed with clear learning objectives [QT-CA3]. The training experience must be consistent across crews such that trainees are able to demonstrate comprehension and meet performance expectations [QT-CA4] when evaluated by supervisors [QT-CA5].

The crew performing the latch checks on January 4 either had limited or no experience as reactor-top crew members. Managers are encouraged to identify the knowledge and skills needed for inclusion in the training program to ensure each crew has sufficiently qualified personnel for all tasks required [QT-SPI1]. The focus of the fuel handling training was on the shuffle due to risks associated with dropping an element. Operators stated a lack of understanding that an unlatched element may result in a failure of the cladding and a fission product release. The training program should include well-defined learning objectives that cover normal and off-normal conditions [QT-SPI2]. The input and experience of licensed operators and trainees should be used to continuously improve the training program [QT-SPI3]. The entire NCNR benefits from a well-defined training program that creates knowledgeable and technically competent SROs.

Table 3. Qualification and Training Related Root Causes, Contributing Factors, Recommended Corrective Actions, Related Observations and Suggested Program Improvements.

Root Causes and Contributing Factors Description Recommended Corrective Actions Source Root Cause QT-RC1: The training and qualification program for operators was not on par with programmatic needs.

QT-CA1: Require proficiency training for personnel prior to all refuelings, emphasizing the importance of latching and procedural compliance.

QT-CA2: Develop program for robust qualification of operators and candidates in moving fuel.

TWG Contributing Factor QT-CF1: Some training materials used for latch checking are ineffective and others are not consistently used (e.g. qualification cards).

QT-CA3: Training materials, such as qual cards and experience with use of fuel handling stand, should reflect learning objectives.

ERCAS Contributing Factor QT-CF2: Training experience differs greatly among crews.

QT-CA4: Provide consistent and structured training and immediate and continual feedback to trainees during OTJ training to ensure ERCAS

SEC Subcommittee Review 21 comprehension of performance expectations.

QT-CA5: Develop consistent standard by which all supervisors evaluate qualifications.

Related Observations Description Suggested Program Improvements Source Observation QT-O1: Job descriptions and expectations are unclear.

QT-SPI1: Develop an understanding of the knowledge and skills for inclusion in the training program to meet the job performance expectations.

ERCAS Observation QT-O2: Insufficient training on normal vs. off-normal conditions.

QT-SPI2: Identify learning objectives for the training program to support successful job performance.

ERCAS Observation QT-O3: Communication of lessons learned not consistently shared across crews.

QT-SPI3: Continuously evaluate and revise the training based on the performance of licensed SROs on the job.

ERCAS

9. PROCEDURES Review of NBSR Procedures. The ERCAS reviewed various aspects of the procedures used for operating the NBSR including the overall structure for NBSR procedures; the processes by which procedures are maintained, updated, and reviewed; and the attitude of the operations staff with regard to following procedures. The review consisted of conducting interviews with operations staff and management personnel and reviewing selected documents (Refs. 14, 15, 16). Regarding the NBSR procedures, the ERCAS identified no additional root causes beyond those identified by the TWG.

Structure of NBSR Procedures. The NBSR operates under a comprehensive set of procedures including:

Operating Instructions, Technical Specification Procedures, Surveillance Procedures, Maintenance Procedures, Annunciator Instructions, Emergency Instructions, Administrative Rules, Health Physics Procedures, Health Physics Instructions. NBSR Procedures are grouped according to type. Operating Instructions are broken down by systems and sub-systems to assist the operators. A master procedure lists all systems that need to be operable for reactor operations to commence. Maintenance Procedures are used when servicing equipment for repair or preventive maintenance, and not when operating the equipment to support reactor operations. Sub-categories of Maintenance Procedures include troubleshooting, routine maintenance, preventive maintenance.

The taxonomy, purpose, and applicability of rules, procedures, and instructions is clear to users and no changes or improvements are recommended in these areas.

An important root cause identified by the Technical Working Group (TWG) and endorsed by ERCAS is that procedures as written do not capture necessary steps in assuring fuel elements are latched [PR-RC1]. The TWG recommended four specific corrective actions [PR-CA1-4]. A detailed review of the procedure for latching fuel elements (Ref. 16) and analysis of associated equipment and tools resulted in recommendations for additional corrective actions. These are presented in Section 10, Instruments, Equipment, and Tools.

SEC Subcommittee Review 22 Procedure Revisions, and Version Control. Changes to procedures are typically identified when executing a procedure, during operator training or via an audit. Administrative Rule 5.0 (Ref. 15) permits three ways to update a procedure: initiating a trouble ticket, notifying the Quality Assurance Program Manager (QAPM) and requesting they make the change, and self-initiated change. All methods result in the revised version being routed for review and approval. Some operators expressed the opinion that self-initiated changes to procedures proceed more expeditiously than the other methods.

Operators are permitted to make pen and ink changes to procedures for non-substantive changes that do not alter reactor safety, scope or intent of the procedure, and as necessary to reflect minor changes or deviations from practices in real time. Reactor Shift Supervisors and Crew Chiefs are authorized to approve pen and ink changes. Pen and ink changes to procedures may be incorporated as permanent changes by revising the procedure following AR 5.0 or when the procedure is up for formal review.

The Engineering Change Notice (ECN) process may also result in changes to procedures, and every ECN contains a place to list all procedures that need to be updated as a result of the ECN. Staff do not use the ECN mechanism solely as a means to institute a change to procedures as this is not the purpose of the ECN process and would be cumbersome. Operator staff are cognizant to avoid creating barriers to needed changes as it was noted that a complicated process is a disincentive for making needed changes.

The QAPM is responsible for ensuring that updated procedures are reviewed and approved, that an official version of the approved procedure is posted to a designated on-line location and that a current hard copy is maintained in the control room. Uncontrolled spare copies are not retained. Reviewers are selected from among those who didnt make the change, but who are associated with the team principally responsible for the affected system, along with other knowledgeable persons. All revisions are approved by Chief of Reactor Operations (CRO); when safety significant changes are made, review by the SEC and Chief of Reactor Operations and Engineering is required. Once procedures are revised, the CRO notifies all parties impacted by the change. If personnel need a copy of a procedure, they know that the R-drive is the designated repository of the most recent controlled version of the procedure.

The ERCAS finds that the existing processes to revise, review and reissue procedures are fully adequate and ensure provisions of 10 CFR 50.59 are met. Although fully adequate, NCNR staff are making changes to improve the process. The new process will, among other things, show the history of changes so that users and reviewers are better informed of a given procedures genesis and evolution prior to implementing a suggested change. It was noted by some interviewees that it is difficult to anticipate all the down-stream impacts of changes on other procedures, particularly for changes made outside of the ECN process. Changes to review processes should be designed to include evaluation of potential down-steam impacts.

The process for making changes to procedures, including the 10 CFR 50.59 process, is addressed during operator requalification but is not included in initial training. A suggestion for program improvement is to provide this training earlier in the operator qualification phase so that operators better understand the importance of adherence to procedures. This observation is neither a root cause of the event, nor a contributing factor and the associated suggested program improvement is provided for consideration to help improve the overall conduct of operations.

Adherence to Procedures. Prior to commencement of activities to be performed on nuclear safety systems, the Crew Chief conducts a pre-job briefing on process and associated procedures are reviewed with staff prior to

SEC Subcommittee Review 23 performing the evolution. Staff are given the opportunity to ask questions and seek clarification on issues prior to commencement of work. The Crew Chief ensures that proper staffing and resources are available for planned operational activities scheduled to take place on shift; if not in place, the Crew Chief will render a decision about whether to proceed with plans. By and large, the interviewees expressed confidence with the new processes used to help ensure adherence to procedures at the NBSR, that they address fully the operational aspects, even as changes are periodically identified.

When a shift ends before operators have completed a procedure, the shift turnover can be difficult. To facilitate mid-procedure hand-off to the next shift, improve clarity and minimize confusion, operators now use turnover logs to document where they were within a procedure. Additionally, the newly implemented Circle - X process is used to document completion of specific procedural steps on a step-by-step basis. Operators remarked that the new process makes things less stressful and that they are more comfortable executing procedures.

Formal classroom training used in operator qualification focuses on reactor systems, which may reference procedures, but the training is focused on systems, not on procedures. Once the trainee has qualified, they receive on-the-job instruction on the use of procedures. However, a root cause identified by the TWG [PR-RC2]

and endorsed by this subcommittee is that procedural compliance was not enforced. The corrective actions identified by the TWG require integration of instruction on compliance with and proper use of procedures into the formal operator training [PR-CA5] and revising procedures to be consistent with INPO 11-003 [PR-CA6].

ERCAS believes that these TWG corrective actions are sufficient to address this root cause.

Table 4. Procedure-Related Root Causes, Contributing Factors, Recommended Corrective Actions, Related Observations and Suggested Program Improvements.

Root Causes Contributing Factors Description Recommended Corrective Actions Source Root Cause PR-RC1: Procedures as written do not capture necessary steps in assuring elements are latched.

PR-CA1: Rewrite OI 6.1 and OI 6.2 to capture detail of fuel and latch movements to align with training.

PR-CA2: Reinstitute requirement for latch checks prior to final pump restart and modify OI 2.1.

PR-CA3: Institute method of visual checks.

PR-CA4: Institute a redundant rotation latch check, performed by a second individual.

TWG Root Cause PR-RC2: Procedural Compliance was not enforced PR-CA5: Update procedures to require training for all personnel on procedure adherence.

PR-CA6: Revise procedures to be consistent with INPO 11-003.

TWG

SEC Subcommittee Review 24 Related Observations Description Suggested Program Improvement Source Observation PR-O1 Training on the process to revise procedures and associated 10CFR50.59 requirements is included in requalification but not initial training.

PR-SPI1: Consider providing this training on initial qualification so that operators understand better the importance of adherence to procedures ERCAS

10. INSTRUMENTS, EQUIPMENT AND TOOLS The ERCAS, like the TWG, conducted a detailed review of the key elements of instruments, equipment, and tools used in the procedure for latching fuel elements. The interaction of the design and condition of these elements, and the procedures used to latch fuel elements was examined to determine any causal or contributory roles in the February 3 event.

The TWG focus on instruments, equipment and tools identified inadequacies in the fidelity of latch determination equipment and tools as a root cause. This led to the inability of operators to make an indisputable assessment of the configuration of the latch mechanism. Moreover, it was discovered post-incident that the use of these tools can impart an impulse to the fuel element head causing it to become unlatched.

Other aspects of instruments, equipment and tools were recognized as either contributing directly to the event or needing improvement to address another root cause. The TWG and the ERCAS review also shed light on opportunities for improvements to instruments, equipment and tools that could yield positive safety impacts.

The ERCAS concurs with the TWG assessment that inadequacies in the fidelity of latch determination equipment and tools is a root cause associated with the February 3 event [IE-RC1]. Prior to this event, the only way to ascertain fuel element latch status was either by using the rotation check tool to perform a rotation latch check, which provides a direct indication of latch status, or by using other checks of height measurements or comparison to height references to make inferences about the latch status. The height-check tools and methods, however, lack the accuracy needed to distinguish the small height difference between a fully-latched element and one that is not fully-latched [IE-CF2.1]. Furthermore, the height-measurement methods gave a false sense of security about latch status. ERCAS suggests the NCNR consider discontinuing the use of height checks to verify fuel element latching [IE-CA4].

During the TWG analysis, it was discovered that the fuel element head design stores torsional energy introduced during the latching. The stored energy is such that it tends to produce a counter-rotation that takes the fuel element out of the fully-latched condition. If not disturbed, a fully latched element cannot spontaneously unlatch because the latch bar is captured in a recess. However, if an impulse is imparted to the fuel element head, the result can be a release of torsional energy and (counter) rotation of the latch bar to a partially-latched state. While a properly executed rotation check yields a valid confirmation of a fully-latched element and allows detectionand latchingof a partially-latched element, friction and difficulty in using the tools can cause an operator to bump the fuel head. This action can impart an impulse that results in the latch bar disengaging from the retaining recess and a fuel element that is not fully latched. There is no safeguard to prevent a partially latched fuel element from becoming unlatched and gaining the freedom to rise out of the lower grid plate and disrupt or block primary cooling flow through the element [IE-CF2.2].

SEC Subcommittee Review 25 The TWG and ERCAS both recommend instituting a method of visual latch checks. The ERCAS further clarifies this as the use of a camera or video device to provide indisputable proof of latching (e.g., visual inspection of a portion of the latch bar monolith assembly that would indicate unambiguously its rotational position [IE-CA1/PR-CA3]. Once developed, TWG and ERCAS recommend documenting the effectiveness of visual latch checks in preventing unlatched elements [IE-CA2]. Implementing a visual check as the last word on latch status prior to the final pump start before reactor startup, along with procedures and administrative controls allowing no tool contact with the fuel element head [IE-CA5], are necessary to prevent a recurrence of the February 3 event.

(NOTE: It may seem like an obvious corrective action to make design changes to the fuel head to eliminate the ability to store torsion in the spring. While solutions can be envisaged to accomplish this and effect an engineering control (e.g., thrust bearings and non-coil springs), great care and consideration must be taken before making changes to the existing design to avoid unanticipated adverse consequences. A design change should only be considered as a long-term approach involving extensive testing before implementation. The ERCAS believes that procedural [IE-CA1] and administrative [IE-CA5] controls outlined in the corrective actions provide an adequate safeguard against spontaneous unlatching of a fuel element.)

Difficulty in tool use was seen as a contributing factor because it degrades the operators sense of feel that is necessary when loading fuel, makes tool movements more abrupt (increasing the chance for bumping the fuel head and unlatching an element), and increases operator fatigue. A combination of all of these is seen as contributing to the February 3 event. Imprecise alignment of the index plate has a direct effect on tool-use difficulty [IE-CF1]. Because the index plate serves as the upper guide for the fueling tools, any mismatch with reactor features below the index plate (upper and lower grid plates and reactor plug) can add a side load and additional friction to the tools. In addition, there should be fiducial marks on the top of the index plate at each fuel position to provide a rotational reference for tool maneuvers but marks for this purpose were never placed or have long since faded. The original design featured pins to preserve the original alignment with the reactor plug and upper and lower grid plates during cycles of removal and replacement, but these were damaged and subsequently removed and never replaced. The TWG and ERCAS both recognize the need for alignment and consistent positioning of the index plate and recommend modifications so that it is consistently positioned, and the addition of clear rotational fiducial marks at each fuel position [IE-CA3]. The TWG investigated the potential impact that differences between old fueling tools and new replacements might have had on the event.

Interviews with operators and engineers indicate agreement that, while tools manufactured at tolerance extremes might add some difficulty in use, it was not a contributing factor. Nonetheless, it brought to light the fact that the engineering best practice of careful inspection using a coordinate measurement machine to document the as-built tool configuration was lacking. This quality assurance capability is universal for manufacturers of precision equipment such as fueling tools. ERCAS suggests that the NCNR require tool manufacturers to provide accurate dimensional inspection reports for comparison of the as-built article to drawing specifications [IE-SPI2].

A root cause recognized by both the TWG and ERCAS, was a sub-par training and qualification program (see QT section). The ERCAS sees the lack of opportunity to teach, practice and test proficiency in fuel handling and latching as a contributing factor because it impedes operators from gaining the needed experience and proficiency in this area [IE-CF3]. Operators have very limited access to the reactor top and the existing test stand only allows them to practice the fuel transfer operation. Ideally, operators would have a training platform designed to closely approximate the feel of tool use and fuel element movement and latching at the reactor top.

The ERCAS recommends increased access to the reactor for training purposes or the redesign or modification of the existing test stand to better simulate reactor top fuel loading, latching, and latch-checking experience [IE-CA6].

SEC Subcommittee Review 26 Because of the stationary nature of the fuel dislocation on February 3, deviations from the normal nuclear instrument channel signal pattern were too small to be discerned by reactor console operators. As configured, the nuclear instruments do not have sufficient sensitivity to detect a problem of this nature [IE-O1]. However, later inspection did show small deviations at the onset of the incident (due to boiling and moderator voiding within the fuel element flow channel). These small deviations might provide an opportunity for early detection and warning of an unusual circumstance. The ERCAS suggests that the NCNR explore nuclear instrument signal analysis tools capable of providing early detection and alarm notification to the console operator [IE-SPI1].

Table 5. Instruments, Equipment and Tool-Related Root Causes, Contributing Factors, Recommended Corrective Actions, Related Observations and Suggested Program Improvements.

Root Cause and Contributing Factors Description Recommended Corrective Actions Source Root Cause IE-RC1: Inadequacies existed in the fidelity of latch determination equipment and tools.

IE-CA1/PR-CA3: Institute a method of visual checks ERCAS clarification: Specify use of a camera or video camera to provide indisputable proof that each element is fully latched IE-CA2: Document that improved latching and latch check processes provide adequate defense against unlatching.

TWG ERCAS TWG Contributing Factor IE-CF1: Imprecise alignment of index plate causes difficulty in tool use.

IE-CA3: Modify index plate so that it is consistently positioned in the same place and rotational fiduciary marks are clear.

TWG Contributing Factor IE-CF2: Height check tools lacked fidelity.

IE-CA4: Consider discontinuing use of height checks to verify latching ERCAS Contributing Factor IE-CF3: Subsequent study showed that inadvertently bumping the fuel head with a tool can cause unlatching.

IE-CA5: Put administrative controls in place (procedures) to assure no tool contact with fuel head following final visual latch verification prior to reactor startup Contributing Factor IE-CF4: Lack of hands-on training contributes to operator inexperience in loading/latching fuel IE-CA6: Increase access to the reactor top for training purposes or redesign/modify existing test stand to better simulate reactor top fuel loading/latching/latch checking experience ERCAS Related Observations Description Suggested Program Improvements Source IE-O1 No detectable indication of fuel dislocation to reactor console operator prior to fuel damage during climb to power IE-SPI1: Explore NI signal analysis tools capable of providing early detection/alarm of abnormal behavior ERCAS

SEC Subcommittee Review 27 IE-O2 Dimensional differences between old tools and replacement tools potentially cause difficulty and confusion during use IE-SPI2: Require tool manufacturer to provide accurate dimensional inspection reports for comparison of as-built condition to drawing specifications ERCAS

SEC Subcommittee Review 28 REFERENCES

1. Letter from R. Dimeo and T. Newton to U.S. Nuclear Regulatory Commission. Report of NCNR declaration of Alert, dated February 16, 2021. ADAMS Accession No. ML21048A149.
2. Letter from T. Newton to U.S. Nuclear Regulatory Commission. Addendum to event report, dated March 4, 2021. ADAMS Accession No. ML21070A183
3. Letter from R. Dimeo and T. Newton to U.S. Nuclear Regulatory Commission. Report of NCNR safety limit exceeded, dated March 5, 2021. ADAMS Accession No. ML21064A523.
4. Letter from T. Newton to U.S. Nuclear Regulatory Commission. Follow-up to event report, dated May 13, 2021. ADAMS Accession No. ML21133A266.
5. NCNR Technical Working Group, Root Cause Investigation of February 2021 Fuel Failure Revision 1, dated May 13, 2021.
6. NCNR Technical Working Group, Addendum to Root Cause Investigation of February 2021 Fuel Failure, dated June 3, 2021.
7. U.S. Nuclear Regulatory Commission, Interim Special Inspection Report No.05000184/2021201, dated April 14, 2021. ADAMS Accession No. ML21077A094
8. Letter from R. Dimeo to D. Pierce, Charge to SEC on the unplanned shutdown of February 3, 2021, dated May 13, 2021.
9. Technical Specifications for the NIST Test Reactor (NBSR), ADAMS Accession No. ML14204A628, Amendment No. 12, dated September 21, 2020
10. NCNR Reactor Safety Evaluation Committee (SEC) Charter dated April 21, 2020.
11. NBSR Emergency Plan December 2008 as amended July 01, 2017.
12. NIST Incident Reporting & Investigation System, Case 21-IG-0017 Exposure to Hazardous Atmosphere, occurred February 4, 2021.
13. INPO 11-003, Guideline for Excellence in Procedure and Work Instruction Use and Adherence Rev. 0, dated June 2011.
14. Administrative Rules 1.0 Conduct of Operations, Rev C
15. Administrative Rules 5.0 Procedures and Manuals, Rev A
16. Operation Instruction 6.1, Fueling and Defueling Procedures, Rev E