05000388/LER-2012-002

LER-2012-002, Unit 2 Manual Scram Due to Loss of the Integrated Control System
	Susquehanna Steam Electric Station Unit 2
Event date:
Report date:
Reporting criterion:	10 CFR 50.73(a)(2)(iv)(A), System Actuation
Initial Reporting
ENS 48496	10 CFR 50.72(b)(2)(iv)(B), RPS System Actuation, 10 CFR 50.72(b)(3)(iv)(A), System Actuation, 10 CFR 50.72(b)(2)(iv)(A), System Actuation - ECCS Discharge
LER closed by
	IR 05000387/2013011 (18 November 2013)
	3882012002R01 - NRC Website
	v • d • e

CONTINUATION SHEET

Prior to the event, Susquehanna Unit 2 was operating in Mode 1 at approximately 90% power. Prior to the scram, the Human Machine Interface (HMI) for ICS became non-responsive and indicated unavailable data (cyan color data fields). The demand signal to the Reactor Feed Pump Turbine (RFPT) speed control system [EIIS System Code SJ] froze at the last known good value in accordance with the system configuration. Without a dynamic control signal, the reactor vessel level began to lower at a rate of approximately one inch/minute. The RFPT CONTROL SIGNAL FAILURE annunciator was received in the control room and was indicative of a loss of network communications affecting control system operation.

Operations attempted to take manual control via the control room HMI, but the HMIs remained non- responsive (system design did not at the time include a hard-wired manual control option). No other equipment was inoperable at the start of the event that contributed to the event.

Description of the Event

At approximately 0117 hours0.00135 days 0.0325 hours 1.934524e-4 weeks 4.45185e-5 months on November 9, 2012, Susquehanna Steam Electric Station Unit 2 was scrammed by plant operators due to a loss of the Integrated Control System (ICS) [EIIS System Code: JB], which is the system that controls the reactor feed and reactor recirculation systems.

The reactor operator placed the mode switch in shutdown when reactor water level reached +25 inches and lowering and then tripped reactor feed pump turbines A, B, and C. All control rods inserted and both reactor recirculation pumps [EIIS System Code: AD] tripped at -38 inches. Reactor water level lowered to -52 inches causing Level 3 (+13 inches) and Level 2 (-38 inches) isolations. High Pressure Coolant Injection (HPCI) [EIIS System Code BJ] and Reactor Core Isolation Cooling (RCIC) [EIIS System Code BN] both automatically initiated. HPCI was overridden prior to injection and RCIC was utilized to restore reactor water level to the normal band. All isolations and initiations at this level occurred as expected. No steam relief valves opened. Pressure was controlled via turbine bypass valve operation. All safety systems operated as expected.

Following the manual scram due to the ICS failure, operators established level control using RCIC as a source of injection. During this time, ICS Narrow Range (NR) Reactor Pressure Vessel (RPV) level indication was locked up at the pre-transient level of approximately 33 inches. A second scram signal was received at approximately 0420 hours0.00486 days 0.117 hours 6.944444e-4 weeks 1.5981e-4 months on November 9, 2012 due to low water level during recovery from the initial scram. Reactor water level was +15 inches at the time of the trip.

Reporting Criteria The initial scram was: 1) a reactor protection system (RPS) actuation with the reactor critical [10 CFR 50.72(b)(2)(iv)(B)], 2) a valid RPS actuation [10 CFR 50.72(b)(3)(iv)(A)], and 3) an emergency core cooling system (ECCS) injection [10 CFR 50.72(b)(2)(iv)(A)] that was reported in accordance with 10 CFR 50.72 in EN 48496 at 0303 on November 9, 2012. The second scram signal that occurred at 0419 was not initially recognized as a reportable event and was not reported in accordance with 10 CFR 50.72(b)(3)(iv)(A) until 1620 on November 10, 2012 in EN 48500. These events are also reportable as an LER in accordance with 10 CFR 50.73(a)(2)(iv)(A).

The ICS is an implementation of an Invensys Intelligent Automation (I/A) Series Distributed Control System (DCS). The ICS utilizes six fault-tolerant pairs of Field Control Processors. These processors and their related input/output (I/O) subsystems are used to control reactor recirculation pumps, reactor feed pump turbine speeds, and reactor vessel water level. To accomplish these control functions, the processor pairs must communicate with other processor pairs. This communication takes place over a digital communication network referred to as the "Mesh Control Network" or just the "mesh." The mesh utilizes network cabling and multiport network switches to implement a digital communications network that provides multiple communication paths between any two devices on the network. If there is a communications problem between two network devices, the mesh should automatically establish a different connection utilizing a different path. This is accomplished via a software algorithm called Rapid Spanning Tree Protocol (RSTP), which manages the network traffic, eliminating system loops, minimizing data packet collisions, and providing fast switchover if a fault occurs.

Background Information Associated with Monitoring RPV Level With the ICS NR level indication locked up, operators were uncertain as to the reliability of the hard wired instrumentation on the standby information panel (SIP) and relied on wide range (WR) instrumentation. At normal operating pressure, WR indicates approximately 10 inches below the NR instrument. As the plant cools down, these indications converge and then diverge, with WR reading almost 10 inches above NR.

Following RPV stratification, the water level band was changed from 45-54 inches to 13-30 inches in accordance with plant procedures. The second scram signal was received with NR levels greater than 15 inches (i.e., the scram signal occurred within allowable tolerance). WR indication was reading 23 inches when the scram signal was received.

CAUSE OF THE EVENT

Initial Scram The direct cause of the initial scram was as follows:

C2 Series Switches had a Latent Design Deficiency. The ICS Core Switches have had repeated reliability issues over the product life that could not be resolved by repeated firmware updates The C2 Series switches have had numerous firmware issues dating back to May 2009. The C2 switches had a history of reliability issues that could not be corrected by updated firmware.

The root cause of the initial scram was as follows:

Less Than Adequate Evaluation of Deficiencies Associated with ICS Core Switch Design, Testing and Mitigating Strategies Resulted in Delayed Resolution without Understanding Risk Implications.

Multiple sources of information, including vendor, regulator, and industry OE were not adequately evaluated by personnel responsible for oversight of the design and testing requirements of the ICS Core Switches, which lead to non-conservative decision making and a known failure mechanism causing SSES ICS to fail and a Unit 2 manual scram.

Procedural guidance directed operators to a reactor water level band that could potentially cause an RPV level 3 SCRAM while maintaining procedural compliance. Emphasis was not placed on maintaining adequate margin to the RPV level 3 RPS actuation point following initial level recovery post SCRAM.

ANALYSIS/SAFETY SIGNIFICANCE

Actual Consequences:

Loss of ICS resulted in operators manually scramming Unit 2. Reactor vessel stratification challenged Operations; however, no Technical Specification limits were exceeded. Reactor Water Cleanup (RWCU) Letdown [EllS System Code CE] was not available due to its tie to ICS. In accordance with NEI 99-02, this was classified as an Unplanned Scram with Complications due to loss of Feedwater.

Although the scram challenged the operators, the safety consequences of the event were well controlled.

The reactor operator placed the mode switch in shutdown when reactor water level reached +25 inches and lowering. All control rods inserted and both reactor recirculation pumps tripped at -38 inches. Reactor water level lowered to -52 inches causing Level 3 (+13 inches) and Level 2 (-38 inches) isolations. HPCI and RCIC both automatically initiated. HPCI was overridden prior to injection and RCIC was utilized to restore reactor water level to the normal band. All isolations and initiations at this level occurred as expected. No steam relief valves opened. Pressure was controlled via turbine bypass valve operation. All safety sytems operated as expected.

The ICS responded as expected with the following exception: A High/Low Level alarm was received when level as indicated at the Standby Information Panel (SIP) was 25 inches. This alarm was expected at 30 inches. The alarm is driven off of selected level, which was impacted by the communication failure. By design, the indicators at the SIP panel do not go through the mesh network.

The secondary RPS actuation on low level following achievement of All Rods In was non-dominant from a PRA standpoint because the Manual SCRAM inserted three hours prior achieved All Rods In. Re-entry into the EOPs for an otherwise non-limiting event does not adversely affect the probabilistic risk results.

The secondary low level SCRAM was considered during the post-scram PRA analysis but was not specifically modeled in the timeline because it was not by itself a risk significant contributor in the PRA model.

Potential Consequences:

Had operators not taken manual control and had safety systems not responded as designed, additional challenges and complications could have arisen requiring the use of additional safety systems and potential entry into additional emergency operating procedures.

The efforts to restore feedwater and the need to re-enter the EOPs during an event in progress also represents an additional operator burden that could adversely affect overall crew performance.

From a qualitative standpoint, the additional tasks and alarms associated with post SCRAM complications, particularly use of the suppression pool as a heat sink, represent potential work environment and task demand error precursors for the crew. As an example, this could increase the potential of an operator taking an incorrect EOP action based upon an unknown instrument malfunction.

Although the scram challenged the operators, the safety consequences of the event were well controlled.

All safety systems operated as expected; therefore safety significant consequences of these events were mitigated.

Assessment of the risk associated with the initiating event determined that the increase in risk to Unit 2 was less than the NRC I MC 609 Appendix K Green/White Threshold of less than 1 E-06 ICDP and less than 1E-07 ILERP.

CORRECTIVE ACTIONS

Key corrective actions include:

1. Unit 1 and Unit 2 ICS cores switches were replaced with newer version switches (C5) and loop protection algorithms have been enabled.

2. NSEP-QA-0004, "Station Engineering Surveillance and Technical Procedures Preparation and Performance Guidelines," was revised to include risk considerations and expected responses to unexpected results in the development of test procedures.

3. NDAP-00-1600, "Technical Task Risk/Managed Defenses Assessment, Pre-Job Brief, Independent Third Party Review, and Post-Job Brief," was revised to include possible negative consequence from first time or non-routine evolutions and a process risk factor for testing that requires prediction of plant response.

4. MFP-QA-2310, "Engineering Change Testing," was revised to require factory acceptance testing to include dynamic testing of new equipment with significant risk impacts.

5. Technical Conscience training was provided to station senior leaders, engineering leaders, and engineers. The effectiveness of the training has been monitored by performing periodic rollup of coaching results for plusses and deltas, and the results of other processes that assess the quality of engineering deliverables.

6. The Unit 1 and Unit 2 Off-Normal procedures for reactor scrams were revised to change the level control strategy and improve guidance associated with level control.

The inability to restore feed water necessitates the use of the more risk significant systems of RCIC or HPCI for high pressure injection / pressure control and represents the need to use the next level of defense in depth systems beyond normal feed. Both HPCI and RCIC exhaust to the suppression pool which will increase heat load in the primary containment; requiring additional operator actions beyond those if feedwater remained available.

8. Operator training that included classroom training on Operator Fundamentals (Phase 1) and ICS (Phase 2), and simulator sessions designed to practice Operator Fundamental and Conservative Decision Making Concepts (Phase 3) was provided and has been institutionalized in the Operator License Training program.

9. The ICS system was modified to allow hard-wired single element control of the "B" Reactor Feed Pump in the event the mesh network is lost.

PREVIOUS SIMILAR EVENTS

Susquehanna has had three previous scrams associated with ICS. These events were as follows:

the Digital Feedwater Integrated Control System" Susquehanna also had two recent LERs that involved a similar cause (switch failures):

One of the causes of this event was a design deficiency in a chiller circulation pump control switch.

The causes of this event included foreign material from the manufacturing process that prevented an ammeter switch from closing and design of the protective relay scheme that included a shared metering function.

05000388/LER-2012-002

Contents

CONTINUATION SHEET

Description of the Event

CAUSE OF THE EVENT

ANALYSIS/SAFETY SIGNIFICANCE

CORRECTIVE ACTIONS

PREVIOUS SIMILAR EVENTS

Navigation menu

05000388/LER-2012-002

CONTINUATION SHEET

Description of the Event

CAUSE OF THE EVENT

ANALYSIS/SAFETY SIGNIFICANCE

CORRECTIVE ACTIONS

PREVIOUS SIMILAR EVENTS

Navigation menu

Search