05000206/LER-1998-001, :on 980114,loss of Security Computers Was Noted.Caused by Equipment Failure.Rebooting of Primary or Backup Servers Will Be Manually Monitored to Prevent Failure

From kanterella
(Redirected from 05000206/LER-1998-001)
Jump to navigation Jump to search
:on 980114,loss of Security Computers Was Noted.Caused by Equipment Failure.Rebooting of Primary or Backup Servers Will Be Manually Monitored to Prevent Failure
ML20202F455
Person / Time
Site: San Onofre 
Issue date: 02/12/1998
From: Krieger R
SOUTHERN CALIFORNIA EDISON CO.
To:
Shared Package
ML20202F266 List:
References
LER-98-001, LER-98-1, NUDOCS 9802190203
Download: ML20202F455 (3)


LER-1998-001, on 980114,loss of Security Computers Was Noted.Caused by Equipment Failure.Rebooting of Primary or Backup Servers Will Be Manually Monitored to Prevent Failure
Event date:
Report date:
Reporting criterion: 10 CFR 50.73(a)(2)(iv), System Actuation

10 CFR 50.73(a)(2)(v), Loss of Safety Function

10 CFR 50.73(a)(2)(vii), Common Cause Inoperability

10 CFR 50.73(a)(2)(1)

10 CFR 50.73(a)(2)(viii)(A)

10 CFR 50.73(a)(2)

10 CFR 50.73(a)(2)(viii)

10 CFR 50.73(a)(2)(iii)
2061998001R00 - NRC Website

text

.

e LICENSEE EVENT REPORT (LER)

Fro 111ty Name (1)

Docket Number (2)

Page (3) san onofre Nuclear Generating station Unit 1 0l5l0l0l0l2l0l6 1ldl0l3 Title (4)

Ioss of n=r

  • tv ocuiEmators uvner tsas (5) im NinaEst (6) l suscstr tare (7)

I oneR ysctLrrIEs INuttuno (e) acust Imr YEnk team aEG MerIAL IsVISIM M3RN thy YEML FACILITY NAMES DOCKET NUMBER (S)

Unit 2 0

5 0

0 0

3 6

1 0l1 1l 4 9l 8 9l8 010l1 0 l0 0l2 1l 2 9l 8 Unit 3-0 5

0 0

0 3

6 2

OPERATING THIS REPoPT IS SUBNITTED PURSUANT To THE REQUIREMENTS oF toCFR St WODE (9)

(CHECK oNE o0L HORE ol' TRE FoLLottly.0) (11)

_PoNER_IAVEL,J1od o l o l o

20. 402 (b)

_ 20.405 tc)

_ 50.73 (a) (2) (iv)

_ 73.71 (b)

_ 20. 405 (a) (1) (1)

_ 50. 36 (c) (1)

_ 50.73 (a) (2) (v)

_ 73.71(c)

20. 405 (a) (1) (11) _ 80.36(c) (2)

_ 50.73 (a) (2) (vii) 1 other (specify in

_ 20. 405 (a) (1) (iii) _ 50.73 (a) (2) (1)

_ 50.73(a) (2) (viii) (A) abstract below

_ 20.405 (a) (1) (iv) _ 50.73 (a) (2) (11)

_ 50.73(a) (2) (viii) (a) and in test)

_ 20.405 (a) (1) (v)

_ 50.73 (a) (2) (iii) _ 50.73 (a) (2) (a) 73.71 (d)

LICENSEE CONTACT FoR THIS LER (12)

NAME TELEPHONE NUMBER AnEA cooE A_.MuErieger, Vice._Presidentu Huclear_ Generation 7l1 l4 3l6le el2l5l5 CoHPLETE oNE LINE FoR RACH COMPONENT FAILURE DESCRIBED IN THIS REPo8tT (13) chuBE SYSTM 03MFCEEENT ENLWAc-REpcatrAmtA ChDER SYrrW otteCREENT E NUFAc-REPCstrASEA TtatER To N5ODS

'ItstER To NPIDS I

SUPPLEMENTAL REPORT EXPECTED (14)

EXPECTED WoNTH DAY YEAR SUBWIS81oM l Yee (If yes, coegdete EXPECTE) SLEEGSSICel DhTE) lXlNo DATE (15) l l

l ABST; Act (Limit to 140o spaces, i.e.,

approximately fifteen single space typewritten lines.)

(16) on January 14, 1998 (discovery date), Southern California Edison (SCE). prepared to install a chart recorder on the primary security comouter for system diagnostic testing. At about 9:25 a.m.,

1efore starting the installation, SCE had t'

conservatively posted compensatory guards for the appropriate plant areas, as specified in Station Procedures SO123-IV-6.8, " Protected Area and Vital Area Barrier Patrols," for a complete loss of security computers.- SCE switched to the backup security computer, removed the primary computer from service and installed the chart recorder. When returning the primary computer to service, a computer network server software error occurred, causing the prirary computer to initialize incorrectly. At about 10:26 a.m.,

the backup computer also failed as a result of this error.

The primary and backup computers were restarted at about 10:32 a.m. and 10:36 a.m.,

respectively.

The cause of this event was an equipment failure.

During the reboot of the primary computer, the network server function for the security computers did not start.

.However, the "be,ot" sequence continued, until the main sect.rity program started on l

the primary computer. Without the network server function, the two computers could not completely communicate and consequently, could not fully function.

The main security program was not capable of recognizing that the network server function had not started and tried to regain the primary role in the security monitoring system.

As a result, a conflict arose and the backup program became unstable and failed to function.

Since the primary had no network server function, it could not communicate properly, leaving both primary and backup down.

SCE is evaluating modifications to the program software to eliminate this problem.

In the interim, any required rebooting of the primary or backup servers will be manually monitored to prevent this type failure.

9802190203 980212 PDR ADOCK 05000206 S

PDR

e LICENSEE EVENT REPORT (LER) TEXT CONTINUATION SAN ONOFRE NUCLEAR GENERATION STATION DOCKET NUMBER LER NUMBER PAGE UNIT 1 05000206 1-98-001 2 of 3 Plant San Onofre Nuclear Generating Station (SONGS) Unit 2 and 3 Reactor Vendor:

Combustion Engineering

(

Event Date:

January 14, 1998

)

Event Times 10:26 Modes Unit 1 - Permanently shutdown

{

Unit 2 - Mode 1 - Power Optration j

Unit 3 - Mode 1 - Power Operation Power Units 2 and 0 - approximately 100 percent l

Background:

In LER 1-97-003, Southern California Edison (SCE) reported four instances in 1997 where the primary and backup security computers were out of service as follows:

May 20 (23 minutes), July 29 (21 ndnutes), October 30 (93 minutes), and December 19 (20 ndnutes). As stated in LER 1-97-003, SCE is completing an engineering review of the computer systems to determine required corrective actions to improve system l

reliability.

Description of Events i

j On January 14, 1998 (discovery date), SCE prepared to install a chart recorder on the primary security computer [IA] for system diagnostic testing. At about 9:25 a.m., before starting the installation, SCE had conservatively posted compensatory guards for the appropriate plant areas, as specified in Station Procedures SO123-IV-6.8, " Protected Area and Vital Area Barrier Patrols," for a complete loss of security computers.

SCE switched to the backup security computer, removed the rrimary computer from service and installed the chart recorder. When returning tbs primary computer to service, a computer network server software error occurred, causing the primary computer to initialize incorrect 3v.

At about 10:26 a.m.,

the backup computer also failed as a result of this error.

The primary and backup computers were restarted at about 10:32 a.m.

(six minutes later) and 10:36 a.m.

(ten minutes later), respectively.

During an NRC Security Inspection Exit interview on November 21, 1997, the NRC indicated that establishing compensatory measures within ten minutes would not be sufficient to preclude reporting a complete loss of security computers to the NRC within one hour.

Consequently, to address the NRC's concerns, SCE revised procedure SO123-IV-11.2, " Reporting Safeguards Events," to require a report to be made within one hour, even when compensatory measures are taken within ten minutes.

Therefore, a 1-hour report was conservatively made to the NRC on January 14, 1998, even though, as a precaution, prestaged compensatory measures were in place prior to the security computer system failure.

Subsequently, SCE is providing this 30 day follow-up report as required by 10 CFR 73.71(d).

Cause of the Event

The cause of this event was an equipment failure.

During the reboot of the primary computer, the network server function for the security computers did not start.

However, the " boot" sequence continued, until the main security program started on the primary computer. Without the network server function, the two computers could not completely communicate and consequently, could not fully function.

The main security program was not capable of recognizing that the network server function had not started and tried to regain the primary role in the security monitoring system.

As a result, a conflict arose and the backup program became unstable and failed to function.

Since the primary had no network server function, it could not communicate properly, leaving both primary and backup down.

.3 LICEHOEE EVENT REPORT -(LER). TEXT CONTINUATION

.. SAN ONOFRE NUCLEAR GENERATION STATION DOCKET NUMBER LER NUMBER.

PAGE UN27 1 05000206 1-98-001 3 of 3 l

- Corrective Actions:

1 SCE is evaluating modifications to the program software to eliminate this problem.

In the interim, any required rebooting of the primary or backup servers will be

. manually _ monitored to prevent this type failure.

4 i

safety significaneet-During the time the computers were off-line, there was no reduction in detection capabilities as compensatory measures (security posts) were already in place.- As such,'there was no safet,v significance to this event.

' Additional Infosaation:

SCE reported similar. instances in LER 1-97-003.

The event reported ~herein occurred during diagnostle testing to correct the events reported in LER l-97-003, and consequently, could not have been prevented.