ML17299B324

From kanterella
Jump to navigation Jump to search

Preliminary Case Study Rept, Effects of Ambient Temp on Electronic Equipment in Safety-Related Instrumentation & Control Sys
ML17299B324
Person / Time
Site: McGuire, Mcguire  Duke Energy icon.png
Issue date: 11/30/1985
From: Chiramal M
NRC OFFICE FOR ANALYSIS & EVALUATION OF OPERATIONAL DATA (AEOD)
To:
Shared Package
ML17299B323 List:
References
REF-GTECI-A-44, REF-GTECI-EL, TASK-A-44, TASK-OR AB38-1-120, TAC-59183, TAC-61289, NUDOCS 8511190535
Download: ML17299B324 (55)


Text

Preliminary Case Study Report Effects of Ambient Temperature on Electronic Equipment in Safety Related Instrumentation and Control Systems Reactor Operations Analysis Branch Office for Analysis and Evaluation of Operational Data November 1985 Prepared by:

Matthew Chiramal NOTE:

This report documents the preliminary results of an ongoing study by the Office for Analysis and Evaluation of Operational Data with regard to a number of operating events.

This report is issued for review and comment as part of the "peer review" process used Vor AEOD case studies.

Since the study is ongoing, the content, findings and recomnendations are preliminary and may not represent the final position of AEOD, the responsible program office for the Nuclear Regulatory Commission.

4 1

Table of Contents 0

Sumnary Pa e Number

~

~

~

~

~

~

1

l. Introduction 4
2. Discussion 8
3. Findings and Conclusions 19 4.

Recommendations 23 5. Table 1

26

EXECUTIVE

SUMMARY

On June 4, 1984 while both units at the McGuire Station were operating at 100K

power, a total loss of the control area ventilation system occurred.

Approx" imately 45 minutes into the event, certain safety related instrumentation of Unit 1 started behaving erratically.

An hour into the event the ventilation system was declared inoperable and one hour later the control operators started to reduce power on both units as required by technical specifications.

During this period, numerous instrument alarms were being received on both units.

At approximately two hours into the event, instrument cabinet doors were opened; control room doors and computer room doors were opened; and portable fans and ducts were placed in the doorways to move cool air from the air conditioned computer room to the control area.

About three hours after the initial venti-lation system trip, the control area ventilation system was returned to service and declared operable and the control operators stopped reducing power with each unit having reached 97X.

Both HcGuire units have previously experienced failures of the instrumentation system due to increasing ambient temperatures causing overheating of electronic components in the Process Control System (PCS 7300) and the Solid State Protection System (SSPS) cabinets.

Thus, the operators anticipated the problem with the instrumentation and the consequences.

The plant was operated safely during the event due to the extraordinary efforts of the operators in providing cooling to the instrumentation cabinets and in not being confused by all the spurious alarms.

If such actions had not been taken, the electronic equipment failures due to high temperatures, like the ones previously experienced, would likely have led to transients and trips of both units, inadvertent actuations

and/or loss of various safety related systems and spurious annunciation and erroneous indications in the control room.

During the event, which occurred at about 8 p.m., the maximum ambient, temperature in the control room area was approximately 90 F, which is well below the technical specification and equipment specification value of 120'F.

(Note that these temperatures apply to the ambient room temperature, but do not address allowable operating temperatures of the components or their enclo-sures.)

Hence, the instrumentation should not have been affected.

Following the event, the licensee monitored the temperatures inside the PCS cabinet with the highest heat load and found them to be much higher (42 F to 52'F above ambient) than the anticipated 25 F to 30 F above ambient.

When the event

occurred, the outside temperature was between 72 F to 76 F.

Had it 'occurred when the outside temperature was higher, the control room and instrument cabinets could have experienced higher temperatures faster.

After the inci-

dent, maintenance personnel rebalanced the airflow in the control area venti-lation system to provide additional cooling to the instrumentation cabinets.

The licensee has determined, based on failure data of printed circuit cards, that the additional cooling has improved reliability of e]ectronic components in these cabinets.

(Thirty-five card failures had occurred in the five months prior to the event, and thirteen card failures in the five months following rebalancing of air flow.)

This report documents th'e review and evaluation of the event at the HcGuire Station.

A search of the Sequence Coding and Search System (SCSS) was also conducted to identify similar events at other operating Westinghouse PWR units.

Based on the review of the data thus obtained, the problem of high ambient

temperatures affecting electronic components in safety related Instrumentation and Control Systems was determined to be potentially generic to other Westinghouse PWR units and to all operating nuclear units which utilize heat sensitive solid state electronic components.

Technical Specifications regarding area ventilation cooling systems and instrumentation systems were also reviewed and found to be inadequate.

A review of the staff's evaluation of the Station Blackout issue regarding environmental effects on plant IIC equipment found certain assumptions to be potentially deficient.

Based on the review and evaluation, this report provides findings, conclusions and recommendations which address the concerns raised.

l

l.

INTRODUCTIOH On June 4, 1984 at about 8 p.m. while HcGuire Unit 1 and Unit 2 were operating at 100K power with the control area ambient temperature at about 754F, the train B chiller of the control area ventilation system tripped due to low oil level.

Prior to this, train A of the system had been removed from service for maintenance on its air handling unit.

Approximately 20 minutes later the tripped chiller was restarted, but shortly thereafter tripped again on low level.

The temperature in the control room started to increase.

Approximately 45 minutes after the initial trip of the control area ventilation system, the control room received numerous repeated alarms on Unit 1 indicating high Reactor Coolant Loop C Tavg.

Alarms were also received from a pressuri1er level channel in Unit 1.

Both alarms subsequently proved to be spurious.

The operators again attempted to start train B chiller at 9:00 p.m., but were unsuccessful.

An attempt to start train A chiller using train B air handling unit also failed and at 9:05 p.m. the control area ventilation system was declared inoperable.

At about 10 p.m., control room doors and compute~

room doors were opened and portable fans were placed in the doorways to move cool air from the computer room to the control room.

The Westinghouse Process Control System (PCS 7300) cabinets, whose internal electronic components

were, from previous experience, known to be sensitive to increasing ambient tempera-ture, were also opened.

At 10:05 p.m., the operators started to reduce power on Units 1 8 2 as required by the plant technical specifications.

At about 10:30 p.m., oil was added to the chiller and the chiller restarted.

At 10:55 p.m., train B control area ventilation system was declared operable and power reductions of both units were stopped.

The units were returned to 100K

power by 11: 12 p.m.

When normal cooling was restored in the control room area, the malfunctioning instrumentation resumed normal operation.

At the McGuire Station, numerous failures of printed circuit cards involving reactor trips or spurious instrument indications have occurred.

The failures appear to be directly attributable to overheating in the PCS 7300 cabinets.

In some cases, the spurious indications automatically corrected themselves when adequate, ventilation was restored to the cabinets.

However, in many cases the licensee has observed that erratic instrumentation signals continued for over a

month after ventilation was restored.

The recommended operating ambient temperature for the Westinghouse PCS 7300 cabinets and associated Solid State Protection System (SSPS) cabinets is 75 F

k 10 F.

gualification specifications call for operation in an ambient range of 40 F to 120 F; however, there is no requirement that relates to actual tempera-tures expe'rienced by the components in these cabinets.

During the event of June 4, 1984, the maximum ambient temperature near the PCS cabinets, as remem-bered by the operators, was about 90 F which was well below the technical specification limit of 1204F.

However, following the event, the licensee installed temperature monitors at various locations in the PCS cabinet with the maximum heat load.

With the ambient temperature at about 724F near the cabi-net, the measured internal air temperatures ranged from 73 F at the bottom of the cabinet to a maximum of 1094F at the top air space in the cabinet.

The thermocouples which were placed directly on the instrument racks on which the printed circuit boards are mounted, measured 1154F at the middle rack to a maximum of 125'F at the top rack.

The measured temperature difference between the ambient and the inside of the cabinet was greater than expected.

(It

should be noted that the ambient temperature was at 904F during the event vice the 72 F at which these measurements were made.

Temperatures at the time of this event are estimated to have been 15 to 20 degrees higher than the temperatures measured during the test.)

The event occurred during the evening hours of early June when the outside temperature was between 724F and 764F.

Had it occurred when the outside temperature was higher, the control room and cabinets could have experienced higher temperatures faster.

Following this event, the plant maintenance personnel rebalanced the air flow in the control area ventilation system to provide additional cooling at the cabinet locations.

Modifications providing heat sinks on the PCS cards have been completed on both units.

The rebalancing of the air flow has already provided some benefits.

In the five months prior to the rebalancing, 35 card failures had occurred; and in the five months after rebalancing, only 13 card failures have occurred.

(It should be noted here that the majority of these card failures had not been reported to the NRC.)

The licensee has concluded that additional cooling of the cabinets has improved the reliability of the cards.

The Westinghouse PCS cabinets house instrument components which process the analog signals provided by transducers that measure the necessary primary and secondary plant parameters such as temperature,

pressure, flow and level, and provide outputs to plant protection systems (SSPS),

control circuits, control room'ndicators and recorders, annunciators and other output devices.

Although these cabinets are purported to be qualified for operating in an ambient rang-ing from 40 F to 120 F, the as-installed

cabinets, with their full complement

f II j

of electronic components, located in the control area adjacent to other cabi-

nets, could be subjected to local temperature conditions not considered during the qualification process.

Since the PCS and SSPS at the McGuire Station Station are similar to the ones used at other Westinghouse PWR units, it is expected that similar temperature related problems could be experienced by others.

To identify such problems, searches of the Sequence Coding and Search System (SCSS) involving failures of printed circuit cards and other solid state components at Westinghouse PWR units with SSPS were conducted.

The SCSS data base contains licensee event reports (LERs) for the period extending from 1981 to the end of 1984.

Based on the search, 124 LERs involving printed circuit cards and other solid state components were obtained.

(As noted above many card failures are not reportable to the NRC, and the number of events obtained are also limited by the search strategies used.)

A list of these LERs with the name of the plant involved, the date of the event and a brief description of the event are provided in Table 1.

, Searches of the NPRDS data base for electronic component failures in I8C systems of PWRs and BWRs were also conducted.

The searches focused on instru-ment bistables/switches, integrators/comparators, electronic power supplies, and isolation devices.

Three hundred and ninety-three events were obtained.

Narrowing the search to 1984 identified two hundred and fifty-seven events.

To make the review tractable, the search was further narrowed to events at McGuire 1 and 2 for the year 1984.

The failure characteristics of components identified in the NPRDS data were similar to those in the LERs of McGuire and other plants.

Based on this very limited review, it was concluded that the NPRDS data findings corroborate the findings of the LERs.

This conclusion is felt to be valid even though only the events at McGuire were reviewed in detail.

jt

'L l

l

a 8 a 2.

DISCUSSION Both the HcGuire units have experienced many failures of printed circuit cards and other solid state components due to overheating in the PCS 7300 and SSPS cabinets.

Table 1 lists 16 events (11 at Unit 1 and 5 at Unit 2) involving such failures.

(LER 84-016 which describes the event of June 4, 1984 mentions 48 card failures in a ten-month period.)

Hence the plant operations personnel were well aware of the possible consequences of loss of control area cooling.

The personnel also knew of the oil level problem associated with the chillers.

When the chiller is in operation, it is normal for a portion of the oil to mix with the refrigerant and cycle with the refrigerant.

The oil which accumulates in the evaporator is returned to the oil reservoir by the velocity of the refrigerant.

The main reason for the chillers tripping is apparently because the chillers at HcGuire are only loaded to 80K of capacity, whereas they were designed to be operated at lOOX.

Since the chiller is not fully loaded, the refrigerant is travelling at a slower velocity resulting in some oil not being returned to the reservoir.

When enough oil is trapped in the evaporator, the oil level drops to a, point where the chiller trips.

After. oil has been added

'nd the chiller restarted, the trapped oil in the evaporator is returned.to the reservoir where there is now an excess of oil.

The excess oil must be removed or the chiller will eventually trip on high bearing temperature.

Normally the excess oil is removed several hours after addition of oil and restart of the chiller.

When the chiller trips, there is a fifteen minute timer which must time out before the chiller can be restarted.

During the event of June 4, 1984, the operators attempted to restart the chiller twice before declaring it inoperable.

It was only at this time that five gallons of oil was added to the

]I f

y l

I

chiller and the chiller restarted.

(In this instance, three gallons of oil were removed several hours after oil had been added. )

The licensee is con-tinuing its review of operation of the chillers to determine further corrective actions.

The chilled water system has been reviewed to determine the feasi-bility of using components from both trains to keep one train operating; a

corresponding procedure change of the control area ventilation and chilled water system has been implemented to include this capability.

During the event of June 4, 1984, the control room area was without normal cooling for about three hours.

The erratic behavior of two channels of safety related instrumentation occurred approximately 45 minutes into the event.

The operators were aware of the potential for plant instrumentation erratic behavior (especially the PCS instrumentation),

as they have been exposed to similar problems at the HcGuire Units before.

Two hours into the event the operators initiated supplemental cooling of the area by means of portable fans etc., to prevent further instrumentation problems.

(The delay in initiating additional cooling was based on operators'nticipation of imminent restart of the cooling system.)

In spite of the prolonged loss of cooling, in this

instance, both units continued to operate at nearly full power.

Although only one channel of instrumentation (Loop C T average) had to be bypassed on Unit 1 during the event, several alarms were received on both units during the period normal cooling was lost.

During the event, the maximum ambient temperature in the control room was about 85'F and near the PCS and SSPS cabinets was approximately 90 F.

These ambient temperatures are well below the technical specification limit of 120 F which, by definition, is the allowable temperature limit for continuous operation of equipment and instrumentation in the control area and to maintain the control room habitable for operations personnel'

I, f

f(

l t

However, the experience at McGuire has been, as seen during the event of June 4, that control room area temperature in the 85 F - 90 F range will begin,.

to produce instabilities and abnormalities in safety related instrumentation systems.

Although the Westinghouse PCS and SSPS cabinets and components are designed to operate in an ambient range of 404F to I204F, the internal compo-nent and cabinet temperatures which could be reached have apparently not been adequately addressed.

(The event occurred during the evening of June 4, 1984 and the outside temperature at the plant site was between 72 F and 76 F at the time of the event.)

Although the McGuire Station has experienced numerous failures of solid state electronic components due to control room temperature

increase, the licensee had not, until after the June 4th event, measured the temperatures inside the instrument cabinets of concern.

As mentioned earlier, the temperature rise in the cabinet with the maximum heat load during normal operation and with normal ambient temperature, was found to be in the range of 42 F to 52 F above ambi-ent.

This temperature rise was higher than the 254 to 304F expected.

Based on this finding, the licensee rebalanced the air flow in the control area ventilation system to provide additional cooling to the PCS and SSPS cabinets.

Although the licensee did not measure the cabinet internal temperatures after rebalancing the air flow, the reduction of the number of printed circuit card failures since rebalancing indicates that additional cooling of the cabinets has improved the reliability of the cards.

The licensee's modifications that provided heat sinks on the PCS cards should further improve the reliability of the cards.

ii I

In reviewing the 16 other LERs describing events at the HcGuire Station involving card failures (see Table 1 for a brief description of each event),

the following types of failures were seen.

(Some failures occurred more than once.)

o Erroneous permissive block indications.

o Overtemperature-delta T (OT-dT) instrumentation channel out of calibration.

o Signal isolator card failure in the PCS cabinet.

o RCS Loop B OT-dT indicator failed high due to control room chil'ler problems.

Indication returned to normal when normal cooling was reestablished.

o Loop D differential temperature indicator failed low due to a lead/lag (NLL) card failure in the PCS cabinet.

o Steam generator water level channel inoper'able dot to failure of a signal comparator (NAL) card.

o Loop B OT"dT indicator out of tolerance due to malfunction of heat sensitive zener diodes and operational-amplifiers.

o Universal logic card failure in SSPS-A.

o Digital rod position indication (DRPI) failure due to detector/encoder card failure.

o Instrument loop power supply (NLP) card failure possibly due to over-heating in PCS cabinet.

In many of the cases at HcGuire, the licensee has established the root cause of card failures to be excessive temperatures in the cabinets where the components are located.

Corrective actions taken by the licensee to increase cooling of the cabinets have resulted in reducing the failure rate of the cards.

Based on the types and characteristics of card failures seen at the NcGuire

Station, a review of events involving failure of printed circuit cands and other electronic components at other Westinghouse PWR units was conducted.

The review found that the types and characteristics of card failures experienced by the HcGuire Units are evident at Salem-1 8 2, D.C. Cook-1 8 2, Sequoyah-1 8 2, Beaver Valley-l, North Anna-1 8 2, Trojan, Farley-1 8 2, and Summer.

(See Table 1 for the number of events and details of the events.)

At Salem-2 and Summer the licensees had identified inadequate cooling of.,instrument cabinets J

and ambient temperature changes as the cause of some of the card failures experienced.

However, in the majority of the other events the licensees had not determined the root cause of the card failures.

Based on the similarity between the types and characteristics of card failures experienced at the HcGuire Station and at these other plants, it is apparent that high temper-atures in the cabinets in which the cards are located could be a root cause or a contributary cause of the failures.

As seen at HcGuire, the temperatures seen by the electronic components in the cabinets can be much higher than the

(

I ambient temperature of the space in which the cabinets are located.

Depending on the location of the cabinet, its heat load, the ventilation airflow, etc.,

the temperature in the cabinets could be above design limit even when the ambient room temperature is within specified limits.

Since internal tempera-tures are location specific, actual measurements should be taken to assure that design temperatures of components are not exceeded.

As far as we have been able to determine from the review of the events in Table 1, none of the licensees (except McGuire) has measured the temperatures inside the cabinets.

Even at the McGuire Station where high temperature was known to be a factor affecting card failures since 1981, actual measurements were taken only after the June 4th event, and then, the measurements were taken in only one cabinet.

After the ventilation system was rebalanced, no further measurements were taken.

Thus, although the licensee has seen a decrease in the rate nf failure of printed circuit cards at McGuire since rebalancing the control area ventila-tion system, the actual internal temperatures are not known.

Hence, the licensee is not aware of the temperatures to which the components of concern are exposed, or what margin exists between normal operating temperatures and design temperature limits.

That is, if a total loss of control area ventila-tion system were to occur now, the licensee still does not know how much margin exists before safety related instrumentation will respond erratically due to high internal cabinet temperatures.

The effects of high temperatures on heat sensitive electronic components in safety related instrumentation cabinets are o

increased failure rate of printed circuit cards and other heat sensitive electronic components, and

0 f

N

"14-o the potential for common cause failure of redundant instrumentation channels upon extended loss of normal area ventilation cooling..

Such failures can lead to malfunctioning of control systems, inoperability of instrumentation channels in protection systems, inadvertent actuations and/or failures of safety systems and erroneous indications and alarms to plant operators.

If actual cabinet internal temperatures are monitored, both of the above items can be remedied.

Failure rates are correlated to actual tempera-tures and depending on the margin between actual and maximum design tempera-

tures, improvement in reliability can be achieved by increasing cooling in the cabinet.

If the rate of change of internal temperature during a.loss of control area ventilation system is obtained, then the data can be analyzed and extrapolated to obtain the time to failure of redundant instrumentation (based on the conservative assumption that instrumentation failures will occur when internal temperature reaches the maximum design temperature of the components).

Variations of plant site outside air temperature due to seasonal and daily cycles could also influence the rate at which the control room ambient temper-ature increases following a total loss of the cooling and ventilation system.

This, too, should be factored into the analysis.

Technical specifications at HcGuire and other operating nuclear plants require actions to be initiated within one hour following the loss of control area cooling system to place the unit in the hot standby mode within the next six hours.

During the June 4, 1984 event, the safety related instrumentation at HcGuire-l started to be erratic 45 minutes into the event and only by extraordinary means were the plant operators able to cool instrumentation cabinets and continue to operate both units until normal cooling was restored

1 I

'I lf lf two and a.half hours later.

Both units were in the process of reducing power from 100K and had reached 97K.

Prior experience at HcGuire aided the operators in this instance, and the operators were able to safely control both units. If the operators had not been able to maintain cooling on the instrumentation

cabinets, the event could have been much more serious.

The event occurred during the evening hours of early June when the outside temperature was between 72'F and 76 F.

Had it occurred when the outside temperature was higher, the control room and the cabinets could have experienced higher temperatures faster.

Erratic operation of safety related instrumentation could have led to transients or trips of both units with the added complication of erratic indications to the operators.

This event suggests that the time periods allowed by plant technical specifications for plant operation following a loss of control area cooling should be reconsidered to include the internal cabinet temperatures and their effects on safety related instrumentation systems.

As stated

above, the operators at the HcGuire Station were able to operate the units safely through a two and a half-hour loss of control area cooling,
mainly, due to prior experience with the problem and a knowledge of its consequences.

This may not be the case at other operating nuclear plants, As seen by our review of the operational events in Table 1, only two other plants had identified high ambient temperature as a possible cause of printed circuit card and other electronic component failures, and none of them addressed the internal cabinet or component temperatures.

Me also conducted a search of the SCSS LER data base for events involving total loss of control area cooling system at the plants listed in Table 1.

No other events were found.

Since none of the other Westinghouse PMR units appears to have experienced a

sustained loss of control area cooling event, and given that the potential

I!

rll jl t'I exists for the solid state electronic components to be affected adversely by the subsequent increase in ambient temperature if such an event were to occur, it can be assumed that unless the operators at other operating plants have been specifically trained for such an event, they could have difficulties in coping with it.

The time periods allowed by the technical specifications to accommo-date the control area ventilation system, as at HcGuire, would generally be inadequate during such an event.

In fact, the times allowed could give the operators a false sense of confidence since they would assume that the unit would continue to operate and be safely shutdown without any complications.

Although the review and evaluation were conducted of events at operating Westinghouse designed PWRs only, the problem of high ambient temperatures and their effect on solid state electronic components in safety related:instru-mentation and control systems applies equally to all operating nuclear plants which utilize such components.

The electronic modules used in safety related I8C Systems in PWRs and BWRs are very similar and are subject to similar

, failure modes.

They are, in general, designed to be installed and operated in controlled mild environments.

Hence, events involving loss of cooling to such components have generic implications to all nuclear plants.

In the ongoing evaluation of total loss of all ac power transients (USI A-44, Station Blackout), one item of consideration is the unavailability of normal heating, ventilation and air conditioning in the plant.

The evaluation, as.

documented in NUREG-1032, states that equipment needed to operate during a station blackout and equipment required for recovery from a station blackout

,would have to maintain operability in the environmental conditions,(e.g.,

temperature,

pressure, humidity) that could occur as a result of the event.

I l

F Otherwise, failures of necessary equipment could lead to loss of core cooling and decay heat removal during the event or failure to recover from the event upon recovery of ac power.

Instrumentation and control elements of components required during station blackout are likely to be impacted by adverse environ-ments.

This is evident from the June 4th event at McGuire where high environ" mental temperature did adversely affect several safety related instrument channels.

The NUREG report goes on to say that only a limited set of equipment located in s

the control room would need to be operable, thus reducing equipment generated heat loads in that location.

Based on our review of the McGuire event and others in Table 1, our conclusion is that the heat load in the control and I

instrument cabinet

rooms, except for heating and i'ighting supplied by alter-nating current, do not change much with plant operating conditions.

The report further states that for control rooms and auxiliary buildings, opening doors should provide adequate relief to maintain equipment in an acceptable operating environment.

(In permitting opening and keeping open various doors during such an event, requirements of plant security, area isolation and fire protection should be considered.)

The event at McGuire showed that this may not be the case.

The operators at thGuire had to Mke several measures including opening cabinet doors, computer room doors, control room doors and provide fans (ac powered) with ducts to blow air conditioned air from the computer room to the control area to maintain instrumentation systems operable.

It should be noted that at McGuire Unit 1 instrumentation channels began to be adversely affected 45 minutes following loss of cooling even though the ambient room temperatures were well within operating limits.

Based on these observations, we believe that more consideration must be given to the effects of environmental

U I

W I

I 4

conditions on instrumentation and cont~ol system elements, especially heat sensitive electronic components, in the evaluation of USI A-44 concerns.

In particular, the actual temperature and condition of components located inside cabinets or other enclosures should be considered and evaluated.

If 3.

FINDINGS AND CONCLUSIONS Based on the review and evaluation of the event at McGuire and the operating experiences at other operating plants as discussed

above, the following findings and conclusions were obtained.

(1)

The total loss of control area ventilation system cooling at the McGuire Station on June 4, 1984 caused certain Unit 1 safety related instrumentation systems to be adversely affected by the increasing ambient temperature 45 minutes into the event.

Two hours into the event, several instrument channels on both units were behaving erratically and operators had to initiate extra-ordinary measures to provide cooling to the affected'instrumentation cabinets to keep the instrumentation system operable.

(2)

The maximum ambient temperature in the control room during the event was approximately 90 F.

The technical specification limit for the control area is 1204F.

The design specification ambient temperature requirement for the Westinghouse process and protection system equipment is 40 F - 120'F.

However, during the event when the control room ambient temperature was in the 854F-904F range, certain safety related instrumentation channels were adversely affected.

Thus, significant degradation of the system occurred even though the ambient temperature during the event was well below the design and technical specification limits.

(3)

Neither the technical specifications nor the design specifications adequately address the internal temperatures in the instrumentation cabinets.

Although,'prior to this event, the McGuire units had experienced ventilation

li t

t fl J

f

"20-system problems and temperature related problems of instrumentation systems, it was only after this event that temperatures inside an instrument cabinet were measured.

(4)

Following the event, the licensee measured the internal temperatures in a PCS cabinet of Unit 1 and found that with an ambient temperature of 724F the temperatures ranged from 734F to 1094F in the air space in the cabinet.

The temperatures of the instrument racks on which the printed circuit cards and other electronic components are mounted measured 115 F to 125 F.

The temper-ature increases in the cabinet were higher than the expected 254 - 304F rise.

(5)

The licensee rebalanced the airflow provided by the control area ventilation system to provide additional cooling to the instrument cabinets of concern.

This modification, since its implementation, has improved the reliability of printed circuit cards in the cabinets.

Other modifications to improve the cooling system and reduce heat loads of the cards are also planned.

However, following the rebalancing of airflow, the licensee has not measured the cabinet's internal temperatures.

Hence, if a total loss of control area ventilation system were to occur now, the licensee still does not know how much time will elapse before safety related instruments are adversely affected.

(6)

Technical specifications at HcGuire require actions to be initiated one hour following the total loss of control area cooling to place the unit in the hot standby mode within six hours.

During the event of June 4, the safety related instrumentation of one unit began to be adversely affected 45 minutes into the event and by two hours several channels of both units were involved.

Thus, the technical specifications are inadequate and have not fully considered the effects of loss of cooling on safety related instrumentation.

(7)

The review of events at other operating Westinghouse PWRs found failures similar to the types experienced by the printed circuit cards and other heat sensitive electronic components at the McGuire units.

Only two other units, Salem-2 and Summer, have identified high ambient temperature in the instrument cabinets as a potential root cause of the failures.

In the majority of the

events, the licensees have not determined the root cause of failure.

As far as we are aware, the internal cabinet temperatures of safety related instrumenta-tion cabinets are not being measured at any plant.

Since internal temperatures are location specific, actual temperature measurements in cabinets at the plants involved are needed to correlate card failures to temperature effects.

(8)

Although the review of events was confined to Westinghouse designed

PWRs, the problem of high temperatures adversely affecting solid state electronic components in safety related instrumentation and control systems is generic to a majority of operating BWRs and PWRs since the electronic components used and their application in I8C systems are similar.

(9)

The technical specifications regarding control room cooling and ventila-tion systems at other operating plants are similar to the standard technical specifications at the McGuire Station.

Thus, the inadequacies in the plant technical specifications regarding operability and loss of cooling and venti-lation systems and the consequent effects of increased ambient and local temperatures on safety related instrumentation are generic to all operating nuclear plants.

I (10)

The operators at the HcGuire Station anticipated the effects of loss of ventilation and cooling on safety related instrumentation because they had experienced similar events before.

Unless operators at other plants have been specifically trained for such an event, they could have difficulties in coping with it.

The time periods allowed by the plant technical specifications would generally be inadequate during such an event; the time allo~ed could give the operators a false sense of confidence since they would assume that the plant would continue to operate safely in that duration.

(11)

In the ongoing evaluation of USI A"44, Station Blackout, loss of normal heating, ventilation and air conditioning and the, effects of the ensuing environmental conditions on instrumentation and control systems are being addressed.

However, some of the assumptions made in the evaluation, such as the possibility of heat loads in the control and cabinet rooms being reduced and the possibility of maintaining adequate cooling of instrument components by opening doors of cabinets and rooms, may not be correct.

1 4.

RECOMMENDATIONS Based on the above discussion, findings and conclusions, the following recommendations are provided for consideration in addressing the problem of high temperature and its effects on heat sensitive electronic components in nuclear plant protection and control systems.

(1)

Procedures at operating nuclear plants related to loss of heating, ventilation and cooling systems supplying instrumentation and control system equipment rooms should be revised to specifically include steps to be taken to cope with the possibility of erratic response and/or failure of safety related instrumentation in the plant's protection, control and indication systems due A

to increasing temperatures.

In the interim, an IE information notice could be issued to alert operating nuclear plant operators that on a loss of control room/instrument room cooling and ventilation, erratic behavior of instrumentation systems can occur> bafare 1 specification ambient temperature RimNs are reached; (2)

All operating nuclear plants that utilize solid state electronic compo-nents in safety related instrumentation and control systems should monitor the environmental conditions (primarily temperature) in the instrumentation cabinets for all modes of plant operation for the expected range of outside temperatures and ambient room temperatures to determine if o

the technical specification and design ambient temperature limits for the control rooms/instrument rooms are adequate, and

1 I

"24-o existing ventilation and cooling systems for these areas are adequate.

(3)

Based on actual measurements of ambient and local temperatures, equipment heat load, cooling flow or pressure drop, analyses should be performed to determine the time to failure of instrumentation in a cabinet following the total loss of cooling in the area where the cabinet is located.

The time to failure should be determined for the range of ambient temperatures expected.

The effects of outside air temperatures depending on the seasonal and daily T

%l N~

thus obtained, when considering all instrument cabinets, should be utilized in applicable sections of the plant's technical specifications and applicable procedures.

(4)

The ongoing staff evaluation of USI A-44, Station Blackout, should consider in more detail the environmental conditions and their effects includ-ing common mode failures on instrumentation and control systems needed to operate during and for recovery from a station blackout.

Special attention should be given to heat sensitive electronic components in these systems.

Operating experiences and actual plant data should be utilized to verify all assumptions made in the evaluation.

Specifically, actual lieat loads in control rooms and instrument rooms during all modes of plant operation and temperature measurements of ambient and local temperatures of control rooms and instrument cabinets during all modes of area cooling system operation including loss of the system should be obtained and used in evaluating the Station Blackout issue.

(5)

Consideration should be given to the need to establish equipment qualification requirements for critical instruments even though they are located in "mild" environments.

II l

)l F

I Ij

(

pe Argy Wp0 ng,";

+w*e+

UNITED STATES NUCLEAR REGULATORY COMMISSION WASHINGTON, D. C. 20555 NN 13 BN Docket No.:

50-528 LICENSEE:

Arizona Public Service Company FACILITY:

Palo Verde Unit 1

SUBJECT:

SUMMARY

OF SITE VISIT TO EVALUATE RECENT EVENTS DURING POWER ASCENSION TEST PROGRAM An NRC staff team visited the Palo Verde site on October 9 and 10, 1985 to review recent events at Palo Verde Unit 1 during the power ascencion test program.

The team members consisted of representatives of NRR (E. A. Licitra, L. Harsh and I. Ahmed),

IE (G. Lanik) and AEOD (M. Chiramal).

Attendees for the entrance meeting held on October 9 at the Palo verde site are listed on Enclosure l.

On October 11, 1985, an exit meeting was held in the licensee's office in Phoenix, Arizona.

Attendees for the exit meeting are listed on Enclosure 2.

The backqround for and a summary of the visit are discussed below.

Back round Palo Verde Unit 1 is a fi~st of a kind plant (CESSAR System 80) and is the fir'st nuclear plant to be operated by the Arizona Public Service Company.

Prior to licensinq, Palo Verde Unit 1 had experienced a number of component problems which involved the reactor coolant

pumps, LPSI pumps, control element assembly
shroud, thermal sleeves and thermowells.

After licensing and during the power ascension test program,,Palo Verde Unit 1 experienced the following three significant events within a period of less than a month:

(1)

Se tember 12, 1985 (From About 50% Power)

During a loss-of-load test, the plant experienced a turbine trip, reactor trip, partial loss of offsite power, safety injection, containment isolation and gas binding of all three charginq pumps.

(2)

October 2, 1985 (From About 50% Power)

Complete loss of offsite power, reactor trip, loss of switchyard instrumentation and loss of control room operation of the switchyard breakers (loss of offsite power was attributed to a malfunctionina multiplexer)

(3)

October 7, 1985 (While Shutdown to Fvaluate October 2 Fvent)

Complete loss of offsite power (i.e.,

a repeat of the October 2

loss of offsite power occurrence).

g' 1

pip