ML22070A152

From kanterella
Revision as of 21:59, 5 April 2022 by StriderTol (talk | contribs) (StriderTol Bot insert)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Ehpg Paper - Generalization of Halden Experimental Results Using IDHEAS-G
ML22070A152
Person / Time
Issue date: 03/11/2022
From: Niav Hughes, Jing Xing
NRC/RES/DRA/HFRB
To:
Xing, Jing - 301 415 2410
References
Download: ML22070A152 (12)


Text

Use of IDHEAS General Methodology to Generalize Operator Performance Data in Halden Experiments for Human Factors and Reliability Analysis Applications Jing Xing, Ph.D. and Niav M. Hughes, Ph.D.

U.S. Nuclear Regulatory Commission, Washington, DC USA 301-415 (2410; 2362), (Jing.Xing; Niav.Hughes)@nrc.gov Abstract The Integrated Human Event Analysis System (IDHEAS), a human reliability analysis method developed by the U.S. Nuclear Regulatory Commission, provides a hierarchical structure to analyze and assess the reliability of human actions. The method is based on cognitive science and is capable of incorporating human performance data to support the estimation of human error probabilities. IDHEAS models human performance in five macrocognitive functions: Detection, Understanding, Decision-making, Action execution, and Teamwork. IDHEAS defines a set of cognitive failure modes for each function to describe the various ways of failing to perform the function. IDHEAS analyzes an event in progressively more detailed levels:

event scenario, human actions, critical tasks of the actions, macrocognitive functions and cognitive failure modes of the tasks, and performance influencing factors. This structure provides an intrinsic interface to generalize and integrate various sources of human error data for applications in human factors and human reliability analysis. We analyzed and generalized two Halden Reactor Project operator performance experiments: the 2003-2004 task complexity study and the microtask study on human-system interface evaluation. This paper presents IDHEAS-G structure along with the demonstration of generalizing the experimental results in the studies. The data, once sufficiently populated, can inform applications of human factors and human reliability. Introduction

1. Introduction The Halden Reactor Project (HRP) has been conducting human performance experiments with nuclear power plant (NPP) simulators as part of their Man Technology Organization (MTO) program for decades. The experimental results provide technical basis for understanding operators, organization, and technologies in NPPs. To highlight a few past experiments of interest, HAMMLAB has been host to thorough lines of investigation of a number of interesting topics including group-view display functionality [1][2][3][4][5], human-automation interaction (which is well summarized in HPR-387[6]), teamwork and work practices

[7][8][9][10][11], and staffing strategies for highly automated plants [12]. More recently, some projects are leveraging access to training simulators outside of HAMMLAB as a way of testing operators in their home plant and using their procedures and interface such as in the 2016 investigation of operator response to the failure of a computerized procedure system [13].

Each HRP human performance experiment addresses specific issues regarding new technologies, design concepts, or conduct of operations. To explore the issues, HRP researchers design and perform experiments with specific systems or technologies (e.g., computerized procedures), human-system-interfaces (e.g., large overview displays), or concepts of operation (e.g., reduced control room staffing level). Operation-based scenarios are designed to allow testing the issues. Over the years, HRP have developed many human performance measures, such as operator performance indices, task performance accuracy, task completion time, situational awareness, etc. The experiments use those measures to examine operators performance as the operators perform required human actions in the scenarios. Researchers draw conclusions about the issues based on the human performance measures and document the experiments in Halden Work Reports (HWRs).

By design, the results of an experiment are applicable to the specific experiment scope, i.e., the systems, human-system-interfaces (HSIs), scenarios, and concepts of operation specified in the study. Yet, the results should be able to provide further insights on general principles about how systems, HSIs, tasks in the scenarios, and concepts of operation may affect human performance. Such insights would allow using the results beyond the experimental scope. Moreover, while individual experiments address specific issues, it is desirable to the HRA and HF communities to learn the state-of-the-art of human performance research from the full mosaic of MTO experiments: What have we learned about human performance and what more do we need to learn?

Given such specific scope and design of each experiment, to gain these insights about the principles and learn the state-of-the-art more broadly, individual experiments need to be generalized in a common framework. Any human performance experiment should be able to be generalized into the common framework, and the framework should be tied with fundamental principles of human performance. This report documents our pilot study of generalizing HRP human performance experiment results using a cognitive framework.

The General Methodology of Integrated Human Event Analysis System (IDHEAS-G), a human reliability analysis (HRA) method developed by the U.S. Nuclear Regulatory Commission (NRC), provides a hierarchical structure to analyze human events [14]. The method is based on a cognitive framework. It models human performance in five macrocognitive functions: Detection, Understanding, Decision-making, Action execution, and Teamwork. IDHEAS-G defines a set of processors for each function to describe the ways of achieving the function. IDHEAS-G analyzes an event in progressively more detailed levels: event scenario, human actions, critical tasks of the actions, macrocognitive functions and cognitive failure modes (CFMs) of the tasks, and performance influencing factors (PIFs). This structure provides an intrinsic interface to generalize various sources of human performance data to elucidate the general human performance principles underlying the data. Previously, the NRC staff reviewed a large body of human error data from the literature and available human error databases, generalized the data into the IDHEAS-G structure, and used the generalized dataset to inform expert judgment of HEPs for nuclear power plant human actions outside the control rooms. Thus, we chose to use the IDHEAS cognitive structure as a framework to generalize HRP human performance experiment results.

Two HRP experiments were selected for piloting the generalization. The first one is HWR-758 [15], The Task Complexity Experiment 2003/2004. The purpose of this experiment was to explore how additional tasks added to base case scenarios affected the operators performance of the main tasks. The experiment ran five types of complicated scenarios. The second study is HWR-1212 [16], Evaluation of Design Features in the HAMBO Operator Displays. The purpose of the study was to evaluate whether innovative design features in the HAMMLAB Boiling water reactor (HAMBO) displays show any performance impacts, compared to a conventional presentation of the same information using HRPs microtask methodology. The experiment required operators to retrieve information from static or dynamic displays of several scenarios in the Conventional and Innovative designs. These two studies represented the simulation of very complicated operational scenarios versus the most simple isolated detection tasks. The motivation for this pilot study was threefold: 1) demonstrating generalization of experimental results from these two types of studies into a common cognitive framework, 2) gaining new insights pertaining to general human performance principles from the reported results, and 3) exploring a systematic way to communicate experimental results and interpret the results pertinent to human factors engineering and human reliability analysis.

2. Method 2.1 Overview of IDHEAS Cognitive Framework The IDHEAS-G cognition framework includes a model of macrocognition that describes the brain processes (i.e., cognitive mechanisms) associated with the success or failure of a task, and a performance influencing factor (PIF) model that describes how various factors affect the success or failure of tasks.

2.1.1 The Macrocognitive Model The macrocognitive model elucidates the cognitive process of human performance in applied work domains where human tasks are complex and often involve multiple individuals or teams. The model is described as follows:

  • Macrocognition consists of five functions: Detection, Understanding, Decisionmaking, Action Execution, and Teamwork. The first four functions may be performed by an individual, a group or a team, and the Teamwork function is performed by multiple groups or teams.
  • Any human task is achieved through these functions; complex tasks typically involve all five functions;
  • Each macrocognitive function is processed through a series of basic cognitive processors; failure of a cognitive processor could lead to the failure of the macrocognitive function;
  • Each processor is reliably achieved through one or more cognitive mechanisms; errors may occur in a cognitive processor if the cognitive mechanisms are challenged;
  • PIFs challenge the capacity limits of cognitive mechanisms and can lead to ineffectiveness of the mechanisms.

Table 1 shows the basic cognitive elements for the macrocognitive functions. The cognitive mechanisms are not presented due to the space limits of the paper.

Table 1: Macrocognitive Functions and Their Basic Elements Detection Understanding Decisionmaking Action Teamwork execution D1- Initiate U1 - Assess/select DM1 - Manage E1 - T1 - Establish or detection - data the goals Assess adapt teamwork Establish mental U-2 Select / adapt DM2 - Adapt a action plan infrastructure T2 -

model and / develop the decision model E2 - Develop / Manage information criteria mental DM3 - Acquire modify action T3 - Maintain D2 - Identify model / select scripts common ground and attend to U-3 Integrate data information E3 - T4 - Manage sources of with mental model DM4 - Make resources Synchronize, information judgment or T5 - Plan inter-team to maintain supervise, and D3- Perceive, plans collaborative activities situational coordinate recognize, and DM5 - Simulate T6 - Implement awareness, action classify the decision decision/command diagnose implementation information DM6 - T7 - Verify, modify, problems, or E4 - Implement D4-Verify the Communicate and control the resolve conflicts action scripts information and authorize implementation U-4 Verify, revise, E5 - Verify and acquired the decision and iterate the adjust actions D5- Retain or understanding communicate the U-5 Communicate acquired the understanding information

2.1.2 The Performance Influence Factor Model PIFs affect cognitive mechanisms and increase the likelihood of macrocognitive function failure. We developed a PIF model that is independent of HRA applications and links to cognitive mechanisms.

The model systematically organizes PIFs to minimize inter-dependency or overlapping of the factors.

The PIF structure has four layers:

PIF category: PIFs are classified into three categories corresponding to characteristics of systems, tasks, and personnel.

PIFs: Each category has high-level PIFs describing specific aspects of the systems, tasks, or personnel. Table 2 shows the PIFs within the three categories.

Table 2: Performance influencing factors in IDHEAS-G System- and Task-related PIFs Personnel-related PIFs environment-related

  • Transparency of systems
  • Information availability and
  • Human-system and instrument & control reliability interface (HSI)
  • Work location
  • Scenario familiarity
  • Staffing accessibility and
  • Multi-tasking, interruptions
  • Training habitability and distractions
  • Procedures,
  • Tools and equipment
  • Cognitive complexity guidelines, instructions
  • Environmental factors
  • Mental fatigue and stress
  • Teamwork factors
  • Physical demands
  • Work process PIF attributes: These are the specific traits of a performance influencing factor. A PIF attribute represents a poor PIF state that challenges cognitive mechanisms and increases the likelihood of errors in cognitive processes. Table 3 shows some example attributes of the PIF Information availability and reliability.

Table 3: Example Attributes of the PIF Information Availability and Reliability Nominal state of Information availability and reliability: Information is needed for personnel to perform tasks. Information is expected to be complete, reliable, unambiguous, and available timely to personnel.

  • Inadequate updates of information (e.g., a party receives information but fails to inform another party).
  • Information of different sources is not synchronized.
  • Conflicts in information Every PIF attribute challenges one or several cognitive mechanisms. IDHEAS-G provides links between PIF attributes and cognitive mechanisms synthesized and inferred from the literature.

2.1.3 Implementation of IDHEAS-G Cognitition Framework for Human Reliability Analysis IDHEAS-G implements the cognition model in the general HRA process, which includes qualitative analysis and HEP quantification. It begins with an event and progressively analyzes more detailed elements of the event: operational narratives and context of the scenarios, human actions in the scenarios, critical tasks in the human actions, macrocognitive functions, cognitive failure modes

(CFMs) of the critical tasks, the PIF states, and human error probabilities. Analysis of these elements is carried out through the following process:

  • Scenario analysis: Develop the operational narrative for the scenarios and identify human performance context
  • Identification and definition of human actions: Identify the key human actions pertinent to the mission of the event and define the human actions
  • Task analysis: Analyze tasks required for the human action and characterize the critical tasks for HEP quantification
  • Time uncertainty analysis: Analyze time uncertainties in the human action and quantify the HEP attributing to time uncertainties
  • Cognition failure analysis: Identify cognition failure modes of every critical tasks in a human action and estimate the HEP attributing to failures of macrocognitive functions for the critical tasks 2.2 Generalization of HRP human performance experiment results with IDHEAS-G framework We took the following process to generalize the results in the two selected HWRs:

Thoroughly read the HWRs and other related reports (e.g., HRP has several reports on the same topic as that of HWR-758);

Apply IDHEAS-G process to analyze the experiment and results, i.e., representing the experiment results in IDHEAS-G elements: operational narrative and context of the scenarios, important human actions, task analysis, macrocognitive functions and applicable failure modes, states of PIFs relevant to operators tasks, error rates of operators task performance and other performance measures; Analyze the results to gain insights of general human performance principles with respect to the human reliability analysis and human factors engineering; Document the description of the study (including the HWR abstract) and generalized information in 2) and 3);

Verify the analysis with peers.

The piloting results of generalization in this report were verified by the two authors. The next step is to verify our analysis with the HRP researchers who performed the studies. Thus, the information reported in the RESULTS section of this paper is only for the purpose of demonstration.

3. Results We documented the results in the Tables 4 and 5, one for each of the studies. Each table has three main sections following the abstract of the original paper: documentation of the experimental design (Part I), generalization of the design and results to the IDHEAS-G framework (Part II), and insights and lessons learned gained from the analysis (Part III).

Table 4: Analysis of results of the task complexity experiment 2003/2004 Reference Halden HWR-758 THE TASK COMPLEXITY EXPERIMENT 2003/2004 The purpose of this experiment was to explore how additional tasks added to base case scenarios affected the operators performance of the main tasks. These additional tasks were in different scenario variants intended to cause high time pressure, high information load, and high masking.

The experiment was run in Halden Man-Machine Laboratorys Boiling Water Reactor simulator.

Seven crews participated, each for one week. There were three operators in each crew. Five main types of scenarios and 20 scenario variants were run. The data from the experiment were analyzed by completion time for important actions and by in-depth qualitative analyses of the crews communications. The results showed that high time pressure decreased some of the crews performance in the scenarios. When a crew had problems in solving a task for which the time pressure was high, they had even more problems in solving other important tasks. High information load did not affect the operators performance much and, in general, the crews were very good at selecting the most important tasks in the scenarios. The scenarios that included both high time pressure and high information load resulted in more reduced performance for the crews compared to the scenarios that only included high time pressure. The total amount of tasks to do and information load to attend to seemed to affect the crews performance. To solve the scenarios with high time pressure well, it was important to have good communication and good allocation of tasks within the crew. Furthermore, the results showed that scenarios with an added complex, masked task created problems for some crews when solving a relatively simple main task. Overall, the results confirmed that complicating, but secondary tasks that are not normally taken into account when modelling the primary tasks in a Probabilistic Risk Analysis (PRA) scenario can adversely affect the performance of the main tasks modelled in the PRA scenario.

Part I. Documentation of Experimental design Scenario Five PRA scenarios, each with four variations Subject 7 crews Variables Base scenario, Base scenario + additional task Base scenario + additional information load Base scenario + additional task + additional information load Measures Task completion time Number of Tasks correctly completed Observation of communication Observation of teamwork Individual crews performance stories Part II. Generalization of the design and results for Experiment 1: Scenario 4 - Incomplete Scram/Start Of The Boron System Scenario analysis Scenario In this scenario there is a leakage in the main feedwater system. The leakage leads to isolation of the definition feedwater system and reactor scram. Normally, at reactor scram, there are 169 control rods inserted into the core within 4 seconds. In this scenario there are 12 nearby control rods that are stuck and the rods will not be inserted by the back-up control rod drive system either. The operators are supposed to start the boron system to reduce the reactor power.

Initial Production at full power. Ordinary maintenance-period in safety system train C is ongoing. Emergency conditions pumps and diesel-generator in train C are blocked. One of six main cooling water pumps (441 PC1) has a failure. The pump is in operation as usual but the current measurement indicates 0 %. The current measurement is input to the logic for the cooling-function of the Main condenser. The fast transfer function in train D is in manual mode due to a failure. No other known problems.

Initiating A failure in the reactor protection system leads to alarm "354 KB701." To reduce the reactor power event operators have to start the boron system manually. There are procedures for this situation.

Boundary At the plant, there is a push button to start the boron system in one of the two trains (train C and D). In conditions the simulator, the push button is placed to the left in the process picture "351 PC1_1 Borinpumping". It is also possible to start the system by maneuvering the components one by one.

Event Environnemental context: None context System context: multiple failures, non-responsive controls Personnel context: Normal crew, good training, procedures are available for the primary task but may not be detailed enough for the secondary tasks Task context:

The primary task and multiple secondary tasks are going on within the same period of time Time available is adequate but personnel has time pressure due to multiple failures Key information is masked or misleading in some scenario variation Critical Action A: start the boron system manually actions Task analysis of Action A Time Adequate time available uncertainty Critical Start the boron system manually tasks

Cognitive Detect the cue: alarm "354 KB701 532 V group B1 malfunction".

activities Diagnosis and decisionmaking: straightforward per alarm response procedures Action execution: start boron system by pressing a button Special requirements for teamwork: None Parallel tasks:

Base scenario 4.1 - Several additional tasks take additional time but they are not intermingled multitasks and not necessarily interruptions.

Scenario 4.2 - Additional system failure cues are onset before the initiating event; operators need to take care of the secondary tasks.

Scenario 4.3 - Additional information load (indications / alarms) require the crew to detect and understand.

Scenario 4.4: 4.1 + 4.2 + 4.3 Applicable Failure of detection: D4 - Key alarm "354 KB701 532 V group B1 malfunction" not attended to cognitive Failure of decision: DM2 - Incorrectly prioritize competing goals in Scenarios 4.2, 4.3, and 4.4 (not failure performing the critical task first) modes Failure of execution - E3 - Action not initiated Failure of execution - E4 - Failure of simple action (start the boron system)

PIF states All scenarios:

Distraction: Secondary tasks may distract crews - Low impact Competing goals: Crews may choose to attend to secondary tasks first - Low impact Scenario 4.2: Time pressure: due to added tasks - Low impact Scenario 4.3: Distraction: Added information load causes more distraction - Low to moderate impact Scenario 4.4: Added tasks and information load caused i) time pressure, ii) more distraction, iii) possible mental fatigue - low to moderate impact likelihood Scenario 4.1 - All CFMs are highly unlikely of failure Scenario 4.2 - All CFMs are highly unlikely Scenario 4.3 - All CFMs are highly unlikely Scenario 4.4 - DM2 possible - Crew may attend to other tasks first; E3 possible - Crew may forget to initiate the action execution of the primary task due to distraction Experimental Results Number of 0/28 - Seven crews in four scenarios all correctly started boron system incorrect- There were errors at the CFM level but the errors were recovered.

ness and 1/28 D4- Key alarm "354 KB701 532 V group B1 malfunction" not attended to in Scenario 4.1. The CFMs shift supervisor did not seem to know that some control rods other than those that belong to scram group C1 were not inserted. Later he realized that by asking the reactor.

3/28 DM2 - Incorrectly prioritize competing goals in Scenario 4.4. After feedwater isolation, the shift supervisor first suggested that the reactor operator should open a scram valve in the scram system (354), then said that the reactor operator should start the boron system.

1/28 E4 _incorrectly executed simple action in Scenario 4.4. The shift supervisor said that the reactor operator should try to start the boron system. The shift supervisor tried several times to start the boron pump on the man-machine interface (Instead of starting the boron system, he tried to start the boron pump and he did not manage to start the boron system). The shift supervisor asked the reactor operator to look at what he did wrong on the boron system, and the reactor operator started the boron system in train C.

  1. of 5/14 E4 -Incorrectly execute a simple action in Scenario 4.2 and 4.4. Crews did not close the open failures in pressure relief valve (314 VA2). they forgot to open the pressure control valve (314 VA17) before they non- closed the isolation valve (314 VA23).

critical 4/20 E4 - Fail to open scram valve in group B1 (354 VB1). Two crews lacked knowledge for doing this task and they failed the task in all the scenarios.

actions 1/28 E4 - Fail to Insert the 531 detectors in Scenario 4.2 and added tasks Part III. Performance stories and insights Depend- Two of the crews that did not close the pressure relief valve (314 VA2) did not start enough auxiliary ency feedwater pumps to keep the level in the reactor tank.

Error of The crew monitored that the low-pressure injection pumps started to pump water to the reactor tank.

commiss- Then the reactor operator said that they had to be careful that they did not dilute all the boron in the reactor tank and asked if they should plan to stop the low-pressure injections pumps. The shift ion manager called and discussed the situation with the plant management (simulated) and they agreed to stop the low-pressure injection pumps and to stop the boron in one train, so that they could keep that boron tank in reserve until they had got control over the level in the reactor tank. (One out of two boron tanks are enough to make the reactor under-critical). The reactor operator then stopped the low-pressure injection pumps when the level in the reactor tank was about 6 meters in the reactor tank. Both of the crew who got low level in the reactor tank activated the TB switch to get fast

depressurization and they also stopped the low pressure injection system so they did not dilute the boron too much. Both of these actions were very critical and important for safety.

Change of In the scenarios with high time pressure there were many failures and the crews had relatively short conduct of time to solve them. The crews who solved these scenarios well managed to divide the tasks in the operation scenario; the shift supervisor worked on some of the tasks and delegated other tasks to the reactor operator. The shift supervisor also divided the tasks between the reactor and the turbine operator.

This changed from the three-way-communication. The safety impact of the change is not assessed here.

Work In those scenarios with high time pressure, the work practice with first checks did not work well. The process reactor operator had many failures to work on, and also the shift supervisor did some of them. There did not seem to be time to do first checks and to report the first checks within 10 minutes. A safety analysis should be performed to assess the work process of handling the situation like this.

Table 2: Analysis of microtask evaluation of innovative versus conventional design features Reference Halden HWR-1212 Evaluation of design features in the HAMBO operator displays.

A current challenge within the nuclear industry is to assess the impact of computerized interfaces on human performance. The purpose of this study was to evaluate whether innovative design features in the HAMmlab BOiling water reactor (HAMBO) displays show any performance impacts, compared to a conventional presentation of the same information: Is there added value in having innovative design features in addition to the numerical information? We used a within-subject experimental approach where nine participants responded to the same blocks of questions in two conditions: with innovative features such as mini-trends, pie-charts and bar graphs; and a control condition where the process information was presented through numerical information only. Overall, the performance results showed that the participants were more accurate in the innovative condition and showed equivalent response times in both conditions. Pie charts, mini-trends and pictorial elements eliminated the need to recall values from memory such as nominal and previous process parameters. Bar graphs were advantageous for checking the status of multiple components or systems. For questions that required verification of parameters, there were no differences in accuracy between the conditions, but the operators tended to answer quicker in the conventional displays, suggesting that the innovative features might have acted as a distractor when they were not actively relevant for the question. Eye tracking analysis of the questions with the largest differences between conditions showed that dwell times and fixation counts tended to be lower in the innovative condition. The heat maps also suggested more focused attention in the innovative condition. These findings suggest that the participants were able to locate and identify relevant information more effectively in the innovative displays. The analysis of the operators confidence ratings showed a tendency for overconfidence in incorrect responses that is interesting to explore in future studies.

Part I. Documentation of Experimental design Scenario No real scenario. Operators individually observes the simulator displays and answer questionnaire Methodology Microtask Subject Nine Swedish operators Variables Conventional display (Numerical information only) vs. Innovative Display (Mini trends, Pie Charts, Pictorial Elements, and Other (e.g., bar graph, trend diagrams))

Static screen shoots vs. 5-min dynamic scenario observation Types of questions in the questionnaire (type of cognitive task = 1) assessing trends 2) multiple parameters 3) check values)

Measures Accuracy of answering questionnaire Response time of answering questionnaire Eye tracking Part II. Generalization of the design and results for questions on static and dynamic scenarios.

Scenario Not applicable analysis Task analysis Time Adequate time with time pressure (Answer the questions as accurate as possible and as uncertainties quick as possible)

Critical tasks Answer every question in the questionnaire Cognitive Acquire information activities Note: The task is to acquire information even for the dynamic scenarios because the operators had no task when the scenario was playing.

Applicable Failure of detection:

failure modes D1 - Incorrect or no mental model - Not applicable. The question asked established the mental model of what to detect D2 - Attend to wrong sources - Applicable D3 - Incorrect perception, recognition, or classification - Applicable D4 - Failure of verification, peer-checking, or supervision - Applicable D5 - Failure of retaining or communicating detected information - Not applicable PIF states Conventional Innovative Work No peer-checking, Same as Conventional process No supervision, Poor verification (because the instruction as quick as possible)

No feedback Information Incomplete information for trending Nominal questions with conventional display HSI Nominal to low impact Nominal to low impact

- Parameters needed for a task - Presentation of are spatially distributed parameters is

- Saliency may be less than complicated and needs nominal interpretation

- More clutter Prediction on likelihood of Conventional Innovative failure D2 Unlikely Unlikely D3 - Low for most questions - Should - Low for most questions.

be the same as Innovative because HSI is nominal. - High for verifying the

- Very high or infeasible for parameters because of questions about trending or other clutter and complicated questions that required graphic representation.

retrospective memory; Operators would not pay attention to trending even with dynamic scenarios because they did not have the task when the information was available, and the trending information was not available when they performed the detection tasks.

D4 Moderate likelihood of errors due to Moderate likelihood of lack of peer-checking/supervision and errors due to lack of peer-possibly not much verification checking/supervision and possibly not much verification Pt Very low because of adequate time Experimental Results Correctness Type of Conventional Innovative Conventional Innovative questions Static Static Dynamic Dynamic Trending 0.67 0.87 Multiple 0.78 0.89 parameters Check values 0.94 0.88 Overall 0.86 0.91 Table 11 in HWR-1212 shows the questions that revealed the lowest accuracy in the dynamic scenarios.

Question Conv Innov 12 The pressure in the RPV has 323 Pictorial 0.44 0.67 exceeded 7.3 MPa.

22 The highest temperature in the 323 Pictorial 0.22 0.89 reactor containment was 58 °C.

35 The condensate flow 462 KB301 has 462 Mini-trend 0.11 1 increased.

40 The level in 462 TD1 has been 462 Mini-trend 0.11 0.89 constant Response Overall, response time for Innovative is several seconds shorter than that for time Conventional.

Eye tracking Eye tracking analysis of the questions with the largest differences between conditions showed that dwell times and fixation counts tended to be lower in the innovative condition. The heat maps also suggested more focused attention in the innovative condition. These findings suggest that the participants were able to locate and identify relevant information more effectively in the innovative displays.

Part III. Performance stories and insights Innovative Innovative HSI design features have the potential of improving human performance by HSI reducing the negative impact of some PIFs; they also have the potential of deteriorating human performance by increasing the negative impact of some other PIFs. For example, the graphic features improve the HSI attribute that related parameters are spatially distributed. Yet, they introduce the new HSI attribute that parameter representations are complicated and need interpretation.

Conventional While the Innovative HSI resulted in higher accuracy for questions asking for trending HSI (.87) and assessing multiple parameters (.89), the highest accuracy reported was 0.94 for checking parameter values with Conventional, but the accuracy for Innovative was still in the high range (.88). This result suggests that when the cognitive activity required is checking straight values without additional cognitive manipulations, the numerical information only condition is sufficient and possibly even preferred. Caution when applying this interpretation is suggested, however, in that when performing a real scenario, simply checking a value in isolation (i.e., not in combination with other cognitive activities) is rare. Thought should be given when determining how indications will be used so as to avoid unnecessary duplication or use of conventional HSI which may neglect to acknowledge the necessity for additional cognitive manipulations.

ISV While the mean accuracy is 0.86 for Conventional and 0.91 for Innovative, Table 11 shows that the accuracies for some questions are very poor with Conventional, and the accuracies for Question 12 are poor for both Conventional and Innovative. The result implies that some safety issues can be hidden in the high average accuracy.

ISV The conclusion of the study is that operators performed better with Innovative than Conventional as indicated by the mean accuracy and response time. However, being a few second faster in the response time does not make significant difference in operators job performance. For accuracy, removing the questions in Table 11 where the accuracies for 4-5 questions are extremely low would result in roughly equal accuracies for Conventional and Innovative. Those questions were infeasible for Conventional because they all asked for the information that no longer existed when operators performed the detection tasks. In other words, when the data (accuracies for individual questions) fall in bimodal or multi-modal distributions, simply averaging them can be misleading.

Work The highest accuracy reported was 0.94 for checking parameter values with process / Conventional. This is still too high compared to the error rates when operators perform conduct of real scenarios as a crew. The only non-nominal performance influencing factor here was operation no peer-checking/supervision/feedback. The result may provide evidence for the effect of peer-checking/supervision/feedback.

Confidence The results of the confidence ratings indicated an overconfidence in incorrect responses.

Ratings As noted in the HWR, the phenomenon is established in cognitive psychology [17],

however, it is as of yet unexplored in the context of nuclear operators in the nuclear power plant main control rooms (NPP MCRs) (See [18] for previous Halden work using self-rating bias measurement). The phenomenon demonstrated could be considered from a methodological standpoint for researchers doing research in the NPP MCRs as it has possibility to influence results and conclusions. It should be further explored and/or considered when choosing measures and methodologies for studying expert operators in the nuclear domain. Furthermore, as mentioned in the HWR, confidence ratings can help us to better understand decision making [19] and again, from a methodological standpoint, might aid in understanding differences between novice and expert operators

[20]

Eye Tracking Eye tracking analysis of the questions with the largest differences between conditions showed that dwell times and fixation counts tended to be lower in the innovative condition. The heat maps also suggested more focused attention in the innovative condition. These findings suggest that the participants were able to locate and identify relevant information more effectively in the innovative displays.

Eye We also explored the usefulness of eye tracking methodologies within the context of Tracking/Met interface assessment for nuclear process control. The findings showed that eye metrics hodology could mediate the interpretation of specific performance outcomes by highlighting

monitoring patterns and styles during the tasks. The operators considered the eye-tracking glasses as comfortable throughout the study and reported that it did not interfere with their tasks, which is a relevant advantage of the equipment.

Microtask/Me Used the microtask method as a procedure to evaluate HSI design features in the thodology HAMBO operator displays for the first time. The method was flexible enough to enable the data collection and analysis according to multiple variables such as operator roles, types of tasks, questions, design features, or displays. The data collection was efficient, with a short session providing a significant amount of data.

4. Conclusions IDHEAS-G is a general HRA methodology built on a cognitive framework. Its layered structure allows for generalization of human performance data of different formats and various levels of detail. This pilot study demonstrated the feasibility of using the IDHEAS-G cognitive framework to generalize the experimental results of very complicated, full scenarios run by crews as well as the results of very simple microtasks performed by individual operators. Moreover, the generalization of the results in a cognitive framework allows for identifying insights of general human performance principles that are applicable beyond the specific experiment scope. The generalization provides a systematic understanding of human performance by elucidating contextual factors, challenging human performance, delineating how personnel perform tasks for the required human actions, and revealing how and why personnel may fail to perform required actions as represented through the applicable cognitive failure modes and the associated performance influencing factors. These together enhance the technical basis to improve system design, new technologies, concepts of operation, procedures, and training in NPP control room operations.
5. Disclaimer This paper presents a research project conducted by the staff in the U.S. Nuclear Regulatory Commission. It does not represent an NRC official position. Although the NRC staff may suggest a course or action in the paper, these suggestions are not legally binding and the regulated community may use other approaches to satisfy regulatory requirements.
6. References

[1] Hurlen, L., Skraaning, G., Myers, B., Carlsson, H., and Jamieson, G. (2014). The Plant Panel:

Feasibility Study of an Interactive Large Screen Concept for Process Monitoring and Operation.

(HWR-1129). OECD Halden Reactor Project, Halden, Norway.

[2] Skraaning, G., Hurlen, L., Le Darz, P. & Jamieson, G. (2016) Feasibility Study of an Interactive Large Screen Concept for Automated Plant Startup. (HWR-1179). OECD Halden Reactor Project, Halden, Norway.

[3] Massaiu, S. and Holmgren, L. (2016). Preliminary Results from the 2013 Resilient Procedure Use Study with Swedish Operators. (HWR-1122). OECD Halden Reactor Project, Norway.

[4] Massaiu, S. and Holmgren, L. (2017). The 2013 Resilient Procedure Use Study with Swedish Operators: Final Results. (HWR-1216). OECD Halden Reactor Project, Halden, Norway.

[5] Eitriheim, M. Svengren, H., and Fernandes, A. (2017). Evaluation of the Human System Interface Concept for Near-Term Applications. (HWR-1211). OECD Halden Reactor Project, Norway.

[6] Skraanning, G. and Jamieson, G. (2017). Twenty Years of HRP Research on Human Automation Interaction: Insights on Automation Transparency and Levels of Automation. (HPR-387). OECD Halden Reactor Project, Halden, Norway.

[7] Strand, S., Kaarstad, M., Svengren, H., Karlsson, T. & Nihlwing, C. (2010). Work Practices 2009 HAMMLAB Study: Team Transparency in Near-Future Computer-Based Control Rooms. (HWR-952). OECD Halden Reactor Project, Halden, Norway

[8] Kaarstad, M., Strand, S. & Nihlwing, C. (2012). Work Practices and New Technologies - iPad as a Tool for the Shift Supervisor to monitor Process Information. (HWR-996.) OECD Halden Reactor Project, Halden, Norway.

[9] Kaarstad, M., Strand, S. & Nihlwing, C., Holmgren, L. & Berntsson, O. (2014). Control Room and Field Operator Collaboration - Use of a Handheld Tool. (HWR-1124). OECD Halden Reactor Project, Halden, Norway

[10] Skranning, G. (2016). A Reanalysis of the Work Practice Experiments in HAMMLAB (2009-2014). (HWR-1194). OECD Halden Reactor Project, Halden, Norway.

[11] Skjerve, A.B.M., Nihlwing, C., Nystad, E. (2008). Lessons learned from the extended teamwork study. (HWR-867). Halden, Norway: OECD Halden Reactor Project.

[12] Eitrheim, M. H., Skraaning, G, Lau, N., Karlsson, T., Nihlwing, C., Hoffmann, M., & Farbrot, J. E.

(2010). Staffing Strategies in highly automated future plants: Results from the 2009 HAMMLAB Experiment. (HWR-938). OECD Halden Reactor Project, Halden, Norway

[13] Taylor, C., Hildebrandt, M., Hughes, N. & McDonald, R. (2016). Operator response to failure of a computerized procedure system. Results from a training simulator study. (HWR-1198). OECD Halden Reactor Project, Halden, Norway.

[14] US Nuclear Regulatory Commission (2019). The general methodology of an integrated humanevent analysis system (IDHEAS-G) , NUREG-2198, 2019 (In preparation)

[15] Laumann, K., Braarud, P.O., Svengren, H. (2005). The Task Complexity Experiment 2003/2004.

(HWR-758). OECD Halden Reactor Project, Halden, Norway.

[16] Eitrheim, M. H., Fernandes, A., and Svengren H. (2017). Evaluation of design features in the HAMBO operator displays. (HWR-1212). OECD Halden Reactor Project, Halden, Norway.

[17] Joshua Klayman, Jack, B. Soll, Claudia Gonzalez-Vallejo, Sema Barlas. 1999. Overconfidence:

It depends on How, What, and Whom You Ask. Organizational Behavior and Human Decision Processes, 79(3), 216-247.

[18] Massaiu, S., Skjerve, A.B.M., Skraaning Jr., G., Strand, S., Wrø, I. (2004). Studying Human-Automation Interactions: Methodological Lessons Learned from the Human-Centred Automation Experiments 1997-2001. (HWR-760). OECD Halden Reactor Project.

[19] Roger Ratcliff, Jeffrey J. Starns. 2013. Modeling confidence judgments, response times, and multiple choices in decision making: Recognition memory and motion discrimination.

Psychological Review, 120(3), 697-719. http://doi.org/10.1037/a0033152.

[20] Mark T. Spence, Merrie Brucks. 1997. The Moderating Effects of Problem Characteristics on Experts and Novices Judgments. Journal of Marketing Research, 34(2), 233.