Text

ROP Data Strategy Evaluation Report and Strategic Recommendations
	ML24059A402
Person / Time
Issue date:	03/19/2024
From:	Patrick Finney; NRC/NRR/DRO/IRAB
To:	Russell Felts; NRC/NRR/DRO
References
	Download: ML24059A402 (1)
	v • d • e

March 19, 2024 MEMORANDUM TO:

Russell N. Felts, Director Division of Reactor Oversight Office of Nuclear Reactor Regulation FROM:

Patrick Finney, Chief Reactor Assessment Branch Division of Reactor Oversight Office of Nuclear Reactor Regulation

SUBJECT:

EVALUATION OF REACTOR OVERSIGHT PROCESS DATA STRATEGY In a charter, dated August 8, 2023 (Agencywide Documents Access and Management System (ADAMS) Accession No. ML23216A132), you directed the formation of a working group tasked with evaluating the current Reactor Oversight Process (ROP) data strategy and recommending any strategic changes along with an initial implementation plan for those changes. The working group was tasked to review the current status of ROP data, focusing on the following four areas:

1) ROP data sources, storage, quality, and accessibility

2) ROP data ownership/stewardship and responsibilities

3) Internally-facing ROP data visualization and analytical tools, and

4) Externally-facing ROP data visualization and analytical tools.

The staff has completed that evaluation and has documented the results of the evaluation, six best practices for ROP data, a description of an idealized dataflow for ROP data processes, and five specific strategic recommendations, along with an initial implementation plan for each of the specific recommendations in the enclosure. The working groups working files ROP Datasets (non-public) and ROP Data Tools (non-public) are also available for your reference.

Enclosure:

1. ROP Data Strategy Evaluation Report and Strategic Recommendations CONTACT:

Nicole E. Fields, NMSS/REFS/RRPB (630) 8299570 PATRICK FINNEY Digitally signed by PATRICK FINNEY Date: 2024.03.20 12:09:32 -04'00'

ML24059A402 OFFICE NMSS/REFS/RRPB NRR/DRO/IRAB/BC NRR/DRO/D NRR/DRO/IRAB/BC NAME NFields PFinney RFelts PFinney DATE 02/28/2024 02/29/2024 03/19/2024 03/19/2024

1 ROP Data Strategy Evaluation Report and Strategic Recommendations Executive Summary The ROP data strategy working group evaluated the current status of ROP data and found at a high level that in terms of datasets, data tools, and overall data processes, ROP data is in a very respectable state. This is especially noteworthy, as this evaluation is the first holistic review of ROP data that has been performed by staff. Recent and ongoing efforts by staff over the past several years to enhance ROP data analysis capabilities using dashboards and to improve overall ROP data infrastructure have paid significant dividends. This is true both in terms of moving ROP data processes towards the consensus best practices and idealized dataflow as described in this report, and improving the overall quality, accessibility, and accuracy of ROP data. The working group also found that individual NRC staff members involved in ROP data processes are knowledgeable, responsible, conscientious, and are actively working on improving the data processes in which they are involved.

In support of this holistic review of ROP data, the working group has defined some key data terminology, which had not been clearly defined previously at the agency, such that a common language and understanding can be used to discuss ROP data. These terms are dataset, individual data owner, and data tool; those definitions are included in this report. The working group also agreed on six Best Practices for ROP Data. The working group additionally agreed on an Idealized Dataflow for ROP Data. See those respective sections of this report for details.

While the working group is not recommending any specific future actions because of these best practices or the consensus idealized dataflow, by having these practices and their underlying intent be well-documented, the working group believes that this will support a common data process understanding and support the accomplishment of any future strategic goals for ROP data processes. In addition, this common understanding of data definitions, data best practices, and dataflow may also be generalizable and useful to other groups.

As a result of this evaluation, the working group agreed on five specific strategic ROP data recommendations, along with an initial implementation plan for each. For easy reference, each of these recommendations are briefly summarized below. For more details, see the section Specific Strategic ROP Data Recommendations of this report.

Recommendation 1: Delete seven specific ROP data tools that are redundant to other existing data tools.

Recommendation 2: Increase accessibility of internal ROP data tools by adding links to the Operating Experience Hub and ROP Digital City.

Recommendation 3: Create a centralized list of ROP datasets and data tools, including the owners.

Recommendation 4: Incorporate an awareness of ROP datasets and data tools into the ROP qualification programs.

Recommendation 5: Add ROP data duties and skills to relevant regional and DRO position descriptions in a high-level and consistent way.

2 Table of Contents Executive Summary...................................................................................................................... 1 Table of Contents.......................................................................................................................... 2 Background................................................................................................................................... 3 Working Group Members and Observers..................................................................................... 3 Working Group.......................................................................................................................... 3 Observers.................................................................................................................................. 3 Evaluation..................................................................................................................................... 3 ROP Datasets........................................................................................................................... 3 Data Storage and Format...................................................................................................... 4 Dataset Information Sensitivity.............................................................................................. 4 Data Limitations and Quality Issues...................................................................................... 4 Dataset Accessibility.............................................................................................................. 5 Dataset Ownership................................................................................................................ 6 ROP Data Ownership and Responsibilities............................................................................... 6 Positions with ROP Data Responsibilities............................................................................. 7 Ownership of Newly-Created Datasets and Data Tools........................................................ 7 Staff Knowledge, Training, and Access................................................................................. 7 Staff Position Descriptions..................................................................................................... 8 ROP Data Tools........................................................................................................................ 9 Data Tool Description, Function, Complexity, and Platform................................................ 10 Data Tool Ownership........................................................................................................... 12 Internal Data Tool Accessibility............................................................................................ 12 External Data Tool Accessibility.......................................................................................... 13 External Data Tool Usability................................................................................................ 13 Feasibility of Creating New External Data Tools................................................................. 14 Data Tool Maintenance and Data Updating......................................................................... 15 Data Tool Gaps and Redundancy....................................................................................... 16 Recommendations...................................................................................................................... 16 Best Practices for ROP Data................................................................................................... 17 Idealized Dataflow for ROP Data............................................................................................ 19 Specific Strategic ROP Data Recommendations.................................................................... 21 Conclusions................................................................................................................................ 23

3

Background

As described in the charter for this working group (ML23216A132), the ROP has been a data-driven, risk-informed oversight program since its inception. Throughout the history of the ROP, NRC staff have continued to make ongoing improvements to ROP data, particularly in the areas of data accessibility and data trending.

On March 10, 2023, the Commission directed the staff in SRM-SECY-22-0086 (ML23069A093) to continue to enhance the Operating Reactor Analytics websites usability and the accessibility of Reactor Oversight Process information in a manner that allows a reasonably informed member of the public to easily locate and interpret Reactor Oversight Process Action Matrix data and supporting information. In response, this working group has holistically reviewed the entire ROP data lifecycle to support the agencys data goals for the ROP.

This working group, which was formed in August 2023, reviewed the current status of ROP data, focusing on four strategic data areas, while also specifically considering the guidance for each strategic data area as laid out in the charter.

Working Group Members and Observers Working Group Nicole Fields, Team Lead, formerly the ROP Self-Assessment Lead in NRR/DRO/IRAB David Aird, Significance Determination Process Lead, NRR/DRO/IRAB Ron Cureton, Performance Indicator Lead, NRR/DRO/IRAB Dan Merzke, Assessment Lead, NRR/DRO/IRAB Greg Stock, NRR/DRO/IRAB Jason Carneal, NRR/DRO/IOEB Manuel Crespo, RPS-Inspections Product Owner, NRR/DRO/IRIB Madeleine Arel, Inspection Manual Coordinator and RPS IP Management Lead, NRR/DRO/IRIB Observers Linh Tran and Juan Arellano, DANU Boyan Ignatov, OCIO Austin Chandler, Department of Energy detailee to IRAB Evaluation The director of the Division of Reactor Oversight (DRO) tasked the working group to review the following four strategic areas for ROP data: 1) ROP data sources, storage, quality, and accessibility; 2) ROP data ownership/stewardship and responsibilities; 3) internally-facing ROP data visualization and analytical tools; and 4) externally-facing ROP data visualization and analytical tools, each as described in the charter. That evaluation follows.

ROP Datasets This section describes the working groups review of the current sources of ROP data. For the purposes of this evaluation, the working group has defined a dataset as a series of closely-related data that are stored together, processed together, and are used for a particular purpose. As the use of agency data evolves, it is possible that what was once considered several independent datasets, may effectively be combined into a single dataset. It is also

4 possible to have reasonable differences of opinion on what constitutes sufficiently closely-related data to form a single dataset versus data being categorized as multiple but related datasets. With this in mind, as the working group identified and evaluated ROP datasets, the working group paid attention to ensuring that the majority of routinely-used ROP data was captured in the list of datasets, and that the datasets as identified did not have significant overlap with each other.

The working group identified a total of 39 ROP-related datasets. For each dataset, the working group reviewed the storage location of the dataset, the format of the dataset, whether the data in the dataset are non-sensitive, and any particular limitations of or quality issues with the dataset. The working group also attempted to identify a specific individual owner and an organizational owner for each dataset. The working group reviewed the accessibility of each dataset for data entry, data review, and data corrections. In addition, the working group categorized each of the identified datasets as being only an ROP dataset (for example, Action Matrix data) or being mostly or partially an ROP dataset (for example, inspection manual feedback forms and public meeting data).

Data Storage and Format The working group found that approximately half of the identified ROP datasets are exclusively stored in authoritative agency databases (including RPS-Inspections, RPS-ROP, RPS-Oversight, CACS, and PMNS). The other half are at least partially stored outside of an authoritative agency database, such as on SharePoint sites, on the OneDrive of individual staff members, on the shared office G drive, on Nuclepedia, or in one case, in an Access database.

Correspondingly, those datasets are at least partially stored in formats such as SharePoint lists, Excel spreadsheets, Word files, text files, and PDF files. Word and PDF files are generally not considered machine-readable formats. Only two datasets were identified that were not in machine-readable formatthe pre-decisional monthly Assessment at a Glance, and the ROP historical performance data which is only available in PDF.

Dataset Information Sensitivity The working group found that approximately two thirds of the identified ROP datasets are generally non-sensitive, and therefore can be publicly released. However, the staff may still need to consider specific limitations such as, the timing of making data public (for example, only once an inspection activity has been completed and documented) and not releasing security-related information (for example, the details of security findings). Approximately one third of the identified ROP datasets are generally sensitive, due to being proprietary, pre-decisional, allegation-related, or solely for internal agency use, such as staffing and personnel data. In some of these cases, roll-up data can still be publicly released, if specific details are not provided.

Data Limitations and Quality Issues The working group identified specific limitations for approximately two thirds of the ROP datasets, and quality issues for approximately half of the datasets. A note here about the relationship between a dataset limitation and a dataset with quality issues. A dataset limitation is something that limits the ability to use that dataset in some desired analysis. Typically, any analysis performed with such datasets would either need to overcome those limitations or would need to include a discussion of the dataset limitations, which in turn could limit the analytical conclusions. The staff may be able to address dataset limitations, such as by incorporating a significant amount of legacy data into a dataset, but at the cost of significant staff effort.

5 Dataset quality issues could be relatively isolated, (such as a few missing or incorrect Accession numbers, or a few inspection reports that were not closed out when they should have been) and affect only a limited number of records in the dataset. Isolated quality issues also mean that quality standards are generally being met, and that there is not a significant dataset limitation. However, significant or widespread quality issues in a dataset may lead to a significant limitation of that dataset (such as a field that is null for hundreds or thousands of records, or specific data fields that are very inconsistent or inaccurate). Also note that data quality standards may evolve over time, as datasets have new data fields added, as datasets are combined or linked together, and as there have been programmatic changes which affect those datasets.

Specific dataset limitations identified included data only being available after a specific year, incompleteness of datasets, widespread data inconsistencies or errors which require extensive manual processing to address, limitations on connections/consistency between different datasets, changes in data formatting or dataset content over time, large size of datasets, and limitations in adding or correcting legacy data. Dataset quality issues identified included inaccuracies or inconsistencies due to duplication or data transfer/import, inaccuracies due to data needing to be manually updated, and errors or incompleteness of data records, particularly those records associated with legacy data.

Dataset Accessibility The working group looked at the accessibility of each dataset for data entry, data review, and data corrections, focusing in particular on accessibility for knowledgeable staff to review the data. Note that the term accessibility, when related to data and websites, is often referring to compliance with Section 508 of the Rehabilitation Act. This is obviously an important consideration, specifically for publicly available datasets and data tools; however, the working group in this context did not specifically look at Section 508 compliance, although there may be some overlap with what the working group assessed. The NRC is committed to making every possible effort to ensure that all information on its website is accessible, (see https://www.nrc.gov/site-help/access.html); Section 508 compliance at the NRC is generally under the purview of the Office of the Chief Information Officer (OCIO).

Approximately a quarter of the datasets are, at least in part, available for review through the NRC Data Warehouse (DW), which is a central data repository managed by OCIO. In general, datasets in the DW originate from an authoritative agency database and are accessible to be used in subsequent data visualizations/data analytics, often colloquially referred to as dashboards. Approximately another quarter of the datasets are also widely available for review, as the data are shared through some other type of data tool, which for example could be a dashboard, a shared spreadsheet, or through RRPS Reports. The remaining half of the datasets have anywhere from limited to extremely limited review access, although a handful of cases were noted where development of a data tool for review of that dataset is in progress.

In general, for data entry and data corrections, controls were applied either through user roles for a database or edit/version controls were applied to the dataset using password or other restrictions. The working group did not identify any instances in which editing controls were not being reasonably applied to the datasets. However, in some cases, in order to provide appropriate editing controls, viewing ability may be unnecessarily limited.

6 Dataset Ownership The working group categorized each identified ROP-related dataset, as to how much of the dataset was specifically ROP data. For 19 of the identified datasets, all of the data in the dataset was related to the ROP and to the currently-operating fleet of large light-water reactors. The working group determined that the remaining 20 datasets were mostly or partially ROP data and contained data that may belong to other business lines or organizations within the agency.

Seventeen of the ROP-only datasets and 13 of the remaining ROP-related datasets are owned or managed by a specific branch in DRO and have a specifically-identified individual data owner or point of contact (for those datasets that are accessible to but not created by NRC staff).

The two datasets that are exclusively ROP data which are not owned by a specific branch in DRO are worth mentioning here. The first is the Accident Sequence Precursor dataset, which is owned by the Office of Nuclear Regulatory Research (RES). As with the datasets owned by DRO, there is also an identified individual data owner, and there is close coordination and communication between RES and DRO, as related to these data. The second dataset contains the operating reactor units/sites and closely-related data, such as location, operational status, licensee data, reactor type data, etc. The working group was unable to determine where the agency authoritative source for these data was, and which organization within the agency was responsible for updating and maintaining this dataset. These data are often used across the agency in many contexts, for example, in licensing, public affairs, oversight, and emergency response functions, and this may contribute to why there is not necessarily a single, clear organizational owner for this dataset.

For the remaining seven ROP-related datasets, not owned by a specific branch in DRO, the working group attempted to determine the organizational owner, and whether there was a specifically-identified individual owner. While the working group was generally able to determine a responsible organization (or organizations), determining if there was an individual owner, and if so who, was challenging for these datasets. Coordination between DRO and these other organizations will be important in ensuring that the ROP portions of these datasets remain accessible and accurate.

ROP Data Ownership and Responsibilities This section describes the working groups review of the current structure of ROP data ownership and ROP data responsibilities. For the purposes of this evaluation, the working group has defined an individual data owner as a designated, knowledgeable member of the technical staff who is cognizant of their designated program dataset and/or data tool and associated data processes, and supports the use, accessibility, improvement, and corrections of their dataset and/or data tool. More specific details about Dataset Ownership and Data Tool Ownership are included in the respective sections of this report. The working group was not able to find any formal agency definitions for data owner or data steward, and the use of data owner in the context of this working group was focused on the program aspects of the dataset or data tool, rather than aspects related to the supporting infrastructure. Particularly for data and data processes that use authoritative agency databases, centralized agency IT resources, and widely-available data platforms, the infrastructure maintenance and controls are under the purview of OCIO, rather than the technical or program organizations.

7 Positions with ROP Data Responsibilities The working group reviewed the positions and organizations that have ROP-specific data responsibilities. Generally, the regions are responsible for the majority of ROP data entry and data corrections. The working group noted that across the regions the specific positions with ROP data responsibilities vary. In some cases, data entry is being done by resident and regional inspectors, in some cases by administrative staff located either at the sites or in the region, and in some cases by regional project engineers either in the reactor projects branches or in the technical support groups. Although this inconsistency is not necessarily an issue, it did make it somewhat challenging for the working group to generalize across the regions.

Generally, DRO staff are responsible for ROP data analysis and are designated as data owners for ROP datasets and data tools. As part of their responsibility as dataset and data tool owners, DRO staff are typically responsible for general maintenance of any data visualizations (often referred to as dashboards) and any datasets that they own. OCIO, as mentioned above, is responsible for the authoritative agency databases, centralized agency IT resources, and widely-available data platforms. When widespread ROP data issues arise, data maintenance and correction is typically a partnership between the regions, DRO, and OCIO.

Ownership of Newly-Created Datasets and Data Tools The creation of ROP datasets and data visualizations is worth discussing here in some detail.

The creation of ROP datasets happens somewhat infrequently and has historically been done by DRO staff when there is a particular need or request, either internally or externally. If similar ROP data is needed from across the regions, typically DRO, as the program office, takes on the role to collate those data and create a dataset. There may be cases in which each region is independently keeping similar regional-specific ROP data, and these are opportunities for DRO to work with the regions to create a single, cohesive agency-wide dataset that is available to all the regions.

On the other hand, as the desire to use data to support agency decision-making and the desire to perform data analytics across the agency increases, new data tools are increasingly being created. For ROP data tools specifically, these may be created by DRO staff, by regional staff, by EMBARK staff, by OCIO staff, by NRC IT contractors, or by some combination thereof. One of the pitfalls that the working group identified of having a flexible model for the creation of ROP data tools, is that there is not always a clear or sustainable path to data tool ownership and dedicated maintenance resources after the initial creation of the tool. This is discussed further under the Data Tool Ownership and Data Tool Maintenance and Data Updating sections of this report.

Staff Knowledge, Training, and Access The working group reviewed whether those staff in regional and DRO positions with ROP data responsibilities have the appropriate knowledge, skills, training, and access to perform their data functions. As OCIO, EMBARK, and the IT contractors that support them are responsible for much more than just ROP data systems and data tools, the working group chose to limit their review of staff positions to those positions only or primarily responsible for ROP data.

Generally, the working group concluded that the staff in positions with ROP data responsibilities have the appropriate knowledge and skills to perform those functions. In particular, many DRO staff were hired into their positions, in part, based on their extensive experience with data

8 analysis and tools. The working group only identified one notable skills gap, in that DRO staff did not have the ability to update or maintain the Operating Reactor Analytics application, although the data for this application could still be updated. This was a somewhat unique case, in that the beta (i.e., trial) application was a custom CSS and JS (Cascading Style Sheets and JavaScript) application built by a staff member in EMBARK with specialized application building skills, who subsequently left the agency in mid-2022. One staff member in DRO has recently obtained the requisite training in JavaScript Development based on the need to fill this skills gap.

Both informal and formal training have been made available to staff on various topics such as Power BI, SharePoint, Azure, Python, and RPS, and staff have taken those trainings as needed.

Five current DRO employees have completed advanced external training in data analytics, and they are all currently assigned to IOEB. When RPS undergoes significant upgrades, DRO typically provides specific user training or guidance communications to the regional staff. Also, several DRO staff have recently gone on rotations to other agency groups, such as OCIO, NMSS, and EMBARK, to further hone their agency-specific data skills.

When reviewing training for staff in positions with ROP data responsibilities, the working group did note that there was not a formal component of any of the relevant qualification programs that included a general discussion of ROP datasets and data tools at large. In some cases, there were references to specific sets of ROP data, such as performance indicator data, or specific operating experience data, but not a general, high-level discussion of ROP data. The working group reviewed the following qualification program documents as part of this review: ADM-504, Appendix G and Appendix H, dated August 23, 2023 (non-public), IMC 1245, Qualification Program for Reactor Inspectors, (ML23129A847) and IMC 1245, Appendices A, B, and C1, (ML23030B602, ML23037A835, and ML23030A607, respectively) which were last revised in mid-2023. For further discussion on qualification programs, see also Recommendation 4 in section Specific Strategic ROP Data Recommendations.

The working group also reviewed the process for getting staff with data responsibilities access to edit and view the datasets, data systems, and data tools they need. Although the process can vary somewhat based on the specifics, typically both view and edit access is controlled by some combination of OCIO staff and the dataset and data tool owners. A few examples of data access that staff may need to perform their data responsibilities include various levels of edit and view access for the RPS-Inspections module, access to the RPS-ROP module, read access to the NRC DW, credentials to utilize the Power BI gateways, and read or edit access to SharePoint lists. In general, current staff have the appropriate access to perform their designated data functions, and new staff can get that access, as needed.

Staff Position Descriptions The working group requested and reviewed position descriptions for regional and DRO positions with ROP data responsibilities. The working group reviewed 12 distinct position descriptions for positions with ROP data responsibilities. This included five positions in DRO, which covers the DRO positions held by working group members, and seven regional positions, which included resident inspector positions, project engineer positions, and administrative assistant positions.

In general, the position descriptions did an inadequate job at describing the data skills and responsibilities of the positions reviewed. In one case, there was no mention whatsoever of data, and in several other cases data was only mentioned in the context of time and labor data, in the context of the general ability to usedata base management, or in the context of using

9 data in decision-making. There was only one instance in which the working group believes that the position description, despite being dated in 2012, generally captured the relevant data responsibilities of the position.

Keeping in mind that each position description can be applied in some cases to dozens or more staff members across the agency, it is understandable that these position descriptions are written in a general way. However, the working group believes that it is important that the position descriptions accurately reflect the data skills and responsibilities of those positions, especially as ROP datasets and data tools become an even more important part of staff work.

Position descriptions that capture the full scope of data skills and responsibilities can help to ensure that staff have a clear expectation of their responsibilities and maintain the appropriate knowledge and skills to perform those functions. For further discussion on data responsibilities in position descriptions see also Recommendation 5 in section Specific Strategic ROP Data Recommendations.

ROP Data Tools This section describes the working groups review of the current suite of ROP data visualization and analytical tools, both internally-facing and externally-facing. For the purposes of this evaluation, the working group has defined a data tool as any tool that takes data or information and makes them easier for the user to understand, analyze, or access. These data tools can range from something as simple as a static list of documents to a sortable table to a database query or reporting tool to a fully-fledged interactive dashboard. In most cases, the data tool is technically separated from the dataset itself, however, the working group identified that for some datasets stored, for example, in Excel spreadsheets or in SharePoint lists, the data tool function can be an integrated part of the dataset storage. In the few cases in which there was both an external and an internal version of a similar data tool or a planned external version of an internal tool, the working group evaluated those together as a single data tool.

The working group identified a total of 55 ROP-related data tools. For each data tool, the working group reviewed the location of the data tool; the location of any underlying data-tool files; whether the tool was internally-facing or externally-facing or both; and the platform, the complexity, and the function of the tool. The working group also attempted to identify a specific individual as well as an organizational owner for each data tool.

The working group reviewed the accessibility of each data tool. For internal data tools, the working group reviewed whether the tools are widely available to staff, and whether the tools are linked within the Operating Experience Hub or within ROP Digital City. Both the Operating Experience Hub and ROP Digital City are internal SharePoint sites commonly used by ROP staff to find useful resources. For external data tools, the working group reviewed whether the tools are easily found without using bookmarks, whether the tool has at least basic filtering, sorting, and searching capabilities, and whether the tool requires exact information (e.g., day, year, etc.) to return useful results. For external data tools, the working group also determined if the tool provides data in a machine-readable format, and whether the tool provides links to Agencywide Documents Access and Management System (ADAMS) documents, as applicable.

For internal data tools without an external version, the working group reviewed whether the tool or a closely-related version of the tool could be made publicly available. For all the identified data tools, the working group reviewed the amount of manual data processing performed by the staff in order to update the data and in order to update the capabilities and functionalities of the

10 tool. The working group also identified which of the ROP datasets are associated with which of the ROP data tools to help identify any duplication or gaps in the available data tools.

Data Tool Description, Function, Complexity, and Platform Of the 55 ROP-related data tools that the working group identified, 23 tools are internally-facing, 27 tools are externally-facing, 3 tools are internally-facing with plans for an external version, and 2 tools have an internal and an external version. In the case of the Operating Reactor Scrams Dashboard, the internal version integrates data from several data sources and displays more information than the external version. In the case of the Inspection Manual Search Tool, since the externally-facing tool does not currently work as desired, an internal tool was created to temporarily restore this search capability for NRC staff. The working group also found that the primary function of externally-facing data tools was to increase transparency, and that the primary function of most internally-facing data tools was to support staff decision-making.

Internally-facing data tools also supported error checking data, tracking various metrics, and improving internal accountability. In many cases, the NRC staff, in addition to being users of internally-facing data tools, also make extensive use of externally-facing tools, along with interested members of the public.

Of the data tools evaluated by the working group, the working group categorized approximately half as complex tools. Seventeen of these complex tools are data visualizations built in Power BI. Most of the Power BI tools function as complete dashboards, but a few are primarily used as a search tool, or used to display a very large list of data. Five of these complex tools are data visualizations built in Tableau, which all function as complete dashboards. One of these complex tools is the custom application Operating Reactor Analytics written in JavaScript (JS) and Custom Style Sheets (CSS), which also functions as a complete dashboard. Two of these complex tools are database reporting tools that are built into the associated authoritative agency database (RPS and CACS, respectively). Typically, database reporting tools require significant in-depth knowledge and effort on the part of the user to use the tool well and efficiently. The complex tools identified generally allow users to perform more complex data analyses and are used in large part by the staff to support decision-making. Approximately three quarters of these complex data tools are internal only, with no current plans by staff to create external versions.

See also the discussion on complexity in section Feasibility of Creating New External Data Tools.

Three externally-facing tools identified by the working group primarily display reactor location data in map form. These tools are somewhat unique among the data tools evaluated by the working group and ranged in complexity and utility from completely static and out-of-date images (see Map of Power Reactor Sites and NRC Maps of Power Reactors) to a zoomable, interactive map of Operating Nuclear Power Reactors.

The remaining half of the identified data tools are significantly less complex than dashboards; they are either lists, tables, basic search tools, charts, graphs, or a combination thereof. The vast majority of these data tools are externally-facing. For the five internally-facing tools in this category, those tools all took advantage of the built-in capabilities of either Excel or SharePoint (Lists, Search, or Libraries), to provide the data functionality needed by staff.

For several of the less complex externally-facing tools, the tools that the working group evaluated are actually each a collection of data tables, typically in Excel spreadsheet format (see NRC Datasets, NRC Datasets at Data.gov, and NRC's High-Value Datasets). The purpose for these data tools is to increase transparency. The working group found some significant

11 overlap between these collections of data tables and there is a high potential for combining related data tables and removing redundant information. For further discussion on consolidation of data tools, see also Best Practice 6 in section Best Practices for ROP Data. The working group also noted that although Excel spreadsheets can be set up with built-in filtering and sorting capabilities, only some of these spreadsheets did so without needing additional data manipulation by the user. Additionally, some of these data tables have companion data dictionaries; ideally for publicly-available data tables, a reasonably informed user should not need a data dictionary, if the data are well formatted, and the fields are clearly labeled. The working group also identified that for the NRC Datasets at Data.gov, the information on when the data were last updated was not accurate. For further discussion on accurate last updated dates, see also Best Practice 3 in section Best Practices for ROP Data.

Four of the less complex externally-facing tools evaluated by the working group have a basic search component. (See Event Notification Search, Inspection Report Search, Licensee Event Report Search, and Public Meeting Schedule Search.) The first three of these tools are search functions that are contracted to Idaho National Laboratory (INL). For each of these tools, the search component is built using a combination of html with hardcoded search parameters and JavaScript, and the results displayed are hardcoded in html. Although these tools have a search function, the search results are not filterable or sortable by users and may not be displayed in a logical order. For the three tools built by INL, although it appears that users should be able to export the search results to Excel, which would be machine-readable, the working group was not able to successfully use that functionality. The public meeting schedule tool does not have an obvious option to export machine-readable data. These data tools are good candidates for improvements in usability. The staff is currently planning to develop an externally-facing Power BI dashboard for Event Notification data, in an effort to improve the public usability of those data.

Six of the less complex externally-facing tools are a simple list or table or a closely-related series of simple lists or tables where the data are hardcoded in html. (See Power Reactor Status Reports, Part 21 Reports, Morning Reports, Preliminary Notification Reports, Inspection Manual, and List of Power Reactor Units.) For these tools, as the data are hardcoded in html, users do not have the capability to filter, sort, or search the data, and the data are also not available in a machine-readable format. Additionally, to update these data, webpage updates need to be manually performed by staff. These simple data tools are good candidates either for improvements in the usability of the data and in minimizing the amount of staff effort needed to update the data or for deletion if the data are no longer useful. The staff is currently planning to develop externally-facing Power BI dashboards to display Power Reactor Status and Part 21 Report data, in an effort to improve the public usability of those data.

Finally, there are the eight externally-facing tools that are built using a combination of html and JSON data files to create lists, tables, and color-coded charts and graphs. (See Inspection Reports for Operating Power Reactors, Assessment Letters, Inspection Findings (Plant Issues Matrix), Performance Indicators, Plant Summaries, Cross Cutting Issues, Action Matrix, and Historical Performance since 2018.) For these tools, the JSON data files are updated periodically using the ROP module of RPS, and the html webpages read the appropriate JSON data file and display the data. Similarly to the tools which use hardcoded html data, users do not currently have the capability to filter, sort, or search the data. Additionally, although JSON is a machine-readable data format and the underlying JSON files are publicly available from the NRC public website, even some members of the working group were not aware of how to access these files, and no links or directions for access are provided to guide external users.

Also, for these tools, the displayed page last reviewed date shows the last update of the html,

12 and not the last update for the underlying JSON, such that users do not have an accurate understanding of when the data were last updated. These data tools are good candidates for improvement in the usability of the data and in minimizing the amount of staff effort needed to update the data. The staff is currently planning to develop an externally-facing Power BI dashboard to display Inspection Report and Assessment Letter data, in an effort to improve the public usability of those data.

Data Tool Ownership For each data tool, the working group attempted to identify an organizational owner as well as a specific individual owner. For approximately two thirds of the identified data tools, the working group was able to identify an organizational owner, and for those data tools, it was also generally possible to identify a specific individual owner. Approximately a quarter of the ROP-related data tools are owned by organizations outside of NRR; the working group believes this is reasonable, as these specific data tools also tend to be widely used across the agency in addition to having ROP-specific uses.

The working group did note that for larger organizations, such as NRR, the appropriate level for organizational ownership is likely at the branch or division level, versus the office level. For the three externally-facing data tools with NRR as the identified organizational owner, with no sub-organizational owner identified, the working group was unable to determine the individual data tool owners.

The working group also found that for five ROP data tools originally created by EMBARK, (see also the discussion above in the section Ownership of Newly Created Datasets and Data Tools,) the current organizational ownership and individual ownership was not completely clear.

These five tools included four internally-facing Tableau dashboards built as part of the Mission Analytics Portal (MAP) program, which builds internal data tools, and the externally-facing Operating Reactor Analytics application. Additionally, in a few of these cases, the owner as listed on the internal data tool is an NRC contractor and not an NRC staff member. For the remaining approximately 15 percent of the identified data tools, the working group was unable to conclusively determine an organizational or specific individual owner.

As previously mentioned in the section Ownership of Newly Created Datasets and Data Tools, there is not always a clear or sustainable path to data tool ownership and dedicated maintenance resources after the initial creation of the tool. This may also occur during transition between individual data owners if there is not a clear organizational owner of the data tool.

Without clear ownership of data tools by knowledgeable data owners, these tools become susceptible to inaccuracies, lack of maintenance, and eventually partial or full loss of utility.

Ultimately, a clearly identified and knowledgeable individual and organizational owner and the appropriate resources dedicated to maintaining each data tool, helps to ensure that the tool maintains its utility. For further discussion on ownership of data tools see also Best Practice 1 and Recommendation 3 in sections Best Practices for ROP Data and Specific Strategic ROP Data Recommendations.

Internal Data Tool Accessibility For the internally-facing data tools, the working group reviewed whether the tools were widely available to staff, and whether they were linked within either the Operating Experience Hub or within ROP Digital City. The working group did not identify any instances in which internal ROP data tools were not widely available. If an internal data tool was associated with a particular

13 database or application (e.g., RPS, CACS, or PMNS), the access to those tools may likewise be limited to users of that application. The working group did find that approximately half of the internal ROP data tools are not currently linked within either the Operating Experience Hub or ROP Digital City, such that they were not easy to find, if one did not already know the location of the tool. For further discussion on the accessibility of internal data tools see also Recommendation 2 in section Specific Strategic ROP Data Recommendations.

External Data Tool Accessibility For the current externally-facing data tools, the working group reviewed whether the tools were easily accessible for a member of the public. For approximately half of the tools, the working group identified that the tools were difficult to find using the navigational structure provided on nrc.gov, often requiring three, four, or more clicks to access the tool. Additionally, the working group found that the search function on nrc.gov did not always pull up the desired tool in the first few search results, and sometimes the search results returned historical content or seemingly unrelated information. Often using a Google search of nrc.gov was more successful in finding the desired data tools. The staff has a general practice of bookmarking the external data tools they frequently use rather than navigating or searching the public website.

Using Google Analytics reports, the working group was also able to review webpage view data for selected public webpages. The OCIO web team can make these reports available to other NRC staff, upon request. The working group believes these data should be used to inform staff of which externally-facing data tools are seeing more traffic, and therefore could be good candidates for future improvements, and which externally-facing data tools are seeing very little traffic and are therefore either not very accessible to users or are not currently of interest to users. For example, between January 1, 2023 and October 25, 2023, the three most viewed webpages with the URL root of www.nrc.gov/reactors/operating were a Map of Power Reactor Sites, a List of Power Reactor Units, and Inspection Reports for Operating Power Reactors, each with more than 30,000 page views over that time period. On the other hand Operating Reactor Analytics and Cross Cutting Issues each had fewer than 1,000 page views over that time period. The Google Analytics reports typically do not include bot/crawler web traffic, such that the page views do represent real human user activity. However, these reports are not able to distinguish or filter out webpage views from NRC staff, so pages that have high page views may not necessarily translate to a high level of public interest.

External Data Tool Usability For the current externally-facing data tools, the working group reviewed whether the tool had at least basic filtering, sorting, and searching capabilities, and whether the tool required exact information (e.g., day, year, etc.) to return useful results. As previously described in section Data Tool Description, Function, Complexity, and Platform, the ability to filter, sort, and search the data can be inherently limited based on the choice of platform, such as by using hardcoded html or static images. The working group found that this was applicable to approximately two thirds of the external data tools. For six of the external data tools, the usability of the tool was further impeded by requiring the user to know exact information (in three cases down to the exact calendar date), to retrieve the desired data. This is again due to the choice of platform by using hardcoded html, and by needing a way to organize the data without giving the user basic filtering or sorting functionality.

In the course of this evaluation, although not one of the ROP external data tools, the working group found a good example of a simple yet effective data tool which includes basic sorting and

14 searching capabilities. (See Standards Incorporated by Reference.) Similarly to the html and JSON ROP data tools, this tool and its capabilities are built in html, and the tool reads a separate data text file. This is a relatively recent NRC data tool, but the working group thinks that these types of basic sorting and searching capabilities should be the minimum standard for usability for a simple external list or table going forward.

For externally-facing data tools, the working group also determined if the tool provides data in a machine-readable format. Generally, the working group found that, except for the data tools that were essentially collections of data tables in Excel spreadsheet format, the vast majority of externally-facing data tools did not provide machine-readable data files. Even for the html and JSON tools, which are built using machine-readable JSON files, although the underlying JSON files are publicly available from the NRC public website, even some members of the working group were not aware of how to access these files. The working group also found that for public-facing Power BI dashboards the ability to export data from the tools has been deliberately disabled due to security concerns. This limitation does not apply to internal Power BI dashboards. However, this does mean that for externally-facing Power BI dashboards, if the staff wish to make machine-readable data publicly available, they will have to do so separately from the dashboard itself.

Finally, for externally-facing data tools, the working group determined if the tool provides links to ADAMS documents, as applicable. ADAMS is the official agency recordkeeping system, so the working group believes it is important to direct users to the official agency records from the data tools which are provided as a user convenience. The working group found that approximately one third of the external data tools included links to ADAMS documents. In the case of the Operating Reactor Scrams Dashboard, links were instead provided to the relevant Event Notification Report on the public NRC website. For the three data tools hosted by INL, links are not provided to ADAMS documents, but instead links to individual data files hosted on the INL servers are provided. In the case of Morning Reports, the working group found that although ADAMS Accession Numbers were often provided, links to ADAMS were not. For five data tools, the working group found that there were not applicable ADAMS documents to which links could have been provided.

Feasibility of Creating New External Data Tools The primary function of externally-facing data tools is to increase transparency, and as a result, in accordance with the agencys strategic goals in NUREG-1614, to inspire stakeholder confidence in the NRC. Over time, the ROP has made a significant amount of reactor oversight-related data available to the public. This is supported by the 27 externally facing ROP data tools identified by the working group. The previous sections External Data Tool Accessibility and External Data Tool Usability have discussed the working groups evaluation of the accessibility and usability of the current externally facing data tools. In addition, the staff currently has plans to create external versions of three of the currently internally facing Power BI dashboards. The staff maintains its commitment to Openness as both one of the Principles of Good Regulation and one of the agencys Organizational Values.

The working group considered the following factors when evaluating whether the creation of a new external data tool is feasible based on the existing 23 internally facing data tools. First, in order for an external tool to even be considered, the datasets displayed in that tool need to be generally non-sensitive; see also the previous discussion in section Dataset Information Sensitivity. The timing of making data publicly available is especially relevant for inspection and inspection finding related data, which can be pre-decisional, and for large and frequently

15 updated datasets this can be particularly challenging. Many of the internal data tools incorporate at least some preliminary, draft, or pre-decisional information; some security-related information; or some staffing information; which would need to be reviewed and removed, if an external version of the tool were to be created.

The working group also reviewed the complexity of these internal tools and found that 17 were built in Power BI or Tableau, and the vast majority of those were dashboards built to meet the complex data needs of the staff. Also included in this group of internal tools are the two complex database reporting tools, that are built into internal agency systems. While the staff supports making ROP-related data available publicly, where feasible, the staff is also cognizant of ensuring that all publicly available data are straightforward to interpret and have the appropriate context for a reasonably informed member of the public, so as not to hinder Clarity, which is also one of the Principles of Good Regulation. Therefore, some care needs to be taken by staff when creating complex data tools for public consumption. One good example of a publicly available complex ROP-related data tool is the Accident Sequence Precursor Dashboard. Not only has the staff made this dashboard publicly available but has also carefully provided context and explanation for the overall program and the data as displayed in the dashboard.

The working group did not identify any additional internal data tools that are currently good candidates for external data tools. In general, if an internal tool is useful to the staff, contains primarily non-sensitive data, and the complexity of the tool can be well managed, the staff should consider making a publicly-available version of that tool. As the treatment of ROP data continues to evolve, it is possible there will be future opportunities for new external ROP data tools.

Data Tool Maintenance and Data Updating For the identified ROP data tools, the working group reviewed the amount of effort performed by the staff to update the capabilities and functionalities of the tool and to update the data. Note that this discussion is related to but distinct from the level of knowledge and effort required on the part of users and any usability concerns with the data tools. Those discussions, where applicable, can be found in sections Data Tool Description, Function, Complexity, and Platform and External Data Tool Usability above.

Regardless of the overall level of staff effort needed to maintain and update the tool, it is still essential to have dedicated data tool maintenance resources especially after the initial creation of the data tool. Even a tool which requires very little maintenance, will need to have a designated data tool owner to ensure accuracy and continued functionality (see also Data Tool Ownership).

The working group found that the amount of effort to maintain and update ROP data tools varied considerably. For very simple tools with a limited amount of data, although typically any data updates were completely manual, these updates tended to be infrequent, and maintenance for the tool was generally minimal overall; however, the tools still need to be hosted and their platforms need to be globally maintained.

For more customized and complicated tools, tools that display large amounts of data, or tools where the data need frequent updating, the level of effort by staff could be much larger, especially if the data updating processes are still manual or required significant manual actions or reviews. It is these tools where a significant amount of staff effort can be alleviated. The working group found that the two primary ways to alleviate the amount of maintenance needed

16 by staff, was first, to make the tools simple and straightforward while still meeting the data needs, and second, to automate the data updates. For further discussion on automation of data updates, see also Automated Refresh in section Idealized Dataflow for ROP Data.

Data Tool Gaps and Redundancy The working group reviewed which ROP datasets are associated with which ROP data tools to help identify any gaps or redundancy in the available data tools. The working group found that all ROP datasets that are generally non-sensitive (see also the section Dataset Information Sensitivity) are associated with at least one data tool. In addition, three quarters of the ROP datasets that are generally sensitive also had at least one associated data tool. In one case, for the inspector qualification dataset, while there is not an existing data tool, the staff is creating a new data tool to view the dataset. In a few cases, the only associated data tools identified for a dataset were internal RRPS Reports, which are established and maintained database queries built-in to RPS. As described in section Data Tool Description, Function, Complexity, and Platform, some of the database reporting tools can require significant in-depth knowledge and effort on the part of the user to use the tool, so consideration should be given to whether the existing database reports are meeting current user needs, and whether it may be useful to create supplemental data tools. For further discussion on idealized data tools see also Data Visualization in Power BI in section Idealized Dataflow for ROP Data. The working group did not identify any other gaps which would be good candidates for the creation of new ROP data tools.

The working group also reviewed the datasets associated with each data tool to identify any possible redundancy or duplication between tools. The working group found that a handful of ROP data tools were not specifically associated with any ROP datasets. Approximately a quarter of the ROP data tools combined data from multiple ROP datasets. For these data tools, if the same dataset is used in multiple tools, that does not necessarily mean that the tools are completely redundant, however this may be an indicator that these tools should be more carefully examined by the data tool owners as either candidates for deletion or consolidation.

For further discussion on consolidation of data tools, see Best Practice 6 in section Best Practices for ROP Data. The working group identified two such tools as redundant and as good candidates for deletion, including Operating Reactor Analytics. For further discussion on deletion of data tools, see Best Practice 3 and Recommendation 1 in sections Best Practices for ROP Data and Specific Strategic ROP Data Recommendations.

For data tools that were associated with only a single ROP dataset, the working group evaluated each grouping of tools, again to identify any possible redundancy or duplication. In these cases, it was often easier to determine if the tools were duplicative, or if they served sufficiently different purposes. See also the discussions in sections Data Tool Description, Function, Complexity, and Platform and Feasibility of Creating New External Data Tools about data tool purposes. The working group identified four such tools as redundant and as good candidates for deletion, including the Map of Power Reactor Sites and NRC Maps of Power Reactors. For further discussion on deletion of data tools, see Best Practice 3 and Recommendation 1 in sections Best Practices for ROP Data and Specific Strategic ROP Data Recommendations.

Recommendations In the course of the working groups evaluation of ROP datasets, ROP data tools, and ROP data ownership above, the working group agreed that documenting some best data practices and a

17 description of an idealized dataflow for ROP data would be a useful tool for future data work.

Along with the working groups specific strategic ROP data recommendations, these best practices and idealized dataflow description are also included in this section.

Best Practices for ROP Data This section describes the working groups consensus on the general best practices for treating ROP data and for ROP data processes. The staff should strive to consistently implement these best practices for ROP data, wherever possible. These best practices for ROP data may also be generalizable and useful to other groups across the agency. The working group considers these best practices to be a current snapshot in time. As the data tools, platforms, and techniques that are available to NRC staff evolve and change, the specifics of the best practices may change, however, the underlying impetus behind each of the best practices will likely remain intact. A description of each of the six consensus best practices is listed below.

Best Practice 1: Each dataset and data tool should have a clearly identified and knowledgeable individual and organizational owner. Although in some cases contractors have supported building and maintaining ROP datasets or data tools as previously discussed in sections Ownership of Newly-Created Datasets and Data Tools and Data Tool Ownership, it is still important that the ownership responsibility remains with the NRC staff. Having a designated individual owner helps to ensure that each dataset and data tool is being maintained, the data are kept up-to-date, and that issues or improvements from users are being considered. Having a designated organizational owner helps to ensure smooth transitions of data responsibilities from one individual staff member to another and helps to ensure that dedicated resources are made available to maintain and update the datasets and data tools. In support of this best practice, the working group also recommends that for data tools in particular, the individual owner and organizational owner be labeled on the tool itself. This helps to ensure that it is clear to users who to contact with issues or improvements and helps keep organizations accountable for designating individual data tool owners. For external data tools, this is also important, as it ensures that the public can contact a specific knowledgeable individual, rather than needing to use a generic Contact Us page or a shared email address or email resource box, which may both result in a less direct and less transparent method to reach the appropriate staff member.

Best Practice 2: Implement data quality controls, with an emphasis on having widely available data visualizations. As described in the section Data Limitations and Quality Issues, significant or widespread quality issues in a dataset may in turn be a significant dataset limitation, so it is important to have robust data quality controls. The following are some best practice data quality controls.1) Have knowledgeable staff perform peer checks or spot checks for data entry accuracy. 2) Implement data entry controls and data validation when possible. For example, this could include requiring a consistent date format, requiring a specific format for an Accession number, or using a choice or a Boolean field as opposed to a free text field. Although data entry controls are typically more common for data stored in agency databases, they should also be considered for data stored in Excel or SharePoint lists. 3) Ensure that visualizations of data are widely available internally to knowledgeable staff. These visualizations should include as wide a span of data as possible. By making aggregate data available that are filterable, sortable, and searchable, knowledgeable staff will be able to identify data-related issues and data errors more easily, so that they can then be corrected. The working group has often found that when building new data visualizations for the purpose of trending data, the first issues to be identified are data errors.

18 Best Practice 3: Delete data tools that are no longer needed. If a tool is redundant, is no longer being used by anyone, is not being maintained, has data that are not being regularly updated, or has served a specific one-time purpose with documented results, that tool should be deleted.

Data tools that are redundant, see also the discussion in section Data Tool Gaps and Redundancy, or are no longer being used still have maintenance costs but are not providing a commensurate benefit. Data tools with out-of-date data can be confusing to users and can also result in perceived data inconsistencies, even if there are not actually any inconsistencies.

Individual data tool owners are likely in the best position to determine if their tools are still useful or are good candidates for deletion. Of note, since externally facing data tools are typically provided by the agency only as a convenience to users, (i.e., ADAMS is the official agency recordkeeping system), any external data tool provided should be more useful to users than a combination of ADAMS and other widely available tools (e.g., Google search). In support of this best practice, the working group also recommends that accurate dates for the last data update, and the last update of the tool itself, if possible, are prominently displayed on the tool. This is also so that users understand the limitations and accuracy of the data they are viewing.

Best Practice 4: Data should be stored in the appropriate authoritative agency database.

Acknowledging that it is not always practical in the near term to store data in an authoritative agency database, the working group recommends the following additional best practices for any data stored outside of an authoritative agency database. 1) Data should at a minimum be stored on a group SharePoint site, such that the data storage is not reliant on a single staff member (e.g., using an individuals OneDrive), and that there is built-in version control and history when using Excel or SharePoint lists, as opposed to using a shared drive (e.g., G drive). If there is not a more preferred SharePoint site to store the data in question, the Operating Reactor Analytics SharePoint site can be used as a general ROP data storage location. 2) There should still be only one data source for each dataset and that data should be stored in a machine-readable format. If the same data are stored in multiple locations and/or there is no clear authoritative data source, this can create versioning issues, duplication issues, formatting issues, and potential data inaccuracies and inconsistencies. See also the previous discussion of the operating reactor units/sites dataset in section Dataset Ownership. 3) Consideration should be given to implementing additional data integrity controls, such as editing or formatting restrictions, although there is sometimes a balance between integrity controls and ensuring sufficient user access.

Best Practice 5: Maintain defense-in-depth for the ownership of datasets and data tools. It is good practice to ensure that dataset and data tool owners have clear directions and documentation for their data processes, including the location of all necessary data tool files, how to access the data sources, and labeling or commenting on any code or tools. This should also include any data integrity controls in place. If a desk guide exists for a given position, it is a best practice to include or reference relevant data process information as part of that documentation. All data-related files, cloud workflows, code, and directions should be available to backup staff and stored in a location that is easy to locate and shared with the owners backup and branch chief, at a minimum. Supporting cloud workflows or applications (e.g.,

Microsoft Forms, PowerApps, or PowerAutomate) must be shared, transferred to a new owner, or transferred to a branch organizational account prior to departure of any data owners as these resources are deleted and not recoverable when a departing employees IT account is disabled.

When creating cloud resources, such as Power BI dashboards, best practice includes assignment of owner rights to the owners backup and branch chief, at a minimum. Any custom code supporting a deployed product should be well documented and include specific requirements regarding the necessary packages, drivers, or supporting software necessary to

19 execute the code as well as instructions on how to obtain and install any necessary non-standard workstation software.

Best Practice 6: Strive to consolidate and cross-reference relevant datasets and data tools. As part of this evaluation, the working group found that there were a large number of ROP datasets and data tools, and that this could be overwhelming both for dataset and data tool owners, particularly from a maintenance perspective, and also for both internal and external data tool users. By combining related datasets, minimizing data duplication, and improving linkages between datasets, this can minimize the amount of maintenance needed and improve overall dataset quality. By adding onto or combining existing data tools, rather than creating separate standalone tools, this can help mitigate issues with the creation of new data tools and their ownership and minimize the amount of maintenance needed. See also the discussions in sections Ownership of Newly-Created Datasets and Data Tools, Data Tool Maintenance and Data Updating, and Data Tool Gaps and Redundancy. This can also make it easier for users to find the data tools that they need. In some cases, the working group found that an existing data tool has been modified slightly to meet the needs of additional groups of users. When data tool consolidation is not reasonable, including cross-references to other relevant tools and having a single one-stop shop for users to go to for data tools are also best practices. In support of this best practice, the working group is recommending that a centralized list of ROP datasets and data tools be created and made widely available internally, for more details see Recommendation 3 in section Specific Strategic ROP Data Recommendations.

Idealized Dataflow for ROP Data This section describes the working groups consensus on an idealized dataflow for ROP data processes. Although this dataflow may not be feasible to fully implement for existing data processes, aspects of this dataflow should still be considered for ways to improve current ROP data processes. In addition, this idealized dataflow should be strongly considered for the implementation of future ROP data processes, wherever possible. The overarching considerations for this idealized dataflow may also be generalizable and useful to other groups across the agency. Similarly to the best practices as described above, the working group considers this idealized dataflow to be a current snapshot in time. Although some of the specific details may change in the future, the underlying reasoning behind the idealized dataflow will likely remain intact.

The idealized dataflow for ROP data is to store the underlying dataset in RPS, have read access to the relevant data through DW views, perform data visualization in a Power BI dashboard where the data are automatically refreshed, and to have an external, publicly available version of the Power BI tool. A detailed breakdown of the idealized dataflow concepts follows, including a discussion of some key dataflow dependency considerations.

Dataset Storage in RPS: As described above in Best Practice 4 in section Best Practices for ROP Data, data should be stored in the appropriate authoritative agency database. For the vast majority of ROP-related data, RPS is the appropriate authoritative data source. Having the data stored in a single location limits the possibility of data duplication and inconsistencies. If data errors are discovered, or if data need to be updated, the authoritative data source should be updated accordingly. If multiple databases rely on the same data, it should be clear which is the authoritative source for that data. Any secondary sources should be automatically synced to the authoritative source, rather than manually copying the data or getting manual reports or bulk data transfers from the database, such that any changes made to the authoritative database appropriately propagate to all other needed data sources. As syncing may take some time and

20 happen at some periodicity, it is also worth ensuring that users understand those time frames accurately, if possible.

Data Accessibility through the Data Warehouse: Data from the appropriate authoritative agency databases should be available to staff to view and access through the NRC DW views for use in analysis and visualizations. The DW was discussed previously in more detail in the section Dataset Accessibility. By having the data available through the DW, this helps to ensure that the loads on the transactional databases (e.g., RPS) are well managed. As data in the DW are read-only to standard users, this also limits the possibility of inadvertently editing the underlying data sources. Additionally, by using the DW views any manipulations, cuts, cleaning, or basic calculations being performed on the data are consistently and reliably available to staff who need to use those data for analysis or visualization. Using a processed DW view for a visualization is preferable to using the unprocessed tables from a transactional database (e.g., RPS), even if those tables are in the DW; using the unprocessed tables can vastly increase the complexity of accessing the desired data, any subsequent downstream data manipulations, and data tool maintenance and upgrades. Generally, if access to data is desired to support a data tool, those data should also be added to the DW, if they are not already available there.

Data Visualization in Power BI: The working group agrees that data visualization in Power BI is the generally preferred data tool option. Data visualizations tend to be intuitive to use and can provide users many options to view, filter, and trend the same set or related sets of data. In a single tool, individual users can filter down to their data of interest, and that way a single data tool can serve multiple groups or purposes with minimal changes or improvements. Additionally, by having a single data tool, less maintenance is needed than by maintaining multiple data tools. As discussed in section Data Tool Description, Function, Complexity, and Platform, there are three currently in-use options for complex data visualizations for ROP data: Power BI, Tableau, and a custom JS/CSS dashboard. Custom code typically has a significantly higher amount of maintenance and relies heavily on specialized skills, so the working group does not generally recommend this option. See also the discussion in section Staff Knowledge, Training, and Access. Both Tableau and Power BI are commercially available products for dashboard creation, and there are pros and cons to both. The working group recommends the use of Power BI, first, because Power BI is relatively easy to use for the creation and maintenance of data tools, and second, because Power BI is well integrated into the Microsoft ecosystem, which the agency has already adopted. There are also cost considerations to licensing Tableau for both dashboard creation and dashboard viewing.

Automated Refresh: As discussed in section Data Tool Maintenance and Data Updating, one of the drivers of staff effort while updating more complex data tools can be extensive manual data processing. By using a Power BI visualization, with data available through the DW, it is possible and recommended to set up a completely automated periodic data refresh process.

The working group agrees that this is the ideal data process for internal ROP data tools.

Although to an untrained user the data tool results may not look significantly different whether the data updates are manual or automatic, having an automatic data refresh both gives the users the most up-to-date data, with minimal delays, while also minimizing the amount of ongoing effort needed by the data tool owner to simply update the data displayed. For dataflows that are different than this idealized dataflow for ROP data, other methods to automate data updating should also be considered.

Publicly Available Data Tools: Finally, the idealized dataflow for ROP data concludes with a high quality, publicly available data tool, for non-sensitive datasets. See also the discussion in

21 section Dataset Information Sensitivity. The working group agrees that in cases in which there is an internal Power BI tool, there are advantages to having the external tool also be Power BI based, as this minimizes staff effort in maintaining multiple tools on different platforms. For simpler externally facing data tools, building a new Power BI data visualization may be unnecessarily complex. However, the external tools should still be at a minimum, filterable, sortable, and searchable, and be intuitive for a reasonably informed user to use as discussed in section External Data Tool Usability. In addition, external data tools should provide clearly labeled machine-readable data files, such that a reasonably informed member of the public could perform their own data analysis, if that were desired. The external data tools should also provide links to official agency records in ADAMS, as applicable.

Key Dataflow Dependencies: When following the idealized dataflow for ROP data, as laid out above, there are some important dataflow dependencies to consider. In particular, the working group found that modifications to authoritative data sources (e.g., RPS, CACS, etc.) should be coordinated not only with the dataset owners, but should also be coordinated with the DW, such that any modifications are adequately communicated, and any needed changes to the DW data cleaning and processing can be made. This is especially important if the proposed modification to the data source removes any data fields. Subsequently, if there are changes to the DW views, those changes should be coordinated with the owners of the data tools that use those views. Note that if the dataflow is different than the idealized dataflow for ROP data, there may be different dependencies instead, and those should be considered as well. As described earlier in Best Practice 1 in section Best Practices for ROP Data, each dataset and data tool should have an owner who is knowledgeable of associated data processes, which generally helps to ensure continued interconnectivity between related datasets and data tools. Another consideration is to explicitly include all data sources on data tools, such that the dataflow dependencies are transparent both to users and to individual data tool owners for any future changes in data sources.

Specific Strategic ROP Data Recommendations In addition to the best practices and the idealized dataflow discussion in the previous sections, the working group has also provided five specific strategic recommendations. These recommendations are actions that rather than provide general ongoing guidance for the treatment of ROP data, are discrete actions that can be completed in a specified amount of time. For each of the following recommendations, the working group has provided a notional time frame for action completion, which could serve as an implementation plan should these recommendations be accepted.

Recommendation 1: Delete specific ROP data tools. Following Best Practice 3 in section Best Practices for ROP Data, as part of this ROP data evaluation, the working group identified seven ROP data tools as candidates for deletion. These data tools are all redundant to other existing data tools (see the previous discussion in section Data Tool Gaps and Redundancy) and often the data were either not being kept up-to-date, or the tool itself was burdensome or difficult to update. This recommendation specifically includes the deletion of the Operating Reactor Analytics application, as it is burdensome for staff to maintain, the data are redundant to existing ROP data webpages, and the beta application did not meet the mark for usability and accuracy.

This recommendation also includes deletion of the static, out-of-date, non-interactive maps (see Map of Power Reactor Sites and NRC Maps of Power Reactors) and deletion of the no longer used Morning Reports as discussed in the section Data Tool Description, Function, Complexity, and Platform. The remaining three tools recommended for deletion are internal. The notional timeline for staff to accomplish these deletions could be six months or less, which includes

22 adding warning banners for users as needed, deleting any other associated IT resources, removing links to these tools, and implementing appropriate redirects to updated information for users who may have bookmarked these tools.

Recommendation 2: Increase accessibility of internal ROP data tools. As discussed in the section Internal Data Tool Accessibility, approximately half of internal data tools are not currently linked within either the Operating Experience Hub or ROP Digital City. By adding such links, the staff can increase internal awareness and use of these tools, and that may also in turn improve these tools and the data displayed. The working group recommends adding links to all internal data tools to both the Operating Experience Hub and ROP Digital City, in addition to any other places that may be appropriate (e.g., DRO SharePoint site, branch SharePoint sites, etc.).

This recommendation also supports Best Practice 6 in section Best Practices for ROP Data, in that internal users will have a single one-stop shop to go to for data tools. The notional timeline for staff to accomplish the addition of these links could be three months or less and should also be coordinated with the implementation of Recommendation 3, below.

Recommendation 3: Create a centralized and comprehensive list of ROP datasets and data tools, which includes the designated owners, and make this information widely available to staff internally. In support of Best Practice 6 in section Best Practices for ROP Data, having an up-to-date list of ROP datasets and data tools and their respective owners, will ensure awareness of the other related datasets and data tools, both for the owners and for the staff at large. This will also ensure that the owner of each dataset and data tool is clearly identified and up-to-date, in support of Best Practice 1 in section Best Practices for ROP Data. This can also contribute to the ability for staff to create, delete, and combine ROP datasets and data tools in a strategic manner, which supports Best Practice 3 and Best Practice 6 in section Best Practices for ROP Data. The notional timeline for staff to accomplish this recommendation could be six months or less. The working files from this working group could be used as a starting point. The lists of datasets and data tools could be stored on ROP Digital City, and the lists should be updated by DRO staff on a periodic basis, as new ROP data tools are often being created.

Recommendation 4: Incorporate an awareness of ROP datasets and data tools into the ROP qualification programs. As discussed in the section Staff Knowledge, Training, and Access, staff generally have or can get the appropriate access to ROP datasets and data tools, but an awareness of ROP data has not been formally incorporated as part of any of the relevant qualification programs. As data analysis plays such an important role in the ROP today, the working group thinks there should be a more formalized introduction to ROP data, especially for newly qualifying staff. In coordination with Recommendation 3, above, the working group recommends incorporating brief, high-level information and references to the current ROP datasets and data tools, as well as to the ROP data evaluation performed in this report, into the qualification programs for ROP inspectors and engineers. The notional timeline for staff to accomplish the addition of these additions could be three months or less, after or in parallel with the implementation of Recommendation 3.

Recommendation 5: Add ROP data duties and skills to relevant position descriptions. As described in section Staff Position Descriptions, the regional and DRO position descriptions that the working group reviewed were generally inadequate with respect to ROP data responsibilities. Although the detailed specifics of working with ROP data may change, the working group thinks it is important that the general data skills and responsibilities of a position are captured in a high-level way, that is consistent across the agency, where applicable. In accordance with Management Directive 10.37, "Position Evaluation and Benchmarks (ML18073A270), the description of the position should be prepared by a supervisor most

23 familiar with the work assigned. However, since updating many position descriptions could be a significant effort, the working group also recommends considering assigning this task to a group of staff. The notional timeline for staff to accomplish these updates could take a few years, as it is not expected for this activity to be the highest priority.

Conclusions The ROP data strategy working group has concluded that overall, ROP data is in a very respectable state. Over the past several years, efforts by the staff have moved ROP data processes towards the consensus best practices and idealized dataflow, as described in this report. There have also been significant recent improvements in the overall quality, accessibility, and accuracy of ROP data. The working group has defined dataset, individual data owner, and data tool; documented six best practices; described an idealized dataflow; and made five specific strategic ROP data recommendations for DRO management to consider. The working group believes that although there are some specific areas identified for improvement, the current state of ROP data, along with the determination of the staff to continue to make data improvements and improve transparency, support the agencys goal of enhancing the usability and accessibility of ROP data for both the staff and ultimately for the American public.

ML24059A402

Text

SUBJECT:

Enclosure:

Background

Navigation menu

Search