AccessGUDID Data Validation

Section 1: Use Case Identifiers

Use Case ID: HHS-NIH-00002
Agency: HHS
Op Div/Staff Div: NIH
Use Case Topic Area: Government Services (includes Benefits and Service Delivery)
Is the AI use case found in the below list of general commercial AI products and services?
None of the above.
What is the intended purpose and expected benefits of the AI?
The artificial intelligence (AI) system for AccessGUDID will focus on enhancing data quality assurance by detecting anomalies in medical device metrics, such as inconsistent units of measurement or unusual device dimensions. It will analyze both structured data, like numerical measurements, and unstructured data, such as free text entries. The AI will not modify data directly but will flag potential issues for human review. Integrated into the U.S. Food and Drug Administration (FDA)'s processing workflow, it will leverage standard codes for precise grouping and employ advanced language models to process unstructured data. Deployed within the regulatory compliance and data management areas of NLM, this AI system aims to streamline data validation processes, reduce manual workload, and maintain the integrity of the AccessGUDID database, without overstepping into areas requiring human oversight.

The key problem the AI aims to address is the presence of inconsistent or incorrect data entries, such as mismatched units of measurement or outlier device dimensions, which can lead to regulatory compliance issues and potential risks in device usage. By automating the anomaly detection process, the AI will improve the efficiency of data validation, significantly reducing the manual effort required to identify and correct errors. This will not only enhance the accuracy of the data but also speed up the processing and integration of new submissions into the AccessGUDID database. Improved data integrity will lead to better decision-making for manufacturers, healthcare providers, and regulators, ensuring that medical devices on the market meet safety and performance standards. Ultimately, the positive outcomes of this AI application include increased trust in the medical device information provided to end users, which contributes to the broader goal of improving public health and ensuring patient safety across the nation.
Describe the AI system's outputs.
For inputs, it handles structured data, such as numerical metrics (e.g., device dimensions, units of measurement), and unstructured data, such as free-hand text entries describing medical devices. This data is sourced from the AccessGUDID database, which aggregates information submitted by manufacturers to the FDA. The frequency of data processing aligns with the batch submissions from the FDA, typically occurring daily or whenever new data is received. The system processes large volumes of data to detect anomalies, making it scalable to accommodate growing datasets. For outputs, the AI generates predictions and classifications, specifically identifying potential anomalies in the data, such as incorrect units of measurement or outlier device specifications. The results are presented as flags or labels indicating the presence of an anomaly, which are then reviewed by human analysts. The AI produces these results in real-time as part of the batch processing workflow, ensuring that each data submission is evaluated promptly before being fully integrated into the AccessGUDID database. The output helps in streamlining the data validation process and improving overall data quality.
Stage of Development: Initiated
Is the AI use case rights-impacting, safety-impacting, both, or neither?
Neither

Section 2: Use Case Summary

Date Initiated: 03/2024
Date when Acquisition and/or Development began: N/A
Date Implemented: N/A
Date Retired: N/A
Was the AI system involved in this use case developed (or is it to be developed) under contract(s) or in-house?
N/A
Provide the Procurement Instrument Identifier(s) (PIID) of the contract(s) used.
N/A
Is this AI use case supporting a High-Impact Service Provider (HISP) public-facing service?
N/A
Does this AI use case disseminate information to the public?
N/A
How is the agency ensuring compliance with Information Quality Act guidelines, if applicable?
N/A
Does this AI use case involve personally identifiable information (PII) that is maintained by the agency?
N/A
Has the Senior Agency Official for Privacy (SAOP) assessed the privacy risks associated with this AI use case?
ongoing

Section 3: Data and Code

Do you have access to an enterprise data catalog or agency-wide data repository that enables you to identify whether or not the necessary datasets exist and are ready to develop your use case?
N/A
Describe any agency-owned data used to train, fine-tune, and/or evaluate performance of the model(s) used in this use case.
N/A
Is there available documentation for the model training and evaluation data that demonstrates the degree to which it is appropriate to be used in analysis or for making predictions?
N/A
Which, if any, demographic variables does the AI use case explicitly use as model features?
N/A
Does this project include custom-developed code?
N/A
Does the agency have access to the code associated with the AI use case?
N/A
If the code is open-source, provide the link for the publicly available source code.
N/A

Section 4: AI Enablement and Infrastructure

Does this AI use case have an associated Authority to Operate (ATO) for an AI system?
N/A
System Name: N/A
How long have you waited for the necessary developer tools to implement the AI use case?
N/A
For this AI use case, is the required IT infrastructure provisioned via a centralized intake form or process inside the agency?
N/A
Do you have a process in place to request access to computing resources for model training and development of the AI involved in this use case?
N/A
Has communication regarding the provisioning of your requested resources been timely?
N/A
How are existing data science tools, libraries, data products, and internally-developed AI infrastructure being re-used for the current AI use case?
N/A
Has information regarding the AI use case, including performance metrics and intended use of the model, been made available for review and feedback within the agency?
N/A