Creation of synthetic Survey-like Insurance Names for use coding NHIS private insurance responses

Section 1: Use Case Identifiers

Use Case ID: HHS-CDC-00051
Agency: HHS
Op Div/Staff Div: CDC
Use Case Topic Area: Health & Medical
Is the AI use case found in the below list of general commercial AI products and services?
None of the above.
What is the intended purpose and expected benefits of the AI?
This solution will impact the process of manual coding and reviewing text errors in the National Health Interview Survey

Previous efforts to augment the manual coding have proven ineffective. Given the complex nature of insurance programs, this results in over 200 FTE hours a year being spent on just the initial coding of open text insurance fields. Additional hours are spent reviewing the manual coding for errors. By creating survey-like insurance responses, this could theoretically save over 100 hours of staff labor a year, greatly increasing efficiency and timeliness.
Describe the AI system's outputs.
The National Health Interview Survey calculates insurance coverage rates among the US non-institutionalized resident population, including statistics of the percentage of people with private insurance coverage. The collection of information on insurance coverage is accomplished in two parts. First, survey respondents self-identify the type of insurance they have. Secondly, to aid in the verification of insurance type, survey respondents also provide the name of the health plan or program they have in an open text field. The information collected in the open text field may include misspellings, acronyms, and rarely match exactly the true complete insurance plan name.

This leads to a time intensive process as staff may need to decipher the plan name in the open text field mentioned by the respondent and/or abbreviated by the interviewer and if that plan name confirms the correct type of insurance initially indicated during the interview process. By creating AI generated known acronyms as data points for projects, as an example Blue Cross Blue Shield of Alabama to BCBS of AL and BC/BS of AL, this will create a more representative set of information for staff to use when coding potential saving up to hundreds of FTE hours reviewing responses. This also works in combination with existing efforts to augment manual coding with AI and Machine Learning to increase efficiency.

Stage of Development: Retired
Is the AI use case rights-impacting, safety-impacting, both, or neither?
Neither

Section 2: Use Case Summary

Date Initiated: N/A
Date when Acquisition and/or Development began: N/A
Date Implemented: N/A
Date Retired: 06/2024
Was the AI system involved in this use case developed (or is it to be developed) under contract(s) or in-house?
N/A
Provide the Procurement Instrument Identifier(s) (PIID) of the contract(s) used.
N/A
Is this AI use case supporting a High-Impact Service Provider (HISP) public-facing service?
N/A
Does this AI use case disseminate information to the public?
N/A
How is the agency ensuring compliance with Information Quality Act guidelines, if applicable?
N/A
Does this AI use case involve personally identifiable information (PII) that is maintained by the agency?
N/A
Has the Senior Agency Official for Privacy (SAOP) assessed the privacy risks associated with this AI use case?
ongoing

Section 3: Data and Code

Do you have access to an enterprise data catalog or agency-wide data repository that enables you to identify whether or not the necessary datasets exist and are ready to develop your use case?
N/A
Describe any agency-owned data used to train, fine-tune, and/or evaluate performance of the model(s) used in this use case.
N/A
Is there available documentation for the model training and evaluation data that demonstrates the degree to which it is appropriate to be used in analysis or for making predictions?
N/A
Which, if any, demographic variables does the AI use case explicitly use as model features?
N/A
Does this project include custom-developed code?
N/A
Does the agency have access to the code associated with the AI use case?
N/A
If the code is open-source, provide the link for the publicly available source code.
N/A

Section 4: AI Enablement and Infrastructure

Does this AI use case have an associated Authority to Operate (ATO) for an AI system?
N/A
System Name: N/A
How long have you waited for the necessary developer tools to implement the AI use case?
N/A
For this AI use case, is the required IT infrastructure provisioned via a centralized intake form or process inside the agency?
N/A
Do you have a process in place to request access to computing resources for model training and development of the AI involved in this use case?
N/A
Has communication regarding the provisioning of your requested resources been timely?
N/A
How are existing data science tools, libraries, data products, and internally-developed AI infrastructure being re-used for the current AI use case?
N/A
Has information regarding the AI use case, including performance metrics and intended use of the model, been made available for review and feedback within the agency?
N/A