DCIPHER (SEDRIC) AIP for Advanced Foodborne Outbreak Investigation (AI Summarization)

Section 1: Use Case Identifiers

Use Case ID: HHS-CDC-00039
Agency: HHS
Op Div/Staff Div: CDC
Use Case Topic Area: Mission-Enabling (internal agency support)
Is the AI use case found in the below list of general commercial AI products and services?
None of the above.
What is the intended purpose and expected benefits of the AI?
This AI solution impacts the process of investigating foodborne disease outbreaks. These foodborne outbreaks require cooperative efforts from CDC staff, FDA, USDA, and local agencies and the AI system is used through a centralized data platform System for Enteric Disease Response, Investigation, and Coordination (also known as SEDRIC). For more information on SEDRIC, please go to our website: https://www.cdc.gov/foodborne-outbreaks/php/foodsafety/tools/index.html

SEDRIC's AIP use case provides CDC epidemiologists the ability to accelerate their investigations of multi-state foodborne disease outbreaks by more effectively leveraging data available in a rich data source, such as receipts from grocery stores, that otherwise requires extensive time and human effort to parse through. In addition, this workflow would free up epidemiologists' time and, potentially, increase the frequency with which both CDC and STLT partners could utilize shopper card, receipts, and free text responses to support investigations. Using the summarization capability, the extensive process of mapping common names of different food items is done automatically, greatly reducing the human labor time to generate dashboards of information regarding current foodborne investigations to serve as a decision point to aid in outbreak response.
Describe the AI system's outputs.
The Artificial Intelligence Platform (AIP) available within SEDRIC provides CDC epidemiologists the power to accelerate their investigations of multi-state foodborne disease outbreaks. Using the information previously extracted from receipts and other records, AIP can summarize these results to provide insights from information pulled from shopper receipts. Given ingredients can be found in multiple food products, and some ingredients such as herbs like coriander/cilantro may go by multiple names or be reported in multiple languages, this summarization tool provides a faster way to gather summary information from receipts on different food items which may be part of a foodborne investigation.
Stage of Development: Operation and Maintenance
Is the AI use case rights-impacting, safety-impacting, both, or neither?
Neither

Section 2: Use Case Summary

Date Initiated: 10/2023
Date when Acquisition and/or Development began: 10/2023
Date Implemented: 02/2024
Date Retired: N/A
Was the AI system involved in this use case developed (or is it to be developed) under contract(s) or in-house?
Developed with contracting resources.
Provide the Procurement Instrument Identifier(s) (PIID) of the contract(s) used.
N/A
Is this AI use case supporting a High-Impact Service Provider (HISP) public-facing service?
N/A
Does this AI use case disseminate information to the public?
No
How is the agency ensuring compliance with Information Quality Act guidelines, if applicable?
N/A
Does this AI use case involve personally identifiable information (PII) that is maintained by the agency?
No
Has the Senior Agency Official for Privacy (SAOP) assessed the privacy risks associated with this AI use case?
ongoing

Section 3: Data and Code

Do you have access to an enterprise data catalog or agency-wide data repository that enables you to identify whether or not the necessary datasets exist and are ready to develop your use case?
No
Describe any agency-owned data used to train, fine-tune, and/or evaluate performance of the model(s) used in this use case.
Data are used in outbreak/response scenarios, such as foodborne illness outbreak response. Data used is dependent on the situation and outbreak, and may be owned by CDC, FDA, USDA, State Health Departments, Tribal Health Departments, Local Health Departments, Territorial Health Departments, or other entities.
Is there available documentation for the model training and evaluation data that demonstrates the degree to which it is appropriate to be used in analysis or for making predictions?
Documentation has been partially completed
Which, if any, demographic variables does the AI use case explicitly use as model features?
N/A
Does this project include custom-developed code?
Yes
If the code is open-source, provide the link for the publicly available source code.
N/A

Section 4: AI Enablement and Infrastructure

Does this AI use case have an associated Authority to Operate (ATO) for an AI system?
Yes
System Name: 1 CDC Data Platform (1CDP)
How long have you waited for the necessary developer tools to implement the AI use case?
Less than 6 months
For this AI use case, is the required IT infrastructure provisioned via a centralized intake form or process inside the agency?
Yes
Do you have a process in place to request access to computing resources for model training and development of the AI involved in this use case?
Yes
Has communication regarding the provisioning of your requested resources been timely?
Yes
How are existing data science tools, libraries, data products, and internally-developed AI infrastructure being re-used for the current AI use case?
Use of existing data platforms
Has information regarding the AI use case, including performance metrics and intended use of the model, been made available for review and feedback within the agency?
Documentation has been published