Best Match: New relevance search for PubMed

Section 1: Use Case Identifiers

Use Case ID: HHS-NIH-00016
Agency: HHS
Op Div/Staff Div: NIH
Use Case Topic Area: Government Services (includes Benefits and Service Delivery)
Is the AI use case found in the below list of general commercial AI products and services?
None of the above.
Describe the AI system's outputs.
This AI technique leverages the intelligence of our users and cutting-edge machine-learning technology as an alternative to the traditional date-sort order that appears in many traditional search engines. Trained with past user searches with dozens of relevance-ranking factors, the Best Match algorithm demonstrates state-of-the-art retrieval performance and an improved user experience.
Stage of Development: Operation and Maintenance
Is the AI use case rights-impacting, safety-impacting, both, or neither?
Neither

Section 2: Use Case Summary

Date Initiated: 06/2016
Date when Acquisition and/or Development began: 06/2016
Date Implemented: 01/2023
Date Retired: N/A
Was the AI system involved in this use case developed (or is it to be developed) under contract(s) or in-house?
Developed in-house.
Provide the Procurement Instrument Identifier(s) (PIID) of the contract(s) used.
N/A
Is this AI use case supporting a High-Impact Service Provider (HISP) public-facing service?
N/A
Does this AI use case disseminate information to the public?
Yes
How is the agency ensuring compliance with Information Quality Act guidelines, if applicable?
This use case is operated in compliance with HHS's agencywide Information Quality Act policy and procedures.
Does this AI use case involve personally identifiable information (PII) that is maintained by the agency?
No
Has the Senior Agency Official for Privacy (SAOP) assessed the privacy risks associated with this AI use case?
ongoing

Section 3: Data and Code

Do you have access to an enterprise data catalog or agency-wide data repository that enables you to identify whether or not the necessary datasets exist and are ready to develop your use case?
No
Describe any agency-owned data used to train, fine-tune, and/or evaluate performance of the model(s) used in this use case.
The PubMed literature collection includes articles relevant to biomedicine and the life sciences, broadly defined to encompass the information needs of those working in healthcare and life sciences. The Best Match algorithm was trained on interaction data from clicks on lists of articles returned when users search the PubMed collection with search terms.
Is there available documentation for the model training and evaluation data that demonstrates the degree to which it is appropriate to be used in analysis or for making predictions?
Documentation is widely available
Which, if any, demographic variables does the AI use case explicitly use as model features?
N/A
Does this project include custom-developed code?
Yes
If the code is open-source, provide the link for the publicly available source code.
https://github.com/ncbi-nlp/PubMed-Best-Match

Section 4: AI Enablement and Infrastructure

Does this AI use case have an associated Authority to Operate (ATO) for an AI system?
No
System Name: N/A
How long have you waited for the necessary developer tools to implement the AI use case?
6-12 months
For this AI use case, is the required IT infrastructure provisioned via a centralized intake form or process inside the agency?
Yes
Do you have a process in place to request access to computing resources for model training and development of the AI involved in this use case?
Yes
Has communication regarding the provisioning of your requested resources been timely?
Yes
How are existing data science tools, libraries, data products, and internally-developed AI infrastructure being re-used for the current AI use case?
Re-use production level code from a different use-case
Has information regarding the AI use case, including performance metrics and intended use of the model, been made available for review and feedback within the agency?
Limited documentation for review