Assess the documentation of cognitive tests and biomarkers in electronic health records via natural language processing for Alzheimer's disease and related dementias

Int J Med Inform. 2023 Feb:170:104973. doi: 10.1016/j.ijmedinf.2022.104973. Epub 2022 Dec 21.

Abstract

Background: Cognitive tests and biomarkers are the key information to assess the severity and track the progression of Alzheimer's' disease (AD) and AD-related dementias (AD/ADRD), yet, both are often only documented in clinical narratives of patients' electronic health records (EHRs). In this work, we aim to (1) assess the documentation of cognitive tests and biomarkers in EHRs that can be used as real-world endpoints, and (2) identify, extract, and harmonize the different commonly used cognitive tests from clinical narratives using natural language processing (NLP) methods into categorical AD/ADRD severity.

Methods: We developed a rule-based NLP pipeline to extract the cognitive tests and biomarkers from clinical narratives in AD/ADRD patients' EHRs. We aggregated the extracted results to the patient level and harmonized the cognitive test scores into severity categories using cutoffs determined based on both relevant literature and domain knowledge of AD/ADRD clinicians.

Results: We identified an AD/ADRD cohort of 48,912 patients from the University of Florida (UF) Health system and identified 7 measurements (6 cognitive tests and 1 biomarker) that are frequently documented in our data. Our NLP pipeline achieved an overall F1-score of 0.9059 across the 7 measurements. Among the 6 cognitive tests, we were able to harmonize 4 cognitive test scores into severity categories, and the population characteristics of patients with different severity were described. We also identified several factors related to the availability of their documentation in EHRs.

Conclusion: This study demonstrates that our NLP pipelines can extract cognitive tests and biomarkers of AD/ADRD accurately for downstream studies. Although, the documentation of cognitive tests and biomarkers in EHRs appears to be low, RWD is still an important resource for AD/ADRD research. Nevertheless, providing standardized approach to document cognitive tests and biomarkers in EHRS are also warranted.

Keywords: Alzheimer’s Disease and Related Dementias; Cognitive Tests; Natural Language Processing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alzheimer Disease* / diagnosis
  • Biomarkers
  • Documentation
  • Electronic Health Records
  • Humans
  • Natural Language Processing

Substances

  • Biomarkers