Phenotype clustering in health care: A narrative review for clinicians

Tyler J Loftus; Benjamin Shickel; Jeremy A Balch; Patrick J Tighe; Kenneth L Abbott; Brian Fazzone; Erik M Anderson; Jared Rozowsky; Tezcan Ozrazgat-Baslanti; Yuanfang Ren; Scott A Berceli; William R Hogan; Philip A Efron; J Randall Moorman; Parisa Rashidi; Gilbert R Upchurch Jr; Azra Bihorac

doi:10.3389/frai.2022.842306

Phenotype clustering in health care: A narrative review for clinicians

Front Artif Intell. 2022 Aug 12:5:842306. doi: 10.3389/frai.2022.842306. eCollection 2022.

Authors

Tyler J Loftus^{1

2

3}, Benjamin Shickel^{3

4}, Jeremy A Balch¹, Patrick J Tighe⁵, Kenneth L Abbott¹, Brian Fazzone¹, Erik M Anderson¹, Jared Rozowsky¹, Tezcan Ozrazgat-Baslanti^{2

3

4}, Yuanfang Ren^{2

3

4}, Scott A Berceli¹, William R Hogan⁶, Philip A Efron¹, J Randall Moorman⁷, Parisa Rashidi^{2

3

8}, Gilbert R Upchurch Jr¹, Azra Bihorac^{2

3

4}

Affiliations

¹ Department of Surgery, University of Florida Health, Gainesville, FL, United States.
² Precision and Intelligent Systems in Medicine (PrismaP), University of Florida, Gainesville, FL, United States.
³ Intelligent Critical Care Center, University of Florida, Gainesville, FL, United States.
⁴ Department of Medicine, University of Florida Health, Gainesville, FL, United States.
⁵ Departments of Anesthesiology, Orthopedics, and Information Systems/Operations Management, University of Florida Health, Gainesville, FL, United States.
⁶ Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, United States.
⁷ Department of Medicine, University of Virginia, Charlottesville, VA, United States.
⁸ Departments of Biomedical Engineering, Computer and Information Science and Engineering, and Electrical and Computer Engineering, University of Florida, Gainesville, FL, United States.

Abstract

Human pathophysiology is occasionally too complex for unaided hypothetical-deductive reasoning and the isolated application of additive or linear statistical methods. Clustering algorithms use input data patterns and distributions to form groups of similar patients or diseases that share distinct properties. Although clinicians frequently perform tasks that may be enhanced by clustering, few receive formal training and clinician-centered literature in clustering is sparse. To add value to clinical care and research, optimal clustering practices require a thorough understanding of how to process and optimize data, select features, weigh strengths and weaknesses of different clustering methods, select the optimal clustering method, and apply clustering methods to solve problems. These concepts and our suggestions for implementing them are described in this narrative review of published literature. All clustering methods share the weakness of finding potential clusters even when natural clusters do not exist, underscoring the importance of applying data-driven techniques as well as clinical and statistical expertise to clustering analyses. When applied properly, patient and disease phenotype clustering can reveal obscured associations that can help clinicians understand disease pathophysiology, predict treatment response, and identify patients for clinical trial enrollment.

Keywords: artificial intelligence; cluster; endotype; endotyping; machine learning.

Publication types

Review

Abstract

Publication types

Grants and funding