Athletic Signature: Predicting the Next Game Lineup in Collegiate Basketball

Document Type

Peer-Reviewed Article

Publication Date

2024

Abstract

The advances in machine learning (ML) tools and techniques have enabled the non-intrusive collection and rapid analysis of massive amounts of data involving athletes in competitive collegiate sports. It has facilitated the development of services that a coach can employ in analyzing these data into actionable insights in designing training schedules and effective strategies for maximizing an athlete’s performance, while minimizing injury risk. Collegiate sports utilize data to get a competitive advantage. While game statistics are publicly available, relying on more than one form of data can help reveal a pattern. We developed a framework that considers various modalities and creates an athletic signature to predict their future performance. Our research involves the study of 42 distinct features that quantify various internal/external stressors the athletes face to characterize and estimate their athletic readiness (in the form of reactive strength index modified—RSImod) using ML algorithms. Our study, conducted over 26 weeks with 17 collegiate women’s basketball athletes, developed a framework that first performed sensitivity analysis using a hybrid approach combining the strengths of various filter-based, wrapper-based, and embedded feature importance techniques to identify the features most significantly impacting athlete readiness. These features were then categorized into four moderating variables (MVs, i.e. factors): sleep, cardiac rhythm, training strain, and travel schedule. Further, we used factor analysis to enhance interpretability and reduce computational complexity. A hybrid boosted-decision-trees-based model designed based on athlete clusters predicted future athletic readiness based on MVs with a mean squared error (MSE) of 0.0102. Partial dependence plots (PDPs) helped qualitatively learn the relationship between the moderating variables and the RSImod score. Athletic signatures, uniquely defining athlete-specific MV patterns, account for intra-individual variability, offering a better statistical basis for predicting game lineup (green/yellow/red card assignment) in combination with model predictions. SHAP (SHapley Additive exPlanations) values suggest the causative MV in order of significance for each prediction, enabling coaches to make informed decisions about training adjustments and athlete well-being, leading to performance improvement. Using the fingerprint mechanism, we created green (within 1 Standard Deviation (SD)), yellow (> 1SD and < 2SD), and red card (> 2SD) zones for athlete readiness assessment. While, this study was conducted on D-I women’s basketball, the modalities apply to several sports, such as soccer, volleyball, football, and ice hockey. This framework allows coaches to understand their athlete dynamics from a strictly data perspective, which helps them strategize their next moves, combined with their personal experience and interactions with the team.

DOI

10.1007/s00521-024-10383-z


Share

COinS