Revising the headline: Reconsidering Survivor Analysis: Strategies to Generate Survival Graphs for Your Model
In the realm of machine learning, a novel approach has been proposed to tackle time-to-event problems, commonly found in various contexts, including those with censored observations. This strategy transforms the time-to-event analysis into multiple binary classification tasks, making it simpler and more accessible for data-driven companies.
The method begins by discretizing the continuous survival time into distinct time intervals, such as months or years. For each interval, a binary label is defined, indicating whether the event occurred within that interval or not.
Using these interval-specific binary labels, separate or joint binary classifiers are trained to estimate the probability of the event occurring in each interval, given survival up to that interval. These predictions are then aggregated to construct an estimated survival curve or hazard function over time.
This approach, often referred to as discrete-time survival modeling or multi-task binary classification for survival, offers several advantages. It leverages standard binary classification algorithms and software, simplifying the handling of censored data by censoring labels for intervals beyond a patient’s follow-up. Moreover, it enables flexible horizon-specific predictions and the use of common machine learning tools.
In clinical contexts, predicting event occurrence within specified windows, such as "the risk of event within 7 days," as binary outcomes is common and computationally practical, allowing direct use of binary classifiers like random forests.
While more complex survival models, such as the Cox proportional hazards model or its neural network extensions, use continuous time and require specialized methods, they can be complemented by discretized binary formulations for practicality and interpretability.
However, it's important to note that compared with survival models inherently designed for continuous hazard estimation, this binary modeling sacrifices some of the fine-grained time-to-event modelling capabilities.
This strategy is based on the research papers by Zhong and Tibshirani (arXiv 2019) and Yu, Greiner, Lin, and Baracos (Advances in Neural Information Processing Systems 2011). For those interested, a typical algorithmic approach or example code to implement this methodology can be outlined upon request.
This strategy for time-to-event problems, which includes multiple binary classifiers, is applicable not only in machine learning but also in various health-and-wellness contexts, such as medical-conditions data analysis. In such cases, it can provide predictions like the risk of event occurrence within specific windows, like "the risk of event within 7 days." This is achieved through data-and-cloud-computing technologies, leveraging standard binary classification algorithms and software. The process involves the use of technology for discretizing the continuous survival time, transforming it into distinct time intervals, and using these intervals to train separate or joint binary classifiers.