摘要
Cervical cancer remains a significant public health challenge worldwide. This disease mainly originates from the epithelial cells of the cervix (the lower part of the uterus that connects to the vagina), and persistent infection with high-risk human papillomavirus (HPV) is the primary causative factor. According to the latest data from the 2020 Global Cancer Statistics Report, cervical cancer ranks as the fourth leading cause of cancer-related deaths among women globally, accounting for approximately 6.5% of new cancer cases in women worldwide. For women of childbearing age, the diagnosis and treatment of cervical cancer are key aspects of clinical management. In recent years, the incidence of cervical cancer in China has shown a significant trend towards younger patients. With the adjustment of fertility policies and the strengthening of individual fertility intentions, the proportion of cervical cancer patients with a strong desire to preserve fertility is expected to increase further in the coming years. Therefore, fertility-sparing treatment (FST) for cervical cancer has increasingly attracted attention in clinical research. FST encompasses various surgical methods, which differ in the extent of parametrial tissue resection and surgical techniques, leading to different perinatal outcomes. Its core objective is to preserve the patient's fertility while ensuring oncological safety, that is, achieving tumor control comparable to radical surgery, reducing the risk of premature birth, and increasing the likelihood of a healthy live birth. Therefore, close monitoring of the oncological outcomes of patients after FST is crucial, especially the assessment of the risk of postoperative recurrence in early-stage cervical cancer patients. However, there is currently a lack of models specifically designed to predict the risk of recurrence in early-stage cervical cancer patients after FST.Therefore, this study aims to identify the key factors influencing recurrence after FST in early-stage cervical cancer by analyzing multi-center data using multiple ML algorithms. We further seek to develop an interpretable predictive model that accurately captures these risk factors and supports clinical decision-making by guiding individualized treatment strategies and preventive interventions. The model has been validated using real-world data, demonstrating strong practical utility and enabling deployment as an online calculator. By inputting relevant clinical parameters, users can rapidly obtain personalized recurrence risk predictions.
Missing value analysis was conducted on 210 samples, among which 11 samples had partial variable missing (missing rate 0.2%). The mice package was used for multiple imputation, with methods including random forest (RF), regression classification, and the default method, all iterated 50 times. Sensitivity analysis was performed on the three methods through forest plots to select the optimal imputation strategy, and the best dataset was selected from the five imputed datasets based on the minimum AIC and variance difference. After imputation, the data were centered, standardized, and transformed by Yeo-Johnson, and near-zero variance variables were filtered. To address the class imbalance problem, downsampling, upsampling, synthetic minority over-sampling technique (SMOTE), and random over-sampling examples (ROSE) were used for resampling. The treebag method in the caret package was used to repeat 5 times of 10-fold cross-validation for each resampled dataset, and the optimal solution was determined through area under the curve (AUC) evaluation. To enhance the model stability, the recursive feature elimination (RFE) method in the wrapper approach was adopted to select variables. Four algorithms, namely RF, support vector machine (SVM), k-nearest neighbor (KNN), and logistic regression (LR), were used, with accuracy, kappa and their standard deviations as evaluation metrics. The screening process was presented in a line chart, and the variable importance was shown in a bar chart. Meanwhile, the Boruta method was used to independently screen the 22 factors. A total of 15 ML methods were adopted for model development, including: RF, extreme gradient boosting (XGBoost), SVM, LR, KNN, partial least squares discriminant analysis (PLS-DA), gradient boosting machine (GBM), neural network (NNET), naive bayesian (NB), linear discriminant analysis (LDA), lasso regression (LASSO), adaptive boosting method 1 (AdaBoost.M1), decision tree (DT), categorical boosting (CatBoost), and light gradient boosting machine (LightGBM). The first 13 methods were implemented under the caret framework and underwent 5 repeated 10-fold cross-validation to determine the optimal hyperparameters. For CatBoost, the maximum number of iterations was set to 500, the learning rate to 0.05, and the maximum depth of the tree to 6. For LightGBM, the maximum number of leaf nodes was set to 31, the maximum depth to 6, and the learning rate to 0.05, with 80% of the features randomly selected to enhance generalization ability. All models were evaluated and externally validated using metrics such as confusion matrix, accuracy, sensitivity, specificity, positive predictive value (PPV) ,negative predictive value (NPV), F1 score, Youden index, AUC, decision curve analysis (DCA), and root mean square residual (RMSR).
Among the four resampling techniques, upsampling yielded the best performance, with both area under the curve (AUC) and specificity approaching 1, and sensitivity reaching 0.911. Through consistent screening across five feature selection methods, six common influencing factors were identified. In the external validation set, the CatBoost model achieved an AUC of 0.899 (95% CI: 0.835-0.963), accuracy of 0.851, and F1 score of 0.870, outperforming the other 14 comparison models in all metrics. Although its positive predictive value (PPV) was 0.770, the model's negative predictive value (NPV) approached 1, significantly outperforming the remaining 13 models. Furthermore, decision curve analysis (DCA) demonstrated that the model achieved the highest clinical net benefit (0.495), equivalent to that of the full-intervention strategy, at a corresponding threshold probability of 0.61. The root mean square residual (RMSR) was lower than the median residual, further supporting the model's robustness in predicting recurrence risk following FST in early-stage cervical cancer within the external validation cohort.
In conclusion, we have developed an interpretable predictive model based on ML that can provide individualized predictions of the risk of recurrence after FST surgery for early-stage cervical cancer based on patients' clinical data. This model has good transparency and clinical practicability. Visit the website: https://cnsdqlcsep.shinyapps.io/app3. In the future, it will be further verified through prospective studies and more multi-center samples will be included to continuously improve the robustness and generalization ability of the model, providing strong support for clinicians in formulating individualized treatment plans.
