高级检索
    沈金红, 肖智良, 王希涛, 赵岩. 基于机器学习构建和验证尿路感染性结石的预测模型[J]. 徐州医科大学学报, 2023, 43(11): 816-821. DOI: 10.3969/j.issn.2096-3882.2023.11.007
    引用本文: 沈金红, 肖智良, 王希涛, 赵岩. 基于机器学习构建和验证尿路感染性结石的预测模型[J]. 徐州医科大学学报, 2023, 43(11): 816-821. DOI: 10.3969/j.issn.2096-3882.2023.11.007
    SHEN Jinhong, XIAO Zhiliang, WANG Xitao, ZHAO Yan. Construction and validation of a predictive model for urinary tract infection stones based on machine learning[J]. Journal of Xuzhou Medical University, 2023, 43(11): 816-821. DOI: 10.3969/j.issn.2096-3882.2023.11.007
    Citation: SHEN Jinhong, XIAO Zhiliang, WANG Xitao, ZHAO Yan. Construction and validation of a predictive model for urinary tract infection stones based on machine learning[J]. Journal of Xuzhou Medical University, 2023, 43(11): 816-821. DOI: 10.3969/j.issn.2096-3882.2023.11.007

    基于机器学习构建和验证尿路感染性结石的预测模型

    Construction and validation of a predictive model for urinary tract infection stones based on machine learning

    • 摘要: 目的 构建于一种术前预测尿路结石患者患感染性结石风险的机器学习模型,以期改进结石患者的术前管理。方法 选取2018年8月—2023年 3月因尿路结石就诊于徐州市中心医院的患者,收集临床资料进行回顾性分析。利用"caret"R包将患者以3∶1的比例随机分为训练集和测试集,在训练集中通过Lasso回归分析筛选预测因子,使用9种机器学习模型拟合。根据受试者工作特征曲线下面积(ROC-AUC)、精确率-召回率曲线下面积(PR-AUC)、准确率、精确率、F1分数、校准曲线、以及临床决策曲线评估上述模型的效能。结果 本研究共纳入患者350例,其中感染性结石患者108例,非感染性结石患者242例。基于十折交叉验证进行Lasso回归分析,筛选出11个临床变量,包括尿pH值、血尿酸、尿亚硝酸、年龄、尿结晶、淋巴细胞、尿蛋白质、性别、肾积水情况、吸烟、尿细菌培养。基于上述临床变量构建9种机器学习模型,其中随机森林模型的效能最好,准确率为0.83;F1分数为0.69;PR-AUC为0.77;精确率为0.77;ROC-AUC为0.87,95%CI(0.78~0.94)。校准曲线结果进一步显示,随机森林模型的曲线拟合度较好,并且布里尔分数在所有模型中最小为0.13。临床决策曲线表明,当阈值为0.38~0.71时,随机森林模型获得的净获益在所有模型中最大。结论 随机森林模型是一种有效预测感染性结石的机器学习模型,其中尿pH值、血尿酸值、以及尿亚硝酸是该预测模型中最重要的指标。

       

      Abstract: Objective To construct a machine learning model for predicting the risk of infection stones in patients with urinary calculi before surgery, in order to improve the preoperative management of patients with urinary stones.Methods Patients who were admitted to Xuzhou Central Hospital due to urinary calculi from August 2018 to March 2023 were selected and their clinical data were retrospectively analyzed. Through the caret R package, the patients were randomly divided into training and test sets in a ratio of 3:1. Predictors were screened from the training set by Lasso regression analysis, which were then fitted based on nine machine learning models. The performance of the resultant model. was evaluated according to the receiver operating characteristic-area under curve (ROC-AUC), precision recall-area under curve (PR-AUC), accuracy, precision, F1 score, calibration curve, and clinical decision curve.Results A total of 350 patients were included, including 108 patients with infection stones and 242 without infection stones. Through Lasso regression analysis based on 10-fold cross-validation, 11 predictors were obtained, namely urinary pH value, blood uric acid, urinary nitrite, age, urinary crystallization, lymphocyte, urinary protein, gender, hydronephrosis degree, smoking, and urinary bacterial culture. Accordingly, nine machine learning models were established, where the random forest model had the best efficiency (accuracy: 0.83, F1 score: 0.69, PR-AUC: 0.77, precision: 0.77, ROC-AUC: 0.87, 95% confidence interval (CI): 0.78-0.94). The calibration curve results further indicated that the random forest model showed a good curve fit, with the smallest brier score (BS) of all models at 0.13. According to the clinical decision curve, the net benefit obtained by the random forest model was the largest of all models at the thresholds of 0.38-0.71.Conclusions The random forest model is the most effective machine learning model for predicting infection stones before surgery, in which urinary pH, blood uric acid and urinary nitrite are the three most important indexes.

       

    /

    返回文章
    返回