高级检索

    基于血清炎症因子的难治性肺炎支原体肺炎随机森林预测模型构建与决策曲线分析

    Construction of random forest prediction model and analysis of decision curve for refractory Mycoplasma pneumoniae pneumonia based on serum inflammatory factors

    • 摘要: 目的 基于血清炎症因子构建难治性肺炎支原体肺炎(RMPP)的随机森林预测模型,并采用决策曲线评价预测模型。方法 纳入2021年1月—2023年2月在石家庄市妇幼保健院儿科住院的990例肺炎支原体肺炎(MPP)患儿,收集所有患儿临床特征资料及血清炎症因子水平。使用R4.1.3软件的sample软件包按7∶3的比例将患儿随机分为训练集(693例)和验证集(297例);利用R4.1.3将获取的训练集数据分组为RMPP与普通肺炎支原体肺炎(GMPP)(GMPP=0,RMPP=1)。基于随机森林算法对训练集数据中的自变量进行特征重要性排序,采用可变重要性(VIMP)结合最小深度法筛选出最佳变量组合构建RMPP的随机森林预测模型,采用验证集及决策曲线评价预测模型。结果 随机森林算法筛选出的RMPP随机森林预测模型最佳变量组合为白细胞介素(IL)-6、D-二聚体(DD)、乳酸脱氢酶(LDH)、IL-10,决策曲线分析显示在阈值概率为6%时对MPP患儿进行临床干预可能获益最大。结论 随机森林算法筛选出的RMPP随机森林预测模型最佳变量组合为IL-6、DD、LDH、IL-10,基于上述指标构建的RMPP随机森林预测模型具有较好的预测效能。

       

      Abstract: Objective To construct a random forest prediction model of refractory Mycoplasma pneumoniae pneumonia (RMPP) based on serum inflammatory factors, and to evaluate the prediction model through decision curve analysis. Methods A total of 990 children with Mycoplasma pneumoniae pneumonia (MPP) who were admitted to in the Maternal and Child Health Hospital of Shijiazhuang from January 2021 to February 2023 were included, and their clinical data and levels of serum inflammatory factors were collected. The children were randomly divided into two parts: a training set (n=693) and a verification set (n=297) at a ratio of 7∶3 using the sample software package of R4.1.3 software. With R4.1.3, the training data were divided into two parts:RMPP and general Mycoplasma pneumoniae pneumonia (GMPP) (GMPP=0, RMPP=1). Based on the random forest algorithm, the independent variables in the training set data were sorted by their feature importance, and the optimal variable combination was selected by VIMP combined with the minimum depth method to construct a random forest prediction model of RMPP. The prediction model was evaluated by verification set and decision curve analysis. Results The optimal variable combination of random forest prediction models for RMPP screened by random forest algorithm were interleukin (IL)-6,D-dimer (DD), lactate dehydrogenase (LDH), IL-10. Decision curve analysis showed that MPP children may benefit the most from clinical intervention after the threshold probability reached 6%. Conclusions The optimal variable combination of random forest prediction models for RMPP screened by random forest algorithm are IL-6, DD, LDH,IL-10. The above random forest prediction model for RMPP has good prediction efficiency.

       

    /

    返回文章
    返回