Skip to content

Estimating with DataExpansionFitter¤

Estimation¤

In the following we apply the estimation method of Lee et al. (2018). Note that the data dataframe must not contain a column named 'C'.

from pydts.fitters import DataExpansionFitter
fitter = DataExpansionFitter()
fitter.fit(df=patients_df.drop(['C', 'T'], axis=1))

fitter.print_summary()


Model summary for event: 1
                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                    j_1   No. Observations:               536780
Model:                            GLM   Df Residuals:                   536745
Model Family:                Binomial   Df Model:                           34
Link Function:                  Logit   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -78272.
Date:                Tue, 02 Aug 2022   Deviance:                   1.5654e+05
Time:                        16:47:21   Pearson chi2:                 5.35e+05
No. Iterations:                     7   Pseudo R-squ. (CS):            0.01509
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
C(X)[1]       -0.9459      0.033    -28.924      0.000      -1.010      -0.882
C(X)[2]       -1.1780      0.035    -33.675      0.000      -1.247      -1.109
C(X)[3]       -1.3158      0.037    -35.614      0.000      -1.388      -1.243
C(X)[4]       -1.3671      0.039    -35.452      0.000      -1.443      -1.291
C(X)[5]       -1.4895      0.041    -36.429      0.000      -1.570      -1.409
C(X)[6]       -1.4702      0.042    -35.004      0.000      -1.553      -1.388
C(X)[7]       -1.5688      0.044    -35.325      0.000      -1.656      -1.482
C(X)[8]       -1.5724      0.046    -34.301      0.000      -1.662      -1.483
C(X)[9]       -1.6733      0.049    -34.334      0.000      -1.769      -1.578
C(X)[10]      -1.6693      0.050    -33.240      0.000      -1.768      -1.571
C(X)[11]      -1.6748      0.052    -32.246      0.000      -1.777      -1.573
C(X)[12]      -1.6825      0.054    -31.287      0.000      -1.788      -1.577
C(X)[13]      -1.8026      0.058    -31.121      0.000      -1.916      -1.689
C(X)[14]      -1.7319      0.058    -29.610      0.000      -1.847      -1.617
C(X)[15]      -1.8695      0.064    -29.319      0.000      -1.994      -1.745
C(X)[16]      -1.7987      0.064    -27.960      0.000      -1.925      -1.673
C(X)[17]      -1.8400      0.068    -27.122      0.000      -1.973      -1.707
C(X)[18]      -1.9016      0.072    -26.333      0.000      -2.043      -1.760
C(X)[19]      -1.7936      0.072    -24.918      0.000      -1.935      -1.653
C(X)[20]      -1.8749      0.077    -24.232      0.000      -2.027      -1.723
C(X)[21]      -1.9294      0.082    -23.424      0.000      -2.091      -1.768
C(X)[22]      -1.8858      0.084    -22.362      0.000      -2.051      -1.721
C(X)[23]      -1.7888      0.085    -21.123      0.000      -1.955      -1.623
C(X)[24]      -2.0205      0.098    -20.568      0.000      -2.213      -1.828
C(X)[25]      -1.9474      0.100    -19.500      0.000      -2.143      -1.752
C(X)[26]      -1.8743      0.102    -18.373      0.000      -2.074      -1.674
C(X)[27]      -1.9588      0.112    -17.518      0.000      -2.178      -1.740
C(X)[28]      -2.0736      0.125    -16.608      0.000      -2.318      -1.829
C(X)[29]      -1.9838      0.128    -15.552      0.000      -2.234      -1.734
C(X)[30]      -2.1912      0.151    -14.550      0.000      -2.486      -1.896
Z1             0.1930      0.026      7.495      0.000       0.143       0.244
Z2            -1.1306      0.026    -42.971      0.000      -1.182      -1.079
Z3            -1.1237      0.026    -42.515      0.000      -1.176      -1.072
Z4            -0.8986      0.026    -34.377      0.000      -0.950      -0.847
Z5            -0.6720      0.026    -25.869      0.000      -0.723      -0.621
==============================================================================


Model summary for event: 2
                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                    j_2   No. Observations:               536780
Model:                            GLM   Df Residuals:                   536745
Model Family:                Binomial   Df Model:                           34
Link Function:                  Logit   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -41269.
Date:                Tue, 02 Aug 2022   Deviance:                       82537.
Time:                        16:47:22   Pearson chi2:                 5.39e+05
No. Iterations:                     8   Pseudo R-squ. (CS):           0.006763
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
C(X)[1]       -1.7207      0.049    -35.253      0.000      -1.816      -1.625
C(X)[2]       -1.9635      0.053    -36.941      0.000      -2.068      -1.859
C(X)[3]       -1.8726      0.054    -34.671      0.000      -1.978      -1.767
C(X)[4]       -1.9732      0.057    -34.515      0.000      -2.085      -1.861
C(X)[5]       -1.9804      0.059    -33.427      0.000      -2.096      -1.864
C(X)[6]       -2.0393      0.062    -32.819      0.000      -2.161      -1.918
C(X)[7]       -2.0853      0.065    -32.085      0.000      -2.213      -1.958
C(X)[8]       -2.0027      0.066    -30.546      0.000      -2.131      -1.874
C(X)[9]       -2.1411      0.071    -30.347      0.000      -2.279      -2.003
C(X)[10]      -2.1014      0.072    -29.209      0.000      -2.242      -1.960
C(X)[11]      -2.2544      0.078    -28.862      0.000      -2.408      -2.101
C(X)[12]      -2.1354      0.078    -27.505      0.000      -2.288      -1.983
C(X)[13]      -2.1257      0.080    -26.538      0.000      -2.283      -1.969
C(X)[14]      -2.1671      0.084    -25.786      0.000      -2.332      -2.002
C(X)[15]      -2.2224      0.089    -24.964      0.000      -2.397      -2.048
C(X)[16]      -2.1811      0.091    -24.026      0.000      -2.359      -2.003
C(X)[17]      -2.1826      0.094    -23.134      0.000      -2.368      -1.998
C(X)[18]      -2.3342      0.104    -22.438      0.000      -2.538      -2.130
C(X)[19]      -2.1546      0.101    -21.382      0.000      -2.352      -1.957
C(X)[20]      -2.1133      0.103    -20.467      0.000      -2.316      -1.911
C(X)[21]      -2.3724      0.119    -19.867      0.000      -2.606      -2.138
C(X)[22]      -2.2038      0.116    -18.983      0.000      -2.431      -1.976
C(X)[23]      -2.4194      0.133    -18.207      0.000      -2.680      -2.159
C(X)[24]      -2.3982      0.139    -17.275      0.000      -2.670      -2.126
C(X)[25]      -2.3070      0.140    -16.480      0.000      -2.581      -2.033
C(X)[26]      -2.2794      0.146    -15.630      0.000      -2.565      -1.994
C(X)[27]      -2.3684      0.160    -14.774      0.000      -2.683      -2.054
C(X)[28]      -2.3635      0.170    -13.926      0.000      -2.696      -2.031
C(X)[29]      -2.1045      0.161    -13.103      0.000      -2.419      -1.790
C(X)[30]      -2.1030      0.172    -12.215      0.000      -2.440      -1.766
Z1             0.0411      0.038      1.074      0.283      -0.034       0.116
Z2            -1.1128      0.039    -28.419      0.000      -1.190      -1.036
Z3            -1.4255      0.040    -35.870      0.000      -1.503      -1.348
Z4            -1.1106      0.039    -28.398      0.000      -1.187      -1.034
Z5            -0.6620      0.039    -17.135      0.000      -0.738      -0.586
==============================================================================

Standard Errors¤

summary = fitter.event_models[1].summary()
summary_df = pd.DataFrame([x.split(',') for x in summary.tables[1].as_csv().split('\n')])
summary_df.columns = summary_df.iloc[0]
summary_df = summary_df.iloc[1:].set_index(summary_df.columns[0])
beta1_summary = summary_df.iloc[-5:]
summary = fitter.event_models[2].summary()
summary_df = pd.DataFrame([x.split(',') for x in summary.tables[1].as_csv().split('\n')])
summary_df.columns = summary_df.iloc[0]
summary_df = summary_df.iloc[1:].set_index(summary_df.columns[0])
beta2_summary = summary_df.iloc[-5:]
beta2_summary
coef std err z P>|z| [0.025 0.975]
Z1 0.0411 0.038 1.074 0.283 -0.034 0.116
Z2 -1.1128 0.039 -28.419 0.000 -1.190 -1.036
Z3 -1.4255 0.040 -35.870 0.000 -1.503 -1.348
Z4 -1.1106 0.039 -28.398 0.000 -1.187 -1.034
Z5 -0.6620 0.039 -17.135 0.000 -0.738 -0.586
from pydts.examples_utils.plots import plot_first_model_coefs
plot_first_model_coefs(models=fitter.event_models, times=fitter.times, train_df=patients_df, n_cov=5)

Prediction¤

Full prediction is given by the method predict_cumulative_incident_function()

The input is a pandas.DataFrame() containing for each observation the covariates columns which were used in the fit() method (Z1-Z5 in our example).

The following columns will be added:

  1. The overall survival at each time point t
  2. The hazard for each failure type \(j\) at each time point t
  3. The probability of event type \(j\) at time t
  4. The Cumulative Incident Function (CIF) of event type \(j\) at time t

In the following, we provide predictions for the individuals with ID values (pid) 0, 1 and 2. We transposed the output for easy view.

pred_df = fitter.predict_cumulative_incident_function(
    patients_df.drop(['J', 'T', 'C', 'X'], axis=1).head(3)).set_index('pid').T
pred_df.index.name = ''
pred_df.columns = ['ID=0', 'ID=1', 'ID=2']
plot_example_pred_output(pred_df)
pred_df
ID=0 ID=1 ID=2
Z1 0.548814 0.645894 0.791725
Z2 0.715189 0.437587 0.528895
Z3 0.602763 0.891773 0.568045
Z4 0.544883 0.963663 0.925597
Z5 0.423655 0.383442 0.071036
overall_survival_t1 0.942684 0.960628 0.932938
overall_survival_t2 0.899636 0.930545 0.883002
overall_survival_t3 0.861480 0.903726 0.839277
overall_survival_t4 0.827201 0.879254 0.800236
overall_survival_t5 0.797018 0.857533 0.766167
overall_survival_t6 0.768048 0.836364 0.733620
overall_survival_t7 0.742313 0.817381 0.704914
overall_survival_t8 0.716876 0.798470 0.676759
overall_survival_t9 0.694881 0.781915 0.652527
overall_survival_t10 0.673241 0.765482 0.628832
overall_survival_t11 0.653276 0.750077 0.607015
overall_survival_t12 0.633323 0.734605 0.585385
overall_survival_t13 0.615405 0.720649 0.566115
overall_survival_t14 0.597401 0.706425 0.546797
overall_survival_t15 0.581732 0.693972 0.530095
overall_survival_t16 0.565528 0.680961 0.512891
overall_survival_t17 0.550207 0.668566 0.496713
overall_survival_t18 0.536576 0.657400 0.482353
overall_survival_t19 0.521450 0.644937 0.466518
overall_survival_t20 0.507307 0.633237 0.451823
overall_survival_t21 0.495118 0.622977 0.439151
overall_survival_t22 0.482190 0.612062 0.425808
overall_survival_t23 0.469586 0.601189 0.412767
overall_survival_t24 0.459059 0.592128 0.401980
overall_survival_t25 0.447933 0.582483 0.390629
overall_survival_t26 0.436435 0.572414 0.378937
overall_survival_t27 0.426143 0.563323 0.368512
overall_survival_t28 0.416810 0.555060 0.359122
overall_survival_t29 0.406209 0.545669 0.348543
overall_survival_t30 0.397051 0.537568 0.339489
hazard_j1_t1 0.043097 0.031017 0.051717
hazard_j1_t10 0.021381 0.015290 0.025774
hazard_j1_t11 0.021267 0.015208 0.025637
hazard_j1_t12 0.021106 0.015092 0.025444
hazard_j1_t13 0.018764 0.013408 0.022632
hazard_j1_t14 0.020110 0.014376 0.024248
hazard_j1_t15 0.017570 0.012551 0.021198
hazard_j1_t16 0.018835 0.013460 0.022717
hazard_j1_t17 0.018086 0.012922 0.021818
hazard_j1_t18 0.017025 0.012160 0.020542
hazard_j1_t19 0.018930 0.013528 0.022831
hazard_j1_t2 0.034478 0.024750 0.041448
hazard_j1_t20 0.017476 0.012484 0.021085
hazard_j1_t21 0.016565 0.011830 0.019989
hazard_j1_t22 0.017291 0.012351 0.020862
hazard_j1_t23 0.019019 0.013592 0.022938
hazard_j1_t24 0.015144 0.010811 0.018280
hazard_j1_t25 0.016275 0.011622 0.019640
hazard_j1_t26 0.017487 0.012491 0.021097
hazard_j1_t27 0.016094 0.011491 0.019422
hazard_j1_t28 0.014374 0.010258 0.017352
hazard_j1_t29 0.015702 0.011211 0.018951
hazard_j1_t3 0.030175 0.021634 0.036308
hazard_j1_t30 0.012799 0.009130 0.015457
hazard_j1_t4 0.028709 0.020575 0.034555
hazard_j1_t5 0.025486 0.018248 0.030696
hazard_j1_t6 0.025969 0.018596 0.031275
hazard_j1_t7 0.023589 0.016880 0.028423
hazard_j1_t8 0.023506 0.016820 0.028323
hazard_j1_t9 0.021298 0.015231 0.025675
hazard_j2_t1 0.014218 0.008355 0.015345
hazard_j2_t10 0.009761 0.005725 0.010538
hazard_j2_t11 0.008387 0.004917 0.009056
hazard_j2_t12 0.009438 0.005535 0.010190
hazard_j2_t13 0.009528 0.005588 0.010287
hazard_j2_t14 0.009146 0.005363 0.009875
hazard_j2_t15 0.008657 0.005076 0.009348
hazard_j2_t16 0.009020 0.005289 0.009739
hazard_j2_t17 0.009006 0.005281 0.009724
hazard_j2_t18 0.007749 0.004542 0.008368
hazard_j2_t19 0.009260 0.005430 0.009998
hazard_j2_t2 0.011188 0.006566 0.012077
hazard_j2_t20 0.009647 0.005658 0.010415
hazard_j2_t21 0.007461 0.004372 0.008057
hazard_j2_t22 0.008819 0.005171 0.009522
hazard_j2_t23 0.007121 0.004172 0.007690
hazard_j2_t24 0.007273 0.004261 0.007853
hazard_j2_t25 0.007961 0.004666 0.008596
hazard_j2_t26 0.008182 0.004796 0.008835
hazard_j2_t27 0.007490 0.004389 0.008088
hazard_j2_t28 0.007527 0.004411 0.008128
hazard_j2_t29 0.009730 0.005707 0.010505
hazard_j2_t3 0.012239 0.007186 0.013211
hazard_j2_t30 0.009745 0.005716 0.010521
hazard_j2_t4 0.011081 0.006503 0.011962
hazard_j2_t5 0.011003 0.006457 0.011878
hazard_j2_t6 0.010379 0.006089 0.011205
hazard_j2_t7 0.009917 0.005817 0.010707
hazard_j2_t8 0.010762 0.006315 0.011618
hazard_j2_t9 0.009384 0.005504 0.010132
prob_j1_at_t1 0.043097 0.031017 0.051717
prob_j1_at_t2 0.032501 0.023776 0.038668
prob_j1_at_t3 0.027146 0.020132 0.032060
prob_j1_at_t4 0.024733 0.018594 0.029001
prob_j1_at_t5 0.021082 0.016044 0.024564
prob_j1_at_t6 0.020698 0.015947 0.023962
prob_j1_at_t7 0.018118 0.014118 0.020852
prob_j1_at_t8 0.017449 0.013749 0.019965
prob_j1_at_t9 0.015268 0.012161 0.017376
prob_j1_at_t10 0.014857 0.011956 0.016819
prob_j1_at_t11 0.014318 0.011641 0.016121
prob_j1_at_t12 0.013788 0.011321 0.015445
prob_j1_at_t13 0.011884 0.009850 0.013248
prob_j1_at_t14 0.012376 0.010360 0.013727
prob_j1_at_t15 0.010497 0.008867 0.011591
prob_j1_at_t16 0.010957 0.009341 0.012042
prob_j1_at_t17 0.010228 0.008799 0.011190
prob_j1_at_t18 0.009367 0.008130 0.010204
prob_j1_at_t19 0.010157 0.008893 0.011013
prob_j1_at_t20 0.009113 0.008051 0.009836
prob_j1_at_t21 0.008404 0.007491 0.009032
prob_j1_at_t22 0.008561 0.007694 0.009162
prob_j1_at_t23 0.009171 0.008319 0.009767
prob_j1_at_t24 0.007112 0.006499 0.007545
prob_j1_at_t25 0.007471 0.006881 0.007895
prob_j1_at_t26 0.007833 0.007276 0.008241
prob_j1_at_t27 0.007024 0.006578 0.007360
prob_j1_at_t28 0.006125 0.005779 0.006395
prob_j1_at_t29 0.006545 0.006223 0.006806
prob_j1_at_t30 0.005199 0.004982 0.005387
prob_j2_at_t1 0.014218 0.008355 0.015345
prob_j2_at_t2 0.010546 0.006308 0.011267
prob_j2_at_t3 0.011010 0.006687 0.011665
prob_j2_at_t4 0.009546 0.005877 0.010040
prob_j2_at_t5 0.009101 0.005677 0.009505
prob_j2_at_t6 0.008272 0.005222 0.008585
prob_j2_at_t7 0.007617 0.004865 0.007855
prob_j2_at_t8 0.007989 0.005162 0.008190
prob_j2_at_t9 0.006727 0.004394 0.006857
prob_j2_at_t10 0.006783 0.004477 0.006877
prob_j2_at_t11 0.005647 0.003764 0.005695
prob_j2_at_t12 0.006165 0.004152 0.006185
prob_j2_at_t13 0.006035 0.004105 0.006022
prob_j2_at_t14 0.005628 0.003865 0.005590
prob_j2_at_t15 0.005172 0.003586 0.005111
prob_j2_at_t16 0.005247 0.003670 0.005162
prob_j2_at_t17 0.005093 0.003596 0.004987
prob_j2_at_t18 0.004264 0.003036 0.004156
prob_j2_at_t19 0.004969 0.003570 0.004822
prob_j2_at_t20 0.005030 0.003649 0.004859
prob_j2_at_t21 0.003785 0.002769 0.003640
prob_j2_at_t22 0.004366 0.003221 0.004181
prob_j2_at_t23 0.003434 0.002554 0.003274
prob_j2_at_t24 0.003415 0.002562 0.003242
prob_j2_at_t25 0.003655 0.002763 0.003456
prob_j2_at_t26 0.003665 0.002794 0.003451
prob_j2_at_t27 0.003269 0.002513 0.003065
prob_j2_at_t28 0.003208 0.002485 0.002995
prob_j2_at_t29 0.004056 0.003168 0.003773
prob_j2_at_t30 0.003958 0.003119 0.003667
cif_j1_at_t1 0.043097 0.031017 0.051717
cif_j1_at_t2 0.075599 0.054792 0.090385
cif_j1_at_t3 0.102745 0.074924 0.122445
cif_j1_at_t4 0.127478 0.093518 0.151447
cif_j1_at_t5 0.148560 0.109563 0.176011
cif_j1_at_t6 0.169258 0.125510 0.199972
cif_j1_at_t7 0.187375 0.139628 0.220824
cif_j1_at_t8 0.204824 0.153376 0.240789
cif_j1_at_t9 0.220092 0.165537 0.258165
cif_j1_at_t10 0.234950 0.177493 0.274983
cif_j1_at_t11 0.249267 0.189135 0.291105
cif_j1_at_t12 0.263055 0.200455 0.306550
cif_j1_at_t13 0.274939 0.210305 0.319798
cif_j1_at_t14 0.287314 0.220665 0.333525
cif_j1_at_t15 0.297811 0.229531 0.345116
cif_j1_at_t16 0.308768 0.238872 0.357158
cif_j1_at_t17 0.318996 0.247671 0.368348
cif_j1_at_t18 0.328364 0.255801 0.378552
cif_j1_at_t19 0.338521 0.264694 0.389565
cif_j1_at_t20 0.347634 0.272745 0.399401
cif_j1_at_t21 0.356038 0.280236 0.408432
cif_j1_at_t22 0.364599 0.287931 0.417594
cif_j1_at_t23 0.373770 0.296250 0.427361
cif_j1_at_t24 0.380881 0.302749 0.434907
cif_j1_at_t25 0.388352 0.309630 0.442802
cif_j1_at_t26 0.396185 0.316906 0.451043
cif_j1_at_t27 0.403209 0.323484 0.458403
cif_j1_at_t28 0.409334 0.329263 0.464797
cif_j1_at_t29 0.415879 0.335485 0.471603
cif_j1_at_t30 0.421078 0.340468 0.476990
cif_j2_at_t1 0.014218 0.008355 0.015345
cif_j2_at_t2 0.024765 0.014663 0.026612
cif_j2_at_t3 0.035775 0.021350 0.038278
cif_j2_at_t4 0.045321 0.027227 0.048317
cif_j2_at_t5 0.054422 0.032905 0.057822
cif_j2_at_t6 0.062695 0.038126 0.066407
cif_j2_at_t7 0.070311 0.042992 0.074262
cif_j2_at_t8 0.078300 0.048154 0.082451
cif_j2_at_t9 0.085027 0.052548 0.089308
cif_j2_at_t10 0.091810 0.057025 0.096185
cif_j2_at_t11 0.097457 0.060789 0.101880
cif_j2_at_t12 0.103622 0.064940 0.108065
cif_j2_at_t13 0.109657 0.069046 0.114087
cif_j2_at_t14 0.115285 0.072911 0.119677
cif_j2_at_t15 0.120457 0.076496 0.124789
cif_j2_at_t16 0.125704 0.080167 0.129951
cif_j2_at_t17 0.130797 0.083763 0.134938
cif_j2_at_t18 0.135061 0.086799 0.139095
cif_j2_at_t19 0.140029 0.090369 0.143917
cif_j2_at_t20 0.145060 0.094018 0.148776
cif_j2_at_t21 0.148845 0.096787 0.152416
cif_j2_at_t22 0.153211 0.100008 0.156598
cif_j2_at_t23 0.156645 0.102561 0.159872
cif_j2_at_t24 0.160060 0.105123 0.163114
cif_j2_at_t25 0.163714 0.107886 0.166569
cif_j2_at_t26 0.167379 0.110680 0.170020
cif_j2_at_t27 0.170648 0.113192 0.173085
cif_j2_at_t28 0.173856 0.115677 0.176081
cif_j2_at_t29 0.177912 0.118845 0.179853
cif_j2_at_t30 0.181870 0.121964 0.183520