Data Preparation¤
Data Generation¤
For simplicity of presentation, we considered \(M=2\) competing events, though PyDTS can handle any number of competing events as long as there are enough observed failures of each failure type, at each discrete time point.
Here, \(d=30\) discrete time points, \(n=50,000\) observations, and \(Z\) with 5 covariates. Failure times of observations were generated based on the model:
with
\(\alpha_{1t} = -1 -0.3 \log(t)\),
\(\alpha_{2t} = -1.75 -0.15\log(t)\), \(t=1,\ldots,d\),
\(\beta_1 = (-\log 0.8, \log 3, \log 3, \log 2.5, \log 2)\),
\(\beta_{2} = (-\log 1, \log 3, \log 4, \log 3, \log 2)\).
Censoring time for each observation was sampled from a discrete uniform distribution, i.e. \(C_i \sim \mbox{Uniform}\{1,...,d+1\}\).
Our goal is estimating \(\{\alpha_{11},\ldots,\alpha_{1d},\beta_1^T,\alpha_{21},\ldots,\alpha_{2d},\beta_2^T\}\) (70 parameters in total) along with the standard error of the estimators.
Checking the Data¤
Both estimation methods require enough observed failures of each failure type, at each discrete time point. Therefore, the first step is to make sure this is in fact the case with the data at hand.
As shown below, in our example, the data comply with this requirement.
Preprocessing suggestions for cases when the data do not comply with this requirement are shown in Data Regrouping Example.