Evaluation Measures¤

Let

\[ \pi_{ij}(t) = \widehat{\Pr}(T_i=t, J_i=j \mid Z_i) = \widehat{\lambda}_j (t \mid Z_i) \widehat{S}(t-1 \mid Z_i) \]

and

\[ D_{ij} (t) = I(T_i = t, J_i = j) \]

The cause-specific incidence/dynamic area under the receiver operating characteristics curve (AUC) is defined and estimated in the spirit of Heagerty and Zheng (2005) and Blanche et al. (2015) as the probability of a random observation with observed event \(j\) at time \(t\) having a higher risk prediction for cause \(j\) than a randomly selected observation \(m\), at risk at time \(t\), without the observed event \(j\) at time \(t\). Namely,

\[ \mbox{AUC}_j(t) = \Pr (\pi_{ij}(t) > \pi_{mj}(t) \mid D_{ij} (t) = 1, D_{mj} (t) = 0, T_m \geq t) \]

In the presence of censored data and under the assumption that the censoring is independent of the failure time and observed covariates, an inverse probability censoring weighting (IPCW) estimator of \(\mbox{AUC}_j(t)\) becomes

\[ \widehat{\mbox{AUC}}_j (t) = \frac{\sum_{i=1}^{n}\sum_{m=1}^{n} D_{ij}(t)(1-D_{mj}(t))I(X_m \geq t) W_{ij}(t) W_{mj}(t) \{I(\pi_{ij}(t) > \pi_{mj}(t))+0.5I(\pi_{ij}(t)=\pi_{mj}(t))\}}{\sum_{i=1}^{n}\sum_{m=1}^{n} D_{ij}(t)(1-D_{mj}(t))I(X_m \geq t) W_{ij}(t) W_{mj}(t)} \]

And can be simplified as:

\[ \widehat{\mbox{AUC}}_j (t) = \frac{\sum_{i=1}^{n}\sum_{m=1}^{n} D_{ij}(t)(1-D_{mj}(t))I(X_m \geq t) \{I(\pi_{ij}(t) > \pi_{mj}(t))+0.5I(\pi_{ij}(t)=\pi_{mj}(t))\}}{\sum_{i=1}^{n}\sum_{m=1}^{n} D_{ij}(t)(1-D_{mj}(t))I(X_m \geq t)} \]

where

\[ W_{ij}(t) = \frac{D_{ij}(t)}{\widehat{G}_C(T_i)} + I(X_i \geq t)\frac{1-D_{ij}(t)}{\widehat{G}_C(t)} = \frac{D_{ij}(t)}{\widehat{G}_C(t)} + I(X_i \geq t)\frac{1-D_{ij}(t)}{\widehat{G}_C(t)} = I(X_i \geq t) / \widehat{G}_C(t) \]

and \(\widehat{G}_C(\cdot)\) is the estimated survival function of the censoring (e.g., the Kaplan-Meier estimator). Interestingly, the IPCWs have no effect on \(\widehat{\mbox{AUC}}_j (t)\).

An integrated cause-specific AUC can be estimated as a weighted sum by

\[ \widehat{\mbox{AUC}}_j = \sum_t \widehat{\mbox{AUC}}_j (t) w_j (t) \]

and we adopt a simple data-driven weight function of the form

\[ w_j(t) = \frac{N_j(t)}{\sum_t N_j(t)} \]

A global AUC can be defined as

\[ \widehat{\mbox{AUC}} = \sum_j \widehat{\mbox{AUC}}_j v_j \]

where

\[ v_j = \frac{\sum_{t} N_j(t)}{ \sum_{j=1}^M \sum_{t} N_j(t) } \]

Another well-known performance measure is the Brier Score (BS). In the spirit of Blanche et al. (2015) we define

\[ \widehat{\mbox{BS}}_{j}(t) = \frac{1}{Y_{\cdot}(t)} {\sum_{i=1}^n W_{ij}(t) \left( D_{ij}(t) - \pi_{ij}(t)\right)^2} \, . \]

An integrated cause-specific BS can be estimated by the weighted sum

\[ \widehat{\mbox{BS}}_{j} = \sum_t \widehat{\mbox{BS}}_{j}(t) w_j(t) \]

and an estimated global BS is given by

\[ \widehat{\mbox{BS}} = \sum_j \widehat{\mbox{BS}}_{j} v_j \, . \]