Urethra contouring on computed tomography urethrogram versus magnetic resonance imaging for stereotactic body radiotherapy in prostate cancer

Highlights • Urethra is increasingly recognized as an organ-at-risk for prostate SABR.• Accurate urethra contouring is critical to reduce GU toxicities.• We compared urethra contouring on CT-urethrogram and T2-weighted MRI.• There is better agreement and less variability in urethra contouring on CT-urethrogram.


Introduction
There is increasing recognition that the urethra is an important organ at risk in stereotactic body radiotherapy (SBRT) for prostate cancer [1], and that increased radiation dose to the urethra is associated with a higher risk of genitourinary toxicities [2].This is particularly important if a simultaneous boost to the dominant intra-prostatic lesion (DIL) is given, as the urethra could be receiving escalated doses [3].As such, several prostate SBRT protocols have called for urethra-sparing measures in prostate SBRT [4][5][6][7][8], with suggested urethra dose constraints for different prostate SBRT fractionations [9,10].
However, this is contingent on accurate urethra contouring.Currently there is no consensus guideline for urethra contouring [11].Visualisation of the prostatic urethra on computed tomography (CT) is poor, making delineation challenging and fraught with uncertainty.A Foley catheter can be inserted to help localize and visualize the urethra on CT [4].However, this can deform the urethra and the catheter insertion needs to be repeated at each SBRT fraction, which can be invasive and introduce the risk of infection [12,13].An earlier study investigated the use of Nickel-Titanium stent that is left in situ in the prostate for the course of radiotherapy, which can be used to guide urethra contouring [14]; however, this is an invasive procedure and there is the risk of stent dislocation [15].An alternative option is to perform a urethrogram at the time of planning CT [6], which relies on radiopaque contrast to enhance the appearance of the urethra on CT.This strategy has been used in our institutional prospective trials, including the 5STAR [16] and 2SMART [17] trials.Magnetic resonance imaging (MRI), which provides superior soft tissue contrast may improve accuracy in urethra contouring, as the urethra appears hyperintense on T2-weighted MRI [11,18].In this study, we aim to evaluate the inter-observer variation and agreement in urethra contouring based on CT-urethrogram and T2-weighted MRI.

Study cohort
This is a retrospective study using data of ten patients enrolled in the 2SMART trial (NCT03588819), a phase 2 prospective trial of twofraction prostate SBRT with focal boost to the MRI-defined DIL [17].All patients in the 2SMART trial had a retrograde urethrogram done at the time of planning CT.The patient was positioned supine with the rectal immobilisation device (GU-Lok [19]) inserted.The penile tip was cleaned and approximately 2cc iodinated contrast with 10cc lidocaine jelly was slowly injected into the penile urethra.A penile clamp was applied to reduce the leakage of contrast.A planning CT scan with 2 mm slice thickness was performed.A separate multi-parametric MRI (mpMRI, which included T1 and T2-weighted images, diffusion-weighted images, and dynamic contrast enhancement) was obtained, without GU-Lok or urethrogram.The CT images were fused with mpMRI for target volume delineation.

Urethra contouring
For each of the ten cases, five genitourinary radiation oncologists ('observers'), with a median of 14 years in practice (range: 2-23 years), independently contoured the prostatic urethra in MIM 7.2.8 (MIM Software Inc) on the anonymized CT dataset with urethrogram and MRI datasets (including T2-weighted axial, sagittal, and coronal).As per the 2SMART trial protocol, the urethra was contoured with a minimum 6 mm pearl.For each case, a consensus contour was generated using the Simultaneous Truth of Performance Level Estimation (STAPLE) function within the Computational Environment for Radiotherapy Research (CERR) in MATLAB version 2019b (Mathworks, Natwick, MA) [20] to provide a probabilistic estimate of the 'reference contour' representing the 'true' urethra anatomy.The STAPLE (i.e., reference) urethra contour was generated for urethra contours done on CT-urethrogram and MRI for each case.

Metrics evaluation
Each observers' contours were compared to the STAPLE contours to assess for interobserver variability, separately for the CT-urethrogram dataset and MRI dataset.In addition, the STAPLE contours on CTurethrogram vs MRI for each case were compared to assess the variation and/or agreement between the two different imaging modalities.Two overlap metrics, Dice Similarity Coefficient (DSC) and Jaccard Index (JI), and two distance metrics, the Hausdorff distance (HD) and mean distance to agreement (MDA), were computed for each comparison.
• Dice Similarity Coefficient (DSC) is a widely used metric to evaluate spatial overlap between multiple contours in radiation oncology settings, by comparing the intersection and the union of two contours [21].The dice similarity coefficient value ranged from 0 (no overlap) to 1 (complete overlap).A dice similarity coefficient score of > 0.70 has been reported as demonstrating 'good' spatial and volumetric similarity [21].
• Jaccard index provides a measure of overlap between datasets by comparing the intersection of the two sets and their union (e.g., percent overlap), with value of 0 being completely separate and 1 being the same.The dice similarity coefficient emphasizes the intersection of contours, while the Jaccard index prioritizes contour differences.
• Hausdorff distance is a measure of the greatest distance from a point on one contour to the closest point on another contour and is an indication of how far the two contours are apart (or rather, the maximum discrepancy between contours).Higher value represents more variability in contour.Since Hausdorff distance gives the maximum distance between contours, it is sensitive to outliers.• Mean distance to agreement is a measurement of the average overall distance between the contours, with lower value representing better agreement in contour.Of note, small mean distance to agreement values obtained from averaging random errors may not be distinguishable from small systematic differences in the overall volume and position.

Results
Fig. 1 showed two examples of individual observers' urethra contour (in blue) and the STAPLE contour (in red) on CT-urethrogram dataset (Fig. 1a) and MRI dataset (Fig. 1b).
When comparing the five observers' urethra contours against the STAPLE contours for the ten cases, the mean and median dice similarity coefficient were 0.81 (SD = 0.03) and 0.82 (range = 0.76-0.86) in the CT-urethrogram dataset and 0.62 (SD = 0.06) and 0.61 (range = 0.52-0.73) in the MRI dataset (Table 1), indicating better agreement of the urethra contours using the CT-urethrogram dataset.The mean and median Jaccard index was 0.66 (SD = 0.04) and 0.67 (range = 0.58-0.70) in the CT-urethrogram dataset and 0.46 (SD = 0.06) and 0.45 (range = 0.38-0.45)on MRI dataset; again, indicating better agreement in urethra contouring using the CT-urethrogram dataset.The mean and median Hausdorff distance was 2.85 mm (SD = 0.60) and 2.75 mm (range = 2.08-4.21) in the CT-urethrogram dataset and 4.83 mm (SD = 0.67) and 4.71 mm (range = 3.76-5.69)in the MRI dataset, indicating that the observers' urethra contour were further from the STAPLE contour in the MRI dataset.In supporting this, the mean and median distance to agreement was 0.49 mm (SD = 0.09) and 0.52 mm (range = 0.37-0.60) in the CT-urethrogram dataset, and 1.06 mm (SD = 0.27) and 1.05 mm (range = 0.70-1.57) in the MRI datasets.
In addition, when comparing the STAPLE contours between CTurethrogram and MRI, the mean dice similarity coefficient was 0.46 (SD = 0.15), the mean Jaccard index was 0.30 (SD = 0.12), the mean Hausdorff distance was 5.4 mm (SD = 1.62) and the mean distance to agreement was 1.34 mm (SD = 0.50).

Discussion
There is increasing recognition of the need to limit radiation dose to the urethra in prostate SBRT, and hence accurate and reproducible urethra contouring is critical.In this study, we compared the agreement and variability in urethra contouring based on CT-urethrogram and T2weighted MRI.While there have been several studies evaluating urethra contouring on different imaging modalities and MRI sequences [18,22,23], to our knowledge, this is the first study that compared CTurethrogram with MRI.Despite some degree of variability between the five observers in all ten cases in this contouring study, all evaluated metrics consistently point towards better agreement and less variation in The findings of our study has important clinical implications.Currently, prostate SBRT planning process generally requires a planning CT scan, which is then fused with MRI to guide contouring of the prostate.Our findings showed that CT-urethrogram allows excellent urethra visualization with high agreement in urethra contouring (mean dice similarity coefficient of > 0.8).Earlier similar study by Richardson et al showed much lower agreement in urethra contouring using CT alone (without urethrogram), with mean dice similarity coefficient of 0.47 [22].The use of urethrogram in prostate radiotherapy planning is not new.It has been used to identify the inferior border or apex of prostate in the days of field-based radiotherapy for prostate cancer and is generally well-tolerated [24].While the urethrogram is less likely to 'deform' the urethra compared to the use Foley catheter [12], the extent to which the per-urethra contrast injection may impact on the intraprostatic urethra position is unknown, and has never been quantified.
While T2-weighted MRI provides better soft tissue contrast and is commonly used to guide prostate contouring, our findings showed that the urethra is in fact better visualized, with less variability in contouring, on CT-urethrogram compared to T2-weighted MRI.Reassuringly, the inter-observer variability in urethra contouring on MRI in our study (mean dice similarity coefficient of 0.61) is not dissimilar to prior studies (mean dice similarity coefficient of 0.61) [22], suggesting our findings are not due to 'poorer' urethra contouring on MRI among the observers.However, as we move towards MRI only workflow [25] or treatment on MR-Linac (i.e., Linac with on-board MRI) [26], a planning CT may no longer be performed as part of prostate SBRT planning.In this situation, additional urethra-optimized MRI sequences (in addition to the standard T2-weighted MRI used for prostate contouring) will be required to better visualize the urethra.In the study by Richardson et al, the use of urethraoptimized T2-weighted Sampling Perfection with Application optimized Contrasts using different flip angle Evolution (T2-SPACE) improved agreement in urethra contouring with the dice similarity coefficient to 0.78 [22].Other sequences such as the 3D half Fourier acquisition single-short turbo spine echo (3D-HASTE) [18], post-urination MRI sequence [23], and use of micturating urethrography [27] have also been investigated.
It is important to acknowledge that there is no consensus on best methodology for contouring comparison [28].A major strength of our study is the use of combination of different metrics, including overlap and distance metrics, to assess the variation and agreement in urethra contouring.Clinically important differences in spatial position of contour boundaries may not be fully captured by indexes such as dice similarity coefficient and Jaccard index.This is especially the case with relatively small volume contours such as urethra.Also, a mis-identification in position in a portion of the urethra contour may be more meaningful compared to overall overlap agreement, particularly in the context of dose-painting in the vicinity of urethra.
There are several limitations in this study.The 'reference' contour was generated using the STAPLE methods, taking into account contours by all five observers, including outliers, and this may result in a biased 'reference' contour.In fact, when we compare the STAPLE contours generated on CT-urethrogram dataset vs MRI dataset, there is also high variability.However, STAPLE is a well-established and commonly used methodology to generate a 'reference' or 'consensus' contour in other contouring studies [29,30].A T2-weighted MRI sequence was used in this study, which may not necessarily be optimal for urethra visualisation.However, it is important to note that not every centres have the experiences or resources for dedicated urethra-optimised MRI sequences, and the MRI sequences used in this study is reflective of common MRI sequences acquired as part of diagnostic prostate MRI.Also, a relatively small number of cases (n = 10) were included in this study.The findings of our study is reflective of the contouring by the five observers (of varying experience) and may not be generalized to other observers.Nonetheless, the consistent findings across all ten patients compared using multiple metrics suggest that increasing the number cases or observers is unlikely to lead to a change in the findings.

Conclusion
Our study showed that CT-urethrogram allows for better visualization of the urethra resulting in better agreement and less variability in urethra contouring compared to standard T2-weighted MRI used in prostate contouring.With increasing move towards urethral-sparing prostate SBRT, we believe that CT-urethrogram should be considered as part of planning CT in prostate SBRT to guide urethra delineation.At the same time, with the shift towards MRI-only radiotherapy planning workflow, additional MRI imaging protocols and sequences will be required to improve urethra visualisation on MRI for more accurate and consistent delineation of urethra.

Fig. 1 .
Fig. 1.Sagittal example dataset with observer urethra contours (blue line) and reference contour generated based on STAPLE (red line) on CT urethrogram (a), and mpMRI (b).(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)