If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Inter-clinician delineation variation for a new highly-conformal flank target volume in children with renal tumors: A SIOP-Renal Tumor Study Group international multicenter exercise
Princess Máxima Center for Pediatric Oncology, Utrecht, the NetherlandsDepartment of Radiation Oncology, University Medical Center Utrecht, Utrecht, the Netherlands
Princess Máxima Center for Pediatric Oncology, Utrecht, the NetherlandsDepartment of Radiation Oncology, University Medical Center Utrecht, Utrecht, the Netherlands
Princess Máxima Center for Pediatric Oncology, Utrecht, the NetherlandsDepartment of Radiation Oncology, University Medical Center Utrecht, Utrecht, the Netherlands
Recently, highly-conformal target volumes for flank delineation were defined.
•
Delineation variation of this target volume was assessed in an international setting.
•
Ten radiation oncologist delineated the GTV and CTV of six individual cases.
•
‘Unacceptable’ delineation variation was found in a large number of participants.
•
This indicates the need for central target volume review before radiotherapy onset.
Abstract
Background and purpose
Recently, the SIOP-RTSG developed a highly-conformal flank target volume definition for children with renal tumors. The aims of this study were to evaluate the inter-clinician delineation variation of this new target volume definition in an international multicenter setting and to explore the necessity of quality assurance.
Materials and methods
Six pediatric renal cancer cases were transferred to ten radiation oncologists from seven European countries (‘participants’). These participants delineated the pre- and postoperative Gross Tumor Volume (GTVpre/post), and Clinical Target Volume (CTV) during two test phases (case 1–2 and 3–4), followed by guideline refinement and a quality assurance phase (case 5–6). Reference target volumes (TVref) were established by three experienced radiation oncologists. The Dice Similarity Coefficient between the reference and participants (DSCref/part) was calculated per case. Delineations of case 5–6 were graded by four independent reviewers as ‘per protocol’ (0–4 mm), ‘minor deviation’ (5–9 mm) or ‘major deviation’ (≥10 mm) from the delineation guideline using 18 standardized criteria. Also, a major deviation resulting in underestimation of the CTVref was regarded as an unacceptable variation.
Results
A total of 57/60 delineation sets were completed. The median DSCref/part for the CTV was 0.55 without improvement after sequential cases (case 3–4 vs. case 5–6: p = 0.15). For case 5–6, a major deviation was found for 5/18, 12/17, 18/18 and 4/9 collected delineations of the GTVpre, GTVpost, CTV-T and CTV-N, respectively. An unacceptable variation from the CTVref was found for 7/9 participants for case 5 and 6/9 participants for case 6.
Conclusion
This international multicenter delineation exercise demonstrates that the new consensus for highly-conformal postoperative flank target volume delineation leads to geometrical variation among participants. Moreover, standardized review showed an unacceptable delineation variation in the majority of the participants. These findings strongly suggest the need for additional training and centralized pre-treatment review when this target volume delineation approach is implemented on a larger scale.
Most children with renal tumors who are treated according to the Renal Tumor Study Group (RTSG) protocols of the International Society for Pediatric Oncology (SIOP) receive upfront chemotherapy followed by nephrectomy. Data from the recent SIOP-2001 trial shows that 20–25% of these patients require postoperative flank irradiation as part of their first line treatment [
Omission of doxorubicin from the treatment of stage II-III, intermediate-risk Wilms’ tumour (SIOP WT 2001): An open-label, non-inferiority, randomised controlled trial.
]. For flank irradiation, two conventional opposing Anterior-Posterior/Posterior-Anterior (AP/PA) photon beams have been considered gold standard since the SIOP-1 trial (1971–1974) [
]. However, renal tumors arise from the retroperitoneal area and displace the organs anterior to the tumor. When performing surgery, the tumor is removed with limited risk of (intraperitoneal) tumor spill or macroscopic residual disease and surrounding organs shift into the surgical cavity [
Immediate nephrectomy versus preoperative chemotherapy in the management of non-metastatic Wilms’ tumour: results of a randomised trial (UKW3) by the UK Children’s Cancer Study Group.
]. Consequently, the volume irradiated by AP/PA photon beams includes a large amount of normal tissue.
Nowadays, advanced Image-Guided Radiotherapy (IGRT) techniques allow us to treat complex target volumes with high conformity. To exploit these favorable dose-volume characteristics, radiation oncologists affiliated with the SIOP-RTSG developed a consensus statement on highly-conformal flank target volume delineation for pediatric renal tumors [
]. As a result, the risk of inter-clinician variation is more substantial: underestimation of the target volume has the risk to increase locoregional failures, whereas overestimation of the target volume will limit the ability of modern IGRT techniques to spare healthy tissue. To assess the locoregional control of new flank target volumes combined with highly-conformal radiotherapy (RT) techniques, the SIOP-RTSG has the intention to launch a prospective multicenter study [
]. It is expected that during this study, prospective RT quality assurance by centralized review of target volumes and dosimetry will be compulsory to tackle the issue of inter-clinician variation, given earlier experiences with conventional flank delineation and in line with other recently launched pediatric cancer trials [
Radiotherapy practice for paediatric brain tumours across Europe and quality assurance initiatives: current situation, international survey and future perspectives.
]. However, the estimated inter-clinician delineation variation and, subsequently, the need for centralized review of the new flank target volume has not been determined. Therefore, the development of the consensus on highly-conformal flank delineation was accompanied by a multicenter delineation exercise, during which the consensus guideline was continuously optimized based on the experiences of each delineation phase.
The aim of this study was to evaluate the inter-clinician variation of the new highly-conformal flank target volume delineation approach in an international multicenter setting using geometrical analyses and reviewing criteria in order to explore the necessity of centralized pre-treatment quality assurance.
2. Materials and methods
This exercise was reported according to the Guidelines for Reporting Reliability and Agreement Studies [
Six unique cases with a pediatric renal tumor eligible for flank irradiation based on the criteria defined in the SIOP-RTSG UMBRELLA 2016 protocol were selected for this delineation exercise (institutional review board approval number: 17-729/C) [
]. For each case, after preoperative chemotherapy, T1-weighted Magnetic Resonance Imaging (MRI) scans (Achieva 1.5T, Philips Medical Systems, Best, The Netherlands; slice thickness: 1.5 mm) with and without gadolinium contrast agent were acquired together with a postoperative planning Computed Tomography (CT) scans in RT treatment position (Brilliance, Philips Medical Systems, Best, The Netherlands, slice thickness of 2.0 mm). Essential clinical data to determine the extent of the area at risk were extracted from the radiology, surgery and pathology reports (Supplementary Table 1). Clinical data and imaging in Digital Imaging and Communications in Medicine (DICOM) format were anonymized and transferred from the coordinating center (University Medical Center Utrecht) to the participating centers using encrypted data exchange.
2.2 Procedure
2.2.1 Preparation phase
Between May 2016 and May 2017, expert pediatric radiation oncologists of the SIOP-RTSG board (‘coordinators’) translated the conventional flank target volumes described in the ongoing UMBRELLA SIOP-RTSG-2016 protocol into a ‘preliminary’ highly-conformal flank delineation guideline during three live meetings [
]. Afterwards, radiation oncologists (‘participants’) from ten different centers in seven countries across Europe were invited to participate in a delineation exercise. Participants were asked to delineate the pre- and postoperative Gross Tumor Volume (GTVpre/post), as well as the Clinical Target Volume of the tumor bed and involved lymph node area when indicated (CTV-T and CTV-N, respectively) for all preselected cases using treatment contouring systems available at their institute. For each case, the contralateral kidney, spleen, liver, heart, lungs and vertebrae were delineated by a coordinating pediatric radiation oncologist (GJ) in order to reduce the total delineation time for the participants. The pancreas and intestine were delineated by the participants, since it is closely related to the construction of the target volumes. The delineation exercise was divided into three phases: two test phases and a quality assurance phase (Fig. 1).
Fig. 1Flowchart depicting the procedure of the delineation exercise. Abbreviations: SIOP-RTSG, International Society for Pediatric Oncology – Renal Tumor Study Group; GTVpre/post, pre- and postoperative Gross Tumor Volume, respectively; CTV, Clinical Target Volume; DSC, Dice Similarity Coefficient.
During the first test phase (January 2018–April 2018), participants delineated the target volumes of case 1 and 2. The preoperative and postoperative scans of these cases had been co-registered in advance at the coordinating center. However, after all delineations were collected by the coordinating center, it was revealed that the rigid co-registration had been overruled by the delineation software at the participants’ departments. For this reason, detailed instructions on co-registration were amended to the ‘preliminary’ delineation guideline. Hence, in the second test phase (May 2018–May 2019), participants performed the co-registration themselves and delineated the target volumes of case 3 and 4. At the end of each test phase, a video meeting was organized between participants and coordinators to discuss inconsistencies between participants and to evaluate the need for refinement of the ‘preliminary’ delineation guideline.
2.2.3 Quality assurance phase
At the beginning of the quality assurance phase (April 2019–July 2019), the ‘preliminary’ delineation guideline was refined by adding new recommendations and detailed illustrations of the delineation approach (Supplementary Table 2) [
]. In this phase, participants performed co-registration and delineated case 5 and 6 using the refined delineation guideline. The purpose of the quality assurance phase was to determine the inter-clinician variation using a standardized procedure to review the target volumes in addition to the geometrical analysis of the volumes.
2.3 Analysis
2.3.1 Geometrical data analysis
Data analysis was limited to cases 3–6 due to the co-registration mismatch in case 1 and 2. Before each phase, a reference target volume (TVref), consisting of the GTVpre, GTVpost and CTV, was established for each case by one of the coordinators (GJ), and subsequently validated by the other coordinators (PM, CR). The TVref was based on the ‘preliminary’ delineation guideline for case 3 and 4 and on the ‘refined’ delineation guideline for case 5 and 6. Afterwards, the volume of contours, Dice Similarity Coefficient (DSC) and the percentage of the TVref not delineated by participants were calculated using an in-house developed software tool [
The volume of contours (in mL) were calculated per participant, per case and per target volume. The DSC was used to determine the variation between two volumes and calculated as the intersect target volume (TVintersect) times two, divided by the sum of the two target volumes (TV1, TV2). The DSC ranges from 0 (no overlap between volumes) to 1 (perfect agreement between volumes).
DSCs were calculated in a pairwise fashion between each participant and the reference (DSCref/part), as well as between the participants only (DSCpart/part) for each target volume per case. The percentage of TVref not delineated by a participant was calculated for each target volume per case to reflect the amount of underestimated treatment volume. Zero percent indicated that no part of the TVref was included by the corresponding target volume of the participant, while 100% indicated that all of the TVref was delineated by the participant.
2.3.2 Target volume review
Target volume review according to the ‘refined’ delineation guideline was performed for case 5 and 6 only using a maximum of 18 standardized criteria depending on the clinical situation. These criteria cover the five major steps in the delineation process: one for co-registration, one for GTVpre, seven for GTVpost, six for CTV-T and three for CTV-N [
For the first part of the review, delineations per case per participant were graded by two independent reviewers (BH, PvR) and one reviewer with prior involvement in the delineation exercise (JM). Since a deviation occurring in each delineation step may cause a systematic error in the succeeding steps, each delineation step was reviewed separately. Subsequently, every deviation was appointed to the violation of a specific criterion. Deviations from the criteria were measured in the axial view using a point-to-point distance tool and categorized as either per protocol (0–4 mm), minor deviation (5–9 mm) or major deviation (≥10 mm). Deviations were only graded as minor or major when present in 3 or more consecutive axial slices. Major deviations were subdivided into deviations leading to a potential underestimation or overestimation of the target volume. Discrepancies between reviewers were resolved collectively.
For the second part of the review, a reference pediatric radiation oncologist (GJ) and two independent reviewers (BH, PvR) graded deviations from the CTVref by each participant in six directions of the CTV (anterior, posterior, medial, lateral, cranial and caudal) using automated expansions of the CTVref. A major deviation in one direction of the CTV resulting in underestimation with potential increased risk of locoregional failure was regarded as an unacceptable variation. All minor deviations and major deviations leading to an overestimation were considered acceptable.
2.4 Statistical analysis
The median of the volumes, the DSCref/part, the DSCpart/part and the TVref not delineated by participants were generated. The One-Sample Wilcoxon signed-rank test was used to test the difference between the size of the CTVpart and the CTVref for each case. The Wilcoxon signed-rank test was used to test whether a significant increase of the DSCref/part was obtained between the mean of case 3 and 4 (‘second test phase’) and the mean of case 5 and 6 (‘quality control phase’). The Related-Samples Friedman's Two-Way Analysis of Variance by Ranks with the Wilcoxon signed-rank test as post-hoc analysis was used to test the difference of CTVref not delineated by the participants between cases, and to test the difference between the mean DSCref/part of the GTVpre versus the GTVpost versus the CTV of all of cases combined. Additionally, the difference between the DSCref/part and the DSCpart/part was tested using the Mann-Whitney U test. A p-value of <0.05 was chosen to indicate statistical significance. Data were analyzed using statistical software SPSS
3. Results
3.1 Data collection
At the end of the quality-control phase, a total of 57/60 delineation sets had been collected by the coordinating center. One participating center was unable to delineate case 1, 5 and 6 within the given timeframes.
3.2 Geometrical data analysis
Table 1 demonstrates the absolute volume of the GTVpre, GTVpost and CTV of each participant compared to the reference target volumes for case 3–6. For all cases, CTVobs was not significantly different compared to the CTVref. Considering each individual participant, the maximum difference in size of the CTVpart compared to the CTVref ranged from minus 68 mL to plus 234 mL.
Table 1Volumetric analysis of case 3–6 for a total of 30/40 completed delineation sets by 10 participants.
Reference volumes
Participants volumes
mean Δ(part – ref)
P-value*
Target volume
mL
N
median mL (min–max)
mL (%)
Case 3
GTVpre
39.4
9
38.6 (29.8–94.0)
3.8 (8.9)
GTVpost
8.9
9
12.9 (8.4–78.5)
16.2 (64.6)
CTV
71.8
10
82.8 (62.3–262.5)
42.4 (37.1)
0.20
Case 4
GTVpre
299.9
10
271.0 (227.9–295.1)
−34.1 (−12.9)
GTVpost
59.8
10
41.5 (24.1–111.7)
−12.5 (−26.5)
CTV
124.1
10
94.3 (56.4–231.2)
−1.6 (−1.3)
0.72
Case 5
GTVpre
133.2
8
147.2 (123.9–162.0)
13.2 (9.0)
GTVpost
15.2
8
28.6 (12.0–54.1)
15.5 (50.5)
CTV
175.2
9
217.5 (149.9–409.7)
63.2 (26.5)
0.05
Case 6
GTVpre
164.7
9
163.3 (148.1–172.2)
−4.3 (−2.7)
GTVpost
17.3
9
20.6 (10.6–42.1)
6.5 (27.6)
CTV
83.8
9
108.3 (49.2–193.7)
34.2 (29.0)
0.14
*For each case, the difference between the mean CTVobs and the CTVref was tested using a One-Sample Wilcoxon Signed Rank test. A p-value of < 0.05 was chosen to indicate statistical significance.
The boxplots in Fig. 2 illustrate the variation in DSC between the reference and the participants (DSCref/part), the variation between the participants only (DSCpart/part) for each target volume per case and the percentage of TVref not delineated by a participant. All cases combined, the DSCref/part was better for the GTVpre (median = 0.87) compared to the GTVpost (median = 0.39, p = 0.03) and CTV (median = 0.55, p = 0.02). No significant difference in DSCref/part for the CTV was observed between the ‘test phase’ and the ‘quality assurance phase’ (case 3/4 vs. case 5/6: p = 0.15, standard error = 8.43). For the CTV of each case, the DSCref/part and the DSCpart/part were not significantly different (case 3: p = 0.84; case 4: p = 0.59; case 5: p = 0.84; case 6: p = 0.32). The percentage of CTVref not delineated by the CTV of all participants for case 3–6 ranged between 11% and 73% (median = 35%) and did not significantly differ between cases (p = 0.17) (Fig. 3; Supplementary Table 3).
Fig. 2Boxplots for DSCref/obs, DSCobs/obs and TVref not delineated for case 3–6 per target volume. For a total of 38 delineation sets (2 missing) that were completed by the participants, the boxplots depict the DSCref/part (upper row), DSCpart/part (middle row) and TVref not delineated (lower row) of case 3–6 per target volume. Abbreviations: TVref, reference target volume; DSC, Dice Similarity Coefficient; GTVpre/post, pre- and postoperative Gross Tumor Volume, respectively; CTV, Clinical Target Volume.
Fig. 3Count map of case 6 (female 2 years old, stage II WT-HR): the degree of agreement between the CTVpart (n = 9/10), alongside the CTVref (n = 1). For a total of 9/10 participants, overlap between the CTVpart for case 6 alongside the CTVref (white) is shown on the axial (3A), coronal (3B) and sagittal (3C) postoperative CT. Red and blue zones depict 100% and 10% agreement between participants only, respectively. Abbreviations: WT, Wilms’ tumor; HR, High-Risk; CTV, Clinical Target Volume. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Firstly, case 5 and 6 were reviewed by grading each step in the delineation process separately. One or more major deviations were found in 2/18, 5/18, 12/17, 18/18 and 4/9 participants for co-registration, GTVpre, GTVpost, CTV-T and CTV-N, respectively (Supplementary Table 4). The criteria with highest number of major deviations are CTV-T criterion 3 (‘healthy-appearing kidney’, n = 14/18), GTVpost criterion 3 (‘healthy-appearing kidney’, n = 8/17) and CTV-T criterion 2 (‘organs at risk’, n = 9/18) (Table 2; Supplementary Table 2). Twenty-nine of the 44 observed major deviations were the result of an overestimation, while 15 of the 44 observed major deviations were caused by an underestimation
Table 2Review of case 5 and 6: total number of deviations per criterion.
Per protocol
Minor
Major (under-/overestimation)
Delineations received
0–4 mm
5–9 mm
≥10 mm
Coregistration
Criterion 1: coregistration
18
15
1
2 (n.a./n.a.)
GTVpre
Criterion 1: macroscopic tumor
18
11
2
5 (0/5)
GTVpost
Criterion 1: OARs
17
17
0
0 (n.a./0)
Criterion 2: contact zone of GTVpre
17
15
1
1 (0/1)
Criterion 3: healthy appearing kidney
17
8
1
8 (0/8)
CTV-T
Criterion 1: isotropic margin
18
17
0
1 (0/1)
Criterion 2: OARs
18
8
1
9 (n.a./9)
Criterion 3: healthy appearing kidney
18
3
1
14 (13/1)
Criterion 4: posterior wall
18
16
0
2 (0/2)
CTV-N*
Criterion 1: area around AA/IVC
9
4
2
3 (1/2)
Criterion 2: cranial border
9
7
0
2 (2/0)
Criterion 3: caudal border
9
7
2
0 (0/0)
*The delineation of a CTV-N was only indicated for case 5.
Abbreviations: n.a., not applicable; GTVpre/post, pre- and postoperative Gross Tumor Volume; OAR, organs at risk; CTV-T/N, Clinical Target Volume of the tumor/involved lymph node area; AA, abdominal aorta; IVC, inferior vena cava.
For the second part of the review, each CTVobs was graded by the deviation from the CTVref. An unacceptable variation from the CTVref was found in 7/9 participants for case 5 and 6/9 participants for case 6 (Fig. 4).
Fig. 4Three examples of an unacceptable variation by a participant observed during target volume review of case 5 and case 6. The postoperative CT’s show the CTVref (green) compared to the CTV of three different participants (blue) who were found to have an unacceptable variation measured with a point-to-point ruler (red). For case 5, participant number 1 underestimates the lateral margin of the CTV-T by 32 mm (4A, axial slice), and participant number 2 underestimates the cranial border of the CTV-N by 24 mm (4B, sagittal slice). For case 6, participant 3 underestimates the CTV-T at the medial side towards the transition zone (yellow) (4C, axial slice). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
In the current study, ten radiation oncologists from seven European countries delineated the pre- and postoperative GTV, as well as the CTV of six unique renal tumor cases in order to evaluate the inter-clinician variation of a new flank target volume delineation approach [
]. The median DSC was 0.55, expressing the overlap between the CTV of participants and the reference CTV, while the median underestimation of the reference CTV by the participants ranged between 29% and 47%. Additionally, standardized review of the delineations showed that an unacceptable underestimation of a reference CTV was present in 7/9 participants for case 5 and 6/9 participants for case 6.
Volume and measurements of overlap, like the DSC and Generalized conformity index, are commonly used metrics to determine inter-clinician delineation variation [
]. In a nationwide French study, the CTV agreement for conventional flank irradiation, as defined in the SIOP-2001 protocol, ranged from 0.50 to 0.64 between five RT teams [
]. In our study, the median DSCref/part for the CTV ranged from 0.53 to 0.62. Despite two consensus meetings with the participants and refinement of the preliminary delineation guideline, no significant improvement of inter-clinician variation was observed during this study. This lack of improvement might be caused by the complexity of postoperative tumor reconstruction, the diversity of clinical presentations and the rarity of pediatric (renal) tumors in general [
]. Moreover, the participants did not receive any training prior to this study and no feedback was provided regarding their individual performance during the study. This might have reduced the number of errors, as demonstrated in other clinical settings [
]. Finally, it is also important to consider that the DSC is generally more severely affected by variation in case of small, concave volumes like the postoperative tumor bed, compared to larger, spherical volumes like the GTVpre, as also demonstrated in this study [
Inter-observer variability in target volume delineations of benign and metastatic brain tumours for stereotactic radiosurgery: results of a national quality assurance programme.
In order to evaluate the effect of variation on the potential clinical outcome of patients, a standardized review of all delineations was performed using objective criteria that reflect the recommendations from the refined delineation guideline [
]. When each step in the delineation process was graded independently during the first review, major deviations predominantly occurred for the margins towards the healthy-appearing kidney tissue, the removal of uninvolved OARs from the CTV-T, and the cranial margin of the CTV-N. This indicates where the delineation guideline could be improved or where additional attention during the reviewing process is appropriate.
The second review showed an unacceptable deviation from the CTVref (i.e. leading to significant underestimation) in the majority of the participants. In adult cancer, it is known that RT protocol violation may increase the risk of treatment failure [
]. The Radiation Oncology Group (RTOG) revealed that failure to adhere to RT guidelines was associated with an increased risk of locoregional failure during a phase III study for pancreatic cancer, as well as during a multi-institutional trial for early-stage gastrointestinal cancer using Intensity-Modulated Radiotherapy (IMRT) [
Failure to adhere to protocol specified radiation therapy guidelines was associated with decreased survival in RTOG 9704 – a phase III trial of adjuvant chemotherapy and chemoradiotherapy for patients with resected adenocarcinoma of the pancreas.
]. Also, a large phase III trial of advanced head and neck cancers using prospective quality assurance in 81 Australian centers found a statistically significant 2-year locoregional control rate of 54% versus 78% for patients with and without major deviations, respectively [
]. Less is known about the negative impact of RT protocol violation on treatment outcome for pediatric cancers. Carrie et al. reviewed the treatment plans of 174 medulloblastoma patients and demonstrated a strong correlation between the number of major target volume deviations and the risk of tumor relapse [
]. While the rate of protocol deviations found in our study is based upon a carefully established reference target volume, the true effect of underestimation can only be determined when comparing clinical outcome and target volume review. However, given the low numbers of locoregional failure for WT compared to medulloblastoma patients, it will be more challenging to demonstrate the impact of major deviations on outcome when this new RT techniques is introduced on a larger scale [
Omission of doxorubicin from the treatment of stage II-III, intermediate-risk Wilms’ tumour (SIOP WT 2001): An open-label, non-inferiority, randomised controlled trial.
Results of a randomized study of preradiation chemotherapy versus radiotherapy alone for nonmetastatic medulloblastoma: the International Society of Paediatric Oncology/United Kingdom Children’s Cancer Study Group PNET-3 study.
Relapse of Wilms’ tumour and detection methods: a retrospective analysis of the 2001 Renal Tumour Study Group-International Society of Paediatric Oncology Wilms’ tumour protocol database.
]. Overestimation of the target volume was not regarded as an unacceptable variation in our analysis. However, the degree of overestimation should also be evaluated within central target volume review in order to prevent unnecessary violation of normal tissue constraints like the spleen, tail of the pancreas or the heart. The design of this study was chosen to mimic daily clinical practice with cases representing a wide range of clinical situations. Also, ten radiation oncologists from seven different countries in Europe participated in this study, reflecting the inter-clinician variability in an international multicenter setting. Furthermore, this study implemented a review approach similar to modern quality assurance initiatives [
Radiotherapy practice for paediatric brain tumours across Europe and quality assurance initiatives: current situation, international survey and future perspectives.
]. However, the use of multiple review criteria and establishing reference target volumes might not be preferable for real-time pre-treatment quality assurance, since it is complex and time consuming for reviewers and RT for renal cancers has to start shortly after surgery [
]. Nevertheless, the criteria generated for this delineation exercise could be a good frame of reference since they reflect all recommendations from the consensus statement on the new flank target volume definition [
]. Since this study aimed to evaluate inter-clinician delineation variation only, dosimetric analyses were not included in this study, but are normally part of the RT quality assurance.
In conclusion, this international multicenter delineation exercise demonstrates that this new approach for flank target volume delineation leads to geometrical variation among clinicians. Standardized review using a reference CTV shows that major deviations leading to an underestimation of the reference CTV occurred in the majority of the participants. These findings strongly suggest the need for additional training and centralized pre-treatment review when this highly-conformal target volume delineation approach is implemented during a SIOP-RTSG endorsed prospective multicenter study.
5. Data availability statement
Research data are available upon request to the corresponding author.
Funding source
KiKa (Children Cancer-free) Foundation, grant number 328.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
Omission of doxorubicin from the treatment of stage II-III, intermediate-risk Wilms’ tumour (SIOP WT 2001): An open-label, non-inferiority, randomised controlled trial.
Immediate nephrectomy versus preoperative chemotherapy in the management of non-metastatic Wilms’ tumour: results of a randomised trial (UKW3) by the UK Children’s Cancer Study Group.
Radiotherapy practice for paediatric brain tumours across Europe and quality assurance initiatives: current situation, international survey and future perspectives.
Inter-observer variability in target volume delineations of benign and metastatic brain tumours for stereotactic radiosurgery: results of a national quality assurance programme.
Failure to adhere to protocol specified radiation therapy guidelines was associated with decreased survival in RTOG 9704 – a phase III trial of adjuvant chemotherapy and chemoradiotherapy for patients with resected adenocarcinoma of the pancreas.
Results of a randomized study of preradiation chemotherapy versus radiotherapy alone for nonmetastatic medulloblastoma: the International Society of Paediatric Oncology/United Kingdom Children’s Cancer Study Group PNET-3 study.
Relapse of Wilms’ tumour and detection methods: a retrospective analysis of the 2001 Renal Tumour Study Group-International Society of Paediatric Oncology Wilms’ tumour protocol database.