Introduction
A number of treatment plans are generated for each patient in order to establish the optimal radiation treatment plan. The final treatment plan is selected by applying a quantitative analysis method by determining the delineation shape of the planning target volume, organ at risk (OAR) and a qualitative analysis method based on the dose volume histogram (DVH).
However, there is no guarantee that the treatment plan selected by this analysis and evaluation will not cause radiotherapy side effects in the patient. Therefore, if the radiation oncologists and medical physicists consider historical clinical data on complications with typical treatment plan factors, such as DVH and OAR dose constraint range, they can establish the optimal treatment plan minimising OAR concerns by suppressing the normal tissue complication probability (NTCP) and increasing the tumour control probability (TCP).
Current advances in diagnostic and therapeutic technologies are under research and development through innovative tools that combine oncology, diagnostics, genetics and computer science to improve the quality of life of patients after treatment. Thus, the clinical decision support system is also being researched using clinical big data with application of prediction modelling using a machine learning algorithm. The role of the prediction model in the radiation treatment decision support system is to maximise tumour control and minimise side effects after treatment and to determine whether the plan offers acceptable dose risk by applying classification and regression methods to dose-volume data using an existing clinical database.Reference Meldolesi, van Soest and Damiani 1
With this aim, Zhang et al. recently studied complication prediction of radiation therapy using machine learning in the field of radiation therapy.Reference Zhang, D’Souza, Shi and Meyer 2 In addition, Guidi et al. are also making progress in studies to predict criticality in which machine learning algorithms target particular cases, such as patients with head and neck cancer.Reference Guidi, Maffei and Vecchi 3
However, no known published studies have integrated machine learning algorithms and dosimetrical and biological index analysis functions into a radiation treatment planning decision support system to determine the optimal plan.
Cao et al. performed integrated analysis studies of dosimetrical and biological index data using prostate cancer cases.Reference Cao, Lee and Chang 4 , Reference Cao, Lee and Chang 5 In addition, big data analysis studies of prostate cancer using machine learning approaches have been performed.Reference Çınar, Engin, Engin and Atesçi 6 – Reference Mohammed and Wagner Meira 8 According to Çınar et al., prostate cancer is currently the most common type of cancer in men except lung cancer.Reference Çınar, Engin, Engin and Atesçi 6 Therefore, we used prostate cancer cases as the model system.
The aim of this study was to develop a predictive model solution that includes the functions of support vector machine and decision tree algorithm to predict OAR complication level and suitable classification of OAR dose-volume values and to combine this function with an in-house developed treatment decision support system in a preliminary study.
Material and Methods
Patient group
The target population was 12 male patients with adenocarcinoma of the prostate, for whom 12 treatment plans had been established. The patient characteristics are as follows: average age, 72 years; average weight, 75·68 kg; tumour-node-metastasis (TNM) stage, T1c-T3b, N0 and M0 (Table 1). The treatment planning system used was TomoTherapy® (Accuray Incorporated, Sunnyvale, CA, USA).
Table 1 Characteristics of the patients with prostate cancer
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-11192-mediumThumb-S1460396916000583_tab1.jpg?pub-status=live)
Note: To establish original treatment plans for patients (n=12).
Abbreviations: PD, prescription dose; FD, fractional dose; OP, operation; TNM, tumour-node-metastasis.
The DVHs for the OARs of bladder and rectum are shown in Figure 1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-49988-mediumThumb-S1460396916000583_fig1g.jpg?pub-status=live)
Figure 1 Established dose volume histogram (DVH) of bladder and rectum for 12 patients with prostate cancer (n=12).
We developed an in-house planning decision support programme to input DVH information from the treatment plan and integrated the results of this study into our system. As an example, we used dose-volume data of the OAR to predict complications as a constraint value (Table 2).
Table 2 Dose-volume constraints for bladder and rectum as organs at risk
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170522043816853-0343:S1460396916000583:S1460396916000583_tab2.gif?pub-status=live)
Predictive modelling using machine learning algorithm
The machine learning algorithm for the radiation treatment planning decision support system can be used in the prediction of complications in OARs exposed to radiation as the peripheral target during radiation therapy.Reference El Naqa, Li and Murphy 9 That is, the predictive modelling algorithm calculates the results using comprehensive data in accordance with the state of the indicator characteristic of the patients and treatment plans.
Accordingly, there is a need to verify the results of late toxicity through a decision tree model or dose-volume data analysis based on current knowledge and historical clinical outcomes.
A prediction model can be applied using index data such as age, TNM stage, gender, prescribed dose, tumour control probability and survival rate.Reference Lambin, van Stiphout and Starmans 10 In addition, the support vector machine (SVM) algorithm can be applied to classify the different OAR dose constraints.Reference Zhang, D’Souza, Shi and Meyer 2 The DVH of the patients during radiation therapy is a significant predictive indicator.Reference Lambin, van Stiphout and Starmans 10
Therefore, we used the DVH data of patients as input parameters for the application of clinical big data, and machine learning techniques were used in the SVM and decision tree as described previously for complication prediction.Reference Zhang, D’Souza, Shi and Meyer 2
Support vector machine
The algorithm is developed to select the best classifier to separate two groups by drawing a perpendicular line between groups in the hyperplane. In the case of the nonlinear model, the kernel method is used to distinguish the linear machine.Reference El Naqa, Li and Murphy 9
A hyperplane is defined as the set of all points x∈R dimension that satisfy h(x)=0, where h(x) is the function of the hyperplane, as follows in equation (1) in d dimensions:Reference De Bari, Vallati and Gatta 7 , Reference Mohammed and Wagner Meira 8
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170522043816853-0343:S1460396916000583:S1460396916000583_eqnU1.gif?pub-status=live)
In this study, we modelled the SVM algorithm using dose-volume input in test and training models as shown in the flow chart in Figure 2. Figure 2a shows the entire analysis system from the treatment planning data, including quantitative analysis for homogeneity index, conformity index and conformation number and dosimetrical indices, TCP, NTCP of biological indices in addition to a big data-based prediction algorithm, to the results. The predictive algorithm component is further defined as in Figure 2b, which describes how dose-volume data for every patient is used as the input and the training and test processes involved in the SVM for classification to achieve an accurate final outcome.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-00980-mediumThumb-S1460396916000583_fig2g.jpg?pub-status=live)
Figure 2 Model of a radiation treatment planning decision support system (a) and its predictive modelling flow chart for the support vector machine algorithm (b) of this study. Abbreviations: PITV, prescription isodose to target volume; CI, conformity index; HI, homogeneity index; TCI, target coverage index; MHI, modified homogeneity index; CN, conformity number; COSI, critical organ scoring index; QF, quality factor
A total of 100 model plans were generated based on 12 treatment plans for analysis of the support vector machine algorithm (Table 3).
Table 3 Classification of bladder and rectum complications by 100 modelled plans for the support vector machine algorithm (n=100)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-72775-mediumThumb-S1460396916000583_tab3.jpg?pub-status=live)
Abbreviation: NC, non-complication.
Decision tree
A decision tree requires that critical decision points be selected for outgoing confirmation based on specific conditions by selecting a final value with these conditions. This can be formalised by a simple pattern and is an algorithm that can be programmed using machine learning.
The decision point X j ≤v divides the input data space, R, into two sections: R Y and R N . The division of R into R Y and R N also derives a binary section of the corresponding input data point D Input. This means that a division point of the form X j ≤v derives the data into sections, as in equations (2) and (3).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170522043816853-0343:S1460396916000583:S1460396916000583_eqnU2.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20170522043816853-0343:S1460396916000583:S1460396916000583_eqnU3.gif?pub-status=live)
where D Y is the group of data points lying in region R Y , and D N is the group of input points lying in R N .Reference Mohammed and Wagner Meira 8
To analyse the decision tree algorithm, we calculated an additional eight plans based on the 12 original treatment plans, expanding the analysis to 20 plans. Table 4 shows the doses (Gy) of 25% bladder, 50% bladder, 25% rectum and 50% rectum using this prediction.
Table 4 Complication prediction for bladder and rectum using 20 representative plans for the decision tree algorithm (n=20)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-11818-mediumThumb-S1460396916000583_tab4.jpg?pub-status=live)
Abbreviation: NC, non-complication; C, complication.
Results
The results of analysis with the machine learning algorithm showed 91·0% accuracy after the training process with respect to 100 modelled plans using SVM.
In addition, the OAR complication analysis showed possible classification of potential risk factors as complication (C) and non-complication (NC) relative to 25% bladder, 50% rectum and 30% bowel using the decision tree. Therefore, we could combine a programme including this machine learning algorithm and our in-house developed planning decision support system to allow complication predictions for patients based on clinical big data.
Predictive modelling analysis results
SVM
Figure 3 shows the results of classification analysis for bladder and rectum with respect to the 100-model plan. Quadratic SVM analysis correctly separated NC cases: red dots in Figure 3 indicate correct classification, and red crosses show misclassified NC. The true positive rate and false positive rate were obtained to demonstrate the performance of the SVM classifier for the analysis and showed an area under curve of 0·8107 (Figure 4).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-73110-mediumThumb-S1460396916000583_fig3g.jpg?pub-status=live)
Figure 3 Scatter plot of patient organs at risk (OARs) with the support vector machine for 100 modelled plans. Note: red dot (·): correctly classified as NC; red cross (x): misclassified as NC. (a) Bladder scatter plot with modeled plans; (b) rectum scatter plot with modeled plans. Abbreviation: NC, non-complication.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-51044-mediumThumb-S1460396916000583_fig4g.jpg?pub-status=live)
Figure 4 Receiver operating characteristic (ROC) of the classifier with the support vector machine. Note: Area under curve (AUC)=0·8107, (a)=positive class for complication, (b)=positive class for non-complication.
In addition, confusion matrix analysis was performed to calculate the error matrix, showing a 91% rate of accuracy for the predicted class and true class (Figure 5).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-56845-mediumThumb-S1460396916000583_fig5g.jpg?pub-status=live)
Figure 5 Confusion matrix for support vector machine analysis and the predicted class for non-complication (NC) with 91·0% accuracy.
Decision tree
As a result of the decision tree analysis, complication prediction was possibly based on the dose of 25% bladder, 50% rectum and 30% bowel. When the radiation oncologists and medical physicists decide the final treatment plans before radiation therapy, the dose constraint for every OAR makes it complicated to determine an optimal plan; thus, a method considering these complex factors would be a useful analytical tool to predict complications (Figure 6).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-07729-mediumThumb-S1460396916000583_fig6g.jpg?pub-status=live)
Figure 6 Decision tree for grade 2 rectal complication classification for 100 plans with 25% bladder, 50% rectum, 30% bowel.
Integration with SMARTRT
SMARTRT is an in-house radiation treatment planning decision support system (PDSS) that was developed to give a final scoring scheme that included DVH information for the patient from the treatment plan and functions with dosimetrical and biological index analysis through the overall quality factor result. However, if the toxicity prediction function is added into the SMARTRT programme using clinical big data and comprehensive clinical side effects could be linked to solve complication prediction, we might be able to achieve the optimal patient-specific PDSS (Figure 7).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-33906-mediumThumb-S1460396916000583_fig7g.jpg?pub-status=live)
Figure 7 Integrated flow chart of toxicity prediction, dosimetric biological index analysis and overall factors for SMARTRT. Abbreviation: DVH, dose volume histogram; PTV, planning target volume; OAR, organ at risk; TCP, tumour control probability; NTCP, normal tissue complication probability; PITV, prescription isodose to target volume; CI, conformity index; HI, homogeneity index; TCI, target coverage index; MHI, modified homogeneity index; CN, conformity number; COSI, critical organ scoring index; RO, radiation oncology; DB, database; RTOG, radiation therapy oncology group; EUD, equivalent uniform dose.
Discussion and Conclusion
To improve the quality of life of the patient after treatment, more accurate patient treatment plans are needed in the field of radiation oncology. This should allow more accurate prognosis of patient outcomes after treatment. This issue is being addressed by planning decision support system research using fundamental DVH analysis, as well as dosimetrical and biological indices with TCP and NTCP.Reference Cao, Lee and Chang 4 , Reference Sanchez-Nieto and Nahum 11 – Reference Bentzen, Constine and Deasy 13
The treatment plan has to be compared more accurately according to the optimum patient-specific PDSS, which includes a predictive model-based function and will represent a significant breakthrough in patient care through machine learning research that can be linked to clinical big data (Figure 7). We present the total artificial intelligence-based integrated clinical decision support system of this study in Figure 8. This system includes an intelligent clinical decision support algorithm with machine learning and deep learning as the artificial intelligence system using clinical big data that could be further expanded.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20170522045123-69364-mediumThumb-S1460396916000583_fig8g.jpg?pub-status=live)
Figure 8 The artificial intelligence (AI)-based integrated clinical decision support system of this study. Abbreviation: DICOM RT, Digital Imaging and Communications in Medicine Radiation Therapy.
Machine learning analysis-based studies with clinical cases in radiation oncology are being researched.Reference El Naqa, Bradley, PE, Hope and Deasy 14 – Reference Coates, Souhami and El Naqa 18 Therefore, it seems likely that more patient cases and multi-institutional studies will be compiled to increase the amount of training data and provide more accurate results. This will be the foundation for the development of optimal patient-specific PDSS for prognosis.
Acknowledgement
This work was supported by a Korea University grant.
Conflicts of Interest
None.