Project dispute prediction by hybrid machine learning techniques.
Chou, Jui-Sheng ; Tsai, Chih-Fong ; Lu, Yu-Hsin 等
Introduction
During the last decade, many PPP projects were not as successful as
expected due to project disputes occurring during the build, operate,
and transfer (BOT) phase. According to the Taiwan Public Construction
Commission (TPCC), the dispute rate was 23.6% during 2002-2009 (PCC
2011). These disputes were resolved by mediation and non-mediation
procedures. Non-mediation procedures include arbitration, litigation,
negotiation, and administrative appeals. In Taiwan, up to 84% of PPP
projects disputes are settled by mediation or negotiation within 1-9
months (PCC 2011). Notably, arbitration or litigation costs to all
parties are considerably more in time and money than those associated
with mediation or negotiation.
Most research has focused on predicting litigation outcomes
(Arditi, Tokdemir 1999a, b; Arditi, Pulket 2005, 2010; Arditi et al.
1998; Chau 2007; Pulket, Arditi 2009a, b) rather than providing a
proactive dispute warning. Additionally, most studies examined the
relationship between the project owner and general contractor; however,
PPP projects involve many stakeholders, including the government,
participating private investors, and financial institutions. This study
intends to provide early dispute warnings by predicting when disputes
will occur based on preliminary project information.
For effective control of PPP projects and to design proactive
dispute management strategies, early knowledge of PPP project dispute
propensity is essential to provide the governmental PPP taskforce with
the information needed to implement a win-win resolution strategy and
even prevent disputes. Further, depending on possible dispute outcomes,
precautionary measures can be implemented proactively during project
execution. Additional preparation in preventive actions can prove
beneficial once a dispute occurs by reducing future effort, time, and
cost to multiple parties during dispute settlement processes.
To achieve this goal, this study compares different prediction
models using a series of machine learning techniques for predicting PPP
dispute likelihood and thereby eliminates future adverse impacts of
disputes on project delivery, operation, and transfer. Particularly,
this study uses single and hybrid machine learning techniques. The
single machine learning models are based on neural networks, decision
trees (DTs), support vector machines (SVMs), the naive Bayes classifier,
and k-nearest neighbor (k-NN). Two hybrid learning models are developed,
one combining clustering and classification techniques and the other
combining multiple classification techniques.
The rest of this paper is organized as follows. Section 1
thoroughly reviews artificial intelligence (AI) literature and its
accuracy in predicting conventional construction disputes and litigation
outcomes. Section 2 then introduces the single and hybrid
machine-learning schemes. Next, Section 3 discusses the experimental
setup and results from comparing the single and hybrid machine learning
techniques for dispute outcome prediction. Conclusions are finally drawn
in the final section, along with recommendations for future research.
1. Literature review
Management personnel typically benefit when the taskforce has a
decision-support tool for estimating dispute propensity and for early
planning of how disputes should be resolved before project initiation
(Marzouk et al. 2011). Several studies have attempted to minimize
construction litigation by predicting the outcomes of court decisions.
In Arditi et al. (1998), a network was trained using data from Illinois
appellate courts, and 67% prediction accuracy was obtained. Arditi et
al. (1998) argued that if the parties in a dispute know with some
certainty how a case will be resolved in court, the number of disputes
can be reduced markedly.
In another series of studies, AI techniques achieved superior
prediction accuracy with the same dataset -83.33% in a case-based
reasoning study (Arditi, Tokdemir 1999b), 89.95% with boosted DTs
(Arditi, Pulket 2005), and 91.15% by integrated prediction modeling
(Arditi, Pulket 2010). These studies used AI to enhance prediction of
outcomes in conventional construction procurement litigation.
However, Chau (2007) determined that, other than in the above case
studies, AI techniques are rarely applied in the legal field. Thus, Chau
(2007) applied AI techniques based on particle swarm optimization to
predict construction litigation outcomes, a field in which new data
mining techniques are rarely applied. The network achieved an 80%
prediction accuracy rate, much higher than mere chance. Nevertheless,
Chau (2007) suggested that additional case factors, such as cultural,
psychological, social, environmental, and political factors, be used in
future studies
to improve accuracy and reflect real world.
For construction disputes triggered by change orders, Chen (2008)
applied a k-NN pattern classification scheme to identify potential
lawsuits based on a nationwide study of US court records. Chen (2008)
demonstrated that the k-NN approach achieved a classification accuracy
of 84.38%. Chen and Hsu (2007) further applied a hybrid artificial
neural networks case-based reasoning (ANN-CBR) model with dispute change
order dataset to obtain early warning information of construction
claims. The classifier attained a prediction rate of 84.61% (Chen, Hsu
2007).
Despite the numerous studies of CBR and its variations for
identifying similar dispute cases for use as references in dispute
settlements, Cheng et al. (2009) refined and improved the conventional
CBR approach by combining fuzzy set theory with a novel similarity
measurement that combines Euclidean distance and cosine angle distance.
Their model successfully extracted the knowledge and experience of
experts from 153 historical construction dispute cases collected
manually from multiple sources.
Generally, all previous studies focused on either specific change
order disputes or on conventional contracting projects using a single
accuracy performance measure. Characteristics and environments of
construction projects under the PPP strategy, however, differ markedly
from the general contractor and owner relationships and require machine
learning techniques with rigorous model performance measures to assist
governmental agencies in predicting disputes with excellent accuracy.
Since disputes always involve numerous complex and interconnected
factors and are difficult to rationalize, machine learning techniques is
now among the most effective methods for identifying hidden
relationships between available or accessible attributes and
dispute-handling methods (Arditi, Pulket 2005, 2010; Arditi, Tokdemir
1999a; El-Adaway, Kandil 2010; Kassab et al. 2010; Pulket, Arditi
2009b). Approaches based on machine learning are related to computer
system designs that attempt to resolve problems intelligently by
emulating human brain processes (Lee et al. 2008) and are typically used
to solve prediction or classification problems.
Researchers in various scientific and engineering fields have
recently combined different learning techniques to increase their
efficacy. Numerous studies have demonstrated that hybrid schemes are
promising applications in various industries (Arditi, Pulket 2010; Chen
2007; Chou et al. 2010, 2011; Kim, Shin 2007; Lee 2009; Li et al. 2005;
Min et al. 2006; Nandi et al. 2004; Wu 2010; Wu et al. 2009). However,
selecting the most appropriate combinations is difficult and time
consuming, such that further attempts are not worthwhile unless
significant improvements in accuracy are achieved. This study constructs
PPP project dispute-prediction models using single and hybrid machine
learning techniques.
2. Machine learning techniques
2.1. Classification techniques
2.1.1. Artificial neural networks
ANN consists of information-processing units that resemble neurons
in the human brain, except that a neural network consists of artificial
neurons (Haykin 1999). Generally, a neural network is a group of neural
and weighted nodes, each representing a brain neuron; connections among
these nodes are analogous to synapses between brain neurons (Malinowski,
Ziembicki 2006).
Multilayer perceptron (MLP) neural networks are standard neural
network models. In an MLP network, the input layer contains a set of
sensory input nodes, one or more hidden layers contain computation
nodes, and an output layer contains computation nodes.
In a multilayer architecture, input vector x passes through the
hidden layer of neurons in the network to the output layer. The weight
connecting input element i to hidden neuron j is [W.sub.ji], and the
weight connecting hidden neuron j to output neuron k is [V.sub.kj], The
net input of a neuron is derived by calculating the weighted sum of its
inputs, and its output is determined by applying a sigmoid function.
Therefore, for the jth hidden neuron:
[net.sup.h.sub.j] = [N.summation over (i=1)] [W.sub.ji][x.sub.i]
and [y.sub.i] = f {[net.sup.h.sub.j]), (1)
and for the kth output neuron:
[net.sup.o.sub.k] = [J+1.summation over (j=1)] [V.sub.kj][y.sub.i]
and [o.sub.k] = f ([net.sup.o.sub.k]), (2)
The sigmoid function f(net) is the logistic function:
f (net) = 1/1 + [e.sup.-[lambda]net] (3)
where [lambda] controls the function gradient.
For a given input vector, the network produces an output [o.sup.k].
Each response is then compared to the known desired response of each
neuron [d.sub.k]. Weights in the network are modified continuously to
correct or reduce errors until total error from all training examples
stays below a pre-defined threshold.
For the output layer weights V and hidden layer weights W, update
rules are given by Eqs (4) and (5), respectively:
[V.sub.kj] (t + 1) = [V.sub.kj] (t) + c[lambda]([d.sub.k] -
[o.sub.k]) [o.sub.k](1 - [o.sub.k])[y.sub.j](t); (4)
[W.sub.ji] (t + 1) = [W.sub.ji] (t) + c[[lambda].sup.2] [y.sub.j]
(1 - [y.sub.j]) [x.sub.i](t) x ([K.sumamtion over (k=1)] ([d.sub.k] -
[o.sub.k])[o.sub.k](1 - [o.sub.k])[V.sub.j]). (5)
2.1.2. Decision trees
DTs have a top-down tree structure, which splits data to create
leaves. In this study, the C4.5 classifier, a recent version of the ID3
algorithm (Quinlan 1993), is used to construct a DT for classification.
A DT is constructed in which each internal node denotes a test of an
attribute and each branch represents a test outcome. Leaf nodes
represent classes or class distributions. The top-most node in a tree is
the root node with the highest information gain. After the root node,
the remaining attribute with the highest information gain is then chosen
as the test for the next node. This process continues until all
attributes are compared or no remaining attributes exist on which
samples may be further partitioned (Huang, Hsueh 2010; Tsai, Chen 2010).
Assume one case is selected randomly from a set S of cases and
belongs to class [C.sub.j]. The probability that an arbitrary sample
belongs to class [C.sub.j] is estimated by:
[P.sub.i] = freJ ([C.sub.j], S)/[absolute value of S], (6)
where [absolute value of S] is the number of samples in set S and,
thus, the information it conveys is [-log.sub.2][p.sub.i] bits.
Suppose a probability distribution P = {[p.sub.1], [p.sub.2,] ...,
[p.sub.n]} is given. The information conveyed by this distribution, also
called entropy of P, is then:
Info(P) = [n.summation over of (i=1)] -[P.sub.i] [log.sub.2]
[P.sub.i]. (7)
If a set T of samples is partitioned based on the value of a
non-categorical attribute X into sets [T.sub.1], [T.sub.2], ...,
[T.sub.m], then the information needed to identify the class of an
element of T becomes the weighted average of information needed to
identify the class of an element of [T.sub.i], that is, the weighted
average of Info([T.sub.i]):
Info(X, T) = [m.summation over (i=1)])[absolute value of
[T.sub.i]]/T x Info([T.sub.i]). (8)
Information gain, Gain(X,T), is then derived as:
Gain(X, T)= Info(T)-Info(X, T). (9)
This equation represents the difference between information needed
to identify an element of T and information needed to identify an
element of T after the value of attribute X has been determined. Thus,
it is the gain in information due to attribute X.
2.1.3. Support vector machines
SVMs, which were introduced by Vapnik (1998), perform binary
classification, that is, they separate a set of training vectors for two
different classes ([x.sub.1], [y.sub.1]), ([x.sub.2], [y.sub.2]),
...,([x.sub.m], [y.sub.m]), where [x.sub.i] [member of] [R.sup.d]
denotes vectors in a d-dimensional feature space and [y.sub.i] [member
of] {-1, +1}isa class label. The SVM model is generated by mapping input
vectors onto a new higher dimensional feature space denoted as [PHI] :
[R.sup.d] [right arrow] [H.sup.f], where d <f. In classification
problems, SVM identifies a separate hyperplane that maximizes the margin
between two classes. Maximizing the margin is a quadratic programming
problem, which can be solved from its dual problem by introducing
Lagrangian multipliers (Han, Kamber 2001; Tan et al. 2006; Witten, Frank
2005). An optimal separating hyperplane in the new feature space is then
constructed by a kernel function K([x.sub.i],[x.sub.j]), which is the
product of input vectors [x.sub.i] and [x.sub.j] and where
K([x.sub.i],[x.sub.J]) = [PHI]([x.sub.i]) x [PHI]([x.sub.j]).
2.1.4. Naive Bayes classifier
The naive Bayes classifier requires all assumptions be explicitly
built into models that are then utilized to derive 'optimal'
decision/classification rules. This classifier can be used to represent
the dependence between random variables (features) and to generate a
concise and tractable specification of a joint probability distribution
for a domain (Witten, Frank 2005). The classifier is constructed using
training data to estimate the probability of each class, given feature
vectors of a new instance. For an example represented by feature vector
X, the Bayes theorem provides a method for computing the probability
that X belongs to class [C.sub.i], which is denoted as p([C.sub.i]| X):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (10)
That is, the naive Bayes classifier determines the conditional
probability of each attribute [x.sub.j](j = 1, 2, ..., N) of X given
class label [C.sub.i]. Therefore, the (image) classification problem can
be stated as follows: given a set of observed features [x.sub.j] from an
image X, classify X into one class [C.sub.i].
2.1.5. k-Nearest neighbor
In pattern classification, the k-NN classifier is a conventional
non-parametric classifier (Bishop 1995). To classify an unknown instance
represented by some feature vectors as a point in a feature space, the
k-NN classifier calculates distances between the point and points in a
training dataset. It then assigns the point to the class among its k-NNs
(where k is an integer).
The k-NN classifier differs from the inductive learning approach
described previously; thus, it has also been called instance-based
learning (Mitchell 1997) or a lazy learner. That is, without off-line
training (i.e. model generation) the k-NN algorithm only needs to search
all examples of a given training dataset to classify a new instance.
Therefore, the primary computation of the k-NN algorithm is online
scoring of training examples to find the k-NNs of a new instance.
According to Jain et al. (2000), 1-NN can be conveniently used as a
benchmark for all the other classifiers since it achieves reasonable
classification performance in most applications.
2.2. Hybrid classification techniques
In literature, hybridization improves the performance of single
classifiers. Hybrid systems can address relatively more complex tasks
because they combine different techniques (Lenard et al. 1998).
Generally, hybrid models are based on combining two or more machine
learning techniques (e.g. clustering and classification techniques).
According to Tsai and Chen (2010), two methods can be applied to
construct hybrid models for classification--the sequential combination
of clustering and classification techniques and the sequential
combination of different classification techniques. These two methods
are described as follows.
2.2.1. Clustering+Classification techniques
The method combining clustering and classification techniques uses
one clustering algorithm as the first component of the hybrid system.
This study uses the k-means clustering algorithm to combine
classification techniques.
The k-means clustering algorithm, a simple and efficient clustering
algorithm, iteratively updates the means of data items in a cluster; the
stabilized value is then regarded as representative of that cluster. The
basic algorithm has the following steps (Hartigan, Wong 1979):
* Randomly select k data items as cluster centers;
* Assign each data item to the group that has the closest centroid;
* When all data items have been assigned, recalculate the positions
of k centroids;
* If no further change exists, end the clustering task; otherwise,
return to step 2 NOTE: if you need to use this phrase, you have to
change the bullets into step 1, step 2, etc.
Therefore, clustering can be used as a pre-processing stage to
identify pattern classes for subsequent supervised classification.
Restated, the clustering result can be used for pre-classification of
unlabelled collections and to identify major populations in a given
dataset.
Alternatively, clustering can be used to filter out
unrepresentative data. That is, the data that cannot be clustered
accurately can be considered noisy data. Consequently, representative
data, which are not filtered out by the clustering technique, are used
during the classification stage.
Next, the classification stage is the same as that for training or
constructing a classifier. The clustering result becomes the training
dataset to train a classifier. After the classifier is trained, it can
classify new (unknown) instances.
Given a training dataset D, which contains m training examples, the
aim of clustering is to "preprocess" D for data reduction.
That is, the correctly clustered data D' by the cluster are
collected, where D' contains n examples (n <m and D'
[member of] D). Then, D' is used to train the classifier. Hence,
given a test dataset, the classifier provides better classification
results than single classifiers trained with the original dataset D.
2.2.2. Classification + Classification techniques
Another hybrid approach combines multiple classification techniques
sequentially; that is, multiple classifiers are cascaded. As with the
combination of clustering and classification techniques, the first
classifier can be used to reduce the amount of data.
The way of cascading two classification techniques is as follows:
given a training dataset D, which contains m training examples, it is
used to train and test the first classifier. Notably, 100%
classification accuracy is impossible. Therefore, the correctly
classified data D' by the first classifier are collected, where
D' contains o examples, where o <m and D' [member of] D.
Then, D' is utilized to train the second classifier. Again, the
hybrid classifier could provide better classification results than
single classifiers trained with the original dataset D over a given test
dataset.
3. Modeling experiments
3.1. Experimental setup and design
3.1.1. The dataset
To demonstrate the accuracy and efficiency of the dispute
classification schemes, this study used PPP project data collected by
the TPCC, the authority overseeing infrastructure construction in
Taiwan, to construct classification models to predict dispute
likelihood. The study database contains 584 PPP projects overseen by the
TPCC during 2002-2009. Of 584 surveys issued, 569 were returned
completed, for a response rate of 97.4%. The questionnaire included
items to collect social demographic data of respondents, background
information, project characteristics, and project dispute resolutions.
Several projects had more than one dispute--one project had nine
disputes--at various project stages. Thus, the overall dataset comprised
data for N= 645 cases (i.e. [N.sub.2] = 493 cases without disputes and
[N.sub.1] = 152 dispute cases). Through expert feedback, project
attributes and their derivatives that were clearly relevant to the
prediction output of interest were identified by survey items. However,
quantitative techniques were still needed to construct and validate
hidden relationships between selected project predictors and the
response (output) variable.
Table 1 summarizes the statistical profile of categorical labels
and numerical ranges for study samples. For PPP-oriented procurement,
59.5% of projects were overseen by the central government. Over the last
eight years, most public construction projects have been for cultural
and education facilities (25.3%), sanitation and medical facilities
(20.8%), transportation facilities (18.1%), and major tourist site
facilities (10.5%). In accordance with economic planning and development
policy, 48.5% of projects werelocatedinnorthernTaiwan.
Basedonthestandard industry definition, most private sector investment
was in industrial (38.6%) and service departments (50.7%). In most cases
(91.0%), the government provided land and planned the facility to
attract investors.
The three major PPP strategies for delivering public services are
BOT (23.7%); operate and transfer (OT) (52.7%); and rehabilitate,
operate, and transfer (ROT) (23.6%). Specifically, the World Bank Group
(WBG 2011) defines the BOT scheme as a strategy in which a private
sponsor builds a new facility, operates the facility, and then transfers
the facility to the government at the end of the contract period. The
government typically provides revenue guarantees through long-term
take-or-pay contracts. When a private sponsor renovates an existing
facility, and then operates and maintains the facility at its own risk
for the contract period, the PPP strategy is ROT, according to WBG
(2011) classifications. Projects involving only management and lease
contracts are classified as OT projects.
Further, flagship infrastructure projects refer to those that are
important and generally large. Average project value was approximately
New Taiwan Dollar (NTD) 841 million (i.e. 1 USD is approximately equal
to 30 NTD). Based on collected data, the overall procurement amount via
PPP was roughly NTD 543 billion. Mean capital investment by the
government and private sector per project was NTD 63.5 million and NTD
777.8 million, respectively. Notably, the average private capital
investment ratio was as high as 91.4%. The mean duration of licensed
facility operations was about 12 years (maximum, 60 years).
To assess the dependencies between categorized data, contingency
table analyses were compared between particular predictors and the
response variable via chi-square testing to infer relationships (Table
2).
All tests obtained statistically significant results at the 5%
alpha level except variables (i.e. planning and design; PCIR) that were
rejected by the null hypothesis, that is, no relationship was observed
between the row variable (input variables) and column variable (output
variable). For instance, among the dispute cases (N1 = 152), the central
government had a higher probability of encountering disputes (67.1%
probability) than municipal (15.1%) and local governments (17.8%).
Particularly, in Nos. 1, 6, 7, 10, 11, 20 in type of public
construction and facility of Table 1, disputes occurred in 76.4% of
projects. Data show that 85.5% of disputes occurred in northern and
southern Taiwan. Interestingly, 92.1% of disputes occurred when the
government provided land and planned the facility, while only 2%
occurred when private investors provided land and designed the facility.
Among the three PPP strategies, the probability of disputes was higher
with BOT (49.3%) than with OT (32.2%) and ROT (18.4%). Notably, once a
project was legally promoted as a major infrastructure project, the
likelihood of a PPP dispute was 38.8%, lower than that for non-major
infrastructure projects (61.2%).
Moreover, once project value exceeded NTD 50 million, dispute
propensity was 4.33 times higher than that for projects valued at NTD
5-50 million and less than NTD 5 million. However, when private sector
investment exceeded 75%, dispute likelihood increased to 92.8%. Notably,
dispute patterns were not significantly related to licensed operating
period. Table 2 summarizes statistical results of cross-analysis.
3.1.2. Single baseline model construction
The single baseline models using classification techniques are
based on C4.5 DTs, the naive Bayes classifier, SVMs, neural network
classifier, and k-NN classifier.
Parameter settings for constructing the five baseline prediction
models are described as follows:
* DTs. The C4.5 DT is established and the confidence factor for
pruning the tree is set at 0.25. Parameters for the minimum number of
instances per leaf and amount of data used to reduce pruning errors are
2 and 3, respectively;
* ANN. This study uses the MLP classifier. To avoid overtraining,
this study constructs an MLP classifier by examining different parameter
settings to obtain an average accuracy for further comparisons.
Therefore, this study considers five different numbers of hidden nodes
and learning epochs. The numbers of hidden nodes are 8, 12, 16, 24, and
32 and those of learning epochs are 50, 100, 200, 300, and 500;
* Naive Bayesian classifier. In building the naive Bayes
classifier, this study uses supervised discretization to convert
numerical attributes into nominal attributes, which can increase model
accuracy. Additionally, the kernel estimator option is set as false
because some attributes are nominal;
* SVM. The complexity parameter, C, and tolerance parameter are as
1.0 and 0.001, respectively. For the kernel function, the radial basis
function with a gamma value of 1 is used;
* k-NN classifier. Different k values are assessed in this study,
starting at 1 and increasing until the minimum error rate is reached.
When comparing the predictive performance of two or more methods,
researchers often use k-fold cross-validation to minimize bias
associated with random sampling of training and holdout data samples. As
cross-validation requires random assignment of individual cases into
distinct folds, a common practice is to stratify the folds. In
stratified k-fold cross-validation, the proportions of predictor labels
(responses) in folds should approximate those in the original dataset.
Empirical studies show that, compared to traditional k-fold
cross-validation, stratified cross-validation reduces bias in comparison
results (Han, Kamber 2001). Kohavi (1995) further demonstrated that
10-fold validation testing was optimal when computing time and variance.
Thus, this study uses stratified 10-fold cross-validation to assess
model performance. The entire dataset was divided into 10 mutually
exclusive subsets (or folds), with class distributions approximating
those of the original dataset (stratified). The subsets were extracted
using the following five steps:
1. Randomize the dataset;
2. Extract one tenth of the original dataset from the randomized
dataset (single fold);
3. Remove extracted data from the original dataset;
4. Repeat steps (1)-(3) eight times;
5. Assign the remaining portion of the dataset to the last fold
(10th fold).
After applying this procedure to obtain 10 distinct folds, each
fold was then used once for performance tests of the single flat and
hybrid classification models, and the remaining nine folds were used for
training model, which obtained 10 independent performance estimates. The
cross-validation estimate of overall accuracy was calculated by
averaging the k individual accuracy measures for cross-validation
accuracy.
3.1.3. Hybrid model construction
For the hybrid models combining clustering and classification
techniques, the k-means clustering algorithm is applied first as the
clustering stage. Notably, the k value was set to 3, 4, 5, and 6. As
dispute and no-dispute groups exist, there are two clusters out of k
corresponding to these two groups, which provide higher accuracy rates
than the other clusters. Then, they are selected as the clustering
result.
For the example of k-means (k = 4), four clusters are produced and
represented by [C.sub.1], [C.sub.2], [C.sub.3], and [C.sub.4] based on a
training dataset. According to the ground truth answer in the training
dataset, one can identify two of the four clusters, which can be well
'classified' into the dispute and no-dispute groups. The other
two clusters whose data are not well classified or difficult to classify
by k-means clustering are filtered out.
Once the best k-mean is found, its clustered data (i.e. the
clustering result) are used to train the five single classifiers.
Notably, one specific clustering model for the 10 training datasets (by
10-fold cross validation) will yield 10 different clustering results.
That is, data in the two representative clusters, which can best
recognize the dispute and no-dispute groups using the 10 training
datasets, are not duplicated. Therefore, the final clustering result of
each k-means model is based on the union method for selecting dispute
and no-dispute data. The clustering result is then used as the new
training dataset to train the five baseline models.
Conversely, for the cascaded hybrid classifiers, the best baseline
classification model is identified after performing 10-fold cross
validation, that is, one of the C4.5 DTs, naive Bayes classifier, SVMs
classifier, kNN classifier, and neural network classifier. The correctly
predicted data from the training set by the best baseline model are used
as new training data to train the five single baseline models.
3.1.4. Evaluation methods
To assess the performance of these single and hybrid prediction
models, prediction accuracy and Type I and II errors, that is,
false-positive and false-negative errors, are examined. Table 3 shows a
confusion matrix for calculating accuracy and error rates, which are
commonly used measures for binary classification (Ferri et al. 2009;
Horng 2010; Kim 2010; Sokolova, Lapalme 2009).
Prediction accuracy, which is defined as the percentage of records
predicted correctly by a model relative to the total number of records
among classification models, is a primary evaluation criterion. The
classification accuracy is derived by:
Accuracy = (a + d/a + b + c + d). (11)
Conversely, the Type I error is the error of not rejecting a null
hypothesis when an alternative hypothesis is the true state. In this
study, Type I error means that the event occurred when the model
classified the event group into the non-event group. The Type II error
is defined as the error in rejecting a null hypothesis when it is the
true state, meaning the event occurred when the model classified the
non-event group into the event group.
Moreover, the Receiver Operating Characteristic (ROC) curves
reflect the ability of a classifier to avoid false classification. The
ROC curve captures a single point, the area under the curve (AUC), in
the analysis of model performance. As the distance between the curve and
reference line increases, test accuracy increases. The AUC, sometimes
referred to as balanced accuracy (Sokolova, Lapalme 2009), is derived
easily by Eq. (12):
AUC = 1/2 [(a/a + b) + (d/c + d)]. (12)
3.2. Experimental results
Table 4 lists the prediction performance of the five single
classifiers, including their prediction accuracy, Type I and II errors,
and the ROC curve. Experimental results indicate that the DT classifier
performs best, providing the highest prediction accuracy at 83.72% and
the lowest Type II error rate at 5.07%. The MLP classifier performs
second best in prediction accuracy at 82.33%. Notably, the significant
difference level is higher than 95% or 99% by t-test for all the
performance measures of the individual models. Therefore, of the hybrid
models combining multiple classification techniques, the DT and MLP
classifiers are chosen as the first classifiers for comparison.
Table 5 shows the prediction performance of the hybrid models
combining clustering and classification techniques, which present the
significance level of performance difference is higher than 95% or 99%
by t-test. Notably, the k-means by the four clusters (i.e. k= 4) are
combined with the five classifiers, since this combination performs
best.
Analytical results demonstrate that the prediction models by hybrid
learning techniques perform better than any single classification
technique in terms of prediction accuracy and the Type II error.
Particularly, k-means + the DT classifier performs best. However, the
prediction accuracies of k-means + the MLP and k-means + k-NN
classifiers are very close to that of k-means + the DT classifier. That
is, performance differences are less than 1%.
For hybrid models combining multiple classification techniques,
Tables 6 and 7 show the prediction performance of MLP and DT combined
and the five classification techniques, respectively. All the techniques
indicate the significant level of performance difference is higher than
95% or 99% by t-test.
When using the MLP classifier as the first classifier, the MLP +
MLP classifier performs best in terms of prediction accuracy, Type I and
II errors, and the ROC curve, followed by the MLP + DT classifier. On
the other hand, when the DT classifier was used as the first classifier,
the DT + DT classifier achieved the highest prediction accuracy, lowest
Type I and II error rates, and best ROC curve. Again, these hybrid
models combining multiple classification techniques outperform single
classifiers.
To determine which method is superior, the best single and hybrid
models are compared by demonstrating difference statistically via
analysis of variance (ANOVA). Tables 8-10 present the ANOVA of average
accuracy, type I error, and type II error. The p-value indicates the
single, cluster + classifier, and classifier + classifier models are
statistically different at 1% or 5% alpha level except the p-value
between cluster + classifiers and classifier + classifier. Notably, the
three models show a statistical difference of performance measures
(F-value) at either 1% or 5% alpha level.
Moreover, Figures 1-4 compare the best single and hybrid learning
models in terms of prediction accuracy, Type I and Type II errors, and
the ROC curve, respectively. According to these comparison results, the
MLP + MLP classifier is the best prediction model, achieving the highest
prediction accuracy rate, lowest Type I and II error rates, and highest
ROC curve, followed by the DT + DT model, indicating that hybrid
learning models perform better than single learning models, and that
multiple classification techniques combined outperform clustering and
classification techniques combined.
[FIGURE 1 OMITTED]
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
Conclusions
Based on the spirit of partnership, Taiwan's governments
function as promoters by building and operating public infrastructure or
buildings with minimal out-of-pocket expense but full administrative
support. For government agencies, the advantages of identifying dispute
propensity early include reducing the time and effort needed to prepare
a rule set to prevent disputes by improving the understanding of
governments, private investors, and financial institutions of each side
in a potential dispute.
This study compares 20 different classifiers using single and
hybrid machine learning techniques. The best single model is the DT,
achieving a prediction accuracy of 83.72%, followed by the MLP at
82.33%. For hybrid models, the combination of the k-means clustering
algorithm and DT outperforms the combination of k-means and the other
single classification techniques, including SVMs, the naive Bayes
classifier, and k-NN by achieving a prediction accuracy of 85.05%.
Notably, all hybrid models (clustering + classification) perform better
than single models.
Moreover, the hybrid models combining multiple classification
techniques perform even better than that combining k-means and a DT.
Specifically, the combination with multiple MLP classifiers and multiple
DT classifiers outperforms other hybrid models, achieving prediction
accuracy of 97.08% and 95.77%, respectively. Additionally, combining MLP
classifiers is the best hybrid model based on having the highest
prediction accuracy, lowest Type I and II error rates, and best ROC
curve.
This study comprehensively compared the effectiveness of various
machine learning techniques. Future work can focus on integration of
proactive strategy deployment and preliminary countermeasures in early
warning systems for PPP project disputes. Another fertile research
direction is the development of second model for use once dispute
likelihood is identified. For dispute cases, such a model is needed to
predict which dispute category and which resolution methods are likely
to be used during which phases of a project's lifecycle by mapping
hidden classification or association rules.
doi: 10.3846/13923730.2013.768544
References
Arditi, D.; Pulket, T. 2005. Predicting the outcome of construction
litigation using boosted decision trees, Journal of Computing in Civil
Engineering ASCE 19(4): 387-393.
http://dx.doi.org/10.1061/(ASCE)0887-3801(2005) 19:4(387)
Arditi, D.; Pulket, T. 2010. Predicting the outcome of construction
litigation using an integrated artificial intelligence model, Journal of
Computing in Civil Engineering ASCE 24(1): 73-80.
http://dx.doi.org/10.1061/(ASCE)0887-3801(2010) 24:1(73)
Arditi, D.; Tokdemir, O. B. 1999a. Comparison of case-based
reasoning and artificial neural networks, Journal of Computing in Civil
Engineering ASCE 13(3): 162-169.
http://dx.doi.org/10.1061/(ASCE)0887-3801(1999) 13:3(162)
Arditi, D.; Tokdemir, O. B. 1999b. Using case-based reasoning to
predict the outcome of construction litigation, Computer-Aided Civil and
Infrastructure Engineering 14(6): 385-393.
http://dx.doi.org/10.1111/0885-9507.00157
Arditi, D.; Oksay, F. E.; Tokdemir, O. B. 1998. Predicting the
outcome of construction litigation using neural networks, Computer-Aided
Civil and Infrastructure Engineering 13(2): 75-81.
http://dx.doi.org/10.1111/0885-9507.00087
Bishop, C. M. 1995. Neural networks for pattern recognition.
Oxford: Oxford University Press. 504 p.
Chau, K. W. 2007. Application of a pso-based neural network in
analysis of outcomes of construction claims, Automation in Construction
16(5): 642-646. http://dx.doi.org/10.1016/j.autcon.2006.11.008
Chen, J.-H. 2008. KNN based knowledge-sharing model for severe
change order disputes in construction, Automation in Construction 17(6):
773-779. http://dx.doi.org/10.1016/j.autcon.2008.02.005
Chen, J.-H.; Hsu, S. C. 2007. Hybrid ANN-CBR model for disputed
change orders in construction projects, Automation in Construction
17(1): 56-64. http://dx.doi.org/10.1016/j.autcon.2007.03.003
Chen, K.-Y. 2007. Forecasting systems reliability based on support
vector regression with genetic algorithms, Reliability Engineering &
System Safety 92(4): 423-432.
http://dx.doi.org/10.1016/j.ress.2005.12.014
Cheng, M.-Y.; Tsai, H.-C.; Chiu, Y.-H. 2009. Fuzzy case-based
reasoning for coping with construction disputes, Expert Systems with
Applications 36(2): 4106-4113.
http://dx.doi.org/10.1016/j.eswa.2008.03.025
Chou, J.-S.; Chiu, C.-K.; Farfoura, M.; Al-Taharwa, I. 2011.
Optimizing the prediction accuracy of concrete compressive strength
based on a comparison of data mining techniques, Journal of Computing in
Civil Engineering ASCE 25(3): 242-253.
http://dx.doi.org/10.1061/(ASCE)CP.1943-5487. 0000088
Chou, J.-S.; Tai, Y.; Chang, L.-J. 2010. Predicting the development
cost of tft-lcd manufacturing equipment with artificial intelligence
models, International Journal of Production Economics 128(1): 339-350.
http://dx.doi.org/10.1016/j.ijpe.2010.07.031
El-Adaway, I. H.; Kandil, A. A. 2010. Multiagent system for
construction dispute resolution (MAS-COR), Journal of Construction
Engineering and Management ASCE 136(3): 303-315.
http://dx.doi.org/10.1061/(ASCE)CO.1943-7862. 0000144
Ferri, C.; Hernandez-Orallo, J.; Modroiu, R. 2009. An experimental
comparison of performance measures for classification, Pattern
Recognition Letters 30(1): 27-38.
http://dx.doi.org/10.1016/j.patrec.2008.08.010
Han, J.; Kamber, M. 2001. Data mining: concepts and techniques. San
Francisco: Morgan Kaufmann Publishers. 744 p.
Hartigan, J. A.; Wong, M. A. 1979. Algorithm AS 136: a k-means
clustering algorithm, Applied Statistics 28(1): 100-108.
http://dx.doi.org/10.2307/2346830
Haykin, S. 1999. Neural networks: a comprehensive foundation. 2nd
ed. New Jersey: Prentice Hall. 842 p.
Horng, M.-H. 2010. Performance evaluation of multiple
classification of the ultrasonic supraspinatus images by using ml, rbfnn
and svm classifiers, Expert Systems with Applications 37(6): 4146-4155.
http://dx.doi.org/10.1016/j.eswa.2009.11.008
Huang, C. F.; Hsueh, S. L. 2010. Customer behavior and decision
making in the refurbishment industry-a data mining approach, Journal of
Civil Engineering and Management 16(1): 75-84.
http://dx.doi.org/10.3846/jcem.2010.07
Jain, A. K.; Duin, R. P. W.; Mao, J. 2000. Statistical pattern
recognition: a review, IEEE Transactions on Pattern Analysis and Machine
Intelligence 22(1): 4-37. http://dx.doi.org/10.1109/34.824819
Kassab, M.; Hegazy, T.; Hipel, K. 2010. Computerized dss for
construction conflict resolution under uncertainty, Journal of
Construction Engineering and Management ASCE 136(12): 1249-1257.
http://dx.doi.org/10.1061/(ASCE)CO.1943-7862. 0000239
Kim, H.-J.; Shin, K.-S. 2007. A hybrid approach based on neural
networks and genetic algorithms for detecting temporal patterns in stock
markets, Applied Soft Computing 7(2): 569-576.
http://dx.doi.org/10.1016/j.asoc.2006.03.004
Kim, Y. S. 2010. Performance evaluation for classification methods:
a comparative simulation study, Expert Systems with Applications 37(3):
2292-2306. http://dx.doi.org/10.1016/j.eswa.2009.07.043
Kohavi, R. 1995. A study of cross-validation and bootstrap for
accuracy estimation and model selectioned, in The International Joint
Conference on Artificial Intelligence, Montreal, Quebec, Canada: Morgan
Kaufmann, 1137-1143.
Lee, J.-R.; Hsueh, S.-L.; Tseng, H.-P. 2008. Utilizing data mining
to discover knowledge in construction enterprise performance records,
Journal of Civil Engineering and Management 14(2): 79-84.
http://dx.doi.org/10.3846/1392-3730.2008.14.2
Lee, M.-C. 2009. Using support vector machine with a hybrid feature
selection method to the stock trend prediction, Expert Systems with
Applications 36(8): 10896-10904.
http://dx.doi.org/10.1016/j.eswa.2009.02.038
Lenard, M. J.; Madey, G. R.; Alam, P. 1998. The design and
validation of a hybrid information system for the auditor's going
concern decision, Journal of Management Information Systems 14(4):
219-237.
Li, L.; Jiang, W.; Li, X.; Moser, K. L.; Guo, Z.; Du, L.; Wang, Q.;
Topol, E. J.; Wang, Q.; Rao, S. 2005. A robust hybrid between genetic
algorithm and support vector machine for extracting an optimal feature
gene subset, Genomics 85(1): 16-23.
http://dx.doi.org/10.1016/j.ygeno.2004.09.007
Malinowski, P.; Ziembicki, P. 2006. Analysis of district heating
network monitoring by neural networks classification, Journal of Civil
Engineering and Management 12(1): 21-28.
Marzouk, M.; El-Mesteckawi, L.; El-Said, M. 2011. Dispute
resolution aided tool for construction projects in egypt, Journal of
Civil Engineering and Management 17(1): 63-71.
http://dx.doi.org/10.3846/13923730.2011.554165
Min, S.-H.; Lee, J.; Han, I. 2006. Hybrid genetic algorithms and
support vector machines for bankruptcy prediction, Expert Systems with
Applications 31(3): 652-660.
http://dx.doi.org/10.1016/j.eswa.2005.09.070
Mitchell, T. 1997. Machine learning. New York: McGraw Hill. 432 p.
Nandi, S.; Badhe, Y; Lonari, J.; Sridevi, U.; Rao, B. S.; Tambe, S.
S.; Kulkarni, B. D. 2004. Hybrid process modeling and optimization
strategies integrating neural networks/support vector regression and
genetic algorithms: study of benzene isopropylation on hbeta catalyst,
Chemical Engineering Journal 97(2-3): 115-129.
http://dx.doi.org/10.1016/S1385-8947(03)00150-5
PCC. 2011. Engineering evaluation forum of ppp strategy (in
Chinese) [online]. Public Constrction Commission, Executive Yuan, [cited
5 May 2011]. Available from Internet:
http://ppp.pcc.gov.tw/PPP/frontplat/search/ showViews.do?indexID =
0&PK = 1002.
Pulket, T.; Arditi, D. 2009a. Construction litigation prediction
system using ant colony optimization, Construction Management and
Economics 27(3): 241-251. http://dx.doi.org/10.1080/01446190802714781
Pulket, T.; Arditi, D. 2009b. Universal prediction model for
construction litigation, Journal of Computing in Civil Engineering ASCE
23(3): 178-187. http://dx.doi.org/10.1061/(ASCE)0887-3801(2009)
23:3(178)
Quinlan, J. R. 1993. C4.5: programs for machine learning. San
Francisco: Morgan Kaufmann. 302 p.
Sokolova, M.; Lapalme, G. 2009. A systematic analysis of
performance measures for classification tasks, Information Processing
and Management 45(4): 427437.
http://dx.doi.org/10.1016/j.ipm.2009.03.002
Tan, P.-N.; Steinbach, M.; Kumar, V. 2006. Introduction to data
mining. London: Pearson Education, Inc. 769 p.
Tsai, C.-F.; Chen, M.-L. 2010. Credit rating by hybrid machine
learning techniques, Applied Soft Computing 10(2): 374-380.
http://dx.doi.org/10.1016/j.asoc.2009.08.003
Vapnik, V. N. 1998. Statistical learning theory. New York: John
Wiley and Sons. 736 p.
WBG. 2011. [online]. The World Bank Group [cited 5 April 2011].
Available from Internet:
http://ppi.worldbank.org/resources/ppi_glossary.aspx.
Witten, I. H.; Frank, E. 2005. Data mining: practical machine
learning tools and techniques. 2nd ed. San Francisco: Morgan Kaufmann.
664 p.
Wu, C.-H.; Tzeng, G.-H.; Lin, R.-H. 2009. A novel hybrid genetic
algorithm for kernel function and parameter optimization in support
vector regression, Expert Systems with Applications 36(3): 4725-4735.
http://dx.doi.org/10.1016/j.eswa.2008.06.046
Jui-Sheng Chou (a), Chih-Fong Tsai (b), Yu-Hsin Lu (c)
(a) Department of Construction Engineering, National Taiwan
University of Science and Technology, 43, Sec. 4, Keelung Rd, Taipei,
106, Taiwan (R.O.C.)
(b) Department of Information Management, National Central
University, No 300, Jhongda Rd Jhongli City, Taoyuan County, 32001,
Taiwan
(c) Department of Accounting, Feng Chia University, 100, Wenhwa Rd.
Seatwen, Taichung 40724, Taiwan
Received 27 Jul. 2011; accepted 20 Jan. 2012
Corresponding author: Jui-Sheng Chou
E-mail: jschou@mail.ntust.edu.tw
Wu, Q. 2010. The hybrid forecasting model based on chaotic mapping,
genetic algorithm and support vector machine, Expert Systems with
Applications 37(2): 1776-1783.
http://dx.doi.org/10.1016Zj.eswa.2009.07.054
Jui-Sheng CHOU. He received his Bachelor's and Master's
degrees from National Taiwan University, and PhD in Construction
Engineering and Project Management from The University of Texas at
Austin. Chou is a professor in the Department of Construction
Engineering at National Taiwan University of Science and Technology. He
has over a decade of practical experience in engineering management and
consulting services for the private and public sectors. He is a member
of several international and domestic professional organizations. His
teaching and research interests primarily involve Project Management
(PM) related to knowledge discovery in databases (KDD), data mining,
decision, risk & reliability, and cost management.
Chih-Fong TSAI. He received a PhD at School of Computing and
Technology from the University of Sunderland, UK in 2005. He is now an
associate professor at the Department of Information Management,
National Central University, Taiwan. He has published more than 50
technical publications in journals, book chapters, and international
conference proceedings. He received the Highly Commended Award (Emerald
Literati Network 2008 Awards for Excellence) from Online Information
Review, and the award for top 10 cited articles in 2008 from Expert
Systems with Applications. His current research focuses on multimedia
information retrieval and data mining.
Yu-Hsin LU. She received her PhD in Accounting and Information
Technology from National Chung Cheng University, Taiwan. She is an
assistant professor at the Department of Accounting, Feng Chia
University, Taiwan. Her research interests focus on data mining
applications and financial information systems.
Table 1. Project attributes and their descriptive statistics
Attribute Data range, categorical label or statistical
description
Input variables
Type of Central authority (59.5%); Municipality (11.5%);
government Local government (29%)
agency in charge
Type of public 1: Transportation facilities (18.1%);
construction
and facility 2: Common conduit (0%);
3: Environmental pollution prevention
facilities (2.3%);
4: Sewerage (1.1%);
5: Water supply facilities (0.5%);
6: Water conservancy facilities (2.5%);
7: Sanitation and medical facilities (20.8%);
8: Social welfare facilities (3.9%);
9: Labor welfare facilities (1.2%);
10: Cultural and education facilities (25.3%);
11: Major tour-site facilities (10.5%);
12: Power facilities (0%);
13: Public gas and fuel supply facilities (0%);
14: Sports facilities (3.3%);
15: Parks facilities (2.5%);
16: Major industrial facilities (0.5%);
17: Major commercial facilities (1.9%);
18: Major hi-tech facilities (0.2%);
19: New urban development (0%);
20: Agricultural facilities (5.6%);
Project location North (48.5%); Center (21.2%); South (24.5%); East
(5.3%); Isolated island (0.5%)
Executive Central authority (36.0%); Municipality (36.1%);
authority Local government (27.9%)
Type of invested Standard industry classification-Primary (0.2%);
private sector Secondary (38.6%); Tertiary (50.7%); Quaternary
(10.5%)
Planning and Government provides land and plans facility
design unit (91.0%); Government provides land and private
investor designs facility (5.9%); Private
provides land and designs facility (3.1%)
PPP contracting BOT (23.7%); OT (52.7%); ROT (23.6%)
strategy
Major public Promoted as major public infrastructure/facility
infrastructure/ in PPP Act (80.1%); Not major
facility infrastructure/facility (19.9%)
Project scale Range: 0-60,000,000; Sum: 5.43E8; Mean:
841337.1776; Standard deviation: 3.52061E6
(Thousand NTD; USD:NTD is about 1:30 as of Apr.
2011)
Government Range: 0-9,600,000; Sum: 40,975,392.41; Mean:
capital 63527.7402; Standard deviation: 5.11192E5
investment (Thousand NTD)
Private capital Range: 0-60,000,000; Sum: 5.02E8; Mean:
investment 777809.4374; Standard deviation: 3.32433E6
amount (Thousand NTD)
Private capital Range: 0-100; Mean: 91.4729; Standard deviation:
investment 25.42269 (%)
ratio (PCIR)
Licensed Range: 0-60; Mean: 11.9778; Standard deviation:
operations 13.39007 (Year)
duration
Output variable
Dispute No dispute occurred (76.4%); Dispute occurred
propensity (23.6%)
Table 2. Contingency table and chi-square test results for
dispute cases
p- Dispute
Project attributes value occurred (%)
Agency 0.002
Central authority 67.1
Municipality 15.1
Local government 17.8
Type of public construction 0.000
Transportation facilities 10.5
Water conservancy facilities 9.9
Sanitation and medical facilities 17.1
Cultural and education facilities 13.2
Major tour-site facilities 14.5
Agricultural facilities 11.2
Planning and design 0.657
Government provides land and 92.1
plans facility
Government provides land and 5.9
private investor designs facility
Private investor provides land 2.0
and designs facility
PPP strategy 0.000
BOT 49.3
OT 32.2
ROT 18.4
Major public infrastructure 0.000
No 61.2
Yes 38.8
Project scale (Thousand NTD) 0.000
<5,000 15.8
5000-50,000 15.8
> 50,000 68.4
PCIR (%) 0.057
< 25 3.3
25-50 0.0
50-75 3.9
> 75 92.8
LOD (Year) 0.000
<5 19.7
5-10 23.0
10-15 5.9
15 -20 13.8
> 20 37.5
Table 3. Confusion matrix
Predicted
Positive Negative
Actual Positive a (tp) b fn)
Negative c fp) d (tn)
Table 4. Prediction accuracy of single classifiers
Model Accuracy Type I error Type II error
MLP 82.33 44.08 9.53
DT 83.72 52.63 5.07
Naive Bayes 78.91 63.82 7.91
SVMs 79.53 69.74 5.27
k-NN 80.93 29.17 13.59
t-value 91.62 ** 7.20 ** 5.27 *
Model ROC Curve Ranking by accuracy
MLP 0.781 2
DT 0.712 1
Naive Bayes 0.720 4
SVMs 0.625 5
k-NN 0.768 3
t-value 26.24 **
* Represents the level of significance is higher than 95% by t-test.
** Represents the level of significance is higher than 99% by t-test.
Table 5. Prediction performance of combined clustering
and classification techniques
Model Accuracy Type I error Type II error
k-means + MLP 84.66 59.78 5.67
k-means + DT 85.05 64.13 4.26
k-means + Naive Bayes 82.72 68.48 6.14
k-means + SVM 82.33 94.56 0.95
k-means -fk-NN 84.66 41.30 9.69
t-value 149.06 ** 7.65 ** 3.77 *
Model ROC curve Ranking by accuracy
k-means + MLP 0.749 2
k-means + DT 0.692 1
k-means + Naive Bayes 0.720 5
k-means + SVM 0.522 4
k-means -fk-NN 0.764 2
t-value 15.80 **
* Represents the level of significance is higher than 95% by t-test.
** Represents the level of significance is higher than 99% by t-test.
Table 6. Prediction performance of the MLP and classification
techniques combined
Model Accuracy Type I errorType II error
MLP + MLP 97.08 8.82 2.08
MLP + DT 97.08 16.18 1.04
MLP + Naive Bayes 91.61 35.29 4.58
MLP + SVM 96.53 13.24 2.08
MLP + k-NN 96.90 10.29 2.08
t-value 90.22 ** 3.49 * 4.04 *
Model ROC curve Ranking by accuracy
MLP + MLP 0.987 1
MLP + DT 0.923 2
MLP + Naive Bayes 0.918 5
MLP + SVM 0.923 4
MLP + k-NN 0.946 3
t-value 73.08 **
* Represents the level of significance is higher than 95% by t-test.
** Represents the level of significance is higher than 99% by t-test.
Table 7. Prediction performances of the DT and classification
techniques combined
Model Accuracy Type I error Type II error
DT + MLP 93.12 32.53 2.48
DT + DT 95.77 16.87 2.07
DT + Naive Bayes 88.36 51.81 4.75
DT + SVM 87.83 61.45 3.72
DT + k-NN 94.89 18.72 2.89
t-value 55.75 ** 4.09 * 6.66 **
Model ROC curve Ranking by accuracy
DT + MLP 0.826 3
DT + DT 0.957 1
DT + Naive Bayes 0.853 4
DT + SVM 0.674 5
DT + k-NN 0.907 2
t-value 17.58 **
* Represents the level of significance is higher than 95% by t-test.
** Represents the level of significance is higher than 99% by t-test.
Table 8. ANOVA analysis of average accuracy of three
methods (p value)
Cluster + Classifier + Single
Method Classifiers Classifiers classifiers
Cluster + Classifiers 1.000 0.000 * 0.112
Classifier + 1.000 0.000 *
Classifiers
Single classifiers 1.000
F-value 82.689 *
* Represents the level of significance is higher than 99% by t-test or
F-test.
Table 9. ANOVA analysis of Type I error of three methods
(p value)
Cluster + Classifier + Single
Method Classifiers Classifiers classifiers
Cluster + Classifiers 1.000 0.001 ** 0.412
Classifier + 1.000 0.014 *
Classifiers
Single classifiers 1.000
F-value 12.824 **
* Represents the level of significance is higher than 95% by t-test.
** Represents the level of significance is higher than 99% by t-test
or F-test.
Table 10. ANOVA analysis of Type II error of three
methods (p value)
Cluster + Classifier + single
Method Classifiers Classifiers classifiers
Cluster + Classifiers 1.000 0.290 0.299
Classifier + 1.000 0.021 *
Classifiers
single classifiers 1.000
F-value 5.427 *
* Represents the level of significance is higher than 95% by t-test or
F-test.