Model for predicting the quality of a recruit in the BPO sector.
Mani, Vijaya
[ILLUSTRATION OMITTED]
Introduction
For high growth organizations, attracting, hiring and retaining the
right talent is critical. Induction of a wrong person could adversely
affect the progress of an organization and impair the public image of
that company. The recruitment specialist in the Human Resource (HR)
department should be aware of status and objective of the company and is
responsible for assessing the suitability of a hire and the retention of
quality talent and productive recruits. A quality talent could be
defined as an employee possessing valuable knowledge and has the ability
to apply his skills to meet the needs of the company. It is known that
quality talent was always scarce, even during the employers market i.e.
the past 50 years. It is established that different types of people
would excel at different companies and that not all workers want the
same ambience. Many organizations evaluated their performance using
cost-per-hire as a metric. However, this statistic would provide only a
skewed and inaccurate view of success. The hired man power should not
only be inexpensive but that individual should also be able befit the
assigned role and grow thus making the hire meaningfully inexpensive.
The above metric, cost per hire does not reflect the latter aspect. It
is rather insensitive to the quality of the recruit. Hence, some other
performance metric such as performance reviews, hiring manager
satisfaction, retention and productivity that measure the quality of the
hire should be used as indicators. Thus it is evident that in order to
index the quality of a hire many parameters need to be taken into
consideration. In principle a composite index into which these factors
are built in could be evolved by using the method of discriminant
analysis.
Discriminant Analysis
The discriminant analysis was originally developed in 1936 by R.A.
Fisher. This is a simple method based on classification and the accuracy
of its prediction is comparable with that of more complex modern
methods. This could only be used for classification (i.e., with a
categorical target variable), not for regression. The target variable in
turn might have two or more categories. In the procedure called Fisher
Linear Discriminant Analysis a "two-group discriminant
analysis" is carried out. In this procedure two groups coded as 1
and 2 and a dependent variable are subjected to a multiple regression
analysis. The results obtained are analogous to those obtained in the
simple discriminant analysis. It is a general practice to fit the two
group case into a linear equation (1),
Group = a + b1*x1 + b2*x2 + ... + bm*xm (1)
where a is a constant and the bs are regression coefficients. The
interpretation is rather straight forward. Those variables with the
largest (standardized) regression coefficient are the ones that
contribute most to the prediction of group membership
In this work such a discriminant analysis is used to classify 220
hired candidates from a BPO in Chennai, in to two groups as potentially
successful or unsuccessful and into the values of a categorical
dependent, usually a dichotomy.
Methodology
This study is based on the problem of hired candidates failing to
show an expected performance and not entering in to the productive
workforce. This project established a premise for gathering preemptive
knowledge about the hire quality to plan its resources in grooming the
new entrants. It required a thorough understanding of the process
involved starting from an employee (voice based customer service
executive) entering the organization as a trainee till he/she enters the
productive workforce. The parameters available during the entry of the
candidate were used. Company records and manuals were the source of
secondary data used. The responses obtained from 224 voice based
customer service executives from a BPO of the Technical Help desk
process servicing a U.K telecom giant were analyzed. These were then
used to build a predictive discriminant equation. The research design
used for the study was descriptive. The major purpose of using such a
design was to describe of the state of affairs as it prevails. The main
characteristic of this method is that this is not influenced by the
subjectivity of the researcher. Based on this analysis, a predictive
model was developed to classify a candidate as a good or bad hire.
Data Analysis and Interpretation
Classification
The given sample was classified into a good hire or a bad hire by
assigning code1 and code 2 to them respectively. A good hire was taken
to be that member who successfully completes the training and becomes a
productive member of the organization, while a bad hire was the one who
fails to clear the training process. About 70 percent of the total
sample surveyed was used to build the model and the latter was tested on
the remainder 30 percent.
Discriminating Variables
The traits that were used as the criteria for selection at the time
of recruitment were used as the discriminating variables. These
independent variables called as predictors were source of recruitment,
age, gender, educational qualification, relevant experience, voice
evaluation, technical score, grammar, aptitude and technical evaluation.
Description of Variables
The source of recruitment is the variable that indicates the
various sources that were used to identify the potential candidates to
undergo the recruitment and selection procedure. The age of the
candidates analyzed varied from 21 years to 35 years. Since the nature
of their job involved significant proficiency in speaking and possessing
clear voice, the evaluation of voice was considered to be an important
discriminating variable. The work experience of the employee at the time
of hiring was reflected by the relevant experience variable. The
academic proficiency of the employee was indexed using the score
obtained in the written technical test. The language skills of the hire
were evaluated by his performance in the test on grammar. The
intelligence quotient of the hire was evaluated at the time of
recruitment using an aptitude test. This variable was also used as an
evaluation parameter. The specific knowledge possessed by a hire in a
given area was adjudged using the technical evaluation. Table-I displays
descriptive statistics for each variable across groups and for the total
sample. Since the discriminant analysis assumes equal variances the
standard deviations should not vary greatly across groups. The table
shows that the condition is satisfied as the variation is not very much.
The Table-II displays eigen values, the percentage of variance, the
cumulative percentage, and canonical correlations for each canonical
variable (or canonical discriminant function). The eigenvalue, also
called the characteristic root of each discriminant function, reflects
the ratio of importance of the dimensions which classify cases of the
dependent variable. For two-group DA, there is one discriminant function
and one eigenvalue (0.371) which accounts for 100 percent of the
explained variance. Canonical correlation values close to 1 indicate a
strong correlation between the discriminant scores and the groups. The
value obtained (0.520) here indicates the same.
Table-III indicates the Wilks' Lambda that is used to test the
significance of the discriminant function as a whole. The
"Sig." level is the significance level of the discriminant
function as a whole. The researcher wants a finding of significance, and
the larger the lambda, the more likely it is significant. A significant
lambda means one can reject the null hypothesis that the two groups have
the same mean discriminant function scores and conclude the model is
discriminating. If the significance value is small (less than say 0.10)
this indicates that group means differ. Here the means of the
group's Good hire and bad hire differ with respect to the value.
The discriminant analysis algorithm requires us to assign an apriori
(before analysis probability) of a given case belonging to one of the
groups. Table-IV indicates the probabilities assigned according to the
group size in the sample data of 0.622 for the case of good hire and a
value of 0.378 for a bad hire. The coefficients displayed in this table
are the coefficients of the canonical variable. The coefficients are
used to compute canonical variable scores for each case.
The decision rule obtained from Table-V is as follows;
Y = 1.231 * (Source of Recruitment)--0.071 * (Age) +
0.165 * (Gender) - 0.337 * (Edu Qual) + 0.270 * (Rel Ex)
+ 0.287 * (Voice) + 0.058 * (Tech) + 0.190 * (Grammar) -
0.061 * (Aptitude) - 0.116 * (Tech Eval) - 8.754 (2)
Functions at group centroids are the mean discriminant scores for
each of the dependent variable categories for each of the discriminant
functions in multiple group discriminant analysis. Two-group
discriminant analysis has two centroids, one for each group. Here the
two groups are Good hire and Bad hire. The distance between the
centriods defines how well the model is discriminanting (Table-VI). The
classification table also called a classification matrix, or confusion,
assignment, or prediction matrix or table, is used to assess the
performance of the discriminant analysis. In Table-VII the rows are the
observed categories of the dependent and the columns are the predicted
categories of the dependents. In order to obtain a classification score
for each case for each group, each coefficient is multiplied by the
value of the corresponding variable, sum the products, and added to the
constant to get the score. If a case is exhibits the largest value of
the function in a particular group then it is said to belong to the
latter.
Table-VIII presents the degree of success of the classification for
this sample. The number and percentage of cases correctly classified and
misclassified are displayed. Here it is observed to be 76.9 percent. The
results for cross-validated cases are given below the original
classification results. The latter is observed to be 73 percent. The
original results may provide overly optimistic estimates.
Cross-validation attempts to remedy this problem. With cross-validation,
each case in the analysis is classified by the functions derived from
all cases other than that case. Figure 1 is the representation of the
employees falling in to the different categories of the recruitment
sources. One-fourth (25 percent) of the recruitment is done by job
fairs. Figure 2 is the representation of the different age groups and
the employees falling under them in our population. More than two thirds
(75 percent) of the population belong to the age group of 21-25 years.
Figure 3 indicates the gender divide among the population considered for
this study. The male (59 percent) and the female (41 percent) population
is quite similar. The gender divide is almost non existent. Figure 4
represents the qualification, the employees possess. In the population
that was studied majority of the candidates (47.8 percent) are
engineering graduates, whereas other degree holders are nearly half
(28.6 percent) of the engineering graduates. Figure 5 indicates the
spread of candidates with prior work experience in the same industry.
Majority (82 percent) of the candidates have no prior work experience
whereas only a few (18 percent) do.
Conclusion
The predictive model to identify a good hire was developed. It has
an accuracy of 76.9 percent. The influence of each of the predictors or
the independent variables in constructing the decision rule is found.
The model can be used to identify a good or a bad hire in a new case.
The decision rule obtained is as follows,
Y = 1.231 * (Source of Recruitment) - 0.071 * (Age) + 0.165 * 3
(Gender) - 0.337 * (Edu Qual) + 0.270 * (Rel Ex) +
0.287 * (Voice) + 0.058 * (Tech) + 0.190 * (Grammar) - 0.061 *
(Aptitude) - 0.116 * (Tech Eval) - 8.754
This Model can be used by the organization to improve the quality
of hire. The predictions from the model give the idea of the quality of
resource in hand. The result can be used in formulating teams for
training. The outcome i.e. the prediction of the model can be used as a
base to focus on different candidates according to their predicted
performance. The analysis can be done extensively with a larger amount
of data to come up with a robust model that can be implemented in the
organization.
References
S.C.Gupta 'Fundamentals Of Mathematical Statistics'
Sultan Chand & Sons, 11 th Ed., 2002. Rajendra Nargundkar
'Marketing Research' Tata McGraw hill, 2002. Tamara.J.Erickson
& Lynda Gratton, Harvard Business Review, pp 82, March 2007.
www.bpoindia.org/research/recruitment-challenge-call-center-bpo.shtml
www.bookpleasures.com www.staffing.org
Vijaya Mani
Professor,
SSN School of Management and
Computer Applications,
Kalavakkam,
Tamil Nadu.
Table--I
Group Statistics
Good/Bad Discriminating Variable Mean Std. Deviation
Good Source of recruitment 1.47 0.502
Gender 2.05 1.460
Education 1.79 0.407
Relevant exp 15.54 0.890
Voice evaluation 15.73 2.124
Technical score 34.14 5.150
Grammar 7.72 1.463
Aptitude 52.73 4.438
Tech evaluation 3.72 1.935
Age 24.44 2.327
Bad Source of recruitment 1.19 0.393
Gender 1.85 1.186
Education 1.81 0.393
Relevant experience 15.39 0.616
Table--II
Eigen Values
Function Eigen Percent of Cumulative Canonical
Value Variance Percent Correlation
1 0.371 100.0 100.0 .520
Table--III
Wilks' Lambda
Test of Wilks' Sig.
Function(s) Lambda
1 0.730 0.000
Table--IV
Prior Probabilitis for Groups
Good/Bad Prior
Good Hire .622
Bad Hire .378
Total 1.000
Table--V
Canonical discriminant function
coefficients
Discriminant Variable Function
Source of recruitment 1.231
Gender .165
Education -.337
Relevant exp .270
Voice evaluation .287
Technical score .058
Grammar .190
Aptitude -.061
Tech evaluation .116
Age -.071
Constant -8.754
Table--VI
Functions at Group Centroids
Good/Bad Function
Good hire .472
Bad hire -.776
Table--VII
Classification Function Coefficients
Independent Variables Good/Bad
Good Bad
Source of recruitment 19.651 18.116
Gender -.334 -.540
Education 26.982 27.402
Relevant exp 26.803 26.466
Voice evaluation 2.941 2.582
Technical score -.143 -.216
Grammar 1.853 1.617
Aptitude 2.024 2.100
Tech evaluation 6.959 6.815
Age 5.574 5.662
(Constant) -409.305 -399.071
Table--VIII
Classification of results
Good/ Predicted Group Total
Bad Membership
Original Good Bad Good
Count Good 82 15 97
Bad 21 38 59
Percent Good 84.5 15.5 100.0
Bad 35.6 64.4 100.0
Cross- Count Good 80 17 97
Validated (a) Bad 24 35 59
Percent Good 82.5 17.5 100.0
Bad 40.7 59.3 100.0
Figure--1
Source Recruitment
Advertisement 16%
Consultancy 18%
Employee 17%
referral
Jobfair 25%
Campus 14%
Networking 1%
Walkin 9%
Note: Table made from pie chart.
Figure--2
Age Group
31-35 4%
years
26-30 21%
years
21-25 75%
years
Note: Table made from pie chart.
Figure--3
Gender of Employees
Gender of Employee
Female 41%
Male 59%
Note: Table made from pie chart.
Figure--4
Education Qualification
B.E/B. TECH 48%
BCOM, BCA, BSC, BBA 29%
BA
MSC 8%
MCA 8%
MBA 4%
OTHER PGs 1%
DIPLOMA 2%
Note: Table made from pie chart.
Figure--5
Relevant Experience
yes 18%
No 82%
Note: Table made from pie chart.