Microsegmentation in telecom market: data mining approach.
Bach, M. Pejic ; Simicevic, V. ; Leskovic, D. 等
1. Introduction
The development of many industries would not have flourished
without the support of information and communication technology.
Telecommunication industry uses information and communication technology
as a support for providing telecommunication services but also as a
support for business processes. The support of business processes is
realized in the form of: (1) transaction information systems which
follow regular business activities and generate standardized reports,
and (2) support systems for the decision-making process which enable
intelligent use of data stored in the databases with the aim of making
quality decisions. Data mining, as a part of the support system for the
decision-making process, enables many applications in the field of
telecommunications. The most frequent are the following:
telecommunication market analysis (Costea, 2006), preventing clients
from shifting to other companies (Lejeune, 2001; Hung et al., 2006),
sale of additional services to existing customers (Malabocchia et al.,
1998), assessment of the client's values (Daskalaki et al., 2003),
as well as market segmentation.
In telecommunication companies, for the purpose of segmentation of
the industrial market, the most frequently used variables include the
location and the size of the revenue realized from the sale of
telecommunication services. The aim of this paper is to present a case
study on the segmentation of the industrial market in a
telecommunication company by means of cluster analysis. The business
users' data were applied as a sample and the approach of dynamic
market microsegmentation is suggested on the basis of the data for each
individual client.
The paper is organized like following. After the introduction, the
second section of the paper explains data mining, while cluster analysis
methodology is presented in the part three. Discovery of market segments
in a telecommunication company is included in the part four, which is
followed in turn by the concluding remarks.
2. Data mining
Data mining is the process of discovering new knowledge from
existing databases of an organisation by using statistical methods and
methods of artificial intelligence. Technically, it is the process of
finding correlations or patterns among dozens of fields in large
relational databases. Its application started in the nineties when
powerful enough computer processors as well as memories for the
implementation of the data mining process became affordable.
Data mining process consists of four steps (Baragoin et al., 2001),
shown in the Figure 1.
In the first step, a business problem is defined. The second step
is data preparation and it consists of determining the necessary data,
transformation and sampling as well as data evaluation. The third step
is modelling, which relates to the choice of the mining method and model
construction and evaluation. The fourth step consists of implementation,
which includes interpretation and the use of data. The data mining
process is iterative, which means that at any moment it is possible to
return to one of the previous steps. Such a "jump back" will
be more a rule than an exemption to the rule, because in data mining the
most important is to define the problem well and to choose and prepare
data in an appropriate way, which can be difficult to achieve at first.
on the other hand, during the data mining process, the knowledge on the
business problem and the data is deepened and such a "revised"
definition of the business problem is often better than the original
one.
[FIGURE 1 OMITTED]
3. Cluster analysis methodology
The objective of conducting a cluster analysis is to discover if
members of the dataset can be classified as pertaining to one of a small
number of types. This can be especially important for marketing managers
who want to discover what constitutes a market segment in a
telecommunication company.
The cluster analysis is conducted with the aim of assigning data
points (sequences) into reasonably homogenous groups (clusters). The
main task in the cluster analysis is to determine how many clusters are
to be used (Cattrell, 1998). If the number of clusters is too high,
dissimilarity within each cluster will be low, but clusters might be
very specific. Therefore, the result of such an analysis could not be
easily interpreted and generalized. If the number of clusters is too
low, the dissimilarity within each cluster will be high and such
clusters could not produce new and useful information. Therefore, we
have to be aware that there is no correct number of clusters. However, a
decision needs to be made on how many clusters will be used. In order to
additionally determine in what way the identified clusters differ from
each other, a descriptive statistics methods and techniques were used.
In order to classify the users into segments a method of
non-hierarchical clustering "K-means" was applied. The
analysis of variance (AnoVA) was performed in order to determine if the
differences of average values of Internet revenue and revenue from fixed
telephony according to individual clusters are statistically
significant. In order to determine between which clusters the
statistically significant difference exists, a posthoc analysis by means
of Scheffe test was applied.
4. Discovery of market segments in a telecommunication company
In order to describe the discovery of market segments in databases
well, a case study involving a telecommunication operator is used. This
research will enable to show segmentation modalities used so far as well
as the proposed modality, based on the discovery of market segments in
data bases. The industrial market segmentation is analysed.
4.1 Existing criteria in the industrial market segmentation
The telecommunication operator from the case study uses the basic
market segmentation, whereby two demographic criteria are used: location
and the size of the user (the total annual revenue from the user). Based
on the location criterion, the market of the republic of croatia is
divided into four geographic regions. The industrial market is divided
into five important market segments based on the users' size
measured by the total annual revenue gained. The market segmentation is
implemented once a year. one should note that a period of a calendar
year is too long for the survival of static segments. In the course of a
year, a large number of legal subjects register with the company, which
means a large number of new telecommunication services' users in
both private and business sector. Additionally, the new services market
is very dynamic. new services are offered and some existing ones lose
their importance. The users buy new services and new solutions thus
changing their position towards the telecommunication operator.
4.2 Approach to the dynamic microsegmentation of industrial market
The presented approach to the industrial market segmentation, which
changes only every calendar year, is not dynamic enough to encompass
neither all the changes in the business activities of business subjects
nor the changes in the telecommunications market. The analysis, in which
variables are measured by the total revenue, other than the location and
the size of the user revenue, will be presented. The analysis is based
on the following variables: (1) total telecommunications revenue from
the users, (2) coefficient of revenue size from users, (3) potential of
the user's branch of economic activity, (4) ICT potential, (5)
compactness of the relationship between a user and the telecommunication
operator and (6) loyalty coefficient. A database of 2000 business users
was analysed. A descriptive statistics methods and techniques were used.
4.2.1 Total telecommunications revenue from the users (APRUnet)
The total telecommunications revenue from the user's company
(APRUnet) is defined as the sum of the values of all the transactions
that an individual user realizes with a telecommunications operator
during one calendar year (the price it pays for all the
products/services).
The average revenue of the company realized from the users in the
sample amounts to KN 29.869,49, with a standard deviation of KN
65.417,02, which is quite high. The high value of the standard deviation
indicates that the average value of the total revenue from the users is
not representative. Therefore, the more acceptable value is a median and
its value is KN 15.742,76. It indicates that a half of the users
realizes the total revenue lower than the median value, and a half of
the users have the revenue higher than the median value. The minimum
value of the total revenue from the users is KN 0, which means that the
database also includes the users that no longer use the company's
services and the maximum revenue from the users in the sample amounts to
KN 1.216.892.
4.2.2 Potential of the user's branch of economic activity
For each activity defined by the National classification of
activities (NKD) an assessment has been performed by the
telecommunication operator, whereby, the range from 0 to 5 is used and
the following value are assigned to them:
0--no data on the company's activity
1--The activity does not represent a potential for the
telecommunication operator at all
2--The activity does not represent a potential for the
telecommunication operator
3--The activity represents a medium potential for the
telecommunication operator
4--The activity represents a high potential for the
telecommunication operator
5--The activity represents a very high potential for the
telecommunication operator
In the sample of analysed companies 5.26% of them perform the
activities with very low potential, 14.20% the activities with low
potential, 52.50% the activities with medium potential, 13.04% the
activities with high potential and 15.11% of the companies perform the
activities with very high potential. only for 0.35% of the companies the
data on the activity is missing (Table 2).
ICT potential is defined based on the ICT coefficient. The ICT
coefficient is an indicator of the level of development of a particular
user in the field of information technology and it is based on the
mentioned elements.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (1)
The elements of the ICT coefficient are: (1)
[k.sub.ICT]-coefficient of information and communication technology, (2)
[n.sub.p]--number of branch offices, (3) [n.sub.z]-number of employees,
(4) ARP[U.sub.net]--total telecommunication revenue from the company,
(5) ARP[U.sub.fix]--total revenue from fixed technology and (6) vc-total
number of voice channels. The ICT potential represents previously
defined ICT coefficient by the notes from 0 to 5 in the ranges presented
in the Table 3. This variable was used despite the fact that a half of
the companies do not possess a defined ICT potential since it indicates
a potential in the use of advanced telecommunication services, which are
attractive from the point of view of profit. In total, 9.9% of the
companies have a very low potential, 13.1% of the companies have a low
potential, 5.5% of the companies have medium potential, 8.7% of the
companies have a high potential and 12.7% of the companies have a very
high potential. The results are presented in Table 3. as following:
4.2.4 Compactness of the relationship between a user and the
telecommunication operator
The stability of the relationship between a user and the
telecommunication operator is mostly based on the data from the history
of their relation. In principle, this element is a product of
multiplication of all services and the days of their use:
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (2)
The following elements were used for the calculation of the
compactness of the relationship: (1) [f.sub.1]--the first service used
by the user, (2) [t.sub.1]--the period of the use of the first service
(in days) and (3) n-the total amount of telecommunication services used
by the user. It is important to note that a signature of an agreement is
compulsory for almost all telecommunication services, so the databases
for each user contain the starting data related to the use of individual
services.
Table 4. contains the data on the compactness of the relationship
between a user and the telecommunication operator from the sample. For
10.6% of the users the data is missing, for 5.4% of the users the
compactness is very low, for 11.3% of the users the compactness is low,
in 52.3% of the cases the compactness is average, in 18.6% of the cases
the compactness is high and for only 1.8% of the users the compactness
of the relationship is very high.
4.2.5 Loyalty coefficient
The loyalty coefficient is the ratio of the number of voice
channels possessed by the competitive companies (vcc) and the total
number of the voice channels used by the individual user (vc):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (3)
Table 5. contains the data on loyalty coefficients and the loyalty
has been defined in the following way: (1) the most loyal users have the
loyalty coefficient 0 and 100% of channels with this operator and no
channels with the competition, (2) quite loyal users have the loyalty
coefficient in the range from 0.01 to 0.25, (3) averagely loyal users
have the loyalty coefficient in the range from 0.26 to 0.60, (4) hardly
loyal users in the range from 0.61 to 0.89 and (5) unloyal ones in the
range from 0.90 to 1.00, meaning that they have between 90% and 100% of
channels with the competition.
The vast majority of the sample (79.6%) consists of the most loyal
users, possessing 100% of channels with this operator and no channels
with the competition. Hardly loyal are 2.1% of the users (the loyalty
coefficient is in the range from 0.61 to 0.89), averagely loyal are 1.3%
and only 0.5% of the users are quite loyal (the loyalty coefficient in
the range from 0.01 to 0.25), while 4.5% of them are unloyal. The data
are missing for 12.0% of the users.
4.3 Construction and evaluation of the market segmentation model
In order to classify the users into segments a method of
non-hierarchical clustering "K-means" was applied. This method
of classifying objects into groups is especially appropriate for the
discovery of market segments, whereby the objects are grouped on the
basis of measured characteristics.
In applying cluster analysis it is necessary to decide on two
important issues: (1) variables which will be included in the analysis
and (2) the number of clusters. The programme package SPSS 17.0 and the
option [??]Classify---K-means Cluster" were used and the mentioned
variables were chosen together with the values for all companies
contained in the data base. In segmentation, the decision on the chosen
variables and the number of clusters is based on ANoVA analysis as well
as on a subjective assessment of an expert knowing the situation in the
market. The cluster analysis is implemented in three steps, which will
be described in detail.
4.3.1 Cluster analysis with chosen variables in three clusters
In this step a cluster analysis was applied with previously chosen
variables in three clusters. The average values of the variables of
individual clusters are shown in the Table 6. In the first Cluster are
companies with the highest total annual revenue, the highest coefficient
of the revenue size, with the potential of the branch of economic
activity, which is approximately the same as in the second Cluster and
the highest ICT potential, and compactness of the relationship. Cluster
2 contains the companies with the lowest total annual revenue but these
companies have the highest potential of the branch of economic activity.
Cluster 3 comprises the companies pertaining to the "golden
mean", and the values of almost all the variables are higher than
those in the second cluster but they are lower than those in Cluster 1.
4.3.2 Cluster analysis with chosen variables in four clusters
In this step, a cluster analysis was performed with previously
chosen variables in four clusters (Table 7). The companies with the
highest total revenue, the highest coefficient of the revenue size, ICT
potential and the compactness of the relationship are contained in
Cluster 1. The companies with the lowest total annual revenue are
contained in Cluster 4, and these companies also have the lowest all
other average values of variables, except the potential of the branch of
economic activity. The third Cluster contains the companies, which have
half of the total annual revenue in comparison to the companies from
Cluster 1, and average values of other variables are quite similar to
average values of the variables of the companies in Cluster 1. Cluster 3
contains the companies, which have quite lower total annual revenue in
comparison to Cluster 1 and Cluster 2. The average values of other
variables are also quite low but still higher than the values of the
companies in Cluster 4.
4.3.3 Cluster analysis in four clusters and without variables
,,Total revenue" and [??]Potential of the branch of economic
activity"
Due to its extremely high values, the variable "Total
revenue" decreased the influence of other variables, and the
variable [??]Potential of the branch of economic activity" has
approximately the same values for all the existing clusters. Therefore
in this step, a cluster analysis was performed in four clusters, whereby
the two previously mentioned variables were omitted. The results of
average values of the variables from individual clusters according to
the analysis with selected variables in four clusters with the omission
of the variable [??]Total revenue" and [??]Potential of the branch
of economic activity"are presented in Table 8. as following:
Cluster 1 contains the companies, which have an average compactness
of the relationship, very low revenue and low ICT potential. Cluster 2
represents the companies with high compactness of the relationship but
also with high revenue and average ICT potential. Cluster 3 includes the
companies with low ICT potential as well as low compactness of the
relationship and low revenue. Cluster 4 contains the companies with
highest revenue and low ICT potential as well as low compactness of the
relationship.
In order to additionally determine in what way the identified
clusters differ from each other, a descriptive statistics for the used
variables will be presented: median values and standard deviations were
calculated for the Internet revenue and the revenue of fixed telephony
of the companies in individual clusters (Table 9). The data showed that
the clusters, which have higher median values of variables, used for
cluster analysis in relation to other clusters also have higher average
values of internet revenue and revenue from fixed telephony and
vice-versa. So, for example, the companies from Cluster 2, which have
the highest average values of variables (the coefficient of the revenue
size, ICT potential, compactness of the relationship) have the highest
average values related to the Internet revenue and the revenue from
fixed telephony. The descriptive statistics results of Internet revenues
and revenues from fixed telephony are presented in Table 9.
The analysis of variance (ANOVA) was performed in order to
determine if the differences of average values of Internet revenue and
revenue from fixed telephony according to individual clusters are
statistically significant. The data revealed that this assumption is
correct for both groups of revenue at 0.1 probability level. The
analysis of variance (ANOVA) determines if there is a statistically
significant difference between at least one pair of clusters. The
results are presented in Table 10. as following:
In order to determine between which clusters the statistically
significant difference exists, a post-hoc analysis by means of Scheffe
test was performed (Table 11). The data revealed that for Internet
revenue there is a statistically significant difference for all pairs of
Cluster 1 and other clusters at 0.1 probability level. For the revenue
from fixed telephony a statistically significant difference exists for
all pairs at 0.1 probability level except for Cluster 3 and Cluster 4.
4.3.4 Profiling of final market segments
The analysis of variance and Scheffe post-hoc analysis showed that
the cluster analysis presented in the Table 8. is acceptable and that it
resulted in determining market segments of the analysed
telecommunication operator. The experts in the telecommunication company
interpreted the determined segments presented in the Table 8 in the
following way:
Cluster 1 represents the companies with very low coefficient of the
revenue size. These companies annually spend less than KN 10.000,00 for
telecommunication services. The data related to their ICT potential
suggest that these companies have low ICT potential. The ICT potential
is directing us to the companies, which in the future might have the
need for additional telecommunication solutions. The companies from
Cluster 1 also have an average level of compactness of the relationship
with our telecommunication operator. These companies have been for quite
some time the clients of this operator. Thus, this Cluster might be
named SOHO (small office home office).
Cluster 2 includes the companies with a high level of compactness
of the relationship and of ICT potential and somewhat lower level of
revenue. It is undoubtedly the most profitable market segment to which
the most attention should be paid. These companies are steady clients,
who will most probably have the need to expand their business and they
can be named LA (large account).
Cluster 3 represents the companies with an extremely low ICT
potential as well as the compactness of the relationship, with slightly
higher revenue from the lowest. It is the most unrewarding market
segment with the tendency of transferring to the competition. They have
not been the company's clients for a long time and they do not have
the need to develop their own ICT. The best name for this market segment
could be SI (Silver).
Cluster 4 represents the companies with highest revenue but in the
same time with low ICT potential and compactness of relationship. This
group could be named SME (small and medium enterprises).
5. Conclusions
The modern information and communication systems enable the storage
of a large number of transaction data. By means of transaction data
mining it is possible to gain new knowledge on the users of
company's products/services/solutions. It is necessary to apply
this knowledge in order to determine the user's habits and to form
effective market segments, which will be characterized by similar
consumer habits.
A particular value of this case study lies in the elaboration of
the segmentation model based on gaining knowledge from the databases of
a Croatian telecommunication operator. It is a leading regional
information and communication company which, at the moment, does not
implement market segmentation using information from its own and
external databases but it uses the common approach to segmentation based
on location and the revenue size from telecommunication services
invoiced to individual users. The study has proved that the market
segmentation has to be based on thorough knowledge of users and their
habits and noting all the interactions with a user. The stored data can
be used for data mining, which will result in new knowledge on
users' habits and inclinations and enable forming effective market
segments. Targeted approach to individual market segments results in
significant competitive advantage. By using cluster analysis as the
proposed market segmentation model of a Croatian telecommunication
operator exceptionally attractive market segments were created. It
enables the company to manage profitability and loyalty of each user.
This model of market segmentation vividly presents the importance of
effective and interactive market segmentation, which will result in
their increased competitiveness in the conditions of ever-growing
globalization as well as competitiveness of Croatian economy in general.
Future studies should be aimed at implementation of other statistical
methods and techniques as well as the methods of artificial intelligence
in the field of market segmentation.
DOI: 10.2507/daaam.scibook.2009.93
6. References
Baragoin, C.; Andersen, C.M.; Bayerl, S.; Bent, G.; Lee, J. &
Schommer, C. (2001). Mining Your Own Business in Banking Using DB2
Intelligent Miner for Data, Available from: http://www.redbooks.ibm.com/
Accessed: 2001-08-31
Cattrell, R.B. (1998). The Scientific Use of Factor Analysis in the
Behavioural and Life Sciences, Plenum Press, ISBN: 0306309394, New York,
USA
Costea, A. (2006). The Analysis of the Telecommunication Sector by
the Means of Data Mining Techniques. Journal of Applied Quantitative
Methods, Vol. 1, No. 2, (December, 2006) pp. 144-150, ISSN: 1842-4562
Daskalaki, S.; Kopanas; I.; Goudara, M. & Avouris, N. (2003).
Data mining for decision support on customer insolvency in
telecommunications business. European Journal of Operational Research,
Vol. 145, No. 2, (March, 2003) pp. 239-255, ISSN: 0377-2217
Hung, S.; Yen, D.C. & Wang, H. Y. (2006). Applying data mining
to telecom churn management. Expert Systems with Applications, Vol. 31,
No. 3, (October, 2006) pp. 515-524, ISSN: 0957-4174
Lejeune, M.A.P.M. (2001). Measuring the impact of data mining on
churn management. Internet Research, Vol. 11, No. 5, (December, 2001)
pp. 375387, ISSN: 1066-2243
Malabocchia, G.; Buriano, L.; Mollo, M.J.; Richeldi, M. &
Rossotto, M. (1998). Mining telecommunications databases: an approach to
support the business management, Available from: Network Operations and
Management Symposium, 1998. NOMS 98., IEEE Accessed:1998-02-15
This Publication has to be referred as: Pejic Bach, M[irjana];
Simicevic, V[anja] & Leskovic, D[arko] (2009). Microsegmentation in
Telecom Market: Data Mining Approach, Chapter 93 in DAAAM International
Scientific Book 2009, pp. 951-964, B. Katalinic (Ed.), Published by
DAAAM International, ISBN 978-3-901509-69-8, ISSN 1726-9687, Vienna,
Austria
Authors' data: Univ.Prof. PhD. Pejic Bach, M[irjana] *;
Assistant Prof. PhD. Simicevic, V[anja] **; Ma. Dipl.-Ing. Assistant
Manager, Leskovic, D[arko]***, * The University of Zagreb, Faculty of
Economics & Business Zagreb, Trg J. F. Kennedyja 6, 10 000 Zagreb,
Croatia, ** Centre for Croatian Studies, Borongajska c. 83d, 10000
Zagreb, Croatia, *** HT, Ivana Mestrovica 66, 33000 Virovitica, Croatia,
mpejic@efzg.hr, vsimicevic@hrstud.hr, darko.leskovic@t.ht.hr
Tab. 1. Descriptive statistics of the total revenue from the users'
companies
Number
KN of users Average Median Min
Total revenue 1.978 29.869,49 15.742,76 0,00
([APRU.sub.net])
KN Max SD
Total revenue 1.216.892 65.417,02
([APRU.sub.net])
** All prices are presented in kuna, Croatian currency
Tab. 2. Potential of companies' branch of economic activity from the
sample 4.2.3 ICT potential
Potential Number of companies Percentage of companies
(0-5) ([f.sub.i]) ([P.sub.i])
Very low (1) 104 5.26
Low (2) 281 14.20
Medium (3) 1030 52.05
High (4) 258 13.04
Very high (5) 299 15.11
No data on the 7 0.35
activity (0)
Tab. 3. ICT potential of the users from the sample
Number of Percentage of
ICT potential (0-5) [k.sub.ICT] companies companies
([f.sub.i]) ([P.sub.i])
No data (0) 0 990 50.0
Very low (1) 1 196 9.9
Low (2) 2-8 260 13.1
Medium (3) 9-19 109 5.5
High (4) 20-99 172 8.7
Very high (5) 100> 251 12.7
Tab. 4. Compactness of the relationship between a user and the
telecommunication operator
Number of Percentage of
users users
Compactness of the relationship C ([f.sub.i]) ([p.sub.i])
No data (0) 0 210 10.6
Very low (1) 1 106 5.4
Low (2) 2-8 223 11.3
Average (3) 9-19 1036 52.3
High (4) 20-99 368 18.6
Very high (5) >100 36 1.8
Tab. 5. Loyalty coefficient
Loyalty Number of Percentage
coefficient users of users
of the user Loyalty level ([f.sub.i]) ([P.sub.i])
-- No data 237 12.0
1.00-0.90 Unloyal, from 90% to 89 4.5
100% of channels with
the competition
0.89-0.61 Hardly loyal 42 2.1
0.60-0.26 Averagely loyal 26 1.3
0.25-0.01 Quite loyal 9 0.5
0.00 The most loyal, 100% 1576 79.6
channels with this
operator
Tab. 6. Average values of the variables in individual clusters based on
the analysis with chosen variables in three clusters
Cluster 1 Cluster 2 Cluster 3
Total annual revenue 956.195,64 22.711,67 223.768,43
Coefficient of the 5.00 1.92 4.11
revenue size
Potential of the branch 3.00 3.21 2.84
of economic activity
ICT potential 5.00 1.50 4.63
Compactness of the 5.00 2.84 3.74
relationship
Tab. 7. The average values of the variables in individual clusters
according to the analysis with chosen variables in four clusters
Cluster 1 Cluster 2 Cluster 3
Total annual revenue 1.193.926,91 142.007,05 539.711,10
Coefficient of the 5.00 3.71 4.83
revenue size
Potential of the branch 3.00 3.03 2.75
of economic activity
ICT potential 5.00 4.55 5.00
Compactness of the 5.00 3.55 4.58
relationship
Cluster 4
Total annual revenue 20.427,90
Coefficient of the 1.87
revenue size
Potential of the branch 3.21
of economic activity
ICT potential 1.41
Compactness of the 2.82
relationship
Tab. 8. Average values of the variables from individual clusters
according to the analysis with selected variables in four clusters,
with the omission of the variable ,,Total revenue" and ,,otential of
the branch of economic activity"
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Coefficient of the 0.90 3.93 1.51 4.10
revenue size
ICT potential 2.04 2.78 1.27 1.67
Compactness of the 3.13 3.86 0.38 2.36
relationship
Tab. 9. Descriptive statistics of Internet revenues and revenues from
fixed telephony according to individual clusters from the Table 8
Internet revenue Revenue from
fixed telephony
Cluster 1 Average 3.482,95 27.105,89
Number of companies 831 831
Standard Deviation 17.887,54 42.700,91
Cluster 2 Average 13.896,70 67.643,47
Number of companies 266 266
Standard Deviation 45.028,97 107.933,87
Cluster 3 Average 346,53 7.316,76
Number of companies 198 198
Standard Deviation 794,54 12.571,21
Cluster 4 Average 1.041,68 14.266,54
Number of companies 683 683
Standard Deviation 1.269,77 17.157,07
Tab. 10. ANOVA analysis of average values of Internet revenue and
revenue from fixed telephony according to individual clusters from
Table 8
Internet revenue
Degrees of
Sum of squares freedom Average quadrants
Groups 34.747.704.784,397 3 11.582.568.261,466
Within
the 804.110.363.161,419 1.974 407.350.741,217
group
Total 838.858.067.945,816 1.977
Revenue from fixed telephony
Groups 625.413.318.862,137 3 208.471.106.287,379
Within
the 4.832.460.832.867,110 1.974 2.448.055.133,165
group
Total 5.457.874.151.729,250 1.977
Internet revenue
F-value P-value
Groups 28.434 0.000 **
Within
the
group
Total
Revenue from fixed telephony
Groups 85.158 0.000 **
Within
the
group
Total
** Statistically significant at 0.1 probability level
Tab. 11. P-values for Scheffe post-hoc analysis of average values of
Internet revenue and revenue from fixed telephony according to
individual clusters from the Table 8
Internet revenue
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Cluster 1 0.000 ** 0.277 0.140
Cluster 2 0.000 ** 0.000 ** 0.000 **
Cluster 3 0.277 0.000 ** 0.980
Cluster 4 0.140 0.000 ** 0.980
Revenue from fixed telephony
Cluster 1 Cluster 2 Cluster 3 Cluster 4
Cluster 1 0.000 ** 0.000 ** 0.000 **
Cluster 2 0.000 ** 0.000 ** 0.000 **
Cluster 3 0.000 ** 0.000 ** 0.387
Cluster 4 0.000 ** 0.000 ** 0.387
** Statistically significant at 0.1 probability level