Forecasting the risk of traffic accidents by using the artificial neural networks/Eismo ivykio rizikos prognozavimas naudojant dirbtinius neuroninius tinklus/Celu satiksmes negadijumu riska prognozesana ar maksligajiem neiralajiem tikliem/Liiklusonnetuste prognoos kasutades kunstlikke narvivorke.
Sliupas, Tomas ; Bazaras, Zilvinas
1. Introduction
Traffic accident risk is defined as a relative probability to have
a traffic accident resulting in death (fatal accident) or injury on a
particular road section while driving the distance of 1 km. The
definition of traffic accident risk makes it easier to compare safety
characteristics of different road sections withdrawing the influence of
a distance (length of a road section) and traffic volume (measured by
calculating AADT--Annual Average Daily Traffic, vehicles per day). The
traffic accident risk level of a road section is affected by the quality
of road surface and road infrastructure in general, social composition
of drivers which might be different in distinct areas of a country,
different speed and other traffic limitations (established by road
signs), meteorological conditions, etc. The goal of this research is to
find out if and how the traffic accident risk is related to the
aforementioned factors by using the Artificial Neural Networks (ANN).
Earlier attempts to forecast the number of traffic accidents on
Lithuanian roads were taken by a relatively small number of Lithuanian
researchers. Scientific methods applied to achieve this goal included
the use of: linear regression equations (Ratkeviciute 2009), different
types of regression equations (Sliupas 2009b), and other research models
(Jasiuniene 2012). Using ANN for traffic accident forecasting is not a
common practice abroad (Maher, Summersgill 1996). Factors influencing
traffic accidents (based on the Lithuanian road data) are analysed in
various sources: Miskinis, Valuntaite (2010), Sliupas (2009a),
Ratkeviciute et al. (2011a), Ratkeviciute (2010). Feasibility of traffic
accident prediction in Lithuania was studied by Ratkeviciute et al.
(2011b). One of the most complex and systematic studies on traffic
accident factors carried out abroad was conducted by Elvik and Vaa
(2004). The use of Geographic information system (GIS) data for the
forecasting of accident risk is described by Andrey and Lister (1999).
2. Research data
State roads of Lithuania are subdivided into 3 parts: main,
national and regional road networks. The objects of this research are
the main and national roads. Merely fatal and injury-incurring traffic
accidents are analysed in the study. All accidents which were registered
on the LAKIS database (abr. from Lithuanian--Lietuvos automobiliu keliu
informacine sistema) and occurred during the period of 2002-2006 were
selected for the analysis. The analysed roads are divided into 341
sections. These road sections were formed by excluding the road
intervals with the same AADT value. Normally they start after one more
significant intersection and end by another and always contain an AADT
measuring post. Sections of roads that cross small towns and highly
populated areas are excluded from the study because they are much
different from the rest of road sections and are more likely to resemble
streets by their characteristics. An average length of a road section is
18.77 km. The total length of these road sections amounts to approx
6.401 km. As it is indicated by The Lithuanian Road Administration under
the Ministry of Transport and Communications of the Republic of
Lithuania, the total length of the main and national road network is
6.722.80 km. These 341 sections cover 30.10% of the Lithuanian state
roads.
After the primary research, the following data on the parameters
affecting the traffic accident risk was collected:
--number of inhabitants within 17.00 km radius from a road section;
--road section class variable;
--weighted average of road pavement width, m;
--quantity of 3-way junctions in a road section;
--quantity of 4-way junctions in a road section;
--number of bus stop grounds;
--number of rest grounds;
--number of large long time rest grounds;
--number of gasoline stations;
--ratio of pavement length, bicycle path length or
bicycle/pedestrian path length along a road section and length of a road
section;
--ratio of guardrail length in a road section and road section
length;
--ratio of illuminated road section length and total road section
length.
All of these 12 parameters were used to forecast the accident risk
by using the ANNs described below. A more comprehensive research
dedicated to the influence of population density (within a chosen radius
from a road section) on the number of traffic accidents and traffic
volume in particular road sections could be found in the following
sources: Sliupas et al. (2006) and Sliupas (2009b). The road section
class variable defines the class of a road following the official state
classification as described in Road Technical Regulation of Lithuania
KTR 1.01:2008. The road section class variable is helpful when there is
a case of different road classes on the same road section. A detailed
description on how the aforementioned calculations are made is given by
Sliupas (2009). The rest of the road parameter information for the
analysis was extracted from the LAKIS (2008) database, which is located
at The Road and Transport Research Institute, Lithuania. Classified
traffic volume information of each road section was collected from the
annual volume research reports of the same institute.
[FIGURE 1 OMITTED]
3. Method of the analysis
The forecasts of road traffic accident risk are calculated by using
ANN and following the procedure displayed in Fig. 1.
The dataset used in the research is multidimensional. One of many
ways to display such kind of data is a matrix of scatter plots (Medvedev
2007). Visual analysis did not reveal strict dependency patterns. A more
interesting dataset view fragments are displayed in Figs 2 and 3.
As it can be noted from Fig. 2, the accidents are concentrated in
two lager spots: around the roads with a 9 meter-width pavement and
roads with a 22 meter-width pavement. Dots are not distributed all over
the scale because of road width standards. The same figure also shows
that wider roads are safer than narrow, but that presumption may not be
correct. Wider roads that were represented in the study are the best
roads (highways) of the country having the best upkeep and traffic
safety financing. Also, these roads have high traffic volumes. Most of
them have grade separated intersections, limited U-round opportunities,
etc., that is why the presumption that wider roads are safer might not
be a universal rule. Elvik and Vaa (2004) note that road widening
improves traffic safety in unpopulated areas; however, this makes it
worse in densely populated territories.
Fig. 3 reveals that road sections having a larger distance of
guardrails are a little bit safer than the opposite case. That seems
logical because guardrails prevent the front to front collisions and
road departing accidents. Positive effects of guardrails were noted by
Hunter et al. (2001), Short and Robertson (1998), Griffith (1999), etc.
However, Fig. 3 shows a large dispersion of dots and the pattern is not
clearly expressed.
ANNs are usually applied to model complex relationships between
inputs and outputs or to find patterns in the data. As ANNs are
non-linear statistical data modelling tools, in this case they might
show better results than other methods.
4. ANN types applied and the process of calculation
3 types of ANN were used in the research:
1. Feedforward two-layer ANN (its scheme is displayed in Fig. 4).
The variables displayed in Fig. 4 are: [p.sub.1] - [p.sub.n] --traffic
accident risk affecting parameters; m--number of neurons in the first
layer; [w'.sub.1,1] - [w'.sub.m,n]--network weights of the
first layer neurons; f--function of hyperbolic tangent; [a'.sub.1]
- [a'.sub.m]--outputs of the first layer neurons; [w'.sub.1] -
[w".sub.m]--weight vector of the second layer neuron; b--bias;
f"--linear function; a" or r--output of the second layer
neuron or forecasted result of neural network.
2. Feedforward ANN of one neuron, using hyperbolic tangent function
(Fig. 5). [w.sub.1] - [w.sub.n]--weight vector of the neuron;
f--function of hyperbolic tangent.
3. Feedforward ANN of one neuron, using linear function.
These three types of ANN were used to forecast accident risks on
the main and national road networks separately using the scheme
displayed in Fig. 1. Quantity of analysed data was not large; therefore,
data splitting into training and testing datasets was performed three
times. This is reflected in Table 1 as a test number. The data was
sorted and every tenth data unit was taken to the testing dataset of
test No. 1. Later on the process was repeated twice by using different
dataset row numbers. Such a process made the distribution of datasets
compounds accidental. After that the arithmetic mean of the forecast
error was calculated. As a result, the estimate of a forecast error has
become more reliable than it would have been if it was evaluated just
once.
Before performing calculations the dataset values were modified by
multiplying them with one or another constant to avoid the excesses in
the amplitude of values (that is, very small and very big values). This
process had no effect to the relation of accident risk and affecting
parameters, however, it helped to improve the calculation result.
Adapted digits, while calculated with the computing program MATLAB by
MathWorks, provide more exact results.
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
In all the three ANN types same network training function, which
updates the weight and bias (a part of the numerical representation used
to interpret a fixed-point number) values according to
Levenberg-Marquardt optimization ("trainbr"), was used. It
minimizes the combinations of squared errors and weights, and then
determines the correct combination so as to produce the network that
grants a satisfactory generalisation.
In the first ANN configuration, the number of neurons m in the
first layer of the network was changed from 1 to 15 (Fig. 4). The aim of
this process was to get the optimal network configuration. As it was
mentioned above, the process was repeated three times. Hereby the first
ANN type (Fig. 4) performed 45 calculations using different inputs and
ANN configuration with the main roads only. Same process was repeated
with the national roads. The second and the third ANN types were
significantly less time and effort-consuming, as they differ by transfer
function only.
[FIGURE 4 OMITTED]
[FIGURE 5 OMITTED]
5. Results
The results of traffic accident risk forecasting derived by using
the ANN types described above are displayed in Fig. 6 and Table 1. 12
parameters describing the road sections were applied, however, the
results show large forecasting errors. In a well-developed road network
accidents befall accidentally. Road sections which are subject to more
road accidents than are common are called the "black spots".
Traffic safety engineers endeavour to decrease or eliminate these spots,
thus they usually do not exist for a long period of time. In this way it
can be presumed that the forecasting method is suitable and enables to
derive satisfactory results when it provides more exact results than
could be obtained by calculating the average traffic accident density in
a network and then applying it to a road section.
Large errors not necessarily show that the application of the
method is not suitable. The accuracy of the results will be improved
after collecting more accurate data. The average annual number of road
accidents per road section was low in the analysed road networks during
the chosen period of time. There were approx 46% of road sections with 3
or less road accidents per year, and only 54% with more than 3. The most
recent data available in Lithuania was used in the study; however, it is
possible that it was not sufficient. In some road sections 1 accident
tends to change the result significantly. Moreover, the impact of single
factors on the accident risk is tiny. The method is supposed to provide
more weighty results by evaluating the complex of these factors.
It is suggested to apply the method while evaluating traffic
accident risk on newly built roads, road sections or bypasses where
there are no historic data of traffic accidents. Traffic accident risk
is considered as one of the determining factors while calculating the
profit of road infrastructure investment projects, the time period of
their buy-off and net present value.
Also, it is possible to use other mathematical methods while
forecasting accident risk using the same or similar data (Sliupas 2011).
The aforementioned source describes a method which permits to forecast
the accident risk by evaluating the impact of a bicycle/pedestrian
trail, traffic barriers and illuminated road length in the analysed road
section. The impact extent is calculated by using "before and
after" method. Also, road category, number of intersections,
grounds and inhabitants in the road section area are evaluated. The
application of the method gives a 53.60% forecast error which is similar
to the results displayed in Fig. 6 and Table 1.
6. Conclusions
1. The best traffic accident risk forecasting results for the main
roads are obtained using type 1 of ANN with 9 neurons in the first layer
(45.86%).
2. The best traffic accident risk forecasting results for the
national roads are obtained using type 1 of ANN with 2 neurons in the
first layer (45.38%).
3. The application of the method reveals large forecasting errors,
but the result was obtained using a relatively small quantity of data.
More accurate data, as well as larger volumes of data may change the
result. It was not available at the time the study was being carried out
in Lithuania.
Caption: Fig. 1. Structural scheme of traffic accident risk
forecasting by using ANN
Caption: Fig. 2. Weighted average of road pavement width in a road
section and traffic accident risk of a road section for all the main
road network sections
Caption: Fig. 3. Ratio of guardrail length in a road section and
road section length, and traffic accident risk for all the main road
network sections
Caption: Fig. 4. Scheme of a two-layer feedforward neural network
which was used for forecasting the traffic accident risk
Caption: Fig. 5. Feedforward ANN of one neuron used in the
forecasting of the accident risk
doi: 10.3846/bjrbe.2013.37
Received 14 October 2011; accepted 6 December 2011
References
Andrey, J. C.; Lister, M. 1999. Using Origin-Destination Data and a
Geographic Information System to Estimate Risk Exposure in Urban Areas,
Transport Research Record 1665: 51-58. http://dx.doi.org/10.3141/1665-08
Elvik, R.; Vaa, T. 2004. The Handbook of Road Safety Measures. 1st
edition. Amsterdam. Elsevier. 1090 p. ISBN 0080440916.
Griffith, M. S. 1999. Safety Evaluation of Rolled-In Continuous
Shoulder Rumble Strips Installed on Freeways, Transport Research Record
1665: 28-34. http://dx.doi.org/10.3141/1665-05
Hunter, W. W.; Stewart, J. R.; Eccles, K. A.; Huang, H. F.;
Council, F. M.; Harkey, D. L. 2001. Three-Stand Cable Median Barrier in
North Carolina: in-Service Evaluation, Transportation Research Record
1742: 97-103. http://dx.doi.org/10.3141/1743-13
Jasiuniene, V. 2012. Road Accident Prediction Model for the Roads
of National Significance of Lithuania. PhD thesis. Vilnius: Technika,
109 p.
Maher, M. J.; Summersgill, I. 1996. A Comprehensive Methodology for
the Fitting of Predictive Accident Models, Accident Analysis and
Prevention 28(3): 281-296.
http://dx.doi.org/10.1016/0001-4575(95)00059-3
Medvedev, V. 2007. Research of Feedforward Neural Network
Application to Multidimensional Data Visualisation. PhD thesis. Vilnius:
Technika, 144 p.
Miskinis, P.; Valuntaite, V. 2010. Mathematical Simulation of the
Correlation between the Frequency of Road Traffic Accidents and Driving
Experience, Transport 25(3): 237-243.
http://dx.doi.org/10.3846/transport.2010.29
Ratkeviciute, K.; Jasiuniene, V.; Cygas, D. 2011a. Metho dology for
the Substantiation of Road Safety Improvement Measures on the Roads of
Lithuania, in Proc. of the 8th International Con ference
"Environmental Engineering": selected papers, vol 3. Ed. by
Cygas, D.; Froehner, K. D. May 19-20, 2011, Vilnius, Lithuania. Vilnius:
Technika, 1200-1204. ISSN 2029-7106.
Ratkeviciute, K.; Vakriniene, S.; Jasiuniene, V.; Cygas, D. 2011b.
Analysis of Accident Prediction Feasibility on the Roads of Lithuania,
in Proc. of the 8th International Conference Environmental
Engineering": selected papers, vol 3. Ed. by Cygas, D.; Froehner,
K. D. May 19-20, 2011, Vilnius, Lithuania. Vilnius: Technika, 1205-1209.
ISSN 2029-7106.
Ratkeviciute, K. 2010. Model for the Substantiation of Road Safety
Improvement Measures on the Roads of Lithuania, The Baltic Journal of
Road and Bridge Engineering 5(2): 116-123.
http://dx.doi.org/10.3846/bjrbe.2010.17
Ratkeviciute, K. 2009. Model for the Substantiation of Road Safety
Improvement Measures on the Roads of Lithuania. PhD thesis. Vilnius:
Technika. 110 p.
Short, D.; Robertson, L. S. 1998. Motor Vehicle Death Reductions
from Guardrail Installation, Journal of Transportation Engineering
124(5): 501-502. http://dx.doi.org/10.1061/(ASCE)0733-947X(1998)124:5(501)
Sliupas, T. 2011. Investigation and Forecasting of Fatal and Injury
Traffic Accidents on Main and National Roads of Lithuania. PhD thesis.
Kaunas: Technologija, 116 p.
Sliupas, T. 2009a. Affect of Meteorological Situation to Accident
Volume on Lithuanian Roads, Advances in Transport Systems Telematics.
Ed. by Mikulski, J. Warszawa: Wydawnictwa Komunikacji i tacznosci Sp. z
o.o., 253-259.
Sliupas, T. 2009b. The Impact of Road Parameters and the
Surrounding Area on Traffic Accidents, Transport 24(1): 42-47.
http://dx.doi.org/10.3846/1648-4142.2009.24.42-47
Sliupas, T.; Radvilavicius, R.; Antanavicius, T. 2006. Interaction
between Population in a Road Section Area and AADT. AADT Forecasting
Using Area Population and Traffic Research Data Form Neighbouring Road
Sections, in Proc. of 10th International Conference on Transport Means.
October 19-20, 2006, Kaunas University of Technology, Kaunas, Lithuania,
143-146.
Received 19 October June 2011; accepted 28 September 2012
Tomas Sliupas (1) [mail], Zilvinas Bazaras (2)
(1) PE Road and Transport Research Institute, I. Kanto g. 23, P.O.
Box 2082, 44009 Kaunas, Lithuania
(2) Dept of Transport Engineering, Kaunas University of Technology,
Kestucio g. 27, 44312 Kaunas, Lithuania
E-mails: (1) t.sliupas@ktti.lt; (2) zilvinas.bazaras@ktu.lt
Table 1. Error of traffic accident risk forecasting
with the application of ANN
ANN type Test No.
I II III Average
Main roads
2 52.48% 53.53% 74.50% 60.17%
3 46.74% 49.01% 70.92% 55.55%
National roads
2 39.46% 59.34% 60.15% 52.98%
3 36.76% 53.29% 49.11% 46.39%
Fig. 6. Change of forecasting accuracy when the number of
neurons in the first layer of ANN type 1 is changing3
Main roads National roads
1 49.0 55.1
2 46.0 45.4
3 52.6 52.0
4 57.3 50.4
5 59.8 53.1
6 55.5 50.4
7 66.5 54.0
8 54.9 45.7
9 45.9 48.5
10 57.2 49.5
11 55.1 53.4
12 63.6 49.2
13 56.3 53.0
14 60.1 57.2
15 56.3 49.5
Note: Table made from bar graph.