Software designed for modelling and simulating using three-layer neural networks.
Ciupan, Emilia
1. INTRODUCTION
The paper presents original software called MLP developed by the
author. The goal of this application is solving problems belonging to
different domains such as engineering, economy etc. using methods of
neural computing.
Process modelling and simulating using neural networks assumes more
phases. First, the choice of an appropriate architecture of the neural
network is necessary. Then, the network is trained so that it simulates
the process as accurately as possible. The last phase is testing and
using the network.
The software is divided into more modules which carry out different
tasks such as: creating neural network, network training, testing and
recall function.
The software was developed in Visual C++ language.
2. CONSIDERATIONS REGARDING THE IMPLEMENTED METHODS
The "MLP" software implements the back propagation
algorithm as a training method by the means of two methods: the descent
gradient method (GD) and the Levenberg-Marquardt method (LM).
Both methods are iterative and solve an optimum problem:
determining the synaptic weights w of the network so that they would
minimize the error function expressed as a sum of squares of the other
functions (Hagan & Menhaj, 1994):
E( w) = 1/2 [m.summation over (i=1)[([e.sub.i] (w)).sup.2] (1)
The GD method is sequential and related to gradient. This assumes a
step by step sequential presentation of the training examples. The error
is calculated as the difference between the desired output and the
effective output for each example. This error is then used for the
calculation of the weights modification [DELTA]w through back
propagation.
The new weights are calculated using the following equation
(Dumitrescu & Hariton, 1996; Zilouchian & Jamshidi, 2001):
w(k+1)=w(k)+[alpha]-[DELTA]w (2)
where [alpha] is a constant that ranges from 0 to 1 called learning
rate.
The LM method is based on Hessian matrix. This is a batch method.
This means that the calculation of the weights modification assumes
presenting all of the examples of the training set and determining the
total error e(w). This is described by equation (3) (Hagan & Menhaj,
1994):
[[J.sup.T] (w) x J(w) + [mu]I] x [DELTA]w = -[J.sup.T] (w) x e(w)
(3)
where:
e : [R.sup.n] [right arrow] [R.sup.m], e = ([e.sub.1], [e.sub.2],
..., [e.sub.m]) is an error function, J is the Jacobian matrix of the
function e [micro] is a variable damping factor I is the identity
matrix.
The matrix [J.sup.T]J is the approximate Hessian of the error
function E (Madsen et al., 2004).
An iteration consists of determining a [DELTA]w modification of the
weights which will lead to a reduction of the error.
3. SOFTWARE TESTING
The software testing was done in two ways: by comparing the results
obtained with MLP with the ones obtained with MATLAB 7.0 and by
comparison with the mathematical model. Two mathematical functions
described by examples 1 and 2 were modelled (Ciupan, 2008).
Example 1: Dampened wave function
A dampened wave function described by equation (4) was studied.
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (4)
Because of the random values of the initial weights an average
performance was calculated during 6 training sessions, 100 epochs per
training. A 30 examples training set was used consisting of (t, f(t))
pairs obtained with the mathematical model (Eq. 4).
Both methods, GD and LM, were used. Other features of the neural
models are presented in table 1.
The model with the best performance on the LM method, 6.3 x
[10.sup.-4], was chosen in order to compare the neural model with the
mathematical one. This comparison may be observed in figure 2, where
f(t) are the theoretical values of the function f given by equation (4)
and f(t)R are the values simulated by the neural model.
[FIGURE 2 OMITTED]
Example 2: Dirichlet function
The general definition of the Dirichlet function is described by
equation (5):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], otherwise (5)
where n is a positive integer parameter.
Consider the case when n=7. The function is periodical in this case
and the period is 2[pi]. A 1-8-1 network was trained to approximate the
function. The neural network was trained during 5 training sessions,
each session consisting of 100 epochs. A training set containing 100
input/output pairs of type (x, f(x)), x [member of] [0, 4[pi]] was
created using the mathematical model (5). The value 10-3 was chosen for
desired performance.
Tables 2 and 3 present the performance reached over these 5
training sessions.
Further testing was done by choosing the best neural model obtained
in the training process through the LM method (table 2, session #1,
performance 6.0* [10.sup.-3]). The approximation of the function was
carried out by the neural model mentioned above.
[FIGURE 3 OMITTED]
Figure 3 illustrates the comparison between the simulated values
obtained by the neural model and the theoretical values obtained by the
mathematical one.
4. CONCLUSIONS
The results analysis shows that:
a) Both programs, MLP and MATLAB, lead to close results when
modelling using three-layer neural networks in general.
b) It is impossible to conclude which program, MLP or MATLAB, leads
to better results when training is done through the LM method or through
the GD method. The results depend on more factors, including the initial
random weights. In the case of each program the performance may be
increased by more training sessions or by increasing the number of
examples.
c) It is possible to notice that in the case of each programs the
neural model approximates the theoretical values of the studied
functions for those values of the argument for which the values of the
function vary sufficiently. For those values of the argument for which
the amplitude of the function values is small, the neural model tends to
make an average of the theoretical values.
d) The neural model maintains certain features of the mathematical
model such as oscillating, dampening or periodicity.
5. REFERENCES
Ciupan, E. (2008) Integrated Management of the Systems Using Open
Control Platforms. Ph D Thesis, Technical University of Cluj-Napoca, pp.
90-100, 2008.
Dumitrescu, D. & Hariton, C. (1996) Retele neuronale-teorie si
aplicatii (Neural networks--theory and applications). Ed. Teora, ISBN 973-601-461-4, Bucuresti.
Hagan, M. T. & Menhaj, M. B. (1994) Training Feed-forward
Networks with the Marquardt Algorithm. IEEE Transactions on Neural
Networks, vol. 5, no. 6, November 1994, pp 989-993.
Madsen, K.; Nielsen, H.B. & Tingleff, O. (2004) Methods for
Non-Linear Least Squares Problems. Second Edition. Informatics and
Mathematical Modelling, Technical University of Denmark, 2004.
Zilouchian, A & Jamshidi, M. (2001) Intelligent Control Systems
using Soft Computing Methodologies, ISBN 08493-1875-0, CRC Press LLC,
2001
Tab. 1. Characteristics of the neural models
Method Network Desired
architecture performance
LM 1-8-1 [10.sub.-3]
GD 1-8-1 [10.sub.-5]
Method Reached performance
MLP MATLAB
LM 4.0 * [10.sub.-3] 1.9 * [10.sub.-3]
GD 1.0 * [10.sub.-4] 38.3 * [10.sub.-4]
Tab. 2. The performance obtained with the LM method
Session Performance
MLP MATLAB
1 0.00607977 0.00681455
2 0.08666790 0.00959747
3 0.08496530 0.01249476
4 0.07041760 0.01129421
5 0.05172580 0.00697923
Average 0.05997127 0.00943605
Tab. 3. The performance obtained with GD method
Session Performance
MLP MATLAB
1 0.00314583 0.09752570
2 0.02416600 0.03229670
3 0.02190806 0.11534800
4 0.01805316 0.09075010
5 0.05172580 0.07358620
Average 0.02379977 0.08190134