Algorithms for automatic/semiautomatic geographic data acquiring and error assessment.
Alexei, Adrian ; Tomoiaga, Tiberius ; Marinescu, Mirel 等
1. INTRODUCTION
With the development of the first commercial GIS at the beginning
of 80's new methods of data acquiring developed too: vector
digitization of paper maps, and later, after hardware improvement,
raster digitization (Nitu, 1992).
Vectorial digitization uses paper maps and digitizers.
Raster vectorization uses scanned maps and is the most used method
in present. It uses large formats scanners, plotters, powerful
computers, DBMS's and raster--vector conversion software.
Raster--vector conversion software's are not fully automated for
data acquiring from scanned maps (Liu et al., 1997).
Because of the great importance and the difficulties involved by
automatic raster--vector conversion, this process has been one of the
major topics in the last decade researches.
These algorithms for automatic acquiring of polygonal, linear and
punctual elements from scanned maps will have a big contribution to
geodatabase creation from 1:25.000, 1:50.000 and 1:100.000 maps.
By implementing these algorithms, users will be allowed to produce
their own geodatabases very easily and at a very low cost.
2. ALGORITHMS DESCRIPTION AND ERROR ASSESSMENT
Based on scheletisation algorithms there are 2 classes of
vectorization algorithms (Alexei, 1997):
--One pass algorithms, applied directly on primary images;
--Two passes algorithms, which have scheletisation as preliminary
phase.
2.1. Linear elements vectorization
First step involves determination of one end of the linear element.
Second step involves linear element tracking using circle method.
After a complete exploration around the moving point ([x.sub.0],
[y.sub.0]), we can have the following situations:
1. One pixel set on 1--means we have a new point of the linear
element with coordinates (x, y). The current point becomes ([x.sub.0] =
x, [y.sub.0] = y) and this is followed by the continuation of the circle
method.
2. Two or more pixels set on 1--means we have an intersection of
linear elements fig. 1. In this case the software requires human
intervention or the point can be memorized for returning after all other
linear element are analyzed (backtracking type algorithm).
[FIGURE 1 OMITTED]
3. No pixel set on 1--means the end of linear element and the end
of analysis.
When we have two or more pixels set on 1 the way to evaluate and
eliminate the ambiguity is presented in the fig. 2.
[FIGURE 2 OMITTED]
If linear elements are not tangential, in order to eliminate the
ambiguity, the analysis radius is decreased until a unique pixel set on
1 is found.
2.2. Polygonal elements vectorization.
First stage involves one pixel determination of the contour to be
vectorized.
Stage two involves the following of the contour. This end when the
current pixel is the same with the first one or when the current pixel
leaves the map.
2.3. Punctual elements vectorization.
Shape signature is one variable function associated to a plane
form. Shape signature is reversible--having a signature we can rebuild a
shape.
Signatures are invariant characteristics by translation due the
direct reference to the mass center (Maguire et al., 1991). The
invariance by the scale factor can be obtained by determining the
average scale factor between the distance on the reference form and the
distance on the analyzed form. The invariance by the rotation factor can
be obtained by a cyclic permutation of the signature.
Fourier Descriptors characterize the shape. This method is a
supervised one and needs the sets of Fourier descriptors.
If we have the Fourier Descriptors [U.sup.1](k) and [U.sup.2](k)
corresponding to some different shapes [u.sup.1](n) respective
[u.sup.2](n), then the form of these shapes is similar if
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)
Parameters [u.sub.0], [alpha], [[phi].sub.0] and [n.sub.0] results
from minimizing condition of the effects of translation, scaling,
rotation.
Mask based punctual symbols recognition. This algorithm is a simple
recognition method of conventional symbols based on correlation between
theoretical and real image of the symbol.
If f(x,y) is the theoretical image and g(x,y) scanned image we
should have the following relation:
f AND g = f (2)
Above relation is an ideal one, but because of the scanning errors
we have:
f AND g = f where f' [not equal to] f. (3)
To see if the f and g images are in correlation we use the
following algorithm:
1) Determination of differences between theoretical and scanned
image using a logical or exclusive
2) Determination of plus pixels (pP) or minus pixels (pM).
If the pP or pM is lower than a certain threshold[epsilon], then g
is influenced by the noise off [epsilon] is determined experimental.
2.4. Geodatabases error assessment.
The elements acquired in a geodatabase cannot be more precise then
the original information. For error assessment we will develop
algorithms for the following accuracy indicators:
Accuracy indicators for linear elements. Linear element accuracy
can be split in two components (Hauvelink, 2000):
--Positioning accuracy;
--Shape fidelity.
Accuracy indicators for polygonal elements. It will be used the
indicators from linear elements over the border.
Accuracy indicators for attributes. This algorithm will be based on
Cohen's Kappa (K) coefficient.
Accuracy indicators for conceptual accuracy. Conceptual accuracy
verifies the selection criterias and the quantity of information. It has
two components:
--The missing information;
--Supplemental information.
3. TEST RESULTS
For testing the algorithms a software package was developed in
Delphi 7.0. As input data scanned images in .bmp and .jpg format was
used and as output format was used ESRI shape format.
The figure 3 shows how a church symbol is recognized and marked
with the correspondent vectorial symbol from a symbol database.
All punctual elements identified on map are memorized in one file
of ESRI SHAPE type, based on unic codes. The codes are used to associate
the symbols from the symbol database for representation.
The figure 4 shows a comparation between the algorithm implemented
for linear automatic vectorization and a similar solution offered by one
of the most powerful GIS solution on the market, ESRI ArcGIS. The
results are comparable both as accuracy and speed.
[FIGURE 3 OMITTED]
[FIGURE 4 OMITTED]
4. CONCLUSION
The developed algorithms for linear and poligonal elements
vectorization was tested with good results and they can be used
succesfuly for interactive vectorization of contour lines and
hydrography. The false pixel detection rate can be lowered by decreasing
the analisys radius, increasing the number of detected points.
The algorithm for automatic recognition of the shape of punctual
symbols can be easily implemented and the speed of analisys for a class
of objects is independent of the image. For increasing the speed, the
approximative coordinates of the element can be pointed out using a
locator, the algorithm finding the precise coordinates and the
associated code.
Having a dual character the propose solutions can be implemented
both in military and civil environment. Military Topographical
Department will be one of the major users of the proposed solutions.
5. REFERENCES
Alexei, A. (1997). Raster--vector conversion algorithms, MTA Review, Military Technical Academy, Bucharest, May 1997, pp. 59-62.
Hauvelink, G., (2000)--Error propagation in environmental modeling
with GIS, Taylor & Francis, London.
Liu, W., Dov, D., (1997)--A Protocol for Performance Evaluation of
Line Detection Algorithms, Machine Vision and Applications,
Springer-Verlag GmbH Volume 9, Numbers 5-6, March 1997, p240-250.
Maguire, D. J., Goodchild, M. F., (1991)--Section I. Introduction.
In: Maguire D J, Goodchild M F, Rhind D W (eds.) Geographical
Information Systems: principles and applications. Longman, London, pp.
3-7, Vol 1
Nitu, C., (1992)--Developing trends in automated
cartography--Geodesy, Cartography and Cadastral Survey Review, no. 1,
1992, pp 27.