文章基本信息

标题：Confidence measure for skew detection in photographed documents.
作者：Boiangiu, Costin-Anton ; Rosner, Daniel ; Olteanu, Alexandra 等
期刊名称：Annals of DAAAM & Proceedings
印刷版ISSN：1726-9679
出版年度：2010
期号：January
语种：English
出版社：DAAAM International Vienna
摘要：In recent years, more and more institutions, companies and libraries have made the transition from plain paper documents and books towards digital content. As many of them retain large paper based archives that need to be integrated in their new digital systems, faster and more robust automated content conversion systems are needed to speed up and cut down on costs for converting books and archives to digital form.
关键词：Imaging;Imaging systems;Quality control

Confidence measure for skew detection in photographed documents.

Boiangiu, Costin-Anton ; Rosner, Daniel ; Olteanu, Alexandra 等

1. INTRODUCTION

In recent years, more and more institutions, companies and libraries have made the transition from plain paper documents and books towards digital content. As many of them retain large paper based archives that need to be integrated in their new digital systems, faster and more robust automated content conversion systems are needed to speed up and cut down on costs for converting books and archives to digital form.

Given the size of such archives, which in many cases can be in the range of millions of pages, improvements in the automatic digitalization system can lead to savings in terms of time, man power and mainframe power consumption.

The first stage in an automated content conversion and one that can induce many errors is the scanning procedure. A typical fault, and the one this paper will address, is the scanning or photographing of the input image at a skew angle. There are many scenarios for this error to occur, but generally there are caused either by poor alignment on the scanning surface, by human error or by imperfections in the optic system of the scanning device. Skew errors can also be expected when scanning thick books that cannot be properly placed on the scanner surface, or place leveled in front of a camera.

The current article presents an approach for detecting the skew angle of an image, with a new method for estimating a confidence level for the detected skew angle in the presence camera lens distortions or similar faults.

2. RELATED WORK

Skew detection and correction for scanned or photographed images is a field of great interest and thus many solutions have been proposed over the years.

There are four main classes of algorithms (Baird, H. S. 1995): Hough transform, projection profiling, cross-correlation and neighbor clustering.

Hough transform (Chandan Singh et al., 2008) and projection profiling based algorithms are constructed on the presumption that parallel lines of text will generate collinear agglomerations of foreground pixels, but use different mathematic representations of the idea. Cross correlation based algorithms use the general coherency between stripes from lines of text that can be expected in a skew free document.

Neighbor clustering techniques use clustering methods to rebuild lines of text. Then, using specific feature-points selected from inside a detected line of text, the skew angle is estimated by various line fitting techniques (R. Smith., 1995), (Lu, Y et al., 2003).

[FIGURE 1 OMITTED]

All this techniques will generally yield excellent results, but will lose precision in the presence of camera lens distortions. Moreover, these techniques cannot offer a measure of the accuracy of the returned skew angle.

3. ALGORITHM

The present algorithm is based on the observation that good skew features can be extracted from the borders of image entities. For most characters, especially Latin ones, bottom borderlines can provide excellent measurement of skew angles. The present skew detection algorithm identifies lines of text and extracts skew angle information based on these lines.

In order to overcome errors introduced by camera lens distortions in images, the current paper introduces a modified skew angle estimation method and a new approximation of the accuracy of results.

3.1 Preprocessing

If the image is not bitonal, the image will first be binarized, using an existing algorithm, like (J. Sauvola, 2000). ntities are then discovered and contour information is preserved.

3.2 Noise removal and entities filter

The main filter in the present work is a minimal size threshold used to remove noise. Considering the 6 point letter 'e' as a minimum size reference, it can be shown that most Latin characters are not readable below this size. Thus, entities below 6 pixels can be viewed as noise.

Considering that the human eye cannot distinguish, unaided, details smaller than 0.1 mm, entities below 0.6 mm of height can be removed from the computation. Thus, the minimum accepted height, measured in pixels, will be (considering that an inch has 25.4 mm):

noise_threshold = round (max (6,0.6*Image_DPI/25.4)). (1)

A second heuristic threshold is used for large entities removal. If too much of the input entities are removed, the second threshold can be reduced in order to retain more input data. Further filtering is done within the clustering algorithm.

3.3 Cluster construction

Clusters are initially constructed based on a vicinity clustering algorithm that is optimized for horizontal Latin based text.

A character's vicinity is defined as rectangular area: of 120% height, and of a length of 200% the character's height, starting from its right bottom pixel. A maximum height difference of 200% is also used. This character clustering algorithm yields good results in the presence of skew angle in the range [+ or -]15[degrees] and normal camera lens distortions.

3.4 Cluster selection

It is important that the clusters offering better skew information are retained, while the clusters that can introduce errors are discarded.

Clusters of small length will generally not retain enough input information for an accurate skew detection. However, depending on the page layout, there are pictures that do not contain sufficiently long lines. Thus, cluster filtering will be tried in an iterative fashion, from highest precision, until a sufficient number of clusters are found:

cluster_threshold = 1/tan( i x angular_precisori) (2) where i is the iteration number and angular_precisionis the desired skew angle detection accuracy.

3.5 Skew angle estimation

For each cluster, two approximate skew angle computations are made by means of a least square fitting method performed on the bottom pixels of each entity in the cluster. The first is used to find the direction of the skew fault, and the second is used to find the angle, while removing descenders in the computation.

Next, the input points are rotated towards the OX axes with the determined skew angle.

Finally, the data points are fitted to a parabolic function in order to determine the straightness of the cluster. There are two indicators of the quality of a cluster for skew detection: cluster length and cluster "flatness"--determined by the distance between the focal point and the extreme of the parabolic function.

Thus, considering that the cluster length in pixels is cluster_length and that the fitting function was determined as:

a [x.sup.2] + b x + c (3)

a confidence degree can be defined as:

confidence_degree = cluster_length x a . (4)

3.6 Final skew angle computation

A histogram is used to eliminate values that are too far off the main peaks. The final skew angle computation takes into account the confidence of each cluster, and is thus computed as a weighted mean between the skew angle and the confidence degree for each cluster.

final_angle = [SIGMA](angle [i])*confidence _degree[i])/ [SIGMA]confidence _degree [i] (5)

3.7 Cluster recomputation

After the final skew angle was computed, if the input data quality is considered of low quality, a new computation can be made, by recomputing clusters using a modified version of the vicinity clustering that account for the direction and the approximate angle of the skew fault.

Finally, steps 3.4 to 3.7 are reiterated and the final skew angle is returned.

4. DEGREE OF CONFIDENCE OF FINAL RESULTS ESTIMATION

As the next steps in an automated digitalization process depend on the results obtained by the skew detection algorithm, a degree of confidence is introduced as a second returned value for the presented algorithm.

The expected error is inverse proportional to the length of the clusters and with the straightness of the detected parabolas.

Thus, the algorithm will return two values:

--the skew angle, in degrees

--a confidence measure

The confidence measure is given by the formula:

[SIGMA] confidence_degree[i] / number_of_clusters (6)

In the present form, the confidence measure is not normalized, but represents a way of evaluating the quality of the returned value by comparing the confidence measure returned for various image.

Moreover, in systems that use feedback loops, an image can be processed several times, each time with different preprocessing filters, and the confidence measure can be used to determine which filters yield the best results.

5. FURTHER WORK

For now, the algorithm achieves great accuracy and provides a measure of the degree of accuracy for the returned skew angle, depending on the quality of the input data.

Further work will be done in order to obtain a normalized degree of confidence that will likewise be returned together with the skew angle, but providing a better measure of the skew detection success. Two values shall be returned.

First, the algorithm will return the angle accuracy, measured in degrees, which will represent the maximum possible error in the returned angle, and thus an angle space in which the true skew angle will surely reside.

The second value will be a percentage indicating the probability that the returned skew angle is the actual true skew angle.

6. ACKNOWLEDGMENTS

The research presented in this paper is supported by the national project "Excelenta in cercetare prin programe post doctorale in domenii prioritare ale societatii bazate pe cunoastere (EXCEL)", Project POSDRU/89/1.5/S/62557.

7. REFERENCES

Baird, H. S. (1995). The skew angle of printed documents. Document Image Analysis, Eds. IEEE Computer Society Press, Los Alamitos, CA, 204-208

Bhatia, C.S.N. & Kaur, A. (2008). Hough transform based fast skew detection and accurate skew correction methods. Pattern Recognition, Volume 41, Issue 12, pp. 3528-3546

Sauvola, J. & Pietikainen, M. (2000). Adaptive document image binarization. Pattern Recognition, 33(2): 225-236

Lu, Y. & Tan, C. L. (2003). Improved Nearest Neighbor Based Approach to Accurate Document Skew Estimation. Proceedings of the Seventh international Conference on Document Analysis and Recognition--Volume I. CDAR. IEEE Computer Society, Washington, DC, pp. 503

Smith, R. (1995). A Simple and Efficient Skew Detection Algorithm via Text Row Accumulation. Proc. of the 3th International Conference on Document Analysis and Recognition, Montreal, Canada, pp. 1145-1148