文章基本信息

标题：Local thresholding image binarization using variable-window standard deviation response.
作者：Boiangiu, Costin-Anton ; Olteanu, Alexandra ; Stefanescu, Alexandru Victor 等
期刊名称：Annals of DAAAM & Proceedings
印刷版ISSN：1726-9679
出版年度：2010
期号：January
语种：English
出版社：DAAAM International Vienna
摘要：In content conversion systems image binarization aims to differentiate pixels into two regions, one represented by information and another one by background. The result of binarization is affected by degradations of documents in poor preservation state, as well as by inappropriate lightening conditions and handling during the acquisition stage. Since there is no solution that overcomes all the aforementioned problems, this remains an open and active area for research.
关键词：Document processing;Image processing

Local thresholding image binarization using variable-window standard deviation response.

Boiangiu, Costin-Anton ; Olteanu, Alexandra ; Stefanescu, Alexandru Victor 等

1. INTRODUCTION

In content conversion systems image binarization aims to differentiate pixels into two regions, one represented by information and another one by background. The result of binarization is affected by degradations of documents in poor preservation state, as well as by inappropriate lightening conditions and handling during the acquisition stage. Since there is no solution that overcomes all the aforementioned problems, this remains an open and active area for research.

Because the document image binarization represents the processing step at the base of a conversion system, it demands high quality for the output in view of the fact that it influences all subsequent processing steps. The algorithm presented in this paper aims to provide a reliable binarization mechanism.

2. RELATED WORK

Binarization techniques are mostly related with thresholding methods, which are classified according to the way they perform as either global (Otsu, 1979) or local (Niblack, 1986), (Gatos et. all, 2006). The major problem with binarization, especially with global thresholding, is that it does not consider any relation between pixels, besides their intensity. In addition, a single threshold is computed and used to classify the image pixels into foreground and background. To overcome these problems, local adaptive methods have been proposed (Boiangiu et. all, 2009). Moreover, adaptive methods use local area information for determining a threshold value for each pixel. However, minor modifications to the threshold value can bring significant changes in the output in some cases and only minor differences in others, due to the density of objects in the local region.

One of the most popular local threshold binarization methods is proposed by Niblack (Niblack, 1986). This method involves calculating for each image pixel the mean and the standard deviation of the gray level value of the neighboring pixels that are found in a window of a predefined size. This size influences the quality of the output and it is recommended to be small enough to conserve local details and large enough to suppress noise. The formula for determining the threshold is:

T = mean + k x stdev (1)

where mean is the average value of the pixels in the local area, stdev the standard deviation of the same pixels, and k is a constant, preselected coefficient. Shortcomings of this method comprise the persistence of background noise and its significant sensitivity to the window size. To reduce the amount of background noise in homogeneous regions larger than the window size an improved version of Niblack's method was proposed by Sauvola (Kasar et. all, 2007). This method use a hypothesis that considers that the gray values of pixels for text, respectively for background, is closer to 0, respectively to 255. The formula for determining the threshold is:

T = mean x [1 + k x (stdev/R - 1)] (2)

where parameters R, the dynamic range of the standard deviation, and k are fixed to 128 and 0.5, respectively. Sauvola's and Niblack's algorithms are very rigid in their approach because they compute local statistical functions by operating on an image dependent, fixed window size. The current approach tries to take into account the advantages of both local statistics and variable window size for local threshold computation.

3. ALGORITHM DESCRIPTION

The problem with binarization is that there is no fully-grown algorithm able to deal effectively with all of the problems that can be identified in the plethora of scanned image documents. In order to obtain this goal we propose a modification to Niblack's method, which is based on sliding a square window over the document image and calculating the mean and standard deviation of the grey pixel values in each window. However, instead of using predefined values, the window size is computed dynamically.

The window is gradually grown until the value of the standard deviation of gray pixel levels within the window multiplied by the logarithmic function applied to the window size reaches the first local maximum (see Figure 1). This is a compromise between the quality of the output and the speed of conversion.

[FIGURE 1 OMITTED]

The window size determined as above is then used for computing the mean of gray level values of pixels within the current window. The threshold T for determining if the pixel is converted to black or white is calculated using:

T = mean x (1 + t) (3)

where t is a preselected coefficient. Because the mean value is dependent on the window size, and consequently to the standard deviation threshold, the direct contribution of the standard deviation to the threshold value has been removed.

Furthermore, the preselected coefficient t plays a major role. This coefficient should be chosen negative when the window size imposed by the standard deviation threshold is small, in order to suppress noise, and positive when the computed window size is larger, for preserving local details.

The main algorithm steps are:

1. For each image pixel, compute the sum and the squared sum of the gray levels of pixels contained in a rectangular area defined by the current pixel and the pixel in the top-left corner;

2. For each image pixel do:

a) grow the window size until the standard deviation of the current window multiplied by the logarithm of the window size is smaller than the value computed for the previous window;

b) determine the mean of the current window and then compute the local threshold using (3);

c) set the pixel to 0 (black) if lower than the obtained threshold, or to 255 (white), otherwise.

This approach assumes that a neighborhood defined by a rectangular window of size corresponding to the first local maximum in the series of standard deviation values mentioned above offers adequate local statistics so that the most appropriate threshold value for the window's center pixel can be determined based on the mean gray level value in the neighborhood window. In this way, the most appropriate window size is also selected in order to preserve local proprieties and to suppress noise at the same time.

4. VISUAL EXAMPLES

The tests have been performed on scanned documents consisting of various old library documents and newspapers. The majority of document scans used for testing presented problems such a slow and uneven contrast, noisy aspect, inconsistent lightening across the page, etc. The results obtained using the proposed algorithm were compared with the results obtained by employing Niblack's original method (15x15 window size and k = -0.2), Sauvola's method (He et. all, 2005) (15x15 window size and k = 0.5) and a global threshold method proposed by Otsu. For the proposed method the parameter t was set to 0.

[FIGURE 2 OMITTED]

Two test images are used to depict the obtained results. The first image, highlighted in Figure 2, is representative for historical books containing Fraktur style fonts, with heavy background noise, high density text and presenting difficulties due to uneven exposure during document acquisition. In comparison to the other methods used for testing, the proposed approach managed to correctly classify the object pixels without incorrectly classifying noise pixels, offering a result very similar to the one corresponding to Sauvola's method and succeeding to obtain the best results. In addition, in the second test, presented in Figure 3, the proposed method produced by far the best results, even in comparison to Sauvola's. This case is representative for historical newspapers having large areas of uneven contrast and lighting, Antiqua-style fonts, medium to low text density and blurry textual regions.

[FIGURE 3 OMITTED]

5. CONCLUSIONS AND FUTURE WORK

The proposed method is very useful in content conversion systems, mainly due to its resistance to noise and imperfections, allowing subsequent stages to benefit from a clean objects/background classification of image data. The comparison between the results obtained using well known image binarization algorithms, both local and global, and the proposed algorithm has shown that the latter succeeded to produce excellent output. Our future work will focus on improving the results in the case of large compact noise zones.

6. ACKNOWLEDGMENT

The research presented in this paper is supported by the national project "Excelenta in cercetare prin programe postdoctorale in domenii prioritare ale societatii bazate pe cunoastere (EXCEL)", Project POSDRU/89/1.5/S/62557.

7. REFERENCES

C.-A. Boiangiu, A. I. Dvornic, and D. C. Cananau, Binarization for digitization projects using hybrid foreground-reconstruction. Proceedings of the IEEE 5th International Conference on Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, August 27-29, 2009, pp. 141-144

B. Gatos, I. Pratikakis, and S. J. Perantonis. Adaptive degraded document image binarization. New York, NY, USA: Elsevier Science Inc., 2006, vol. 39, no. 3, pp. 317-327

J. He, Q. D. M. Do, A. C. Downton, and J. H. Kim. A comparison of binarization methods for historical archive documents. Proceedings of the Eighth International Conference on Document Analysis and Recognition. Washington, DC, USA: IEEE Computer Society, 2005, pp. 538-542

T. Kasar, J. Kumar, and A. G. Ramakrishnan. Font and background color independent text binarization. Second International Workshop on Camera-Based Document Analysis and Recognition, Bangalore, India, 2007

W. Niblack. (1986). An introduction to digital image processing. Englewood Cliffs, NJ: Prentice-Hall. 1986, pp. 115 -116

N. Otsu. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics. 1979, vol. 9, pp. 62-66. ISSN 0018-9472