Current Issue Cover

李向丰1, 汪斌1, 刘峰1, 胡福乔1(上海交通大学图象处理与模式识别研究所,上海 200030)

摘 要
The Binarization for Color Text Images Based on Graph-theoretical Clustering and Binary Texture Analysis


Text is an important feature for computer vision, especially for information retrieval applications. In this paper, the authors have developed a novel algorithm for text background separation, or binarization for color images of complicated backgrounds. In their algorithm, dimensionality reduction and graph theoretical clustering are first performed. Corresponding to each cluster, a binary image can be obtained. Additional binary images are obtained through combination among these cluster related binary images. Then, two kinds of features capable of effectively characterizing binary texture images, run length histogram based and spatial size distribution based features associated with each of these binary images are extracted out. Based on the analysis of these texture features, cooperating with an LDA classifier, the optimal binary image which gives the best text background separation will be found out as the final binarization result. Experiments with images collected from Internet have been carried out, which show that their method can handle color text images with complex background effectively; comparison with existing techniques also presented a notable improvement brought by the proposed method.
