Current Issue Cover
基于图理论聚类和二值纹理分析技术的彩色文本图像二值化方法

李向丰1, 汪斌1, 刘峰1, 胡福乔1(上海交通大学图象处理与模式识别研究所,上海 200030)

摘 要
为了有效地对彩色文本图像进行分割,提出了一种复杂背景下彩色图像中文本一背景分离的新方法。该方法首先应用颜色空间降维以及基于图理论的颜色聚类对彩色文本图像进行聚类,并对应于聚类结果获得一系列二值图像,这些二值图像以及它们之间的组合就构成了二值化的待选结果;然后对与游程直方图以及空间-尺寸分布相关的两类纹理特征进行分析,并结合线性判别分析分类器来从待选的二值图像中选取出具有最佳文本背景分离效果的二值图像。实验结果显示,该方法的:二值化效果比现有方法有显著提高,因而能更有效地对具有复杂背景的彩色文本图像进行分割。
关键词
The Binarization for Color Text Images Based on Graph-theoretical Clustering and Binary Texture Analysis

()

Abstract
Text is an important feature for computer vision, especially for information retrieval applications. In this paper, the authors have developed a novel algorithm for text background separation, or binarization for color images of complicated backgrounds. In their algorithm, dimensionality reduction and graph theoretical clustering are first performed. Corresponding to each cluster, a binary image can be obtained. Additional binary images are obtained through combination among these cluster related binary images. Then, two kinds of features capable of effectively characterizing binary texture images, run length histogram based and spatial size distribution based features associated with each of these binary images are extracted out. Based on the analysis of these texture features, cooperating with an LDA classifier, the optimal binary image which gives the best text background separation will be found out as the final binarization result. Experiments with images collected from Internet have been carried out, which show that their method can handle color text images with complex background effectively; comparison with existing techniques also presented a notable improvement brought by the proposed method.
Keywords

订阅号|日报