Current Issue Cover
基于连通分量特征的文本检测与分割

蒋人杰1, 戚飞虎1, 徐立1, 吴国荣1(上海交通大学计算机科学与工程系,上海 200240)

摘 要
自然背景中的文本识别具有巨大的应用价值,但其应用却一直受到文本检测和分割技术的限制。为了更有效地进行文本检测与分割,提出了一种基于连通分量特征的自然场景中文本检测分割算法。该算法首先将原始图片通过Niblack方法分解为许多连通分量;接着,用一个级联分类器和一个SVM组成的两阶段分类模块来验证这些连通分量的文本特征。由于文本连通分量和非文本连通分量在特征上存在差异,大多数非文本会被级联分类器丢弃,而SVM则能在此结果上做进一步的验证,因此最终输出只有文本的二值图像。最后用该算法在测试数据上进行了评估实验,评估结果表明,检测精度超过90%,响应超过93%。
关键词
Using Connected-Components'''' Features to Detect and Segment Text

()

Abstract
Text recognition in natural scenes has a promising future,but its application is limited by the technique of text detection and segmentation.To detect and segment text effectively,this paper proposes an approach for detecting and segmenting text from scene images by using Connected-Components' features.First,the image is decomposed into a list of Connected-Components(CCs) by Niblack algorithm.Then all the CCs' features are verified by 2-stage classification module which is composed by a cascade classifier and a SVM.Most of non-text CCs are filtered out by cascade classifier and the remaining CCs are further verified by SVM.The final outputs are binary images containing texts only.Experiments have been taken on lots of images,the precision is more than 90% and recall is more than 93%.
Keywords

订阅号|日报