基于DCT压缩域的图象字符定位

黄祥林; 沈兰荪

发布时间：
摘要点击次数： 2883
全文下载次数： 526
DOI: 10.11834/jig.20020108
2002 | Volume 7 | Number 1

基于DCT压缩域的图象字符定位

黄祥林¹, 沈兰荪¹(北京工业大学信号与信息处理研究室，北京 100022)

摘要

为了能够利用图象中所含的文字信息来进行图象的快速高效浏览检查，其中，快速字符定位是很重要的工作，为此设计了一种直接在图象压缩域中进行字符定位的方法，该方法主要是利用图象中字符纹理所具有的方向性特点，首先直接在DCT域中提取字符的横向、竖向、斜向纹理的方向信息，然后根据各自的阈值把字符区域从图象背景中分割出来，在处理过程中，用形态滤波的方法可有效地消除噪音点，该算法可直接处理JPEG、MPEG等以DCT为编码基础的压缩数据，仅需少量的解码过程（Huffman解码）即可完成字符定位，因此要处理的数据量较少，用该算法既提高了处理速度，又减少了对计算机资源的需求，试验结果表明，此方法具有较高的准确率。

关键词

字符定位 DCT变换压缩域处理形态滤波图象处理图象压缩域

Character-Localization in DCT-Compressed Domain

()

Abstract

Segmenting character regions in an image is very important because these characters contain clear clues of retrieving and browsing images from video/image databases efficiently and effectively. In this paper, We propose a method to locate character regions of video/image in DCT compressed domain directly. With the distinguishing characteristics of character's texture (such as horizontal lines, vertical lines, or slant lines in a character) that can be extracted directly in DCT compressed domain, the character regions are segmented from their backgrounds quickly, and the image noises rising during the processing period can be removed by morphological filter. With this method, the compressed bit streams, which are encoded by DCT based encoding algorithm such as JPEG, MPEG 1/2, etc., can be processed directly to locate the character regions in image, just a very small amount of decoding is required (Huffman decoding only). So, the amount of data which want to process is smaller, the processing speed is faster and the demand of computer memory is less. The experimental results show that the correct localization rate of this algorithm is higher.

Keywords

Character localization DCT Compressed domain processing Morphological filtering

在线采编平台

论文出版

年度会议

下载中心

年度信息