形状的圆内距离变换
摘 要
目的 形状的描述、匹配、相似性判定和检索是计算机视觉和图像识别的基本问题,也是一个开问题。在目前公开的方法中,除了只能应用于简单形状的几何复变换和基于边界的傅里叶描述子外,其他的方法均不能由构建的形状特征描述符重建原形状,因此不能保证所建立的形状特征能客观地描述原形状。本文提出了形状的圆内距离变换,该方法所建立的描述符可用于形状匹配、相似性度量和形状检索。该方法是可逆的,也就是可以从形状描述符重建原形状。方法 形状的圆内距离变换通过在形状的最小外接圆内旋转和切分形状,求出形状相邻切分点之间的距离,并由此构建形状的特征矩阵。对于任意相似的形状,从理论上证明了形状的圆内距离变换具有缩放、旋转和位移不变性。结果 对发生了形变、扭曲和仿射变换的形状,采用圆内距离变换方法进行了形状的相似性度量、检索和重建实验,结果表明,形状的圆内距离变换可以准确地描述形状、度量形状的相似性、检索形状并重建原形状。在形状的相似性度量上,形状的圆内距离变换能给出与人类视觉一致的结果,并且当两个形状相似时,还能计算出它们的尺度缩放和角度旋转。通过与经典的方法,包括形状上下文方法、傅里叶描述子方法、拉东柱状图方法,针对典型的MPEG-7形状库进行对比实验,发现形状的圆内距离变换在形状检索的综合得分上相比这些经典方法提高了近20%。结论 形状的圆内距离变换在形状的描述、相似性判定和检索上是有效和可逆的,具有广泛的可适用性且优于本文比较的其他经典方法。
关键词
Inside-circle distance transform for shapes
Wu Shaogen1, Nie Weiqing2, Lu Lijun3, Liu Yaqing3(1.Guangdong industry technical college, Guangzhou 510300, China;2.Guangzhou Radio Group, Guangzhou 510656, China;3.Southern Medical University, Guangzhou 510515, China) Abstract
Objective Description, matching, similarity measurement, and retrieval of shapes are basic tasks in computer vision, image recognition, and machine intelligence, and they remain as open issues. Except for the methods of geometry complex transform and shape contour-based Fourier descriptor, all other methods are not information preserving, which means that the original shapes cannot be reconstructed from their descriptors. Consequently, the descriptors cannot be guaranteed to fully represent the characteristics of original shapes. Although geometry complex transform and shape contour-based Fourier descriptor are information preserving, they are only applicable to simple closed shapes or some other special types of shapes, which limits their applications. We propose a generic shape descriptor, which is named inside-circle distance transform (ICD). The ICD method can be used for matching, similarity measurement, and retrieval of any shape with obvious contours, and it is information preserving. Method In the ICD method, we initially calculate the minimum circumscribed circle of a shape. Then, we draw a set of equidistant parallel lines that are perpendicular to x-axes, calculate the intersections of each line and shape contour, and compute the distance vector for each line. We finally form a distance matrix with all distance vectors. We yield another distance matrix by rotating the shape anticlockwise around the center of its minimum circumscribed circle by θ degree and repeating the aforementioned process. A set of distance matrices is generated by repeating the process for[360/θ] times. With all distance matrices of the shape in hand, we construct the feature matrix of the shape. The feature matrix of a shape is a representation of the original shape with rich information, which can be used to describe the original shape, to measure the similarity of two shapes and reconstruct the original shape. Therefore, ICD is an information-preserving method. This capability of ICD to reconstruct original shapes is useful. We can thoroughly understand the intrinsic of shapes using the ICD method. Feature matrix is a powerful tool for shape representation and shape matching. We prove that ICD is scaling, rotation, and translation invariant, which is an important property in shape description, matching, and retrieval. Result We construct 40 shapes to verify the capability and test the effectiveness of the ICD method. The shapes are categorized into eight classes, with five shapes in each class. In each of these eight classes, one basic shape exists, and the others are deformations of the basic shape through modifications, such as twisting the contour or adding noise. We further expand the set of shapes by performing affine transformation with random scale factor, random rotation angle, and random translation of position to generate two new shapes for each of the 40 shapes. This expansion results in a total of 120 shapes. We initially perform a similarity measurement between each of the eight basic shapes and each of 12 randomly selected shapes. Experimental result shows that in similarity measurement of shapes, the ICD method generates the same vision results as those of human vision. If two shapes are determined to be similar, then the ICD method can calculate their scaling factor and orientation differences. We also test the effectiveness of retrieval through the well-known method of "Bullseye score." Results show that 38 out of 40 subclasses achieve a score of 100, which is extremely satisfactory. We compare the ICD method with three other classic shape description and matching methods, namely, shape context method, histogram of Radon transform, and generic Fourier descriptor, on the basis of the widely used shape database of MPEG-7. This database consists of 70 classes, with 20 shapes in each class and hence a total of 1400 shapes. The Bullseye score method is adopted. Results indicate that all the methods under evaluation have their own advantages and disadvantages with respect to different shape classes. Moreover, on average, the test score of the ICD method is approximately 20 points higher than those of the other three classic methods. ICD shows significant effects outperforming those of the other methods in all experiments. In the reconstruction experiments, we randomly select four shapes from the MPEG-7 database and reconstruct them by using the ICD method with varying parameters k and θ, where k=30, 50, 150 and θ=1, 6. Experiment results show that the reconstructed shapes become accurate with the increase of parameter k and decrease of parameter θ. The reconstruction experiments also imply that shape reconstruction is more sensitive to parameter k than to parameter θ. In application scenarios in which the rotational angle is insignificant, θ=3 is an optimal recommendation by our experiments. Conclusion The ICD method and its corresponding feature matrix can be used to represent, match, and retrieve shapes effectively. This method has the prominent feature of being information preserving, thereby assuring that it represents a shape without losing information. When this method is used to compute similarity of shapes, it can generate the same result as that of human vision. For two similar shapes, the ICD method can compute their scale factor and rotation differences. Theoretical analysis, mathematical proofs, and experiments show that ICD is effective, useful, and information preserving, and it outperforms several important classic methods.
Keywords
|