基于躯干检测的单人不良图片识别
摘 要
目的 互联网中色情图片传播泛滥,对其自动识别与过滤越来越重要,而目前多数不良图片识别方法对类肤色区域较多的正常图像容易产生误检。为此,针对网络上常见的单人色情写真类图片,在总结已有方法不足的基础上提出一种将躯干部位作为感兴趣区域的不良图片识别算法。方法 首先使用基于Poselet(姿态部件)的人体躯干检测方法定位出与色情信息密切相关的躯干区域,然后基于躯干区域提取具有判别力的Fisher向量,最后使用线性支持向量机(SVM)进行分类。然而,由于人体外观变化很大,躯干检测器输出的置信度最大的位置往往较躯干真实的位置有一定的偏移。为了克服这一缺点,提出一种自适应的算法,即根据躯干检测器输出的置信度自适应地选择多个躯干候选区域,并通过集成多个区域的判别结果来得到最终结果。此外,为了训练基于躯干的SVM分类器和验证算法的有效性,本文通过互联网下载的方式收集了一个包含30000幅单人色情写真图片的大规模数据集,并对色情部位进行了标注,标注信息可用于自动生成训练数据。结果 本文提出的基于躯干的自适应分类算法在收集的大规模数据集上达到了91.7%的识别精度,明显高于传统肤色模型的识别结果,尤其是对于如同泳装模特等皮肤裸露较多或类肤色区域较多的图像,本文方法效果尤为显著。结论 文中基于Poselet的躯干检测能够获取与色情信息更相关的信息,因而相比较于传统方法,在较为准确地检测不良图片的同时,有效地降低皮肤裸露较多的正常图像的误检率,达到了实际应用的要求。
关键词
Adult image recognition based on torso detection
Chen Xiao, Jin Xin, Tan Xiaoyang(College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China) Abstract
Objective With the steady growth of the amount of images publicly available on the Internet, adult image recognition is of great significance for ensuring web security and content monitoring.In this paper, single-person adult pictures are studied. Current skin-based adult image recognition algorithms usually have a high false positive rate. Here, an adult image recognition algorithm, which takes the image of an adult torso as the region of interest (ROI), is proposed to reduce false-positives efficiently. Method The proposed algorithm utilizes Poselet to detect the ROIs with plenty of discriminative information. Each Poselet provides examples for training a linear SVM (support vector machine)classifierthat can be run over the image in a multiscale scanning mode, after which the outputs of these Poseletdetectors vote for the localization of the torso and other body parts. Then, based on the ROIs, discriminative Fisher vectors for nude breast image classification are obtained. However, owing to variations in human body appearance, the ground truth positions and the outputs of the torso detectormay shift. An adaptive algorithm is proposed to overcome such a weakness. The algorithm selects several torso candidate areas according to the confidence values obtained by the torso detector; the algorithm then integrates the discrimination results of several areas to obtain the final result. To train the SVM classifier based on torso detection, a set of 30 000 pornographic images was collected,and the pornographic regions in the images are manually labeled. The labeling information can be used to generate the training data automatically. Result To evaluate the method, a new and large dataset is built, which includes adult, benign, and bikini images. Experiments on this dataset reveal that the proposed method obtains an accuracy rate of 91.7%, which is much higher than the traditional skin color-based method. Conclusion The Poselet-based torso detection method obtains more pornography-related information compared with other methods. Thus, the proposed method can detect adult images with a high detection rate and a low false positive rate, making it suitable for practical applications.
Keywords
|