Current Issue Cover
基于自我训练的长效垃圾分类方法

刘雅璇1, 潘万彬1,2(1.杭州电子科技大学数字媒体与艺术设计学院, 杭州 310018;2.新加坡国立大学机械工程系, 新加坡 117576)

摘 要
目的 目前垃圾主要采用名称检索的方式开展分类,这类方法通常基于事先设定的数据分类,很难有效包含现有所有的垃圾,更难应对未来持续增多的垃圾,针对上述问题,面向生活垃圾,提出一种基于自我训练的长效垃圾分类方法。方法 首先,采用Bagging将两类分类能力和训练机制不同的基分类器:K近邻分类器和支持向量机,根据它们各自独立的投票和权重进行有机组合,提出了一种新颖的集成分类器对生活垃圾进行分类;其次,基于直观的图像交互反馈,动态地更新分类器相应分类结果的置信度和基于云的训练样本集,提升后续分类的准确性和方法本身的自学习能力。结果 使用包含233条生活垃圾的训练样本集对原型系统进行训练,并使用151条垃圾样例进行测试,实验表明本文提出的集成分类器对生活垃圾的分类准确性可以达到95%左右。通过逐步提高训练样本集中错误样本的比例(≤ 30%)并重新训练集成分类器,再采用上述151条样例共开展了150次分类测试。相应的平均准确率分析表明,本文的集成分类器具有较高且较为稳定的分类准确率(≥ 93%)。此外,在上述实验中加入反馈机制后,平均准确率分析表明,该机制能有效地减轻错误样本对本文集成分类器准确率衰减带来的影响。结论 本文方法对生活垃圾分类具有较高的分类准确率、鲁棒性且具有良好的长效性。
关键词
Long-term waste classification based on self-training

Liu Yaxuan1, Pan Wanbin1,2(1.School of Media and Design, Hangzhou Dianzi University, Hangzhou 310018, China;2.Department of Mechanical Engineering, National University of Singapore, Singapore 117576, Singapore)

Abstract
Objective Given the improvement of people's consumption, daily waste increases in quantity and type. Classifying waste correctly is important to protect human health and maintain a clean and safe environment. With the popularity of the internet and the development of information technology, retrieving waste by smartphones based on waste names is a popular waste classification method. However, this method usually works on some static data classifications. Hence, covering all waste with this method and extending the approach to include new types of waste are difficult. To address the problem, this study proposes a long-term waste classification method for domestic waste based on self-training. Method The proposed method, which fully uses the capability of machine learning, can update its corresponding training set and conduct self-training on the basis of users' inputs and feedback realized by waste image selection. Thus, a high user participation equates to the high classification accuracy of our method. Accordingly, the proposed method is mainly composed of two parts. 1) To make our method effective in classification, we adopt a new ensemble classifier that integrates K-nearest neighbor classifier (KNN)s and support vector machine (SVM)s (as basis classifiers) together by adopting bagging based on independent voting and weights. In this method, misclassification oversampling technology is combined with bagging to promote the accuracies of these basis classifiers. 2) A feedback mechanism based on image selection is used to automatically update our classifier's confidence and extend our waste training set, thereby upgrading its classification accuracy and self-training ability. Result A corresponding domestic waste classifying prototype is developed to validate the effectiveness of the above method. Here, a training set that contains 233 waste samples is used to train our ensembled classifier, whereas a test set with 151 waste samples is used to evaluate the accuracy and robustness of our ensembled classifier. The experiments demonstrate that the average classification accuracy rate of the ensembled classifier (approximately 95%) is better than that of each basis classifier. Along with the gradual increase in the proportion of incorrect samples in the training set (≤ 30%), we correspondingly train the ensembled classifier on the data and then conduct a classification test by using the above test set. The corresponding average accuracy analyses illustrate that our ensembled classifier can maintain a relatively high and stable classification accuracy rate (≥ 93%), whereas the feedback mechanism can effectively help our method to alleviate the negative influence brought by incorrect samples. Conclusion Classifying waste is closely related to people's health and environmental protection. However, long-term methods to effectively implement the above work, along with the increasing number and types of waste, remain rare, especially in mobile platforms. Thus, a new long-term waste classification for domestic waste based on self-training is presented in this work. The method is characterized by an accurate and robust domestic waste classification ability and a self-learning ability. These abilities are verified by a novel ensembled classifier and feedback mechanism. However, the method still has some disadvantages that should be improved. 1) The waste image input is mainly used by our feedback mechanism, whereas its corresponding features are mainly described by text because the general and effective methods for extracting waste features from images remain rare. 2) The automatic feedback mechanism should be studied to improve the automation level of the entire method.
Keywords

订阅号|日报