Current Issue Cover
多源域混淆的双流深度迁移学习

闫美阳, 李原(北京理工大学自动化学院, 北京 100081)

摘 要
目的 针对深度学习严重依赖大样本的问题,提出多源域混淆的双流深度迁移学习方法,提升了传统深度迁移学习中迁移特征的适用性。方法 采用多源域的迁移策略,增大源域对目标域迁移特征的覆盖率。提出两阶段适配学习的方法,获得域不变的深层特征表示和域间分类器相似的识别结果,将自然光图像2维特征和深度图像3维特征进行融合,提高小样本数据特征维度的同时抑制了复杂背景对目标识别的干扰。此外,为改善小样本机器学习中分类器的识别性能,在传统的softmax损失中引入中心损失,增强分类损失函数的惩罚监督能力。结果 在公开的少量手势样本数据集上进行对比实验,结果表明,相对于传统的识别模型和迁移模型,基于本文模型进行识别准确率更高,在以DenseNet-169为预训练网络的模型中,识别率达到了97.17%。结论 利用多源域数据集、两阶段适配学习、双流卷积融合以及复合损失函数,构建了多源域混淆的双流深度迁移学习模型。所提模型可增大源域和目标域的数据分布匹配率、丰富目标样本特征维度、提升损失函数的监督性能,改进任意小样本场景迁移特征的适用性。
关键词
Two-stream deep transfer learning with multi-source domain confusion

Yan Meiyang, Li Yuan(School of Automation, Beijing Institute of Technology, Beijing 100081, China)

Abstract
Objective Feature extraction can be completed automatically by using a nonlinear network structure for deep learning.Thus, multi-dimensional features can be obtained through the distributed expression of features. Deep convolutional neural networks are supported by a large volume of valid data. However, obtaining a large volume of effective labeled data is often labor-intensive and time-consuming. Hence, achieving deep learning on a large volume of labeled datasets is still a challenge. Presently, deep convolutional neural networks on few-shot datasets have become a popular research topic in deep learning, and deep learning with transfer learning is the latest approach to solve the problem of data poverty. In this paper, two-stream deep transfer learning with multi-source domain confusion is proposed to address the limited adaptionissue of the source model's general features extracted on the target data.Method The proposed deep transfer learning network is based on the confusion domain deep transfer learning model. First, amulti-source domain transfer strategy is used to increase the coverage of target domain transfer features from the source domain. Second, a two-stage adaptive learning method is proposed to achieve domain-invariant deep feature representations and similar recognition results of the inter-domain classifier. Third, a data fusion strategy of natural light images with two-dimensional features and depth images with three-dimensional features is proposed to enrich the features dimension of few-shot datasets and suppress the influence of a complex background. Finally, the composite loss function is presented with the softmax and center loss functions to improve the recognition performance of the classifier in few-shot deep learning, and intra-and inter-class distances are shortened and expanded, respectively. The proposed method increases the recognition rate by improving the feature extraction and loss function of the deep convolutional neural network. Regarding feature extraction, the efficiency of feature transfer is enhanced, and the feature parameters of few-shot datasets are enriched by multi-source deep transfer features and feature fusion. The efficiency of multi-source domain feature transfer is improved with three kinds of loss functions. The inter-and intra-class feature distances are adjusted by introducing the center loss function. To extract the deep adaptation features, the difference loss of domain-invariant deep feature representation is calculated, and the inter-domain features are aligned with oneanother. In addition, the mutual adaptation of different domain classifiers is designed with the difference loss function. A two-stream deep transfer learning model with multi-source domain confusion is developed by combining the above methods. The model enhances the characterization of targets in complex contexts while improving the applicability of transfer features. Gesture recognition experiments are conducted on public datasets to verify the validity of the proposed model. Quantitative analysis of comparative experiments shows that the performance of the proposed model is superior to that of other classical gesture recognition models.Result The two-stream deep transfer learning model with multi-source domain confusiondemonstratesa more effective gesture recognition performance on few-shot datasets than previous models. In the model with the DenseNet-169 pre-training network, theproposed network achieves 97.17% accuracy. Compared with other classic gesture recognition and transfer learning models, the two-stream deep transfer learning model with multi-source domain confusion has 2.34% higher accuracy.The recognition performance of the proposed model in a small gesture sample dataset is evaluated through comparison as follows. First, compared with other transfer learning models, the proposed framework of the two-stream fusion model with multi-source domain confusion transfer learning can effectively complete the transfer of features. Second, the performance of the proposed fusion model is superior to that of the traditional two-stream information fusion model, which verifies that the proposed fusion model can improve recognition efficiency while effectively combining natural light and depth image features.Conclusion A deep transfer learning method with multi-source domain confusion is proposed. By studying the principle and mechanism of deep learning and transfer learning, a multi-source domain transfer method that covers the characteristics of the target domain is proposed. First, an adaptable featureis introduced to enhance the description capability of the transfer feature. Second, a two-stage adaptive learning method is proposed to represent the deep features of the invariant domain and reduce the prediction differences of inter-domain classifiers. Third, combined with the three-dimensional feature information of the depth image, a two-stream convolution fusion strategy that can realize the full use of scene information is proposed. Through the fusion of natural light imaging and depth information, the capability to segment the foreground and background in the image is improved, and the data fusion strategy realizes the recombination of the twotypes of modal information. Finally, the efficiency of multi-source domain feature transfer is improved by three kinds of loss functions. To improve the recognition performance of the classifier in few-shot datasets, the penalty performance of classifiers on inter-and intra-class features is adjusted by introducing center loss to softmax loss. The inter-domain features are adapted to oneanother by calculating the loss of the domain-invariant deep feature. The mutual adaptation of different domain classifiers is designed with the difference loss function of inter-domain classifiers. The two-stream deep transfer learning model with multi-source domain confusion is generated through two-stage adaptive learning, which can facilitate the feature transfer from the source domain to the target domain. The model structure of the two-stream deep transfer learning with multi-source domain confusion is designed by combining the proposed deep transfer learning method and data fusion strategy with multi-source domain confusion. On the public gesture dataset, the superior performance of the proposed model is verified through the contrast of multiple angles.Experimental results prove that the proposed method can increase the matching rate of the source and target domains, enrich the feature dimension, and enhance the penalty supervision capability of the loss function. The proposed method can improve the recognition accuracy of the deep transfer network on few-shot datasets.
Keywords

订阅号|日报