面向COVID-19疫情预测的图卷积神经网络时空数据学习
杨成意1,2, 刘峰1,3, 齐佳音1, 段妍1,2, 吕润倩1,2, 肖子龙4(1.上海对外经贸大学人工智能与变革管理研究院, 上海 200336;2.上海对外经贸大学统计与信息学院, 上海 201620;3.华东师范大学计算机科学与技术学院, 上海 200062;4.中山大学数据科学与计算机学院, 广州 510006) 摘 要
目的 当前的疾病传播研究主要集中于时序数据和传染病模型,缺乏运用空间信息提升预测精度的探索和解释。在处理时空数据时需要分别提取时间特征和空间特征,再进行特征融合得到较为可靠的预测结果。本文提出一种基于图卷积神经网络(graph convolutional neural network,GCN)的时空数据学习方法,能够运用空间模型端对端地学习时空数据,代替此前由多模块单元相集成的模式。方法 依据数据可视化阶段呈现出的地理空间、高铁线路、飞机航线与感染人数之间的正相关关系,将中国各城市之间的空间分布关系和交通连接关系映射成网络图并编码成地理邻接矩阵、高铁线路直达矩阵、飞机航线直达矩阵以及飞机航线或高铁线路直达矩阵。按滑动时间窗口对疫情数据进行切片后形成张量,依次分批输入到图深度学习模型中参与卷积运算,通过信息传递、反向传播和梯度下降更新可训练参数。结果 在新型冠状病毒肺炎疫情数据集上的实验结果显示,采用GCN学习这一时空数据的分布特征相较于循环神经网络模型,在训练过程中表现出了更强的拟合能力,在训练时间层面节约75%以上的运算成本,在两类损失函数下的平均测试集损失能够下降80%左右。结论 本文所采用的时空数据学习方法具有较低的运算成本和较高的预测精度,尤其在空间特征强于时间特征的时空数据中有着更好的性能,并且为流行病传播范围和感染人数的预测提供了新的方法和思路,有助于相关部门在公共卫生事件中制定应对措施和疾病防控决策。
关键词
Spatiotemporal data learning of graph convolutional neural network for epidemic prediction of COVID-19
Yang Chengyi1,2, Liu Feng1,3, Qi Jiayin1, Duan Yan1,2, Lyu Runqian1,2, Xiao Zilong4(1.Institute of Artificial Intelligence and Change Management, Shanghai University of International Business and Economics, Shanghai 200336, China;2.School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai 201620, China;3.School of Computer Science and Technology, East China Normal University, Shanghai 200062, China;4.School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510006, China) Abstract
Objective COVID-19 has caused a severe impact on the medical system and economic growth of countries all over the world. Therefore, the epidemic information of each city has important reference value for governments and enterprises to formulate public health prevention and control measures and decisions in opening the economy. According to relevant research, the infectious disease model and time series model have played an important role in finding potential hosts, confirming human to human transmission, and estimating the basic reproductive number. Related research methods on disease transmission and confirmed case prediction have experienced a series of evolution, including demographic method, dynamic model, social network analysis, flight passenger volume estimation, data mining, and machine learning method. However, the prediction accuracy of these methods still needs to be improved by using spatial information in the study of epidemic transmission. In recent years, the boom of graph deep learning has provided new technologies and methods for the estimation of epidemic spread. From the iteration of trainable parameters in the way of information interaction in early time to the optimization of graph type, propagation mechanism, and output steps, this development process laid the foundation for the generation of graph convolutional neural network(GCN). The development of graph convolution network optimizes the performance of graph neural network in spectral domain and spatial domain by changing convolution kernel and information aggregation mode. The progress of representation learning improves the convenience of graph data processing. The rise of integrated framework realizes more accurate prediction in spatiotemporal data processing represented by traffic flow. Method Compared with traffic flow prediction, epidemic data prediction has stronger spatial attribute and weaker temporal attribute, which is the reason why GCN is used alone instead of integration approaches. First, according to the data visualization stage of the epidemic information and geographical location and traffic network between the positive correlation, the spatial distribution relationship and traffic connection relationship between affected cities in China are mapped into a graph network and encoded into geographic adjacency matrix, high-speed railway direct matrix, aircraft route direct matrix, and aircraft route or high-speed railway direct matrix. Four cities networks with different connection modes are formed by these four adjacent matrices, and the corresponding GCN model is constructed based on these cities networks, including geographical proximity graph convolutional neural network(GPGCN), airline graph convolutional neural network(ALGCN), high speed railway graph convolutional neural network(HSRGCN), airline and high speed railway graph convolutional neural network(ALHSRGCN). After dividing the training, validation, and test sets at a ratio of 6:2:2, the epidemic data were sliced according to the sliding time window to form a three-dimensional tensor with a size of 30×327×7 as the test set, which was input into the graph deep learning model in batches to participate in convolution operation. The training parameters were updated by information transmission, back propagation, and gradient descent. Result The experimental results on the COVID-19 dataset demonstrated that the learning distribution features of spatiotemporal data by the GCN model showed stronger fitting ability than the recurrent neural network model in the training process. It could save more than 75% of the computation cost at the training time level, and the average test set loss on mean absolute error(MAE) and mean square error(MSE) decreased by about 80%. The value of loss would converge to a lower position and achieve a more stable training process so as to obtain more accurate prediction when MSE is chosen as the loss function. Conclusion The spatiotemporal data learning method in this study has lower operation cost and higher prediction accuracy, which shows better performance especially in the case of spatiotemporal data with stronger spatial characteristics than temporal characteristics. It provides new approaches and ideas for the prediction of epidemic spread range and number of infected people, which is conducive for relevant departments to formulate countermeasures for disease prevention and control decisions in public health events.
Keywords
deep learning graph convolutional neural network(GCN) spatiotemporal data processing Coronavirus Disease 2019(COVID-19) epidemic prediction
|