基于图卷积的3D骨架数据的双人交互行为识别

doi:10.12422/j.issn.1672-6952.2023.03.014

辽宁石油化工大学学报 ›› 2023, Vol. 43 ›› Issue (3): 86-90.DOI: 10.12422/j.issn.1672-6952.2023.03.014

基于图卷积的3D骨架数据的双人交互行为识别

张静亭¹(), 曹江涛¹(), 姬晓飞²

^1.辽宁石油化工大学信息与控制工程学院，辽宁抚顺 113001
^2.沈阳航空航天大学自动化学院，辽宁沈阳 110136

收稿日期:2021-12-16 修回日期:2022-01-21 出版日期:2023-06-25 发布日期:2023-06-25
通讯作者: 曹江涛
作者简介:张静亭(1995⁃),女,硕士研究生,从事基于深度学习的行为识别方面研究；E⁃mail：18946104650@163.com。
基金资助:
国家自然科学基金项目(61673199)

3D Skeleton Data Double Human Interaction Recognition Based on Graph Convolution Network

Jingting Zhang¹(), Jiangtao Cao¹(), Xiaofei Ji²

^1.School of Information and Control Engineering，Liaoning Petrochemical University，Fushun Liaoning 113001，China
^2.School of Automation，Shenyang Aerospace University，Shenyang Liaoning 110136，China

Received:2021-12-16 Revised:2022-01-21 Published:2023-06-25 Online:2023-06-25
Contact: Jiangtao Cao

摘要/Abstract

摘要：

针对图卷积神经网络的双人交互行为识别方法存在交互语义信息表达不充分的问题，提出了一种新的双人交互时空图卷积神经网络（DHI?STGCN）用于行为识别的方法。该网络包含空间子网络模块和时间子网络模块。将基于交互动作视频获取的3D骨架数据生成一种双人交互动作的空间动作图用于空间信息的表示，图中根据关节点位置信息对双人之间的连接边赋予不同的权重。时间信息处理中，在构造的邻接矩阵中增加了上下文时间信息的联系，图中关节点与其一定时间范围内的节点增加连接。将生成的时空图数据送入空间图卷积网络模块，结合时间图卷积网络模块增强帧间运动特征连续性进行时序建模。该模型充分考虑了双人交互动作的紧密关系，具有较强的鲁棒性，获得了比现有模型更好的交互动作识别效果。

关键词: 时空图卷积, 骨架数据, 双人交互, 行为识别

Abstract:

Aiming at the problem of insufficient representation of interactive semantic information in the double human interaction behavior recognition method based on graph convolutional neural networks，a new double human interactive spatial?temporal graph convolution network (DHI?STGCN) was proposed for behavior recognition. The network contains spatial sub?network modules and temporal sub?network modules. Based on the 3D skeleton data obtained from the interactive action video, a spatial action graph of double human interactive action was generated for the representation of spatial information. In the graph, the connecting edges between double human were given different weights according to the joint point position information. The connection of context time information was added in the constructed adjacency matrix, and the joint points in the graph were connected with their nodes within a certain time range in time information processing. The generated spatial?temporal graph data was sent to the spatial graph convolution network module, and the temporal graph convolution network module was combined to enhance the continuity of inter frame motion features for modeling in time. The model fully considers the close relationship of double human interaction. The comparative experimental results on NTU?RGB+D dataset show that the algorithm has strong robustness and obtains better interaction recognition effect than the existing models.

Key words: Spatial?temporal graph convolution, Skeleton data, Double human interaction, Behavior recognition

中图分类号:

TP391.1

张静亭, 曹江涛, 姬晓飞. 基于图卷积的3D骨架数据的双人交互行为识别[J]. 辽宁石油化工大学学报, 2023, 43(3): 86-90.

Jingting Zhang, Jiangtao Cao, Xiaofei Ji. 3D Skeleton Data Double Human Interaction Recognition Based on Graph Convolution Network[J]. Journal of Liaoning Petrochemical University, 2023, 43(3): 86-90.

图/表 8

参考文献 14

1	张宇，曹江涛，阚哲，等.基于CPS系统的网络入侵检测方法［J］.传感器与微系统，2020，39（10）：126⁃128.
2	魏鹏，曹江涛，姬晓飞.基于整体和个体分割融合的双人交互行为识别［J］. 辽宁石油化工大学学报，2019，39（6）：91⁃96.
3	姬晓飞，王治博，王昱.视频手势识别的互动演示系统的设计与实现［J］.沈阳航空航天大学学报，2020，37（2）：56⁃62.
4	陈姝琪，曹江涛，赵挺，等.基于关节点数据的双人交互行为识别［J］.电子测量与仪器学报，2020，34（6）：124⁃130.
5	Ke Q， Bennamoun M， An S， et al. A new representation of skeleton sequences for 3D action recognition［C］//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Amsterdam：［s.n.］， 2017.
6	Liu J， Shahroudy A， Xu D. et al. Spatio⁃temporal LSTM with trust gates for 3D human action recognition［C］//Proceeding of the European Conference on Computer Vision. Amsterdam：［s.n.］， 2016.
7	Lee I， Kim D， Kang S， et al. Ensemble deep learning for skeleton⁃based action recognition using temporal sliding LSTM networks［C］//Proceedings of the IEEE International Conference on Computer Vision. Venice：［s.n.］， 2017.
8	Li S， Li W， Cook C， et al. Independently recurrent neural network （INDRNN）： Building a longer and deeper RNN［C］//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City：［s.n.］， 2018.
9	Zhang P F， Xue J R， Lan C L， et al. Eleatt⁃RNN： Adding attentiveness to neurons in recurrent neural networks［C］//European Conference on Computer Vision. Munich：［s.n.］， 2018.
10	Wang P， Li Z， Hou Y， et al. Action recognition based on joint trajectory maps using convolutional neural networks［C］//Proceedings of the 24th ACM International Conference on Multimedia. Amsterdam：［s.n.］， 2016.
11	Yan S J， Xiong Y J， Lin D H. Spatial temporal graph convolutional networks for skeleton⁃based action recognition［C］//Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans：［s.n.］， 2018.
12	Wen Y H， Gao L， Fu H， et al. Graph CNNs with motif and variable temporal block for skeleton⁃based action recognition［C］//Proceedings of the AAAI Conference on Artificial Intelligence. Hawaii：［s.n.］， 2019.
13	Tang Y， Yi T， Lu J， et al. Deep progressive reinforcement learning for skeleton⁃based action recognition［C］//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. Salt Lake City：［s.n.］，2018.
14	Shahroudy A， Liu J， Ng T T， et al. NTU RGB+D： A large scale dataset for 3D human activity analysis［C］//IEEE Computer Society.［s.l.］：IEEE，2016.

层数	网络层	输入输出通道
1	SGCN⁃TGCN⁃0	（3，64）
2	SGCN⁃TGCN⁃1	（64，64）
3	SGCN⁃TGCN⁃2	（64，64）
4	SGCN⁃TGCN⁃3	（64，64）
5	SGCN⁃TGCN⁃4	（64，128）
6	SGCN⁃TGCN⁃5	（128，128）
7	SGCN⁃TGCN⁃6	（128，128）
8	SGCN⁃TGCN⁃7	（128，256）
9	SGCN⁃TGCN⁃8	（256，256）
10	SGCN⁃TGCN⁃9	（256，256）

层数	网络层	输入输出通道
1	SGCN⁃TGCN⁃0	（3，64）
2	SGCN⁃TGCN⁃1	（64，64）
3	SGCN⁃TGCN⁃2	（64，64）
4	SGCN⁃TGCN⁃3	（64，64）
5	SGCN⁃TGCN⁃4	（64，128）
6	SGCN⁃TGCN⁃5	（128，128）
7	SGCN⁃TGCN⁃6	（128，128）
8	SGCN⁃TGCN⁃7	（128，256）
9	SGCN⁃TGCN⁃8	（256，256）
10	SGCN⁃TGCN⁃9	（256，256）

方法	准确率/%
Clips+CNN+MTLN^[5]	84.8
TS⁃LSTM^[7]	81.2
IndRNN^[8]	87.9
EleAtt⁃RNN^[9]	87.1
ST⁃GCN^[11]	88.3
DPRL+GCNN^[13]	89.8
motif⁃GCNs+non⁃local VTDB^[12]	90.2
DHI⁃STGCN	92.4

方法	准确率/%
Clips+CNN+MTLN^[5]	84.8
TS⁃LSTM^[7]	81.2
IndRNN^[8]	87.9
EleAtt⁃RNN^[9]	87.1
ST⁃GCN^[11]	88.3
DPRL+GCNN^[13]	89.8
motif⁃GCNs+non⁃local VTDB^[12]	90.2
DHI⁃STGCN	92.4

基于图卷积的3D骨架数据的双人交互行为识别

3D Skeleton Data Double Human Interaction Recognition Based on Graph Convolution Network

HTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 14

相关文章 0

编辑推荐

Metrics