In the domain of human action recognition, skeleton-based methods have attracted widespread attention for their superior robustness. While the Spatial-Temporal Graph Convolutional Networks (ST-GCN) was the first to apply GCNs to model skeleton data, it still struggle to effectively differentiate between essential and redundant features. To address this limitation, in this work we propose a novel Channel Attention-based Spatial-Temporal Graph Convolutional Network (CA-STGCN). Our model integrates SENet with SoftPool, intruducing the SoftPool-SENet (S-SE) module to enhance pooling operations and preserve critical functional information. We validate CA-STGCN on two public datasets, NTU-RGB+D and Kinetics. Experimental results demonstrate that our model outperforms the original ST-GCN model and offers valuable insights for advancing skeleton-based action recognition.
Comment submit