Static topology
The learned static topologies graphs A in GC-GT.
 
          In this work, we propose DSTSA-GCN, a skeleton-based gesture recognition model that integrates both channel-specific and temporal-specific topology modeling to capture more expressive spatio-temporal features. By leveraging shared feature transformation functions (STCA) across both channel-wise and temporal-wise topology modeling, the model enhances its ability to sense variations in spatio-temporal locations. Furthermore, inspired by the concept of grouped convolution, we design a corresponding grouped graph convolution that maintains model complexity while mitigating local bias issues in the static topology at deeper layers.
Through comprehensive experiments on two gesture datasets and two full-body action datasets, we not only demonstrate the ability of DSTSA-GCN to effectively extract flexible gesture features, but also validate its promising potential for full-body action recognition tasks.
Static topology is fixed during inference, and the local bias problem is caused by the fixed topology. The local bias problem is more severe in deeper layers. Triditional dynamic topology-based GCN models are not able to capture the channel-specific and temporal-specific topology simultaneously.
 
        The DSTSA-GCN model consists of three main components: the channel-specific topology modeling, the temporal-specific topology modeling, and multi-branch temporal convolution. The channel-specific topology modeling and the temporal-specific topology modeling are integrated by the shared feature transformation functions (STCA) to capture more expressive spatio-temporal features. The grouped version of them can be used to mitigate the local bias problem in the static topology at deeper layers.
 
        The learned static topologies graphs A in GC-GT.
 
          The learned dynamic topologies A_T in GT-GC.
 
            The class activation maps (CAM) for action Grap and Tap.
 
          The visualization of the continuous skeleton action corresponding to the heat map of aciton sample : Grap.
 
          The visualization of the continuous skeleton action corresponding to the heat map of action sample : Tap.
