A Survey of Recent Graph-Based Methods for Skeleton Based Action Recognition
DOI:
https://doi.org/10.56147/aaiet.2.1.111Keywords:
- Skeleton-based action recognition,
- Graph convolutional networks,
- Dynamic topology,
- Hierarchical graphs,
- Information bottleneck,
- Language supervision,
- Temporal-channel aggregation
Abstract
Skeleton-Based Action Recognition (SBAR) leverages 3D joint trajectories to recognize human activities while offering privacy, robustness to illumination/background changes and computational efficiency. Recent progress is dominated by spatio-temporal graph neural networks (ST-GNNs) that model the human body as a graph and learn data-adaptive connectivity, hierarchical structure, compact representations, multimodal supervision and efficient temporal fusion. This survey focuses on five representative methods CTR-GCN, HD-GCN, InfoGCN, Language Supervised Training (LST) and Temporal Channel Aggregation (TCA-GCN) and positions them within the broader SBAR literature. We analyze modeling assumptions, architectural choices, training objectives and empirical results on NTU RGB+D 60/120 and North western UCLA. We additionally contextualize the trajectory from dynamic topology learning to emerging foundation and sequence models reported in 2024-2025. Finally, we summarize open challenges and provide research directions for scalable, robust and semantically grounded SBAR.