Unsupervised skeleton-based action recognition has attracted increasing attention. Existing methods have several limitations: (1) Many actions are highly related to local joints, which is often neglected. (2) Most methods directly employ joint coordinates as frame feature and do not utilize skeleton graph, e.g., topological information. (3) Long-range dependency is not captured well. In this work, a novel unsupervised method called Global-Local Temporal Attention Graph Convolutional Network (GLTA-GCN) is proposed to alleviate the above problems. The network consists of two branches, local and global branches. Each one utilizes graph convolution units and self-attention mechanism to better extract spatio-temporal features. Furthermore, two loss functions are designed to constrain the model to extract more essential local joint feature and maintain intrinsic structural information. Extensive experiments demonstrate that GLTA-GCN achieves state-of-the-art performance. Our code is released on https://github.com/HaoyueQiu/GLTA-GCN.