Users' diverse intentions drive the subsequent interaction based on users' dynamic behavior trajectories. Despite the impressive progress of existing work on capturing the user's intent, these methods still have the following challenges: (1) How can the intent effectively be exploited in sparse and long-distance interacted items? (2) How to improve the performance of a recommendation system by fusing spatial and temporal information? To further exploit the intent from local and global contexts, we present a novel recommendation framework, namely hierarchical intent Contrastive Learning (HICL) for SR. First, a graph encoder is employed to enhance item embedding by incorporating global context information. And then, the sequential encoder is leveraged as a prediction baseline from a historical perspective. Moreover, diverse intent Contrastive Learning branches are integrated to model user latent by utilizing a new loss function. Specifically, sequence contrastive learning (CL) employs sequence augment to model the users' local intent, while graph CL employs graph augment and feature clustering to model the users' global latent from semantic and structural graph views. Extensive experiments are conducted on four real-world datasets, including highly sparse datasets, and the experiment results demonstrate the superiority of our model over the state-of-the-art.