Multimodal depression detection (MDD) has garnered significant interest in recent years. Current methods typically integrate multimodal information within samples to distinguish positive from negative samples, but they often neglect the relationships between samples. Despite similarities within the same class, individual variations exist. By leveraging these relationships, we can provide supervision signals for both inter- and intra-class samples, thereby enhancing the discriminative power of user representations. Inspired by this observation, we introduce IISFD, a novel approach that concurrently exploits intra-sample contrastive learning and inter-sample contrastive learning with hard negative sampling. This method comprehensively considers information both within individual samples and across samples. Specifically, we decompose the multimodal inputs of each sample, including audio, vision and text, into modality-common features and modality-specific features. To obtain better decomposed feature representations, we integrate intra-sample contrastive learning and inter-sample contrastive learning with hard negative sampling. Additionally, detailed modal information is obtained through unimodal reconstruction. By passing the decomposed features through a carefully designed multimodal fusion module, we obtain more discriminative user representations. Experimental results on two publicly available datasets demonstrate the superiority of our model, highlighting its effectiveness in leveraging both intra- and inter-sample information for enhanced MDD.