Due to the inherent shortcomings of traditional depth models, the Transformer model based on the self-attention mechanism has become popular in the field of fault diagnosis. The current Transformer's self-attentive mechanism provides an alternative way of thinking, which can make direct association between each signal. However, it can only focus on the association information within a sequence, and it is difficult to understand the information gap between samples. Therefore, this paper proposes the two-branch Twins attention, which for the first time uses cross-attention to focus on information associations between samples. Twins attention uses cross-attention to learn information associations between samples in addition to retaining the information associations within sequences learned by self-attention. The performance of the proposed model was validated on four popular bearing datasets. Compared to the original transformer structure, the average accuracy of each dataset improved by 1.73% to 99.42%, leading the noise experiments.