Multi-contrast magnetic resonance imaging (MC-MRI) has been widely used for the diagnosis and characterization of tumors and lesions, as multi-contrast MR images are capable of providing complementary information for more comprehensive diagnosis and evaluation. However, it usually suffers from long scanning time to acquire multi-contrast MR images; in addition, long scanning time may lead to motion artifacts, degrading the image quality. Recently, many studies have proposed to employ the fully-sampled image of one contrast with short acquisition time to guide the reconstruction of the other contrast with long acquisition time so as to speed up the scanning. However, these studies still have two shortcomings. First, they simply concatenate the features of the two contrast images together without digging and leveraging the inherent and deep correlation between them. Second, as aliasing artifacts are complicated and non-local, sole image domain reconstruction with local dependencies is far from enough to eliminate these artifacts and achieve faithful reconstruction results. We present a novel Dual-Domain Cross-Attention Fusion (DuDoCAF) scheme with recurrent transformer to comprehensively address these shortcomings. Specifically, the proposed CAF scheme enables deep and effective fusion of features extracted from two modalities. The dual-domain recurrent learning allows our model to restore signals in both k-space and image domains, and hence more comprehensively remove the artifacts. In addition, we tame recurrent transformers to capture long-range dependencies from the fused feature maps to further enhance reconstruction performance. Extensive experiments on public fastMRI and clinical brain datasets demonstrate that the proposed DuDoCAF outperforms the state-of-the-art methods under different under-sampling patterns and acceleration rates.