Protein-protein interactions (PPIs) are essential for various biological processes and diseases. However, most existing computational methods for identifying PPI modulators require either target structure or reference modulators, which restricts their applicability to novel PPI targets. To address this challenge, we propose MultiPPIMI, a sequence-based deep learning framework that predicts the interaction between any given PPI target and modulator. MultiPPIMI integrates multimodal representations of PPI targets and modulators and uses a bilinear attention network to capture intermolecular interactions. Experimental results on our curated benchmark data set show that MultiPPIMI achieves an average AUROC of 0.837 in three cold-start scenarios and an AUROC of 0.994 in the random-split scenario. Furthermore, the case study shows that MultiPPIMI can assist molecular docking simulations in screening inhibitors of Keap1/Nrf2 PPI interactions. We believe that the proposed method provides a promising way to screen PPI-targeted modulators.