The detection of social bots has become a critical task in maintaining the integrity of social media. With social bots evolving continually, they primarily evade detection by imitating human features and engaging in interactions with humans. To reduce the impact of social bots imitating human features, also known as feature camouflage, existing methods mainly utilize multi-modal user information for detection, especially GNN-based methods that utilize additional topological structure information. However, these methods ignore relation camouflage, which involves disguising through interactions with humans. We find that relation camouflage results in both homophilic connections formed by nodes of the same type and heterophilic connections formed by nodes of different types in social networks. The existing GNN-based detection methods assume all connections are homophilic while ignoring the difference among neighbors in heterophilic connections, which leads to a poor detection performance for bots with relation camouflage. To address this, we propose a multi-modal social bot detection method with learning homophilic and heterophilic connections adaptively (BothH for short). Specifically, firstly we determine whether each connection is homophilic or heterophilic with the connection classifier, and then we design a novel message propagating strategy that can learn the homophilic and heterophilic connections adaptively. We conduct experiments on the mainstream datasets and the results show that our model is superior to state-of-the-art methods.