Multi-agent deep reinforcement learning (MADRL) has attracted a tremendous amount of interest in recent years. In this paper, we introduce MADRL into the confrontation scene of Unmanned Aerial Vehicle (UAV) swarm. To analysis the dynamic game process of UAV swarm confrontation, we build two non-cooperative game models based on MADRL paradigm. By using the multi-agent deep deterministic policy gradient (MADDPG) and the centralized training with decentralized execution method, we achieve the Nash equilibrium under 5 vs. 5 UAV confrontation scenes. We also introduce multi-agent soft actor critic (MASAC) method into the UAV swarm confrontation, simulation results indicate that the MASAC-based model outperforms the MADDPG-based model on exploring the UAV swarm combat environment, and more effectively converges to the Nash equilibrium. Our work will provide new insights into the confrontation of UAV swarm.