The knowledge of fish assemblage in submarine environments is a keystone for fishery management but remain poorly monitored due to heavy methodological requirements for scientific fish sampling. Passive acoustic monitoring appeared as a promising non intrusive alternative to fish sampling, potentially providing more objective data at a reduced cost, but still rely on heavy expert analysis of the acoustic signal. We propose in this paper to improve passive acoustic monitoring efficiency with deep learning. Using convolutional recurrent neural networks, we built and tested models able to detect two types of fish vocalizations as well as motor engine sounds. The detector had a high F1 score $$({>}0.94)$$ and an error rate $$({<}0.18)$$ . These methods are promising tools for the fish communities management as they allow automatic fish sounds detection and classification.