Improving carbon emission efficiency is essential to address climate change. The widely used methods of modelling heterogeneity in efficiency evaluation tend to artificially classify groups based on a single variable and thus result in biased estimation. To fill this knowledge gap, this paper proposes a new method that combines machine learning and radial directional distance function (DDF) to estimate carbon emission efficiency and reduction potential, in which heterogeneity could be grouped endogenously. Furthermore, index decomposition analysis (IDA) is incorporated to explore the dynamic determinants of carbon emission reduction potential. Using China's data at city level from 2010 to 2018, we found that carbon emission efficiency considering technology heterogeneity is between 0.569–0.822. This implies an excellent emission reduction potential of around 5.9 million tons in 2018. The reduction potential is attributable to managerial failure and technology gap—the latter accounts for 46–55% of the total reduction potential. We arguably conclude that the method in this paper can capture each city's economic and environmental information more accurately than previous methods based on geographic grouping, which may underestimate the reduction potential. We anticipate the machine learning method in this paper could provide insights on clustering the technological heterogeneity and efficiency evaluation. • Machine learning provides insight on evaluating carbon emission efficiency. • Exogenous technical variables are introduced to capture heterogeneity. • Carbon emission efficiencies in China's cities range from 0.569 to 0.822. • China's carbon emission reduction potential is 5.9 million tons in 2018. • Abatement potential due to technology gap accounts for 46–55% of the total.