Abstract Artificial intelligence (AI), particularly deep learning, has demonstrated remarkable performance in medical imaging across a variety of modalities, including X-ray, computed tomography (CT), magnetic resonance imaging (MRI), ultrasound, positron emission tomography (PET), and pathological imaging. However, most existing state-of-the-art AI techniques are task-specific and focus on a limited range of imaging modalities. Compared to these task-specific models, emerging foundation models represent a significant milestone in AI development. These models can learn generalized representations of medical images and apply them to downstream tasks through zero-shot or few-shot fine-tuning. Foundation models have the potential to address the comprehensive and multifactorial challenges encountered in clinical practice. This article reviews the clinical applications of both task-specific and foundation models, highlighting their differences, complementarities, and clinical relevance. We also examine their future research directions and potential challenges. Unlike the replacement relationship seen between deep learning and traditional machine learning, task-specific and foundation models are complementary, despite inherent differences. While foundation models primarily focus on segmentation and classification, task-specific models are integrated into nearly all medical image analyses. However, with further advancements, foundation models could be applied to other clinical scenarios. In conclusion, all indications suggest that task-specific and foundation models, especially the latter, have the potential to drive breakthroughs in medical imaging, from image processing to clinical workflows.