Cardiovascular Diseases (CVDs) present a substantial global health burden, with tobacco use as a major risk factor. While extensive research has identified several risk factors for CVDs, there is a gap in predictive models that account for a combination of clinical factors, lifestyle factors, and other determinants in order to predict CVD risk. In addition, existing studies tend to overlook the interactions among risk factors within high-risk populations, such as tobacco users. In this study, we examined phenotype data from over 15,000 tobacco users from the UK Biobank dataset to investigate which additional phenotype factors in the population showed predictive power for CVD. We explored the application of multiple Machine Learning (ML) algorithms, including Decision Trees (DT), Gradient Boosting (GB), Logistic Regression (LR), Random Forest (RF), and Support Vector Classification (SVC) in predicting CVD risk and individual phenotype feature importance. By analyzing the rich phenotype data in the UK Biobank via various algorithms, we were able to understand factors related to risk prediction and offer insights into the interplay of risk factors that contribute to cardiovascular events in this high-risk population.