A data set of sixty samples of diverse black tea were collected and analysed using high-performance liquid chromatography-mass spectrometry (HPLC-MS) methods. Chemical variations of black tea infusions depending on origin, botanical variety, and processing were investigated employing various multivariate statistical techniques including principal component analysis (PCA), hierarchical cluster analysis (HCA), partial least squares discriminant analysis (PLS-DA) and analysis of variance (ANOVA). In particular, PLS-DA allowed identification of a variety of marker compounds responsible for differences among black teas of different origin, plant variety and processing methods used. Among most variable compounds are catechins, derivatives of quercetin, apigenin, quinic acid, and kaempferol. Rutin, epigallocatechin gallate (EGCG), quinic acid and theaflavin (TF) were contributing to most variances. Products of black tea fermentation (theaflavin, theasinensin, and theacitrin derivatives) contributed to PLS-DA associated to the processing of black tea.