单变量
统计
数学
病态的
对数正态分布
正态分布
标准差
病理
多元统计
医学
数学分析
作者
Rui Zhen Tan,C. Markus,Samuel Vasikaran,Tze Ping Loh
标识
DOI:10.1016/j.clinbiochem.2022.02.006
摘要
Indirect reference intervals and biological variation studies heavily rely on statistical methods to separate pathological and non-pathological subpopulations within the same dataset. In recognition of this, we compare the performance of eight univariate statistical methods for identification and exclusion of values originating from pathological subpopulations.The eight approaches examined were: Tukey's rule with and without Box-Cox transformation; median absolute deviation; double median absolute deviation; Gaussian mixture models; van der Loo (Vdl) methods 1 and 2; and the Kosmic approach. Using four scenarios including lognormal distributions and varying the conditions through the number of pathological populations, central location, spread and proportion for a total of 256 simulated mixed populations. A performance criterion of ± 0.05 fractional error from the true underlying lower and upper reference interval was chosen.Overall, the Kosmic method was a standout with the highest number of scenarios lying within the acceptable error, followed by Vdl method 1 and Tukey's rule. Kosmic and Vdl method 1 appears to discriminate better the non-pathological reference population in the case of log-normal distributed data. When the proportion and spread of pathological subpopulations is high, the performance of statistical exclusion deteriorated considerably.It is important that laboratories use a priori defined clinical criteria to minimise the proportion of pathological subpopulation in a dataset prior to analysis. The curated dataset should then be carefully examined so that the appropriate statistical method can be applied.
科研通智能强力驱动
Strongly Powered by AbleSci AI