An Appraisal of the Carlisle-Stouffer-Fisher Method for Assessing Study Data Integrity and Fraud

医学离群值统计无效假设随机对照试验随机化统计假设检验计量经济学数学外科

作者

Edward J. Mascha,Thomas R. Vetter,Jean François Pittet

出处

期刊：Anesthesia & Analgesia [Ovid Technologies (Wolters Kluwer)]
日期：2017-10-01 卷期号：125 (4): 1381-1385 被引量：19

链接

nih.govdoi.org

标识

DOI：10.1213/ane.0000000000002415

摘要

Data fabrication and scientific misconduct have been recently uncovered in the anesthesia literature, partly via the work of John Carlisle. In a recent article in Anaesthesia , Carlisle analyzed 5087 randomized clinical trials from anesthesia and general medicine journals from 2000 to 2015. He concluded that in about 6% of studies, data comparing randomized groups on baseline variables, before the given intervention, were either too similar or dissimilar compared to that expected by usual sampling variability under the null hypothesis. Carlisle used the Stouffer-Fisher method of combining P values in Table 1 (the conventional table reporting baseline patient characteristics) for each study, then calculated trial P values and assessed whether they followed a uniform distribution across studies. Extreme P values targeted studies as likely to contain data fabrication or errors. In this Statistical Grand Rounds article, we explain Carlisle’s methods, highlight perceived limitations of the proposed approach, and offer recommendations. Our main findings are (1) independence was assumed between variables in a study, which is often false and would lead to “false positive” findings; (2) an “unusual” result from a trial cannot easily be concluded to represent fraud; (3) utilized cutoff values for determining extreme P values were arbitrary; (4) trials were analyzed as if simple randomization was used, introducing bias; (5) not all P values can be accurately generated from summary statistics in a Table 1, sometimes giving incorrect conclusions; (6) small numbers of P values to assess outlier status within studies is not reliable; (7) utilized method to assess deviations from expected distributions may stack the deck; (8) P values across trials assumed to be independent; (9) P value variability not accounted for; and (10) more detailed methods needed to understand exactly what was done. It is not yet known to what extent these concerns affect the accuracy of Carlisle’s results. We recommend that Carlisle’s methods be improved before widespread use (applying them to every manuscript submitted for publication). Furthermore, lack of data integrity and fraud should ideally be assessed using multiple simultaneous statistical methods to yield more confident results. More sophisticated methods are needed for nonrandomized trials, randomized trial data reported beyond Table 1, and combating growing fraudster sophistication. We encourage all authors to more carefully scrutinize their own reporting. Finally, we believe that reporting of suspected data fraud and integrity issues should be done more discretely and directly by the involved journal to protect honest authors from the stigma of being associated with potential fraud.

求助该文献

An Appraisal of the Carlisle-Stouffer-Fisher Method for Assessing Study Data Integrity and Fraud

今日热心研友