统计推断
估计员
一致性(知识库)
推论
数据集
计算机科学
数据挖掘
数学
集合(抽象数据类型)
统计模型
数据点
正态性
功能(生物学)
统计
算法
人工智能
程序设计语言
进化生物学
生物
作者
Lan Luo,Jingshen Wang,Emily C. Hector
出处
期刊:Biometrika
[Oxford University Press]
日期:2023-02-20
卷期号:110 (4): 841-858
被引量:2
标识
DOI:10.1093/biomet/asad010
摘要
Summary Modern longitudinal data, for example from wearable devices, may consist of measurements of biological signals on a fixed set of participants at a diverging number of time-points. Traditional statistical methods are not equipped to handle the computational burden of repeatedly analysing the cumulatively growing dataset each time new data are collected. We propose a new estimation and inference framework for dynamic updating of point estimates and their standard errors along sequentially collected datasets with dependence, both within and between the datasets. The key technique is a decomposition of the extended inference function vector of the quadratic inference function constructed over the cumulative longitudinal data into a sum of summary statistics over data batches. We show how this sum can be recursively updated without the need to access the whole dataset, resulting in a computationally efficient streaming procedure with minimal loss of statistical efficiency. We prove consistency and asymptotic normality of our streaming estimator as the number of data batches diverges, even as the number of independent participants remains fixed. Simulations demonstrate the advantages of our approach over traditional statistical methods that assume independence between data batches. Finally, we investigate the relationship between physical activity and several diseases through analysis of accelerometry data from the National Health and Nutrition Examination Survey.
科研通智能强力驱动
Strongly Powered by AbleSci AI