计算机科学
云计算
Python(编程语言)
边缘计算
工作量
聚类分析
GSM演进的增强数据速率
功能(生物学)
数据挖掘
分布式计算
操作系统
机器学习
人工智能
进化生物学
生物
作者
André Bauer,Haochen Pan,Ryan Chard,Yadu Babuji,Josh Bryan,Devesh Tiwari,Ian Foster,Kyle Chard
标识
DOI:10.1016/j.future.2023.12.007
摘要
We present a unique function-as-a-service (FaaS) dataset capturing the use of the Globus Compute (previously funcX) platform. Globus Compute implements a federated model via which users may deploy endpoints on arbitrary remote computers, from the edge to high performance computing (HPC) cluster, and they may then invoke Python functions on those endpoints via a reliable cloud-hosted service. The dataset covers 31 weeks and includes 2121472 task submissions from 252 users executed on 580 remote computing endpoints. It includes 277386 registered functions. We describe the dataset and various observations, some that are similar to other FaaS datasets, for example, that 74% of tasks run for less than 1 s, and some that are unique to Globus Compute, for example, that endpoints are used in different ways and that the majority of functions are related to scientific computing and machine learning. To the best of our knowledge, this dataset represents the first federated FaaS dataset that includes user workloads, distributed computing endpoints, and analysis of registered function bodies. We expect the dataset to be useful for researching FaaS architectures, workload modeling, container warming, and other distributed computing architectures.
科研通智能强力驱动
Strongly Powered by AbleSci AI