Performance Challenges and Solutions in Big Data Platform Hadoop

计算机科学 大数据 调度(生产过程) 人气 分析 数据科学 作业调度程序 批处理 过程(计算) 领域(数学) 资源(消歧) 分布式计算 云计算 数据挖掘 工程类 操作系统 社会心理学 数学 纯数学 计算机网络 运营管理 心理学
作者
Balraj Singh,Harsh Kumar Verma,Vishu Madaan
出处
期刊:Recent advances in computer science and communications [Bentham Science]
卷期号:16 (9) 被引量:1
标识
DOI:10.2174/2666255816666230608165146
摘要

Background: The present era demands continuous support to bring improvements in executing complex analytics on large-scale data and to work beyond traditional systems. Objective: The need for processing diverse data types and solutions for different domains of the industry is rising. Such needs increase the requirement for sophisticated techniques and methods to enhance the existing platforms and mechanisms further. It provides an opportunity for the research community to investigate further into the existing systems, find potential issues, and propose new ways to improve the current systems. Hadoop is a popular choice to manage and process Big data. It is an open-source platform and a front-runner in the batch processing of large-scale jobs. The economy associated with the cluster in scaling is low as compared to other platforms. However, this popularity by no means guarantees high performance in all scenarios. With the continuous evolution in data development and industrial requirements, it is imperative to investigate and look into new methods and techniques to bring advancements to the existing system. Method: A systematic review is represented in this paper to have an insight into the current progress in this field. Research publications from various sources are taken and analyzed. The performance of a cluster largely depends upon the different job processing mechanisms and policies associated with it. Conclusion: While extensive studies and solutions are proposed, the performance bottlenecks in terms of load balancing, resource utilization, content management, and efficient processing prevail. Not many of the solutions are there on scheduling about the trade-off between different parameters, the process of content splitting and merging is not explored to a large extent and the skew mitigation solutions are more focused on Reduce side of the MapReduce while the Map side is not utilized much for load balancing.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
3秒前
3秒前
4秒前
尔沁发布了新的文献求助10
4秒前
5秒前
tY完成签到,获得积分10
7秒前
William_l_c完成签到,获得积分10
7秒前
妍妍发布了新的文献求助10
7秒前
可爱的函函应助123采纳,获得10
8秒前
guozizi完成签到,获得积分10
8秒前
感动煎饼发布了新的文献求助10
8秒前
8秒前
Victor发布了新的文献求助10
9秒前
lzy完成签到 ,获得积分10
10秒前
PSL发布了新的文献求助10
10秒前
abcdefg完成签到 ,获得积分10
11秒前
12秒前
王心茹发布了新的文献求助10
13秒前
找找发布了新的文献求助30
13秒前
天天快乐应助zhe采纳,获得10
15秒前
爱做实验的大头完成签到,获得积分10
15秒前
南木_完成签到,获得积分10
15秒前
搞怪柔完成签到,获得积分10
17秒前
PSL完成签到,获得积分10
17秒前
科研通AI5应助min采纳,获得10
17秒前
我是老大应助威武绿真采纳,获得10
18秒前
ink发布了新的文献求助10
18秒前
阳光的沧海完成签到 ,获得积分10
19秒前
annafan应助ch采纳,获得10
20秒前
20秒前
SYH关注了科研通微信公众号
21秒前
沙慧完成签到,获得积分10
21秒前
半柚应助蛋堡采纳,获得10
23秒前
23秒前
23秒前
上官若男应助YY采纳,获得10
24秒前
24秒前
陈冲冲发布了新的文献求助10
25秒前
NexusExplorer应助Jonathan采纳,获得200
26秒前
高分求助中
All the Birds of the World 4000
Production Logging: Theoretical and Interpretive Elements 3000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500
CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000
Machine Learning Methods in Geoscience 1000
Resilience of a Nation: A History of the Military in Rwanda 888
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3738035
求助须知:如何正确求助?哪些是违规求助? 3281550
关于积分的说明 10025988
捐赠科研通 2998302
什么是DOI,文献DOI怎么找? 1645228
邀请新用户注册赠送积分活动 782660
科研通“疑难数据库(出版商)”最低求助积分说明 749882