The Design of Distributed File System Based on HDFS Yannan Wang 1, a, Shudong Zhang 2, b, Hui Liu 3, c

Size: px
Start display at page:

Download "The Design of Distributed File System Based on HDFS Yannan Wang 1, a, Shudong Zhang 2, b, Hui Liu 3, c"

Transcription

1 Applied Mechanics and Materials Online: ISSN: , Vols , pp doi: / Trans Tech Publications, Switzerland The Design of Distributed File System Based on HDFS Yannan Wang 1, a, Shudong Zhang 2, b, Hui Liu 3, c 1, 2, 3 College of Information Engineering, Capital Normal University, Beijing, China a wangyannanme@163.com, b zsd@mail.cnu.edu.cn, c liuhui_cnu@yahoo.com.cn Keywords: HDFS, Small File, Binary Serialization, SequenceFile Abstract. HDFS is a distributed file system designed to access large files, which is inefficient for storing small files. For this issue, a new storage architecture based on the HDFS is designed to solve the problem of low efficiency of HDFS storing small files in this article. This paper mainly uses SequenceFile to merge small files and against to the shortcoming that SequenceFile merges small files, the paper provides the solution and designs a new system structure based on HDFS. The system mainly increases the file judgment unit to mark and identify small files, creates a local index file which is helpful to improve the retrieval efficiency of small files to record the size and offset of the small files and finally uses binary serialization to merge the small files, which makes small files be written into large files as time order. Introduction The cloud computing is not formed a uniform definition by the academic and industrial communities. To a certain extent, we can consider that cloud computing is the commercial development of computing concept including distributed computing, parallel computing, grid computing and so on, which the basic principle is that people use resources on a computer cluster via the Internet [1]. Hadoop is a distributed computing open source framework of Apache open-source organization, which focuses on distributed systems about mass data storage and processing, and provides the MapReduce technology framework implemented in Java, and can deploy distributed applications to the low-cost server. [2].Hadoop massive large files very well, but with the increasing scale of small files to, Hadoop starts to become powerless. Because storing small file needs to repeatedly request the memory address and allocates block. A large number of small files make single NameNode become powerless, and a lot of metadata occupies the NameNode in memory[3]. Therefore, the above problems, this paper designs a distributed file system based on HDFS which used to solve the problem of low the HDFS processing small files. HDFS Architecture Analysis HDFS architecture is based on a large number of ordinary computer configured cluster. Nodes in the cluster are usually running GNU/Linux operating system that must support Java, because the HDFS is implemented in Java. HDFS uses master-slave architecture (Master/Slave), and a cluster has a Master and multiple Slaves, and the former is called the name node (NameNode), and the latter is called data nodes, which is shown in Figure 1. In theory, a single computer can run multiple DataNode process, a NameNode the process (the process is unique throughout the cluster), but in reality, a computer often run a DataNode, or a NameNode [4]. A file is divided into a number of Blocks stored in a set of DataNode. Figure 1.HDFS structure All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of Trans Tech Publications, (ID: , Pennsylvania State University, University Park, USA-12/05/16,13:46:13)

2 2734 Applied Materials and Technologies for Modern Manufacturing Problems Which the HDFS Stores Processes Small Files HDFS is designed for large files, storing large files reflects performance advantages, but there is no good way to optimize small files, it is that any block, file or directory in HDFS are stored as objects in memory, and each object takes about memory 150 byte. If there is a ten million small files, NameNode needs 2G space (save two), and if the number of small files increases to 100 million, NameNode need 20G space. Small files consume a lot of memory space of NameNode, which makes NameNode memory capacity severely constrain cluster expansion and its applications. Secondly, accessing to a large number of small files is much faster than accessing to several large files. HDFS was originally developed for streaming accessing to large files, and if a large number of small files are accessed, it needs to constantly jump from one DataNode another DataNode, which seriously affects performance. Finally, it is much faster to handle large files faster than to handle a large number of small files of the same size. Each small file takes up a slot, and the task starts to spend a lot of time and even most of the time-consuming task in the startup task and release. [5]. Related Researches At present, there are three technologies processing small files technologies [6]. HAR Archive Technology [7]. Hadoop Archives (HAR files) file system is a file system that Hadoop provides, which is generally used to archive files. Hadoop Archives (HAR files) File Archive is designed to reduce the namenode memory that large number of small files consumes. HAR file is a special file format. A HAR file is created by the Hadoop archive commands, and this command is to run a MapReduce task to package a number of smaller files into a HAR file. A HAR file cannot be changed once created, such as to add or delete a file, and client must re-create the archive. SequenceFile Technology. SequenceFile which is a text stored file that consists of the byte stream of binary sequence of key/value can be used in the process of input/output format of map/ reduce [8]. SequenceFile can use a file name as a key, file content as a value. You can write a program to write some small files into a single sequence file then you can use this file directly. But SequenceFile does not establish the appropriate mapping relationship of files to a large file, and if it is not indexed, querying small files needs to traverse the entire SequenceFile to reduce the efficiency of file read. CombineFileInputFormat. The reason that Hadoop is not suitable for processing a large number of small files is that the whole or part of InputSplit which is generated by FileInputFormat is always as the input file. Dealing with a large number of small files, each map operation handles only a small amount of input data, resulting in too many map task operation and reduce overall performance. CombineFileInputFormat is a new the inputformat which can alleviate this problem. It is used to merge multiple files into a single split and Combine FileInputFormat can consider the storage location of the data [9]. Design of Storage Structure In the above description of three methods of resolving small files, some problems exist, and they also need to archive small files in HDFS so that reduce the number of small files, which brings a lot of inconvenience This paper increases judgment module on the basis of the original HDFS. The structure is shown in Figure 2.When a file arrives, at first, the file is determined whether the file is a small file, and if it is, it is given to Merge small files Unit, and if it is not, it is directly uploaded to HDFS. The following is a brief introduction of each part.

3 Applied Mechanics and Materials Vols Figure 2.The structure of data storage system based on the HDFS Determine The File Unit. The User can make uploading, looking over and downloading data easy and complete other related operations, and it takes into account the needs of non-professional users, which only provides the user a simple business operation and the final valid data. Determine the file type achieves the judgment of the file. Whether the file uploaded is a small file or not, the paper sets a specific threshold. The system sets 1M threshold, and the file whose size is less than 1M is a small file, others are large files. When it is judged as a large file, Determine the file type directly gives the large file uploaded to HDFS client; If it is determined as small files, small file will be transmitted to Merge small files Unit.At first Merge small files Unit will create an index file to record the size and offset of small file. Merge Small Files Unit. The main function of Merge small file unit is to merge small files and generate large files in order to reduce the large number of small files on the Map resource waste. In this unit, in order to more effectively read small files and resolve the low retrieval efficiency when using SequenceFile to merge small files, a local index file is created to store the size and offset of current file. At the same time, for facilitating the storage of small files, this paper uses binary serialization scheme to merge small files and operate small files as time order. Storage Section. The storage section is composed of a large number of low-cost servers, which is a collection of multiple devices. The entire storage layer is composed by a NameNode and multiple DataNodes to complete storage operation of the entire system. The NameNode is responsible for managing namespace of the cluster file system. The DataNode is mainly responsible for data blocks in the storage node and reports status and performs pipeline operations of data copy to the NameNode nodes. System Flowchart. Specific workflow is shown in Figure3: Figure 3.System flowchart

4 2736 Applied Materials and Technologies for Modern Manufacturing Conclusions This paper analyzes the architecture of HDFS and deficiencies that HDFS deals with small files, and for these shortcomings, this paper improves the design on the basis of distributed file system of the HDFS and designs a new distributed file system based on HDFS that can improve the processing performance for small files. At first, file uploaded is transmitted to Determine the file type, if the file is large, this file is directly given to HDFS, and if the file is small, this file is transmitted to Merge small file unit, then an index file that records the size and offset of the current small file is created, and after a certain period of time, SequenceFile start to merge small files to reduce the number of small files and memory usage of NameNode. Acknowledgment This research was supported by China National Key Technology R&D Program (2012BAH20B03), (2013BAH19F01),(2012BAZ03836).National Nature Science Foundation ( ), Beijing Nature Science Foundation ( ), "The computer application technology" Beijing municipal key construction of the discipline, Beijing Engineering Research Center, and Beijing Educational Committee science and technology development plan project (KM ). References [1] Jianguang Deng, Xiaoheng Pan, Huaqiang Yuan, Research of Cloud storage and its Distributed File System, Journal of Dong Guan University of Technology,. vol.19, no.7, pp.41-45, [2] Weijiao Hao, Shijian Zhou, Dawei Peng, Research of the Cloud GIS Frame with Hadoop Cloud Platform, Jiangxi Science, vol.31, no.1, pp , [3] Dongxue Qin, Study on Processing of Massive Small Files Based on Hadoop, Liaoning University, China, [4] Chunling Xu, Guangquan Zhang, Comparison and analysis of distributed file system Hadoop HDFS with traditional file system Linux FS, Journal of SuZhou University, vol.30, no.4, pp. 5-9, [5] Guangyao Zhu, The Hadoop mass processing and analysis of small files, Science and Technology Information, [6] Yannan Wang, Hui Liu, Shudong Zhang, Research of Processing Massive Small Files Based on Hadoop, Journal of Convergence Information Technology, vol.8, no.9, pp , [7] [8] [9] Xusheng Hong, Shiping Lin, Efficiency of Storaging Small Files in HDFS Based on MapFile, Computer Systems & Applications, vol.21, no.11, pp , 2013.

5 Applied Materials and Technologies for Modern Manufacturing / The Design of Distributed File System Based on HDFS /

Research on Full-text Retrieval based on Lucene in Enterprise Content Management System Lixin Xu 1, a, XiaoLin Fu 2, b, Chunhua Zhang 1, c

Research on Full-text Retrieval based on Lucene in Enterprise Content Management System Lixin Xu 1, a, XiaoLin Fu 2, b, Chunhua Zhang 1, c Applied Mechanics and Materials Submitted: 2014-07-18 ISSN: 1662-7482, Vols. 644-650, pp 1950-1953 Accepted: 2014-07-21 doi:10.4028/www.scientific.net/amm.644-650.1950 Online: 2014-09-22 2014 Trans Tech

More information

A New Model of Search Engine based on Cloud Computing

A New Model of Search Engine based on Cloud Computing A New Model of Search Engine based on Cloud Computing DING Jian-li 1,2, YANG Bo 1 1. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China 2. Tianjin Key

More information

Processing Technology of Massive Human Health Data Based on Hadoop

Processing Technology of Massive Human Health Data Based on Hadoop 6th International Conference on Machinery, Materials, Environment, Biotechnology and Computer (MMEBC 2016) Processing Technology of Massive Human Health Data Based on Hadoop Miao Liu1, a, Junsheng Yu1,

More information

Research Of Data Model In Engineering Flight Simulation Platform Based On Meta-Data Liu Jinxin 1,a, Xu Hong 1,b, Shen Weiqun 2,c

Research Of Data Model In Engineering Flight Simulation Platform Based On Meta-Data Liu Jinxin 1,a, Xu Hong 1,b, Shen Weiqun 2,c Applied Mechanics and Materials Online: 2013-06-13 ISSN: 1662-7482, Vols. 325-326, pp 1750-1753 doi:10.4028/www.scientific.net/amm.325-326.1750 2013 Trans Tech Publications, Switzerland Research Of Data

More information

The Analysis and Research of IPTV Set-top Box System. Fangyan Bai 1, Qi Sun 2

The Analysis and Research of IPTV Set-top Box System. Fangyan Bai 1, Qi Sun 2 Applied Mechanics and Materials Online: 2012-12-13 ISSN: 1662-7482, Vols. 256-259, pp 2898-2901 doi:10.4028/www.scientific.net/amm.256-259.2898 2013 Trans Tech Publications, Switzerland The Analysis and

More information

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform Computer and Information Science; Vol. 11, No. 1; 2018 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education The Analysis and Implementation of the K - Means Algorithm Based

More information

Huge Data Analysis and Processing Platform based on Hadoop Yuanbin LI1, a, Rong CHEN2

Huge Data Analysis and Processing Platform based on Hadoop Yuanbin LI1, a, Rong CHEN2 2nd International Conference on Materials Science, Machinery and Energy Engineering (MSMEE 2017) Huge Data Analysis and Processing Platform based on Hadoop Yuanbin LI1, a, Rong CHEN2 1 Information Engineering

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. The study on magnanimous data-storage system based on cloud computing

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. The study on magnanimous data-storage system based on cloud computing [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 11 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(11), 2014 [5368-5376] The study on magnanimous data-storage system based

More information

Research Article Mobile Storage and Search Engine of Information Oriented to Food Cloud

Research Article Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 DOI:10.19026/ajfst.5.3106 ISSN: 2042-4868; e-issn: 2042-4876 2013 Maxwell Scientific Publication Corp. Submitted: May 29, 2013 Accepted:

More information

New research on Key Technologies of unstructured data cloud storage

New research on Key Technologies of unstructured data cloud storage 2017 International Conference on Computing, Communications and Automation(I3CA 2017) New research on Key Technologies of unstructured data cloud storage Songqi Peng, Rengkui Liua, *, Futian Wang State

More information

Optimization Scheme for Small Files Storage Based on Hadoop Distributed File System

Optimization Scheme for Small Files Storage Based on Hadoop Distributed File System , pp.241-254 http://dx.doi.org/10.14257/ijdta.2015.8.5.21 Optimization Scheme for Small Files Storage Based on Hadoop Distributed File System Yingchi Mao 1, 2, Bicong Jia 1, Wei Min 1 and Jiulong Wang

More information

Hadoop and HDFS Overview. Madhu Ankam

Hadoop and HDFS Overview. Madhu Ankam Hadoop and HDFS Overview Madhu Ankam Why Hadoop We are gathering more data than ever Examples of data : Server logs Web logs Financial transactions Analytics Emails and text messages Social media like

More information

Enhanced Hadoop with Search and MapReduce Concurrency Optimization

Enhanced Hadoop with Search and MapReduce Concurrency Optimization Volume 114 No. 12 2017, 323-331 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Enhanced Hadoop with Search and MapReduce Concurrency Optimization

More information

The Analysis of the Loss Rate of Information Packet of Double Queue Single Server in Bi-directional Cable TV Network

The Analysis of the Loss Rate of Information Packet of Double Queue Single Server in Bi-directional Cable TV Network Applied Mechanics and Materials Submitted: 2014-06-18 ISSN: 1662-7482, Vol. 665, pp 674-678 Accepted: 2014-07-31 doi:10.4028/www.scientific.net/amm.665.674 Online: 2014-10-01 2014 Trans Tech Publications,

More information

Decision analysis of the weather log by Hadoop

Decision analysis of the weather log by Hadoop Advances in Engineering Research (AER), volume 116 International Conference on Communication and Electronic Information Engineering (CEIE 2016) Decision analysis of the weather log by Hadoop Hao Wu Department

More information

A Digital Menu System Based on the Cloud client Technology Lin Dong 1, a, Weibo Li 1, b, Ping He 2,c,Jia Liu 1,d

A Digital Menu System Based on the Cloud client Technology Lin Dong 1, a, Weibo Li 1, b, Ping He 2,c,Jia Liu 1,d Applied Mechanics and Materials Online: 2012-11-29 ISSN: 1662-7482, Vol. 235, pp 389-393 doi:10.4028/www.scientific.net/amm.235.389 2012 Trans Tech Publications, Switzerland A Digital Menu System Based

More information

Research and Improvement of Apriori Algorithm Based on Hadoop

Research and Improvement of Apriori Algorithm Based on Hadoop Research and Improvement of Apriori Algorithm Based on Hadoop Gao Pengfei a, Wang Jianguo b and Liu Pengcheng c School of Computer Science and Engineering Xi'an Technological University Xi'an, 710021,

More information

The Application Analysis and Network Design of wireless VPN for power grid. Wang Yirong,Tong Dali,Deng Wei

The Application Analysis and Network Design of wireless VPN for power grid. Wang Yirong,Tong Dali,Deng Wei Applied Mechanics and Materials Online: 2013-09-27 ISSN: 1662-7482, Vols. 427-429, pp 2130-2133 doi:10.4028/www.scientific.net/amm.427-429.2130 2013 Trans Tech Publications, Switzerland The Application

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK DISTRIBUTED FRAMEWORK FOR DATA MINING AS A SERVICE ON PRIVATE CLOUD RUCHA V. JAMNEKAR

More information

Research on Mass Image Storage Platform Based on Cloud Computing

Research on Mass Image Storage Platform Based on Cloud Computing 6th International Conference on Sensor Network and Computer Engineering (ICSNCE 2016) Research on Mass Image Storage Platform Based on Cloud Computing Xiaoqing Zhou1, a *, Jiaxiu Sun2, b and Zhiyong Zhou1,

More information

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 ISSN

International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 ISSN 68 Improving Access Efficiency of Small Files in HDFS Monica B. Bisane, Student, Department of CSE, G.C.O.E, Amravati,India, monica9.bisane@gmail.com Asst.Prof. Pushpanjali M. Chouragade, Department of

More information

Construction of the Library Management System Based on Data Warehouse and OLAP Maoli Xu 1, a, Xiuying Li 2,b

Construction of the Library Management System Based on Data Warehouse and OLAP Maoli Xu 1, a, Xiuying Li 2,b Applied Mechanics and Materials Online: 2013-08-30 ISSN: 1662-7482, Vols. 380-384, pp 4796-4799 doi:10.4028/www.scientific.net/amm.380-384.4796 2013 Trans Tech Publications, Switzerland Construction of

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

Design and Implementation of CNC Operator Panel Control Functions Based on CPLD. Huaqun Zhan, Bin Xu

Design and Implementation of CNC Operator Panel Control Functions Based on CPLD. Huaqun Zhan, Bin Xu Advanced Materials Research Online: 2013-07-31 ISSN: 1662-8985, Vol. 722, pp 428-432 doi:10.4028/www.scientific.net/amr.722.428 2013 Trans Tech Publications, Switzerland Design and Implementation of CNC

More information

Customizing dynamic libraries of Qt based on the embedded Linux Li Yang 1,a, Wang Yunliang 2,b

Customizing dynamic libraries of Qt based on the embedded Linux Li Yang 1,a, Wang Yunliang 2,b Applied Mechanics and Materials Submitted: 2014-11-12 ISSN: 1662-7482, Vol. 740, pp 782-785 Accepted: 2014-12-02 doi:10.4028/www.scientific.net/amm.740.782 Online: 2015-03-09 2015 Trans Tech Publications,

More information

Serial Communication Based on LabVIEW for the Development of an ECG Monitor

Serial Communication Based on LabVIEW for the Development of an ECG Monitor Advanced Materials Research Online: 2013-08-16 ISSN: 1662-8985, Vols. 734-737, pp 3003-3006 doi:10.4028/www.scientific.net/amr.734-737.3003 2013 Trans Tech Publications, Switzerland Serial Communication

More information

Design and Implementation of unified Identity Authentication System Based on LDAP in Digital Campus

Design and Implementation of unified Identity Authentication System Based on LDAP in Digital Campus Advanced Materials Research Online: 2014-04-09 ISSN: 1662-8985, Vols. 912-914, pp 1213-1217 doi:10.4028/www.scientific.net/amr.912-914.1213 2014 Trans Tech Publications, Switzerland Design and Implementation

More information

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework Scientific Journal of Impact Factor (SJIF): e-issn (O): 2348- International Journal of Advance Engineering and Research Development Volume 3, Issue 2, February -2016 A Study: Hadoop Framework Devateja

More information

IMPLEMENTATION OF INFORMATION RETRIEVAL (IR) ALGORITHM FOR CLOUD COMPUTING: A COMPARATIVE STUDY BETWEEN WITH AND WITHOUT MAPREDUCE MECHANISM *

IMPLEMENTATION OF INFORMATION RETRIEVAL (IR) ALGORITHM FOR CLOUD COMPUTING: A COMPARATIVE STUDY BETWEEN WITH AND WITHOUT MAPREDUCE MECHANISM * Journal of Contemporary Issues in Business Research ISSN 2305-8277 (Online), 2012, Vol. 1, No. 2, 42-56. Copyright of the Academic Journals JCIBR All rights reserved. IMPLEMENTATION OF INFORMATION RETRIEVAL

More information

Hadoop. copyright 2011 Trainologic LTD

Hadoop. copyright 2011 Trainologic LTD Hadoop Hadoop is a framework for processing large amounts of data in a distributed manner. It can scale up to thousands of machines. It provides high-availability. Provides map-reduce functionality. Hides

More information

Shape Optimization Design of Gravity Buttress of Arch Dam Based on Asynchronous Particle Swarm Optimization Method. Lei Xu

Shape Optimization Design of Gravity Buttress of Arch Dam Based on Asynchronous Particle Swarm Optimization Method. Lei Xu Applied Mechanics and Materials Submitted: 2014-08-26 ISSN: 1662-7482, Vol. 662, pp 160-163 Accepted: 2014-08-31 doi:10.4028/www.scientific.net/amm.662.160 Online: 2014-10-01 2014 Trans Tech Publications,

More information

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018 Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster

More information

An Algorithm of Association Rule Based on Cloud Computing

An Algorithm of Association Rule Based on Cloud Computing Send Orders for Reprints to reprints@benthamscience.ae 1748 The Open Automation and Control Systems Journal, 2014, 6, 1748-1753 An Algorithm of Association Rule Based on Cloud Computing Open Access Fei

More information

Utilizing Restricted Direction Strategy and Binary Heap Technology to Optimize Dijkstra Algorithm in WebGIS

Utilizing Restricted Direction Strategy and Binary Heap Technology to Optimize Dijkstra Algorithm in WebGIS Key Engineering Materials Online: 2009-10-08 ISSN: 1662-9795, Vols. 419-420, pp 557-560 doi:10.4028/www.scientific.net/kem.419-420.557 2010 Trans Tech Publications, Switzerland Utilizing Restricted Direction

More information

The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian ZHENG 1, Mingjiang LI 1, Jinpeng YUAN 1

The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian ZHENG 1, Mingjiang LI 1, Jinpeng YUAN 1 International Conference on Intelligent Systems Research and Mechatronics Engineering (ISRME 2015) The Analysis Research of Hierarchical Storage System Based on Hadoop Framework Yan LIU 1, a, Tianjian

More information

Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments

Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments Send Orders for Reprints to reprints@benthamscience.ae 368 The Open Automation and Control Systems Journal, 2014, 6, 368-373 Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing

More information

A priority based dynamic bandwidth scheduling in SDN networks 1

A priority based dynamic bandwidth scheduling in SDN networks 1 Acta Technica 62 No. 2A/2017, 445 454 c 2017 Institute of Thermomechanics CAS, v.v.i. A priority based dynamic bandwidth scheduling in SDN networks 1 Zun Wang 2 Abstract. In order to solve the problems

More information

Realization of Automatic Keystone Correction for Smart mini Projector Projection Screen

Realization of Automatic Keystone Correction for Smart mini Projector Projection Screen Applied Mechanics and Materials Online: 2014-02-06 ISSN: 1662-7482, Vols. 519-520, pp 504-509 doi:10.4028/www.scientific.net/amm.519-520.504 2014 Trans Tech Publications, Switzerland Realization of Automatic

More information

Construction of SSI Framework Based on MVC Software Design Model Yongchang Rena, Yongzhe Mab

Construction of SSI Framework Based on MVC Software Design Model Yongchang Rena, Yongzhe Mab 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2015) Construction of SSI Framework Based on MVC Software Design Model Yongchang Rena, Yongzhe Mab School

More information

Research on the Application of Digital Images Based on the Computer Graphics. Jing Li 1, Bin Hu 2

Research on the Application of Digital Images Based on the Computer Graphics. Jing Li 1, Bin Hu 2 Applied Mechanics and Materials Online: 2014-05-23 ISSN: 1662-7482, Vols. 556-562, pp 4998-5002 doi:10.4028/www.scientific.net/amm.556-562.4998 2014 Trans Tech Publications, Switzerland Research on the

More information

A Novel Architecture to Efficient utilization of Hadoop Distributed File Systems for Small Files

A Novel Architecture to Efficient utilization of Hadoop Distributed File Systems for Small Files A Novel Architecture to Efficient utilization of Hadoop Distributed File Systems for Small Files Vaishali 1, Prem Sagar Sharma 2 1 M. Tech Scholar, Dept. of CSE., BSAITM Faridabad, (HR), India 2 Assistant

More information

Distributed Face Recognition Using Hadoop

Distributed Face Recognition Using Hadoop Distributed Face Recognition Using Hadoop A. Thorat, V. Malhotra, S. Narvekar and A. Joshi Dept. of Computer Engineering and IT College of Engineering, Pune {abhishekthorat02@gmail.com, vinayak.malhotra20@gmail.com,

More information

CLIENT DATA NODE NAME NODE

CLIENT DATA NODE NAME NODE Volume 6, Issue 12, December 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Efficiency

More information

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI 2017 International Conference on Electronic, Control, Automation and Mechanical Engineering (ECAME 2017) ISBN: 978-1-60595-523-0 The Establishment of Large Data Mining Platform Based on Cloud Computing

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

Ghislain Fourny. Big Data 6. Massive Parallel Processing (MapReduce)

Ghislain Fourny. Big Data 6. Massive Parallel Processing (MapReduce) Ghislain Fourny Big Data 6. Massive Parallel Processing (MapReduce) So far, we have... Storage as file system (HDFS) 13 So far, we have... Storage as tables (HBase) Storage as file system (HDFS) 14 Data

More information

A Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop

A Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop A Robust Cloud-based Service Architecture for Multimedia Streaming Using Hadoop Myoungjin Kim 1, Seungho Han 1, Jongjin Jung 3, Hanku Lee 1,2,*, Okkyung Choi 2 1 Department of Internet and Multimedia Engineering,

More information

HADOOP FRAMEWORK FOR BIG DATA

HADOOP FRAMEWORK FOR BIG DATA HADOOP FRAMEWORK FOR BIG DATA Mr K. Srinivas Babu 1,Dr K. Rameshwaraiah 2 1 Research Scholar S V University, Tirupathi 2 Professor and Head NNRESGI, Hyderabad Abstract - Data has to be stored for further

More information

Hadoop Map Reduce 10/17/2018 1

Hadoop Map Reduce 10/17/2018 1 Hadoop Map Reduce 10/17/2018 1 MapReduce 2-in-1 A programming paradigm A query execution engine A kind of functional programming We focus on the MapReduce execution engine of Hadoop through YARN 10/17/2018

More information

SQL Query Optimization on Cross Nodes for Distributed System

SQL Query Optimization on Cross Nodes for Distributed System 2016 International Conference on Power, Energy Engineering and Management (PEEM 2016) ISBN: 978-1-60595-324-3 SQL Query Optimization on Cross Nodes for Distributed System Feng ZHAO 1, Qiao SUN 1, Yan-bin

More information

A Compatible Public Service Platform for Multi-Electronic Certification Authority

A Compatible Public Service Platform for Multi-Electronic Certification Authority Applied Mechanics and Materials Submitted: 2014-04-26 ISSN: 1662-7482, Vol. 610, pp 579-583 Accepted: 2014-05-26 doi:10.4028/www.scientific.net/amm.610.579 Online: 2014-08-11 2014 Trans Tech Publications,

More information

Simulation Technology of Light Effect Based on Catia and Workbench Software HongXia Hu

Simulation Technology of Light Effect Based on Catia and Workbench Software HongXia Hu Applied Mechanics and Materials Online: 2014-03-24 ISSN: 1662-7482, Vols. 543-547, pp 3218-3221 doi:10.4028/www.scientific.net/amm.543-547.3218 2014 Trans Tech Publications, Switzerland Simulation Technology

More information

Application of Three-dimensional Visualization Technology in Real Estate Management Jian Cui 1,a, Jiju Ma 2,b, Dongling Ma 1, c and Nana Yang 3,d

Application of Three-dimensional Visualization Technology in Real Estate Management Jian Cui 1,a, Jiju Ma 2,b, Dongling Ma 1, c and Nana Yang 3,d Applied Mechanics and Materials Online: 2014-07-04 ISSN: 1662-7482, Vols. 580-583, pp 2765-2768 doi:10.4028/www.scientific.net/amm.580-583.2765 2014 Trans Tech Publications, Switzerland Application of

More information

Ghislain Fourny. Big Data Fall Massive Parallel Processing (MapReduce)

Ghislain Fourny. Big Data Fall Massive Parallel Processing (MapReduce) Ghislain Fourny Big Data Fall 2018 6. Massive Parallel Processing (MapReduce) Let's begin with a field experiment 2 400+ Pokemons, 10 different 3 How many of each??????????? 4 400 distributed to many volunteers

More information

Research on Heterogeneous Communication Network for Power Distribution Automation

Research on Heterogeneous Communication Network for Power Distribution Automation 3rd International Conference on Material, Mechanical and Manufacturing Engineering (IC3ME 2015) Research on Heterogeneous Communication Network for Power Distribution Automation Qiang YU 1,a*, Hui HUANG

More information

Dynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c

Dynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c 2016 Joint International Conference on Service Science, Management and Engineering (SSME 2016) and International Conference on Information Science and Technology (IST 2016) ISBN: 978-1-60595-379-3 Dynamic

More information

Research of 3D parametric design system of worm drive based on Pro/E. Hongbin Niu a, Xiaohua Li b

Research of 3D parametric design system of worm drive based on Pro/E. Hongbin Niu a, Xiaohua Li b Advanced Materials Research Online: 2013-06-27 ISSN: 1662-8985, Vols. 712-715, pp 1107-1110 doi:10.4028/www.scientific.net/amr.712-715.1107 2013 Trans Tech Publications, Switzerland Research of 3D parametric

More information

Applied Mechanics and Materials Vol

Applied Mechanics and Materials Vol Applied Mechanics and Materials Online: 2014-02-27 ISSN: 1662-7482, Vol. 532, pp 280-284 doi:10.4028/www.scientific.net/amm.532.280 2014 Trans Tech Publications, Switzerland A Practical Real-time Motion

More information

The Research of A multi-language supporting description-oriented Clustering Algorithm on Meta-Search Engine Result Wuling Ren 1, a and Lijuan Liu 2,b

The Research of A multi-language supporting description-oriented Clustering Algorithm on Meta-Search Engine Result Wuling Ren 1, a and Lijuan Liu 2,b Applied Mechanics and Materials Online: 2012-01-24 ISSN: 1662-7482, Vol. 151, pp 549-553 doi:10.4028/www.scientific.net/amm.151.549 2012 Trans Tech Publications, Switzerland The Research of A multi-language

More information

Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop Cluster

Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop Cluster 2017 2 nd International Conference on Artificial Intelligence and Engineering Applications (AIEA 2017) ISBN: 978-1-60595-485-1 Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop

More information

The Research and Design of the Application Domain Building Based on GridGIS

The Research and Design of the Application Domain Building Based on GridGIS Journal of Geographic Information System, 2010, 2, 32-39 doi:10.4236/jgis.2010.21007 Published Online January 2010 (http://www.scirp.org/journal/jgis) The Research and Design of the Application Domain

More information

Cloud Computing. Hwajung Lee. Key Reference: Prof. Jong-Moon Chung s Lecture Notes at Yonsei University

Cloud Computing. Hwajung Lee. Key Reference: Prof. Jong-Moon Chung s Lecture Notes at Yonsei University Cloud Computing Hwajung Lee Key Reference: Prof. Jong-Moon Chung s Lecture Notes at Yonsei University Cloud Computing Cloud Introduction Cloud Service Model Big Data Hadoop MapReduce HDFS (Hadoop Distributed

More information

Constructing an University Scientific Research Management Information System of NET Platform Jianhua Xie 1, a, Jian-hua Xiao 2, b

Constructing an University Scientific Research Management Information System of NET Platform Jianhua Xie 1, a, Jian-hua Xiao 2, b Applied Mechanics and Materials Online: 2013-12-04 ISSN: 1662-7482, Vol. 441, pp 984-988 doi:10.4028/www.scientific.net/amm.441.984 2014 Trans Tech Publications, Switzerland Constructing an University

More information

KillTest *KIJGT 3WCNKV[ $GVVGT 5GTXKEG Q&A NZZV ]]] QORRZKYZ IUS =K ULLKX LXKK [VJGZK YKX\OIK LUX UTK _KGX

KillTest *KIJGT 3WCNKV[ $GVVGT 5GTXKEG Q&A NZZV ]]] QORRZKYZ IUS =K ULLKX LXKK [VJGZK YKX\OIK LUX UTK _KGX KillTest Q&A Exam : CCD-410 Title : Cloudera Certified Developer for Apache Hadoop (CCDH) Version : DEMO 1 / 4 1.When is the earliest point at which the reduce method of a given Reducer can be called?

More information

QADR with Energy Consumption for DIA in Cloud

QADR with Energy Consumption for DIA in Cloud Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

The RTP Encapsulation based on Frame Type Method for AVS Video

The RTP Encapsulation based on Frame Type Method for AVS Video Applied Mechanics and Materials Online: 2012-12-27 ISSN: 1662-7482, Vols. 263-266, pp 1803-1808 doi:10.4028/www.scientific.net/amm.263-266.1803 2013 Trans Tech Publications, Switzerland The RTP Encapsulation

More information

A brief history on Hadoop

A brief history on Hadoop Hadoop Basics A brief history on Hadoop 2003 - Google launches project Nutch to handle billions of searches and indexing millions of web pages. Oct 2003 - Google releases papers with GFS (Google File System)

More information

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch 619 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 51, 2016 Guest Editors: Tichun Wang, Hongyang Zhang, Lei Tian Copyright 2016, AIDIC Servizi S.r.l., ISBN 978-88-95608-43-3; ISSN 2283-9216 The

More information

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Big Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing Big Data Analytics Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Big Data "The world is crazy. But at least it s getting regular analysis." Izabela

More information

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud Calhoun: The NPS Institutional Archive Faculty and Researcher Publications Faculty and Researcher Publications 2013-03 A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the

More information

UNIT-IV HDFS. Ms. Selva Mary. G

UNIT-IV HDFS. Ms. Selva Mary. G UNIT-IV HDFS HDFS ARCHITECTURE Dataset partition across a number of separate machines Hadoop Distributed File system The Design of HDFS HDFS is a file system designed for storing very large files with

More information

An Improved Performance Evaluation on Large-Scale Data using MapReduce Technique

An Improved Performance Evaluation on Large-Scale Data using MapReduce Technique Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Indexing Strategies of MapReduce for Information Retrieval in Big Data

Indexing Strategies of MapReduce for Information Retrieval in Big Data International Journal of Advances in Computer Science and Technology (IJACST), Vol.5, No.3, Pages : 01-06 (2016) Indexing Strategies of MapReduce for Information Retrieval in Big Data Mazen Farid, Rohaya

More information

Implementation and performance test of cloud platform based on Hadoop

Implementation and performance test of cloud platform based on Hadoop IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Implementation and performance test of cloud platform based on Hadoop To cite this article: Jingxian Xu et al 2018 IOP Conf. Ser.:

More information

Design and Implementation of LED Display Screen Controller based on STM32 and FPGA Chi Zhang 1,a, Xiaoguang Wu 1,b and Chengjun Zhang 1,c

Design and Implementation of LED Display Screen Controller based on STM32 and FPGA Chi Zhang 1,a, Xiaoguang Wu 1,b and Chengjun Zhang 1,c Applied Mechanics and Materials Online: 2012-12-27 ISSN: 1662-7482, Vols. 268-270, pp 1578-1582 doi:10.4028/www.scientific.net/amm.268-270.1578 2013 Trans Tech Publications, Switzerland Design and Implementation

More information

Analyzing and Improving Load Balancing Algorithm of MooseFS

Analyzing and Improving Load Balancing Algorithm of MooseFS , pp. 169-176 http://dx.doi.org/10.14257/ijgdc.2014.7.4.16 Analyzing and Improving Load Balancing Algorithm of MooseFS Zhang Baojun 1, Pan Ruifang 1 and Ye Fujun 2 1. New Media Institute, Zhejiang University

More information

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities

Application of Individualized Service System for Scientific and Technical Literature In Colleges and Universities Journal of Applied Science and Engineering Innovation, Vol.6, No.1, 2019, pp.26-30 ISSN (Print): 2331-9062 ISSN (Online): 2331-9070 Application of Individualized Service System for Scientific and Technical

More information

A Template-Matching-Based Fast Algorithm for PCB Components Detection Haiming Yin

A Template-Matching-Based Fast Algorithm for PCB Components Detection Haiming Yin Advanced Materials Research Online: 2013-05-14 ISSN: 1662-8985, Vols. 690-693, pp 3205-3208 doi:10.4028/www.scientific.net/amr.690-693.3205 2013 Trans Tech Publications, Switzerland A Template-Matching-Based

More information

, ,China. Keywords: CAN BUS,Environmental Factors,Data Collection,Roll Call.

, ,China. Keywords: CAN BUS,Environmental Factors,Data Collection,Roll Call. Advanced Materials Research Online: 2013-09-04 ISS: 1662-8985, Vols. 765-767, pp 1693-1696 doi:10.4028/www.scientific.net/amr.765-767.1693 2013 Trans Tech Publications, Switzerland The design of artificial

More information

Introduction to MapReduce

Introduction to MapReduce Basics of Cloud Computing Lecture 4 Introduction to MapReduce Satish Srirama Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed

More information

Batch Inherence of Map Reduce Framework

Batch Inherence of Map Reduce Framework Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.287

More information

Hadoop File Management System

Hadoop File Management System Volume-6, Issue-5, September-October 2016 International Journal of Engineering and Management Research Page Number: 281-286 Hadoop File Management System Swaraj Pritam Padhy 1, Sashi Bhusan Maharana 2

More information

DRA AUDIO CODING STANDARD

DRA AUDIO CODING STANDARD Applied Mechanics and Materials Online: 2013-06-27 ISSN: 1662-7482, Vol. 330, pp 981-984 doi:10.4028/www.scientific.net/amm.330.981 2013 Trans Tech Publications, Switzerland DRA AUDIO CODING STANDARD Wenhua

More information

Study and Design of CAN / LIN Hybrid Network of Automotive Body. Peng Huang

Study and Design of CAN / LIN Hybrid Network of Automotive Body. Peng Huang Advanced Materials Research Online: 2014-06-30 ISSN: 1662-8985, Vol. 940, pp 469-474 doi:10.4028/www.scientific.net/amr.940.469 2014 Trans Tech Publications, Switzerland Study and Design of CAN / LIN Hybrid

More information

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti

MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 MAPREDUCE FOR BIG DATA PROCESSING BASED ON NETWORK TRAFFIC PERFORMANCE Rajeshwari Adrakatti 1 Department

More information

MapReduce. U of Toronto, 2014

MapReduce. U of Toronto, 2014 MapReduce U of Toronto, 2014 http://www.google.org/flutrends/ca/ (2012) Average Searches Per Day: 5,134,000,000 2 Motivation Process lots of data Google processed about 24 petabytes of data per day in

More information

Distributed Systems 16. Distributed File Systems II

Distributed Systems 16. Distributed File Systems II Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS

More information

Study on the Quantitative Vulnerability Model of Information System based on Mathematical Modeling Techniques. Yunzhi Li

Study on the Quantitative Vulnerability Model of Information System based on Mathematical Modeling Techniques. Yunzhi Li Applied Mechanics and Materials Submitted: 2014-08-05 ISSN: 1662-7482, Vols. 651-653, pp 1953-1957 Accepted: 2014-08-06 doi:10.4028/www.scientific.net/amm.651-653.1953 Online: 2014-09-30 2014 Trans Tech

More information

A Review Approach for Big Data and Hadoop Technology

A Review Approach for Big Data and Hadoop Technology International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 A Review Approach for Big Data and Hadoop Technology Prof. Ghanshyam Dhomse

More information

Keywords: Interactive electronic technical manuals; GJB6600; XML markup language; Automatic control equipment

Keywords: Interactive electronic technical manuals; GJB6600; XML markup language; Automatic control equipment Applied Mechanics and Materials Submitted: 2014-06-11 ISSN: 1662-7482, Vols. 602-605, pp 1165-1168 Accepted: 2014-06-11 doi:10.4028/www.scientific.net/amm.602-605.1165 Online: 2014-08-11 2014 Trans Tech

More information

K-means Clustering Optimization Algorithm Based on MapReduce

K-means Clustering Optimization Algorithm Based on MapReduce International Symposium on Computers & Informatics (ISCI 015) K-means Clustering Optimization Algorithm Based on MapReduce Zhihua Li 1,a, Xudong Song,b,WenhuiZhu 3,c, YanxiaChen 4,d * 1 College of Network

More information

Mounica B, Aditya Srivastava, Md. Faisal Alam

Mounica B, Aditya Srivastava, Md. Faisal Alam International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 3 ISSN : 2456-3307 Clustering of large datasets using Hadoop Ecosystem

More information

MI-PDB, MIE-PDB: Advanced Database Systems

MI-PDB, MIE-PDB: Advanced Database Systems MI-PDB, MIE-PDB: Advanced Database Systems http://www.ksi.mff.cuni.cz/~svoboda/courses/2015-2-mie-pdb/ Lecture 10: MapReduce, Hadoop 26. 4. 2016 Lecturer: Martin Svoboda svoboda@ksi.mff.cuni.cz Author:

More information

ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS

ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 ADAPTIVE HANDLING OF 3V S OF BIG DATA TO IMPROVE EFFICIENCY USING HETEROGENEOUS CLUSTERS Radhakrishnan R 1, Karthik

More information

50 Must Read Hadoop Interview Questions & Answers

50 Must Read Hadoop Interview Questions & Answers 50 Must Read Hadoop Interview Questions & Answers Whizlabs Dec 29th, 2017 Big Data Are you planning to land a job with big data and data analytics? Are you worried about cracking the Hadoop job interview?

More information

Available online at ScienceDirect. Procedia Computer Science 79 (2016 )

Available online at   ScienceDirect. Procedia Computer Science 79 (2016 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 79 (2016 ) 207 214 7th International Conference on Communication, Computing and Virtualization 2016 An Improved PrePost

More information

AN EFFECTIVE DETECTION OF SATELLITE IMAGES VIA K-MEANS CLUSTERING ON HADOOP SYSTEM. Mengzhao Yang, Haibin Mei and Dongmei Huang

AN EFFECTIVE DETECTION OF SATELLITE IMAGES VIA K-MEANS CLUSTERING ON HADOOP SYSTEM. Mengzhao Yang, Haibin Mei and Dongmei Huang International Journal of Innovative Computing, Information and Control ICIC International c 2017 ISSN 1349-4198 Volume 13, Number 3, June 2017 pp. 1037 1046 AN EFFECTIVE DETECTION OF SATELLITE IMAGES VIA

More information

A SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING

A SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING Journal homepage: www.mjret.in ISSN:2348-6953 A SURVEY ON SCHEDULING IN HADOOP FOR BIGDATA PROCESSING Bhavsar Nikhil, Bhavsar Riddhikesh,Patil Balu,Tad Mukesh Department of Computer Engineering JSPM s

More information

MapReduce, Hadoop and Spark. Bompotas Agorakis

MapReduce, Hadoop and Spark. Bompotas Agorakis MapReduce, Hadoop and Spark Bompotas Agorakis Big Data Processing Most of the computations are conceptually straightforward on a single machine but the volume of data is HUGE Need to use many (1.000s)

More information