Research on Mining Cloud Data Based on Correlation Dimension Feature

Similar documents
Research on Design and Application of Computer Database Quality Evaluation Model

Fault Diagnosis of Wind Turbine Based on ELMD and FCM

Design of student information system based on association algorithm and data mining technology. CaiYan, ChenHua

Traffic Flow Prediction Based on the location of Big Data. Xijun Zhang, Zhanting Yuan

Research and Application of E-Commerce Recommendation System Based on Association Rules Algorithm

A Data Classification Algorithm of Internet of Things Based on Neural Network

Research on QR Code Image Pre-processing Algorithm under Complex Background

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

Research on Hybrid Network Technologies of Power Line Carrier and Wireless MAC Layer Hao ZHANG 1, Jun-yu LIU 2, Yi-ying ZHANG 3 and Kun LIANG 3,*

Hotel Recommendation Based on Hybrid Model

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Test Analysis of Serial Communication Extension in Mobile Nodes of Participatory Sensing System Xinqiang Tang 1, Huichun Peng 2

Two Algorithms of Image Segmentation and Measurement Method of Particle s Parameters

An Adaptive Threshold LBP Algorithm for Face Recognition

Design and Implementation of Networked CNC Machine DNC System in. Colleges and Universities Based on Internet Plus

DOTNET PROJECTS. DOTNET Projects. I. IEEE based IOT IEEE BASED CLOUD COMPUTING

Qiqihar University, China *Corresponding author. Keywords: Highway tunnel, Variant monitoring, Circle fit, Digital speckle.

Prediction of traffic flow based on the EMD and wavelet neural network Teng Feng 1,a,Xiaohong Wang 1,b,Yunlai He 1,c

A Multi-Analyzer Machine Learning Model for Marine Heterogeneous Data Schema Mapping

Multisource Remote Sensing Data Mining System Construction in Cloud Computing Environment Dong YinDi 1, Liu ChengJun 1

The Establishment of Large Data Mining Platform Based on Cloud Computing. Wei CAI

Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services

The Flooding Time Synchronization Protocol

A Novel Method for Activity Place Sensing Based on Behavior Pattern Mining Using Crowdsourcing Trajectory Data

Fuzzy C-means Clustering with Temporal-based Membership Function

FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP

Design of Underground Current Detection Nodes Based on ZigBee

Research on the New Image De-Noising Methodology Based on Neural Network and HMM-Hidden Markov Models

Information Security Coding Rule Based on Neural Network and Greedy Algorithm and Application in Network Alarm Detection

Personalized Search for TV Programs Based on Software Man

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. Research on motion tracking and detection of computer vision ABSTRACT KEYWORDS

Organization and Retrieval Method of Multimodal Point of Interest Data Based on Geo-ontology

The Improved Embedded Zerotree Wavelet Coding (EZW) Algorithm

A Novel Image Semantic Understanding and Feature Extraction Algorithm. and Wenzhun Huang

Open Access Research on the Data Pre-Processing in the Network Abnormal Intrusion Detection

T-S Neural Network Model Identification of Ultra-Supercritical Units for Superheater Based on Improved FCM

Research on the Application of Digital Images Based on the Computer Graphics. Jing Li 1, Bin Hu 2

International Journal of Electrical, Electronics ISSN No. (Online): and Computer Engineering 3(2): 85-90(2014)

The Application of Clustering Algorithm Based on Improved Canopy -Kmeans in Operators Data

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Comprehensive analysis and evaluation of big data for main transformer equipment based on PCA and Apriority

manufacturing process.

A Design of Remote Monitoring System based on 3G and Internet Technology

2 Proposed Methodology

Design Analysis Method for Multidisciplinary Complex Product using SysML

A liquid level control system based on LabVIEW and MATLAB hybrid programming

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Anti-Distortion Image Contrast Enhancement Algorithm Based on Fuzzy Statistical Analysis of the Histogram Equalization

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging

A Real-time Detection for Traffic Surveillance Video Shaking

The Study on Paper Board Thickness Measurement by Using Data Fusion

Research of Traffic Flow Based on SVM Method. Deng-hong YIN, Jian WANG and Bo LI *

Open Access Research on the Prediction Model of Material Cost Based on Data Mining

A Rapid Automatic Image Registration Method Based on Improved SIFT

Research on An Electronic Map Retrieval Algorithm Based on Big Data

Object Tracking Algorithm based on Combination of Edge and Color Information

Fuzzy Double-Threshold Track Association Algorithm Using Adaptive Threshold in Distributed Multisensor-Multitarget Tracking Systems

This paper puts forward three evaluation functions: image difference method, gradient area method and color ratio method. We propose that estimating t

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000.

Preliminary Research on Distributed Cluster Monitoring of G/S Model

The Application Research of Neural Network in Embedded Intelligent Detection

Multi-Source Spatial Data Distribution Model and System Implementation

PUBLICATIONS. Journal Papers

Information Fusion Based Fault Location Technology for Distribution Network

Compressed Sensing Algorithm for Real-Time Doppler Ultrasound Image Reconstruction

The Application of CAD/CAM in the Design of Industrial Products

Training and Application of Radial-Basis Process Neural Network Based on Improved Shuffled Flog Leaping Algorithm

An Indoor Mobile Robot Localization Method Based on Information Fusion

Analysis Range-Free Node Location Algorithm in WSN

Iterative Removing Salt and Pepper Noise based on Neighbourhood Information

Orange3 Data Fusion Documentation. Biolab

RESEARCH ON INTER-SATELLITE COMMUNICATION SYSTEM OF FRACTIONATED SPACECRAFT

Energy efficient optimization method for green data center based on cloud computing

High Resolution Remote Sensing Image Classification based on SVM and FCM Qin LI a, Wenxing BAO b, Xing LI c, Bin LI d

Research on the Checkpoint Server Selection Strategy Based on the Mobile Prediction in Autonomous Vehicular Cloud

BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IMAGES FOR URBAN SURVEILLANCE SYSTEMS

An Inspection Method of Rice Milling Degree Based on Machine Vision and Gray-Gradient Co-occurrence Matrix

The Level Set Method applied to Structural Topology Optimization

Open Access Apriori Algorithm Research Based on Map-Reduce in Cloud Computing Environments

A Robust Watermarking Algorithm For JPEG Images

Temperature Calculation of Pellet Rotary Kiln Based on Texture

A Kind of Wireless Sensor Network Coverage Optimization Algorithm Based on Genetic PSO

The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform

Efficient Path Finding Method Based Evaluation Function in Large Scene Online Games and Its Application

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Implementation Of Harris Corner Matching Based On FPGA

The Design and Implementation of the Unmanned Vehicle Fixed-point Tracking System Liang-liang CHEN, Yu-ying LIU and Chun ZHAN

Research on Parallelized Stream Data Micro Clustering Algorithm Ke Ma 1, Lingjuan Li 1, Yimu Ji 1, Shengmei Luo 1, Tao Wen 2

Efficient Algorithm for Frequent Itemset Generation in Big Data

The Design of Distributed File System Based on HDFS Yannan Wang 1, a, Shudong Zhang 2, b, Hui Liu 3, c

AN EFFECTIVE DETECTION OF SATELLITE IMAGES VIA K-MEANS CLUSTERING ON HADOOP SYSTEM. Mengzhao Yang, Haibin Mei and Dongmei Huang

Wireless Smart Home Security System Based on Android

Creating Time-Varying Fuzzy Control Rules Based on Data Mining

Social Network Recommendation Algorithm based on ICIP

Design of an Intelligent PH Sensor for Aquaculture Industry

Study on Digitized Measuring Technique of Thrust Line for Rocket Nozzle

Dynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c

A Unified Data Publishing Protocol in Health Big Data Processing

Design of Coal Mine Power Supply Monitoring System

Research of Fault Diagnosis in Flight Control System Based on Fuzzy Neural Network

Transcription:

2016 4 th International Conference on Advances in Social Science, Humanities, and Management (ASSHM 2016) ISBN: 978-1-60595-412-7 Research on Mining Cloud Data Based on Correlation Dimension Feature Jingwen Tu 1, Li Tao 2 Abstract Large hierarchical cloud storage database has distributed of non-continuous massive data, the data has nonlinear characteristics of strong coupling, and using traditional methods for data mining, mining exist difficult problems. This paper proposes mining algorithm based a cloud non continuous layer data, and analyze the overall data mining model. The paper use fuzzy C means clustering algorithm to complete the semantic ontology feature point clustering beam based on semantic feature extraction and quantization encoding, to realize improved data mining algorithm. The experimental results show that the improved algorithm, the non-continuous mining level data have high precision, good performance, anti-interference ability strong, performance is superior to the traditional method. Keywords: K means algorithm; FCM algorithm; Cloud Computing; Data clustering. 1. Introduction With the rapid development of network information and large data processing technology, a large number of data are distributed in the network space to constitute the network Web Deep database through the cloud storage model. The big data information processing technology is highly developed today, using cloud computing method for data transmission and scheduling, which can effectively raise the ability to access the ability of information retrieval and the Deep Web database. In a large cloud storage database, which distribute a massive non continuous level data with self-coupling nonlinear characteristics, there is mining difficult in other world environmental disturbance. In order to improve the semantic of network database retrieval and information analysis capabilities, we need the mining method for cloud computing discontinuous hierarchical data based on cloud computing platform to build cloud mining environment data. Therefore, this paper proposes a mining algorithm for nonhierarchical data semantic ontology feature point beam based on clustering, the algorithm can analyze the overall level and non-continuous data structure model of data mining, but also extract and encode on the non-continuous level data semantic feature. On the basis of quantization coding, this paper uses the fuzzy C means clustering algorithm to realize the semantic ontology characteristic directional beam clustering, and realize the data mining algorithm improvement. Finally, the performance test is carried out by simulation experiment. 1 Nanchang Business College, Jiangxi Agriculture University, Nanchang, Jiangxi, 330044 China 2 Jiangxi Radio & TV University, Nanchang, Jiangxi, China 134

2. Working principle of data mining model The realization of the overall framework of the non-continuous data mining model based on cloud computing is shown in Figure 1. Figure 1. Realization of overall framework of data mining model based on cloud computing. 2.1 Quantization and coding In a cloud of non-continuous levels of data mining method based on the overall framework, we design a hierarchical data mining model of non-continuous mass distribution of large cloud storage in the database, due to the non-continuous level data have strong self-coupling nonlinear characteristics, it has mining difficult under big disturbance. In the cloud computing environment, this paper proposes a nonhierarchical data mining algorithm based on point semantic ontology feature beam clustering. We extract and encode non-continuous level data semantic feature, by adaptive weighted for nonhierarchical data semantic ontology model of continuous gradient window and the maximum value, we get non-continuous level data association directional weighted vector: (1) We used a 1 * N time window to compress feature, and determine data mining time window size N of the non-continuous levels, the time window is divided into many small intervals, the vector quantization encoding, testing function is x (T), the distance of a continuous sliding encoding window is as shows: (2) Among them WJ is the maximum gradient difference weighted factor of the non-continuous data, expressed as (3) In order to extract directivity of semantic feature through useful text of non-continuous level data, assuming that the data is piecewise smooth, in the linear region of piecewise stationary, they judge non-continuous hierarchical data in narrow time window TLx, TLy, to get the text feature extraction judgment type of non-continuous data: (4) The energy density spectrum of the non-continuous layer data is m; at the minimum distance of the window, we obtain the clock sampling Nj*, Among them, the vector space trajectory function of vector quantization coding is: (5) The semantic ontology is divided into 3*3 topological structure, and the specific window function is selected to obtain the output vector quantization coding object set Fm (x, y): (6) (7) among: is the estimated value of the frequency resolution;uj(k) is a set of information fusion attributes of non-continuous level data.;uj is adjustable window of Fu Liye transform;pj is 135

probability density function of semantic ontology feature. According to the above analysis,we quantify the new code as the gradient direction of the discontinuous data, and we give the non-continuous hierarchical data edge fusion series N; initialize the semantic features of non-continuous data threshold levelε; the training sequence for the edge pixels of the given non-continuous hierarchy data is {xj},j=0,1,,m-1,feature decomposition in an initial N level symbol,i=1, 2,,N.Set n=0,d-1 =. According to the above analysis, based on the vector quantization coding, the semantic ontology characteristic directional beam clustering is carried out. 2.2 Data mining implementation The improved algorithm mainly aimed at FCM algorithm cannot automatically obtain the optimal cluster number and sensitive to initial cluster. According to a great deal of information * display: The optimal cluster number usually meet c n.therefore, the range of optimal cluster number is set to: (1, n ). The basic idea of the algorithm is: first get the n clustering results by using K means algorithm is improved, the results of the n cluster is as the cluster center of FCM algorithm, corresponding calculated the effectiveness function based on principle of granularity value when the C = n, using the method of clustering center to get the C 1 clustering center, the clustering number minus 1 again, and calculate results of FCM algorithm and the effectiveness of the function value when C = n 1 clustering, order cycle, until C reaches the lower clustering number range. Finally the validity of all function value, effectiveness evaluation to obtain the best clustering. The basic flow chart is as shown in figure 2 based on K algorithm and the principle of mean grain size. Figure 2. The improved algorithm flow chart based on the K mean and the principle of granularity. According to the basic idea of the algorithm, can get the specific steps of the algorithm: 1. InitialParameter() // initializae each parameter 2. V (0)=Improved_KMeans_Centers(cmax,n ) // using improved K means algorithm,get the Cmax clustering center, as the initial clustering center of FCM algorithm 3. For c= n to 2 { 136

4. Repeat // Repeat FCM algorithm clustering process (b) 5. U = UpdataMatrix(c, n) // Update the classification matrix (b) 6. V = UpdataCenters(c, n) // updating the cluster center (b 1) (b) 7. Until V + V < ε // until to satisfy the convergence condition 8. Dis tan cemin (i, j) // Get the between class distance, calculate the distance of two kinds of min 9.MergerCenters(c) // Combined with the cluster center, the distance between classes of two kinds of minimum, merged into one category, the clustering number is C-1 10.M=CountEffectiveFunction () /// Calculation effective function value, save it to the M collection 11. } End For * 12. c = min(m) // to find out the best clustering validity function minimum value corresponding to the number in the M collection 13. Output(C*)// output clustering 3. Simulation experiment and performance test We can see from Figure 3, the original data flow is distribute in the cloud storage database by self-coupling nonlinear strong interference, which result the system has the low accuracy of data mining, poor performance. In order to quantitatively analyze the performance of mining, using this algorithm and the traditional method in data mining accuracy as test index, the experiment achieved across 10000 Monte-Carlo the output of data mining, the root mean square error of RMSE, the results shown in Figure 4. Figure 3. Time domain waveform of primary data info flow. From Figure 4, we can see that this algorithm is the non-continuous data mining, the root mean square error of the data mining is low, which shows that the data mining accuracy is higher than the traditional method, and the anti-jamming performance is strong. Figure 4. Performance comparison of data mining. 137

4. Conclusions According to the traditional data mining method mining with low precision, big error problem. This paper presents the cloud data mining algorithm based on non-continuous level, to analyze the overall data mining model, extract the non-continuous level data semantic feature and quantization encoding, using the fuzzy C mean clustering complete algorithm based on encoding, semantic ontology feature directional beam clustering, improved data mining algorithm is realized. The experimental results show that the algorithm of data mining has high precision, good performance, semantic ontology feature a directional beam has better clustering effect, strong anti-interference ability. Reference [1] Zhang Jianping, Liu Xiyu. Research and application of K-means algorithm based on clustering analysis [J]. Application Research of computers, 2007, 24 (5): 166-168. [2] Eldemerdash Y A, Dobre O A, Liao B J. Blind identification of SM and alamouti STBC-OFDM signals[j]. IEEE Transactions on Wireless Communications, 2015, 14(2): 972-982. [3] Xu Y, Tong S, LI Y. Prescribed performance fuzzy adaptive fault-tolerant control of non-linear systems with actuator faults[j]. IET Control Theory and Applications, 2014, 8(6): 420-431. [4] Huang X, Wabg Z, Li Y, et al. Design of fuzzy state feedback controller for robust stabilization of uncertain fractional-order chaotic systems[j]. Journal of the Franklin Institute 2015, 351(12): 5480-5493. 138