Performance Analysis of Video Data Image using Clustering Technique

Indian Journal of Science and Technology, Vol 9(10), DOI: 10.17485/ijst/2016/v9i10/79731, March 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Performance Analysis of Video Data Image using Clustering Technique D. Saravanan * IFHE University, IBS Hyderabad, Telangana State 501203, India; sa_roin@yahoo.com Abstract Objectives: This research paper focuses on design of a hierarchical clustering algorithm for efficient and effective organization of data for information retrieval. Method/Analysis: A classification tree is formed in COBWEB which indicates hierarchical clustering model. Findings: The proposed method utilizes less memory and worked well for all types of video files. Also this paper brings the comparison result of existing three types of video clustering algorithms BRICH, CURE, and CHAMELEON and their performances. Keywords: Clustering, Hierarchical Clustering, Image Processing, Performance Analysis, Video Data Mining 1. Introduction CHAMELEON that measures the similarity between the clusters based on an active model. Video clustering is different from normal clustering techniques. Because video content are unstructured, to perform video data mining from such unstructured information, they must first be converted into a structured format. Then those videos can be accessed based on the content available in the video file 1. In video clustering, time plays an important role. Due to technological developments, lots of duplicate files are available on the web 2. In the clustering process, the difference between the clusters is merged only if the interconnectivity and closeness 3 (proximity) between two clusters are high, relative to the internal inter-connectivity of the clusters and closeness of items within the clusters. It discovers natural clusters of different shapes and sizes. For frequently inserting and querying enormous amount of data, large databases are required for storage store. For researchers extracting important models and analyzing big data sets are interesting. There are two main groups in huge database mining. Applying mining techniques and referring streaming data is one group. Solving the problem with suitable algorithm is done in the second group. For huge databases, data stream is the best approach instead of mining the entire database, as stated by many researchers. Data can be scanned once and the data may be retrieved but the accuracy is not better. The second type of clustering process is CURE. CURE performs its operations by using hierarchical clustering technique, and this algorithm is used mostly for large databases like image data bases. Because of the running time this algorithm is never applied directly to large data bases. Due to this drawback we choose stochastic data from the original data set, and then apply the partition technique. After this chief points are identified. With help of the above three steps algorithm works effectively. The third type of cluster algorithm is BIRCH is a remembrance type of algorithm, i.e., the clustering process is performed by remembrance and is carried out with a memory limitation. Existing clustering algorithms can be broadly classified into partitioned and hierarchical 4-6 of which CHAMELEON captures the concept of neighbourhood dynamically by taking into account the density of the region. BIRCH makes full use of available memory to derive the finest possible sub clusters (to ensure accuracy) while minimizing I/O costs (to ensure efficiency). CURE is a hierarchical clustering technique where each partition is nestled into the next partition in the sequence. The main drawback of this algorithm is that identifying chief points take more time, due to this, the running time of this algorithm is more. 12 A comparative study identified the three existing *Author for correspondence

Performance Analysis of Video Data Image using Clustering Technique algorithms BRICH, CURE and CHEAMELON and found that each has its own drawback in terms of forming cluster, method of implementation and their applications. Input Data Partitioning Algorithm Cluster using Sparse Graph 2. Existing System Single search is enough to forming clusters. Existing technique not suitable if database size is increased. Video data Performance analyzed any one of the existing clustering technique only. Existing algorithm inefficient not only time complexity, also suffer for frequency estimation of data points. There is no single algorithm suite for video data mining. The amount of stored information is more. 2.1 Proposed System Given the desired number of clusters K and a distancebased measurement function, we are asked to find a partition of the dataset that minimizes the value of the measurement function. Single search in enough to form the clustering. Performance is analyzed for all three existing clustering algorithms. Reduce the space and time complexity. Based on the proposed technique the efficiency of the clustering gets considerably increased and gives optimum results. 2.2 Advantages of proposed system The amount of memory available is limited. CURE can identify clusters that are not spherical but also ellipsoid. Using partitioning and sampling CURE can be applied to large datasets. CHAMELEON has Inter Connectivity, Relative closeness. The Drawback of BRICH exact quality measurement is eliminated. Proposed method implemented of various video files effectively. 3. Experimental Setup Implementation is the most crucial stage in achieving a successful system and giving the user s confidence that the new system is workable and effective. Implementation of a modified application to replace an existing one. This type Classification tree is formed Hierarchical clustering is applied CURE METHODOLOGY Merger the centroids and eliminate the Outliers igure 1. of conversation is relatively easy to handle, provide there are no major changes in the system. Searching image from the huge amount of the content is very complex work 7. Initially as a first step I taken dataset as an input in video data mining. In this I implemented the combination of three algorithms are BRICH, CURE, and CHAMELEON 8, 9. Implementation is the stage of the paper when the theoretical design is turned out into a working system. Thus it can be considered to be the most critical stage in achieving a successful formation of new clusters with in short time. The implementation stage involves careful planning, investigation of the existing system and it s constraints on implementation, designing of methods.. Implementation is the process of converting a new system design into operation. By implementing three algorithms outliers are removed. By implementing of new algorithm can get the original clusters with in short time. The experiment is done by the following setup: 3.1 Initial Sub-Cluster s CHAMELEON Proposed Architecture. Divide and conquer method Detecting outliers Random Sampling method Partitioning the centroids Label the clusters The first Phase is Finding Initial Sub-clusters It can get the input dataset and apply the spare graph. It can produce the edge cut of the dataset. 2 Vol 9 (10) March 2016 www.indjst.org Indian Journal of Science and Technology

D. Saravanan 3.2 The Partition Graph Function After the Neighbor Graph we can to perform the Partition Graph. It can do by the multilevel graph partitioning algorithms. This algorithm can compute partitioning that has a very small edge-cut. 3.3 COBWEB Clustering Technique To discover the understandable pattern in data COBWEB is used. A classification tree is formed in COBWEB which indicates hierarchical clustering model. A brief description about the concept is defined in each node and under each node, objects are classified which has the summary of the concept. The tree structure includes the Outliers which is the overhead for managing tree. To discover the understandable pattern in data COBWEB is used. A classification tree is formed in COBWEB which indicates hierarchical clustering model. A brief description about the concept is defined in each node and under each node, objects are classified which has the summary of the concept. The tree structure includes the Outliers which is the overhead for managing tree. Input: Each sub cluster is taken as input. Output: Outliers are removed from each sub cluster. 3.3.1 Algorithm approach Step 1: Extract the frames of that video Step 2: Preprocess the extracted frames Step 3: Apply clustering algorithm to cluster the frames Step 4: Store the clustered frames in the database Step 5: Give an image input query Step 6: Find the similarity of the image with the video content Step 7: Retrieves the related video to the requested user. 3.4 Divide and Conquer Method To improve the clustering in data stream Divide and conquer technique is applied. There are two level divide and conquer clustering algorithm. It is applied to 2000 data points. The first level is the Leader algorithm which forms number of clusters in original data. A representation of these clusters is obtained. Using hierarchical clustering algorithm, representations are then clustered. This results in high quality and efficiency of objects with high dimensionality. 10 The second level is the CMeans algorithm for data stream. Here, the clusters are weighted iteratively. The weighted clusters are incrementally clustered with the next data. The weights of outliers are not considered. To improve the quality of image give the proper input, it help to form the cluster in better way 13 Forming of clustering both video and time series data is very difficult process, due to change of data points 14. Input: Cluster without outliers is taken as input. Output: High quality and efficiency of objects with high dimensionality. 3.5 Random with Merging the Clusters Because of the image data base, algorithm never applied directly. The process is performed various steps. Draw the stochastic data points from the stored Data points. Divide the samples. By using step 2 form partial clusters. Eliminate the noise. Identify the chief points after step 4 gets over. Input: Unstructured huge data points chosen as input. Output: Merged data clusters. 3.6 Selecting Data Points Based on procedure 3.5 we select the random points from the available data generated with help of procedure 3.4. Here cluster are selected, with help of chief point. Points closer to the chief points, those points are assigned to the particular cluster. INPUT: Here we are taking the original data set as an input. OUTPUT: Labeling clusters are formed. 3.7 Pseudo Code of Frame Comparison x1 = imgwidth1 / 2 y1 = imgheight1 / 2 For y = 0 To imgheight - 1 For x = 0 To imgwidth - 1 colorpixel = DisplayBM.GetPixel(x, y) str = Format(x, 000 & ) str1 = Format(y, 000 & ) str2 = colorpixel.r str3 = colorpixel.g str4 = colorpixel.b Vol 9 (10) March 2016 www.indjst.org Indian Journal of Science and Technology 3

Performance Analysis of Video Data Image using Clustering Technique For y1 = 0 To imgheight1-1 For x1 = 0 To imgwidth1-1 If x1 = x And y1 = y Then colorpixel = DisplayOBM.GetPixel(x1, y1) stro = Format(x1, 000 & ) stro1 = Format(y1, 000 & ) stro2 = colorpixel.r stro3 = colorpixel.g stro4 = colorpixel.b If stro = str And stro1 = str1 Then StatusBar1.Text = Comparing Values... If str2 = stro2 And str3 = stro3 And str4 = stro4 Then cnt = cnt + 1 Me.Refresh() lblmsg.visible = True lblmsg.text = cnt Else GoTo 3 If cnt >= Val Then storedb() GoTo 2 Figure 4. Figure 5. Perform clustering. Clustering Completed. 4. Results Figure 2. Open Input image. Figure 6. Comparison of Clustering Process. F Figure 3. Input image. Figure 7. Comparison of frames. 4 Vol 9 (10) March 2016 www.indjst.org Indian Journal of Science and Technology

D. Saravanan Table 2. Table result Figure 8. Duplicate found. Figure 9. Figure 10. Eliminate the duplicate. Cluster for all. 5. Conclusion Using CHAMELEON mechanism an image is changed into the pixel format and images which does not belong to the cluster is also made into a cluster 11. But dataset will take more space. By using BIRCH the dataset is minimized by detecting the outliers and the representing grids will be formed. But labeling the grids will make some errors while clustering. Thus CURE cluster is used for clustering by eliminating the outliers. The centroids are clustered here. Thus the dataset will be minimized. 5.1 Future Enchancement The purpose of video data mining is to discover and describe interesting patterns in data in large databases having different kinds of data file formats such as image data file, audio data file, video data file etc. Here I am describing about video data files like sports video data file, picture video data file, and news data video file. In this all video file, which data files are giving more clustering with minimum time? The performance a measure using the existing algorithm shows the entire existing algorithm is show good performance for any one or two video file only, but the remaining videos are not clustered efficiently. We need a one clustering algorithm perform for all set of video files effectively. Figure 11. Graph result of Comparison of Clustering process. Table 1. Table Information for the Cluster formation 6. References 1. Cao L, Ji R, Gao Y, Liu W, Tian Q. Mining spatiotemporal video patterns towards robust action retrieval. Neurocomputing; 2013April; 1(105):61 9. 2. Wu X, Ngo CW, Hauptmann A, Tan HK. Real-Time Near-Duplicate Elimination for Web Video Search with Content and Context. IEEE Transaction on Multimedia. 2009;11(2):196 207. 3. Saravanan D, Tony RA. Text Taxonomy using Data Mining Clustering System. Asian Journal of Information Technology. 2015; 14(3): 97 104. Vol 9 (10) March 2016 www.indjst.org Indian Journal of Science and Technology 5

Performance Analysis of Video Data Image using Clustering Technique 4. Saravanan D, Srinivasan S. Video image retrieval using data mining Techniques. Journal of computer applications (JCA). 2012; 1: 39 42. 5. Saravanan D, Dr.Srinivasan S. Matrix Based Indexing Technique for video data. Journal of computer science. 2013; 9(5): 34 542. 6. Ester M, kriegel H-P, Sander J, XU X. A density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. KDD; 1996. p. 226-31. 7. Mezaris V, Kompatsiaris I, Strintzis MG. Region-based Image Retrieval using an Object Ontology and Relevance Feedback. Eurap Journal on Applied Signal Processing.2004; (6):886 90. 8. Saravanan D, Dr Srinivasan S. Indexing ad Accessing Video Frames by Histogram Approach. Proc. of International Conference on RSTSCC; 196 99. 9. Saravanan D, Dr.Srinivasan S, Video information retrieval using: CHEMELEON Clustering. International Journal of Emerging Trends and Technology in Computer Science (IJETTCS); 2013; 2(1): 166 70. 10. Hiremath PS, Pujari J.Content Based Image Retrieval Using Color, Texture and Shape Features. proceeding of Advanced Computing and Communications, ADCOM; Guwahati: Assam. 2007.780 84. 11. Saravanan D, Somasundaram V. Matrix Based Sequential Indexing Technique for Video Data Mining. Journal of Theoretical and Applied Information Technology. 2014; 67(3): 725 31. 12. Saravanan D, Kumar RA. Content Based Image Retrieval using Color Histogram. International Iournal of computer science and information technology (IJCSIT); 2013; 4(2):242 45. 13. Janani P, Premaladha J, Ravichandran KS. Image Enhancement Techniques: Indian Journal of Science and Technology. 2015 Sep; 8(22): 1 12. 14. Muruga Radha Devi D, Thambidurai T. Similarity Measurement in Recent Biased Time Series Databases using Different Clustering Methods. Indian Journal of Science and Technology. 2014 Jan; 7(2):189 98. 6 Vol 9 (10) March 2016 www.indjst.org Indian Journal of Science and Technology