TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT 5 LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS xxi

Similar documents
TABLE OF CONTENTS CHAPTER NO. TITLE PAGENO. LIST OF TABLES LIST OF FIGURES LIST OF ABRIVATION

Research Article Combining Pre-fetching and Intelligent Caching Technique (SVM) to Predict Attractive Tourist Places

CHAPTER 4 OPTIMIZATION OF WEB CACHING PERFORMANCE BY CLUSTERING-BASED PRE-FETCHING TECHNIQUE USING MODIFIED ART1 (MART1)

STUDY OF COMBINED WEB PRE-FETCHING WITH WEB CACHING BASED ON MACHINE LEARNING TECHNIQUE

Configuring BGP on Cisco Routers Volume 1


Improving the Performance of a Proxy Server using Web log mining

Performance Improvement of Least-Recently- Used Policy in Web Proxy Cache Replacement Using Supervised Machine Learning

Web Proxy Cache Replacement Policies Using Decision Tree (DT) Machine Learning Technique for Enhanced Performance of Web Proxy

Business Intelligence Roadmap HDT923 Three Days

Contents. Preface to the Second Edition

TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS

Contact Center Supervisor Manual

System Administration of PTC Windchill 11.0

Content distribution networks over shared infrastructure : a paradigm for future content network deployment

Mathematics Shape and Space: Polygon Angles

Foreword xxiii Preface xxvii IPv6 Rationale and Features

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Intelligent Web Proxy Cache Replacement Algorithm Based on Adaptive Weight Ranking Policy via Dynamic Aging

Oracle Exadata Recipes

TABLE OF CONTENTS PAGE TITLE NO.

Techno Expert Solutions An institute for specialized studies! 0.20 hrs hrs. 2 hrs

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

ECE7995 Caching and Prefetching Techniques in Computer Systems. Lecture 8: Buffer Cache in Main Memory (I)

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Name of the lecturer Doç. Dr. Selma Ayşe ÖZEL

Knowledge libraries and information space

Self-Organization in Sensor and Actor Networks

LIST OF TABLES Parameters used in analyzing FIM-CQTransSWin Characteristics of Mushroom and Retail Datasets 99

BMEGUI Tutorial 1 Spatial kriging

Chapter The LRU* WWW proxy cache document replacement algorithm

CIT 668: System Architecture. Caching

COPYRIGHTED MATERIAL. Acknowledgments...v Introduction... xxi

TABLE OF CONTENTS SECTION 2 BACKGROUND AND LITERATURE REVIEW... 3 SECTION 3 WAVE REFLECTION AND TRANSMISSION IN RODS Introduction...

To Everyone... iii To Educators... v To Students... vi Acknowledgments... vii Final Words... ix References... x. 1 ADialogueontheBook 1

CONTENTS. Cisco Internet Streamer CDS 3.0 Software Configuration Guide iii OL CHAPTER 1 Product Overview 1-1

Improving the Performances of Proxy Cache Replacement Policies by Considering Infrequent Objects

Introduction to PTC Windchill ProjectLink 11.0

COPYRIGHTED MATERIAL. Contents at a Glance

Andale Store Getting Started Manual

Annexure I: Contact Details:

LIST OF ACRONYMS & ABBREVIATIONS

Chapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction

Design and Implementation of A P2P Cooperative Proxy Cache System

CITY UNIVERSITY OF NEW YORK. i. Visit:

Contents. Part I Setting the Scene

Update to Creo Parametric 4.0 from Creo Parametric 2.0

Social Networks: Service Selection and Recommendation

TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS AND ABBREVIATIONS

Excel Programming with VBA (Macro Programming) 24 hours Getting Started

A CONTENT-TYPE BASED EVALUATION OF WEB CACHE REPLACEMENT POLICIES

Modules, Details & Fees. Total Modules- 25 (highest in Industry) Duration- 2-5Months Full Course Fees- 30, (Pay in two Installments *2)

"Charting the Course... SharePoint 2007 Hands-On Labs Course Summary

PoP Level Mapping And Peering Deals

Optimization of Cache Size with Cache Replacement Policy for effective System Performance

Part I: Data Mining Foundations

CITY UNIVERSITY OF NEW YORK. Creating a New Project in IRBNet. i. After logging in, click Create New Project on left side of the page.

Performance Improvement of Web Proxy Cache Replacement using Intelligent Greedy-Dual Approaches

Flow-based Anomaly Intrusion Detection System Using Neural Network

GENETIC ALGORITHM BASED COLLABORATIVE FILTERING MODEL FOR PERSONALIZED RECOMMENDER SYSTEM

Web Crawlers Detection. Yomna ElRashidy

CROSS-REFERENCE TABLE ASME A Including A17.1a-1997 Through A17.1d 2000 vs. ASME A

object/relational persistence What is persistence? 5

Constrained Classification of Large Imbalanced Data

Introduction to Creo Elements/Direct 19.0 Modeling

Multimedia Streaming. Mike Zink

Proxy Server Systems Improvement Using Frequent Itemset Pattern-Based Techniques

Introduction. Assessment Test. Chapter 1 Introduction to Performance Tuning 1. Chapter 2 Sources of Tuning Information 33

Automatic annotation of digital photos

foreword to the first edition preface xxi acknowledgments xxiii about this book xxv about the cover illustration

EVPath Performance Tests on the GTRI Parallel Software Testing and Evaluation Center (PASTEC) Cluster

An Efficient Web Cache Replacement Policy

COPYRIGHTED MATERIAL. Table of Contents. Assessment Test

Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut

Introduction to Windchill PDMLink 10.2 for the Implementation Team

An Integration Approach of Data Mining with Web Cache Pre-Fetching

COPYRIGHTED MATERIAL. Contents. Chapter 1: Creating Structured Documents 1

Telematics Chapter 9: Peer-to-Peer Networks

"Charting the Course... MOC A Introduction to Web Development with Microsoft Visual Studio Course Summary

DISTRIBUTED MULTIMEDIA PROXY CACHE REPLACEMENT ALGORITHMS. A Thesis. Submitted to the Faculty. Purdue University. Albert I.

Data Mining with Microsoft

Improved Classification of Known and Unknown Network Traffic Flows using Semi-Supervised Machine Learning

This course is designed for web developers that want to learn HTML5, CSS3, JavaScript and jquery.

CACHE MEMORIES ADVANCED COMPUTER ARCHITECTURES. Slides by: Pedro Tomás

"Charting the Course... Agile Database Design Techniques Course Summary

Fónfix Repair Logging System

Modeling and Caching of P2P Traffic

Summary of Contents LIST OF FIGURES LIST OF TABLES

Pattern Classification based on Web Usage Mining using Neural Network Technique

"Charting the Course to Your Success!" MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008

Mechanism Design using Creo Parametric 3.0

"Charting the Course... Java Programming Language. Course Summary

A Frequent Max Substring Technique for. Thai Text Indexing. School of Information Technology. Todsanai Chumwatana

Sathyamangalam, 2 ( PG Scholar,Department of Computer Science and Engineering,Bannari Amman Institute of Technology, Sathyamangalam,

CHAPTER 1: GETTING STARTED WITH ASP.NET 4 1

Introduction to PTC Windchill PDMLink 11.0 for the Implementation Team

Chapter 1: Introducing SQL Server

Monitor DNS errors in a dashboard

PROBLEM FORMULATION AND RESEARCH METHODOLOGY

Installing and Administering a Satellite Environment

Transcription:

ix TABLE OF CONTENTS CHAPTER NO. TITLE PAGE NO. ABSTRACT 5 LIST OF TABLES xv LIST OF FIGURES xviii LIST OF SYMBOLS AND ABBREVIATIONS xxi 1 INTRODUCTION 1 1.1 INTRODUCTION 1 1.2 WEB CACHING 2 1.2.1 Classification of Web Cache 4 1.2.2 Cache Replacement Algorithms 5 1.2.3 Properties of WWW Caching System 6 1.3 PREFETCHING 8 1.3.1 Classification of Prefetching Algorithms 9 1.4 CO-OPERATIVE CACHING 10 1.5 OBJECTIVES 11 1.6 PROPOSED SYSTEM ARCHITECTURE 12 1.6.1 Access Log Manager (ALM) with SVM Classifier 13 1.6.2 Cluster Based Proxy Cache Manager (PCM) 13 1.6.3 Dynamic Hash Table (DHT) Based Co-operative Client Cache Manager System (CCM) 14 1.7 PERFORMANCE METRICS 15

x CHAPTER NO. TITLE PAGE NO. 1.7.1 Web Caching Performance Metrics 15 1.7.2 Prefetch Metrics 16 1.8 THESIS ORGANIZATION 18 2 LITERATURE REVIEW 20 2.1 INTRODUCTION 20 2.2 WEB CACHING 20 2.2.1 Web Caching Algorithms 21 2.2.2 Studies Based on Web Caching 23 2.3 PREFETCHING 26 2.3.1 Types of Web Prefetching 26 2.3.2 Approaches to Web Prefetching 27 2.3.3 Studies on Web Prefetching Techniques 28 2.3.4 Clustering Based Prefetching 29 2.4 INTEGRATING WEB CACHING AND PREFETCHING TECHNIQUES 32 2.5 CO-OPERATIVE CACHING 35 2.5.1 Co-operative Caching Mechanisms 35 2.5.2 Co-operative Caching Algorithms 36 2.5.3 Studies Based on Co-operative Caching 38 2.6 SUMMARY 44 3 ACCESS LOG MANAGER (ALM) 45 3.1 INTRODUCTION 45 3.2 INTRODUCTION TO ACCESS LOG MANAGER 45 3.3 ACCESS LOG MANAGER PROCESS 47 3.4 SVM CLASSIFIER 48 3.5 FRAMEWORK FOR GENERATING INPUT DATASET USING ALM 50

xi CHAPTER NO. TITLE PAGE NO. 3.5.1 Raw Data Collection 51 3.6 OFFLINE COMPONENT OF ACCESS LOG MANAGER 52 3.6.1 Sample Log File 53 3.6.2 Log File Contents 53 3.7 DATA PRE-PROCESSING 54 3.8 DATA CLEANING 56 3.9 TRAINING PHASE 57 3.10 ALM PERFORMANCE 60 3.10.1 Classifier Metrics 61 3.10.2 Web Cache Performance Measures 62 3.11 SUMMARY 64 4 CLUSTER BASED PROXY CACHE MANAGER 65 4.1 INTRODUCTION 65 4.2 FRAMEWORK OF PROXY CACHE MANAGER 65 4.2.1 Authentication Manager for Proxy Cache System 67 4.2.2 Constructing Web Navigational Graph (WNG) 70 4.2.3 Association Rule Mining 71 4.2.4 Inter Custer Creation Algorithm for Proxy Cache System 72 4.3 IMPACT OF SVM FOR VARIOUS CONFIDENCES AND SUPPORT THRESHOLDS 74 4.4 CLUSTER BASED PREDICTION AND PREFETCHING 76

xii CHAPTER NO. TITLE PAGE NO. 4.4.1 Hybrid CRF Algorithm for Cache Replacement 77 4.4.2 Performance Evaluation of CRF Algorithm 79 4.5 IMPACT OF CACHE SIZE ON PERFORMANCE MEASURES 79 4.6 SUMMARY 82 5 CO-OPERATIVE CLIENT CACHE MANAGER SYSTEM 83 5.1 INTRODUCTION 83 5.2 CO-OPERATIVE HYBRID ARCHITECTURE 84 5.3 DHT BASED CO-OPERATIVE CLIENT CACHE MANAGER SYSTEM (CCM) 86 5.3.1 Identifier Creation by Hashing Algorithm 87 5.3.1.1 Finger table 91 5.3.1.2 Key value pair table 94 5.3.2 Node Create and Join Procedures 97 5.3.3 Routing Algorithm 100 5.3.4 Node Stabilize Algorithm 100 5.3.5 Resource Searching Algorithm 103 5.3.6 Node Exit Algorithm 105 5.4 QUERY INTEGRATOR 107 5.5 CLIENT CACHE MANAGEMENT 111 5.6 SUMMARY 112 6 PERFORMANCE EVALUATION AND ANALYSIS 116

xiii 6.1 INTRODUCTION 116 CHAPTER NO. TITLE PAGE NO. 6.2 EXPERIMENTAL SETUP AND IMPLEMENTATION 117 6.3 ACCESS LOG MANAGER PERFORMANCE ANALYSIS 118 6.4 CLASSIFIER EVALUATION 119 6.5 EFFECTIVENESS OF THE WEB CACHING 120 6.5.1 SOS Management through Hybrid Algorithm 121 6.5.2 Cluster Based Proxy Server Cache Management through Combined LRU and LFU Algorithm 123 6.5.2.1 Efficiency improvement by CRF in sample data sets 124 6.6 THE EFFECTIVENESS OF PREFETCHING 126 6.7 AVERAGE NETWORK TRAFFIC 127 6.8 IMPACT OF CONFIDENCE AND SUPPORT THRESHOLD 129 6.8.1 Analysis of Precision and Recall for Different Support Values 136 6.9 IMPACT OF CACHE SIZE ON PERFORMANCE MEASURES 136 6.10 PERFORMANCE ANALYSIS OF SIMULATION 138 6.11 COMPARISON OF DEVELOPED APPROACH WITH EXISTING APPROACHES 139 6.12 SUMMARY 143

xiv CHAPTER NO. TITLE PAGE NO. 7 CONCLUSION AND FUTURE WORKS 144 7.1 SUMMARY 144 7.2 MAJOR FINDINGS 145 7.3 CONCLUSION 146 7.4 SCOPE FOR FUTURE WORK 147 APPENDIX 1 148 APPENDIX 2 151 REFERENCES 156 LIST OF PUBLICATIONS 162

xv LIST OF TABLES TABLE NO. TITLE PAGE NO. 3.1 Details of proxy server 52 3.2 Resultant preprocessed data 56 3.3 Algorithm for removing irrelevant records 57 3.4 SVM efficiency for various values of C and 60 3.5 SVM classification algorithm 60 3.6 Classifier measures of testing datasets 62 3.7 Statistics of proxy datasets after SVM classification 62 3.8 No. of web objects Vs classification efficiency 63 4.1 Algorithm for cluster creation 73 4.2 Impact of cache size on hit ratio for support 2 and confidence 0.3 74 4.3 Impact of cache size on hit ratio for support 4 and confidence 0.3 74 4.4 Impact of cache size on hit ratio for support 6 and confidence 0.3 75 4.5 Impact of cache size on hit ratio for support 8 and confidence 0.3 75 4.6 Pseudo-code for prediction and prefetching 77 4.7 CRF algorithm 79 4.8 Impact of cache size on hit ratio for proxy data set UC 80 4.9 Impact of cache size on hit ratio for proxy data set BO2 80 4.10 Impact of cache size on hit ratio for proxy data 81

xvi TABLE NO. TITLE PAGE NO. set SV 4.11 Impact of cache size on hit ratio for proxy data set SD 81 4.12 Impact of cache size on hit ratio for proxy data set NY 82 5.1 Object ID Table with sample clients 89 5.2 Finger Table for node C1 91 5.3 Finger Table for node C3 91 5.4 Finger Table for node C5 92 5.5 Finger Table for node C8 92 5.6 Finger Table for node C20 92 5.7 Finger Table for node C24 93 5.8 Finger Table for node C28 93 5.9 Finger Table for node C30 93 5.10 Key Value Pair Table for Node C1 95 5.11 Key Value Pair Table for Node C3 95 5.12 Key Value Pair Table for Node C5 95 5.13 Key Value Pair Table for Node C8 95 5.14 Key Value Pair Table for Node C20 96 5.15 Key Value Pair Table for Node C24 96 5.16 Key Value Pair Table for Node C28 96 5.17 Key Value Pair Table for Node C30 97 5.18 Object ID table after joining node C16 98 5.19 Object ID table after deleting node C8 105 5.20 Comparison of recency and frequency wise retrieval with proposed system 114 6.1 Average network traffic 128 6.2 CPU utilization and server load based on proxy 129

xvii TABLE NO. TITLE PAGE NO. presence 6.3 Improvement ratio comparison of SVM hybrid with other methods 137 6.4 Performance analysis of proposed system 141 6.5 Cache performance results 142

xviii LIST OF FIGURES FIGURE NO. TITLE PAGE NO. 1.1 Storage hierarchy 3 1.2 Classification of web cache 5 1.3 Classification of prediction algorithms 10 1.4 Co-operative web caching system 11 1.5 System architecture for information retrieval 12 3.1 Access Log Manager Process 48 3.2 Classifications of data by SVM 49 3.3 Training data classification by machine learning Technique 51 3.4 Snapshot of sample proxy server log 53 4.1 Framework of cluster based proxy cache manager 67 4.2 Client registration 68 4.3 Client registration failure 68 4.4 Snapshot of sample proxy cache content 69 4.5 Snapshot of frequency updated sample cache content optimization 70 4.6 Example Web Navigational Graph 71 5.1 Client Cache Manager Modules integration process 84 5.2 Hybrid architeuctue uesd in the system 85 5.3 Co-operative chord network for shared objects with finger table 90 5.4 Co-operative chord network after joining C16 99 5.5 Co-operative chord network after relieving C8 106

xix FIGURE NO. TITLE PAGE NO. 5.6 Snapshot of query integrator 108 5.7 SOS object creation 109 5.8 Work break down structure 113 6.1 Classification efficiency Comparison 119 6.2 Redundancy rate comparison with web objects 120 6.3 Analysis of hit ratio 122 6.4 Analysis of byte hit ratio 123 6.5 Comparison of cache hit ratio in uc data set 124 6.6 Comparison of cache hit ratio in bo2 data set 124 6.7 Comparison of cache hit ratio in sv data set 125 6.8 Comparison of cache hit ratio in sd data set 125 6.9 Comparison of cache hit ratio in ny data set 126 6.10 Precision and Recall percentages for all five data sets 127 6.11 Hit ratio Vs cache size for Support :2 Confidence:0.3 131 6.12 Hit ratio Vs cache size for Support :4 Confidence:0.3 131 6.13 Hit ratio Vs cache size for Support :6 Confidence:0.3 132 6.14 Hit ratio Vs cache size for Support :8 Confidence:0.3 132 6.15 Byte hit ratio Vs cache size for Support :2 Confidence:0.3 133 6.16 Byte hit ratio Vs cache size for Support :4 Confidence:0.3 133 6.17 Byte hit ratio Vs cache size for Support :6 Confidence:0.3 134 6.18 Byte hit ratio Vs cache size for Support :8 134

xx FIGURE NO. TITLE PAGE NO. Confidence:0.3 6.19 Analysis of access latency 135 6.20 Precision and Recall analysis 136 6.21 Breakdown of request handling 139 6.22 Comparison of access latency Vs number of Peers 140 6.23 Performance comparison of client server and developed system 140 6.24 Performance analysis of proposed system 142

xxi LIST OF SYMBOLS AND ABBREVIATIONS APACS - A Proxy Agent for Client System ALM - Access Log Manager AI - Artificial Network ANN - Artificial Neural Network ANFIS - Artificial Neuro Fuzzy Information System AS - Autonomous System BPNN - Back Propagation Neural Network BHR - Byte Hit Ratio CCM - Client Cache Manager CRF - Combined Recency Frequency CCN - Content Centric Network CON - Content Oriented Network CP - Content Provider CCR - Correct Classification Rate DG - Dependency Graph DNS - Domain Name Server DDG - Double Dependency Graph DHT - Dynamic Hash Table EC - End Consumer XML - Extensible Markup Language FFS - Fast Frequency server FI - Finite Inductive FIFO - First In First Out GA - Genetic Algorithm GM - Geometric Mean GDS - Greedy Dual Size

xxii GDFS - Greedy Dual Size Frequency HR - Hit Ratio HTML - Hyper Text Markup Language ID - Identifier IP - Internet Protocol ISP - Internet Service Provider LFU - Least Frequently Used LRU - Least Recently Used LAC - Local Access Counter MRU - Most Recently Used NLANR - National Laboratory of Applied Network Research NN - Neural Network PSO - Particle Swarm Optimization P2P - Peer to Peer PPE - Prediction Prefetching Engine PCM - Proxy Cache Manager RR - Random Replacement RS - Rough Set RTT - Round Trip Time SHA1 - Secure Hashing Algorithm 1 SAC - Sharable Access Counter SOS - Sharable Object Space SWNET - Social Wireless Network SV - Support Vector SVM - Support Vector Machine TTL - Time To Live TNR - True Negative Rate TPR - True Positive Rate URL - Uniform Resource Locator

xxiii WNG - Web Navigational Graph WSE - Web Search Engine WAN - Wide Area Network WWW - World Wide Web