ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

Size: px

Start display at page:

Download "ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System"

Karen Francis
6 years ago
Views:

1 ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System Xiaodong Shi Dan Feng Wuhan National Laboratory for Optoelectronics, Wuhan, Hubei , PR China Journal of Computers, Vol 7, No 8 (2012), , Aug 2012

2 OUTLINE INTRODUCTION MOTIVATION RELATED WORK THE ASEP SCHEME PERFORMANCE EVALUATION CONCLUSION

3 INTRODUCTION Buffer cache is widely used to bridge the performance gap between processor and mechanical disk Multilevel buffer cache are used to break the I/O bottleneck, for example: web page request proxy server database server client browser web server background storage server All these buffer cache form hierarchy and distinct buffer caches have difference access patterns

4 MOTIVATION(1/3) The pages referenced in Second-level buffer cache has some feature: Larger reused distance than in first-level buffer cache Active time point loss Larger reused distance The number of accesses between two accesses to the same block in a reference sequence Weak temporal locality 99% of references in second-level buffer caches have larger than 512 More sensitive to the changes of prefetching depth Marginal cache area becomes warmer than single level buffer cache Active time point at marginal cache area

5 MOTIVATION(2/3) Compare the number of I/O with two different prefetching A, B, C are random stream and X1, X2, X3 are sequential access Consider 4 th access and 8 th access These assumptions are based on access characteristics of second-level buffer

6 MOTIVATION(3/3) The lifetime of A, B, C are larger than their reuse distance, which conserve the active time points of these three If we want to achieve effective sequential prefetching Keep track of the accesses to a given size marginal cache area How many pages will reach their active time point Pages prefetch < Pages missed from active time point loss Prefetching is too aggressive, depth should be decrease

7 RELATED WORK(1/3) AMP adjust the prefetching depth to avoid the prematurely eviction of prefetched pages If prefetched page is evicted before accessed, we reduce the sequential prefetching request length Otherwise, heuristically increases the sequential request length STEP improve hit probability of prefetched pages Identifying the confidence sequential stream, more prefetched pages are requested Above solutions could not effective improve the performance of second-level buffer cache Due to ignoring the active time point loss

8 RELATED WORK(2/3) Cache management focus on prefetched pages LRU variations AMP(Adaptive Multistream Prefetching) SARC(Sequential prefetching in adaptive replacement cache) LRU-bottom Maintain a single LRU list Sequential data is inserted into LRU end Random access data is inserted into MRU end AMP Replacement policy based on LRU Sequential data is moved to the MRU end only on repeated access

9 RELATED WORK(3/3) SARC Optimize the cache space between sequential and random access Equalize their marginal utility Maintain a desired size for sequential list If bottom portion of sequential list is found to more valuable, desired size increase

10 THE ASEP SCHEME(1/7) A. Evaluate the active time point loss B. Inferring the prefetching accuracy C. Architecture overview D. Prefetching algorithm

11 THE ASEP SCHEME(2/7) A sequential prefetching request, if its depth is larger than 2 It will make this page lose its reaccess opportunity Evaluate the misses induced by aggressive sequential prefetching Access to the evicted pages Access to the active time point loss ASEP first identifies pages will be influenced on their lifetime A prefetching request with M pages is generated NUM=(L-M)*(1-HR)+M NUM=L*(1-HR)

12 THE ASEP SCHEME(3/7) Evaluated the number of pages of active time point loss Misses=NUM*(Q/P)=L*(1-HR)*(Q/P) Q is far less than P, ASEP need long time before update the value of misses We keep track marginal cache Marginal area is set to 2% of available cache capability Misses= L*(1-HR)*{[Q N /(N/M)]/P N }

13 THE ASEP SCHEME(4/7) Inferring the prefetching accuracy Growing sequential stream access We construct correlation graph Predecessor node Len: history access length Successor node Len: prefetched page length Weight edges Strength of correlation A sequential stream with X pages can support X(X 1) 2 correlations

14 THE ASEP SCHEME(5/7) For a node(len: X, Count: N) N=total number of accessed sequential stream that are consisted at least X pages Accuracy Two node(len: X, Count: R) node(len: Y, Count: S) A(X->Y)=T/R

15 THE ASEP SCHEME(6/7) Detector: If there is predecessor block founded, then sequential stream is detected After sequential current stream is extend or new sequential access is detected, it trigger sequential prefetching executor

16 THE ASEP SCHEME(7/7) Prefetching algorithm ASEP establishes and updates a GSA-CG model used to predict the accuracy(a) of expected prefetch request (M) ASEP generates size(m) is the maximum value A*M>Misses Size M can maximize without thrashing STEP computes the optimum value of sequential size (N), by cost benefit module Prefetche depth is determined by Min(M,N)

17 PERFORMANCE EVALUATION(1/10) Experimental setup Intel Xeon 3.o GHz processor 1GB memory 3 Seagate ST AS SATA disk, 250GB respectively Rational speed is 7200 RPM Average seek time is 8.5ms RAID-5 and set the stripping size 256KB We evaluate performance through the trace driven experiments Replay tool based on RAIDmeter RAID driver(md) embedded into the Linux Kernel Fedora core 4 Linux, kernel version EXT3 file system Cache memory 512MB

18 PERFORMANCE EVALUATION(2/10) Workload Storage performance council(oltp and Web) OLTP are characterized by the sequential access pattern obtained (financial institution) Web are collected from web search workload, which is more random Synthetic workload We decompose a trace into multiple sub-traces and set their start time at the same time Denote the traces as Finx-n or Webx-n, where -n represent the number of sub-traces replayed simultaneously

19 PERFORMANCE EVALUATION(3/10) ASEP can efficiently improve performance of storage system under FIN workload Under scare cache space, the active time point loss dominate the efficiency of sequential prefetching

20 PERFORMANCE EVALUATION(4/10) Small cache space deteriorates the active time point loss of pages

21 PERFORMANCE EVALUATION(5/10)

22 PERFORMANCE EVALUATION(6/10)

23 PERFORMANCE EVLUATION(7/10) If the algorithm is effective, its curve should be steadier than othes

24 PERFORMANCE EVALUATION(8/10) STEP: response time become large more quickly APP: too aggressive under light load

25 PERFORMANCE EVALUATION(9/10)

26 PERFORMANCE EVALUATION(10/10)

27 CONCLUSION In second-level buffer cache, the pages have large reuse distance, which tends to lead the pages to be prematurely evicted. The ASEP algorithm can balance between the accesses to prefetched pages and the misses of pages active time point loss induced by prefetching ASEP can significantly improve the performance by up to 49.7% and, at the same time, it only uses 55.6% cache space of other prefetching algorithms.

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System

ASEP: An Adaptive Sequential Prefetching Scheme for Second-level Storage System JOURNAL OF COMPUTERS, VOL. 7, NO. 8, AUGUST 2012 1853 : An Adaptive Sequential Prefetching Scheme for Second-level Storage System Xiaodong Shi Computer College, Huazhong University of Science and Technology,