Availability and Utility of Idle Memory in Workstation Clusters. Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ

Size: px

Start display at page:

Download "Availability and Utility of Idle Memory in Workstation Clusters. Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ"

Nora Barker
5 years ago
Views:

1 Availability and Utility of Idle Memory in Workstation Clusters Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ

2 Motivation Explosive growth in data intensive applications Large-scale scientific computations, data mining, multimedia apps High speed LANs access remote memory faster than local disk Exploit idle memory on workstation clusters for data intensive applications ACM Sigmetrics 99 2

3 Exploiting Idle Memory OS support for Global Memory Management GMS (Univ. of Washington), Duke, UC Berkeley No previous study of memory availability in workstation clusters User-level software Can we build something similar to Condor for harvesting idle memory on clusters of workstations? ACM Sigmetrics 99 3

4 Issues addressed What are the typical memory usage patterns on a desktop workstation? How much memory in a cluster can be expected to be idle? How much benefit can a user expect from exploiting idle memory? Can we exploit idle memory without inconveniencing workstation owners? ACM Sigmetrics 99 4

5 Outline Memory Availability Metrics Methodology Results User-level Software for exploiting idle memory Utility of exploiting idle memory ACM Sigmetrics 99 5

6 Memory Usage Memory usage Kernel File Cache Process Memory (Virtual Memory) Free Memory Free memory Idle memory In modern systems, file cache and virtual memory system unified ACM Sigmetrics 99 6

7 Idle memory Busy memory Kernel memory Resident Set Size of active processes (CPU usage > 0) Active file cache memory UNIX always maintains a pool of free pages determined by kernel variable lotsfree Idle mem = Total mem - Busy mem - Lotsfree ACM Sigmetrics 99 7

8 Active File Cache Memory Two components Open files in currently active processes are considered active Cached files that will be needed by processes that will be active in the near future Executables, library files, etc. that are mapped into process address space Files that are opened by the process ACM Sigmetrics 99 8

9 Measuring busy memory Kernel memory In Solaris, kernel statistics interface (kstat) Process Memory = Sum of RSS of active procs active processes reported by ps, top X server and periodic daemons always active RSS of a process = Private + Shared parts avoid double-counting of shared memory memtool kernel module for Solaris ACM Sigmetrics 99 9

10 Snippet of pmem output Address Kbytes Resident Shared Private Permissions Mapped File read/exec emacs 000DC read/write/exec emacs read/write/exec [ heap ] EF read/exec libc_psr.so.1 EF read/exec libmp.so.2 EF read/write/exec libmp.so.2

11 Measuring Busy Memory cont d Active File Cache Memory open files reported by lsof memory associated with files in cache reported by memtool look ahead in the gathered traces to account for cached files that will be needed in the future Caveat: approximation since we do not capture file system calls in our traces ACM Sigmetrics 99 10

12 Methodology Took snapshots of the system at 90 second intervals for several weeks on two clusters SUN workstations since kernel tools only available for Solaris George Mason cluster 23 workstations: 7 with 32 MB, 14 with 64 MB Total memory 1.4 GB UCSB cluster 29 workstations: 6 with 64 MB, 14 with 128 MB Total memory 5.2 GB ACM Sigmetrics 99 11

13 Memory Usage (64 MB Hosts) 60 Memory (MB) Avail Mem Proc Mem File Mem Dyn Kmem Stat Kmem Lotsfree Time (days)

14 Memory Usage (128 MB Hosts) 120 Memory (MB) Avail Mem Proc Mem File Mem Dyn Kmem Static Kmem Lotsfree Time (days)

15 Memory Usage (256 MB Hosts) Memory (MB) Avail Mem Proc Mem File Mem Dyn Kmem Static Kmem Lotsfree Time (days)

16 amount of memory used for dierent purposes Average (KB) Host type kernel le-cache process available 32MB hosts MB hosts MB hosts MB hosts

17 of Busy Memory with Time on a 32 MB Variation workstation 30 availmem Available Memory (MB) Time (days)

18 of Busy Memory with Time on a 256 MB Variation workstation 250 availmem Available Memory (MB) Time (days)

19 Aggregate Available Memory (ucsb) Memory (MB) Idle Hosts Only All Hosts Total Memory Time (days)

20 Aggregate Available Memory (gmu) Memory (MB) Idle Hosts Only All Hosts Total Memory Time (days)

21 Results A substantial fraction of the total memory on desktop machines is idle grows from 16 MB on 32 MB hosts to 186 MB on 256 MB hosts A large fraction of the total memory on a cluster is available 60-68% considering all hosts; 53% considering only idle hosts Many dips in memory availability on workstations => perception of memory shortage ACM Sigmetrics 99 12

22 Exploiting Idle Memory User level software inspired by Condor goals to allow data-intensive applications to use remote memory as intermediate layer in storage hierarchy between local memory and disk no social nuisance only harvest memory from idle workstations ACM Sigmetrics 99 13

23 APPLICATION mopen, mread, mwrite, mclose mopen, mclose CENTRAL MANAGER LIBRARY mread mwrite idle / active WORKSTATION W IDLE MEMORY DAEMON fork / kill RESOURCE MANAGER ACM Sigmetrics 99 14

24 Memory Recruitment Policy Based on availability study Recruit memory only from idle machines Limit recruited memory to Total Memory - (Active Memory + 15% of Total Memory + Lotsfree) Active Memory can be measured using user-level tools ACM Sigmetrics 99 15

25 Simulation Study Workload: Data-Intensive applications LU out of core LU factorization (dataset 536 MB) DMINE datamining (dataset 4 GB) DB2 5 queries on a large DB2 database (5.2 GB) RANDOM (dataset 2GB) HOTCOLD (dataset 2GB) Traces of file system calls made by these apps Memory and cpu availability traces for two weeks Repeated execution of benchmarks DB2 and DMINE with an inter-execution delay ACM Sigmetrics 99 16

26 System Model Application executes on its own workstation 4 processors, 256 MB memory, 2 disks Local disk based on Seagate Cheetah 9 avg. seek 5.8 ms, max seek 15.7 ms, RPM, transfer rate 18 MB/sec High speed network bandwidth 70 MB/sec, host-to-host latency 7.5 µs ACM Sigmetrics 99 17

27 Simulation Study Workload: 5 applications (3 real, 2 synthetic) data set sizes 536 MB GB Traces of file system calls made by these apps Memory and cpu availability traces for two weeks Repeated execution of applications System model application platform: 4 CPUs, 256 MB, 2 disks disk: RPM, transfer rate 18 MB/sec network: bandwidth 70 MB/sec, latency 7.5 µs ACM Sigmetrics 99 16

28 Results Significant benefit to exploiting idle memory on workstation clusters for applications whose footprints can fit in available memory, speedup of up to 2.1 memory recruitment policy minimizes any delays experienced by owner on workstation reclamation Simulation results validated via implementation Koussih, Acharya, Setia (HPDC 99) ACM Sigmetrics 99 17

29 Conclusions A substantial fraction of the total memory on desktop machines is not in use grows from 16 MB on 32 MB hosts to 186 MB on 256 MB hosts A large fraction of the total memory on a cluster is available 60-68% considering all hosts; 53% considering only idle hosts Memory Recruitment policy results in no adverse impact on owner while leading to speedups up to 2.1 ACM Sigmetrics 99 18

30 DMINE RANDOM Speedup lat 30, bw 10 lat 30, bw 70 lat 7.4, bw 70

File System Aging: Increasing the Relevance of File System Benchmarks

File System Aging: Increasing the Relevance of File System Benchmarks Keith A. Smith Margo I. Seltzer Harvard University Division of Engineering and Applied Sciences File System Performance Read Throughput