Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross

Size: px
Start display at page:

Download "Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems. Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross"

Transcription

1 Two-Choice Randomized Dynamic I/O Scheduler for Object Storage Systems Dong Dai, Yong Chen, Dries Kimpe, and Robert Ross

2 Parallel Object Storage Many HPC systems utilize object storage: PVFS, Lustre, PanFS, Ceph, etc. Files are arranged as list of objects, which are physically distributed to OSDs I/O requests are mapped to objects, each object is served by OSD Application Application p1 p2 p3 pn File=>Objects Object Store Device OSD OSD OSD OSD

3 Parallel Object Storage Many HPC systems utilize object storage: PVFS, Lustre, PanFS, Ceph, etc. Files are arranged as list of objects, which are physically distributed to OSDs I/O requests are mapped to objects, each object is served by OSD Application Application p1 p2 p3 pn File=>Objects Object Store Device OSD OSD OSD OSD

4 Parallel Object Storage Many HPC systems utilize object storage: PVFS, Lustre, PanFS, Ceph, etc. Files are arranged as list of objects, which are physically distributed to OSDs I/O requests are mapped to objects, each object is served by OSD Application Application p1 p2 p3 pn File=>Objects Object Store Device OSD OSD OSD OSD I/O Stragglers

5 Motivation: I/O Stragglers The existence of I/O stragglers is a well-know problem Long-term I/O stragglers Hours, days, even always Caused by software bugs, hardware failures, or outdated hardware Short-term, dynamic I/O stragglers Minutes, seconds, or even less Come from interference between applications, or resource contention With more clients and storage servers, this problem will not become better In this research, we focus on detecting and avoiding short-term, dynamic I/O Stragglers.

6 Motivation: I/O Stragglers The existence of I/O stragglers is a well-know problem Long-term I/O stragglers Hours, days, even always Statistical data help identify them Caused by software bugs, hardware failures, or outdated hardware Short-term, dynamic I/O stragglers Minutes, seconds, or even less Come from interference between applications, or resource contention With more clients and storage servers, this problem will not become better In this research, we focus on detecting and avoiding short-term, dynamic I/O Stragglers.

7 Motivation: I/O Stragglers The existence of I/O stragglers is a well-know problem Long-term I/O stragglers Hours, days, even always Statistical data help identify them Caused by software bugs, hardware failures, or outdated hardware Short-term, dynamic I/O stragglers Minutes, seconds, or even less No good strategy to identify them Come from interference between applications, or resource contention With more clients and storage servers, this problem will not become better In this research, we focus on detecting and avoiding short-term, dynamic I/O Stragglers.

8 Two-choice randomized, dynamic HPC I/O scheduler Identify and avoid the short-term stragglers by tracking the real-time performance of storage servers Dynamically place write operations to OSDs in a decentralized way (use Two-choice Randomization) Efficient way to track the dynamic data placement due to the I/O scheduling

9 K. Ousterhout, Sparrow: Distributed, Low Latency Scheduling, SOSP 2013 Two-Choice Algorithm Parallel, randomized load-balancer M. D. Mitzenmacher, et. al. The Power of Two Choices in Randomized Load Balancing, Also applied in task scheduler recently (SOSP 2013) Having two choices yields a qualitatively improvement on Maximal Queue Length Randomly choose one Randomly choose two, and select one Requests Servers Requests Queue Queue Length

10 K. Ousterhout, Sparrow: Distributed, Low Latency Scheduling, SOSP 2013 Two-Choice Algorithm Parallel, randomized load-balancer M. D. Mitzenmacher, et. al. The Power of Two Choices in Randomized Load Balancing, Also applied in task scheduler recently (SOSP 2013) Having two choices yields a qualitatively improvement on Maximal Queue Length Randomly choose one Randomly choose two, and select one Requests Servers Requests Queue Queue Length

11 K. Ousterhout, Sparrow: Distributed, Low Latency Scheduling, SOSP 2013 Two-Choice Algorithm Parallel, randomized load-balancer M. D. Mitzenmacher, et. al. The Power of Two Choices in Randomized Load Balancing, Also applied in task scheduler recently (SOSP 2013) Having two choices yields a qualitatively improvement on Maximal Queue Length Randomly choose one Randomly choose two, and select one Requests Servers Requests Queue Queue Length

12 K. Ousterhout, Sparrow: Distributed, Low Latency Scheduling, SOSP 2013 Two-Choice Algorithm Parallel, randomized load-balancer M. D. Mitzenmacher, et. al. The Power of Two Choices in Randomized Load Balancing, Also applied in task scheduler recently (SOSP 2013) Having two choices yields a qualitatively improvement on Maximal Queue Length Randomly choose one Randomly choose two, and select one Requests Servers Requests Queue Queue Length

13 K. Ousterhout, Sparrow: Distributed, Low Latency Scheduling, SOSP 2013 Two-Choice Algorithm Parallel, randomized load-balancer M. D. Mitzenmacher, et. al. The Power of Two Choices in Randomized Load Balancing, Also applied in task scheduler recently (SOSP 2013) Having two choices yields a qualitatively improvement on Maximal Queue Length Randomly choose one Randomly choose two, and select one Requests Servers Requests Queue Queue Length

14 Is Two-Choice Good Enough? Two-Choice seems to be promising We simulate it in HPC I/O to see what will happen Run background workloads that issue I/O operations on storage servers Synthetically create some slower servers (I/O stragglers) by putting 5x loads on them Run a new application with parallel I/O operations to see the response time (based on the slowest I/O operation) Response Time(ms) Fixed Scheduler Random Selection Two Choice Random 1,000 Storage Nodes 10,000 Processes 1MB writes 5ms round-trip time Straggler Ratio the percentage of servers that are much slower

15 Is Two-Choice Good Enough? Two-Choice seems to be promising We simulate it in HPC I/O to see what will happen Run background workloads that issue I/O operations on storage servers Synthetically create some slower servers (I/O stragglers) by putting 5x loads on them Run a new application with parallel I/O operations to see the response time (based on the slowest I/O operation) Random selection can easily be worse than fixed strategy Response Time(ms) Fixed Scheduler Random Selection Two Choice Random 1,000 Storage Nodes 10,000 Processes 1MB writes 5ms round-trip time Straggler Ratio the percentage of servers that are much slower

16 Is Two-Choice Good Enough? Two-Choice seems to be promising We simulate it in HPC I/O to see what will happen Run background workloads that issue I/O operations on storage servers Synthetically create some slower servers (I/O stragglers) by putting 5x loads on them Run a new application with parallel I/O operations to see the response time (based on the slowest I/O operation) Random selection can easily Two-Choice converges to the same be worse than fixed strategy performance as Fixed Scheduler Response Time(ms) Fixed Scheduler Random Selection Two Choice Random 1,000 Storage Nodes 10,000 Processes 1MB writes 5ms round-trip time Straggler Ratio the percentage of servers that are much slower

17 Why I/O schedulers perform like this? By analyzing the scheduling results, we find: Random selection most likely will hit some of the stragglers will still put I/O requests on heavy-loaded servers Native two-choice algorithm can avoid stragglers when there are not many of them tends to put much more I/O requests on servers with slightly less load may generate new hotspots Maximal Scheduled I/O Requests Fixed Scheduler Random Selection Two Choice Random Server Number of Each Load Current Load (Pending I/O Requests)

18 Extend Native Two-Choice We can not simply apply native two-choice in HPC I/O as All schedulers probe only two random storage servers can not avoid stragglers effectively High concurrency HPC apps place I/O requests at the same time all schedulers probe at the same time, make the same scheduling decision may generate new stragglers

19 Extend Native Two-Choice We can not simply apply native two-choice in HPC I/O as All schedulers probe only two random storage servers can not avoid stragglers effectively High concurrency HPC apps place I/O requests at the same time all schedulers probe at the same time, make the same scheduling decision may generate new stragglers Extend native two-choice Collaborative Probe Strategy Preassign Strategy

20 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers

21 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one

22 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one

23 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one Probe 2*k, Select k

24 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one Probe 2*k, Select k Schedulers knew k times more server info

25 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one Probe 2*k, Select k Schedulers knew k times more server info

26 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one Probe 2*k, Select k Probe with attached info Schedulers knew k times more server info

27 Collaborative Probe (CP) Strategy Combine concurrent probing in a single node together Attach more information for scheduling from storage servers Probe two, Select one Probe 2*k, Select k Probe with attached info Schedulers knew k times more server info Schedulers learn from shortterm cached server info

28 With Collaborative Probe (CP) Strategy Scheduler observes more load information and creates better performance when the straggler ratio is low But, with more stragglers, the performance drops down quickly caused by the high concurrency of HPC I/O Response Time(ms) Fixed Scheduler Random Selection Two Choice Random Two Choice with Collaborative Probe Straggler Ratio

29 With Collaborative Probe (CP) Strategy Scheduler observes more load information and creates better performance when the straggler ratio is low But, with more stragglers, the performance drops down quickly caused by the high concurrency of HPC I/O Response Time(ms) Better Performance Fixed Scheduler Random Selection Two Choice Random Two Choice with Collaborative Probe Straggler Ratio

30 With Collaborative Probe (CP) Strategy Scheduler observes more load information and creates better performance when the straggler ratio is low But, with more stragglers, the performance drops down quickly caused by the high concurrency of HPC I/O Response Time(ms) Better Performance Performance Gets Worse Quickly Fixed Scheduler Random Selection Two Choice Random Two Choice with Collaborative Probe Straggler Ratio

31 Preassign Strategy probe time line OSD local load Info Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

32 Preassign Strategy probe Compare, Select time line OSD local load Info Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

33 Preassign Strategy probe Compare, Select I/O Arrives time line OSD local load Info Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

34 Preassign Strategy probe Compare, Select I/O Arrives time line OSD local load Info Updated Load Info No Change on Load Information Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

35 Preassign Strategy probe probe Compare, Select I/O Arrives time line OSD local load Info Updated Load Info No Change on Load Information Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

36 Preassign Strategy Compare, Select probe probe Compare, Select I/O Arrives time line OSD local load Info Updated Load Info No Change on Load Information Storage servers maintain statistics about the acceptance of a given local load value add a fraction of current load from the probe information based on this historical statistics After receiving the real requests, storage servers will adjust these preassigned loads

37 With CP + Preassign With Pre-assign, schedulers will not get the same load information with more probes, the expected loads on that server will increase if the pre-assigned loads are large, the possibility of being accepted will be low Even with more stragglers, we still are able to keep the performance stable Better performance by avoid stragglers Response Time(ms) Stable performance without creating new stragglers Fixed Scheduler Random Selection Two Choice Random Collaborative Probe CP + Preassign Straggler Ratio

38 Implementation Issues In order to implement a dynamically scheduler in an object storage systems be able to redirect I/O requests to arbitrary storage servers be able to remember these redirections and read back in the future Core Idea: Do not update metadata servers, update storage servers Do not invalidate client-side cache, move redirected data back OSD OSD OSD Components: Redirect Table Metadata Migration

39 Implementation Issues In order to implement a dynamically scheduler in an object storage systems be able to redirect I/O requests to arbitrary storage servers be able to remember these redirections and read back in the future Small fragments Core Idea: Do not update metadata servers, update storage servers Do not invalidate client-side cache, move redirected data back OSD OSD OSD Components: Redirect Table Metadata Migration

40 Implementation Issues In order to implement a dynamically scheduler in an object storage systems be able to redirect I/O requests to arbitrary storage servers be able to remember these redirections and read back in the future Small fragments Core Idea: Do not update metadata servers, update storage servers Do not invalidate client-side cache, move redirected data back OSD OSD OSD Components: Redirect Table Metadata Migration

41 Read/Write under this solution Write Write(obj, offset, len) randomly probe several storage servers, get their real-time loads compare and select the one who should server this I/O request put the real data into that selected server; put the redirection information into the original server Read Read(obj, offset, len) find the original server based on client-side metadata cache send read request to the original server, may false hit a server get the redirected server location, and send read request there again return the value from the redirected server

42 Performance Consideration Redirection is not free It introduces extra latency During writes, probing and updating redirect table During reads, false hit and querying redirect table

43 Performance Consideration Redirection is not free It introduces extra latency During writes, probing and updating redirect table During reads, false hit and querying redirect table Move data back fast to reduce the cache miss

44 Performance Consideration Redirection is not free It introduces extra latency During writes, probing and updating redirect table During reads, false hit and querying redirect table Move data back fast to reduce the cache miss Efficient redirect table to support fast updating and querying

45 Efficient Redirect table Supported operations create/query/delete Redirect table time-order array storing redirection ranges and target servers create indicates appending a new entry at the end of the redirect table query looks back the table one by one to search the visited range delete also appends new entries Tips Maintain an in-memory structure for fast access Append new entry with timestamps Remove overlapped existing entries while appending new ones by checking their range

46 Move Data Back: Metadata Migration Metadata Migration threads run in the background of storage servers With the lowest priority, pause if there are I/O requests We have plenty of opportunities to do this Carns, Philip, et al. Understanding and improving computational science storage access through continuous characterization, TOS, 2011

47 Move Data Back: Metadata Migration Metadata Migration threads run in the background of storage servers With the lowest priority, pause if there are I/O requests We have plenty of opportunities to do this Carns, Philip, et al. Understanding and improving computational science storage access through continuous characterization, TOS, 2011 Consistency is easier to manage as each storage server manages its own redirect table Use timestamps to direct the consistency management

48 Implementation and Evaluation Prototype based on Triton - an object-based storage system All evaluations were conducted on the Fusion cluster in ANL 320 compute nodes 2.53 GHz Xeon CPU 36 GB Memory and 250 GB local hard disk Strategy Evaluate the critical components separately Probing Performance Redirect Performance Evaluate the overall performance on real-world workloads Whole cluster load-balancing Single application finish time

49 Evaluate with Workloads Evaluate Synthetically create a group of I/O workloads to mimic the real-world application s behaviors. Using 128 storage servers, 64 compute nodes (512 cores) Fixed Scheduler indicates the round-robin scheduler (use random start index) Dynamic Scheduler indicates the Two-choice randomized scheduler A better balanced load on each storage server

50 Evaluate Application Evaluate Use previous workloads running in the background to generate unbalanced workloads Schedule one new application to run Results: Fixed scheduler scheduled more processes to finish in first several seconds; Native Two-choice scheduler suffers at creating new stragglers Proposed Two-choice scheduler finish faster since avoid the stragglers

51 Evaluate Application Evaluate Use previous workloads running in the background to generate unbalanced workloads Schedule one new application to run Results: Fixed scheduler scheduled more processes to finish in first several seconds; 60% I/Os finished soon, 100% finished at 200+s Native Two-choice scheduler suffers at creating new stragglers Proposed Two-choice scheduler finish faster since avoid the stragglers

52 Evaluate Application Evaluate Use previous workloads running in the background to generate unbalanced workloads Schedule one new application to run Results: 90% I/Os finished soon, 100% at around 200s Fixed scheduler scheduled more processes to finish in first several seconds; 60% I/Os finished soon, 100% finished at 200+s Native Two-choice scheduler suffers at creating new stragglers Proposed Two-choice scheduler finish faster since avoid the stragglers

53 Evaluate Application Evaluate Use previous workloads running in the background to generate unbalanced workloads Schedule one new application to run Results: 100% I/Os finished 90% I/Os finished soon, 100% at around 200s Fixed scheduler scheduled more processes to finish in first several seconds; 60% I/Os finished soon, 100% finished at 200+s Native Two-choice scheduler suffers at creating new stragglers Proposed Two-choice scheduler finish faster since avoid the stragglers

54 Conclusion Extends the native two-choice randomized algorithm (collaborative probe, preassign strategy) to I/O scheduler for object storage systems Implement new components (redirect table, metadata migration) for such dynamic I/O scheduler in parallel file systems Evaluations confirm the better response time for applications and also a better load balance for storage servers.

55 Thanks & Questions

GraphTrek: Asynchronous Graph Traversal for Property Graph-Based Metadata Management

GraphTrek: Asynchronous Graph Traversal for Property Graph-Based Metadata Management GraphTrek: Asynchronous Graph Traversal for Property Graph-Based Metadata Management Dong Dai, Philip Carns, Robert B. Ross, John Jenkins, Kyle Blauer, and Yong Chen Metadata Management Challenges in HPC

More information

ECE7995 (7) Parallel I/O

ECE7995 (7) Parallel I/O ECE7995 (7) Parallel I/O 1 Parallel I/O From user s perspective: Multiple processes or threads of a parallel program accessing data concurrently from a common file From system perspective: - Files striped

More information

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large

More information

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University In this work Compare and contrast large

More information

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment This document is provided as-is. Information and views expressed in this document, including URL and other Internet

More information

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment This document is provided as-is. Information and views expressed in this document, including

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

GFS-python: A Simplified GFS Implementation in Python

GFS-python: A Simplified GFS Implementation in Python GFS-python: A Simplified GFS Implementation in Python Andy Strohman ABSTRACT GFS-python is distributed network filesystem written entirely in python. There are no dependencies other than Python s standard

More information

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work

Outline 1 Motivation 2 Theory of a non-blocking benchmark 3 The benchmark and results 4 Future work Using Non-blocking Operations in HPC to Reduce Execution Times David Buettner, Julian Kunkel, Thomas Ludwig Euro PVM/MPI September 8th, 2009 Outline 1 Motivation 2 Theory of a non-blocking benchmark 3

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

Ambry: LinkedIn s Scalable Geo- Distributed Object Store Ambry: LinkedIn s Scalable Geo- Distributed Object Store Shadi A. Noghabi *, Sriram Subramanian +, Priyesh Narayanan +, Sivabalan Narayanan +, Gopalakrishna Holla +, Mammad Zadeh +, Tianwei Li +, Indranil

More information

VIRTUAL MEMORY READING: CHAPTER 9

VIRTUAL MEMORY READING: CHAPTER 9 VIRTUAL MEMORY READING: CHAPTER 9 9 MEMORY HIERARCHY Core! Processor! Core! Caching! Main! Memory! (DRAM)!! Caching!! Secondary Storage (SSD)!!!! Secondary Storage (Disk)! L cache exclusive to a single

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented

More information

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar

An Exploration into Object Storage for Exascale Supercomputers. Raghu Chandrasekar An Exploration into Object Storage for Exascale Supercomputers Raghu Chandrasekar Agenda Introduction Trends and Challenges Design and Implementation of SAROJA Preliminary evaluations Summary and Conclusion

More information

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

! Design constraints.  Component failures are the norm.  Files are huge by traditional standards. ! POSIX-like Cloud background Google File System! Warehouse scale systems " 10K-100K nodes " 50MW (1 MW = 1,000 houses) " Power efficient! Located near cheap power! Passive cooling! Power Usage Effectiveness = Total

More information

I/O Characterization of Commercial Workloads

I/O Characterization of Commercial Workloads I/O Characterization of Commercial Workloads Kimberly Keeton, Alistair Veitch, Doug Obal, and John Wilkes Storage Systems Program Hewlett-Packard Laboratories www.hpl.hp.com/research/itc/csl/ssp kkeeton@hpl.hp.com

More information

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 21 Main Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ Why not increase page size

More information

FastScale: Accelerate RAID Scaling by

FastScale: Accelerate RAID Scaling by FastScale: Accelerate RAID Scaling by Minimizing i i i Data Migration Weimin Zheng, Guangyan Zhang gyzh@tsinghua.edu.cn Tsinghua University Outline Motivation Minimizing data migration Optimizing data

More information

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1, Rohan Kadekodi 1, Vijay Chidambaram 1,2, Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

Characterizing Private Clouds: A Large-Scale Empirical Analysis of Enterprise Clusters

Characterizing Private Clouds: A Large-Scale Empirical Analysis of Enterprise Clusters Characterizing Private Clouds: A Large-Scale Empirical Analysis of Enterprise Clusters Ignacio Cano, Srinivas Aiyar, Arvind Krishnamurthy University of Washington Nutanix Inc. ACM Symposium on Cloud Computing

More information

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies.

Module Outline. CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies. M6 Memory Hierarchy Module Outline CPU Memory interaction Organization of memory modules Cache memory Mapping and replacement policies. Events on a Cache Miss Events on a Cache Miss Stall the pipeline.

More information

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5) Announcements Reading Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) 1 Relationship between Kernel mod and User Mode User Process Kernel System Calls User Process

More information

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit.

Memory Hierarchy. Goal: Fast, unlimited storage at a reasonable cost per bit. Memory Hierarchy Goal: Fast, unlimited storage at a reasonable cost per bit. Recall the von Neumann bottleneck - single, relatively slow path between the CPU and main memory. Fast: When you need something

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani The Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani CS5204 Operating Systems 1 Introduction GFS is a scalable distributed file system for large data intensive

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

Network Load Balancing Methods: Experimental Comparisons and Improvement

Network Load Balancing Methods: Experimental Comparisons and Improvement Network Load Balancing Methods: Experimental Comparisons and Improvement Abstract Load balancing algorithms play critical roles in systems where the workload has to be distributed across multiple resources,

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW

More information

Application Acceleration Beyond Flash Storage

Application Acceleration Beyond Flash Storage Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage

More information

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.

UK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc. UK LUG 10 th July 2012 Lustre at Exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Exascale I/O requirements Exascale I/O model 3 Lustre at Exascale - UK LUG 10th July 2012 Exascale I/O

More information

Optimizing Local File Accesses for FUSE-Based Distributed Storage

Optimizing Local File Accesses for FUSE-Based Distributed Storage Optimizing Local File Accesses for FUSE-Based Distributed Storage Shun Ishiguro 1, Jun Murakami 1, Yoshihiro Oyama 1,3, Osamu Tatebe 2,3 1. The University of Electro-Communications, Japan 2. University

More information

Modification and Evaluation of Linux I/O Schedulers

Modification and Evaluation of Linux I/O Schedulers Modification and Evaluation of Linux I/O Schedulers 1 Asad Naweed, Joe Di Natale, and Sarah J Andrabi University of North Carolina at Chapel Hill Abstract In this paper we present three different Linux

More information

CLOUD-SCALE FILE SYSTEMS

CLOUD-SCALE FILE SYSTEMS Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients

More information

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif * University of Massachusetts Amherst * Intel, Portland Data

More information

TRASH DAY: COORDINATING GARBAGE COLLECTION IN DISTRIBUTED SYSTEMS

TRASH DAY: COORDINATING GARBAGE COLLECTION IN DISTRIBUTED SYSTEMS TRASH DAY: COORDINATING GARBAGE COLLECTION IN DISTRIBUTED SYSTEMS Martin Maas* Tim Harris KrsteAsanovic* John Kubiatowicz* *University of California, Berkeley Oracle Labs, Cambridge Why you should care

More information

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp

Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp Do You Know What Your I/O Is Doing? (and how to fix it?) William Gropp www.cs.illinois.edu/~wgropp Messages Current I/O performance is often appallingly poor Even relative to what current systems can achieve

More information

NPTEL Course Jan K. Gopinath Indian Institute of Science

NPTEL Course Jan K. Gopinath Indian Institute of Science Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,

More information

Memory Hierarchy: Caches, Virtual Memory

Memory Hierarchy: Caches, Virtual Memory Memory Hierarchy: Caches, Virtual Memory Readings: 5.1-5.4, 5.8 Big memories are slow Computer Fast memories are small Processor Memory Devices Control Input Datapath Output Need to get fast, big memories

More information

Data Transformation and Migration in Polystores

Data Transformation and Migration in Polystores Data Transformation and Migration in Polystores Adam Dziedzic, Aaron Elmore & Michael Stonebraker September 15th, 2016 Agenda Data Migration for Polystores: What & Why? How? Acceleration of physical data

More information

Midterm Exam Solutions and Grading Guidelines March 3, 1999 CS162 Operating Systems

Midterm Exam Solutions and Grading Guidelines March 3, 1999 CS162 Operating Systems University of California, Berkeley College of Engineering Computer Science Division EECS Spring 1999 Anthony D. Joseph Midterm Exam Solutions and Grading Guidelines March 3, 1999 CS162 Operating Systems

More information

Parallel Databases C H A P T E R18. Practice Exercises

Parallel Databases C H A P T E R18. Practice Exercises C H A P T E R18 Parallel Databases Practice Exercises 181 In a range selection on a range-partitioned attribute, it is possible that only one disk may need to be accessed Describe the benefits and drawbacks

More information

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song

Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song Aerie: Flexible File-System Interfaces to Storage-Class Memory [Eurosys 2014] Operating System Design Yongju Song Outline 1. Storage-Class Memory (SCM) 2. Motivation 3. Design of Aerie 4. File System Features

More information

IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion

IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion IndexFS: Scaling File System Metadata Performance with Stateless Caching and Bulk Insertion Kai Ren Qing Zheng, Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon University Why Scalable

More information

Extreme Storage Performance with exflash DIMM and AMPS

Extreme Storage Performance with exflash DIMM and AMPS Extreme Storage Performance with exflash DIMM and AMPS 214 by 6East Technologies, Inc. and Lenovo Corporation All trademarks or registered trademarks mentioned here are the property of their respective

More information

Performance Modeling and Analysis of Flash based Storage Devices

Performance Modeling and Analysis of Flash based Storage Devices Performance Modeling and Analysis of Flash based Storage Devices H. Howie Huang, Shan Li George Washington University Alex Szalay, Andreas Terzis Johns Hopkins University MSST 11 May 26, 2011 NAND Flash

More information

1. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, 2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6.

1. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, 2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6. 1. Consider the following page reference string: 1, 2, 3, 4, 2, 1, 5, 6, 2, 1, 2, 3, 7, 6, 3, 2, 1, 2, 3, 6. What will be the ratio of page faults for the following replacement algorithms - FIFO replacement

More information

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015

CS 31: Intro to Systems Caching. Kevin Webb Swarthmore College March 24, 2015 CS 3: Intro to Systems Caching Kevin Webb Swarthmore College March 24, 205 Reading Quiz Abstraction Goal Reality: There is no one type of memory to rule them all! Abstraction: hide the complex/undesirable

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0; How will execution time grow with SIZE? int array[size]; int A = ; for (int i = ; i < ; i++) { for (int j = ; j < SIZE ; j++) { A += array[j]; } TIME } Plot SIZE Actual Data 45 4 5 5 Series 5 5 4 6 8 Memory

More information

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems

More information

Getafix: Workload-aware Distributed Interactive Analytics

Getafix: Workload-aware Distributed Interactive Analytics Getafix: Workload-aware Distributed Interactive Analytics Presenter: Mainak Ghosh Collaborators: Le Xu, Xiaoyao Qian, Thomas Kao, Indranil Gupta, Himanshu Gupta Data Analytics 2 Picture borrowed from https://conferences.oreilly.com/strata/strata-ny-2016/public/schedule/detail/51640

More information

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18 PROCESS VIRTUAL MEMORY CS124 Operating Systems Winter 2015-2016, Lecture 18 2 Programs and Memory Programs perform many interactions with memory Accessing variables stored at specific memory locations

More information

Go Deep: Fixing Architectural Overheads of the Go Scheduler

Go Deep: Fixing Architectural Overheads of the Go Scheduler Go Deep: Fixing Architectural Overheads of the Go Scheduler Craig Hesling hesling@cmu.edu Sannan Tariq stariq@cs.cmu.edu May 11, 2018 1 Introduction Golang is a programming language developed to target

More information

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018

Virtual Memory. Kevin Webb Swarthmore College March 8, 2018 irtual Memory Kevin Webb Swarthmore College March 8, 2018 Today s Goals Describe the mechanisms behind address translation. Analyze the performance of address translation alternatives. Explore page replacement

More information

Memory Management Virtual Memory

Memory Management Virtual Memory Memory Management Virtual Memory Part of A3 course (by Theo Schouten) Biniam Gebremichael http://www.cs.ru.nl/~biniam/ Office: A6004 April 4 2005 Content Virtual memory Definition Advantage and challenges

More information

CSE 124: Networked Services Fall 2009 Lecture-19

CSE 124: Networked Services Fall 2009 Lecture-19 CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but

More information

Dell EMC CIFS-ECS Tool

Dell EMC CIFS-ECS Tool Dell EMC CIFS-ECS Tool Architecture Overview, Performance and Best Practices March 2018 A Dell EMC Technical Whitepaper Revisions Date May 2016 September 2016 Description Initial release Renaming of tool

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

Click to edit Master title

Click to edit Master title Click to edit Master title DIMM: A Distributed Metadata Management for Data-Intensive HPC Brandon Szeliga, John Cavicchio and Weisong Shi Wayne State University bszeliga@wayne.edu 1 Click Roadmap to edit

More information

OS and Hardware Tuning

OS and Hardware Tuning OS and Hardware Tuning Tuning Considerations OS Threads Thread Switching Priorities Virtual Memory DB buffer size File System Disk layout and access Hardware Storage subsystem Configuring the disk array

More information

Multiprocessor Scheduling. Multiprocessor Scheduling

Multiprocessor Scheduling. Multiprocessor Scheduling Multiprocessor Scheduling Will consider only shared memory multiprocessor or multi-core CPU Salient features: One or more caches: cache affinity is important Semaphores/locks typically implemented as spin-locks:

More information

Multiprocessor Scheduling

Multiprocessor Scheduling Multiprocessor Scheduling Will consider only shared memory multiprocessor or multi-core CPU Salient features: One or more caches: cache affinity is important Semaphores/locks typically implemented as spin-locks:

More information

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18 File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18 1 Introduction Historically, the parallel version of the HDF5 library has suffered from performance

More information

Main Memory. Electrical and Computer Engineering Stephen Kim ECE/IUPUI RTOS & APPS 1

Main Memory. Electrical and Computer Engineering Stephen Kim ECE/IUPUI RTOS & APPS 1 Main Memory Electrical and Computer Engineering Stephen Kim (dskim@iupui.edu) ECE/IUPUI RTOS & APPS 1 Main Memory Background Swapping Contiguous allocation Paging Segmentation Segmentation with paging

More information

Fall COMP3511 Review

Fall COMP3511 Review Outline Fall 2015 - COMP3511 Review Monitor Deadlock and Banker Algorithm Paging and Segmentation Page Replacement Algorithms and Working-set Model File Allocation Disk Scheduling Review.2 Monitors Condition

More information

Getting it Right: Testing Storage Arrays The Way They ll be Used

Getting it Right: Testing Storage Arrays The Way They ll be Used Getting it Right: Testing Storage Arrays The Way They ll be Used Peter Murray Virtual Instruments Flash Memory Summit 2017 Santa Clara, CA 1 The Journey: How Did we Get Here? Storage testing was black

More information

WHITE PAPER. Optimizing Virtual Platform Disk Performance

WHITE PAPER. Optimizing Virtual Platform Disk Performance WHITE PAPER Optimizing Virtual Platform Disk Performance Optimizing Virtual Platform Disk Performance 1 The intensified demand for IT network efficiency and lower operating costs has been driving the phenomenal

More information

The University of Adelaide, School of Computer Science 13 September 2018

The University of Adelaide, School of Computer Science 13 September 2018 Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per

More information

Xen scheduler status. George Dunlap Citrix Systems R&D Ltd, UK

Xen scheduler status. George Dunlap Citrix Systems R&D Ltd, UK Xen scheduler status George Dunlap Citrix Systems R&D Ltd, UK george.dunlap@eu.citrix.com Goals for talk Understand the problem: Why a new scheduler? Understand reset events in credit1 and credit2 algorithms

More information

OS and HW Tuning Considerations!

OS and HW Tuning Considerations! Administração e Optimização de Bases de Dados 2012/2013 Hardware and OS Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID OS and HW Tuning Considerations OS " Threads Thread Switching Priorities " Virtual

More information

OASIS: Self-tuning Storage for Applications

OASIS: Self-tuning Storage for Applications OASIS: Self-tuning Storage for Applications Kostas Magoutis, Prasenjit Sarkar, Gauri Shah 14 th NASA Goddard- 23 rd IEEE Mass Storage Systems Technologies, College Park, MD, May 17, 2006 Outline Motivation

More information

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith

SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY. Alexander Gessler Simon Hanna Ashley Marie Smith SCALING A DISTRIBUTED SPATIAL CACHE OVERLAY Alexander Gessler Simon Hanna Ashley Marie Smith MOTIVATION Location-based services utilize time and geographic behavior of user geotagging photos recommendations

More information

CSE 124: Networked Services Lecture-17

CSE 124: Networked Services Lecture-17 Fall 2010 CSE 124: Networked Services Lecture-17 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/30/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

COSC3330 Computer Architecture Lecture 20. Virtual Memory

COSC3330 Computer Architecture Lecture 20. Virtual Memory COSC3330 Computer Architecture Lecture 20. Virtual Memory Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Virtual Memory Topics Reducing Cache Miss Penalty (#2) Use

More information

GLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools

GLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools GLUSTER CAN DO THAT! Architecting and Performance Tuning Efficient Gluster Storage Pools Dustin Black Senior Architect, Software-Defined Storage @dustinlblack 2017-05-02 Ben Turner Principal Quality Engineer

More information

A Comparison of Two Distributed Systems: Amoeba & Sprite. By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec.

A Comparison of Two Distributed Systems: Amoeba & Sprite. By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec. A Comparison of Two Distributed Systems: Amoeba & Sprite By: Fred Douglis, John K. Ousterhout, M. Frans Kaashock, Andrew Tanenbaum Dec. 1991 Introduction shift from time-sharing to multiple processors

More information

Active Storage using OSD. John A. Chandy Department of Electrical and Computer Engineering

Active Storage using OSD. John A. Chandy Department of Electrical and Computer Engineering Active Storage using OSD John A. Chandy Department of Electrical and Computer Engineering Active Disks We already have intelligence at the disk Block management Arm scheduling Can we use that intelligence

More information

Near Memory Key/Value Lookup Acceleration MemSys 2017

Near Memory Key/Value Lookup Acceleration MemSys 2017 Near Key/Value Lookup Acceleration MemSys 2017 October 3, 2017 Scott Lloyd, Maya Gokhale Center for Applied Scientific Computing This work was performed under the auspices of the U.S. Department of Energy

More information

RUNTIME SUPPORT FOR ADAPTIVE SPATIAL PARTITIONING AND INTER-KERNEL COMMUNICATION ON GPUS

RUNTIME SUPPORT FOR ADAPTIVE SPATIAL PARTITIONING AND INTER-KERNEL COMMUNICATION ON GPUS RUNTIME SUPPORT FOR ADAPTIVE SPATIAL PARTITIONING AND INTER-KERNEL COMMUNICATION ON GPUS Yash Ukidave, Perhaad Mistry, Charu Kalra, Dana Schaa and David Kaeli Department of Electrical and Computer Engineering

More information

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Bigtable. Presenter: Yijun Hou, Yixiao Peng Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 06 Presenter: Yijun Hou, Yixiao Peng

More information

1 of 8 14/12/2013 11:51 Tuning long-running processes Contents 1. Reduce the database size 2. Balancing the hardware resources 3. Specifying initial DB2 database settings 4. Specifying initial Oracle database

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

Dalí: A Periodically Persistent Hash Map

Dalí: A Periodically Persistent Hash Map Dalí: A Periodically Persistent Hash Map Faisal Nawab* 1, Joseph Izraelevitz* 2, Terence Kelly*, Charles B. Morrey III*, Dhruva R. Chakrabarti*, and Michael L. Scott 2 1 Department of Computer Science

More information

IOGP. an Incremental Online Graph Partitioning algorithm for distributed graph databases. Dong Dai*, Wei Zhang, Yong Chen

IOGP. an Incremental Online Graph Partitioning algorithm for distributed graph databases. Dong Dai*, Wei Zhang, Yong Chen IOGP an Incremental Online Graph Partitioning algorithm for distributed graph databases Dong Dai*, Wei Zhang, Yong Chen Workflow of The Presentation A Use Case IOGP Details Evaluation Setup OLTP vs. OLAP

More information

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, Michael

More information

Today: Segmentation. Last Class: Paging. Costs of Using The TLB. The Translation Look-aside Buffer (TLB)

Today: Segmentation. Last Class: Paging. Costs of Using The TLB. The Translation Look-aside Buffer (TLB) Last Class: Paging Process generates virtual addresses from 0 to Max. OS divides the process onto pages; manages a page table for every process; and manages the pages in memory Hardware maps from virtual

More information

Design of Global Data Deduplication for A Scale-out Distributed Storage System

Design of Global Data Deduplication for A Scale-out Distributed Storage System 218 IEEE 38th International Conference on Distributed Computing Systems Design of Global Data Deduplication for A Scale-out Distributed Storage System Myoungwon Oh, Sejin Park, Jungyeon Yoon, Sangjae Kim,

More information

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced? Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!

More information

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment Case Study Order Number: 334534-002US Ordering Information Contact your local Intel sales representative for ordering

More information

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay! Lecture 16 Today: Start looking into memory hierarchy Cache$! Yay! Note: There are no slides labeled Lecture 15. Nothing omitted, just that the numbering got out of sequence somewhere along the way. 1

More information

The Google File System (GFS)

The Google File System (GFS) 1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints

More information

CSE 124: Networked Services Lecture-16

CSE 124: Networked Services Lecture-16 Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments

More information

Advanced Memory Organizations

Advanced Memory Organizations CSE 3421: Introduction to Computer Architecture Advanced Memory Organizations Study: 5.1, 5.2, 5.3, 5.4 (only parts) Gojko Babić 03-29-2018 1 Growth in Performance of DRAM & CPU Huge mismatch between CPU

More information