NEXPReS WP8 Progress Meeting
|
|
- Irene Ford
- 5 years ago
- Views:
Transcription
1 NEXPReS WP8 Progress Meeting Preparations for Multi Site Testing Ari Mujunen Aalto University Metsähovi Radio Observatory VLBI2SKA Workshop in Aveiro, Portugal, Research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/ ) under grant agreement n RI This presentation reflects only the author's views. The European Union is not liable for any use that may be made of the information contained therein.
2 Partner Focus Areas AALTO Coordination, basic technologies ASTRON Long-term archival & reprocessing INAF Global/local allocation/deallocation schemes JIVE Augmenting correlation capabilities /w buffering OSO Trial-site performance & applicability testing PSNC Role & trials of computing center buffering UMAN Trial-site performance & applicability testing 2 (44)
3 WP8 Objective Determine the best practical mix of solutions What kind of storage Where located & packaged Connected in which ways How storage is allocated/deallocated and accessed Which will serve the needs of evolving (>1Gbps) VLBI data acquisition and processing 3 (44)
4 WP8 Schedule 4 (44)
5 D8.1 Interface design document of storage element API Delivered in December 2010 Main results: One buffer one computer, many disks Control API through SSH Simple SSH-protected CLI in/out, potentially opening up UDP/TCP ports in public 10GE Simple (UDP-based/Tsunami-like) transfer protocol Separation of data and metadata Though no sw cleanup of native data formats of data sources (Mk4 timecode, VDIF headers, UDP packet numbers,...) Data chunks in memory and on individual disks 5 (44)
6 D8.2: Hardware design document for simultaneous I/O storage elements For the highest capacity and bandwidth at the lowest cost, 3.5 hard disks are still (for a while) the best match To fill one 10GE port we need disks We presented two alternative designs 6 (44)
7 Current Issues MS801: Hardware decision for subsequent prototyping and integration tests D8.2 alternative solution with server motherboard and CPUs selected D8.3: Design document of storage element allocation methods Due ~now... Should describe how to manage buffer space across stations Could describe how SSH keys (or WP5 sshbroker?) are used to allow cross-access 7 (44)
8 Multi Station Tests are Scheduled for Summer 2011 Striving to deliver 21-Sep-2011 as scheduled Although it is a milestone, not a deliverable... Mh, On, Jb JIVE D8.4: Design document of transparent local/remote application programming interface for storage elements MS802: Review of prototype test results, decision of the architecture and scope of subsequent integration tests 8 (44)
9 Platform Selection 9 (44)
10 Platform Selection in D8.2 Two alternatives: Very basic COTS chassis with customization COTS chassis (more expensive) 10 (44)
11 Lowest-Cost Solution by Customized Chassis Suggested a basic 4U rack enclosure Chieftec UNC-410F-B 11 (44)
12 Customized Chassis Was a Great Idea, but... Wiring needs more work than predicted 12 (44)
13 Adding ~+30% to Cost: 2*Intel Xeon QP More disks (36) 2*1400W Uses 420W Neat, /w 3yr warranty 13 (44)
14 Final Recommendation as MS801 We ended up using D8.2 alternative solution and we also chose a (potentially?) more powerful motherboard/cpu from the same vendor: This fulfills the recommendations in D8.2 of COTS implementation of 8Gbps capable 36 disk storage unit in a fairly standard 4U rack enclosure 72TB or 20 hours of 8Gbps Add a heavy-duty 2x4-core MB with 10GE and disk controllers With room for expansion with external disk controllers, cf. the recent Mark6 proposal 14 (44)
15 Parts List 15 (44)
16 36 Disks + 7 (low-profile) PCIe x8 slots Still only 4U 16 (44)
17 D8.2 Recommendation OK, too Though now recommending the SC disk case... Asus Crosshair IV Formula (+846TQ) 3 PCIe x8 16+6=22 disks 10GE No fans Extreme (+846A) 4 PCI x8 24(+6) disks 10GE Fans, Hydralogix 17 (44)
18 Pictures from Mh 18 (44)
19 Linux Environment, Software Algorithms 19 (44)
20 Linux Environment 64-bit Ubuntu/Debian server booted from network recommended 64-bit: 16GB (and up; >4GB) needed for latency buffers Ubuntu/Debian: familiarity within our community Server: avoiding extra services (X, net) in storage nodes Netboot: central management of (potentially) multiple storage nodes; maximize the number of data disks Netboot via the motherboard 1GE interfaces Local.vlbi VLAN for that while 10GE connected to Internet/LP 20 (44)
21 Netboot Server TFTP+NFS small server 21 (44)
22 Lessons Learned from EXPReS 6Gbps/20-disk RAID0 Total throughput bottlenecked in a single kernel thread flushing all data from block buffer cache to the single raid0 Tried avoiding the buffer cache with O_DIRECT, got abysmal performance in EXPReS trial tests Did not understand the importance of O_DIRECT writing large blocks in a single write() (2 64MB or more) Substituted two raid0s; still all CPU being consumed in copying data from user programs to buffer cache Giving each disk spindle its own process/thread doing O_DIRECT write()s with large enough block size seems fine With 22 disks attained >17Gbps using only 20% of CPUs 22 (44)
23 RAID Arrays? For high bandwidth and high capacity at the same time, one cannot really use other RAIDs but raid0 (striping) Raid5/Raid6 slow down in streaming writing, especially in degraded mode (with failed disks) False illusion of guard against disk failures Cannot afford losing half the disk space with raid10 But raid0 is vulnerable to single-disk failures Why use any RAIDs at all for astronomy data? Why use any form of redundancy for volatile, non-archival data? 23 (44)
24 Writing to a set of disks Divide incoming data stream into consecutive largish (50 500MB) chunks of memory buffer Write chunks into files on a set of stand-alone disks Local or even remote disks Disk handler processes/threads 24 (44)
25 Naming Files on Disks Prefer naming conventions which ls-list alphabetically in time/sequence order Determining file name list in correct order becomes as easy as 'ls /*/<experiment>/*'... Detecting missing files or holes For example: /d001/n11k2mh/ m5a /d001/n11k2mh/ m5a 25 (44)
26 Managing Local Disk Space Initially thought writing chunks in round-robin fashion The current idea is to select among idle disks the one with the largest amount of free space Round-robin (or longest idle time?) for disks with the same (largest) amount of free space Balances disk consumption automatically, allows different disk sizes & usage levels Also allows subsets of disks Tracking the number of idle disks & idle time gives an idea of how much headroom still left 26 (44)
27 Minimizing CPU Usage & Memory Bandwidth At 10Gbps rates, avoiding unnecessary memory copies becomes essential to avoid exhausting CPU & memory Current memory bandwidth is only 30 60Gbps in total The standard Linux disk r/w model makes one extra memcpy() between user mode and kernel buffer cache Must use O_DIRECT with large enough transfer sizes to allow disk drives to queue large enough DMA request / NCQ queues By default the network stack makes another memcpy() when adding/removing IP packet headers Can be circumvented with splice() or raw packet ring buffer 27 (44)
28 Networking CPU Overhead Planning to tolerate one memcpy() with the simplest UDP packet processing Though avoiding IP sw checksumming & fragmentation Retaining user headers (like packet #) Assuming UDP mostly in order ; attempting UDP receive at the assumed packet location and copying out-of-order packets back If zero-copy needed PacketShader raw ring buffer driver Though dictates using the latest Intel cards... mmap()ed PCAP raw ring buffer turned out to be one memcpy() plus regular IP stack processing (44)
29 10GE Cards Now recommending latest Intel cards PacketShader raw ring buffer driver compatible 29 (44)
30 Reading from a set of disks Initiate preloading of memory buffers from a set of disks Once all disks in a set have had a chance to preload, consuming the streaming data can commence at full rate 30 (44)
31 Simultaneous Read and Write Two problems, can be solved with interleaving: HDD seek time, slows down using more than one spot of disk Even without, double data streaming bandwidth required throughout the internal data paths Duration of one sequential r/w of the order of ms, >>10ms, one disk rotation 31 (44)
32 Simultaneous Read and Write Avoiding hard disk 8 12ms seek times Must stream in either read or write direction for much longer than seek time (or full disk rotation time, approx ms) E.g. read streaming for about 120ms gives the data from 10 rotations, then wastes the time of only one rotation, 90% efficiency So nothing really fancy is required but: Usually simultaneous read and write will halve the total bandwidth If the bottleneck at one subsystem can be removed (e.g. hard disks SSDs), another bottleneck will surface in another subsystem Memory and/or disk controller (PCIe) bandwidth limits are the most probable bottleneck candidates 32 (44)
33 Error Recovery If one disk handler does not complete in allotted time, it can hand over the memory block to another hander A spare disk Or even to another disk in the current set, provided there is enough slack time in the set, so it can eventually catch up Disk handler processes/threads Testing/recovery of failed disk Spare 33 (44)
34 Read Error Recovery Failure to read a chunk file from one disk causes the loss of a well-defined (0.1 1sec) time interval in the data stream In worst case (a total disk failure) the loss repeats at n*(0.1 1s) intervals Can adjust read retry strategy according to whether the reader/consumer can wait or it need quasi-realtime data 34 (44)
35 Turning Recordings into Archives If some time after a recording is made it is found that It needs to be retained/archived for a longer period of time And there is spare disk capacity available And there is some idle time for the data disk set available Then redundancy (aka raid5/raid6) can be calculated afterwards (file based) and the ECC information stored onto the spare disks If later a disk of the disk set fails, it can be replaced and its content files restored by recovering from the ECC files 35 (44)
36 Multi Site Test Setup 36 (44)
37 Test Cases We divide the test into two cases: Receiving data to the local buffer Transferring data from local buffer to remote buffer/processing Later, if this works, we can test doing thee two simultaneously == simultaneous buffering Possibly even multiple concurrent reads? To support multiple (slower) reads by several SFX instances Divide main memory roughly in half for write / read Memory chunk == file size, half of memory divided by # of disks Divide read memory chunks to concurrent readers, minimum of 2 3 chunks per reader 37 (44)
38 Boundary Conditions Although we talked about transparent access in DoW, Transparent remote writing over the network makes very little sense As it is effectively real-time e-vlbi, defeating the very purpose of buffering, namely independence of network performance Transparent remote reading looks fine, esp. with sw correlators Thus most of the buffer disk space will be located at stations, close to data sources Fits the practical economic model, too: nobody is going to buy disk space needed by a given station except the station itself... Transparent teleporting of data from one buffer to another rare (contingency?) 38 (44)
39 More Boundary Conditions Our available data sources: Test 10GE Linux PC: dummy data up to 9Gbps Could devise something to generate artificial noise data for testing? ibobs: 8Gbps UDP packet stream (1Gsps/8-bit) Not readily correlatable (nor super-interesting, too many bits/sample) Though the only real data source readily available Jivemk5 software: numbered UDP packets up to 512/896Mbps via the std Mk5 1GE Or up to 1024Mbps via an additional Mk5 10Ge card... Unlikely that JIVE can have lots of disks in a buffer machine already during summer But could use JIVE Mk5s to access station buffers (Mh, On, Jb) and feed the data to Either SFX or the Mk4 correlator 39 (44)
40 Summer Test Scenario Due to boundary conditions, combining high data rate and buffered correlation seems unlikely this summer Thus recording 22GHz UDP packet stream from Jivemk5 locally and correlating that with SFX (with UDT or even TCP remote access) seems like the only viable real correlation option Could spice that up with recording more scans while SFX is reading the previous scan(s)... Even better if there could be multiple SFX instances running concurrently, from different locations, processing several past scans Any way to show high-rate performance? >1Gbps remote reads Mh<->Jb<->On? 40 (44)
41 What Do I Have to Do Next? Test partners need a similar test system 36, 24, or 22 disks + 10GE Install Linux to the test system Preferably with a small ( leftover PC/HP Microserver) TFTP+NFS netboot server Check suitable schedule for 10GE test traffic with your operator (or WP6?) Agree test schedule with partners 41 (44)
42 Tasks 1 AALTO Create UDP stream writer + UDT(/TCP) reader sw plus Debianbased installation for OSO, UMAN JIVE Remote UDT(/TCP) reader component for SFX OSO, UMAN Acquire test systems (36/24/22/6? disks) 42 (44)
43 Tasks 2 PSNC Role & trials of computing center buffering Could we run SFX from there, accessing station buffers over the network? Could that be combined with a contingency transfer of some observation from Mh/On/Jb to PSNC buffer disks? INAF Global/local allocation/deallocation schemes The deliverable D8.3 ASTRON Long-term archival & reprocessing? Start scheduled for (44)
44 The Inconvenient Truths :-) About networked data streaming: A given station cannot really sustain recording bandwidth larger than their network connectivity unless given an unlimited disk buffer or long enough breaks between recordings. For the VLBI case: A single slow (or high-latency, like shipping) connection in a given e-vlbi network will force others (or some buffering party) to buffer most of the VLBI data of the whole network, if not all. 44 (44)
WP7. Mark Kettenis April 18, Joint Institute for VLBI in Europe
WP7 Mark Kettenis kettenis@jive.nl Joint Institute for VLBI in Europe April 18, 2011 Mark Kettenis kettenis@jive.nl (JIVE) NEXPReS WP7 April 18, 2011 1 / 16 Goals Create an automated, distributed correlator
More informationCSE 380 Computer Operating Systems
CSE 380 Computer Operating Systems Instructor: Insup Lee University of Pennsylvania Fall 2003 Lecture Note on Disk I/O 1 I/O Devices Storage devices Floppy, Magnetic disk, Magnetic tape, CD-ROM, DVD User
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationStorage. Hwansoo Han
Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics
More informationI/O CANNOT BE IGNORED
LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.
More informationCSE380 - Operating Systems. Communicating with Devices
CSE380 - Operating Systems Notes for Lecture 15-11/4/04 Matt Blaze (some examples by Insup Lee) Communicating with Devices Modern architectures support convenient communication with devices memory mapped
More informationCS5460: Operating Systems Lecture 20: File System Reliability
CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving
More informationPC-based data acquisition II
FYS3240 PC-based instrumentation and microcontrollers PC-based data acquisition II Data streaming to a storage device Spring 2015 Lecture 9 Bekkeng, 29.1.2015 Data streaming Data written to or read from
More informationOperating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3
More informationVirtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili
Virtual Memory Lecture notes from MKP and S. Yalamanchili Sections 5.4, 5.5, 5.6, 5.8, 5.10 Reading (2) 1 The Memory Hierarchy ALU registers Cache Memory Memory Memory Managed by the compiler Memory Managed
More informationMass-Storage Structure
Operating Systems (Fall/Winter 2018) Mass-Storage Structure Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review On-disk structure
More informationI/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University
I/O, Disks, and RAID Yi Shi Fall 2017 Xi an Jiaotong University Goals for Today Disks How does a computer system permanently store data? RAID How to make storage both efficient and reliable? 2 What does
More informationComputer Organization and Structure. Bing-Yu Chen National Taiwan University
Computer Organization and Structure Bing-Yu Chen National Taiwan University Storage and Other I/O Topics I/O Performance Measures Types and Characteristics of I/O Devices Buses Interfacing I/O Devices
More informationChapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition
Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space
More informationSpecifying Storage Servers for IP security applications
Specifying Storage Servers for IP security applications The migration of security systems from analogue to digital IP based solutions has created a large demand for storage servers high performance PCs
More informationPresented by: Nafiseh Mahmoudi Spring 2017
Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY WESTFORD, MASSACHUSETTS May 2011
MASSACHUSETTS INSTITUTE OF TECHNOLOGY HAYSTACK OBSERVATORY WESTFORD, MASSACHUSETTS 01886 27 May 2011 Telephone: 781-981-5400 Fax: 781-981-0590 TO: VLBI2010/mm-VLBI development group FROM: Alan R. Whitney,
More informationStorage Systems. Storage Systems
Storage Systems Storage Systems We already know about four levels of storage: Registers Cache Memory Disk But we've been a little vague on how these devices are interconnected In this unit, we study Input/output
More informationAnalysis of high capacity storage systems for e-vlbi
Analysis of high capacity storage systems for e-vlbi Matteo Stagni - Francesco Bedosti - Mauro Nanni May 21, 212 IRA 458/12 Abstract The objective of the analysis is to verify if the storage systems now
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo
More informationChapter 10: Mass-Storage Systems
Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space
More informationMass-Storage Structure
CS 4410 Operating Systems Mass-Storage Structure Summer 2011 Cornell University 1 Today How is data saved in the hard disk? Magnetic disk Disk speed parameters Disk Scheduling RAID Structure 2 Secondary
More informationOperating Systems 2010/2011
Operating Systems 2010/2011 Input/Output Systems part 2 (ch13, ch12) Shudong Chen 1 Recap Discuss the principles of I/O hardware and its complexity Explore the structure of an operating system s I/O subsystem
More informationDell EMC SCv3020 7,000 Mailbox Exchange 2016 Resiliency Storage Solution using 7.2K drives
Dell EMC SCv3020 7,000 Mailbox Exchange 2016 Resiliency Storage Solution using 7.2K drives Microsoft ESRP 4.0 Abstract This document describes the Dell EMC SCv3020 storage solution for Microsoft Exchange
More informationGuifré Molera, J. Ritakari & J. Wagner Metsähovi Radio Observatory, TKK - Finland
European 4-Gbps VLBI and evlbi Click to edit Master subtitle style Guifré Molera, J. Ritakari & J. Wagner Observatory, TKK - Finland 1 Outline Introduction Goals for 4 Gbps Testing Results Product Conclusions
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung December 2003 ACM symposium on Operating systems principles Publisher: ACM Nov. 26, 2008 OUTLINE INTRODUCTION DESIGN OVERVIEW
More informationI/O CANNOT BE IGNORED
LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.
More informationConfiguring Short RPO with Actifio StreamSnap and Dedup-Async Replication
CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for
More informationCS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)
CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2003 Lecture 21: Network Protocols (and 2 Phase Commit) 21.0 Main Point Protocol: agreement between two parties as to
More informationMladen Stefanov F48235 R.A.I.D
R.A.I.D Data is the most valuable asset of any business today. Lost data, in most cases, means lost business. Even if you backup regularly, you need a fail-safe way to ensure that your data is protected
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationCSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke
CSCI-GA.2250-001 Operating Systems I/O : Disk Scheduling and RAID Hubertus Franke frankeh@cs.nyu.edu Disks Scheduling Abstracted by OS as files A Conventional Hard Disk (Magnetic) Structure Hard Disk
More informationMass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova
Mass-Storage ICS332 - Fall 2017 Operating Systems Henri Casanova (henric@hawaii.edu) Magnetic Disks! Magnetic disks (a.k.a. hard drives ) are (still) the most common secondary storage devices today! They
More informationChapter 6 - External Memory
Chapter 6 - External Memory Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 6 - External Memory 1 / 66 Table of Contents I 1 Motivation 2 Magnetic Disks Write Mechanism Read Mechanism
More informationMark 6 Next-Generation VLBI Data System
Mark 6 Next-Generation VLBI Data System, IVS 2012 General Meeting Proceedings, p.86 90 http://ivscc.gsfc.nasa.gov/publications/gm2012/whitney.pdf Mark 6 Next-Generation VLBI Data System Alan Whitney, David
More informationFLAT DATACENTER STORAGE. Paper-3 Presenter-Pratik Bhatt fx6568
FLAT DATACENTER STORAGE Paper-3 Presenter-Pratik Bhatt fx6568 FDS Main discussion points A cluster storage system Stores giant "blobs" - 128-bit ID, multi-megabyte content Clients and servers connected
More informationCSE 120. Overview. July 27, Day 8 Input/Output. Instructor: Neil Rhodes. Hardware. Hardware. Hardware
CSE 120 July 27, 2006 Day 8 Input/Output Instructor: Neil Rhodes How hardware works Operating Systems Layer What the kernel does API What the programmer does Overview 2 Kinds Block devices: read/write
More informationMoneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010
Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed
More informationGoogle File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo
Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google 2017 fall DIP Heerak lim, Donghun Koo 1 Agenda Introduction Design overview Systems interactions Master operation Fault tolerance
More informationCSE 124: Networked Services Lecture-16
Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments
More informationSYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID
System Upgrade Teaches RAID In the growing computer industry we often find it difficult to keep track of the everyday changes in technology. At System Upgrade, Inc it is our goal and mission to provide
More informationPhysical Representation of Files
Physical Representation of Files A disk drive consists of a disk pack containing one or more platters stacked like phonograph records. Information is stored on both sides of the platter. Each platter is
More informationIomega REV Drive Data Transfer Performance
Technical White Paper March 2004 Iomega REV Drive Data Transfer Performance Understanding Potential Transfer Rates and Factors Affecting Throughput Introduction Maximum Sustained Transfer Rate Burst Transfer
More informationV. Mass Storage Systems
TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationRAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE
RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting
More informationEvaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades
Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state
More informationBest Practices for Setting BIOS Parameters for Performance
White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page
More informationI/O Management Software. Chapter 5
I/O Management Software Chapter 5 1 Learning Outcomes An understanding of the structure of I/O related software, including interrupt handers. An appreciation of the issues surrounding long running interrupt
More informationChapter 10: Mass-Storage Systems
COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 10: Mass-Storage Systems Zhi Wang Florida State University Content Overview of Mass Storage Structure Disk Structure Disk Scheduling Disk
More informationLecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown
Lecture 21: Reliable, High Performance Storage CSC 469H1F Fall 2006 Angela Demke Brown 1 Review We ve looked at fault tolerance via server replication Continue operating with up to f failures Recovery
More informationScaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX
Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX Inventing Internet TV Available in more than 190 countries 104+ million subscribers Lots of Streaming == Lots of Traffic
More informationNetwork Design Considerations for Grid Computing
Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
More informationBenchmarking Enterprise SSDs
Whitepaper March 2013 Benchmarking Enterprise SSDs When properly structured, benchmark tests enable IT professionals to compare solid-state drives (SSDs) under test with conventional hard disk drives (HDDs)
More informationIBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide
V7 Unified Asynchronous Replication Performance Reference Guide IBM V7 Unified R1.4.2 Asynchronous Replication Performance Reference Guide Document Version 1. SONAS / V7 Unified Asynchronous Replication
More informationFile System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018
File System Performance (and Abstractions) Kevin Webb Swarthmore College April 5, 2018 Today s Goals Supporting multiple file systems in one name space. Schedulers not just for CPUs, but disks too! Caching
More informationMPA (Marker PDU Aligned Framing for TCP)
MPA (Marker PDU Aligned Framing for TCP) draft-culley-iwarp-mpa-01 Paul R. Culley HP 11-18-2002 Marker (Protocol Data Unit) Aligned Framing, or MPA. 1 Motivation for MPA/DDP Enable Direct Data Placement
More informationMultiprocessor Systems. Chapter 8, 8.1
Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor
More informationFile Structures and Indexing
File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures
More informationDepartment of Computer Engineering University of California at Santa Cruz. File Systems. Hai Tao
File Systems Hai Tao File System File system is used to store sources, objects, libraries and executables, numeric data, text, video, audio, etc. The file system provide access and control function for
More informationChapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel
Chapter-6 SUBJECT:- Operating System TOPICS:- I/O Management Created by : - Sanjay Patel Disk Scheduling Algorithm 1) First-In-First-Out (FIFO) 2) Shortest Service Time First (SSTF) 3) SCAN 4) Circular-SCAN
More informationLighting the Blue Touchpaper for UK e-science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK March, 2007
Working with 1 Gigabit Ethernet 1, The School of Physics and Astronomy, The University of Manchester, Manchester, M13 9PL UK E-mail: R.Hughes-Jones@manchester.ac.uk Stephen Kershaw The School of Physics
More informationFile. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access
File File System Implementation Operating Systems Hebrew University Spring 2009 Sequence of bytes, with no structure as far as the operating system is concerned. The only operations are to read and write
More informationTest report of transparent local/remote application programming interface for storage elements
NEXPReS is an Integrated Infrastructure Initiative (I3), funded under the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement n RI-261525. Test report of transparent
More informationOpen-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs
Open-Channel SSDs Offer the Flexibility Required by Hyperscale Infrastructure Matias Bjørling CNEX Labs 1 Public and Private Cloud Providers 2 Workloads and Applications Multi-Tenancy Databases Instance
More informationMultiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.
Multiprocessor System Multiprocessor Systems Chapter 8, 8.1 We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationDisks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]
Disks and RAID CS 4410 Operating Systems [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse] Storage Devices Magnetic disks Storage that rarely becomes corrupted Large capacity at low cost Block
More informationInput/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security
Input/Output Today Principles of I/O hardware & software I/O software layers Disks Next Protection & Security Operating Systems and I/O Two key operating system goals Control I/O devices Provide a simple,
More informationMultiprocessor Systems. COMP s1
Multiprocessor Systems 1 Multiprocessor System We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than one CPU to improve
More informationDefinition of RAID Levels
RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds
More informationDatabase Systems. November 2, 2011 Lecture #7. topobo (mit)
Database Systems November 2, 2011 Lecture #7 1 topobo (mit) 1 Announcement Assignment #2 due today Assignment #3 out today & due on 11/16. Midterm exam in class next week. Cover Chapters 1, 2,
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationu Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices
Where Are We? COS 318: Operating Systems Storage Devices Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) u Covered: l Management of CPU
More informationVX3000-E Unified Network Storage
Datasheet VX3000-E Unified Network Storage Overview VX3000-E storage, with high performance, high reliability, high available, high density, high scalability and high usability, is a new-generation unified
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationHigh bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK
High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459
More informationSTORAGE SYSTEMS. Operating Systems 2015 Spring by Euiseong Seo
STORAGE SYSTEMS Operating Systems 2015 Spring by Euiseong Seo Today s Topics HDDs (Hard Disk Drives) Disk scheduling policies Linux I/O schedulers Secondary Storage Anything that is outside of primary
More informationJMR ELECTRONICS INC. WHITE PAPER
THE NEED FOR SPEED: USING PCI EXPRESS ATTACHED STORAGE FOREWORD The highest performance, expandable, directly attached storage can be achieved at low cost by moving the server or work station s PCI bus
More informationVess A2000 Series. NVR Storage Appliance. Milestone Surveillance Solution. Version PROMISE Technology, Inc. All Rights Reserved.
Vess A2000 Series NVR Storage Appliance Milestone Surveillance Solution Version 1.0 2014 PROMISE Technology, Inc. All Rights Reserved. Contents Introduction 1 Overview 1 Purpose 2 Scope 2 Audience 2 Components
More informationCh 11: Storage and File Structure
Ch 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary Dictionary Storage
More informationStorage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media
Storage and File Structure Classification of Physical Storage Media Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 9: Mass Storage Structure Prof. Alan Mislove (amislove@ccs.neu.edu) Moving-head Disk Mechanism 2 Overview of Mass Storage Structure Magnetic
More informationImplementing a Statically Adaptive Software RAID System
Implementing a Statically Adaptive Software RAID System Matt McCormick mattmcc@cs.wisc.edu Master s Project Report Computer Sciences Department University of Wisconsin Madison Abstract Current RAID systems
More informationCPS104 Computer Organization and Programming Lecture 18: Input-Output. Outline of Today s Lecture. The Big Picture: Where are We Now?
CPS104 Computer Organization and Programming Lecture 18: Input-Output Robert Wagner cps 104.1 RW Fall 2000 Outline of Today s Lecture The system Magnetic Disk Tape es DMA cps 104.2 RW Fall 2000 The Big
More informationFlavors of Memory supported by Linux, their use and benefit. Christoph Lameter, Ph.D,
Flavors of Memory supported by Linux, their use and benefit Christoph Lameter, Ph.D, Twitter: @qant Flavors Of Memory The term computer memory is a simple term but there are numerous nuances
More informationUsing Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD
More informationRed Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015
Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure
More informationI/O Management Software. Chapter 5
I/O Management Software Chapter 5 1 Learning Outcomes An understanding of the structure of I/O related software, including interrupt handers. An appreciation of the issues surrounding long running interrupt
More informationTechnical Note P/N REV A01 March 29, 2007
EMC Symmetrix DMX-3 Best Practices Technical Note P/N 300-004-800 REV A01 March 29, 2007 This technical note contains information on these topics: Executive summary... 2 Introduction... 2 Tiered storage...
More informationNPTEL Course Jan K. Gopinath Indian Institute of Science
Storage Systems NPTEL Course Jan 2012 (Lecture 39) K. Gopinath Indian Institute of Science Google File System Non-Posix scalable distr file system for large distr dataintensive applications performance,
More informationCSE 124: Networked Services Fall 2009 Lecture-19
CSE 124: Networked Services Fall 2009 Lecture-19 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124 Some of these slides are adapted from various sources/individuals including but
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER 80 GBIT/S OVER IP USING DPDK Performance, Code, and Architecture Charles Shiflett Developer of next-generation
More informationThe Lion of storage systems
The Lion of storage systems Rakuten. Inc, Yosuke Hara Mar 21, 2013 1 The Lion of storage systems http://www.leofs.org LeoFS v0.14.0 was released! 2 Table of Contents 1. Motivation 2. Overview & Inside
More informationASN Configuration Best Practices
ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows.
More informationTake control of storage performance
Take control of storage performance Transition From Speed To Management SSD + RAID 2008-2011 Reduce time to market Inherent bottlenecks Re-architect for better performance NVMe, SCSI Express Reads & Writes
More informationUC Santa Barbara. Operating Systems. Christopher Kruegel Department of Computer Science UC Santa Barbara
Operating Systems Christopher Kruegel Department of Computer Science http://www.cs.ucsb.edu/~chris/ Input and Output Input/Output Devices The OS is responsible for managing I/O devices Issue requests Manage
More informationCSE 451: Operating Systems Winter Redundant Arrays of Inexpensive Disks (RAID) and OS structure. Gary Kimura
CSE 451: Operating Systems Winter 2013 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Gary Kimura The challenge Disk transfer rates are improving, but much less fast than CPU performance
More information