COOPERATIVE CROSS-LAYER PROTECTION FOR RESOURCE CONSTRAINED MOBILE MULTIMEDIA SYSTEMS
|
|
- Patrick McKinney
- 6 years ago
- Views:
Transcription
1 COOPERATIVE CROSS-LAYER PROTECTION FOR RESOURCE CONSTRAINED MOBILE MULTIMEDIA SYSTEMS Prof. Nikil Dutt Prof. Nalini Venkatasubramanian Prof. Lichun Bao Nov. 26, 2008 Kyoungwoo Lee (final defense)
2 2 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC (Partially Protected Caches) EAVE (Error-Aware Video Encoding) CC-PROTECT (Cooperative, Cross-layer Protection) Thesis Contribution and Future Direction
3 3 Mobile Multimedia Embedded Systems 3D Graphics Resource-limited Image Browsing mobile devices! Map Routing Main problem is to achieve low power with high performance, high QoS, and high reliability Mobile TV Animation Web Browsing Satellite TV Video Streaming Video Conferencing
4 4 Reliability Reliability is an emerging and critical concern in mobile devices New enhanced technology makes devices vulnerable to errors due to high complexity and high integration Exponential increase of soft error rate as technology scales [Baumann, 05] Mobile applications are running close to humans In pervasive computing, failures of healthcare mobile devices cause serious results Redundancy techniques incur high overheads of power and performance TMR (Triple Modular Redundancy) may exceed 200% overheads without optimization [Nieuwland, 06] Challenging to optimize multiple properties (e.g., performance, power, QoS, and reliability) in mobile embedded systems
5 5 Soft error is becoming an every second concern! Soft Error Rate (SER) FIT (Failures in Time) = number of errors in 10 9 hours SER (FIT) MTTF Reason µm years µm 64x8x days High Integration nm 2x1000x64x8x hour Technology scaling and Twice Integration A 65 nm 2x2x1000x64x8x minutes Memory takes up 50% of soft errors in a system A system with voltage 65 nm A system with voltage flight (35, nm 100x2x2x1000x64x8x x100x2x2x1000x6 4x8x1000 FIT 18 seconds Exponential relationship b/w SER & Supply Voltage 0.02 seconds High Intensity of Neutron Flux at flight (high altitude)
6 6 Errors and Failures in Mobile Embedded Systems Faults or Errors can cause Failures Exce ption Bug Application Middleware/ OS Hardware Soft Error Packet Loss Network
7 7 Errors and Error Control Schemes at Hardware Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Soft Errors, Hard Failures, System Crash External Radiations, Thermal Effects, Power Loss, Poor Design, Aging FIT, MTTF, MTBF Spatial Redundancy (TMR, Duplex, RAID-1 etc.) and Data Redundancy (EDC, ECC, RAID-5, etc.) Hardware failures are increasing as technology scales (e.g.) SER increases by up to 1000 times [Mastipuram, 04] Redundancy techniques are expensive (e.g.) ECC-based protection in caches can incur 95% performance penalty [Li, 05] FIT: Failures in Time (10 9 hours) MTTF: Mean Time To Failure MTBF: Mean Time b/w Failures TMR: Triple Modular Redundancy EDC: Error Detection Codes ECC: Error Correction Codes RAID: Redundant Array of Inexpensive Drives
8 8 Errors and Error Control Schemes at Software Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Wrong outputs, Infinite loops, Crash Incomplete Specification, Poor software design, Bugs, Unhandled Exception Number of Bugs/Klines, QoS, MTTF, MTBF Spatial Redundancy (Nversion Programming, etc.), Temporal Redundancy (Checkpoints and Backward Recovery, etc.) Software errors become dominant as system s complexity increases (e.g.) Several bugs per kilo lines Hard to debug, and redundancy techniques are expensive (e.g.) Backward recovery with checkpoints is inappropriate for real-time applications QoS: Quality of Service
9 9 Errors and Error Control Schemes in Networks Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Data Losses, Deadline Misses, Node (Link) Failure, System Down Network Congestion, Noise/Interferenc e, Malicious Attacks Packet Loss Rate, Deadline Miss Rate, SNR, MTTF, MTBF, MTTR Network is unreliable (especially, wireless networks) Joint approaches across OSI layers have been investigated for minimal costs [Vuran, 06][Schaar, 07] Resource Reservation, Data Redundancy (CRC, etc.), Temporal Redundancy (Retransmission, etc.), Spatial Redundancy (Replicated Nodes, MIMO, etc.) SNR: Signal to Noise Ratio MTTR: Mean Time To Recovery CRC: Cyclic Redundancy Check MIMO: Multiple-In Multiple-Out
10 10 Conventional Approaches Most redundancy techniques incur overheads in terms of performance, power, area, etc. Conventional TRM (Triple Modular Redundancy) can incur 200% overheads without optimization. Backward Recovery with Checkpoints cannot guarantee the completion time of a task. Recently proposed techniques have focused on the cost reduction without losing reliability However, they still incur overheads
11 11 Thesis Problem Statement Study tradeoffs among system properties (e.g.) Redundancy incurs energy overheads while DVS increases SER significantly Examine errors and error control schemes across system abstraction layers (e.g.) network errors & error-resilient video encoding, soft errors & ECC or EDC, etc. Maximize reliability with minimal costs of power and performance for mobile embedded systems
12 12 Cross-Layer Methods Cross-layer approaches: aim at system-level optimization Integrate and coordinate techniques across system layers Classification [Srivastava, 05] Top-down, Bottom-up, or Both direction Top-down PPC, PDVS [GRACE], etc. Bottom-up EAVE, etc. Both direction CC-PROTECT, etc. Coupling or Merging layers Dynamo [Mohapatra], xtune [Kim], etc. Coupling Merging Bottom-up Top-down
13 13 Cross-Layer Approaches GRACE GRACE UIUC [W. Yuan Ph.D. thesis in 04 and A. F. Harris III, Ph.D. thesis in 06] QoS/Power tradeoffs Primarily OS adaptation for power management in multimedia mobile devices Network adaptation for power management in multimedia communications [GRACE, 05] Application Operating System Hardware
14 14 Cross-Layer Approaches DYNAMO & FORGE DYNAMO middleware for FORGE UCI [S. Mohapatra Ph.D. thesis in 05 and R. Cornea Ph.D. thesis in 07] QoS/Power tradeoffs for mobile embedded systems Middleware-driven coordination and proxy-based cooperation 1. Content transcoding at the application layer 2. Network traffic shaping at the network layer 3. Backlight (LCD display) setting at the hardware layer 4. NIC shutdown, CPU DVS/DFS at the hardware layer 1 Proxy Server (NW & MW) 2 Application Middleware/ OS 3 4 Hardware
15 15 Cross-Layer Approaches xtune xtune UCI and SRI [M. Kim Ph.D. thesis in 08] QoS/Power/Timeliness adaptation for distributed real-time embedded systems A Formal Methodology for cross-layer tuning and verifiable timeliness of Mobile Embedded Systems Application Middleware/ OS Proxy Server Handheld Server Hardware
16 16 Thesis Proposed Contribution Thesis proposes a cross-layer design methodology for mobile multimedia embedded systems with minimal costs Reliability/QoS/Power/Performance system optimization for mobile multimedia systems Cooperative, Cross-Layer Protection PPC, EAVE, & CCPROTECT Low-cost reliability
17 17 Overview of Thesis Proposals Mobile Video Application Error-prone Networks Original Video Multimedia Application Error-Controller (e.g., frame drop) Unprotected Cache Error-Resilient Encoder (e.g., PBPAIR) EAVE Application Correction Protected Cache EDC ECC QoS Monitor Error-prone & Translate Networks SER MW/OS Mobile Video Application Error detection Hardware Error- Aware Video Frame Drop Packet Loss PPC (Partially Protected Caches) EAVE (Error-Aware Video Encoding) CC-PROTECT (Cooperative, Crosslayer Protection)
18 18 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC (Partially Protected Caches) EAVE CC-PROTECT Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network
19 19 Conventional Protection for Caches Conventional Protected Caches Unaware of fault tolerance at applications Implement a redundancy technique such as ECC to protect all data for every access Overkill for multimedia applications ECC (e.g., a Hamming Code) incurs high performance penalty by up to 95%, power overhead by up to 22%, and area cost by up to 25% Unaware of Application High Cost Cache ECC
20 20 PPC (Partially Protected Caches) Observation Not all data are equally failure critical Multimedia data vs. control variables Propose PPC architectures to provide an unequal protection for mobile multimedia systems [Lee, CASES06][Lee, TVLSI08] Unprotected cache and Protected cache at the same level of memory hierarchy Protected cache is typically smaller to keep power and delay the same as or less than those of Unprotected cache PPC Unprotected Cache Memory Protected Cache
21 21 PPC for Multimedia Applications Unprotected Cache Memory PPC Protected Cache Propose a selective data protection [Lee, CASES06] Unequal protection at hardware layer exploiting error-tolerance of multimedia data at application layer Simple data partitioning for multimedia applications Multimedia data is failure noncritical All other data is failure critical Power/Delay Reduction Fault Tolerance
22 22 PPC for General Applications DPExplore [Lee, PPCDIPES08] Explore partitioning space by exploiting awareness of vulnerability of each data page Vulnerable time It is vulnerable for the time when eventually it is read by CPU or written back to Memory Pages causing high vulnerable time are failure critical Vulnerable time closely estimates failure rate data Incoming Read Unprotected Cache Write Memory Eviction t 0 t 1 t 2 t 3 PPC Protected Cache
23 23 Summary PPC All data are not equally failure critical Propose a PPC architecture to provide unequal protection Support an unequal protection at hardware layer by exploiting error-tolerance and vulnerability at application Present cost-efficient reliability Related Publications [Lee, CASES06] PPC for multimedia embedded systems [Lee, PPCDIPES08] PPC for general applications [Lee, TVLSI08] PPC and design space exploration Under submission [Lee, TODAES??] PPC for general applications and instruction caches Application Data & Code Error-tolerance of MM data Vulnerability of Data & Code Page Partitioning Algorithms Failure Non- Failure Critical Critical FNC & FC are mapped into Unprotected & Protected Caches Unprotected Cache PPC Protected Cache
24 24 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE (Error-Aware Video Encoding) CCPROTECT Thesis Contribution and Future Direction Application Middleware/ OS Network
25 25 Active Error Exploitation Intentional Frame Drop Intentional Frame Drop (one way to actively exploit errors) can result in energy reduction for each operation FDT-1 affects the following components with respect to power, performance, and QoS in mobile video applications Enc CPU FDT-1 Tx WNI FDT-2 Error-prone Networks Mobile Video Application Packet Loss Rx WNI Dec CPU FDT-3 FDT: Frame Drop Type Enc: Encoding, Dec: Decoding WNI: Wireless Network Interface
26 26 Error-Aware Video Encoding EIR: Error Injection Rate Propose EE-PBPAIR [Lee, DIPES08] Intentionally drop frames at video encoding Reduce the energy consumption for video encoding Maintain the video quality by exploiting error-resilience of PBPAIR Original Video Intentional frame drop Error-prone Networks Packet Loss Error-Aware Video Encoder (EAVE) Error-Controller (e.g., frame dropping) Error-Resilient Encoder (e.g., PBPAIR) Error- Resilient Aware Video
27 27 Summary EAVE Intentional Frame Drop is one way to exploit errors actively Propose an error-aware video encoding (EE-PBPAIR) Present a knob (EIR) to adjust the amount of errors considering the QoS feedback Maintain the video quality using error-resilience of PBPAIR Related Publication [Lee, DIPES08] EE-PBPAIR Considering Submission [Lee, TECS??] Generalized idea for error-resilient video encodings EIR Error Resilient Video Encoder Application Error Rate = PlR + EIR Error Controller Middleware Energy Reduction CPU, Memory, and WNIC Hardware Error-Aware Video Data PLR & QoS Network or Decoding Side EIR: Error Injection Rate PLR: Packet Loss Rate
28 28 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE CC-PROTECT (Cooperative Cross-layer Protection) Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network
29 29 Errors and Error Control Schemes No Coupling Different errors and their protection techniques have not been considered jointly No coupling and no cooperation Cooperating control schemes in a cross-layer manner can open a new venue Application Middleware/ OS Hardware Error-prone Networks Mobile Video Application Soft Error Packet Loss Network
30 30 PPC still incurs overheads due to ECC-protection Propose PPC architectures to provide an unequal protection for mobile multimedia systems [Lee, TVLSI08] Unprotected cache and Protected cache a the same level of memory hierarchy PPC still incurs overheads due to high expensive ECC-protection at the protected cache 29% energy reduction compared to the protected cache 10% energy overhead compared to the unprotected cache PPC Unprotected Cache Memory Protected Cache
31 31 PBPAIR is energy-inefficient in error-free network PBPAIR is error-resilient and energy-efficient in general PBPAIR may not be energy efficient in case of error-free network network Packet Loss PLR PBPAIR Intra_Threshold PBPAIR: Probability-Based Power Aware Intra Refresh [Kim, 06]
32 32 riginal Video Outline of CC-PROTECT Error-Controller Error-Resilient (e.g., frame drop) Encoder (e.g., PBPAIR) Error-Aware Video Encoder (EAVE) Error- Aware Video QoS Loss Frame Drop Mobile Video Application Error-prone Networks Packet Loss Monitor & Translate SER Feedback Parameter Trigger Selective DFR Error-prone Support EAVE Networks & PPC MW/OS Mobile Video Application BER (Backward DFR (Drop & Error Recovery) Forward Recovery) frame K frame K+1 Unprotected Cache Protected Cache EDC PPC Error detection
33 33 Energy Saving BASE = Error-prone video encoding + unprotected cache HW-PROTECT = Error-prone video encoding + PPC with ECC APP-PROTECT = Error-resilient video encoding + unprotected cache MULTI-PROTECT = Error-resilient video encoding + PPC with ECC CC-PROTECT1 = Error-prone video encoding + PPC with EDC CC-PROTECT2 = Error-prone video encoding + PPC with EDC + DFR CC-PROTECT = error-resilient video encoding + PPC with EDC + DFR EDC + impact DFR + impact PBPAIR(CC-PROTECT) impact 17% 36% 56% Reduction compared to HW-PROTECT 4% 26% 49% Reduction compared to to BASE
34 34 Summary CC-PROTECT Propose CC-PROTECT approach, which cooperates existing schemes across layers to mitigate the impact of soft errors on the failure rate and video quality in mobile video encoding systems PPC (Partially Protected Caches) with EDC (Error Detection Codes) at hardware layer DFR (Drop and Forward Recovery) at middleware PBPAIR (Probability-Based Power Aware Intra Refresh) at application layer Demonstrate the effectiveness of low-cost (about 50%) reliability (1,000x) at the minimal cost of QoS (less than 1%) Related Publication [Lee, ACMMM08] CC-PROTECT Considering Submission [Lee, ACMTOMCCAP??] Tradeoff space exploration with CC-PROTECT Application Middleware/ OS Hardware PBPAIR - Error Resilience DFR - Error Correction Unprotected Cache ECC EDC Protected Cache
35 35 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE CC-PROTECT Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network
36 36 Overall Thesis Contribution Cross-layer methodology to design mobile multimedia embedded systems with minimal costs 1. Effective Cross-layer approaches for reliability 2. Low-cost reliability 3. Expanded trade-off space 4. Extended applicability of existing techniques Application Middleware/ OS Hardware Soft Error Packet Loss Network
37 37 Effectiveness of Thesis Proposals (Energy Saving) PPC 25% energy reduction, as compared to a conventional protected cache with ECC EAVE 30% energy reduction, as compared to a conventional video encoding CCPROTECT 56% energy reduction, as compared to a conventional composition of protections
38 38 Publication Application Middleware / OS Hardware [Lee, ACMMM08] [Mohapatra, IPDPS05] [Lee, ICME05] [Lee, DIPES08] Network [Lee, TVLSI08] [Lee, PPCDIPES08] [Lee, CASES06] [Lee, ACMMM08] K. Lee, A. Shirvastava, M. Kim, N. Dutt, and N. Venk atasubramanian, Mitigating the impact of hardware defects on multimedia applications A cross-layer approach, In ACM Inter national Conference on Multimedia, Oct [Lee, TVLSI08] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkata subramanian, Partially protected caches to reduce failures due t o soft errors in multimedia applications, In IEEE Transactions on V ery Large Scale Integration Systems (TVLSI), 2008, to appear. [Lee, DIPES08] K. Lee, M. Kim, N. Dutt, and N. Venkatasubramanian, E rror exploiting video encoder to extend energy/qos tradeoffs f or mobile embedded systems, In 6th IFIP Working Conference o n Distributed and Parallel Embedded Systems (DIPES), Sep [Lee, PPCDIPES08] K. Lee, A. Shrivastava, N. Dutt, and N. Venkatasubr amanian, Data partitioning techniques for partially protected ca ches to reduce soft error induced failures, In 6th IFIP Working C onference on Distributed and Parallel Embedded Systems (DIPES), Sep [Lee, CASES06] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkat asubramanian, Mitigating soft error failures for multimedia appl ications by selective data protection, In Int. Conference on Compi lers, Architecture, & Synthesis for Embedded Systems (CASES), Oct [Lee, ICME05] K. Lee, N. Dutt, and N. Venkatasubramanian, Experime ntal Study on Energy Consumption of Video Encryption for Mobil e Handheld Devices", In IEEE International Conference on Multime dia and Expo (ICME 05), Poster Session, July [Mohapatra, IPDPS05] S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkatasubramanian, A cross-layer approach for powerperformance optimization in distributed mobile systems, In Next Generation Software Program in conjunction with IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2005.
39 39 Future Direction Error Rate Translation/Integration Different types of errors Different components across system layers Cross-layer methods for distributed embedded systems (Horizontal Expansion) Network-aware methods Context-aware approaches Exce ption Application Error-prone Networks Mobile Video Application Middleware/ OS Hardware Bug Soft Error Network Packet Loss
40 40 Thank you! Any Questions or Comments?
41 41 Backup Slides
42 42 Why Cross-Layer Approach? Cross-layer interactions and conflicts arise between system properties DVS increases SER exponentially Over protection or under protection All ECC for multimedia data is an overkill Cross-layer approaches can maximize the reliability with minimal power and performance overheads Benefits of Cross-layer approaches Global system view Coordination for intelligent selection Adaptation Cross-layer approaches have been promising to save the resources at the cost of QoS [Mohapatra, 05][Yuan, 04] DVS: Dynamic Voltage Scaling SER: Soft Error Rate ECC: Error Correction Codes QoS: Quality of Service
43 43 Thesis Proposed Contribution: CC-PROTECT Cooperative Cross-layer Protection (CC-PROTECT) by exploiting error-awareness and error control schemes across system abstraction layers Contribution Present cost-efficient reliability methods (cooperative crosslayer protection) Open expanded tradeoff spaces and operating points Rediscover applicability of existing approaches for other purposes
44 44 Performance vs. Capacity Total energy available from a battery is a design issue and is fixed at a design time, along with its weight and size Stark contrast between linear growth rate of battery capacity and exponential technology improvement rate of system components [Udani] Sanjay Udani and Jonathan Smith, Power management in mobile computing
45 [Chetan, SPC04] S. Chetan, A. Ranganathan, and R. Campbell, Towards Fault Tolerant Pervasive Computing, in SPC 04 [Somani, IEEECom97] A. K. Somani and N. H. Vaidya, Understanding Fault Tolerance and Reliability, in IEEE Computer 97 vol. 30 issue 4 45 Generalized Fault Tolerance Techniques 1) Modular Redundancy 2) N-Version Programming 3) Error-Control Coding 4) Checkpoints and Rollbacks 5) Recovery Blocks
46 46 1) Modular Redundancy Modular Redundancy Multiple identical replicas of hardware modules Voter mechanism Compare outputs and select the correct output fault Producer A Producer B voter Data Consumer Tolerate most hardware faults Effective but expensive
47 47 2) N-version Programming N-version Programming Different versions by different teams Different versions may not contain the same bugs Voter mechanism Tolerate some software bugs Program fault i Programmer K Producer A voter Program j Programmer L Data Consumer
48 48 3) Error-Control Coding Error-Control Coding Replication is effective but expensive Error-Detection Coding and Error-Correction Coding (example) Parity Bit, Hamming Code, CRC Much less redundancy than replication Producer A fault Data Error Control Consumer Data
49 49 4) Checkpoints & Rollbacks Checkpoints and Rollbacks Checkpoint A copy of an application s state Save it in storage immune to the failures Rollback Restart the execution from a previously saved checkpoint Recover from transient and permanent hardware and software failures Data Producer A Application State K state (K-1) state K Checkpoint Rollback fault Consumer
50 50 5) Recovery Blocks Recovery Blocks Multiple alternates to perform the same functionality One Primary module and Secondary modules Different approaches 1) Select a module with output satisfying acceptance test 2) Recovery Blocks and Rollbacks Restart the execution from a previously saved checkpoint with secondary module Tolerate software failures Producer A Application Block X Block X2 Block Y Block Z state (K-1) Checkpoint state K fault Data Rollback Consumer
51 51 Soft Errors (Transient Faults) SER increases exponentially as technology scales Integration, voltage scaling, altitude, latitude [Baumann, 05] Caches are most hit due to: Larger portion in processors (more than 50%) No masking effects (e.g., logical masking) Intel Itanium II Processor Transistor 01 Bit Flip 5 hours MTTF 1 month MTTF MTTF: Mean time To Failure
52 52 Related Work Process Technology Solutions Hardening [Baze, IEEE Trans. on Nuclear Science 00] SOI [O. Musseau, IEEE Trans. on Nuclear Science 96] Process complexity, yield loss, and substrate cost Our Solution -Protects caches from failures due to soft errors exploiting error-tolerance of applications -Protection can be in conjunction with any techniques Microarchitectural Solutions for Caches Cache Scrubbing [Mukherjee, PRDC04] Low Power Cache [Li, ISLPED04] Area Efficient Protection [Kim, DATE06] Multiple Bit Correction [Neuberger, TODAES 03] Cache Size Selection [Cai, ASP- DAC06] In-Cache Replication [Zhang, DSN03] Replication Cache [Zhang, IEEE Computers 05] High overheads in terms of power, performance, and area
53 53 Unequal Data Protection All pages are not equally failure critical Multimedia data is failure noncritical Program variables are failure critical Failures: system crash, infinite loop, segmentation faults, etc QoS degradation is not a failure Only 9 pages out of 83 are failure critical
54 54 Failure Critical and Failure Non-Critical Data
55 55 Soft Errors on Increase Increase exponentially due to technology scaling 0.18 µm 1,000 FIT per Mbit of SRAM 0.13 µm 10,000 to 100,000 FIT per Mbit of SRAM Voltage Scaling SER N flux Voltage scaling increases SER significantly x CS x exp {- where Q critical = C V x Q critical Q s }
56 56 Experimental Setup for Page Failure Rates
57 57 Experimental Framework
58 Experimental Results Failure Rate 58 Failure rate of PPC is close to that of Safe (Safe is a protected cache configuration with an ECC protection, i.e., protecting all data, and Unsafe is an unprotected cache)
59 Experimental Results Performance 59 Runtime of PPC is close to that of Unsafe
60 Experimental Results Power 60 Energy consumption of PPC is close to that of Unsafe
61 61 Experimental Setup for DPExplore
62 62 DPExplore Results
63 63 Video Encoding
64 64 Error-Resilient Video Encoding Parameters PLR Resilience Network Error-resilient video encodings have been developed to combat errors in networks PBPAIR energyefficient and errorresilient video encoding [Kim,06] Passive Error Exploitation It compresses video data according to PLR Embed Error-Resilience against packet losses Error-prone Networks Mobile Video Application Packet Loss Maintain the QoS PBPAIR: Probability-Based Power Aware Intra Refresh
65 65 Related Work Energy/QoS-aware video encoding Video encoding parameters [Mopatra, IPDPS05] Motion estimation algorithm [Tourapis, VCIP00] Integrated power management [Mohapatra, ACM MM03] Global cross-layer adaption [Yuan, MMCN04] Transmission power and QoS [Eisenberg, IEEE Trans. on CSVT 02] Not consider error-resilience Error-resilient video encoding Error-resilient GOP [Yang, JVCIP07] AIR (Adaptive Intra Refreshing) [Worral, ICASSP01] PGOP (Progressive GOP) [Cheng, PCS04] PBPAIR (Probability-Based Power Aware Intra Refresh) [Kim, MCCR06] Passive error exploitation Our Solution -Error-aware video encoding: exploits errors actively to minimize energy consumption
66 66 EE-PBPAIR
67 67 Experimental Setup
68 68 Experimental Results Energy Reduction Energy saving occurs at every component in a path from encoding to decoding in mobile video applications EC = Energy Consumption Enc EC = EC for Encoding Tx EC = EC for Transmission Dec EC = EC for Decoding Rx EC = EC for Receiving PLR = 10% and EIR = 10% PSNR: Peak Signal to Noise Ratio
69 69 Experimental Results Expanded Tradeoff Space
70 70 Experimental Energy Saving Source EC = Enc EC + Tx EC Destination EC = Rx EC + Dec EC
71 71 Experimental Results Adaptive EIR Feedbackbased approach (Adaptive EE- PBPAIR) maintains the required video quality compared to Static EE- PBPAIR
72 72 Adaptive EIR
73 73 Conclusion Studied two main crosslayer approaches PPC EAVE Demonstrated the effectiveness of our cooperative cross-layer approaches by exploiting error tolerance and error control schemes EIR feedback Tolerance Unequal Protection FLR Resilience PLR Network
74 74 Failure Rate
75 75 Video Quality
76 76 Memory Access Time (performance)
77 77 Future Direction Cooperative approaches combining PPC and EAVE Middleware-driven cross-layer approach manages error control schemes Translate errors to exploit existing approaches at other abstraction layers PPC EIR Apply our approach for other components feedback Instruction caches and logics EAVE Unequal Intelligent frame dropping techniques Protection To maximize the energy saving while minimizing the quality degradation FLR SER Tolerance Resilience PLR
78 78 Thesis Outline Thesis proposes a cross-layer method Exploit errors and error control schemes across layers to maximize reliability with minimal costs for mobile embedded systems Topic 1 Approach at hardware and application layers PPC (unequal data protection at hardware exploiting error tolerance at application) [Lee, CASES06][Lee, DIPES08][Lee, TVLSI08] Topic 2 Approach at application, middleware, and network layers EAVE (intentional exploitation of errors at application, incorporating error resilience in networks) [Lee, DIPES08] Topic 3 Approach across application/middleware-os/hw Application Middleware/ OS Hardware CC-PROTECT (middleware-driven cooperative exploitation of errors and error control schemes across layers) [Lee, ACM MM 08] Network
79 79 References (cross-layers and tools) [Bajic, 07] I. V. Bajic. Efficient cross-layer error control for wireless video multicast. 53(1): , Mar [Dynamo] DYNAMO. Power Aware Middleware for Distributed Mobile Computing. University of California at Irvine, [Forge] FORGE Project. A Framework for Optimization of Distributed Embedded Systems Software. University of California at Irvine, [Grace] GRACE Project. Global Resource Adaptation through CoopEration. University of Illinois at Urbana-Champaign, [Kim, 08] M. Kim, N. Dutt, N. Venkatasubramanian, and C. Talcott. xtune: Online verifiable cross-layer adaptation for distributed real-time embedded systems. ACM SIGBED Review: Special Issue on the RTSS Forum on Deeply Embedded Real-Time Computing, 5(1), Jan [Mohapatra, 03] S. Mohapatra, R. Cornea, N. Dutt, A. Nicolau, and N. Venkatasubramanian. Integrated power management for video streaming to mobile handheld devices. In ACM international conference on Multimedia, [Mohapatra, 05] S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkatasubramanian. A cross-layer approach for power-performance optimization in distributed mobile systems. In Next Generation Software Program in conjunction with IPDPS, page 218.1, April [Shivakumar, 01] P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. In WRL Technical Report 2001/2, [Synopsys] Synopsys Inc., Mountain View, CA, USA. Design Compiler Reference Manual, [Schaar, 07] M. van der Schaar and D. S. Turaga. Cross-layer packetization and retransmission strategies for delay-sensitive wireless multimedia transmission. IEEE Transactions on Multimedia, 9(1): , Jan [Vuran, 06] M. C. Vuran and I. F. Akyildiz. Cross-layer analysis of error control in wireless sensor networks. In IEEE Communications Society on Sensor and Ad Hoc Communications and Networks (SECON), pages , Sep [Yuan, 03] W. Yuan and K. Nahrstedt. Energy-efficient soft real-time CPU scheduling for mobile multimedia systems. 37(5): , Dec [Yuan, 04] W. Yuan and K. Nahrstedt. Practical voltage scaling for mobile multimedia devices. In ACM international conference on Multimedia, pages , 2004.
80 80 References (soft errors and reliability) [Baumann, 05] R. Baumann. Soft errors in advanced computer systems. IEEE Design and Test of Computers, pages , [Hazucha, 00] P. Hazucha and C. Svensson. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans. on Nuclear Science, 47(6): , [Li, 05] J.-F. Li and Y.-J. Huang. An error detection and correction scheme for RAMs with partial-write function. In IEEE International Workshop on Memory Technology, Design and Testing (MTDT), pages , [Li, 04] L. Li, V. Degalahal, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Soft error and energy consumption interactions: A data cache perspective. In ISLPED, Aug [Mastipuram, 04] R. Mastipuram and E. C. Wee. Soft Errors Impact on System Reliability. Sep [Phelan, 03] R. Phelan. Addressing soft errors in arm core-based designs. Technical report, ARM, [Pradhan, 96] D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice Hall, ISBN [Shrivastava, 05] A. Shrivastava, I. Issenin, and N. Dutt. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In CASES, pages 90 96, [Wrobel, 01] F. Wrobel, J. M. Palau, M. C. Calvet, O. Bersillon, and H. Duarte. Simulation of nucleon-induced nuclear reactions in a simplified SRAM structure: Scaling effects on SEU and MBU cross sections. IEEE Trans. on Nuclear Science, 48(6), [Xu, 96] J. Xu and B. Randell. Roll-forward error recovery in embedded real-time systems. In ICPADS, page 414, [Nieuwland, 06] A. K. Nieuwland and S. Jasarevic and G. Jerin. Combinational Logic Soft Error Analysis and Protection. In IOLTS06, 2006.
81 81 References (error-resilient encoding, etc.) [Cheng, 04] L. Cheng and M. E. Zarki. PGOP: An error resilient techniques for low bit rate and low latency video communications. In Picture Coding Symposium (PCS), Dec [Kim, 06] M. Kim, H. Oh, N. Dutt, A. Nicolau, and N. Venkatasubramanian. PBPAIR: An energy-efficient error-resilient encoding using probability based power aware intra refresh. ACM SIGMOBILE Mobile Computing and Communications Review, 10(3):58 69, July [Wang, 98] Y.Wang and Q.-F. Zhu. Error control and concealment for video communication: A review. 86(5): , May [Worrall, 01] S. Worrall, A. Sadka, P. Sweeney, and A. Kondoz. Motion adaptive error resilient encoding for MPEG-4. In ICASSP, May 2001.
Cross-Layer Interactions of Error Control Schemes in Mobile Multimedia Systems
Center for Embedded Computer Systems University of California, Irvine Cross-Layer Interactions of Error Control Schemes in Mobile Multimedia Systems Kyoungwoo Lee, Aviral Shrivastava, Minyoung Kim, Nikil
More informationError-Exploiting Video Encoder to Extend Energy/QoS Tradeoffs for Mobile Embedded Systems
Error-Exploiting Video Encoder to Extend Energy/QoS Tradeoffs for Mobile Embedded Systems Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian Abstract Energy/QoS provisioning is a challenging
More informationData Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures
Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures Kyoungwoo Lee, Aviral Shrivastava, Nikil Dutt, and Nalini Venkatasubramanian Abstract Exponentially increasing
More informationMitigating Soft Error Failures for Multimedia Applications by Selective Data Protection
Mitigating Soft Error Failures for Multimedia Applications by Selective Data Protection Kyoungwoo Lee 1, Aviral Shrivastava 2, Ilya Issenin 1, Nikil Dutt 1, and Nalini Venkatasubramanian 1 1 Department
More informationFor the estimation we use the following equations (1) and (2), adapted from [5]: T (n) = n k (t load +t store ) +t sort unit(n) (1)
20 Rui Marcelino, Horácio Neto, and João M. P. Cardoso As previously referred, when the number of data elements to be sorted surpasses the number of elements sorted by each execution of the sorting unit,
More informationPartially Protected Caches to Reduce Failures due to Soft Errors in Multimedia Applications
Partially Protected Caches to Reduce Failures due to Soft Errors in Multimedia Applications Kyoungwoo Lee 1, Aviral Shrivastava 2, Ilya Issenin 1, Nikil Dutt 1, and Nalini Venkatasubramanian 1 1 Department
More informationCompiler-Assisted Soft Error Correction by Duplicating Instructions for VLIW Architecture
R1-11 SASIMI 2012 Proceedings Compiler-Assisted Soft Error Correction by Duplicating Instructions for VLIW Architecture Yunrong Li 1, Jongwon Lee 1, Yohan Ko 2, Kyoungwoo Lee 2, and Yunheung Paek 1 1 School
More informationOutline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline
Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication Khanh N. Dang and Xuan-Tu Tran Email: khanh.n.dang@vnu.edu.vn VNU Key Laboratory for Smart Integrated Systems
More informationArea-Efficient Error Protection for Caches
Area-Efficient Error Protection for Caches Soontae Kim Department of Computer Science and Engineering University of South Florida, FL 33620 sookim@cse.usf.edu Abstract Due to increasing concern about various
More informationConstraint Refinement for Online Verifiable Cross-Layer System Adaptation
Constraint Refinement for Online Verifiable Cross-Layer System Adaptation Minyoung Kim, Mark-Oliver Stehr 2, Carolyn Talcott 2, Nikil Dutt, Nalini Venkatasubramanian School of Information and Computer
More informationVideo-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014
Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 1/26 ! Real-time Video Transmission! Challenges and Opportunities! Lessons Learned for Real-time Video! Mitigating Losses in Scalable
More informationSoft Error and Energy Consumption Interactions: A Data Cache Perspective
Soft Error and Energy Consumption Interactions: A Data Cache Perspective Lin Li, Vijay Degalahal, N. Vijaykrishnan, Mahmut Kandemir, Mary Jane Irwin Department of Computer Science and Engineering Pennsylvania
More informationAnswers to comments from Reviewer 1
Answers to comments from Reviewer 1 Question A-1: Though I have one correction to authors response to my Question A-1,... parity protection needs a correction mechanism (e.g., checkpoints and roll-backward
More informationFault Tolerance. The Three universe model
Fault Tolerance High performance systems must be fault-tolerant: they must be able to continue operating despite the failure of a limited subset of their hardware or software. They must also allow graceful
More informationRELIABILITY and RELIABLE DESIGN. Giovanni De Micheli Centre Systèmes Intégrés
RELIABILITY and RELIABLE DESIGN Giovanni Centre Systèmes Intégrés Outline Introduction to reliable design Design for reliability Component redundancy Communication redundancy Data encoding and error correction
More informationFault Tolerant Computing CS 530
Fault Tolerant Computing CS 530 Lecture Notes 1 Introduction to the class Yashwant K. Malaiya Colorado State University 1 Instructor, TA Instructor: Yashwant K. Malaiya, Professor malaiya @ cs.colostate.edu
More informationAnalysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology
Analysis of Soft Error Mitigation Techniques for s in IBM Cu-08 90nm Technology Riaz Naseer, Rashed Zafar Bhatti, Jeff Draper Information Sciences Institute University of Southern California Marina Del
More informationReliable Architectures
6.823, L24-1 Reliable Architectures Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 6.823, L24-2 Strike Changes State of a Single Bit 10 6.823, L24-3 Impact
More informationFAULT TOLERANT SYSTEMS
FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance
More informationError Mitigation of Point-to-Point Communication for Fault-Tolerant Computing
Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing Authors: Robert L Akamine, Robert F. Hodson, Brock J. LaMeres, and Robert E. Ray www.nasa.gov Contents Introduction to the
More informationParallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik
Parallel Streaming Computation on Error-Prone Processors Yavuz Yetim, Margaret Martonosi, Sharad Malik Upsets/B muons/mb Average Number of Dopant Atoms Hardware Errors on the Rise Soft Errors Due to Cosmic
More informationQuality-Assured Energy Balancing for Multi-hop Wireless Multimedia Networks via 2-D Channel Coding Rate Allocation
Quality-Assured Energy Balancing for Multi-hop Wireless Multimedia Networks via 2-D Channel Coding Rate Allocation Lin Xing, Wei Wang, Gensheng Zhang Electrical Engineering and Computer Science, South
More informationBy Charvi Dhoot*, Vincent J. Mooney &,
By Charvi Dhoot*, Vincent J. Mooney &, -Shubhajit Roy Chowdhury*, Lap Pui Chau # *International Institute of Information Technology, Hyderabad, India & School of Electrical and Computer Engineering, Georgia
More informationFast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda
Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for
More informationAn Allocation Optimization Method for Partially-reliable Scratch-pad Memory in Embedded Systems
[DOI: 10.2197/ipsjtsldm.8.100] Short Paper An Allocation Optimization Method for Partially-reliable Scratch-pad Memory in Embedded Systems Takuya Hatayama 1,a) Hideki Takase 1 Kazuyoshi Takagi 1 Naofumi
More informationA Partial Memory Protection Scheme for Higher Effective Yield of Embedded Memory for Video Data
A Partial Protection Scheme for Higher Effective Yield of Embedded for Video Data Kang Yi1, Shih-Yang Cheng2, Fadi Kurdahi2, and Ahmed Eltawil2 1 School of Computer Sci. and Electrical Eng., Handong Global
More informationHDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES
HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES (1) Nallaparaju Sneha, PG Scholar in VLSI Design, (2) Dr. K. Babulu, Professor, ECE Department, (1)(2)
More informationA Framework for Video Streaming to Resource- Constrained Terminals
A Framework for Video Streaming to Resource- Constrained Terminals Dmitri Jarnikov 1, Johan Lukkien 1, Peter van der Stok 1 Dept. of Mathematics and Computer Science, Eindhoven University of Technology
More informationTU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007
TU Wien 1 Fault Isolation and Error Containment in the TT-SoC H. Kopetz TU Wien July 2007 This is joint work with C. El.Salloum, B.Huber and R.Obermaisser Outline 2 Introduction The Concept of a Distributed
More informationCS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following:
CS 47 Spring 27 Mike Lam, Professor Fault Tolerance Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapter 8) Various online
More informationARCHITECTURE DESIGN FOR SOFT ERRORS
ARCHITECTURE DESIGN FOR SOFT ERRORS Shubu Mukherjee ^ШВпШшр"* AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO T^"ТГПШГ SAN FRANCISCO SINGAPORE SYDNEY TOKYO ^ P f ^ ^ ELSEVIER Morgan
More informationEnergy-aware Fault-tolerant and Real-time Wireless Sensor Network for Control System
Energy-aware Fault-tolerant and Real-time Wireless Sensor Network for Control System Thesis Proposal Wenchen Wang Computer Science, University of Pittsburgh Committee: Dr. Daniel Mosse, Computer Science,
More informationSoftware-based Fault Tolerance Mission (Im)possible?
Software-based Fault Tolerance Mission Im)possible? Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013 System Software Group http://www4.cs.fau.de Embedded Systems Initiative
More informationFault Tolerant Parallel Filters Based on ECC Codes
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 11, Number 7 (2018) pp. 597-605 Research India Publications http://www.ripublication.com Fault Tolerant Parallel Filters Based on
More informationIssues in Programming Language Design for Embedded RT Systems
CSE 237B Fall 2009 Issues in Programming Language Design for Embedded RT Systems Reliability and Fault Tolerance Exceptions and Exception Handling Rajesh Gupta University of California, San Diego ES Characteristics
More informationMulti-path Forward Error Correction Control Scheme with Path Interleaving
Multi-path Forward Error Correction Control Scheme with Path Interleaving Ming-Fong Tsai, Chun-Yi Kuo, Chun-Nan Kuo and Ce-Kuen Shieh Department of Electrical Engineering, National Cheng Kung University,
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN
255 CORRECTIONS TO FAULT SECURE OF MAJORITY LOGIC DECODER AND DETECTOR FOR MEMORY APPLICATIONS Viji.D PG Scholar Embedded Systems Prist University, Thanjuvr - India Mr.T.Sathees Kumar AP/ECE Prist University,
More informationLecture 8 Wireless Sensor Networks: Overview
Lecture 8 Wireless Sensor Networks: Overview Reading: Wireless Sensor Networks, in Ad Hoc Wireless Networks: Architectures and Protocols, Chapter 12, sections 12.1-12.2. I. Akyildiz, W. Su, Y. Sankarasubramaniam
More informationFPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP
FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationNEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect
1 A Soft Tolerant Network-on-Chip Router Pipeline for Multi-core Systems Pavan Poluri and Ahmed Louri Department of Electrical and Computer Engineering, University of Arizona Email: pavanp@email.arizona.edu,
More informationDelay Constrained ARQ Mechanism for MPEG Media Transport Protocol Based Video Streaming over Internet
Delay Constrained ARQ Mechanism for MPEG Media Transport Protocol Based Video Streaming over Internet Hong-rae Lee, Tae-jun Jung, Kwang-deok Seo Division of Computer and Telecommunications Engineering
More informationExploiting Unused Spare Columns to Improve Memory ECC
2009 27th IEEE VLSI Test Symposium Exploiting Unused Spare Columns to Improve Memory ECC Rudrajit Datta and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering
More informationUltra Low-Cost Defect Protection for Microprocessor Pipelines
Ultra Low-Cost Defect Protection for Microprocessor Pipelines Smitha Shyam Kypros Constantinides Sujay Phadke Valeria Bertacco Todd Austin Advanced Computer Architecture Lab University of Michigan Key
More informationOPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)
OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING Björn Döbel (TU Dresden) Brussels, 02.02.2013 Hardware Faults Radiation-induced soft errors Mainly an issue in avionics+space 1 DRAM errors in large
More informationImplementation of single bit Error detection and Correction using Embedded hamming scheme
Implementation of single bit Error detection and Correction using Embedded hamming scheme Anoop HK 1, Subodh kumar panda 2 and Vasudeva G 1 M.tech(VLSI & ES), BNMIT, Bangalore 2 Assoc Prof,Dept of ECE,
More informationECC Protection in Software
Center for RC eliable omputing ECC Protection in Software by Philip P Shirvani RATS June 8, 1999 Outline l Motivation l Requirements l Coding Schemes l Multiple Error Handling l Implementation in ARGOS
More informationEnergy-Aware MPEG-4 4 FGS Streaming
Energy-Aware MPEG-4 4 FGS Streaming Kihwan Choi and Massoud Pedram University of Southern California Kwanho Kim Seoul National University Outline! Wireless video streaming! Scalable video coding " MPEG-2
More informationEDAC FOR MEMORY PROTECTION IN ARM PROCESSOR
EDAC FOR MEMORY PROTECTION IN ARM PROCESSOR Mrs. A. Ruhan Bevi ECE department, SRM, Chennai, India. Abstract: The ARM processor core is a key component of many successful 32-bit embedded systems. Embedded
More informationAdaptive Middleware for Distributed Sensor Environments
Adaptive Middleware for Distributed Sensor Environments Xingbo Yu, Koushik Niyogi, Sharad Mehrotra, Nalini Venkatasubramanian University of California, Irvine {xyu, kniyogi, sharad, nalini}@ics.uci.edu
More informationFault-tolerant techniques
What are the effects if the hardware or software is not fault-free in a real-time system? What causes component faults? Specification or design faults: Incomplete or erroneous models Lack of techniques
More informationSNR Scalable Transcoding for Video over Wireless Channels
SNR Scalable Transcoding for Video over Wireless Channels Yue Yu Chang Wen Chen Department of Electrical Engineering University of Missouri-Columbia Columbia, MO 65211 Email: { yyu,cchen} @ee.missouri.edu
More informationLow Power Cache Design. Angel Chen Joe Gambino
Low Power Cache Design Angel Chen Joe Gambino Agenda Why is low power important? How does cache contribute to the power consumption of a processor? What are some design challenges for low power caches?
More informationFault Tolerance. Distributed Systems IT332
Fault Tolerance Distributed Systems IT332 2 Outline Introduction to fault tolerance Reliable Client Server Communication Distributed commit Failure recovery 3 Failures, Due to What? A system is said to
More informationECE 574 Cluster Computing Lecture 19
ECE 574 Cluster Computing Lecture 19 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 10 November 2015 Announcements Projects HW extended 1 MPI Review MPI is *not* shared memory
More informationDep. Systems Requirements
Dependable Systems Dep. Systems Requirements Availability the system is ready to be used immediately. A(t) = probability system is available for use at time t MTTF/(MTTF+MTTR) If MTTR can be kept small
More informationI/O CANNOT BE IGNORED
LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.
More informationSubject: Adhoc Networks
ISSUES IN AD HOC WIRELESS NETWORKS The major issues that affect the design, deployment, & performance of an ad hoc wireless network system are: Medium Access Scheme. Transport Layer Protocol. Routing.
More information416 Distributed Systems. Errors and Failures Oct 16, 2018
416 Distributed Systems Errors and Failures Oct 16, 2018 Types of Errors Hard errors: The component is dead. Soft errors: A signal or bit is wrong, but it doesn t mean the component must be faulty Note:
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationComputer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 13:
More informationReSpace/MAPLD Conference Albuquerque, NM, August A Fault-Handling Methodology by Promoting Hardware Configurations via PageRank
ReSpace/MAPLD Conference Albuquerque, NM, August 2011. A Fault-Handling Methodology by Promoting Hardware Configurations via PageRank Naveed Imran and Ronald F. DeMara Department of Electrical Engineering
More informationRedundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992
Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical
More informationRemote Health Monitoring for an Embedded System
July 20, 2012 Remote Health Monitoring for an Embedded System Authors: Puneet Gupta, Kundan Kumar, Vishnu H Prasad 1/22/2014 2 Outline Background Background & Scope Requirements Key Challenges Introduction
More informationAn Approach for Adaptive DRAM Temperature and Power Management
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 An Approach for Adaptive DRAM Temperature and Power Management Song Liu, Yu Zhang, Seda Ogrenci Memik, and Gokhan Memik Abstract High-performance
More informationA Low-Cost Correction Algorithm for Transient Data Errors
A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction
More informationERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS
ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS Ye-Kui Wang 1, Miska M. Hannuksela 2 and Moncef Gabbouj 3 1 Tampere International Center for Signal Processing (TICSP), Tampere,
More informationMultiple Event Upsets Aware FPGAs Using Protected Schemes
Multiple Event Upsets Aware FPGAs Using Protected Schemes Costas Argyrides, Dhiraj K. Pradhan University of Bristol, Department of Computer Science Merchant Venturers Building, Woodland Road, Bristol,
More informationFile systems CS 241. May 2, University of Illinois
File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict
More informationEnergy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks
Energy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks Reza Aminzadeh Electrical Engineering Department Khavaran Higher Education Institute Mashhad, Iran. reza.aminzadeh@ieee.com
More informationNETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN
NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN Giovanni De Micheli Luca Benini CSL - Stanford University DEIS - Bologna University Electronic systems Systems on chip are everywhere Technology
More informationEliminating Single Points of Failure in Software Based Redundancy
Eliminating Single Points of Failure in Software Based Redundancy Peter Ulbrich, Martin Hoffmann, Rüdiger Kapitza, Daniel Lohmann, Reiner Schmid and Wolfgang Schröder-Preikschat EDCC May 9, 2012 SYSTEM
More information4G WIRELESS VIDEO COMMUNICATIONS
4G WIRELESS VIDEO COMMUNICATIONS Haohong Wang Marvell Semiconductors, USA Lisimachos P. Kondi University of Ioannina, Greece Ajay Luthra Motorola, USA Song Ci University of Nebraska-Lincoln, USA WILEY
More informationChannel-Adaptive Error Protection for Scalable Audio Streaming over Wireless Internet
Channel-Adaptive Error Protection for Scalable Audio Streaming over Wireless Internet GuiJin Wang Qian Zhang Wenwu Zhu Jianping Zhou Department of Electronic Engineering, Tsinghua University, Beijing,
More informationDistributed Rate Allocation for Video Streaming over Wireless Networks. Wireless Home Video Networking
Ph.D. Oral Defense Distributed Rate Allocation for Video Streaming over Wireless Networks Xiaoqing Zhu Tuesday, June, 8 Information Systems Laboratory Stanford University Wireless Home Video Networking
More informationMulti-level Fault Tolerance in 2D and 3D Networks-on-Chip
Multi-level Fault Tolerance in 2D and 3D Networks-on-Chip Claudia usu Vladimir Pasca Lorena Anghel TIMA Laboratory Grenoble, France Outline Introduction Link Level outing Level Application Level Conclusions
More informationSTLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip
STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip Codesign for Tiled Manycore Systems Mingyu Wang and Zhaolin Li Institute of Microelectronics, Tsinghua University, Beijing 100084,
More informationDistributed Systems COMP 212. Lecture 19 Othon Michail
Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails
More informationMITIGATING THE EFFECT OF PACKET LOSSES ON REAL-TIME VIDEO STREAMING USING PSNR AS VIDEO QUALITY ASSESSMENT METRIC ABSTRACT
MITIGATING THE EFFECT OF PACKET LOSSES ON REAL-TIME VIDEO STREAMING USING PSNR AS VIDEO QUALITY ASSESSMENT METRIC Anietie Bassey, Kufre M. Udofia & Mfonobong C. Uko Department of Electrical/Electronic
More informationReliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure
Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Iswarya Gopal, Rajasekar.T, PG Scholar, Sri Shakthi Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India Assistant
More information120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014
120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 VL-ECC: Variable Data-Length Error Correction Code for Embedded Memory in DSP Applications Jangwon Park,
More informationADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS
ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS E. Masala, D. Quaglia, J.C. De Martin Λ Dipartimento di Automatica e Informatica/ Λ IRITI-CNR Politecnico di Torino, Italy
More informationAdaptation of Scalable Video Coding to Packet Loss and its Performance Analysis
Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis Euy-Doc Jang *, Jae-Gon Kim *, Truong Thang**,Jung-won Kang** *Korea Aerospace University, 100, Hanggongdae gil, Hwajeon-dong,
More informationRecommended Readings
Lecture 11: Media Adaptation Scalable Coding, Dealing with Errors Some slides, images were from http://ip.hhi.de/imagecom_g1/savce/index.htm and John G. Apostolopoulos http://www.mit.edu/~6.344/spring2004
More informationCDA 5140 Software Fault-tolerance. - however, reliability of the overall system is actually a product of the hardware, software, and human reliability
CDA 5140 Software Fault-tolerance - so far have looked at reliability as hardware reliability - however, reliability of the overall system is actually a product of the hardware, software, and human reliability
More informationAR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors Computer Sciences Department University of Wisconsin Madison http://www.cs.wisc.edu/~ericro/ericro.html ericro@cs.wisc.edu High-Performance
More informationIntroduction to Robust Systems
Introduction to Robust Systems Subhasish Mitra Stanford University Email: subh@stanford.edu 1 Objective of this Talk Brainstorm What is a robust system? How can we build robust systems? Robust systems
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationEvaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability
Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability Liangzhen Lai, Vikas Chandra* and Puneet Gupta UCLA Electrical Engineering Department ARM Research* Radiation-Induced
More informationModule 6 STILL IMAGE COMPRESSION STANDARDS
Module 6 STILL IMAGE COMPRESSION STANDARDS Lesson 19 JPEG-2000 Error Resiliency Instructional Objectives At the end of this lesson, the students should be able to: 1. Name two different types of lossy
More informationEXASCALE IN 2018 REALLY? FRANCK CAPPELLO INRIA&UIUC
EASCALE IN 2018 REALLY? FRANCK CAPPELLO INRIA&UIUC What are we talking about? 100M cores 12 cores/node Power Challenges Exascale Technology Roadmap Meeting San Diego California, December 2009. $1M per
More informationTo address these challenges, extensive research has been conducted and have introduced six key areas of streaming video, namely: video compression,
Design of an Application Layer Congestion Control for Reducing network load and Receiver based Buffering Technique for packet synchronization in Video Streaming over the Internet Protocol Mushfeq-Us-Saleheen
More informationImproving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy
Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy Wei Chen, Rui Gong, Fang Liu, Kui Dai, Zhiying Wang School of Computer, National University of Defense Technology,
More informationJeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1. University of Virginia Computer Science 2. NVIDIA Research
A Hardware Redundancy and Recovery Mechanism for Reliable Scientific Computation on Graphics Processors Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1 1 University of Virginia Computer Science
More informationHAFT Hardware-Assisted Fault Tolerance
HAFT Hardware-Assisted Fault Tolerance Dmitrii Kuvaiskii Rasha Faqeh Pramod Bhatotia Christof Fetzer Technische Universität Dresden Pascal Felber Université de Neuchâtel Hardware Errors in the Wild Online
More informationSystem Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework
System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Final Report Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao May 8, 2002 Abstract Stream
More informationADAPTIVE ERROR PROTECTION FOR ENERGY EFFICIENCY. Lin Li, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin
To appear on International Conference on Computer Aided Design (ICCAD 3) ADAPTIVE ERROR PROTECTION FOR ENERGY EFFICIENCY Lin Li, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin Microsystems Design
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationError Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet.
Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet. Injong Rhee Department of Computer Science North Carolina State University Video Conferencing over Packet- Switching
More informationImproving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE e networks
Improving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE 802.11e networks Seung-Seok Kang 1,1, Yejin Sohn 1, and Eunji Moon 1 1Department of Computer Science, Seoul Women s
More information