COOPERATIVE CROSS-LAYER PROTECTION FOR RESOURCE CONSTRAINED MOBILE MULTIMEDIA SYSTEMS

Size: px
Start display at page:

Download "COOPERATIVE CROSS-LAYER PROTECTION FOR RESOURCE CONSTRAINED MOBILE MULTIMEDIA SYSTEMS"

Transcription

1 COOPERATIVE CROSS-LAYER PROTECTION FOR RESOURCE CONSTRAINED MOBILE MULTIMEDIA SYSTEMS Prof. Nikil Dutt Prof. Nalini Venkatasubramanian Prof. Lichun Bao Nov. 26, 2008 Kyoungwoo Lee (final defense)

2 2 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC (Partially Protected Caches) EAVE (Error-Aware Video Encoding) CC-PROTECT (Cooperative, Cross-layer Protection) Thesis Contribution and Future Direction

3 3 Mobile Multimedia Embedded Systems 3D Graphics Resource-limited Image Browsing mobile devices! Map Routing Main problem is to achieve low power with high performance, high QoS, and high reliability Mobile TV Animation Web Browsing Satellite TV Video Streaming Video Conferencing

4 4 Reliability Reliability is an emerging and critical concern in mobile devices New enhanced technology makes devices vulnerable to errors due to high complexity and high integration Exponential increase of soft error rate as technology scales [Baumann, 05] Mobile applications are running close to humans In pervasive computing, failures of healthcare mobile devices cause serious results Redundancy techniques incur high overheads of power and performance TMR (Triple Modular Redundancy) may exceed 200% overheads without optimization [Nieuwland, 06] Challenging to optimize multiple properties (e.g., performance, power, QoS, and reliability) in mobile embedded systems

5 5 Soft error is becoming an every second concern! Soft Error Rate (SER) FIT (Failures in Time) = number of errors in 10 9 hours SER (FIT) MTTF Reason µm years µm 64x8x days High Integration nm 2x1000x64x8x hour Technology scaling and Twice Integration A 65 nm 2x2x1000x64x8x minutes Memory takes up 50% of soft errors in a system A system with voltage 65 nm A system with voltage flight (35, nm 100x2x2x1000x64x8x x100x2x2x1000x6 4x8x1000 FIT 18 seconds Exponential relationship b/w SER & Supply Voltage 0.02 seconds High Intensity of Neutron Flux at flight (high altitude)

6 6 Errors and Failures in Mobile Embedded Systems Faults or Errors can cause Failures Exce ption Bug Application Middleware/ OS Hardware Soft Error Packet Loss Network

7 7 Errors and Error Control Schemes at Hardware Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Soft Errors, Hard Failures, System Crash External Radiations, Thermal Effects, Power Loss, Poor Design, Aging FIT, MTTF, MTBF Spatial Redundancy (TMR, Duplex, RAID-1 etc.) and Data Redundancy (EDC, ECC, RAID-5, etc.) Hardware failures are increasing as technology scales (e.g.) SER increases by up to 1000 times [Mastipuram, 04] Redundancy techniques are expensive (e.g.) ECC-based protection in caches can incur 95% performance penalty [Li, 05] FIT: Failures in Time (10 9 hours) MTTF: Mean Time To Failure MTBF: Mean Time b/w Failures TMR: Triple Modular Redundancy EDC: Error Detection Codes ECC: Error Correction Codes RAID: Redundant Array of Inexpensive Drives

8 8 Errors and Error Control Schemes at Software Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Wrong outputs, Infinite loops, Crash Incomplete Specification, Poor software design, Bugs, Unhandled Exception Number of Bugs/Klines, QoS, MTTF, MTBF Spatial Redundancy (Nversion Programming, etc.), Temporal Redundancy (Checkpoints and Backward Recovery, etc.) Software errors become dominant as system s complexity increases (e.g.) Several bugs per kilo lines Hard to debug, and redundancy techniques are expensive (e.g.) Backward recovery with checkpoints is inappropriate for real-time applications QoS: Quality of Service

9 9 Errors and Error Control Schemes in Networks Application MW/ OS Network Hardware Failures Causes Metrics Traditional Approaches Data Losses, Deadline Misses, Node (Link) Failure, System Down Network Congestion, Noise/Interferenc e, Malicious Attacks Packet Loss Rate, Deadline Miss Rate, SNR, MTTF, MTBF, MTTR Network is unreliable (especially, wireless networks) Joint approaches across OSI layers have been investigated for minimal costs [Vuran, 06][Schaar, 07] Resource Reservation, Data Redundancy (CRC, etc.), Temporal Redundancy (Retransmission, etc.), Spatial Redundancy (Replicated Nodes, MIMO, etc.) SNR: Signal to Noise Ratio MTTR: Mean Time To Recovery CRC: Cyclic Redundancy Check MIMO: Multiple-In Multiple-Out

10 10 Conventional Approaches Most redundancy techniques incur overheads in terms of performance, power, area, etc. Conventional TRM (Triple Modular Redundancy) can incur 200% overheads without optimization. Backward Recovery with Checkpoints cannot guarantee the completion time of a task. Recently proposed techniques have focused on the cost reduction without losing reliability However, they still incur overheads

11 11 Thesis Problem Statement Study tradeoffs among system properties (e.g.) Redundancy incurs energy overheads while DVS increases SER significantly Examine errors and error control schemes across system abstraction layers (e.g.) network errors & error-resilient video encoding, soft errors & ECC or EDC, etc. Maximize reliability with minimal costs of power and performance for mobile embedded systems

12 12 Cross-Layer Methods Cross-layer approaches: aim at system-level optimization Integrate and coordinate techniques across system layers Classification [Srivastava, 05] Top-down, Bottom-up, or Both direction Top-down PPC, PDVS [GRACE], etc. Bottom-up EAVE, etc. Both direction CC-PROTECT, etc. Coupling or Merging layers Dynamo [Mohapatra], xtune [Kim], etc. Coupling Merging Bottom-up Top-down

13 13 Cross-Layer Approaches GRACE GRACE UIUC [W. Yuan Ph.D. thesis in 04 and A. F. Harris III, Ph.D. thesis in 06] QoS/Power tradeoffs Primarily OS adaptation for power management in multimedia mobile devices Network adaptation for power management in multimedia communications [GRACE, 05] Application Operating System Hardware

14 14 Cross-Layer Approaches DYNAMO & FORGE DYNAMO middleware for FORGE UCI [S. Mohapatra Ph.D. thesis in 05 and R. Cornea Ph.D. thesis in 07] QoS/Power tradeoffs for mobile embedded systems Middleware-driven coordination and proxy-based cooperation 1. Content transcoding at the application layer 2. Network traffic shaping at the network layer 3. Backlight (LCD display) setting at the hardware layer 4. NIC shutdown, CPU DVS/DFS at the hardware layer 1 Proxy Server (NW & MW) 2 Application Middleware/ OS 3 4 Hardware

15 15 Cross-Layer Approaches xtune xtune UCI and SRI [M. Kim Ph.D. thesis in 08] QoS/Power/Timeliness adaptation for distributed real-time embedded systems A Formal Methodology for cross-layer tuning and verifiable timeliness of Mobile Embedded Systems Application Middleware/ OS Proxy Server Handheld Server Hardware

16 16 Thesis Proposed Contribution Thesis proposes a cross-layer design methodology for mobile multimedia embedded systems with minimal costs Reliability/QoS/Power/Performance system optimization for mobile multimedia systems Cooperative, Cross-Layer Protection PPC, EAVE, & CCPROTECT Low-cost reliability

17 17 Overview of Thesis Proposals Mobile Video Application Error-prone Networks Original Video Multimedia Application Error-Controller (e.g., frame drop) Unprotected Cache Error-Resilient Encoder (e.g., PBPAIR) EAVE Application Correction Protected Cache EDC ECC QoS Monitor Error-prone & Translate Networks SER MW/OS Mobile Video Application Error detection Hardware Error- Aware Video Frame Drop Packet Loss PPC (Partially Protected Caches) EAVE (Error-Aware Video Encoding) CC-PROTECT (Cooperative, Crosslayer Protection)

18 18 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC (Partially Protected Caches) EAVE CC-PROTECT Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network

19 19 Conventional Protection for Caches Conventional Protected Caches Unaware of fault tolerance at applications Implement a redundancy technique such as ECC to protect all data for every access Overkill for multimedia applications ECC (e.g., a Hamming Code) incurs high performance penalty by up to 95%, power overhead by up to 22%, and area cost by up to 25% Unaware of Application High Cost Cache ECC

20 20 PPC (Partially Protected Caches) Observation Not all data are equally failure critical Multimedia data vs. control variables Propose PPC architectures to provide an unequal protection for mobile multimedia systems [Lee, CASES06][Lee, TVLSI08] Unprotected cache and Protected cache at the same level of memory hierarchy Protected cache is typically smaller to keep power and delay the same as or less than those of Unprotected cache PPC Unprotected Cache Memory Protected Cache

21 21 PPC for Multimedia Applications Unprotected Cache Memory PPC Protected Cache Propose a selective data protection [Lee, CASES06] Unequal protection at hardware layer exploiting error-tolerance of multimedia data at application layer Simple data partitioning for multimedia applications Multimedia data is failure noncritical All other data is failure critical Power/Delay Reduction Fault Tolerance

22 22 PPC for General Applications DPExplore [Lee, PPCDIPES08] Explore partitioning space by exploiting awareness of vulnerability of each data page Vulnerable time It is vulnerable for the time when eventually it is read by CPU or written back to Memory Pages causing high vulnerable time are failure critical Vulnerable time closely estimates failure rate data Incoming Read Unprotected Cache Write Memory Eviction t 0 t 1 t 2 t 3 PPC Protected Cache

23 23 Summary PPC All data are not equally failure critical Propose a PPC architecture to provide unequal protection Support an unequal protection at hardware layer by exploiting error-tolerance and vulnerability at application Present cost-efficient reliability Related Publications [Lee, CASES06] PPC for multimedia embedded systems [Lee, PPCDIPES08] PPC for general applications [Lee, TVLSI08] PPC and design space exploration Under submission [Lee, TODAES??] PPC for general applications and instruction caches Application Data & Code Error-tolerance of MM data Vulnerability of Data & Code Page Partitioning Algorithms Failure Non- Failure Critical Critical FNC & FC are mapped into Unprotected & Protected Caches Unprotected Cache PPC Protected Cache

24 24 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE (Error-Aware Video Encoding) CCPROTECT Thesis Contribution and Future Direction Application Middleware/ OS Network

25 25 Active Error Exploitation Intentional Frame Drop Intentional Frame Drop (one way to actively exploit errors) can result in energy reduction for each operation FDT-1 affects the following components with respect to power, performance, and QoS in mobile video applications Enc CPU FDT-1 Tx WNI FDT-2 Error-prone Networks Mobile Video Application Packet Loss Rx WNI Dec CPU FDT-3 FDT: Frame Drop Type Enc: Encoding, Dec: Decoding WNI: Wireless Network Interface

26 26 Error-Aware Video Encoding EIR: Error Injection Rate Propose EE-PBPAIR [Lee, DIPES08] Intentionally drop frames at video encoding Reduce the energy consumption for video encoding Maintain the video quality by exploiting error-resilience of PBPAIR Original Video Intentional frame drop Error-prone Networks Packet Loss Error-Aware Video Encoder (EAVE) Error-Controller (e.g., frame dropping) Error-Resilient Encoder (e.g., PBPAIR) Error- Resilient Aware Video

27 27 Summary EAVE Intentional Frame Drop is one way to exploit errors actively Propose an error-aware video encoding (EE-PBPAIR) Present a knob (EIR) to adjust the amount of errors considering the QoS feedback Maintain the video quality using error-resilience of PBPAIR Related Publication [Lee, DIPES08] EE-PBPAIR Considering Submission [Lee, TECS??] Generalized idea for error-resilient video encodings EIR Error Resilient Video Encoder Application Error Rate = PlR + EIR Error Controller Middleware Energy Reduction CPU, Memory, and WNIC Hardware Error-Aware Video Data PLR & QoS Network or Decoding Side EIR: Error Injection Rate PLR: Packet Loss Rate

28 28 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE CC-PROTECT (Cooperative Cross-layer Protection) Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network

29 29 Errors and Error Control Schemes No Coupling Different errors and their protection techniques have not been considered jointly No coupling and no cooperation Cooperating control schemes in a cross-layer manner can open a new venue Application Middleware/ OS Hardware Error-prone Networks Mobile Video Application Soft Error Packet Loss Network

30 30 PPC still incurs overheads due to ECC-protection Propose PPC architectures to provide an unequal protection for mobile multimedia systems [Lee, TVLSI08] Unprotected cache and Protected cache a the same level of memory hierarchy PPC still incurs overheads due to high expensive ECC-protection at the protected cache 29% energy reduction compared to the protected cache 10% energy overhead compared to the unprotected cache PPC Unprotected Cache Memory Protected Cache

31 31 PBPAIR is energy-inefficient in error-free network PBPAIR is error-resilient and energy-efficient in general PBPAIR may not be energy efficient in case of error-free network network Packet Loss PLR PBPAIR Intra_Threshold PBPAIR: Probability-Based Power Aware Intra Refresh [Kim, 06]

32 32 riginal Video Outline of CC-PROTECT Error-Controller Error-Resilient (e.g., frame drop) Encoder (e.g., PBPAIR) Error-Aware Video Encoder (EAVE) Error- Aware Video QoS Loss Frame Drop Mobile Video Application Error-prone Networks Packet Loss Monitor & Translate SER Feedback Parameter Trigger Selective DFR Error-prone Support EAVE Networks & PPC MW/OS Mobile Video Application BER (Backward DFR (Drop & Error Recovery) Forward Recovery) frame K frame K+1 Unprotected Cache Protected Cache EDC PPC Error detection

33 33 Energy Saving BASE = Error-prone video encoding + unprotected cache HW-PROTECT = Error-prone video encoding + PPC with ECC APP-PROTECT = Error-resilient video encoding + unprotected cache MULTI-PROTECT = Error-resilient video encoding + PPC with ECC CC-PROTECT1 = Error-prone video encoding + PPC with EDC CC-PROTECT2 = Error-prone video encoding + PPC with EDC + DFR CC-PROTECT = error-resilient video encoding + PPC with EDC + DFR EDC + impact DFR + impact PBPAIR(CC-PROTECT) impact 17% 36% 56% Reduction compared to HW-PROTECT 4% 26% 49% Reduction compared to to BASE

34 34 Summary CC-PROTECT Propose CC-PROTECT approach, which cooperates existing schemes across layers to mitigate the impact of soft errors on the failure rate and video quality in mobile video encoding systems PPC (Partially Protected Caches) with EDC (Error Detection Codes) at hardware layer DFR (Drop and Forward Recovery) at middleware PBPAIR (Probability-Based Power Aware Intra Refresh) at application layer Demonstrate the effectiveness of low-cost (about 50%) reliability (1,000x) at the minimal cost of QoS (less than 1%) Related Publication [Lee, ACMMM08] CC-PROTECT Considering Submission [Lee, ACMTOMCCAP??] Tradeoff space exploration with CC-PROTECT Application Middleware/ OS Hardware PBPAIR - Error Resilience DFR - Error Correction Unprotected Cache ECC EDC Protected Cache

35 35 Contents Thesis Motivation Thesis Proposal Cooperative, Cross-layer Methods PPC EAVE CC-PROTECT Thesis Contribution and Future Direction Application Middleware/ OS Hardware Network

36 36 Overall Thesis Contribution Cross-layer methodology to design mobile multimedia embedded systems with minimal costs 1. Effective Cross-layer approaches for reliability 2. Low-cost reliability 3. Expanded trade-off space 4. Extended applicability of existing techniques Application Middleware/ OS Hardware Soft Error Packet Loss Network

37 37 Effectiveness of Thesis Proposals (Energy Saving) PPC 25% energy reduction, as compared to a conventional protected cache with ECC EAVE 30% energy reduction, as compared to a conventional video encoding CCPROTECT 56% energy reduction, as compared to a conventional composition of protections

38 38 Publication Application Middleware / OS Hardware [Lee, ACMMM08] [Mohapatra, IPDPS05] [Lee, ICME05] [Lee, DIPES08] Network [Lee, TVLSI08] [Lee, PPCDIPES08] [Lee, CASES06] [Lee, ACMMM08] K. Lee, A. Shirvastava, M. Kim, N. Dutt, and N. Venk atasubramanian, Mitigating the impact of hardware defects on multimedia applications A cross-layer approach, In ACM Inter national Conference on Multimedia, Oct [Lee, TVLSI08] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkata subramanian, Partially protected caches to reduce failures due t o soft errors in multimedia applications, In IEEE Transactions on V ery Large Scale Integration Systems (TVLSI), 2008, to appear. [Lee, DIPES08] K. Lee, M. Kim, N. Dutt, and N. Venkatasubramanian, E rror exploiting video encoder to extend energy/qos tradeoffs f or mobile embedded systems, In 6th IFIP Working Conference o n Distributed and Parallel Embedded Systems (DIPES), Sep [Lee, PPCDIPES08] K. Lee, A. Shrivastava, N. Dutt, and N. Venkatasubr amanian, Data partitioning techniques for partially protected ca ches to reduce soft error induced failures, In 6th IFIP Working C onference on Distributed and Parallel Embedded Systems (DIPES), Sep [Lee, CASES06] K. Lee, A. Shrivastava, I. Issenin, N. Dutt, and N. Venkat asubramanian, Mitigating soft error failures for multimedia appl ications by selective data protection, In Int. Conference on Compi lers, Architecture, & Synthesis for Embedded Systems (CASES), Oct [Lee, ICME05] K. Lee, N. Dutt, and N. Venkatasubramanian, Experime ntal Study on Energy Consumption of Video Encryption for Mobil e Handheld Devices", In IEEE International Conference on Multime dia and Expo (ICME 05), Poster Session, July [Mohapatra, IPDPS05] S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkatasubramanian, A cross-layer approach for powerperformance optimization in distributed mobile systems, In Next Generation Software Program in conjunction with IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2005.

39 39 Future Direction Error Rate Translation/Integration Different types of errors Different components across system layers Cross-layer methods for distributed embedded systems (Horizontal Expansion) Network-aware methods Context-aware approaches Exce ption Application Error-prone Networks Mobile Video Application Middleware/ OS Hardware Bug Soft Error Network Packet Loss

40 40 Thank you! Any Questions or Comments?

41 41 Backup Slides

42 42 Why Cross-Layer Approach? Cross-layer interactions and conflicts arise between system properties DVS increases SER exponentially Over protection or under protection All ECC for multimedia data is an overkill Cross-layer approaches can maximize the reliability with minimal power and performance overheads Benefits of Cross-layer approaches Global system view Coordination for intelligent selection Adaptation Cross-layer approaches have been promising to save the resources at the cost of QoS [Mohapatra, 05][Yuan, 04] DVS: Dynamic Voltage Scaling SER: Soft Error Rate ECC: Error Correction Codes QoS: Quality of Service

43 43 Thesis Proposed Contribution: CC-PROTECT Cooperative Cross-layer Protection (CC-PROTECT) by exploiting error-awareness and error control schemes across system abstraction layers Contribution Present cost-efficient reliability methods (cooperative crosslayer protection) Open expanded tradeoff spaces and operating points Rediscover applicability of existing approaches for other purposes

44 44 Performance vs. Capacity Total energy available from a battery is a design issue and is fixed at a design time, along with its weight and size Stark contrast between linear growth rate of battery capacity and exponential technology improvement rate of system components [Udani] Sanjay Udani and Jonathan Smith, Power management in mobile computing

45 [Chetan, SPC04] S. Chetan, A. Ranganathan, and R. Campbell, Towards Fault Tolerant Pervasive Computing, in SPC 04 [Somani, IEEECom97] A. K. Somani and N. H. Vaidya, Understanding Fault Tolerance and Reliability, in IEEE Computer 97 vol. 30 issue 4 45 Generalized Fault Tolerance Techniques 1) Modular Redundancy 2) N-Version Programming 3) Error-Control Coding 4) Checkpoints and Rollbacks 5) Recovery Blocks

46 46 1) Modular Redundancy Modular Redundancy Multiple identical replicas of hardware modules Voter mechanism Compare outputs and select the correct output fault Producer A Producer B voter Data Consumer Tolerate most hardware faults Effective but expensive

47 47 2) N-version Programming N-version Programming Different versions by different teams Different versions may not contain the same bugs Voter mechanism Tolerate some software bugs Program fault i Programmer K Producer A voter Program j Programmer L Data Consumer

48 48 3) Error-Control Coding Error-Control Coding Replication is effective but expensive Error-Detection Coding and Error-Correction Coding (example) Parity Bit, Hamming Code, CRC Much less redundancy than replication Producer A fault Data Error Control Consumer Data

49 49 4) Checkpoints & Rollbacks Checkpoints and Rollbacks Checkpoint A copy of an application s state Save it in storage immune to the failures Rollback Restart the execution from a previously saved checkpoint Recover from transient and permanent hardware and software failures Data Producer A Application State K state (K-1) state K Checkpoint Rollback fault Consumer

50 50 5) Recovery Blocks Recovery Blocks Multiple alternates to perform the same functionality One Primary module and Secondary modules Different approaches 1) Select a module with output satisfying acceptance test 2) Recovery Blocks and Rollbacks Restart the execution from a previously saved checkpoint with secondary module Tolerate software failures Producer A Application Block X Block X2 Block Y Block Z state (K-1) Checkpoint state K fault Data Rollback Consumer

51 51 Soft Errors (Transient Faults) SER increases exponentially as technology scales Integration, voltage scaling, altitude, latitude [Baumann, 05] Caches are most hit due to: Larger portion in processors (more than 50%) No masking effects (e.g., logical masking) Intel Itanium II Processor Transistor 01 Bit Flip 5 hours MTTF 1 month MTTF MTTF: Mean time To Failure

52 52 Related Work Process Technology Solutions Hardening [Baze, IEEE Trans. on Nuclear Science 00] SOI [O. Musseau, IEEE Trans. on Nuclear Science 96] Process complexity, yield loss, and substrate cost Our Solution -Protects caches from failures due to soft errors exploiting error-tolerance of applications -Protection can be in conjunction with any techniques Microarchitectural Solutions for Caches Cache Scrubbing [Mukherjee, PRDC04] Low Power Cache [Li, ISLPED04] Area Efficient Protection [Kim, DATE06] Multiple Bit Correction [Neuberger, TODAES 03] Cache Size Selection [Cai, ASP- DAC06] In-Cache Replication [Zhang, DSN03] Replication Cache [Zhang, IEEE Computers 05] High overheads in terms of power, performance, and area

53 53 Unequal Data Protection All pages are not equally failure critical Multimedia data is failure noncritical Program variables are failure critical Failures: system crash, infinite loop, segmentation faults, etc QoS degradation is not a failure Only 9 pages out of 83 are failure critical

54 54 Failure Critical and Failure Non-Critical Data

55 55 Soft Errors on Increase Increase exponentially due to technology scaling 0.18 µm 1,000 FIT per Mbit of SRAM 0.13 µm 10,000 to 100,000 FIT per Mbit of SRAM Voltage Scaling SER N flux Voltage scaling increases SER significantly x CS x exp {- where Q critical = C V x Q critical Q s }

56 56 Experimental Setup for Page Failure Rates

57 57 Experimental Framework

58 Experimental Results Failure Rate 58 Failure rate of PPC is close to that of Safe (Safe is a protected cache configuration with an ECC protection, i.e., protecting all data, and Unsafe is an unprotected cache)

59 Experimental Results Performance 59 Runtime of PPC is close to that of Unsafe

60 Experimental Results Power 60 Energy consumption of PPC is close to that of Unsafe

61 61 Experimental Setup for DPExplore

62 62 DPExplore Results

63 63 Video Encoding

64 64 Error-Resilient Video Encoding Parameters PLR Resilience Network Error-resilient video encodings have been developed to combat errors in networks PBPAIR energyefficient and errorresilient video encoding [Kim,06] Passive Error Exploitation It compresses video data according to PLR Embed Error-Resilience against packet losses Error-prone Networks Mobile Video Application Packet Loss Maintain the QoS PBPAIR: Probability-Based Power Aware Intra Refresh

65 65 Related Work Energy/QoS-aware video encoding Video encoding parameters [Mopatra, IPDPS05] Motion estimation algorithm [Tourapis, VCIP00] Integrated power management [Mohapatra, ACM MM03] Global cross-layer adaption [Yuan, MMCN04] Transmission power and QoS [Eisenberg, IEEE Trans. on CSVT 02] Not consider error-resilience Error-resilient video encoding Error-resilient GOP [Yang, JVCIP07] AIR (Adaptive Intra Refreshing) [Worral, ICASSP01] PGOP (Progressive GOP) [Cheng, PCS04] PBPAIR (Probability-Based Power Aware Intra Refresh) [Kim, MCCR06] Passive error exploitation Our Solution -Error-aware video encoding: exploits errors actively to minimize energy consumption

66 66 EE-PBPAIR

67 67 Experimental Setup

68 68 Experimental Results Energy Reduction Energy saving occurs at every component in a path from encoding to decoding in mobile video applications EC = Energy Consumption Enc EC = EC for Encoding Tx EC = EC for Transmission Dec EC = EC for Decoding Rx EC = EC for Receiving PLR = 10% and EIR = 10% PSNR: Peak Signal to Noise Ratio

69 69 Experimental Results Expanded Tradeoff Space

70 70 Experimental Energy Saving Source EC = Enc EC + Tx EC Destination EC = Rx EC + Dec EC

71 71 Experimental Results Adaptive EIR Feedbackbased approach (Adaptive EE- PBPAIR) maintains the required video quality compared to Static EE- PBPAIR

72 72 Adaptive EIR

73 73 Conclusion Studied two main crosslayer approaches PPC EAVE Demonstrated the effectiveness of our cooperative cross-layer approaches by exploiting error tolerance and error control schemes EIR feedback Tolerance Unequal Protection FLR Resilience PLR Network

74 74 Failure Rate

75 75 Video Quality

76 76 Memory Access Time (performance)

77 77 Future Direction Cooperative approaches combining PPC and EAVE Middleware-driven cross-layer approach manages error control schemes Translate errors to exploit existing approaches at other abstraction layers PPC EIR Apply our approach for other components feedback Instruction caches and logics EAVE Unequal Intelligent frame dropping techniques Protection To maximize the energy saving while minimizing the quality degradation FLR SER Tolerance Resilience PLR

78 78 Thesis Outline Thesis proposes a cross-layer method Exploit errors and error control schemes across layers to maximize reliability with minimal costs for mobile embedded systems Topic 1 Approach at hardware and application layers PPC (unequal data protection at hardware exploiting error tolerance at application) [Lee, CASES06][Lee, DIPES08][Lee, TVLSI08] Topic 2 Approach at application, middleware, and network layers EAVE (intentional exploitation of errors at application, incorporating error resilience in networks) [Lee, DIPES08] Topic 3 Approach across application/middleware-os/hw Application Middleware/ OS Hardware CC-PROTECT (middleware-driven cooperative exploitation of errors and error control schemes across layers) [Lee, ACM MM 08] Network

79 79 References (cross-layers and tools) [Bajic, 07] I. V. Bajic. Efficient cross-layer error control for wireless video multicast. 53(1): , Mar [Dynamo] DYNAMO. Power Aware Middleware for Distributed Mobile Computing. University of California at Irvine, [Forge] FORGE Project. A Framework for Optimization of Distributed Embedded Systems Software. University of California at Irvine, [Grace] GRACE Project. Global Resource Adaptation through CoopEration. University of Illinois at Urbana-Champaign, [Kim, 08] M. Kim, N. Dutt, N. Venkatasubramanian, and C. Talcott. xtune: Online verifiable cross-layer adaptation for distributed real-time embedded systems. ACM SIGBED Review: Special Issue on the RTSS Forum on Deeply Embedded Real-Time Computing, 5(1), Jan [Mohapatra, 03] S. Mohapatra, R. Cornea, N. Dutt, A. Nicolau, and N. Venkatasubramanian. Integrated power management for video streaming to mobile handheld devices. In ACM international conference on Multimedia, [Mohapatra, 05] S. Mohapatra, R. Cornea, H. Oh, K. Lee, M. Kim, N. Dutt, R. Gupta, A. Nicolau, S. Shukla, and N. Venkatasubramanian. A cross-layer approach for power-performance optimization in distributed mobile systems. In Next Generation Software Program in conjunction with IPDPS, page 218.1, April [Shivakumar, 01] P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. In WRL Technical Report 2001/2, [Synopsys] Synopsys Inc., Mountain View, CA, USA. Design Compiler Reference Manual, [Schaar, 07] M. van der Schaar and D. S. Turaga. Cross-layer packetization and retransmission strategies for delay-sensitive wireless multimedia transmission. IEEE Transactions on Multimedia, 9(1): , Jan [Vuran, 06] M. C. Vuran and I. F. Akyildiz. Cross-layer analysis of error control in wireless sensor networks. In IEEE Communications Society on Sensor and Ad Hoc Communications and Networks (SECON), pages , Sep [Yuan, 03] W. Yuan and K. Nahrstedt. Energy-efficient soft real-time CPU scheduling for mobile multimedia systems. 37(5): , Dec [Yuan, 04] W. Yuan and K. Nahrstedt. Practical voltage scaling for mobile multimedia devices. In ACM international conference on Multimedia, pages , 2004.

80 80 References (soft errors and reliability) [Baumann, 05] R. Baumann. Soft errors in advanced computer systems. IEEE Design and Test of Computers, pages , [Hazucha, 00] P. Hazucha and C. Svensson. Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans. on Nuclear Science, 47(6): , [Li, 05] J.-F. Li and Y.-J. Huang. An error detection and correction scheme for RAMs with partial-write function. In IEEE International Workshop on Memory Technology, Design and Testing (MTDT), pages , [Li, 04] L. Li, V. Degalahal, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Soft error and energy consumption interactions: A data cache perspective. In ISLPED, Aug [Mastipuram, 04] R. Mastipuram and E. C. Wee. Soft Errors Impact on System Reliability. Sep [Phelan, 03] R. Phelan. Addressing soft errors in arm core-based designs. Technical report, ARM, [Pradhan, 96] D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice Hall, ISBN [Shrivastava, 05] A. Shrivastava, I. Issenin, and N. Dutt. Compilation techniques for energy reduction in horizontally partitioned cache architectures. In CASES, pages 90 96, [Wrobel, 01] F. Wrobel, J. M. Palau, M. C. Calvet, O. Bersillon, and H. Duarte. Simulation of nucleon-induced nuclear reactions in a simplified SRAM structure: Scaling effects on SEU and MBU cross sections. IEEE Trans. on Nuclear Science, 48(6), [Xu, 96] J. Xu and B. Randell. Roll-forward error recovery in embedded real-time systems. In ICPADS, page 414, [Nieuwland, 06] A. K. Nieuwland and S. Jasarevic and G. Jerin. Combinational Logic Soft Error Analysis and Protection. In IOLTS06, 2006.

81 81 References (error-resilient encoding, etc.) [Cheng, 04] L. Cheng and M. E. Zarki. PGOP: An error resilient techniques for low bit rate and low latency video communications. In Picture Coding Symposium (PCS), Dec [Kim, 06] M. Kim, H. Oh, N. Dutt, A. Nicolau, and N. Venkatasubramanian. PBPAIR: An energy-efficient error-resilient encoding using probability based power aware intra refresh. ACM SIGMOBILE Mobile Computing and Communications Review, 10(3):58 69, July [Wang, 98] Y.Wang and Q.-F. Zhu. Error control and concealment for video communication: A review. 86(5): , May [Worrall, 01] S. Worrall, A. Sadka, P. Sweeney, and A. Kondoz. Motion adaptive error resilient encoding for MPEG-4. In ICASSP, May 2001.

Cross-Layer Interactions of Error Control Schemes in Mobile Multimedia Systems

Cross-Layer Interactions of Error Control Schemes in Mobile Multimedia Systems Center for Embedded Computer Systems University of California, Irvine Cross-Layer Interactions of Error Control Schemes in Mobile Multimedia Systems Kyoungwoo Lee, Aviral Shrivastava, Minyoung Kim, Nikil

More information

Error-Exploiting Video Encoder to Extend Energy/QoS Tradeoffs for Mobile Embedded Systems

Error-Exploiting Video Encoder to Extend Energy/QoS Tradeoffs for Mobile Embedded Systems Error-Exploiting Video Encoder to Extend Energy/QoS Tradeoffs for Mobile Embedded Systems Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian Abstract Energy/QoS provisioning is a challenging

More information

Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures

Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures Data Partitioning Techniques for Partially Protected Caches to Reduce Soft Error Induced Failures Kyoungwoo Lee, Aviral Shrivastava, Nikil Dutt, and Nalini Venkatasubramanian Abstract Exponentially increasing

More information

Mitigating Soft Error Failures for Multimedia Applications by Selective Data Protection

Mitigating Soft Error Failures for Multimedia Applications by Selective Data Protection Mitigating Soft Error Failures for Multimedia Applications by Selective Data Protection Kyoungwoo Lee 1, Aviral Shrivastava 2, Ilya Issenin 1, Nikil Dutt 1, and Nalini Venkatasubramanian 1 1 Department

More information

For the estimation we use the following equations (1) and (2), adapted from [5]: T (n) = n k (t load +t store ) +t sort unit(n) (1)

For the estimation we use the following equations (1) and (2), adapted from [5]: T (n) = n k (t load +t store ) +t sort unit(n) (1) 20 Rui Marcelino, Horácio Neto, and João M. P. Cardoso As previously referred, when the number of data elements to be sorted surpasses the number of elements sorted by each execution of the sorting unit,

More information

Partially Protected Caches to Reduce Failures due to Soft Errors in Multimedia Applications

Partially Protected Caches to Reduce Failures due to Soft Errors in Multimedia Applications Partially Protected Caches to Reduce Failures due to Soft Errors in Multimedia Applications Kyoungwoo Lee 1, Aviral Shrivastava 2, Ilya Issenin 1, Nikil Dutt 1, and Nalini Venkatasubramanian 1 1 Department

More information

Compiler-Assisted Soft Error Correction by Duplicating Instructions for VLIW Architecture

Compiler-Assisted Soft Error Correction by Duplicating Instructions for VLIW Architecture R1-11 SASIMI 2012 Proceedings Compiler-Assisted Soft Error Correction by Duplicating Instructions for VLIW Architecture Yunrong Li 1, Jongwon Lee 1, Yohan Ko 2, Kyoungwoo Lee 2, and Yunheung Paek 1 1 School

More information

Outline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline

Outline. Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication. Outline Parity-based ECC and Mechanism for Detecting and Correcting Soft Errors in On-Chip Communication Khanh N. Dang and Xuan-Tu Tran Email: khanh.n.dang@vnu.edu.vn VNU Key Laboratory for Smart Integrated Systems

More information

Area-Efficient Error Protection for Caches

Area-Efficient Error Protection for Caches Area-Efficient Error Protection for Caches Soontae Kim Department of Computer Science and Engineering University of South Florida, FL 33620 sookim@cse.usf.edu Abstract Due to increasing concern about various

More information

Constraint Refinement for Online Verifiable Cross-Layer System Adaptation

Constraint Refinement for Online Verifiable Cross-Layer System Adaptation Constraint Refinement for Online Verifiable Cross-Layer System Adaptation Minyoung Kim, Mark-Oliver Stehr 2, Carolyn Talcott 2, Nikil Dutt, Nalini Venkatasubramanian School of Information and Computer

More information

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 1/26 ! Real-time Video Transmission! Challenges and Opportunities! Lessons Learned for Real-time Video! Mitigating Losses in Scalable

More information

Soft Error and Energy Consumption Interactions: A Data Cache Perspective

Soft Error and Energy Consumption Interactions: A Data Cache Perspective Soft Error and Energy Consumption Interactions: A Data Cache Perspective Lin Li, Vijay Degalahal, N. Vijaykrishnan, Mahmut Kandemir, Mary Jane Irwin Department of Computer Science and Engineering Pennsylvania

More information

Answers to comments from Reviewer 1

Answers to comments from Reviewer 1 Answers to comments from Reviewer 1 Question A-1: Though I have one correction to authors response to my Question A-1,... parity protection needs a correction mechanism (e.g., checkpoints and roll-backward

More information

Fault Tolerance. The Three universe model

Fault Tolerance. The Three universe model Fault Tolerance High performance systems must be fault-tolerant: they must be able to continue operating despite the failure of a limited subset of their hardware or software. They must also allow graceful

More information

RELIABILITY and RELIABLE DESIGN. Giovanni De Micheli Centre Systèmes Intégrés

RELIABILITY and RELIABLE DESIGN. Giovanni De Micheli Centre Systèmes Intégrés RELIABILITY and RELIABLE DESIGN Giovanni Centre Systèmes Intégrés Outline Introduction to reliable design Design for reliability Component redundancy Communication redundancy Data encoding and error correction

More information

Fault Tolerant Computing CS 530

Fault Tolerant Computing CS 530 Fault Tolerant Computing CS 530 Lecture Notes 1 Introduction to the class Yashwant K. Malaiya Colorado State University 1 Instructor, TA Instructor: Yashwant K. Malaiya, Professor malaiya @ cs.colostate.edu

More information

Analysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology

Analysis of Soft Error Mitigation Techniques for Register Files in IBM Cu-08 90nm Technology Analysis of Soft Error Mitigation Techniques for s in IBM Cu-08 90nm Technology Riaz Naseer, Rashed Zafar Bhatti, Jeff Draper Information Sciences Institute University of Southern California Marina Del

More information

Reliable Architectures

Reliable Architectures 6.823, L24-1 Reliable Architectures Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology 6.823, L24-2 Strike Changes State of a Single Bit 10 6.823, L24-3 Impact

More information

FAULT TOLERANT SYSTEMS

FAULT TOLERANT SYSTEMS FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance

More information

Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing

Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing Error Mitigation of Point-to-Point Communication for Fault-Tolerant Computing Authors: Robert L Akamine, Robert F. Hodson, Brock J. LaMeres, and Robert E. Ray www.nasa.gov Contents Introduction to the

More information

Parallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik

Parallel Streaming Computation on Error-Prone Processors. Yavuz Yetim, Margaret Martonosi, Sharad Malik Parallel Streaming Computation on Error-Prone Processors Yavuz Yetim, Margaret Martonosi, Sharad Malik Upsets/B muons/mb Average Number of Dopant Atoms Hardware Errors on the Rise Soft Errors Due to Cosmic

More information

Quality-Assured Energy Balancing for Multi-hop Wireless Multimedia Networks via 2-D Channel Coding Rate Allocation

Quality-Assured Energy Balancing for Multi-hop Wireless Multimedia Networks via 2-D Channel Coding Rate Allocation Quality-Assured Energy Balancing for Multi-hop Wireless Multimedia Networks via 2-D Channel Coding Rate Allocation Lin Xing, Wei Wang, Gensheng Zhang Electrical Engineering and Computer Science, South

More information

By Charvi Dhoot*, Vincent J. Mooney &,

By Charvi Dhoot*, Vincent J. Mooney &, By Charvi Dhoot*, Vincent J. Mooney &, -Shubhajit Roy Chowdhury*, Lap Pui Chau # *International Institute of Information Technology, Hyderabad, India & School of Electrical and Computer Engineering, Georgia

More information

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE 5359 Gaurav Hansda 1000721849 gaurav.hansda@mavs.uta.edu Outline Introduction to H.264 Current algorithms for

More information

An Allocation Optimization Method for Partially-reliable Scratch-pad Memory in Embedded Systems

An Allocation Optimization Method for Partially-reliable Scratch-pad Memory in Embedded Systems [DOI: 10.2197/ipsjtsldm.8.100] Short Paper An Allocation Optimization Method for Partially-reliable Scratch-pad Memory in Embedded Systems Takuya Hatayama 1,a) Hideki Takase 1 Kazuyoshi Takagi 1 Naofumi

More information

A Partial Memory Protection Scheme for Higher Effective Yield of Embedded Memory for Video Data

A Partial Memory Protection Scheme for Higher Effective Yield of Embedded Memory for Video Data A Partial Protection Scheme for Higher Effective Yield of Embedded for Video Data Kang Yi1, Shih-Yang Cheng2, Fadi Kurdahi2, and Ahmed Eltawil2 1 School of Computer Sci. and Electrical Eng., Handong Global

More information

HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES

HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES HDL IMPLEMENTATION OF SRAM BASED ERROR CORRECTION AND DETECTION USING ORTHOGONAL LATIN SQUARE CODES (1) Nallaparaju Sneha, PG Scholar in VLSI Design, (2) Dr. K. Babulu, Professor, ECE Department, (1)(2)

More information

A Framework for Video Streaming to Resource- Constrained Terminals

A Framework for Video Streaming to Resource- Constrained Terminals A Framework for Video Streaming to Resource- Constrained Terminals Dmitri Jarnikov 1, Johan Lukkien 1, Peter van der Stok 1 Dept. of Mathematics and Computer Science, Eindhoven University of Technology

More information

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007 TU Wien 1 Fault Isolation and Error Containment in the TT-SoC H. Kopetz TU Wien July 2007 This is joint work with C. El.Salloum, B.Huber and R.Obermaisser Outline 2 Introduction The Concept of a Distributed

More information

CS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following:

CS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following: CS 47 Spring 27 Mike Lam, Professor Fault Tolerance Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapter 8) Various online

More information

ARCHITECTURE DESIGN FOR SOFT ERRORS

ARCHITECTURE DESIGN FOR SOFT ERRORS ARCHITECTURE DESIGN FOR SOFT ERRORS Shubu Mukherjee ^ШВпШшр"* AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO T^"ТГПШГ SAN FRANCISCO SINGAPORE SYDNEY TOKYO ^ P f ^ ^ ELSEVIER Morgan

More information

Energy-aware Fault-tolerant and Real-time Wireless Sensor Network for Control System

Energy-aware Fault-tolerant and Real-time Wireless Sensor Network for Control System Energy-aware Fault-tolerant and Real-time Wireless Sensor Network for Control System Thesis Proposal Wenchen Wang Computer Science, University of Pittsburgh Committee: Dr. Daniel Mosse, Computer Science,

More information

Software-based Fault Tolerance Mission (Im)possible?

Software-based Fault Tolerance Mission (Im)possible? Software-based Fault Tolerance Mission Im)possible? Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013 System Software Group http://www4.cs.fau.de Embedded Systems Initiative

More information

Fault Tolerant Parallel Filters Based on ECC Codes

Fault Tolerant Parallel Filters Based on ECC Codes Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 11, Number 7 (2018) pp. 597-605 Research India Publications http://www.ripublication.com Fault Tolerant Parallel Filters Based on

More information

Issues in Programming Language Design for Embedded RT Systems

Issues in Programming Language Design for Embedded RT Systems CSE 237B Fall 2009 Issues in Programming Language Design for Embedded RT Systems Reliability and Fault Tolerance Exceptions and Exception Handling Rajesh Gupta University of California, San Diego ES Characteristics

More information

Multi-path Forward Error Correction Control Scheme with Path Interleaving

Multi-path Forward Error Correction Control Scheme with Path Interleaving Multi-path Forward Error Correction Control Scheme with Path Interleaving Ming-Fong Tsai, Chun-Yi Kuo, Chun-Nan Kuo and Ce-Kuen Shieh Department of Electrical Engineering, National Cheng Kung University,

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 5, May-2013 ISSN 255 CORRECTIONS TO FAULT SECURE OF MAJORITY LOGIC DECODER AND DETECTOR FOR MEMORY APPLICATIONS Viji.D PG Scholar Embedded Systems Prist University, Thanjuvr - India Mr.T.Sathees Kumar AP/ECE Prist University,

More information

Lecture 8 Wireless Sensor Networks: Overview

Lecture 8 Wireless Sensor Networks: Overview Lecture 8 Wireless Sensor Networks: Overview Reading: Wireless Sensor Networks, in Ad Hoc Wireless Networks: Architectures and Protocols, Chapter 12, sections 12.1-12.2. I. Akyildiz, W. Su, Y. Sankarasubramaniam

More information

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP 1 M.DEIVAKANI, 2 D.SHANTHI 1 Associate Professor, Department of Electronics and Communication Engineering PSNA College

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect 1 A Soft Tolerant Network-on-Chip Router Pipeline for Multi-core Systems Pavan Poluri and Ahmed Louri Department of Electrical and Computer Engineering, University of Arizona Email: pavanp@email.arizona.edu,

More information

Delay Constrained ARQ Mechanism for MPEG Media Transport Protocol Based Video Streaming over Internet

Delay Constrained ARQ Mechanism for MPEG Media Transport Protocol Based Video Streaming over Internet Delay Constrained ARQ Mechanism for MPEG Media Transport Protocol Based Video Streaming over Internet Hong-rae Lee, Tae-jun Jung, Kwang-deok Seo Division of Computer and Telecommunications Engineering

More information

Exploiting Unused Spare Columns to Improve Memory ECC

Exploiting Unused Spare Columns to Improve Memory ECC 2009 27th IEEE VLSI Test Symposium Exploiting Unused Spare Columns to Improve Memory ECC Rudrajit Datta and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering

More information

Ultra Low-Cost Defect Protection for Microprocessor Pipelines

Ultra Low-Cost Defect Protection for Microprocessor Pipelines Ultra Low-Cost Defect Protection for Microprocessor Pipelines Smitha Shyam Kypros Constantinides Sujay Phadke Valeria Bertacco Todd Austin Advanced Computer Architecture Lab University of Michigan Key

More information

OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)

OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden) OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING Björn Döbel (TU Dresden) Brussels, 02.02.2013 Hardware Faults Radiation-induced soft errors Mainly an issue in avionics+space 1 DRAM errors in large

More information

Implementation of single bit Error detection and Correction using Embedded hamming scheme

Implementation of single bit Error detection and Correction using Embedded hamming scheme Implementation of single bit Error detection and Correction using Embedded hamming scheme Anoop HK 1, Subodh kumar panda 2 and Vasudeva G 1 M.tech(VLSI & ES), BNMIT, Bangalore 2 Assoc Prof,Dept of ECE,

More information

ECC Protection in Software

ECC Protection in Software Center for RC eliable omputing ECC Protection in Software by Philip P Shirvani RATS June 8, 1999 Outline l Motivation l Requirements l Coding Schemes l Multiple Error Handling l Implementation in ARGOS

More information

Energy-Aware MPEG-4 4 FGS Streaming

Energy-Aware MPEG-4 4 FGS Streaming Energy-Aware MPEG-4 4 FGS Streaming Kihwan Choi and Massoud Pedram University of Southern California Kwanho Kim Seoul National University Outline! Wireless video streaming! Scalable video coding " MPEG-2

More information

EDAC FOR MEMORY PROTECTION IN ARM PROCESSOR

EDAC FOR MEMORY PROTECTION IN ARM PROCESSOR EDAC FOR MEMORY PROTECTION IN ARM PROCESSOR Mrs. A. Ruhan Bevi ECE department, SRM, Chennai, India. Abstract: The ARM processor core is a key component of many successful 32-bit embedded systems. Embedded

More information

Adaptive Middleware for Distributed Sensor Environments

Adaptive Middleware for Distributed Sensor Environments Adaptive Middleware for Distributed Sensor Environments Xingbo Yu, Koushik Niyogi, Sharad Mehrotra, Nalini Venkatasubramanian University of California, Irvine {xyu, kniyogi, sharad, nalini}@ics.uci.edu

More information

Fault-tolerant techniques

Fault-tolerant techniques What are the effects if the hardware or software is not fault-free in a real-time system? What causes component faults? Specification or design faults: Incomplete or erroneous models Lack of techniques

More information

SNR Scalable Transcoding for Video over Wireless Channels

SNR Scalable Transcoding for Video over Wireless Channels SNR Scalable Transcoding for Video over Wireless Channels Yue Yu Chang Wen Chen Department of Electrical Engineering University of Missouri-Columbia Columbia, MO 65211 Email: { yyu,cchen} @ee.missouri.edu

More information

Low Power Cache Design. Angel Chen Joe Gambino

Low Power Cache Design. Angel Chen Joe Gambino Low Power Cache Design Angel Chen Joe Gambino Agenda Why is low power important? How does cache contribute to the power consumption of a processor? What are some design challenges for low power caches?

More information

Fault Tolerance. Distributed Systems IT332

Fault Tolerance. Distributed Systems IT332 Fault Tolerance Distributed Systems IT332 2 Outline Introduction to fault tolerance Reliable Client Server Communication Distributed commit Failure recovery 3 Failures, Due to What? A system is said to

More information

ECE 574 Cluster Computing Lecture 19

ECE 574 Cluster Computing Lecture 19 ECE 574 Cluster Computing Lecture 19 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 10 November 2015 Announcements Projects HW extended 1 MPI Review MPI is *not* shared memory

More information

Dep. Systems Requirements

Dep. Systems Requirements Dependable Systems Dep. Systems Requirements Availability the system is ready to be used immediately. A(t) = probability system is available for use at time t MTTF/(MTTF+MTTR) If MTTR can be kept small

More information

I/O CANNOT BE IGNORED

I/O CANNOT BE IGNORED LECTURE 13 I/O I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access improves by ~10% per year and I/O remains the same.

More information

Subject: Adhoc Networks

Subject: Adhoc Networks ISSUES IN AD HOC WIRELESS NETWORKS The major issues that affect the design, deployment, & performance of an ad hoc wireless network system are: Medium Access Scheme. Transport Layer Protocol. Routing.

More information

416 Distributed Systems. Errors and Failures Oct 16, 2018

416 Distributed Systems. Errors and Failures Oct 16, 2018 416 Distributed Systems Errors and Failures Oct 16, 2018 Types of Errors Hard errors: The component is dead. Soft errors: A signal or bit is wrong, but it doesn t mean the component must be faulty Note:

More information

Redundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992

Redundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical

More information

Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 13:

More information

ReSpace/MAPLD Conference Albuquerque, NM, August A Fault-Handling Methodology by Promoting Hardware Configurations via PageRank

ReSpace/MAPLD Conference Albuquerque, NM, August A Fault-Handling Methodology by Promoting Hardware Configurations via PageRank ReSpace/MAPLD Conference Albuquerque, NM, August 2011. A Fault-Handling Methodology by Promoting Hardware Configurations via PageRank Naveed Imran and Ronald F. DeMara Department of Electrical Engineering

More information

Redundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992

Redundancy in fault tolerant computing. D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 Redundancy in fault tolerant computing D. P. Siewiorek R.S. Swarz, Reliable Computer Systems, Prentice Hall, 1992 1 Redundancy Fault tolerance computing is based on redundancy HARDWARE REDUNDANCY Physical

More information

Remote Health Monitoring for an Embedded System

Remote Health Monitoring for an Embedded System July 20, 2012 Remote Health Monitoring for an Embedded System Authors: Puneet Gupta, Kundan Kumar, Vishnu H Prasad 1/22/2014 2 Outline Background Background & Scope Requirements Key Challenges Introduction

More information

An Approach for Adaptive DRAM Temperature and Power Management

An Approach for Adaptive DRAM Temperature and Power Management IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 An Approach for Adaptive DRAM Temperature and Power Management Song Liu, Yu Zhang, Seda Ogrenci Memik, and Gokhan Memik Abstract High-performance

More information

A Low-Cost Correction Algorithm for Transient Data Errors

A Low-Cost Correction Algorithm for Transient Data Errors A Low-Cost Correction Algorithm for Transient Data Errors Aiguo Li, Bingrong Hong School of Computer Science and Technology Harbin Institute of Technology, Harbin 150001, China liaiguo@hit.edu.cn Introduction

More information

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS Ye-Kui Wang 1, Miska M. Hannuksela 2 and Moncef Gabbouj 3 1 Tampere International Center for Signal Processing (TICSP), Tampere,

More information

Multiple Event Upsets Aware FPGAs Using Protected Schemes

Multiple Event Upsets Aware FPGAs Using Protected Schemes Multiple Event Upsets Aware FPGAs Using Protected Schemes Costas Argyrides, Dhiraj K. Pradhan University of Bristol, Department of Computer Science Merchant Venturers Building, Woodland Road, Bristol,

More information

File systems CS 241. May 2, University of Illinois

File systems CS 241. May 2, University of Illinois File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict

More information

Energy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks

Energy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks Energy-Efficient Cooperative Communication In Clustered Wireless Sensor Networks Reza Aminzadeh Electrical Engineering Department Khavaran Higher Education Institute Mashhad, Iran. reza.aminzadeh@ieee.com

More information

NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN

NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN NETWORKS on CHIP A NEW PARADIGM for SYSTEMS on CHIPS DESIGN Giovanni De Micheli Luca Benini CSL - Stanford University DEIS - Bologna University Electronic systems Systems on chip are everywhere Technology

More information

Eliminating Single Points of Failure in Software Based Redundancy

Eliminating Single Points of Failure in Software Based Redundancy Eliminating Single Points of Failure in Software Based Redundancy Peter Ulbrich, Martin Hoffmann, Rüdiger Kapitza, Daniel Lohmann, Reiner Schmid and Wolfgang Schröder-Preikschat EDCC May 9, 2012 SYSTEM

More information

4G WIRELESS VIDEO COMMUNICATIONS

4G WIRELESS VIDEO COMMUNICATIONS 4G WIRELESS VIDEO COMMUNICATIONS Haohong Wang Marvell Semiconductors, USA Lisimachos P. Kondi University of Ioannina, Greece Ajay Luthra Motorola, USA Song Ci University of Nebraska-Lincoln, USA WILEY

More information

Channel-Adaptive Error Protection for Scalable Audio Streaming over Wireless Internet

Channel-Adaptive Error Protection for Scalable Audio Streaming over Wireless Internet Channel-Adaptive Error Protection for Scalable Audio Streaming over Wireless Internet GuiJin Wang Qian Zhang Wenwu Zhu Jianping Zhou Department of Electronic Engineering, Tsinghua University, Beijing,

More information

Distributed Rate Allocation for Video Streaming over Wireless Networks. Wireless Home Video Networking

Distributed Rate Allocation for Video Streaming over Wireless Networks. Wireless Home Video Networking Ph.D. Oral Defense Distributed Rate Allocation for Video Streaming over Wireless Networks Xiaoqing Zhu Tuesday, June, 8 Information Systems Laboratory Stanford University Wireless Home Video Networking

More information

Multi-level Fault Tolerance in 2D and 3D Networks-on-Chip

Multi-level Fault Tolerance in 2D and 3D Networks-on-Chip Multi-level Fault Tolerance in 2D and 3D Networks-on-Chip Claudia usu Vladimir Pasca Lorena Anghel TIMA Laboratory Grenoble, France Outline Introduction Link Level outing Level Application Level Conclusions

More information

STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip

STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip Codesign for Tiled Manycore Systems Mingyu Wang and Zhaolin Li Institute of Microelectronics, Tsinghua University, Beijing 100084,

More information

Distributed Systems COMP 212. Lecture 19 Othon Michail

Distributed Systems COMP 212. Lecture 19 Othon Michail Distributed Systems COMP 212 Lecture 19 Othon Michail Fault Tolerance 2/31 What is a Distributed System? 3/31 Distributed vs Single-machine Systems A key difference: partial failures One component fails

More information

MITIGATING THE EFFECT OF PACKET LOSSES ON REAL-TIME VIDEO STREAMING USING PSNR AS VIDEO QUALITY ASSESSMENT METRIC ABSTRACT

MITIGATING THE EFFECT OF PACKET LOSSES ON REAL-TIME VIDEO STREAMING USING PSNR AS VIDEO QUALITY ASSESSMENT METRIC ABSTRACT MITIGATING THE EFFECT OF PACKET LOSSES ON REAL-TIME VIDEO STREAMING USING PSNR AS VIDEO QUALITY ASSESSMENT METRIC Anietie Bassey, Kufre M. Udofia & Mfonobong C. Uko Department of Electrical/Electronic

More information

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure

Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Reliability of Memory Storage System Using Decimal Matrix Code and Meta-Cure Iswarya Gopal, Rajasekar.T, PG Scholar, Sri Shakthi Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India Assistant

More information

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 VL-ECC: Variable Data-Length Error Correction Code for Embedded Memory in DSP Applications Jangwon Park,

More information

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS E. Masala, D. Quaglia, J.C. De Martin Λ Dipartimento di Automatica e Informatica/ Λ IRITI-CNR Politecnico di Torino, Italy

More information

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis Euy-Doc Jang *, Jae-Gon Kim *, Truong Thang**,Jung-won Kang** *Korea Aerospace University, 100, Hanggongdae gil, Hwajeon-dong,

More information

Recommended Readings

Recommended Readings Lecture 11: Media Adaptation Scalable Coding, Dealing with Errors Some slides, images were from http://ip.hhi.de/imagecom_g1/savce/index.htm and John G. Apostolopoulos http://www.mit.edu/~6.344/spring2004

More information

CDA 5140 Software Fault-tolerance. - however, reliability of the overall system is actually a product of the hardware, software, and human reliability

CDA 5140 Software Fault-tolerance. - however, reliability of the overall system is actually a product of the hardware, software, and human reliability CDA 5140 Software Fault-tolerance - so far have looked at reliability as hardware reliability - however, reliability of the overall system is actually a product of the hardware, software, and human reliability

More information

AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors

AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors Computer Sciences Department University of Wisconsin Madison http://www.cs.wisc.edu/~ericro/ericro.html ericro@cs.wisc.edu High-Performance

More information

Introduction to Robust Systems

Introduction to Robust Systems Introduction to Robust Systems Subhasish Mitra Stanford University Email: subh@stanford.edu 1 Objective of this Talk Brainstorm What is a robust system? How can we build robust systems? Robust systems

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more

More information

Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability

Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability Liangzhen Lai, Vikas Chandra* and Puneet Gupta UCLA Electrical Engineering Department ARM Research* Radiation-Induced

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAGE COMPRESSION STANDARDS Lesson 19 JPEG-2000 Error Resiliency Instructional Objectives At the end of this lesson, the students should be able to: 1. Name two different types of lossy

More information

EXASCALE IN 2018 REALLY? FRANCK CAPPELLO INRIA&UIUC

EXASCALE IN 2018 REALLY? FRANCK CAPPELLO INRIA&UIUC EASCALE IN 2018 REALLY? FRANCK CAPPELLO INRIA&UIUC What are we talking about? 100M cores 12 cores/node Power Challenges Exascale Technology Roadmap Meeting San Diego California, December 2009. $1M per

More information

To address these challenges, extensive research has been conducted and have introduced six key areas of streaming video, namely: video compression,

To address these challenges, extensive research has been conducted and have introduced six key areas of streaming video, namely: video compression, Design of an Application Layer Congestion Control for Reducing network load and Receiver based Buffering Technique for packet synchronization in Video Streaming over the Internet Protocol Mushfeq-Us-Saleheen

More information

Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy

Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy Improving the Fault Tolerance of a Computer System with Space-Time Triple Modular Redundancy Wei Chen, Rui Gong, Fang Liu, Kui Dai, Zhiying Wang School of Computer, National University of Defense Technology,

More information

Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1. University of Virginia Computer Science 2. NVIDIA Research

Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1. University of Virginia Computer Science 2. NVIDIA Research A Hardware Redundancy and Recovery Mechanism for Reliable Scientific Computation on Graphics Processors Jeremy W. Sheaffer 1 David P. Luebke 2 Kevin Skadron 1 1 University of Virginia Computer Science

More information

HAFT Hardware-Assisted Fault Tolerance

HAFT Hardware-Assisted Fault Tolerance HAFT Hardware-Assisted Fault Tolerance Dmitrii Kuvaiskii Rasha Faqeh Pramod Bhatotia Christof Fetzer Technische Universität Dresden Pascal Felber Université de Neuchâtel Hardware Errors in the Wild Online

More information

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework System Modeling and Implementation of MPEG-4 Encoder under Fine-Granular-Scalability Framework Final Report Embedded Software Systems Prof. B. L. Evans by Wei Li and Zhenxun Xiao May 8, 2002 Abstract Stream

More information

ADAPTIVE ERROR PROTECTION FOR ENERGY EFFICIENCY. Lin Li, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin

ADAPTIVE ERROR PROTECTION FOR ENERGY EFFICIENCY. Lin Li, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin To appear on International Conference on Computer Aided Design (ICCAD 3) ADAPTIVE ERROR PROTECTION FOR ENERGY EFFICIENCY Lin Li, N. Vijaykrishnan, Mahmut Kandemir, and Mary Jane Irwin Microsystems Design

More information

LECTURE 5: MEMORY HIERARCHY DESIGN

LECTURE 5: MEMORY HIERARCHY DESIGN LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive

More information

Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet.

Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet. Error Control Techniques for Interactive Low-bit Rate Video Transmission over the Internet. Injong Rhee Department of Computer Science North Carolina State University Video Conferencing over Packet- Switching

More information

Improving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE e networks

Improving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE e networks Improving the quality of H.264 video transmission using the Intra-Frame FEC over IEEE 802.11e networks Seung-Seok Kang 1,1, Yejin Sohn 1, and Eunji Moon 1 1Department of Computer Science, Seoul Women s

More information