Hardware Verification 2IMF20

Size: px
Start display at page:

Download "Hardware Verification 2IMF20"

Transcription

1 Hardware Verification 2IMF20 Julien Schmaltz Lecture 06: Verification of on-chip communications

2 Living a revolution It is a really exiting time to be in computing right now, because we are [experiencing] a major change. A change that happens maybe once every 20 or 30 years... a fundamental re-thinking of... what computation is William Dally, DAC keynote

3 Multi-core shift reality 3

4 Growing number of cores AMD Opteron 12 cores Sun Niagara3 16 cores Intel 8 cores ~1.8 Bill. T. on 2x3.46cm 2 ~1 Bill. T. on 3.7cm 2 ~2.3 Bill. T. on 6.8cm 2 Intel SCC 48 cores ~1.3 Bill. T. on 5.6cm 2 Intel 4 cores ~582 Mio. T. on 2.86cm 2 Intel Research 80 cores ~100 Mio. T. on 2.75cm 2 Tilera TILEPro64 64 cores Intel 2 cores ~167 Mio. T. on 1.1cm 2 4

5 80 Cores Research Chip (Intel) Teraflops, 62 Watts 100 millions transistors, 275 mm2 25% node area for router ASCI Red Supercomputer Teraflops (Dec. 1996) 10, 000 Pentium Pro 104 cabinets, 230 m2 5

6 It is only the beginning Source. IEEE Computers

7 Communications are key 80 core research chip about 25% area for the NoC about 30% power consumption for the NoC Communication fabrics key to performance and efficiency to functional correctness 7

8 Verification Challenges NoCs are very large systems Methods must scale up to 100s of agents Large number of parameters (routing, switching, buffers, etc.) Regular and irregular structures NoCs must be fault-tolerant Deep sub-micron effect Not all routers/processors are working Static and dynamic fault-models NoCs have intricate message dependencies Mix between interconnect and protocols e.g. cache coherency or master/slave Deadlocks can emerge from deadlock-free routing and protocols 8

9 Protocol'between' applica/ons' Applica/on'layer' Processing'nodes,' routers,'channels' Network'layer' Protocol'between' routers'' Data'link'layer'

10 NoC Example: Hermes 10

11 Application layer Masters send requests and wait for responses Slaves produce responses when receiving requests Deadlock-free protocol No cyclic message dependencies: 11

12 Network layer Deterministic simple routing algorithm First route to the destination column and then to the correct row No cyclic dependencies and thus deadlock-free 12

13 Link layer A packet moves if the next channel has free space p A packet moves if it has sufficiently credits p A packet is joined with another packet p p 13

14 Link layer A packet moves if the next channel has free space p 13

15 Deadlock-free? Deadlock-free application layer + Deadlock-free network layer + Deadlock-free link layer? = Deadlock-free system 14

16 All masters right, slaves left» Is the system deadlock-free? Master Slave 15

17 All masters right, slaves left» Is the system deadlock-free?» Yes! A deadlock would require a response to wait for a request Master Slave 16

18 Masters even columns, slaves odd» Is the system deadlock-free? Master Slave Slave Master Slave Master 17

19 Masters even columns, slaves odd» Is the system deadlock-free?» No if at least four columns, yes otherwise. Slave Slave Slave Master Master Master Purple requests are transformed into orange responses Blue requests are transformed into green responses 18

20 Masters even columns, slaves odd» Is the system deadlock-free?» No if at least four columns, yes otherwise. Slave Slave Slave Master Master Master Orange responses blocked by blue requests Green responses blocked by purple requests 19

21 Deadlock-free application layer + Deadlock-free network layer + Deadlock-free link layer? = Deadlock-free system 20

22 NoC Example (2) - Spidergon Design by STMicroelectronics 21

23 Application layer» Nodes send requests High-level protocol req! 22

24 Network layer Routing logic RelAd = (dest - current ) mod 4 * N if RelAd = 0 then stop elseif 0 < RelAd <= N then go clockwise elseif 3*N <= RelAd <= 4*N then go counter clockwise else go across endif 23

25 The network has a deadlock!» For instance, we can have a cycle of packets

26 Dividing in two networks» Is the system deadlock-free? Send packets Idle cores

27 Dividing in two networks» Is the system deadlock-free?» Yes! None of the dependencies in the right upper quarter occur Send packets Idle cores

28 Slaves in 1st quarter only?» Is the system deadlock-free? Send packets Idle cores

29 Slaves in 1st quarter only?» Is the system deadlock-free?» No! Send packets Idle cores

30 Slaves in 1st quarter only?» Is the system deadlock-free?» No! Send packets Idle cores

31 Slaves in 1st quarter only?» Is the system deadlock-free?» No! Send packets Idle cores

32 Slaves in 1st quarter only?» Is the system deadlock-free?» No! Send packets Idle cores

33 Slaves in 1st quarter only?» Is the system deadlock-free?» No! Send packets Idle cores

34 Deadlock-free application layer + Network layer with deadlocks + Deadlock-free link layer? = Deadlock-free system 31

35 Confusing» We need tools to (quickly) check for deadlocks taking details of all three layers into account in large systems 32

36 Outline» Intel's micro-architectural description language xmas language Capturing high-level structure and message dependencies Extended to MaDL at TU/e. Micro-architectural Description Language» Deadlock verification for MaDL Definition of deadlocks Labelled dependency graph Feasible logically closed subgraph» Conclusion and future work 33

37 Intel s abstraction for networks High-level of abstraction Exploit high-level structure Automatic proofs using invariant generation and hardware model-checking 34

38 xmas Executable Micro-Architectural Specification» Fair sinks and sometimes sources» Diagram is formal model» Friendly to microarchitects 35

39 MaDL (1) Channels with three signals data, input ready, target ready Transfer cycle both input and target are "true" Notion of a dead channel: F (c.irdy & G! c.trdy) {req,rsp} rsp {req} req 36

40 MaDL (2) param int QSIZE; textual input const pkt; chan src := Source(pkt); chan one, two := Fork(src); chan long := Queue(QSIZE,Queue(QSIZE, one)); chan short := Queue(QSIZE, two); Sink(CtrlJoin(long,short)); predicates and functions pred P (p : pkt_t, q : pkt_t ) { p.id == q.id }; function F (p : pkt_t) : pkt_t { flda = p.fldb; fldb = p.flda; }; structured recursive data types IO-FSM additional complex primitives LoadBalancer Joitch Multi-Match struct pkt0_t { flda : [2:0]; fldb : [3:0]; }; union pkt1_t { optiona : pkta_t; optionb : pktb_t; }; enum pkt2_t { optiona; optionb; }; And more: for-loops, if-then-else, uses, const optiona; 37

41 Composing modules via channels» Channels with three signals data, input ready, target ready» Transfer cycle both input and target are "true" 38

42 MaDL example req,rsp q 0 q 1 rsp req q 2 req» Two sources one for requests one for responses 39

43 MaDL example P req,rsp q 0 q 1 rsp req q 2 req» Two sources one for requests one for responses 39

44 MaDL example req,rsp q 0 q 1 P rsp req q 2 req» Two sources one for requests one for responses 39

45 MaDL example req,rsp req P q 0 q 1 P q 2 rsp req» Two sources one for requests one for responses 39

46 MaDL example req,rsp req q 0 q 1 q 2 P P rsp req» Two sources one for requests one for responses 39

47 MaDL example req,rsp q 0 q 1 P rsp req q 2 req» Two sources one for requests one for responses 39

48 Processing node for XY routing in a 2D-mesh L N S E h.x = X h.x X h.y = Y h.y Y h.x > X h.x < X h.y < YN h.y > Y S E W W

49 Processing node with requests and responses L (dst,src,req)! (src, _,rsp) req rsp N S E h.x = X h.x X h.y = Y h.y Y h.x > X h.x < X h.y < YN h.y > Y S E W W

50 A more complex processing nodes with virtual channels and credits

51 Outline» Intel's micro-architectural description language xmas language Capturing high-level structure and message dependencies Extended to MaDL at TU/e. Micro-architectural Description Language» Deadlock verification for MaDL Definition of deadlocks Labelled dependency graph Feasible logically closed subgraph» Conclusion and future work 43

52 Formal definition of "deadlock" in MaDL Intuition is a "dead" channel Formal definition based on Linear Temporal Logic Predicate logic Temporal operators "eventually" ( ) and "globally" ( ) Channel c is dead iff (c.irdy c.trdy)

53 MaDL example (c.irdy c.trdy) requests req,rsp q 0 q 1 rsp req q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

54 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0

55 MaDL example requests req,rsp q 0 q 1 rsp responses req x q 2 req Inject two requests in q0 Fork creates two copies

56 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

57 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

58 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

59 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

60 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk

61 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk Inject two responses in q0

62 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk Inject two responses in q0 q2 is permanently idle for responses, q1 is permanently blocking

63 MaDL example requests req,rsp q 0 q 1 rsp responses reqx q 2 req Inject two requests in q0 Fork creates two copies One pair is sunk Inject two responses in q0 q2 is permanently idle for responses, q1 is permanently blocking We have a deadlock without a circular wait!

64 General approach for deadlock detection in MaDL networks Define deadlock equations for all components Equations capture the reason why a component is idle or blocking Build a labelled waiting graph for each queue Labels correspond to the equations Graph captures the topology, i.e., the dependencies between the MaDL components Search for a feasible logically closed subgraph Corresponds to a deadlock situation Feasibility checked using Linear Programming This approach may output unreachable deadlocks A first step generates invariants to rule out false deadlocks Invariants are rather weak and simple - false deadlocks are in theory still possible

65 General approach for deadlock detection in xmas networks Define deadlock equations for all components Equations capture the reason why a component is idle or blocking Build a labelled waiting graph for each queue Labels correspond to the equations Graph captures the topology, i.e., the dependencies between the MaDL components Search for a feasible logically closed subgraph Corresponds to a deadlock situation Feasibility checked using Linear Programming This approach may output unreachable deadlocks A first step generates invariants to rule out false deadlocks Invariants are rather weak and simple - false deadlocks are in theory still possible

66 Deadlock equations for a channel Depends on the target component connected to the channel We look at the input port of the target component trdy data irdy Initiator u target

67 Deadlock equations for a join 2 cases output is blocked the other input is idle data irdy Block(u) = Idle(v) + Block(w) trdy u v w req

68 Deadlock equations for a join 2 cases output is blocked the other input is idle We need to know when a channel is idle! data irdy Block(u) = Idle(v) + Block(w) trdy u v w req

69 Idle equations for a channel Depends on the initiator component connected to the channel We are looking at the input port of the initiator data irdy trdy Initiator u target

70 Idle equations for a join A join is idle if one of the input channels is idle Idle(w) = Idle(u) + Idle(v) data irdy trdy u v w req

71 Idle equations for a fork A fork output is idle if the input is idle or the other output is blocked Idle(w) = Idle(u) + Block(v) data irdy trdy u v w req

72 Idle equations for a queue A queue is idle if it is empty and its input channel is idle This is for one message type which might be blocked by another type data Idle(w) = Empty(q). Idle(u) + Block(w') where w' is a message with a type different from w irdy trdy u w req

73 Our quest for "dead" queues Definition of a deadlock F (u.irdy => G~u.trdy) We look for a "dead" queue with a message in it (u.irdy) output blocked (G~u.trdy) Over approximation configuration not always reachable we may output false deadlocks

74 MaDL Verification: Overview Basic idea: Generate Invariants Generate SMT Problem - Express deadlock in SMT - Over-approximate deadlocks - Use invariants to rule-out false deadlocks - Reachability analysis of found deadlocks SMT Solver (Z3) (unsat) No deadlock! (sat) Generate SMV Model nuxmv Deadlock!

75 General approach for deadlock detection in MaDL networks Define deadlock equations for all components Equations capture the reason why a component is idle or blocking Build a labelled waiting graph for each queue Labels correspond to the equations Graph captures the topology, i.e., the dependencies between the MaDL components Search for a feasible logically closed subgraph Corresponds to a deadlock situation Feasibility checked using Linear Programming This approach may output unreachable deadlocks A first step generates invariants to rule out false deadlocks Invariants are rather weak and simple - false deadlocks are in theory still possible

76 Step 1 / simulation - req queues with req inject req req

77 Step 1 / simulation - req queues with req inject req req

78 Step 1 / simulation - req queues with req inject req req

79 Step 1 / simulation - rsp queues with req and rsp inject rsp req

80 Step 1 / simulation - rsp queues with req and rsp inject rsp req

81 Step 1 / simulation - rsp queues with req and rsp inject rsp req

82 General approach for deadlock detection in MaDL networks Define deadlock equations for all components Equations capture the reason why a component is idle or blocking Build a labelled waiting graph for each queue Labels correspond to the equations Graph captures the topology, i.e., the dependencies between the MaDL components Search for a feasible logically closed subgraph Corresponds to a deadlock situation Feasibility checked using Linear Programming This approach may output unreachable deadlocks A first step generates invariants to rule out false deadlocks Invariants are rather weak and simple - false deadlocks are in theory still possible

83 Step 2 / labelled dependency graph (1) start join req q1 q 1.req 1 join start with a message in q1 and visit the join

84 Step 2 / labelled dependency graph (2) start join u v w req sw mrg2 Block(u) = Idle(v) + Block(w) analyse the join according to its deadlock equation q1 q 1.req 1 join we go forward to the merge and backward to the switch + sw mrg2

85 Step 2 / labelled dependency graph (2) start req join sw u w v mrg2 Block(u) = Block(w) q1 q 1.req 1 join forwards to the switch - then the sink can never be blocked we assume fair sinks + sw mrg2 false sink

86 Step 2 / labelled dependency graph (2) start join req w v u mrg2 sw Idle(u) = Idle(w) backwards to the switch q1 q 1.req 1 join + sw mrg2 false sink

87 Step 2 / labelled dependency graph (2) start join w u req mrg2 sw Idle(u) = Idle(w). Empty(q2) backwards to the queue note that we forgot the Block(w') case q1 q2 q 1.req 1 join + sw q 2.rsp =0 mrg2 false sink

88 Step 2 / labelled dependency graph (2) start join req u v w u mrg2 sw Idle(w) = Idle(u). Idle(v) backwards to the merge and branch note branching is bad for us mrg1 q1 q2 q 1.req 1 join + sw q 2.rsp =0 mrg2 false sink

89 Step 2 / labelled dependency graph (2) start w v u u join req mrg2 Idle(u) = Block(v) + Idle(w) sw backwards to the merge and branch to the source - idle if no type produced to the fork q1 q 1.req 1 join src2. rsp 62 x frk mrg1 q2 + sw q 2.rsp =0 mrg2 false sink

90 Step 2 / labelled dependency graph (2) start join w u u req mrg2 Idle(u) = Idle(w). Empty(q0) sw backwards to q0 and the source src1 q0 q1 q 1.req 1 join false q 0.rsp =0. src2 rsp 62 x frk mrg1 q2 + sw q 2.rsp =0 mrg2 false sink

91 Step 2 / labelled dependency graph (2) start u w join req Block(u) = Block(w). Full(q1) sw mrg2 forwards back to q1 and stop expansion src1 false q0 q 0.rsp =0 src2. + rsp 62 x frk mrg1 q1 q 1 = q 1.size q2 q 1.req 1 join + sw q 2.rsp =0 mrg2 false sink

92 General approach for deadlock detection in MaDL networks Define deadlock equations for all components Equations capture the reason why a component is idle or blocking Build a labelled waiting graph for each queue Labels correspond to the equations Graph captures the topology, i.e., the dependencies between the MaDL components Search for a feasible logically closed subgraph Corresponds to a deadlock situation Feasibility checked using Linear Programming This approach may output unreachable deadlocks A first step generates invariants to rule out false deadlocks Invariants are rather weak and simple - false deadlocks are in theory still possible

93 Step 2 / logically closed subgraph 1 req src1 false q0 q 0.rsp =0 +. src2 rsp 62 x q1 q 1.req 1 join q 1 = q 1.size frk + sw q 2.rsp =0 mrg1 q2 mrg2 false sink

94 Step 2 / logically closed subgraph 1 req src1 false q0 q 0.rsp =0 +. src2 rsp 62 x q1 q 1.req 1 join q 1 = q 1.size frk + sw q 2.rsp =0 mrg1 q2 mrg2 false sink

95 Step 2 / logically closed subgraph 1 req src1 false q0 q 0.rsp =0 +. src2 rsp 62 x q1 q 1.req 1 join q 1 = q 1.size frk + sw q 2.rsp =0 mrg1 q2 mrg2 false sink not feasible

96 Step 2 / logically closed subgraph 2 req src1 false q0 q 0.rsp =0. src2 true + q1 q 1.req 1 join q 1 = q 1.size frk + sw q 2.rsp =0 mrg1 q2 mrg2 false sink

97 Step 2 / logically closed subgraph 2 req src1 false q0 q 0.rsp =0. src2 true + q1 q 1.req 1 join q 1 = q 1.size frk + sw q 2.rsp =0 mrg1 q2 mrg2 sink false q 1.req 1 q 1 = q 1.size q 2.rsp =0 true

98 Invariant Generation (1) Flow invariants: - Automatically generated - Linear equations over the # of packets in channels - Gaussian elimination - Equalities of # of packet between queues - NumX = #In - #Out Linear equations: - #in = #A_in = #B_in - #A = #A_in - #A_out - #B = #B_in - #B_out - #A_out = #B_out = #out After Gaussian elimination: - #A = #B 77

99 [Verbeek et al. 2016] Invariant Generation Highlight (2) Cross-layer invariants: - Automatically generated - Linear equations over: - (1) # transitions and being in a given state - (2) # of events on a channel and # transitions 78

100 Invariant Generation Highlight (2) [Verbeek et al. 2016] = req = ack 79

101 Experimental results 80

102 [Verbeek et al. 2016] Case-study: 2D Mesh - XY routing - MI protocol Interconnect: 81

103 [Verbeek et al. 2016] Case-study: 2D Mesh - XY routing - MI protocol Protocol: L2 caches Directory 82

104 [Verbeek et al. 2016] Case-study: 2D Mesh - XY routing - MI protocol Router MI Cache Core Router MI Directory Core 83

105 Case-study: Experimental results [Verbeek et al. 2016] 84

106 Reachability analysis 85

107 MaDL Verification: Overview Basic idea: Generate Invariants Generate SMT Problem - Express deadlock in SMT - Over-approximate deadlocks - Use invariants to rule-out false deadlocks - Reachability analysis of found deadlocks SMT Solver (Z3) (unsat) No deadlock! (sat) Generate SMV Model nuxmv Deadlock!

108 Reachability checking flow xmas network transfer islands state nuxmv model interleaving, or synchronous invariants reachable (under certain conditions)? definitely unreachable 87

109 Transfer islands 88

110 Enabled island 89

111 Enabled island 90

112 Combination of islands 91

113 Combination of islands 92

114 Conflicts 93

115 Different semantics interleaving asynchronous synchronous 94

116 Two Agents Example c c c c 95

117 Deadlock found by SMT c c "However if credit counters are sized incorrectly to provide more credits (say k + 1) than the capacity k of the ingress queues, the system deadlocks..." -- Chatterjee et al. VMCAI 2011 c c 96 96

118 Deadlock found by SMT c c "However if credit counters are sized incorrectly to provide more credits (say k + 1) than the capacity k of the ingress queues, the system deadlocks..." -- Chatterjee et al. VMCAI 2011 Incorrect! c c 96 96

119 Let s rewind one cycle c c No credit available. (3 requests) c c 97 97

120 Let s move one cycle forward c c 1 credit back. c c 98 98

121 Some experimental results Two agents example IC3 or BDD optional invariants synchronous or interleaving c = 2; k = 2 unreachable deadlock 2; 3 unreachable deadlock 4; 4 unreachable deadlock synchronous/ic3+inv ~3s state space: ~2 40 (about ) 4; 5 unreachable deadlock 4; 6 reachable deadlock 99

122 Outline» Intel's micro-architectural description language xmas language Capturing high-level structure and message dependencies Extended to MaDL at TU/e. Micro-architectural Description Language» Deadlock verification for MaDL Definition of deadlocks Labelled dependency graph Feasible logically closed subgraph» Conclusion and future work 100

123 Research questions and axes Q1: How can we formalise the generic aspects of specific domains? Foundations Formalise domain-specific theories Abstractions Q2: How can we define specification languages with built-in support for formal analyses? Q3: How do we relate the formally proven correct specification to the actual design? Focus on aspects and properties Language to express and analyse Approximations Ignore details to scale-up

124 Research questions Q1: How can we formalise the generic aspects of specific domains? Q2: How can we define specification languages with built-in support for formal analyses? Q3: How do we relate the formal proven correct specification to the actual design?

125 General Theory of Networking Architectures (Q1) GeNoC theory with Productivity Theorem Deadlock-free routing theory & algorithms Languages (Q2) Verification Technologies (Q2 & Q3) MaDL language & algorithms Integration Real World Applications Cooperation with industrial partners (Intel, ARM, NXP?) and academic partners (FBK, UC Irvine, Tsinghua, )

126 Today s focus General Theory of Networking Architectures (Q1) GeNoC theory with Productivity Theorem Deadlock-free routing theory & algorithms Languages (Q2) Verification Technologies (Q2 & Q3) MaDL language & algorithms Integration Real World Applications Cooperation with industrial partners (Intel, ARM, NXP?) and academic partners (FBK, UC Irvine, Tsinghua, )

127 MaDL: Conclusions & Future Work Conclusions Adequate DSL for prototyping network architectures. Being applied to some industrial case-studies Future work Equivalence relations between MaDL and RTL Performance (throughput & latency) evaluation Extend to entire systems

128 THANKS! 106

Formal Verification of Communications in Networks on Chips

Formal Verification of Communications in Networks on Chips Formal Verification of Communications in Networks on Chips Sebastiaan J.C. Joosten Julien Schmaltz Freek Verbeek Bernard van Gastel Open University of the Netherlands, Heerlen Radboud University Nijmegen

More information

A formalisation of XMAS

A formalisation of XMAS A formalisation of XMAS Bernard van Gastel Julien Schmaltz Open University of the Netherlands {Bernard.vanGastel, Julien.Schmaltz}@ou.nl Communication fabrics play a key role in the correctness and performance

More information

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem

Deadlock. Reading. Ensuring Packet Delivery. Overview: The Problem Reading W. Dally, C. Seitz, Deadlock-Free Message Routing on Multiprocessor Interconnection Networks,, IEEE TC, May 1987 Deadlock F. Silla, and J. Duato, Improving the Efficiency of Adaptive Routing in

More information

Towards A Formally Verified Network-on-Chip

Towards A Formally Verified Network-on-Chip Towards A Formally Verified Network-on-Chip Tom van den Broek 1 Julien Schmaltz 12 1 Institute for Computing and Information Sciences Radboud University Nijmegen The Netherlands 2 School of Computer Science

More information

Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom

Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom ISCA 2018 Session 8B: Interconnection Networks Synchronized Progress in Interconnection Networks (SPIN) : A new theory for deadlock freedom Aniruddh Ramrakhyani Georgia Tech (aniruddh@gatech.edu) Tushar

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Packet Switch Architecture

Packet Switch Architecture Packet Switch Architecture 3. Output Queueing Architectures 4. Input Queueing Architectures 5. Switching Fabrics 6. Flow and Congestion Control in Sw. Fabrics 7. Output Scheduling for QoS Guarantees 8.

More information

Towards A Formal Theory of On Chip Communications in the ACL2 Logic

Towards A Formal Theory of On Chip Communications in the ACL2 Logic (c) Julien Schmaltz, ACL2 2006, San José August 15-16 p. 1/37 Towards A Formal Theory of On Chip Communications in the ACL2 Logic Julien Schmaltz Saarland University - Computer Science Department Saarbrücken,

More information

Basic Switch Organization

Basic Switch Organization NOC Routing 1 Basic Switch Organization 2 Basic Switch Organization Link Controller Used for coordinating the flow of messages across the physical link of two adjacent switches 3 Basic Switch Organization

More information

Automated Refinement Checking of Asynchronous Processes. Rajeev Alur. University of Pennsylvania

Automated Refinement Checking of Asynchronous Processes. Rajeev Alur. University of Pennsylvania Automated Refinement Checking of Asynchronous Processes Rajeev Alur University of Pennsylvania www.cis.upenn.edu/~alur/ Intel Formal Verification Seminar, July 2001 Problem Refinement Checking Given two

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Lecture 2: Topology - I

Lecture 2: Topology - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and

More information

INF672 Protocol Safety and Verification. Karthik Bhargavan Xavier Rival Thomas Clausen

INF672 Protocol Safety and Verification. Karthik Bhargavan Xavier Rival Thomas Clausen INF672 Protocol Safety and Verication Karthik Bhargavan Xavier Rival Thomas Clausen 1 Course Outline Lecture 1 [Today, Sep 15] Introduction, Motivating Examples Lectures 2-4 [Sep 22,29, Oct 6] Network

More information

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks

Lecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock

More information

Promela and SPIN. Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH. Promela and SPIN

Promela and SPIN. Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH. Promela and SPIN Promela and SPIN Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH Promela and SPIN Promela (Protocol Meta Language): Language for modelling discrete, event-driven

More information

Concurrent/Parallel Processing

Concurrent/Parallel Processing Concurrent/Parallel Processing David May: April 9, 2014 Introduction The idea of using a collection of interconnected processing devices is not new. Before the emergence of the modern stored program computer,

More information

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Dataflow Lecture: SDF, Kahn Process Networks Stavros Tripakis University of California, Berkeley Stavros Tripakis: EECS

More information

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics

Lecture 16: On-Chip Networks. Topics: Cache networks, NoC basics Lecture 16: On-Chip Networks Topics: Cache networks, NoC basics 1 Traditional Networks Huh et al. ICS 05, Beckmann MICRO 04 Example designs for contiguous L2 cache regions 2 Explorations for Optimality

More information

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels

Lecture: Interconnection Networks. Topics: TM wrap-up, routing, deadlock, flow control, virtual channels Lecture: Interconnection Networks Topics: TM wrap-up, routing, deadlock, flow control, virtual channels 1 TM wrap-up Eager versioning: create a log of old values Handling problematic situations with a

More information

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013 NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching

More information

NoC Test-Chip Project: Working Document

NoC Test-Chip Project: Working Document NoC Test-Chip Project: Working Document Michele Petracca, Omar Ahmad, Young Jin Yoon, Frank Zovko, Luca Carloni and Kenneth Shepard I. INTRODUCTION This document describes the low-power high-performance

More information

Estimating Worst-case Latency of on-chip Interconnects with Formal Simulation

Estimating Worst-case Latency of on-chip Interconnects with Formal Simulation Estimating Worst-case Latency of on-chip Interconnects with Formal Simulation Freek Verbeek Open University of the Netherlands Radboud University, Nijmegen Nikè van Vugt Open University of the Netherlands

More information

T Reactive Systems: Kripke Structures and Automata

T Reactive Systems: Kripke Structures and Automata Tik-79.186 Reactive Systems 1 T-79.186 Reactive Systems: Kripke Structures and Automata Spring 2005, Lecture 3 January 31, 2005 Tik-79.186 Reactive Systems 2 Properties of systems invariants: the system

More information

udirec: Unified Diagnosis and Reconfiguration for Frugal Bypass of NoC Faults

udirec: Unified Diagnosis and Reconfiguration for Frugal Bypass of NoC Faults 1/45 1/22 MICRO-46, 9 th December- 213 Davis, California udirec: Unified Diagnosis and Reconfiguration for Frugal Bypass of NoC Faults Ritesh Parikh and Valeria Bertacco Electrical Engineering & Computer

More information

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures

NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures Andreas Prodromou, Andreas Panteli, Chrysostomos Nicopoulos, and Yiannakis Sazeides University of Cyprus

More information

Diagnostic Information for Control-Flow Analysis of Workflow Graphs (aka Free-Choice Workflow Nets)

Diagnostic Information for Control-Flow Analysis of Workflow Graphs (aka Free-Choice Workflow Nets) Diagnostic Information for Control-Flow Analysis of Workflow Graphs (aka Free-Choice Workflow Nets) Cédric Favre(1,2), Hagen Völzer(1), Peter Müller(2) (1) IBM Research - Zurich (2) ETH Zurich 1 Outline

More information

HECTOR: Formal System-Level to RTL Equivalence Checking

HECTOR: Formal System-Level to RTL Equivalence Checking ATG SoC HECTOR: Formal System-Level to RTL Equivalence Checking Alfred Koelbl, Sergey Berezin, Reily Jacoby, Jerry Burch, William Nicholls, Carl Pixley Advanced Technology Group Synopsys, Inc. June 2008

More information

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP

CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2

More information

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications.

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications. Petri Nets 2. Applications Y Narahari Y Narahari is currently an Associate Professor of Computer Science and Automation at the Indian Institute of Science, Bangalore. His research interests are broadly

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

Interconnection Networks

Interconnection Networks Lecture 18: Interconnection Networks Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Credit: many of these slides were created by Michael Papamichael This lecture is partially

More information

Finite State Verification. CSCE Lecture 14-02/25/2016

Finite State Verification. CSCE Lecture 14-02/25/2016 Finite State Verification CSCE 747 - Lecture 14-02/25/2016 So, You Want to Perform Verification... You have a property that you want your program to obey. Great! Let s write some tests! Does testing guarantee

More information

Thomas Moscibroda Microsoft Research. Onur Mutlu CMU

Thomas Moscibroda Microsoft Research. Onur Mutlu CMU Thomas Moscibroda Microsoft Research Onur Mutlu CMU CPU+L1 CPU+L1 CPU+L1 CPU+L1 Multi-core Chip Cache -Bank Cache -Bank Cache -Bank Cache -Bank CPU+L1 CPU+L1 CPU+L1 CPU+L1 Accelerator, etc Cache -Bank

More information

Reducing Fair Stuttering Refinement of Transaction Systems

Reducing Fair Stuttering Refinement of Transaction Systems Reducing Fair Stuttering Refinement of Transaction Systems Rob Sumners Advanced Micro Devices robert.sumners@amd.com November 16th, 2015 Rob Sumners (AMD) Transaction Progress Checking November 16th, 2015

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

Distributed Systems Programming (F21DS1) Formal Verification

Distributed Systems Programming (F21DS1) Formal Verification Distributed Systems Programming (F21DS1) Formal Verification Andrew Ireland Department of Computer Science School of Mathematical and Computer Sciences Heriot-Watt University Edinburgh Overview Focus on

More information

ECE/CS 757: Advanced Computer Architecture II Interconnects

ECE/CS 757: Advanced Computer Architecture II Interconnects ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction

More information

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling

Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Quest for High-Performance Bufferless NoCs with Single-Cycle Express Paths and Self-Learning Throttling Bhavya K. Daya, Li-Shiuan Peh, Anantha P. Chandrakasan Dept. of Electrical Engineering and Computer

More information

Ultra-Fast NoC Emulation on a Single FPGA

Ultra-Fast NoC Emulation on a Single FPGA The 25 th International Conference on Field-Programmable Logic and Applications (FPL 2015) September 3, 2015 Ultra-Fast NoC Emulation on a Single FPGA Thiem Van Chu, Shimpei Sato, and Kenji Kise Tokyo

More information

Network on Chip Architecture: An Overview

Network on Chip Architecture: An Overview Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology

More information

Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip

Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip ASP-DAC 2010 20 Jan 2010 Session 6C Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip Jonas Diemer, Rolf Ernst TU Braunschweig, Germany diemer@ida.ing.tu-bs.de Michael Kauschke Intel,

More information

[module 2.2] MODELING CONCURRENT PROGRAM EXECUTION

[module 2.2] MODELING CONCURRENT PROGRAM EXECUTION v1.0 20130407 Programmazione Avanzata e Paradigmi Ingegneria e Scienze Informatiche - UNIBO a.a 2013/2014 Lecturer: Alessandro Ricci [module 2.2] MODELING CONCURRENT PROGRAM EXECUTION 1 SUMMARY Making

More information

The SPIN Model Checker

The SPIN Model Checker The SPIN Model Checker Metodi di Verifica del Software Andrea Corradini Lezione 1 2013 Slides liberamente adattate da Logic Model Checking, per gentile concessione di Gerard J. Holzmann http://spinroot.com/spin/doc/course/

More information

ECE 587 Hardware/Software Co-Design Lecture 12 Verification II, System Modeling

ECE 587 Hardware/Software Co-Design Lecture 12 Verification II, System Modeling ECE 587 Hardware/Software Co-Design Spring 2018 1/20 ECE 587 Hardware/Software Co-Design Lecture 12 Verification II, System Modeling Professor Jia Wang Department of Electrical and Computer Engineering

More information

ECE 2162 Intro & Trends. Jun Yang Fall 2009

ECE 2162 Intro & Trends. Jun Yang Fall 2009 ECE 2162 Intro & Trends Jun Yang Fall 2009 Prerequisites CoE/ECE 0142: Computer Organization; or CoE/CS 1541: Introduction to Computer Architecture I will assume you have detailed knowledge of Pipelining

More information

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)

Memory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed) Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2

More information

Lecture 7: Flow Control - I

Lecture 7: Flow Control - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 7: Flow Control - I Tushar Krishna Assistant Professor School of Electrical

More information

Finite State Verification. CSCE Lecture 21-03/28/2017

Finite State Verification. CSCE Lecture 21-03/28/2017 Finite State Verification CSCE 747 - Lecture 21-03/28/2017 So, You Want to Perform Verification... You have a property that you want your program to obey. Great! Let s write some tests! Does testing guarantee

More information

EE 382C Interconnection Networks

EE 382C Interconnection Networks EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Memory Consistency and Synchronization Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 5 (multi-core) " Basic requirements: out later today

More information

Generating Finite State Machines from SystemC

Generating Finite State Machines from SystemC Generating Finite State Machines from SystemC Ali Habibi, Haja Moinudeen, and Sofiène Tahar Department of Electrical and Computer Engineering Concordia University 1455 de Maisonneuve West Montréal, Québec,

More information

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger

Interconnection Networks: Flow Control. Prof. Natalie Enright Jerger Interconnection Networks: Flow Control Prof. Natalie Enright Jerger Switching/Flow Control Overview Topology: determines connectivity of network Routing: determines paths through network Flow Control:

More information

Continuum Computer Architecture

Continuum Computer Architecture Plenary Presentation to the Workshop on Frontiers of Extreme Computing: Continuum Computer Architecture Thomas Sterling California Institute of Technology and Louisiana State University October 25, 2005

More information

Petri Nets ee249 Fall 2000

Petri Nets ee249 Fall 2000 Petri Nets ee249 Fall 2000 Marco Sgroi Most slides borrowed from Luciano Lavagno s lecture ee249 (1998) 1 Models Of Computation for reactive systems Main MOCs: Communicating Finite State Machines Dataflow

More information

Interconnection Networks: Routing. Prof. Natalie Enright Jerger

Interconnection Networks: Routing. Prof. Natalie Enright Jerger Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly

More information

Model-based Analysis of Event-driven Distributed Real-time Embedded Systems

Model-based Analysis of Event-driven Distributed Real-time Embedded Systems Model-based Analysis of Event-driven Distributed Real-time Embedded Systems Gabor Madl Committee Chancellor s Professor Nikil Dutt (Chair) Professor Tony Givargis Professor Ian Harris University of California,

More information

Fault-adaptive routing

Fault-adaptive routing Fault-adaptive routing Presenter: Zaheer Ahmed Supervisor: Adan Kohler Reviewers: Prof. Dr. M. Radetzki Prof. Dr. H.-J. Wunderlich Date: 30-June-2008 7/2/2009 Agenda Motivation Fundamentals of Routing

More information

Prediction Router: Yet another low-latency on-chip router architecture

Prediction Router: Yet another low-latency on-chip router architecture Prediction Router: Yet another low-latency on-chip router architecture Hiroki Matsutani Michihiro Koibuchi Hideharu Amano Tsutomu Yoshinaga (Keio Univ., Japan) (NII, Japan) (Keio Univ., Japan) (UEC, Japan)

More information

Property-based design with HORUS / SYNTHORUS

Property-based design with HORUS / SYNTHORUS Property-based design with HORUS / SYNTHORUS Dominique Borrione, Negin Javaheri, Katell Morin-Allory, Yann Oddos, Alexandre Porcher Radboud University, Nijmegen 1 March 27, 2013 Functional specifications

More information

Computer Aided Verification 2015 The SPIN model checker

Computer Aided Verification 2015 The SPIN model checker Computer Aided Verification 2015 The SPIN model checker Grigory Fedyukovich Universita della Svizzera Italiana March 11, 2015 Material borrowed from Roberto Bruttomesso Outline 1 Introduction 2 PROcess

More information

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 13: Memory Consistency + a Course-So-Far Review Parallel Computer Architecture and Programming Today: what you should know Understand the motivation for relaxed consistency models Understand the

More information

Timo Latvala. January 28, 2004

Timo Latvala. January 28, 2004 Reactive Systems: Kripke Structures and Automata Timo Latvala January 28, 2004 Reactive Systems: Kripke Structures and Automata 3-1 Properties of systems invariants: the system never reaches a bad state

More information

Chapter Seven Morgan Kaufmann Publishers

Chapter Seven Morgan Kaufmann Publishers Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be

More information

Interconnection Networks

Interconnection Networks Interconnection Networks Interconnection Networks Introduction How to connect individual devices together into a group of communicating devices? Device: r r r Component within a computer Single computer

More information

Embedded Systems: Hardware Components (part II) Todor Stefanov

Embedded Systems: Hardware Components (part II) Todor Stefanov Embedded Systems: Hardware Components (part II) Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded

More information

Formal Verification Adoption. Mike Bartley TVS, Founder and CEO

Formal Verification Adoption. Mike Bartley TVS, Founder and CEO Formal Verification Adoption Mike Bartley TVS, Founder and CEO Agenda Some background on your speaker Formal Verification An introduction Basic examples A FIFO example Adoption Copyright TVS Limited Private

More information

Checker Processors. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India

Checker Processors. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India Advanced Department of Computer Science Indian Institute of Technology New Delhi, India Outline Introduction Advanced 1 Introduction 2 Checker Pipeline Checking Mechanism 3 Advanced Core Checker L1 Failure

More information

Outline. Petri nets. Introduction Examples Properties Analysis techniques. 1 EE249Fall04

Outline. Petri nets. Introduction Examples Properties Analysis techniques. 1 EE249Fall04 Outline Petri nets Introduction Examples Properties Analysis techniques 1 Petri Nets (PNs) Model introduced by C.A. Petri in 1962 Ph.D. Thesis: Communication with Automata Applications: distributed computing,

More information

Conquering Memory Bandwidth Challenges in High-Performance SoCs

Conquering Memory Bandwidth Challenges in High-Performance SoCs Conquering Memory Bandwidth Challenges in High-Performance SoCs ABSTRACT High end System on Chip (SoC) architectures consist of tens of processing engines. In SoCs targeted at high performance computing

More information

Mix n Match Async and Group Replication for Advanced Replication Setups. Pedro Gomes Software Engineer

Mix n Match Async and Group Replication for Advanced Replication Setups. Pedro Gomes Software Engineer Mix n Match Async and Group Replication for Advanced Replication Setups Pedro Gomes (pedro.gomes@oracle.com) Software Engineer 4th of February Copyright 2017, Oracle and/or its affiliates. All rights reserved.

More information

From synchronous models to distributed, asynchronous architectures

From synchronous models to distributed, asynchronous architectures From synchronous models to distributed, asynchronous architectures Stavros Tripakis Joint work with Claudio Pinello, Cadence Alberto Sangiovanni-Vincentelli, UC Berkeley Albert Benveniste, IRISA (France)

More information

Control Hazards. Prediction

Control Hazards. Prediction Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional

More information

CAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC

CAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC QoS Aware BiNoC Architecture Shih-Hsin Lo, Ying-Cherng Lan, Hsin-Hsien Hsien Yeh, Wen-Chung Tsai, Yu-Hen Hu, and Sao-Jie Chen Ying-Cherng Lan CAD System Lab Graduate Institute of Electronics Engineering

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

EE 6900: Interconnection Networks for HPC Systems Fall 2016

EE 6900: Interconnection Networks for HPC Systems Fall 2016 EE 6900: Interconnection Networks for HPC Systems Fall 2016 Avinash Karanth Kodi School of Electrical Engineering and Computer Science Ohio University Athens, OH 45701 Email: kodi@ohio.edu 1 Acknowledgement:

More information

Formal modelling and verification in UPPAAL

Formal modelling and verification in UPPAAL Budapest University of Technology and Economics Department of Measurement and Information Systems Fault Tolerant Systems Research Group Critical Embedded Systems Formal modelling and verification in UPPAAL

More information

Control Hazards. Branch Prediction

Control Hazards. Branch Prediction Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional

More information

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching.

Switching/Flow Control Overview. Interconnection Networks: Flow Control and Microarchitecture. Packets. Switching. Switching/Flow Control Overview Interconnection Networks: Flow Control and Microarchitecture Topology: determines connectivity of network Routing: determines paths through network Flow Control: determine

More information

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste

Outline Computer Networking. TCP slow start. TCP modeling. TCP details AIMD. Congestion Avoidance. Lecture 18 TCP Performance Peter Steenkiste Outline 15-441 Computer Networking Lecture 18 TCP Performance Peter Steenkiste Fall 2010 www.cs.cmu.edu/~prs/15-441-f10 TCP congestion avoidance TCP slow start TCP modeling TCP details 2 AIMD Distributed,

More information

High Performance Computing

High Performance Computing Master Degree Program in Computer Science and Networking, 2014-15 High Performance Computing 1 st appello January 13, 2015 Write your name, surname, student identification number (numero di matricola),

More information

Verifying C & C++ with ESBMC

Verifying C & C++ with ESBMC Verifying C & C++ with ESBMC Denis A Nicole dan@ecs.soton.ac.uk CyberSecuritySoton.org [w] @CybSecSoton [fb & tw] ESBMC ESBMC, the Efficient SMT-Based Context-Bounded Model Checker was originally developed

More information

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation

More information

Networks-on-Chip Router: Configuration and Implementation

Networks-on-Chip Router: Configuration and Implementation Networks-on-Chip : Configuration and Implementation Wen-Chung Tsai, Kuo-Chih Chu * 2 1 Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 413, Taiwan,

More information

Hardware Verification 2IMF20

Hardware Verification 2IMF20 Hardware Verification 2IMF20 Julien Schmaltz Lecture 01: Introduction to Hardware (Formal) Verification Lectures» Two blocks every week (w36 to w43)» Tue 13:45-15:30 Room LUNA 1.056» Thu 08:45-10:30 Room

More information

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1

EE382C Lecture 1. Bill Dally 3/29/11. EE 382C - S11 - Lecture 1 1 EE382C Lecture 1 Bill Dally 3/29/11 EE 382C - S11 - Lecture 1 1 Logistics Handouts Course policy sheet Course schedule Assignments Homework Research Paper Project Midterm EE 382C - S11 - Lecture 1 2 What

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

Research Collection. Formal background and algorithms. Other Conference Item. ETH Library. Author(s): Biere, Armin. Publication Date: 2001

Research Collection. Formal background and algorithms. Other Conference Item. ETH Library. Author(s): Biere, Armin. Publication Date: 2001 Research Collection Other Conference Item Formal background and algorithms Author(s): Biere, Armin Publication Date: 2001 Permanent Link: https://doi.org/10.3929/ethz-a-004239730 Rights / License: In Copyright

More information

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics

Lecture 14: Large Cache Design III. Topics: Replacement policies, associativity, cache networks, networking basics Lecture 14: Large Cache Design III Topics: Replacement policies, associativity, cache networks, networking basics 1 LIN Qureshi et al., ISCA 06 Memory level parallelism (MLP): number of misses that simultaneously

More information

FlexRay International Workshop. FAN analysis

FlexRay International Workshop. FAN analysis FlexRay International Workshop 16 th and 17 th April, 2002 Munich FAN analysis Dipl. Inf. Jens Lisner - University of Essen Project FAN - Goals Verify the design of FlexRay in particular: countermeasures

More information

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that

More information

Under the Hood, Part 1: Implementing Message Passing

Under the Hood, Part 1: Implementing Message Passing Lecture 27: Under the Hood, Part 1: Implementing Message Passing Parallel Computer Architecture and Programming CMU 15-418/15-618, Fall 2017 Today s Theme 2 Message passing model (abstraction) Threads

More information

Total No. of Questions : 18] [Total No. of Pages : 02. M.Sc. DEGREE EXAMINATION, DEC First Year COMPUTER SCIENCE.

Total No. of Questions : 18] [Total No. of Pages : 02. M.Sc. DEGREE EXAMINATION, DEC First Year COMPUTER SCIENCE. (DMCS01) Total No. of Questions : 18] [Total No. of Pages : 02 M.Sc. DEGREE EXAMINATION, DEC. 2016 First Year COMPUTER SCIENCE Data Structures Time : 3 Hours Maximum Marks : 70 Section - A (3 x 15 = 45)

More information

Module 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT

Module 18: TLP on Chip: HT/SMT and CMP Lecture 39: Simultaneous Multithreading and Chip-multiprocessing TLP on Chip: HT/SMT and CMP SMT TLP on Chip: HT/SMT and CMP SMT Multi-threading Problems of SMT CMP Why CMP? Moore s law Power consumption? Clustered arch. ABCs of CMP Shared cache design Hierarchical MP file:///e /parallel_com_arch/lecture39/39_1.htm[6/13/2012

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Binary Decision Diagrams and Symbolic Model Checking

Binary Decision Diagrams and Symbolic Model Checking Binary Decision Diagrams and Symbolic Model Checking Randy Bryant Ed Clarke Ken McMillan Allen Emerson CMU CMU Cadence U Texas http://www.cs.cmu.edu/~bryant Binary Decision Diagrams Restricted Form of

More information