ECE 172 Digital Systems. Chapter 14 Itanium EPIC Processor Architecture. Herbert G. Mayer, PSU Status 5/10/2018

Size: px
Start display at page:

Download "ECE 172 Digital Systems. Chapter 14 Itanium EPIC Processor Architecture. Herbert G. Mayer, PSU Status 5/10/2018"

Transcription

1 ECE 172 Digita Systems Chapter 14 Itanium EPIC Processor Architecture Herbert G. Mayer, PSU Status 5/10/2018 1

2 Syabus Introduction Inte Itanium Architecture Data and Memory Itanium Registers Instruction Set Architecture ISA Assember Source Program Bibiography 2

3 Photo of Itanium 2 Processor 3

4 Itanium Processor Bock Diagram 4!

5 Introduction Itanium is Inte s first pubicy announced, commercia 64-bit computer product, aunched 2001, co-deveoped with HP Corp. IPF stands for Itanium Processor Famiy Pubicy Announced: Smart Inte was diigenty and secrety deveoping a contemporaneous, competing 64-bit processor: extended version of its ancient x86 architecture, just in case, as a backup risk hedge! Lucky they did! 64-bit means that ogica address range spans 2 64 different memory bytes; aso natura integer objects are 64 bits wide The exact format of data objects is described in section Data and Memory During its deveopment at Inte, the first generation of Itanium processors was internay code-named Merced The famiy is now officiay caed IPF, for Itanium Processor Famiy, whie eary it was referred to as IA-64, for Inte 64-bit architecture; conficting ater with 64-bit version of x86 famiy! 5

6 Introduction Inte s Itanium architecture is radicay, competey different from the widey used 32-bit IA-32 architecture IA-32 shoud be referred to as x86 architecture, est one incorrecty infers today that it be restricted to 32-bit addresses and integer types of 32-bit ength That imitation no onger exists since introduction of 64-bit versions about ½ year after AMD s extension of IA-32 to 64 bits; see aso EM64T Imagine how Inte fet, when AMD, the company having produced CPUs compatibe with Inte s chips, suddeny had a more advanced, attractive x86 CPU! 6

7 Introduction Pat Gesinger, former Inte VP, with Itanium Chips 7

8 Inte Itanium Architecture Interestingy, IA-32 object code is executabe on Itanium processors; caveat! More interesting yet, even the Hewett-Packard PA- RISC code is executabe on Inte s and HP s nove 64-bit IPF processor HP and Inte were strategic partners in definition, deveopment, and cost sharing of the IPF, with HP having initiated the deveopment Cautious about performance inferences! Just because IA-32 object code is executabe on IPF, one shoud not deduce such code executes on IPF as fast as on x86 processor!! 8

9 Inte Itanium Architecture IPF is Inte s and HP s first instance of the nove EPIC architecture; different from PA Risc, different from x86! EPIC stands for Expicity Parae Instruction Computing. It is Inte s first aunched 64-bit architecture; the second was aunched ater (1q2004), with EM64T, the first 64-bit version of the ancient x86 architecture HP aready had a 64-bit version with Performance Architecture (PA) RISC processor at time of Itanium aunch Expicit means, the assemby anguage programmer bears the inteectua burden (or the smart compier) to take advantage of the paraeism in the architecture; see ref [8] It is not the processor that automaticay expoits the numerous, parae computing modues; the microprocessor needs to be tod! 9

10 Inte Itanium Architecture As a consequence, compiers for IPF are highy compex; see Donad Knuth s comment, ref [7] Compier compexity is not desirabe, as that means more errors, decreased object code quaity, something a new architecture shoud avoid On the other hand, the IPF has provided expicit architectura features that enabe impementing highy optimizing compiers A case in point is architectura support for software pipeined oops (SW PL) Certain source constructs et the compier emit SW PL oops that need no proogue and epiogue Absence of Proogue and Epiogue not ony renders the object code more compact, but aso faster 10

11 Inte Itanium Architecture Parae means an Itanium processor gains speed not soey via high cock rates, but via simutaneous execution of mutipe operations in one cock cyce Key concepts refined, or newy introduced, in IPF incude: predication, branch prediction, branch eimination, conditiona move, specuation, parae comparisons, and a arge register fie The first impementation of the new 64-bit Inte + HP Itanium architecture ony impemented 44 physica of the 64 ogica address bits 11

12 Inte Itanium Architecture With just 44 address bits, the tota initia address range of first Itanium HW was about a miionth of ogica address range, yet ~4000 times arger than on od-fashioned 32-bit architecture In its second generation, 56 physica bits of the 64-bit ogica address space were impemented in HW Product name of second generation: Itanium 2 Short-term, no severe imitations were expected with restricted 56-bit addresses Sti about 16 miion times arger than 32-bit addressing space Integer type operands are of course fu 64 bits wide 12

13 Inte Itanium Architecture Unike earier parae VLIW architectures, EPIC has no fixed width instruction encoding Instead, operations can be combined to function in parae; from a singe instructing to many instructions can be combined Critica in EPIC is that a code be written assuming parae semantics within a group (to be expained ater), and sequentia semantics across groups To be abe to run in parae, the machine is buit with mutipe execution modues that can a work at the same time Aows natura architecture migration from say, 6 HW modues executing on today s Itanium, to as many as can be crammed into a future siicon microprocessor, years from now 13

14 Inte Itanium Architecture To iustrate a sampe taken from ref [1], consider 2 memory operands a and b to be swapped temp := a; // a, b, temp, are memory ocs a := b; b := temp; The semicoon operator ; impies sequentia semantics. On a machine with parae semantics, it woud be sufficient to write a := b, // operand atching needed b := a; // operand atching needed With the comma operator, impying parae semantics, simiar to syntactic conventions in the programming anguage Ago-68 This source snipped is just a generic exampe; NOT a sampe of the Itanium assemby anguage 14

15 Data & Memory 15

16 Data and Memory Native data types of IPF resembe conventiona 32-bit architectures, except for the onger 64-bit integer and unsigned formats An extension over IA-32 object code is the IPF bunde Data types incude integer, unsigned, foating-point, and pointer Integers are of different widths: byte, word, doubeword, or quad-word precision Length in bits as we as min and max vaues are isted beow: 16

17 Data and Memory, Min Max Type Byte Word Doubeword+ Quad- Integer [bits] word+ 64 Unsigned [bits] Pointer [bits] NA NA Comp Foat [bits] NA NA 32, 64 64, 80 Type byte Word Doube-word Quad-word Minint ,768-2,147,483,648 "-9,223,372,036,854,775,808" Maxint ,767 2,147,483,647 "9,223,372,036,854,775,807" Minunsigned Maxunisgned ,535 4,294,967,295 "18,446,744,073,709,551,615" 17

18 Data and Memory Negative numbers are represented in two s compement format, with the sign-bit in the mostsignificant position Foating-point data use the IEEE 754 standard Bits representing integer vaues are numbered from 0 in the east significant position (rightmost position) to higher vaues For exampe, the most significant bit in a doube word is in position indexed 31 (Note the unusua word definition on Inte architectures: 2 bytes) Maximum address on first generation Itanium was ony 17,592,186,040,322 or ; grew in its 2 nd generation to 56 bits, and is now a fu 64-bits ong 18

19 Data and Memory Bytes are stored in itte-endian order by defaut Possibe to programmaticay seect itte- or bigendian order, by setting the be bit in the user mask, a specia status register That be bit (for big-endian) does not affect how instructions are stored or fetched from memory Object code is aways represented in itte-endian order; programmer seected endianness ony impacts data In itte-endian order, data bytes with the owest numeric vaue are stored in the byte with the owest address; conversey for big-endian order 19

20 Data and Memory Data quad-word 0x is stored as: Data stored in 8 adjacent bytes in memory in itte-endian order: addr: 0 addr: 1 addr: 2 addr: 3 addr: 4 addr: 5 addr: 6 addr: 7 08 x 07 x 06 x 55 x 04 x 03 x 02 x 11 x Same int vaue 0x stored in big-endian order: byte7 byte6 byte5 byte4 byte3 byte2 byte1 byte0 11 x 02 x 03 x 04 x 55 x 06 x 07 x 08 x 20

21 Itanium Registers The Itanium processor has 128 genera registers (GR), 128 foating-point registers (FR), 64 singebit predicate registers (PR), 8 branch registers (BR), and 128 appication registers (AR) In addition, there are Performance Monitor Data registers (PMD), processor identifiers (CPUID), a Current Frame Marker register (CFM), user mask (UM), and instruction pointer registers (IP) GRs, FRs, BRs, ARs, CPUIDs, IP, and PMDs are 64 bits wide PRs are 1 bit wide, whie the UM hods 6 and the CFM 38 bits; depicted beow: 21

22 Itanium Register Fie GR FR PR BR AR gr fr pr 0 0 br ar 0 Kr0 gr fr pr 1 0 br gr fr pr 2 0 br ar 7 Kr7 gr fr pr 3 0 br gr fr pr 4 0 br ar 16 RSC gr fr pr 5 0 br ar 17 BSP br ar 18 BSPST O gr fr pr 10 0 br ar 19 RNAT ip 63 0 ar 21 FCR gr fr pr gr fr pr 63 0 cfm 37 0 ar 30 FDR 22 User M ar 32 CCV CPUID um 5 0 ar 36 UNAT cpuid PMD ar 40 FSPR cpuid pmd ar 44 ITC pmd ar 64 LC cpuid n ar 66 EC pmd m 63 0 ar 127

23 Itanium Registers GR The 128 GR registers are the common workhorses during computation They contain integer vaues being computed Possibe to use these integer vaues as machine addresses, thus GRs can be used as pointers in oad- and store-operations A machine instructions can refer to these registers, for reading and writing vaues In addition to the 64 data bits, each GR has an associated NAT bit, which stands for Not A Thing NAT is 1, if the associated register has not been initiaized with vaid data 23

24 Itanium Registers GR NATs support specuation For exampe, if a specuative oad is issued but aborted, before the vaue arrives in its destined GR, the NAT state records that fact Enabes integrity of the machine s exception process There are 2 groups of GR registers: The first 32, GR0 through GR31, are visibe to a software, and are used to hod gobay computed, intermediate vaues However, GR0 is read-ony, providing the constant 0, 64 bits ong 24

25 Itanium Registers GR The next 96, GR32 to GR127, are used to impement a sma but frequenty used portion of the top of the run-time stack; i.e. work ike a specia-purpose topof-stack cache These stack registers are made avaiabe to SW by aocation of a register stack frame, and incude from 0 to 96 registers Registers not used from this subset are inaccessibe to genera SW The stack frame portion impemented via GRs is further partitioned into subsections, one meant to hod oca registers, the other output registers, i.e. resuts of the current function ca 25

26 Sampe Stack Frame, Generic sp Locas + Temps Stack Marker Stack Frame bp Actua Parameters 26

27 Itanium Predicate Registers PR Execution of most IPF instructions can be predicated by one of the PRs Vaue 1 in the PR means: the operation can be competed normay PR vaue 0 means the resut wi not be posted (AKA not committed), even if it has been computed aready. I.e. there wi be no stores and no impact on any AR of the machine Exception of an instruction that cannot be predicated is the oop operation 27

28 Itanium Predicate Registers The PRs are aso partitioned into 2 sections: PR0 through PR15 are static PRs The other 48 are so caed rotating PRs PR0 is an exceptiona register, it can ony be read, and its vaue is aways 1, meaning, the predicate is true; thus PR0 denotes unconditiona execution The remaining 48 PRs are used to hod stage predicates, used during software pipeining 28

29 Branch Registers BR IPF instructions are grouped in bundes, which are 16-byte aigned byte sequences hoding executabe code. Hence their rightmost 4 address bits wi aways be 0 due to aignment; these 4 address bits don t need to be stored expicity Execution of an indirect branch requires an expicit operand On the Itanium architecture this operand is a branch register; a branch register BR hods the branch destination The machine then oads the vaue of the referenced BR into the IP register and execution continues from there; IP stands for Instruction Pointer Executing branch-reated instructions is the way to directy affect the vaue in the instruction pointer, the register that hods the address of the next bunde to be executed 29

30 Current Frame Marker Register CFM Note: Frame Marker often referred to in iterature as Stack Frame; its fixed portion as Stack Marker Each function has a specific stack frame associated with it, which is created at function invocation; it is ceared at function return If a reevant data of a function s stack frame do fit, they are paced in the stack of genera registers; ese the overfowing data must reside in memory Either way, the current frame marker (CFM) hods the frame marker for the function that is currenty active Generay, most functions have sma stack frames 30

31 Current Frame Marker Register CFM Layout of CFM: CFM register Rrb.pr Rrb.fr Rrb.gr sor so sof Meaning of Bits in CFM: Name Bit Fied meaning Sof 0..6 Tota size of stack frame So Size of oca part of stack frame, in words Sor Size of rotating portion of stack frame. The number of the rotating registers is 8 times the sor vaue rrb.gr Register rename base for grs rrb.fr Register rename base frs rrb.pr Register rename base prs 31

32 Appication Registers AR Appication Registers t.b.d.: register Mnemonic Description of register ar 0 ar 7 KR 0 KR 7 Kerne registers ar 8 ar 15 Reserved ar 16 t.b.d. 32

33 Instruction Pointer IP IPF instructions are fetched in units of bundes, which are chunks of 16 bytes, or 128 bits Bundes are stored bunde-aigned The ip can address 18,446,744,073,709,551,616 different bytes (but ony at bunde addresses) The rightmost 4 bits of the ip thus wi aways be zero, due to bunde-aignment Hence these 4 bits don t needs to be stored on microprocessor 33

34 Performance Monitor Data Register These are architecture-provided resources that record the use of hardware modues Contents is read-ony by SW Contrary to performance monitor registers on Inte Pentium architectures, they are user visibe on Itanium! 34

35 Itanium ISA Instruction Set Architecture 35

36 Instruction Set Architecture ISA Paraeism, Dependences, and Groups Itanium instructions packaged in groups can execute in parae; aows fast execution, if HW is avaiabe! Assemby programmer or compier may craft groups as arge as desired; the performance consequence is: A operations embedded in a singe group can be executed simutaneousy, in parae, saving time over the equivaent sequentia execution The physica siicon ange of this is: Of a operations that coud be executed in parae ony those are actuay performed in parae, for which there exist HW resources E.g. on an Itanium 2 impementation of IPF, there are 6 units avaiabe to operate in parae 36

37 Instruction Set Architecture ISA Paraeism, Dependences, and Groups If fewer actions are encosed in a group, some HW wi ide If more actions coud be incuded in a group, then a HW eements are active, yet some degree of possibe paraeism wi be ost; future HW impementations may execute that same object code faster due to the higher degree of paraeism Parae execution is not feasibe if dependencies exist between instructions On Itanium these dependencies are not resoved by the machine It is the human programmer or optimizer that expicity tracks, what can be done in parae, and what must be done in sequence. The machine just runs it, goa: BE FAST! 37

38 Instruction Set Architecture ISA Paraeism, Dependences, and Groups If a resut has to be computed first before it can be read somewhere ese (memory or register), a true dependence exists; AKA data dependence; conventiona to say dependence On Itanium we ca this a RAW (Read after Write) dependence If a resut has to be read first before it can be re-computed, a fase dependence is created, AKA anti-dependence On Itanium this is named WAR (Write after Read) dependency If a resut has to be computed first before it can be computed again, assuming that an intermediate reference is possibe, output dependence is created Itanium cas this third dependence: WAW (Write after Write) dependence 38

39 Instruction Set Architecture ISA Paraeism, Dependences, and Groups In a these cases, the prior operation has to compete, before the dependent can be started; e.g.: d8 r14 = [r3] -- oad GR14 w. 8 bytes addr. by GR3 add r15 = r14, r16 - integer sum into GR15, RAW dep This is an exampe of RAW dependence, AKA true dependence The oading of an 8-byte vaue into (8-byte) register GR14 must compete first, before the addition of the 2 ong integer vaues, hed in GR14 and GR16, can be started Note the assember register names: r14, and not gr14 This is Inte and HP assemby anguage convention! Another assember may use different conventions 39

40 Instruction Set Architecture ISA Assemby Language Format Format of an Itanium assember instruction: In meta-syntax [ and ] brackets mean that the bracketed portion of the instruction is optiona In assemby syntax, square bracket pairs [] express: indirection Carefu not to get confused by 2 different contexts! [(pr)] mnemonic[.comp] dest = src1 [, src2 [, src3 ] ] Meaning of the various assemby anguage fieds: 40

41 Instruction Set Architecture ISA syntax Name Meaning (pr) Predicate register Used to predicate execution; if vaue is 0, the resut is not committed, if true, the resut is committed. pr0 is aways 1, hence the associated instructions are executed unconditionay mnemonic Instruction Name of the instruction to te the assember: which operation to perform comp Competer Further quaifies or competes the instruction specification. There may be mutipe competers per instruction; not a instructions have a competer dest src1 src2 src3 Destination Is the destination of the specified instruction. Choices are: register or memory source one Source operand. Not a instructions require a source. Some instructions aow mutipe sources. Sources may be: Immediate operands, or registers. Memory can be a source via indirection (through a register) source two Ditto source Ditto three 41

42 Instruction Set Architecture ISA Assemby Language Format A sampe assemby anguage instruction is shown next: (p0) add r5 = r4, r3, 1 // (p0) can be skipped This is an integer add instruction that sums up the integer vaues in GR4 and GR3, aso adds integer itera 1 Assigns sum to register GR5. Since the predicate register used is PR0, which is aways true, the commit of the sum to register GR5 is unconditiona, as if no predicate quaifier had been given Predicate registers, when isted, are encosed in ( ) parentheses Not a instructions aow or need a competer. Typica competers are shown beow Some instructions aow mutipe competers, notaby the memory access instructions, and branch instructions 42

43 Instruction Set Architecture ISA Competer Meaning.a For advanced oad; check ater if successfu.c Check.cr If advanced oad was not successfu, cear the reg.nc no cear.s Specuative; e.g. for oad; NOT aowed for store!.many t.b.d..few t.b.d..exc t.b.d. Many.equ.unc etc. more 43

44 Instruction Set Architecture ISA Itanium Bunde Format Executabe code on Itanium comes in units of bundes. A bunde consists of 3 instructions, a grouped with an associated tempate Tempate competes the instruction specification and above a, defines group boundaries Boundary is aso known as a stop. Stop defines where one group ends and another group starts If no stop is incuded in a tempate, this means that the bunde wi be part of a arger group, consisting of more instructions in the next bunde 44

45 Instruction Set Architecture ISA Itanium Bunde Format Each instruction is 41 bits ong, a tempate consumes 5 bits, one tempate per bunde With 3 instructions per bunde, the overa bunde ength is 3 * = 128 bits, fitting into 16 bytes; a bunde-aigned, easiy accompished due to first bunde residing on a mod-16 memory boundary From then on a wi be aigned on 16-byte boundaries With the memory bus being 128 bits wide (or wider on future IPF impementations) and bundes being bundeaigned, fetching instruction memory is fast Requiring one singe transfer on the bus 45

46 Instruction Set Architecture ISA Itanium Bunde Format Genera ayout of a bunde is shown next, with bits ordered from 0 through 127 increasing r. to instruction 2 instruction 1 instruction 0 tempate The tempate serves as a means for the compier to communicate additiona information about instructions 1, 2, and 3, without which they coud be ambiguous One such key piece of information is the pacement of an instruction group stop, in assember ;; 46

47 Instruction Set Architecture ISA Itanium Bunde Format A group stop can occur after instruction 2, or 1, or 0, indicating an earier group must compete execution, before another starts But Itanium instructions aows at most 2 stops in a bunde If 3 stops are needed, a NOOP must be packed into one of the instructions, to effectivey create 2 physica groups, with the third being the NOOP, whose execution order does not matter Compier-generated code performs this workaround automaticay 47

48 Instruction Set Architecture ISA Itanium Bunde Format The tempate specifies which types of instructions are assembed into sot 0, 1, and 2 IPF instructions are partitioned into the foowing 6 groups: Type A I M F B L + X Meaning ALU: integer or memory unit Non-ALU: Integer unit Memory unit Foating-point unit Branch unit Extended unit, or Branch unit 48

49 Instruction Set Architecture ISA Itanium Bunde Format Providing such information in the tempate speeds up instruction decoding, improving execution speed A ist with the Instruction Set Architecture (ISA) tempates and embedded stops is shown next Note at most 2 stops in any of the formats On an architecture that aims to have arge groups, it seems ogica to have few stops (max 2) per bunde 49

50 Instruction Set Architecture ISA Tempate # type sot 0 sot 1 sot2 0 = 0x00 MII Memory unit Integer unit Integer unit 1 = 0x01 MII_ Memory unit Integer unit Integer unit ;; 2 = 0x02 MI_I Memory unit Integer unit;; Integer unit 3 = 0x03 MI_I_ Memory unit Integer unit;; Integer unit;; 4 = 0x04 MLX Memory unit L unit? Extended unit 5 = 0x05 MLX_ Memory unit L unit? Extended unit;; 6 = 0x06 reserved 7 = 0x07 reserved 8 = 0x08 MMI Memory unit Memory unit Integer unit 9 = 0x09 MMI_ Memory unit Memory unit Integer unit;; 10 = 0x0a M_MI Memory unit;; Memory unit Integer unit 11 = 0x0b M_MI_ Memory unit;; Memory unit Integer unit;; 12 = 0x0c MFI Memory unit Foating-point unit Integer unit 13 = 0x0d MFI_ Memory unit Foating-point unit Integer unit;; 14 = 0x0e MMF Memory unit Memory unit Foating-point unit 15 = 0x0f MMF_ Memory unit Memory unit Foating-point unit;; 16 = 0x10 MIB Memory unit Integer unit Branch unit 17 = 0x11 MIB_ Memory unit Integer unit Branch unit;; 18 = 0x12 MBB Memory unit Branch unit Branch unit 19 = 0x13 MBB_ Memory unit Branch unit Branch unit;; 20 = 0x14 reserved 21 = 0x15 reserved 22 = 0x16 BBB Branch unit Branch unit Branch unit 23 = 0x17 BBB_ Branch unit Branch unit Branch unit;; 24 = 0x18 MMB Memory unit Memory unit Branch unit 25 = 0x19 MMB_ Memory unit Memory unit Branch unit;; 26 = 0x1a reserved 27 = 0x1b reserved 28 = 0x1c MFB Memory unit Foating-point unit Branch unit 28 = 0x1d MFB_ Memory unit Foating-point unit Branch unit;; 30 = 0x1e reserved 31 = 0x1f reserved 50

51 Instruction Set Architecture ISA Itanium Bunde Format The difference between above tempates 0x00 and 0x01, both being MII type operations is: after instruction 2 in tempate 0x01 there is a stop, whie in tempate 0x00 there is none In other words, the next bunde after the one for tempate 0x00 wi beong to the same group, and a higher degree of paraeism wi be possibe there 51

52 Instruction Set Architecture ISA Itanium Assemby Code A group is a sequence of 1 or more instructions deimited by a stop. The first instruction in a whoe program is thought to be preceded by a stop Simiary, the ast instruction of a compete program is thought to be foowed by a stop A instructions paced into a singe group can be executed in parae. Whether or not they wi depends on the number of hardware resources avaiabe. In the initia Itanium architecture ony 6 resources were avaiabe In a ater impementation, more HW resources may become avaiabe, thus potentiay speeding up execution of the same od, unchanged Itanium code on a future generation The ;; indicates to the assember, where one boundary ends and thus the next group starts 52

53 Instruction Set Architecture ISA Itanium Assemby Code Some assemby anguage instructions foow: comp.eq p1, p2 = r33, r34 This checks genera purpose registers 33 and 34 for equaity; if equa, predicate register 1 is set to true, predicate register 2 to fase. Otherwise p1 is set to fase and p2 to true. A more compicated case is: (p3) comp.eq.unc p1, p2 = r33, r34 checks if predicate register 3 is true at the start. If so, if registers GR33 and GR34 are equa, register p1 is set to true and p2 to fase, ese the reverse Ese i.e. if p3 is fase a priori then predicate registers 1 and 2 are both set to fase 53

54 Assember Source With & Without Stack Unwind Operations From ref [8] 54

55 Assember for Heo Word, With // heo_word.c assemby with unwind directive // sampe taken from ref [8] // page 1/3.fie "heo.c".pred.safe_across_cas p1-p5, p16-p63.section.rdata, "a", "progbits".aign 8.STRING1: stringz "Heo Word!!!\n".text.aign 16.goba heo#.proc heo# heo:.proogue.save ar.pfs, r34 55

56 Assember for Heo Word, With // heo_word.c assemby with unwind directive // sampe taken from ref [8] // page 2/3 aoc r34 = ar.pfs, 0, 4, 1, 0.vframe r35 mov r35 = r12.save rp, r33 mov r33 = b0 // oad branch register into GR33.body add r36 gp ;; d8 r36 = [r36] mov r32 = r1 br.ca.sptk.many b0 = printf# // b0! ;; 56

57 Assember for Heo Word, With // heo_word.c assemby with unwind directive // sampe taken from ref [8] // page 3/3 mov r1 = r32 mov ar.pfs = r34 mov b0 = r33 // restore branch register.restore sp mov r12 = r35 br.ret.sptk.many b0.endp heo#.goba printf#.type 57

58 Assember for Heo Word, Without // heo_word.c assemby without unwind directive // sampe taken from ref [8] // page 1/3 // The string is defined in the read ony data section.section.rdata, "a", "progbits".aign 8.STRING1: stringz "Heo Word!!!\n" // definition of function heo is in text section // Registers to be saved in oca registers: // gp = r1 - oc0 = r32 // rp = b0 - oc1 = r33 // ar.pfs - oc2 = r34 // sp = r12 - oc3 = r35 58

59 Assember for Heo Word, Without // heo_word.c assemby without unwind directive // sampe taken from ref [8] // page 2/3.text.goba heo.proc heo heo: aoc oc2 = ar.pfs, 0, 4, 1, 0 mov oc3 = sp mov oc1 = b0 // save branch register b0 add out0 gp ;; d8 out0 = [out0] // group of 3 instructions mov oc0 = gp br.ca.sptk.many b0 = printf ;; 59

60 Assember for Heo Word, Without // heo_word.c assemby without unwind directive // sampe taken from ref [8] // page 3/3 mov gp = oc0 mov ar.pfs = oc2 mov b0 = oc1 mov sp = oc3 br.ret.sptk.many b0.endp heo.goba printf.type 60

61 Bibiography 1. Triebe, Water: IA-64 Architecture for Software Deveopers, Inte Press 2000, 308 pages attachment_ciid=c2d2e0aecd2b7110vgnvcm d6e10rc RD&ciid=ce1fd701521c7110VgnVCM d6e10RCRD Donad Knuth: Interview with Donad Knuth Inte Itanium Architecture Assemby Reference Guide, 2002, Inte order number , at 61

62 Definitions 62

63 Branch Eimination Definitions Repacing object code that has conditiona branches, with code that has a straight-forward execution path, acking branches The second version with branches eiminated must be semanticay equivaent to the origina code with branches Everything ese equa, the version without branches generay executes faster due to ess cache misses 63

64 Bunde Definitions Group of 3 instructions pus a tempate, that a fit into a 16-byte ong, 16-byte aigned section of instruction memory on Itanium Tota number of bits =

65 Conditiona Move Definitions Move instruction that transfers bits from source to destination, but ony if an associated condition is true Otherwise the instruction operates ike a noop Such a move can serve as a specia case of branch eimination. For exampe, the C source construct: if ( a > 0 ) x = 99; -- HL source program coud be mapped into the conditiona move: cmov x, #99, a, #0, gt -- hypothetica asm which has no branches. Source operand #99 is moved into memory ocation x ony if the > condition hods between operands a and integer itera 0 65

66 Endian, Endianness Definitions A convention that defines in which order the higher-vaued bytes of a muti-byte data object are addressed Can be programmed on Itanium with be bit If the higher address byte hods the higher numeric vaue, we ca this itte-endian typica on Inte x86 architecture The other way around we ca big-endian ordering typica on IBM 370 architecture 66

67 EPIC Definitions Expicity Parae Instruction Computing, with IPF being the first commercia architecture that impements EPIC Note IPF s abiity to aso execute od Inte x86 and od HP PA object code 67

68 Epiogue Definitions When the steady state of a software pipeined oop competes, there may be yet to be used operands and operations to be computed that woud not fit into the steady state These ast operands must be consumed, some even be generated during the epiogue, and utimatey the pipeine must be drained This is accompished in the object code after the steady state, and that portion of code is caed the epiogue See aso proogue 68

69 Group Definitions A sequence of instructions, each with an associated tempate and a defined stop A group is composed of one bunde or more The stop means, the hardware cannot start executing any subsequent group, unti the current group has competed Syntax notation for stop in Itanium assember is the doube-semicoon ;; 69

70 Parae Comparison Definitions A composite source program condition of the form: ( ( a > b ) && ( c <= d ) ) requires mutipe steps to compute a booean predicate Generay, on a sequentia architecture these mutipe steps are combined via expicit instructions for anding and oring, or ese the fow of contro of execution seects a matching true abe. A this takes time The Itanium processor aows parae evauation of certain composite Booean expressions in one singe step The resut can be used as a predicate in subsequent instructions. Notice that such combined Booean expressions must be side-effect free Is not equivaent to C s short-circuit evauation of compex booean expressions! 70

71 Definitions Parae Comparison, Cont d For exampe, another compex booean expression ( fun( j, k ) && ( i < MAX ) ) cannot be mapped into a parae EPIC comparison Since one operand is a function ca fun( i, k ) with a possiby arge number of parameters, and may have a side-effect on one of the other operands, for exampe i which is yet to be compared This type of booean expression is mapped into sequentia code 71

72 Predication Definitions Is the association of a booean condition with the execution of an instruction sequence. This aows the foowing: Two instruction streams can be executed in parae, ceary requiring mutipe hardware modues; provided on EPIC Both streams have a predicate associated with their operations. Ony the stream with the true predicate is actuay retired; the other wi be aborted and ignored Abort can happen as soon as the predicate is known. This means, the computation of the predicate can proceed in parae with the execution of the two code streams, but must compete by the time these 2 code streams waite for who be the winner An ISA with predication requires bits for the predicates to use, and which direction (true? or fase?) to seect Aso, the discarded code path may contain no side-effect, such as a write to memory! 72

73 Proogue Definitions Before a software pipeined oop body can be initiated, hardware resources (e.g. registers) must be initiaized; we say the oop must be primed This is accompished in the object code before the steady state, caed the Proogue See aso epiogue 73

74 Register Fie Definitions The IPF has a rich set of registers This incudes 128 genera purpose registers (for integer operations), 128 foating-point-, 64 predicate-, 64 branch-, and 128 so-caed appication registers Aso a variety of specia purpose register is visibe; visibe means accessibe by the assemby anguage program Incudes a user mask, stack marker (frame marker), ip, processor id, and performance monitoring registers 74

75 Specuation Definitions If it is suspected --but not sure-- that operand o wi be used in the future, and this operand is not readiy avaiabe (not yet in a high-speed register), and it takes ong to fetch o, a processor may initiate the fetch we before it is actuay used Advantage: by the time o is needed, it is aready avaiabe without deay Disadvantage: if the fow of contro never reaches the pace where o was thought to be needed, then the specuative fetch was superfuous May sti be meaningfu, if a) no side-effects occurred that are harmfu to program correctness, and b) if the hardware resource required to fetch o was ide anyway; then no oss! 75

76 Steady State Definitions The software pipeined object code executed repeatedy, after the Proogue has been initiated, before the Epiogue wi be active, is caed the Steady State Each iteration of the Steady State makes some progress toward mutipe iterations of the origina source oop See aso proogue and epiogue 76

77 Syabe Definitions Is the instruction-ony portion of a bunde A bunde aways hods 3 instructions pus a tempate, the tempate specifying additiona necessary information about an instruction The instruction aone, without the needed tempate information, is a syabe 77

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7 Functions Unit 6 Gaddis: 6.1-5,7-10,13,15-16 and 7.7 CS 1428 Spring 2018 Ji Seaman 6.1 Moduar Programming Moduar programming: breaking a program up into smaer, manageabe components (modues) Function: a

More information

ECE 172 Digital Systems. Chapter 5 Uniprocessor Data Cache. Herbert G. Mayer, PSU Status 6/10/2018

ECE 172 Digital Systems. Chapter 5 Uniprocessor Data Cache. Herbert G. Mayer, PSU Status 6/10/2018 ECE 172 Digita Systems Chapter 5 Uniprocessor Data Cache Herbert G. Mayer, PSU Status 6/10/2018 1 Syabus UP Caches Cache Design Parameters Effective Time t eff Cache Performance Parameters Repacement Poicies

More information

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01 Page 1 of 15 Chapter 9 Chapter 9: Deveoping the Logica Data Mode The information requirements and business rues provide the information to produce the entities, attributes, and reationships in ogica mode.

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program?

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program? Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Spring 2018 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program a set of

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Fa 2017 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program instructions

More information

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 ECEn 528 Prof. Archibad Lab: Dynamic Scheduing Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 Overview This ab's purpose is to expore issues invoved in the design of out-of-order issue processors.

More information

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges

Special Edition Using Microsoft Excel Selecting and Naming Cells and Ranges Specia Edition Using Microsoft Exce 2000 - Lesson 3 - Seecting and Naming Ces and.. Page 1 of 8 [Figures are not incuded in this sampe chapter] Specia Edition Using Microsoft Exce 2000-3 - Seecting and

More information

MCSE Training Guide: Windows Architecture and Memory

MCSE Training Guide: Windows Architecture and Memory MCSE Training Guide: Windows 95 -- Ch 2 -- Architecture and Memory Page 1 of 13 MCSE Training Guide: Windows 95-2 - Architecture and Memory This chapter wi hep you prepare for the exam by covering the

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

Straight-line code (or IPO: Input-Process-Output) If/else & switch. Relational Expressions. Decisions. Sections 4.1-6, , 4.

Straight-line code (or IPO: Input-Process-Output) If/else & switch. Relational Expressions. Decisions. Sections 4.1-6, , 4. If/ese & switch Unit 3 Sections 4.1-6, 4.8-12, 4.14-15 CS 1428 Spring 2018 Ji Seaman Straight-ine code (or IPO: Input-Process-Output) So far a of our programs have foowed this basic format: Input some

More information

Register Allocation. Consider the following assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x

Register Allocation. Consider the following assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x Register Aocation Consider the foowing assignment statement: x = (a*b)+((c*d)+(e*f)); In posfix notation: ab*cd*ef*++x Assume that two registers are avaiabe. Starting from the eft a compier woud generate

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

An Optimizing Compiler

An Optimizing Compiler An Optimizing Compier The big difference between interpreters and compiers is that compiers have the abiity to think about how to transate a source program into target code in the most effective way. Usuay

More information

Directives & Memory Spaces. Dr. Farid Farahmand Updated: 2/18/2019

Directives & Memory Spaces. Dr. Farid Farahmand Updated: 2/18/2019 Directives & Memory Spaces Dr. Farid Farahmand Updated: 2/18/2019 Memory Types Program Memory Data Memory Stack Interna PIC18 Architecture Data Memory I/O Ports 8 wires 31 x 21 Stack Memory Timers 21 wires

More information

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm Outine Parae Numerica Agorithms Chapter 8 Prof. Michae T. Heath Department of Computer Science University of Iinois at Urbana-Champaign CS 554 / CSE 512 1 2 3 4 Trianguar Matrices Michae T. Heath Parae

More information

NCH Software Express Delegate

NCH Software Express Delegate NCH Software Express Deegate This user guide has been created for use with Express Deegate Version 4.xx NCH Software Technica Support If you have difficuties using Express Deegate pease read the appicabe

More information

understood as processors that match AST patterns of the source language and translate them into patterns in the target language.

understood as processors that match AST patterns of the source language and translate them into patterns in the target language. A Basic Compier At a fundamenta eve compiers can be understood as processors that match AST patterns of the source anguage and transate them into patterns in the target anguage. Here we wi ook at a basic

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER!

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER! [1,2] have, in theory, revoutionized cryptography. Unfortunatey, athough offer many advantages over conventiona and authentication), such cock synchronization in this appication due to the arge operand

More information

Language Identification for Texts Written in Transliteration

Language Identification for Texts Written in Transliteration Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

A Fast Block Matching Algorithm Based on the Winner-Update Strategy

A Fast Block Matching Algorithm Based on the Winner-Update Strategy In Proceedings of the Fourth Asian Conference on Computer Vision, Taipei, Taiwan, Jan. 000, Voume, pages 977 98 A Fast Bock Matching Agorithm Based on the Winner-Update Strategy Yong-Sheng Chenyz Yi-Ping

More information

IBC DOCUMENT PROG007. SA/STA SERIES User's Guide V7.0

IBC DOCUMENT PROG007. SA/STA SERIES User's Guide V7.0 IBC DOCUMENT SA/STA SERIES User's Guide V7.0 Page 2 New Features for Version 7.0 Mutipe Schedues This version of the SA/STA firmware supports mutipe schedues for empoyees. The mutipe schedues are impemented

More information

CSE120 Principles of Operating Systems. Architecture Support for OS

CSE120 Principles of Operating Systems. Architecture Support for OS CSE120 Principes of Operating Systems Architecture Support for OS Why are you sti here? You shoud run away from my CSE120! 2 CSE 120 Architectura Support Announcement Have you visited the web page? http://cseweb.ucsd.edu/casses/fa18/cse120-a/

More information

Introduction to OpenMP

Introduction to OpenMP MPSoC Architectures OpenMP Aberto Bosio, Associate Professor UM Microeectronic Departement bosio@irmm.fr Introduction to OpenMP What is OpenMP? Open specification for Muti-Processing Standard API for defining

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

Chapter 5: Transactions in Federated Databases

Chapter 5: Transactions in Federated Databases Federated Databases Chapter 5: in Federated Databases Saes R&D Human Resources Kemens Böhm Distributed Data Management: in Federated Databases 1 Kemens Böhm Distributed Data Management: in Federated Databases

More information

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,

More information

On-Chip CNN Accelerator for Image Super-Resolution

On-Chip CNN Accelerator for Image Super-Resolution On-Chip CNN Acceerator for Image Super-Resoution Jung-Woo Chang and Suk-Ju Kang Dept. of Eectronic Engineering, Sogang University, Seou, South Korea {zwzang91, sjkang}@sogang.ac.kr ABSTRACT To impement

More information

Sample of a training manual for a software tool

Sample of a training manual for a software tool Sampe of a training manua for a software too We use FogBugz for tracking bugs discovered in RAPPID. I wrote this manua as a training too for instructing the programmers and engineers in the use of FogBugz.

More information

Windows NT, Terminal Server and Citrix MetaFrame Terminal Server Architecture

Windows NT, Terminal Server and Citrix MetaFrame Terminal Server Architecture Windows NT, Termina Server and Citrix MetaFrame - CH 3 - Termina Server Architect.. Page 1 of 13 [Figures are not incuded in this sampe chapter] Windows NT, Termina Server and Citrix MetaFrame - 3 - Termina

More information

Infinity Connect Web App Customization Guide

Infinity Connect Web App Customization Guide Infinity Connect Web App Customization Guide Contents Introduction 1 Hosting the customized Web App 2 Customizing the appication 3 More information 8 Introduction The Infinity Connect Web App is incuded

More information

l A program is a set of instructions that the l It must be translated l Variable: portion of memory that stores a value char

l A program is a set of instructions that the l It must be translated l Variable: portion of memory that stores a value char Week 1 Operators, Data Types & I/O Gaddis: Chapters 1, 2, 3 CS 5301 Fa 2018 Ji Seaman Programming A program is a set of instructions that the computer foows to perform a task It must be transated from

More information

COS 318: Operating Systems. Virtual Memory Design Issues: Paging and Caching. Jaswinder Pal Singh Computer Science Department Princeton University

COS 318: Operating Systems. Virtual Memory Design Issues: Paging and Caching. Jaswinder Pal Singh Computer Science Department Princeton University COS 318: Operating Systems Virtua Memory Design Issues: Paging and Caching Jaswinder Pa Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Virtua Memory:

More information

BEA WebLogic Server. Release Notes for WebLogic Tuxedo Connector 1.0

BEA WebLogic Server. Release Notes for WebLogic Tuxedo Connector 1.0 BEA WebLogic Server Reease Notes for WebLogic Tuxedo Connector 1.0 BEA WebLogic Tuxedo Connector Reease 1.0 Document Date: June 29, 2001 Copyright Copyright 2001 BEA Systems, Inc. A Rights Reserved. Restricted

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Lecture 4: Threads

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Lecture 4: Threads CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Lecture 4: Threads Announcement Project 0 Due Project 1 out Homework 1 due on Thursday Submit it to Gradescope onine 2 Processes Reca that

More information

Dynamic Symbolic Execution of Distributed Concurrent Objects

Dynamic Symbolic Execution of Distributed Concurrent Objects Dynamic Symboic Execution of Distributed Concurrent Objects Andreas Griesmayer 1, Bernhard Aichernig 1,2, Einar Broch Johnsen 3, and Rudof Schatte 1,2 1 Internationa Institute for Software Technoogy, United

More information

Meeting Exchange 4.1 Service Pack 2 Release Notes for the S6200/S6800 Servers

Meeting Exchange 4.1 Service Pack 2 Release Notes for the S6200/S6800 Servers Meeting Exchange 4.1 Service Pack 2 Reease Notes for the S6200/S6800 Servers The Meeting Exchange S6200/S6800 Media Servers are SIP-based voice and web conferencing soutions that extend Avaya s conferencing

More information

UnixWare 7 System Administration UnixWare 7 System Configuration

UnixWare 7 System Administration UnixWare 7 System Configuration UnixWare 7 System Administration - CH 3 - UnixWare 7 System Configuration Page 1 of 8 [Figures are not incuded in this sampe chapter] UnixWare 7 System Administration - 3 - UnixWare 7 System Configuration

More information

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-4,6

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-4,6 Arrays Unit 5 Gaddis: 7.1-4,6 CS 1428 Fa 2017 Ji Seaman Array Data Type Array: a variabe that contains mutipe vaues of the same type. Vaues are stored consecutivey in memory. An array variabe definition

More information

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

Introducing a Target-Based Approach to Rapid Prototyping of ECUs

Introducing a Target-Based Approach to Rapid Prototyping of ECUs Introducing a Target-Based Approach to Rapid Prototyping of ECUs FEBRUARY, 1997 Abstract This paper presents a target-based approach to Rapid Prototyping of Eectronic Contro Units (ECUs). With this approach,

More information

Hiding secrete data in compressed images using histogram analysis

Hiding secrete data in compressed images using histogram analysis University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University

More information

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-3,5

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-3,5 Arrays Unit 5 Gaddis: 7.1-3,5 CS 1428 Spring 2018 Ji Seaman Array Data Type Array: a variabe that contains mutipe vaues of the same type. Vaues are stored consecutivey in memory. An array variabe decaration

More information

ECE 172 Digital Systems. Chapter 4.1 Memory. Herbert G. Mayer, PSU Status 8/1/2018

ECE 172 Digital Systems. Chapter 4.1 Memory. Herbert G. Mayer, PSU Status 8/1/2018 ECE 172 Digita Systems Chapter 4.1 Memory Herbert G. Mayer, PSU Status 8/1/2018 1 Syabus Introduction Processor Memory Latency and Bandwidth Memory Hierarchy Memory Attributes Four Messages Memory Types

More information

An Introduction to Design Patterns

An Introduction to Design Patterns An Introduction to Design Patterns 1 Definitions A pattern is a recurring soution to a standard probem, in a context. Christopher Aexander, a professor of architecture Why woud what a prof of architecture

More information

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed

Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Layout Conscious Approach and Bus Architecture Synthesis for Hardware-Software Co-Design of Systems on Chip Optimized for Speed Nattawut Thepayasuwan, Member, IEEE and Aex Doboi, Member, IEEE Abstract

More information

Link Registry Protocol Options

Link Registry Protocol Options Link Registry Protoco Options Norman Finn, March 2017 HUAWEI TECHNOLOGIES CO., LTD. IEEE 802.1 TSN At east two obvious choices for P802.1CS Link Registration Protoco An IS-IS-ike protoco. TCP (Transmission

More information

A Petrel Plugin for Surface Modeling

A Petrel Plugin for Surface Modeling A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica

More information

RDF Objects 1. Alex Barnell Information Infrastructure Laboratory HP Laboratories Bristol HPL November 27 th, 2002*

RDF Objects 1. Alex Barnell Information Infrastructure Laboratory HP Laboratories Bristol HPL November 27 th, 2002* RDF Objects 1 Aex Barne Information Infrastructure Laboratory HP Laboratories Bristo HPL-2002-315 November 27 th, 2002* E-mai: Andy_Seaborne@hp.hp.com RDF, semantic web, ontoogy, object-oriented datastructures

More information

PL/SQL, Embedded SQL. Lecture #14 Autumn, Fall, 2001, LRX

PL/SQL, Embedded SQL. Lecture #14 Autumn, Fall, 2001, LRX PL/SQL, Embedded SQL Lecture #14 Autumn, 2001 Fa, 2001, LRX #14 PL/SQL,Embedded SQL HUST,Wuhan,China 402 PL/SQL Found ony in the Orace SQL processor (sqpus). A compromise between competey procedura programming

More information

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System Sef-Contro Cycic Access with Time Division - A MAC Proposa for The HFC System S.M. Jiang, Danny H.K. Tsang, Samue T. Chanson Hong Kong University of Science & Technoogy Cear Water Bay, Kowoon, Hong Kong

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

3.1 The cin Object. Expressions & I/O. Console Input. Example program using cin. Unit 2. Sections 2.14, , 5.1, CS 1428 Spring 2018

3.1 The cin Object. Expressions & I/O. Console Input. Example program using cin. Unit 2. Sections 2.14, , 5.1, CS 1428 Spring 2018 Expressions & I/O Unit 2 Sections 2.14, 3.1-10, 5.1, 5.11 CS 1428 Spring 2018 Ji Seaman 1 3.1 The cin Object cin: short for consoe input a stream object: represents the contents of the screen that are

More information

Space-Time Trade-offs.

Space-Time Trade-offs. Space-Time Trade-offs. Chethan Kamath 03.07.2017 1 Motivation An important question in the study of computation is how to best use the registers in a CPU. In most cases, the amount of registers avaiabe

More information

Chapter 3: Introduction to the Flash Workspace

Chapter 3: Introduction to the Flash Workspace Chapter 3: Introduction to the Fash Workspace Page 1 of 10 Chapter 3: Introduction to the Fash Workspace In This Chapter Features and Functionaity of the Timeine Features and Functionaity of the Stage

More information

Computer Networks. College of Computing. Copyleft 2003~2018

Computer Networks. College of Computing.   Copyleft 2003~2018 Computer Networks Computer Networks Prof. Lin Weiguo Coege of Computing Copyeft 2003~2018 inwei@cuc.edu.cn http://icourse.cuc.edu.cn/computernetworks/ http://tc.cuc.edu.cn Attention The materias beow are

More information

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction

More information

lnput/output (I/O) AND INTERFACING

lnput/output (I/O) AND INTERFACING CHAPTER 7 NPUT/OUTPUT (I/O) AND INTERFACING INTRODUCTION The input/output section, under the contro of the CPU s contro section, aows the computer to communicate with and/or contro other computers, periphera

More information

LCD Video Controller. LCD Video Controller. Introduction Safety Precautions Indentifying the Components... 5

LCD Video Controller. LCD Video Controller. Introduction Safety Precautions Indentifying the Components... 5 LCD Video Controer LCD Video Controer Introduction... 3 Safety Precautions... 4 Indentifying the Components... 5 Main Board... 5 Main Board ASIC Features... 6 LVDS Transmitter... 8 Backight Inverter...

More information

1. INTRODUCTION 1.1 Product Introduction 1.2 Product Modes 1.3 Product Package 1.4 Network Printing Architecture 1.5 Network Printing Environment 1.6

1. INTRODUCTION 1.1 Product Introduction 1.2 Product Modes 1.3 Product Package 1.4 Network Printing Architecture 1.5 Network Printing Environment 1.6 Links for mode 504058 (1-Port UTP/BNC Parae Pocket Print Server): Downoads & inks http://www.inteinet-network.com/htm/d-pserver.htm This manua http://inteinet-network.com/mk2/manuas/502993_manua.zip Instructions

More information

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations Formuation of Loss minimization Probem Using Genetic Agorithm and Line-Fow-based Equations Sharanya Jaganathan, Student Member, IEEE, Arun Sekar, Senior Member, IEEE, and Wenzhong Gao, Senior member, IEEE

More information

Predator P User s Guide - 1

Predator P User s Guide - 1 Predator P09-600 User s Guide - 1 2-2017. A Rights Reserved. Desktop Computer Covers: Tower modes This revision: November 2017 V1.00 Important This manua contains proprietary information that is protected

More information

Alpha labelings of straight simple polyominal caterpillars

Alpha labelings of straight simple polyominal caterpillars Apha abeings of straight simpe poyomina caterpiars Daibor Froncek, O Nei Kingston, Kye Vezina Department of Mathematics and Statistics University of Minnesota Duuth University Drive Duuth, MN 82-3, U.S.A.

More information

l Tree: set of nodes and directed edges l Parent: source node of directed edge l Child: terminal node of directed edge

l Tree: set of nodes and directed edges l Parent: source node of directed edge l Child: terminal node of directed edge Trees & Heaps Week 12 Gaddis: 20 Weiss: 21.1-3 CS 5301 Fa 2016 Ji Seaman 1 Tree: non-recursive definition Tree: set of nodes and directed edges - root: one node is distinguished as the root - Every node

More information

Searching, Sorting & Analysis

Searching, Sorting & Analysis Searching, Sorting & Anaysis Unit 2 Chapter 8 CS 2308 Fa 2018 Ji Seaman 1 Definitions of Search and Sort Search: find a given item in an array, return the index of the item, or -1 if not found. Sort: rearrange

More information

Modeling of Problems of Projection: A Non-countercyclic Approach * Jason Ginsburg Osaka Kyoiku University

Modeling of Problems of Projection: A Non-countercyclic Approach * Jason Ginsburg Osaka Kyoiku University Modeing of Probems of Projection: A Non-countercycic Approach * Jason Ginsburg Osaka Kyoiku University Abstract This paper describes a computationa impementation of the recent Probems of Projection (POP)

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

User s Guide. Eaton Bypass Power Module (BPM) For use with the following: Eaton 9155 UPS (8 15 kva)

User s Guide. Eaton Bypass Power Module (BPM) For use with the following: Eaton 9155 UPS (8 15 kva) Eaton Bypass Power Modue (BPM) User s Guide For use with the foowing: Eaton 9155 UPS (8 15 kva) Eaton 9170+ UPS (3 18 kva) Eaton 9PX Spit-Phase UPS (6 10 kva) Specia Symbos The foowing are exampes of symbos

More information

The Big Picture WELCOME TO ESIGNAL

The Big Picture WELCOME TO ESIGNAL 2 The Big Picture HERE S SOME GOOD NEWS. You don t have to be a rocket scientist to harness the power of esigna. That s exciting because we re certain that most of you view your PC and esigna as toos for

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

NCH Software Spin 3D Mesh Converter

NCH Software Spin 3D Mesh Converter NCH Software Spin 3D Mesh Converter This user guide has been created for use with Spin 3D Mesh Converter Version 1.xx NCH Software Technica Support If you have difficuties using Spin 3D Mesh Converter

More information

Load Balancing by MPLS in Differentiated Services Networks

Load Balancing by MPLS in Differentiated Services Networks Load Baancing by MPLS in Differentiated Services Networks Riikka Susitaiva, Jorma Virtamo, and Samui Aato Networking Laboratory, Hesinki University of Technoogy P.O.Box 3000, FIN-02015 HUT, Finand {riikka.susitaiva,

More information

Intel Architecture: Features & Futures

Intel Architecture: Features & Futures Inte Architecture: Features & Futures For Servers & Workstations Stephen L. Smith Corporate Vice President, Microprocessor Products Group Genera Manager, Santa Cara Processor Division Inte Corporation

More information

Avaya Aura Call Center Elite Multichannel Configuration Server User Guide

Avaya Aura Call Center Elite Multichannel Configuration Server User Guide Avaya Aura Ca Center Eite Mutichanne Configuration Server User Guide Reease 6.2.3/6.2.5 March 2013 2013 Avaya Inc. A Rights Reserved. Notice Whie reasonabe efforts were made to ensure that the information

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Synchronization: Semaphore

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Synchronization: Semaphore CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Synchronization: Synchronization Needs Two synchronization needs Mutua excusion Whenever mutipe threads access a shared data, you need to worry

More information

Further Concepts in Geometry

Further Concepts in Geometry ppendix F Further oncepts in Geometry F. Exporing ongruence and Simiarity Identifying ongruent Figures Identifying Simiar Figures Reading and Using Definitions ongruent Trianges assifying Trianges Identifying

More information

Priority Queueing for Packets with Two Characteristics

Priority Queueing for Packets with Two Characteristics 1 Priority Queueing for Packets with Two Characteristics Pave Chuprikov, Sergey I. Nikoenko, Aex Davydow, Kiri Kogan Abstract Modern network eements are increasingy required to dea with heterogeneous traffic.

More information

Introduction to USB Development

Introduction to USB Development Introduction to USB Deveopment Introduction Technica Overview USB in Embedded Systems Recent Deveopments Extensions to USB USB as compared to other technoogies USB: Universa Seria Bus A seria bus standard

More information

Data Management Updates

Data Management Updates Data Management Updates Jenny Darcy Data Management Aiance CRP Meeting, Thursday, November 1st, 2018 Presentation Objectives New staff Update on Ingres (JCCS) conversion project Fina IRB cosure at study

More information

(10) Patent N0.: US 6,185,668 B1

(10) Patent N0.: US 6,185,668 B1 US006185668B1 (12) United States Patent () Patent N0.: Arya (45) Date of Patent: *Feb. 6, 2001 (54) METHOD AND APPARATUS FOR OTHER PUBLCATONS gtcg5é g1esexecut0n 0F Mahke, S.A., et a., Sentine Scheduing

More information

Replication of Virtual Network Functions: Optimizing Link Utilization and Resource Costs

Replication of Virtual Network Functions: Optimizing Link Utilization and Resource Costs Repication of Virtua Network Functions: Optimizing Link Utiization and Resource Costs Francisco Carpio, Wogang Bziuk and Admea Jukan Technische Universität Braunschweig, Germany Emai:{f.carpio, w.bziuk,

More information

Insert the power cord into the AC input socket of your projector, as shown in Figure 1. Connect the other end of the power cord to an AC outlet.

Insert the power cord into the AC input socket of your projector, as shown in Figure 1. Connect the other end of the power cord to an AC outlet. Getting Started This chapter wi expain the set-up and connection procedures for your projector, incuding information pertaining to basic adjustments and interfacing with periphera equipment. Powering Up

More information

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah Journa of Lightwave Technoogy Page of CFP: Cooperative Fast Protection Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapocai and Hussein T. Mouftah Abstract We introduce a nove protection scheme, caed Cooperative

More information

ECE 172 Digital Systems. Chapter 3 Registers. Herbert G. Mayer, PSU Status 7/12/2018

ECE 172 Digital Systems. Chapter 3 Registers. Herbert G. Mayer, PSU Status 7/12/2018 ECE 172 Digital Systems Chapter 3 Registers Herbert G. Mayer, PSU Status 7/12/2018 1 Syllabus l Definitions, Introduction l Register Transfer & RTL l Register Shift Operations l Register Windows l Vector

More information

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The

More information

MILITARY i387tm SX MATH COPROCESSOR

MILITARY i387tm SX MATH COPROCESSOR MILITARY i387tm SX MATH COPROCESSOR Miitary Y Interfaces with Miitary i386 TM SX Microprocessor Y Expands Miitary i386 SX CPU Data Types to Incude 32-64- 80-Bit Foating Point 32-64-Bit Integers and 18-Digit

More information

Brad A. Myers Human Computer Interaction Institute Carnegie Mellon University Pittsburgh, PA

Brad A. Myers Human Computer Interaction Institute Carnegie Mellon University Pittsburgh, PA PAPERS CHI 98. 18-23 APRIL 1998 Scripting Graphica Appications ABSTRACT Writing scripts (often caed macros ) can be hepfu for automating repetitive tasks. Scripting faciities for text editors ike Emacs

More information

C PTS 3.3, class 0.05 Three-phase Stationary Test System

C PTS 3.3, class 0.05 Three-phase Stationary Test System C PTS 3.3, cass 0.05 Three-phase Stationary Test System Three phase test system PTS 3.3 C The PTS 3.3 C portabe test system consists of an integrated three-phase current and votage source and a three-phase

More information

(12) United States Patent

(12) United States Patent US006697794B1 (12) United States Patent (10) Patent N0.: Miby (45) Date of Patent: Feb. 24, 2004 (54) PROVDNG DATABASE SYSTEM NATVE 6,285,996 B1 * 9/2001 Jou et a1...... 707/2 OPERATONS FOR USER DEFNED

More information

A Taxonomy for Computer Architectures

A Taxonomy for Computer Architectures A Taxonomy for Computer Architectures David B. Skiicorn Queen s University at Kingston F ynn s cassification of architectures does not discriminate ceary between different mutiprocessor architectures.

More information

Chapter 3: KDE Page 1 of 31. Put icons on the desktop to mount and unmount removable disks, such as floppies.

Chapter 3: KDE Page 1 of 31. Put icons on the desktop to mount and unmount removable disks, such as floppies. Chapter 3: KDE Page 1 of 31 Chapter 3: KDE In This Chapter What Is KDE? Instaing KDE Seecting KDE Basic Desktop Eements Running Programs Stopping KDE KDE Capabiities Configuring KDE with the Contro Center

More information

Authorization of a QoS Path based on Generic AAA. Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taal

Authorization of a QoS Path based on Generic AAA. Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taal Abstract Authorization of a QoS Path based on Generic Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taa Advanced Internet Research Group, Department of Computer Science, University of Amsterdam.

More information

SA2100X-UG001 SA2100. User Guide

SA2100X-UG001 SA2100. User Guide SA2100X-UG001 SA2100 User Guide Version 2.0 August 7,2015 INSEEGO COPYRIGHT STATEMENT 2015 Inseego Corporation. A rights reserved. The information contained in this document is subject to change without

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Scheduing Announcement Homework 2 due on October 25th Project 1 due on October 26th 2 CSE 120 Scheduing and Deadock Scheduing Overview In discussing

More information

Realization of GGF DAIS Data Service Interface for Grid Access to Data Streams

Realization of GGF DAIS Data Service Interface for Grid Access to Data Streams Reaization of GGF DAIS Data Interface for Grid Access to Data Streams Ying Liu, Beth Pae, Nithya Vijayakumar Indiana University Boomington, IN IU-CS TR 613 ABSTRACT As the computation power of hardware

More information

Special Edition Using Microsoft Office Sharing Documents Within a Workgroup

Special Edition Using Microsoft Office Sharing Documents Within a Workgroup Specia Edition Using Microsoft Office 2000 - Chapter 7 - Sharing Documents Within a.. Page 1 of 8 [Figures are not incuded in this sampe chapter] Specia Edition Using Microsoft Office 2000-7 - Sharing

More information

Several Common Compiler Strategies. Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining

Several Common Compiler Strategies. Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining Several Common Compiler Strategies Instruction scheduling Loop unrolling Static Branch Prediction Software Pipelining Basic Instruction Scheduling Reschedule the order of the instructions to reduce the

More information