Decoding bitstreams for fun and profit

Size: px
Start display at page:

Download "Decoding bitstreams for fun and profit"

Transcription

1 :25 1/13 Decoding bitstreams for fun and profit Decoding bitstreams for fun and profit by lft This article describes a technique for extracting bitfields from a long sequence of bytes stored in RAM. As an example application, consider a scler where the text is a string of 5-bit character codes. The entire text could then be stored as a bitstream, from which you read five bits at a time. But you might save some space if you represent, say, the eight most common characters as 0xxx, and all other characters as 1xxxxx. (This would also give you 40 different characters, rather than 32.) In that case, you'd first want to read a 1-bit field, to differentiate between the two cases. Then you'd read either a 3-bit field or a 5-bit field. We will discuss how to do this efficiently and elegantly on the In particular, we will look at a technique that performs the two-stage procedure described above, and even navigates arbitrary decision trees, as part of its normal operation. The schoolbook application for this kind of routine would be a Lempel-Ziv-Welch decruncher or a Huffman decoder. But anything is possible! For instance, you could use it to parse note events in a playroutine, instructions in a virtual machine, or entropy encoded sound samples. We will start with a simple design, and then add complexity step by step, also optimising it to the point where the complete decoder is quite devilish to follow. From bytes to bits At the heart of the bitfield decoder is the. This is essentially a mini-buffer of pending bits, represented as a single byte in the zero-page. As we shift out bits, the buffer occasionally becomes empty, at which time a new byte is loaded into it. The clever part is how we represent the. This is an old established technique, but it can be rather baffling when you see it for the first time. The idea is that the contains (from left to right) 0 7 bits of pending data, followed by a single 1-bit that we'll refer to as the token, followed by zeros. So, the following contains three bits of data (1, 0, 1): At program start, the is initialised to $80. Here is a first attempt at a getbit routine:

2 Last update: :28 base:decoding_bitstreams getbit asl jsr sta sec ; The bit is now in C. In order to read a bit from the, we first perform an ASL. Normally, this puts one bit of data in the carry flag, while also preparing the for the next bit. But if the Z flag was set by the ASL, the buffer was in fact empty, and we shifted out the token bit. In that case, we grab a new byte, store it in the, and then ROL to get the first data bit from the new byte. The ROL will also shift in a new token bit. In practice, it would be slow to call a subroutine in order to fetch new bytes. After all, this will happen for every eighth bit, which is quite often. Instead we'll use some self-modifying code, and keep a pointer to the next byte inside an instruction operand, like this: getbit asl mod_source ldx buffer mod_source+1 mod_source+2 stx ; The bit is now in C. We're using the X register because we're going to need A for something else soon. Note that the SEC has now been removed, because carry is already set from the previous token bit. If you want to get philosophical about it, you might say that it's the same token bit that gets re-used over and over. Next, we will rearrange the code to reduce the number of branch-taken penalty cycles. From now on, we must make sure to CLC before calling getbit. mod_source ldx buffer mod_source+1 Printed on :25

3 :25 3/13 Decoding bitstreams for fun and profit stx getbit beq mod_source+2 ; The bit is now in C. From bits to fields So now we can read individual bits from the stream. Let's pack them together into bitfields! We could of course call the getbit routine from a loop: getfield ; Y contains the requested number of bits lda #0 field_loop jsr getbit dey field_loop ; The bitfield is now in A. (This is why we had to preserve the A register during getbit/.) But again, subroutine calls are costly, so we'll merge getfield and getbit into a single routine. However, getting a single bit is now slower, because we have to treat it as a field of size one. getbit ldy #1 getfield ; Y contains the requested number of bits lda #0 jmp field_loop mod_source ldx buffer mod_source+1

4 Last update: :28 base:decoding_bitstreams stx field_loop beq mod_source+2 dey field_loop ; C is clear Note that, because we clear A at the beginning, we don't have to CLC before looping back to field_loop. But we can do better than this! Instead of representing the requested number of bits as an integer in the Y register, we can represent it as a single 1-bit in the accumulator. As we shift new data into the accumulator, the 1-bit gets closer and closer to the MSB, and when it finally falls off the edge, we terminate the loop: getbit lda #% getfield ; Position of 1-bit in A represents requested number of bits jmp mod_source ldx stx field_loop beq bcc field_loop buffer mod_source+1 mod_source+2 field_loop This preserves Y and saves two cycles per bit (DEY). Printed on :25

5 :25 5/13 Decoding bitstreams for fun and profit Two-stage fields Given the above routine, we are now in a position to implement the scler scenario described in the introduction. Here is some code to fetch a new character from the bitstream: getchar jsr getbit large large ; 3-bit character code lda #% jmp getfield ; 5-bit character code lda #% jsr getfield adc #8 Actually, we can shave off a byte and a pair of cycles by recognising that getfield always returns with carry set: We can safely omit the CLC and do ADC #7 instead. In more complex scenarios, such as decrunchers, we often need to distinguish between more than two cases. Perhaps we read two bits in order to select between four differently-sized encodings: Value range Coded as Value offset (what to add to x) x xx xxxx xxxxxxx 22 Rather than spelling out these four cases as different paths through the code, we can use a tablebased approach. This helps keep down the size of the decruncher, which is often very important. It will also enable some more optimisations further down the rabbit hole. We will use one table for the field widths, and one table for the value offsets. getvalue lda #% ; Get two bits. jsr getfield tay lda fields,y jsr getfield adc offsets,y ; 9-bit value returned in A and C.

6 Last update: :28 base:decoding_bitstreams fields.byt % ; Get one more bit..byt % ; Get two more bits..byt % ; Get four more bits..byt % ; Get seven more bits. offsets.byt 0.byt 2.byt 6.byt 22 Note that in the example, the maximum value returned is 149. Therefore, rather than saying that the result is a 9-bit value, we could simply say that the routine returns with carry undefined, and with an 8-bit result in A. We could then eliminate the CLC, and compensate by subtracting one from each value in the offset table. The reason why we can't do this for 9-bit values, is that the first entry in the offset table would become $ff, and this would cause values 0 and 1 to instead come out as 256 and 257. Decoding with arbitrary decision trees Consider again our scler example. Suppose we wish to encode a particularly common character (such as space) using a single bit. We might decide on the following encoding scheme: Value range Coded as Value offset (what to add to x) xxx xxxxx 9 To fetch a character now, we start by getting a single bit. Based on the value of this bit, we're either or we fetch one more bit. Based on this bit, we then either fetch three or five bits. This algorithm is essentially a tree of decisions, as illustrated by the following flowchart: Printed on :25

7 :25 7/13 Decoding bitstreams for fun and profit We will refer to the rhombus-shaped nodes as branch nodes and the rounded-rectangle nodes as return nodes. Such decision trees are usually implemented explicitly, as code. But for large trees, the decoder becomes unwieldy. Next up, we'll see how we can represent decision trees more compactly using tables. In each node of the flowchart above, we first fetch a bitfield (possibly of size zero), and then either: Branch to a different node, or Add a constant and return. It is time to introduce another decoding trick! So far, the field specifiers (what we put in A prior to calling getfield) have consisted of a number of zeros followed by a single set bit. But the remaining bits have no purpose yet, and they will be available in A when getfield returns, shifted into a position immediately to the left of the fetched bits. So, if we call getfield with A set to 001ttttt (t is for tag), we'll get tttttxxx back, where x is the fetched bitfield. The most significant bit of the tag will also be in the sign bit of the status register. Some decrunchers, e.g. Doynamite, use this to determine whether the value returned is complete, or whether it's just the high-byte of a larger value. In the latter case, the low-byte can be grabbed very quickly straight from the byte stream. Essentially, one tag bit is used to differentiate between two cases. However, in the present technique, we wish to encode a generic decision tree, and for this we'll have to use more tag bits. (In the following, the word branch will refer to branches in the flowchart, not 6502 branch instructions!)

8 Last update: :28 base:decoding_bitstreams Suppose we put a number on each node in the flowchart. The current node number will be kept in the Y register. From this number, we can deduce (using a lookup table) how many bits to fetch, whether we should branch or return after fetching, and in case of a branch node what range of nodes we should branch to. All of this information can be encoded as a single byte, and placed in the accumulator before calling getfield. As we have already seen, the number of leading zeros determines the field width. They are followed by a single set bit and a tag. We will use the most significant tag bit to keep track of what kind of node we're in. If this bit is clear, we're in a branch node, in which case the remaining tag bits will be used to encode the range of branch targets. A separate lookup table, also indexed by the current node number in Y, will be used to hold the constants that are added in return nodes. decode ldy #4 ; Start at node 4, the last node in the table. ; Y represents the current node, and is an index into the field and ; offset tables. lda fields,y ; In A, we now have: ; a number of zero bits, contling how many bits to fetch ; a one bit ; if we are in a return node: ; a one bit (tag MSB) ; fill up with zeros ; if we are in a branch node: ; a zero bit (tag MSB) ; tag bits --> first target node (after shift) ; Special exception to the above: ; If we're going to fetch a zero-length field, A is zero. ; Handle that now. beq ; Otherwise, fetch the field. jsr getfield ; In A, we now have: ; a bit indicating whether we are in a branch or return node ; more tag bits (all zero in case of a return node) ; the field we just fetched ; Are we in a return node? bmi Printed on :25

9 :25 9/13 Decoding bitstreams for fun and profit ; No, this was a branch node. The branch target is in A ; Note that the target has been constructed automatically by ; concatenating the tag with the fetched bits. So if the tag was ; and we fetched 101, we're going to branch to node tay jmp ; Add constant and return. adc offsets,y fields 2/3. 0/1. offsets.byt % ; Node 0: Fetch no more bits..byt % ; Node 1: Fetch 1 bit, then branch to node.byt % ; Node 2: Fetch 3 bits, then return..byt % ; Node 3: Fetch 5 bits, then return..byt % ; Node 4: Fetch 1 bit, then branch to node.byt 0 ; Add constant to obtain range 0-0..byt 0 ; Unused (branch node).byt $80+1 ; Add constant to obtain range 1-8..byt $80+9 ; Add constant to obtain range byt 0 ; Unused (branch node) A subtlety is that when we return without fetching anything (node 0), the accumulator will be zero before adding the constant. Otherwise, the accumulator will be $80, and we have to compensate accordingly in the offset table. The above code was organised for clarity. However, we can rearrange the loop to eliminate the JMP instruction. There's also no need to start by setting up a constant Y, as we could just as well load A directly. Se the first node is always a branch node, we won't be using Y after the fetch, so we can leave it uninitialised. Hence: decode lda #% ; Fetch 1 bit, then branch to node 0/1. jsr getfield bmi tay lda fields,y

10 Last update: :28 base:decoding_bitstreams adc offsets,y fields 2/3. offsets.byt % ; Node 0: Fetch no more bits..byt % ; Node 1: Fetch 1 bit, then branch to node.byt % ; Node 2: Fetch 3 bits, then return..byt % ; Node 3: Fetch 5 bits, then return..byt 0 ; Add constant to obtain range 0-0..byt 0 ; Unused (branch node).byt $80+1 ; Add constant to obtain range 1-8..byt $80+9 ; Add constant to obtain range The CLC at can be removed if we adjust the offset table: We subtract one from each table entry that corresponds to a return node where a non-zero-sized field was fetched. Putting it all together Cramming an arbitrary decision tree into the field table is all very nifty, and it keeps down the size of the decoder considerably. But what about performance? Surely, putting a flowchart in a table can't be faster than simply coding it with explicit branch instructions? But as a consequence of the table-driven design, there is now a great optimisation opportunity staring us in the face: We're down to a single call to the getfield routine, and that means we can inline it! decode lda #% ; Fetch 1 bit, then branch to node 0/1. jmp mod_source ldx buffer mod_source+1 stx beq bcc mod_source+2 Printed on :25

11 :25 11/13 Decoding bitstreams for fun and profit bmi a tay lda fields,y ; Carry will be set if we got here via the BMI, i.e. after fetching ; non-zero-sized field. Compensate in the table. adc offsets,y fields 2/3. offsets clear). set). set)..byt % ; Node 0: Fetch no more bits..byt % ; Node 1: Fetch 1 bit, then branch to node.byt % ; Node 2: Fetch 3 bits, then return..byt % ; Node 3: Fetch 5 bits, then return..byt 0 ; Add constant to obtain range 0-0 (Carry.byt 0 ; Unused (branch node).byt $7f+1 ; Add constant to obtain range 1-8 (Carry.byt $7f+9 ; Add constant to obtain range 9-40 (Carry Indeed, with such a flexible routine, one might even be able to drive all decoding from a single call site, and thus to inline the call to the decoder itself. For a real-world example of this, please have a look at the decruncher in Spindle 2.1. A final touch The code is already looking rather streamlined, but let's top it off with one more optimisation: We can get rid of two cycles for each step through the decision tree, by eliminating the CLC right before branching back to. The following trick is only possible if, for each node, the number in the field table is either zero (for a zero-size fetch) or strictly larger than the node number. Many decision trees have this property, because node numbers are small integers, while numbers in the field table tend to be large. If not, it may be possible to fix it by rearranging the node numbers. The idea is to access the table a little differently: Instead of simply loading from it, we perform an ADC. Naturally, we then have to compensate in the table, by subtracting from each element the node number (which happens to be in A at the time of the addition) and 1 (for the carry flag, which is set).

12 Last update: :28 base:decoding_bitstreams With that, we are ready for the final version of the decoder. It is listed below in the form of a subroutine, but, as mentioned earlier, it should be inlined for maximum performance. decode lda #% ; Fetch 1 bit, then branch to node 0/1. jmp mod_source ldx buffer mod_source+1 stx beq bcc bmi mod_source+2 tay adc fields,y ; Carry is clear when branching. ; Carry is set. adc offsets,y fields 2/3..byt.byt.byt.byt % ; Node 0: Fetch no more bits. % ; Node 1: Fetch 1 bit, then branch to node % ; Node 2: Fetch 3 bits, then return. % ; Node 3: Fetch 5 bits, then return. offsets.byt $ff ; Add constant to obtain range 0-0..byt 0 ; Unused (branch node).byt $7f+1 ; Add constant to obtain range 1-8..byt $7f+9 ; Add constant to obtain range Printed on :25

13 :25 13/13 Decoding bitstreams for fun and profit Conclusion We have seen how to extract bitfields from byte sequences stored in RAM, using a highly efficient technique that is capable of navigating arbitrary decision trees as part of the decoding process. From: - Codebase 64 wiki Permanent link: Last update: :28

COSC 243. Instruction Sets And Addressing Modes. Lecture 7&8 Instruction Sets and Addressing Modes. COSC 243 (Computer Architecture)

COSC 243. Instruction Sets And Addressing Modes. Lecture 7&8 Instruction Sets and Addressing Modes. COSC 243 (Computer Architecture) COSC 243 Instruction Sets And Addressing Modes 1 Overview This Lecture Source Chapters 12 & 13 (10 th editition) Textbook uses x86 and ARM (we use 6502) Next 2 Lectures Assembly language programming 2

More information

The 6502 Instruction Set

The 6502 Instruction Set The 6502 Instruction Set Load and Store Group LDA Load Accumulator N,Z LDX Load X Register N,Z LDY Load Y Register N,Z STA Store Accumulator STX Store X Register STY Store Y Register Arithmetic Group ADC

More information

; Once Initialized, monitor character in calls to CN05 ; set carry for input, to be tested CN35 C SEC

; Once Initialized, monitor character in calls to CN05 ; set carry for input, to be tested CN35 C SEC // // Serialcode.s // 256 Byte Prom P8 and 512 Byte PROM P9A (second version) for Apple II Serial Card // P9A differs from P9 by adding RTS/ACK software flow control to output and // by removing batch

More information

Example Programs for 6502 Microprocessor Kit

Example Programs for 6502 Microprocessor Kit Example Programs for 6502 Microprocessor Kit 0001 0000 0002 0000 GPIO1.EQU $8000 0003 0000 0004 0000 0005 0200.ORG $200 0006 0200 0007 0200 A5 00 LDA $0 0008 0202 8D 00 80 STA $GPIO1 0009 0205 00 BRK 0010

More information

COSC 243. Assembly Language Techniques. Lecture 9. COSC 243 (Computer Architecture)

COSC 243. Assembly Language Techniques. Lecture 9. COSC 243 (Computer Architecture) COSC 243 Assembly Language Techniques 1 Overview This Lecture Source Handouts Next Lectures Memory and Storage Systems 2 Parameter Passing In a high level language we don t worry about the number of parameters

More information

III. Flags of the Processor Staus Register

III. Flags of the Processor Staus Register III. Flags of the Processor Staus Register INHALT 1. Meaning 2. Application 2.1 Shifts 2.2 Branches 2.3 Addition and Subtraction 2.4 Comparisons in magnitude 1. Meaning processor status register Overflow

More information

Code Secrets of Wolfenstein 3D IIGS. Eric Shepherd

Code Secrets of Wolfenstein 3D IIGS. Eric Shepherd Code Secrets of Wolfenstein 3D IIGS Eric Shepherd Fast Screen Refresh with PEI Slamming Or, Dirty Tricks with the Direct Page IIGS Features We Can Abuse Super high-resolution graphics shadowing Bank $01

More information

instruction 1 Fri Oct 13 13:05:

instruction 1 Fri Oct 13 13:05: instruction Fri Oct :0:0. Introduction SECTION INSTRUCTION SET This section describes the aressing modes and instruction types.. Aressing Modes The CPU uses eight aressing modes for flexibility in accessing

More information

Regarding the change of names mentioned in the document, such as Mitsubishi Electric and Mitsubishi XX, to Renesas Technology Corp.

Regarding the change of names mentioned in the document, such as Mitsubishi Electric and Mitsubishi XX, to Renesas Technology Corp. To all our customers Regarding the change of names mentioned in the document, such as Mitsubishi Electric and Mitsubishi XX, to Renesas Technology Corp. The semiconductor operations of Hitachi and Mitsubishi

More information

A. CPU INSTRUCTION SET SUMMARY

A. CPU INSTRUCTION SET SUMMARY A. CPU INSTRUCTION SET SUMMARY This appendix summarizes the CPU instruction set. Table A-1 is a matrix of CPU instructions and addressing modes arranged by operation code. Table A-2 lists the CPU instruction

More information

Lecture #3 Microcontroller Instruction Set Embedded System Engineering Philip Koopman Wednesday, 20-Jan-2015

Lecture #3 Microcontroller Instruction Set Embedded System Engineering Philip Koopman Wednesday, 20-Jan-2015 Lecture #3 Microcontroller Instruction Set 18-348 Embedded System Engineering Philip Koopman Wednesday, 20-Jan-2015 Electrical& Computer ENGINEERING Copyright 2006-2015, Philip Koopman, All Rights Reserved

More information

Content. 1. General informations 2. direct addressing 3. indirect addressing 4. Examples including informations

Content. 1. General informations 2. direct addressing 3. indirect addressing 4. Examples including informations IV. Addressing Modi Content 1. General informations 2. direct addressing 3. indirect addressing 4. Examples including informations 1. General Informations Address range for data and program : the 65xx

More information

SCRAM Introduction. Philipp Koehn. 19 February 2018

SCRAM Introduction. Philipp Koehn. 19 February 2018 SCRAM Introduction Philipp Koehn 19 February 2018 This eek 1 Fully work through a computer circuit assembly code Simple but Complete Random Access Machine (SCRAM) every instruction is 8 bit 4 bit for op-code:

More information

COMPUTE! ISSUE 36 / MAY 1983 / PAGE 244

COMPUTE! ISSUE 36 / MAY 1983 / PAGE 244 Versatile Data Acquisition with VIC Doug Homer and Stan Klein COMPUTE! ISSUE 36 / MAY 1983 / PAGE 244 This simple method of adjusting the VIC's internal jiffy dock can slow it down to match your timing

More information

Lecture #2 January 30, 2004 The 6502 Architecture

Lecture #2 January 30, 2004 The 6502 Architecture Lecture #2 January 30, 2004 The 6502 Architecture In order to understand the more modern computer architectures, it is helpful to examine an older but quite successful processor architecture, the MOS-6502.

More information

OSIAC Read OSIAC 5362 posted on the course website

OSIAC Read OSIAC 5362 posted on the course website OSIAC 5362 Read OSIAC 5362 posted on the course website The Basic Structure of Control Unit m CLK Run/Inhibit Control Step Counter m Preset (to any new state) Reset IR Decoder/Encoder (combinational logic)

More information

Chapter 2. Assembler Design

Chapter 2. Assembler Design Chapter 2 Assembler Design Assembler is system software which is used to convert an assembly language program to its equivalent object code. The input to the assembler is a source code written in assembly

More information

Quicksort (for 16-bit Elements)

Quicksort (for 16-bit Elements) 2017-09-21 17:30 1/9 Quicksort (for 16-bit Elements) Quicksort (for 16-bit Elements) by Vladimir Lidovski aka litwr, 13 Aug 2016 (with help of BigEd) It is well known that the best, the fastest sort routine

More information

Question Bank Microprocessor and Microcontroller

Question Bank Microprocessor and Microcontroller QUESTION BANK - 2 PART A 1. What is cycle stealing? (K1-CO3) During any given bus cycle, one of the system components connected to the system bus is given control of the bus. This component is said to

More information

CS 101, Mock Computer Architecture

CS 101, Mock Computer Architecture CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically

More information

Chapter 3 : Control Unit

Chapter 3 : Control Unit 3.1 Control Memory Chapter 3 Control Unit The function of the control unit in a digital computer is to initiate sequences of microoperations. When the control signals are generated by hardware using conventional

More information

:31 1/9 RLE Toolkit for CC65 v 1.0

:31 1/9 RLE Toolkit for CC65 v 1.0 2017-09-21 17:31 1/9 RLE Toolkit for CC65 v 1.0 RLE Toolkit for CC65 v 1.0 By MagerValp. The homepage and sources to this Toolkit is available here. Check that page for potential updates to this code.

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats William Stallings Computer Organization and Architecture 8 th Edition Chapter 11 Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement

More information

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control,

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control, UNIT - 7 Basic Processing Unit: Some Fundamental Concepts, Execution of a Complete Instruction, Multiple Bus Organization, Hard-wired Control, Microprogrammed Control Page 178 UNIT - 7 BASIC PROCESSING

More information

Gechstudentszone.wordpress.com

Gechstudentszone.wordpress.com CHAPTER -2 2.1 Basic Assembler Functions: The basic assembler functions are: ASSEMBLERS-1 Translating mnemonic language code to its equivalent object code. Assigning machine addresses to symbolic labels.

More information

JBit E1 (1) Subroutines. Preface. Usage. Tables. Program Layout

JBit E1 (1) Subroutines. Preface. Usage. Tables. Program Layout JBit E1 (1) Preface, Usage, Program Layout, Subroutines, Tables Preface JBit E1 (1) The E1 series will show you how to write a complete application with JBit. While the application is trivial by today

More information

Blog - https://anilkumarprathipati.wordpress.com/

Blog - https://anilkumarprathipati.wordpress.com/ Control Memory 1. Introduction The function of the control unit in a digital computer is to initiate sequences of microoperations. When the control signals are generated by hardware using conventional

More information

MOS 6502 Architecture

MOS 6502 Architecture MOS 6502 Architecture Lecture 3 Fall 17 1 History Origins lie in the Motorola 6800. Was very expensive for consumers. ($300, or about $1500 in 2017 $s) Chuck Peddle proposes lower-cost, lower-area 6800

More information

Microprocessor Architecture. mywbut.com 1

Microprocessor Architecture. mywbut.com 1 Microprocessor Architecture mywbut.com 1 Microprocessor Architecture The microprocessor can be programmed to perform functions on given data by writing specific instructions into its memory. The microprocessor

More information

A Technical Overview of Commodore Copy Protection. Glenn Holmer ( ShadowM ) World of Commodore Expo, 12/01/2007

A Technical Overview of Commodore Copy Protection. Glenn Holmer ( ShadowM )   World of Commodore Expo, 12/01/2007 A Technical Overview of Commodore Copy Protection Glenn Holmer ( ShadowM ) www.lyonlabs.org/commodore/c64.html World of Commodore Expo, 12/01/2007 Why Talk About This? These skills were a black art to

More information

Digital System Design Using Verilog. - Processing Unit Design

Digital System Design Using Verilog. - Processing Unit Design Digital System Design Using Verilog - Processing Unit Design 1.1 CPU BASICS A typical CPU has three major components: (1) Register set, (2) Arithmetic logic unit (ALU), and (3) Control unit (CU) The register

More information

Grundlagen Microcontroller Processor Core. Günther Gridling Bettina Weiss

Grundlagen Microcontroller Processor Core. Günther Gridling Bettina Weiss Grundlagen Microcontroller Processor Core Günther Gridling Bettina Weiss 1 Processor Core Architecture Instruction Set Lecture Overview 2 Processor Core Architecture Computes things > ALU (Arithmetic Logic

More information

(Refer Slide Time: 1:40)

(Refer Slide Time: 1:40) Computer Architecture Prof. Anshul Kumar Department of Computer Science and Engineering, Indian Institute of Technology, Delhi Lecture - 3 Instruction Set Architecture - 1 Today I will start discussion

More information

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng.

CS 265. Computer Architecture. Wei Lu, Ph.D., P.Eng. CS 265 Computer Architecture Wei Lu, Ph.D., P.Eng. Part 5: Processors Our goal: understand basics of processors and CPU understand the architecture of MARIE, a model computer a close look at the instruction

More information

The Motorola 68HC11 Instruc5on Set

The Motorola 68HC11 Instruc5on Set The Motorola 68HC11 Instruc5on Set Some Defini5ons A, B * accumulators A and B D * double accumulator (A + B) IX, IY * index registers X and Y SP * stack pointer M * some memory loca5on opr * an operand

More information

Table 1: Mnemonics Operations Dictionary. Add Accumulators Add B to Y. Add with carry to B. Add Memory to B. Add 16-bit to D And B with Memory

Table 1: Mnemonics Operations Dictionary. Add Accumulators Add B to Y. Add with carry to B. Add Memory to B. Add 16-bit to D And B with Memory Table 1: Mnemonics s Dictionary ABA ABX ABY ADCA ADCB ADDA ADDB ADDD ANDA ANDB ASL ASLA ASLB ASLD ASR ASRA ASRB BCC BCLR BCS BEQ BGE BGT BHI BHS BITA BITB BLE BLO BLS BLT Add Accumulators Add B to X Add

More information

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: The CPU and Memory How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: 1 Registers A register is a permanent storage location within

More information

Call A.P.P.L.E. TOME OF COPY PROTECTION

Call A.P.P.L.E. TOME OF COPY PROTECTION Call A.P.P.L.E. World s Largest Apple User Group Since 1978 www.callapple.org TOME OF COPY PROTECTION Technical Errata for First Printing Compiled August 2018 Changes are Bold Page 20 Half Tracks * BE5A:

More information

INSTRUCTION SET AND EXECUTION

INSTRUCTION SET AND EXECUTION SECTION 6 INSTRUCTION SET AND EXECUTION Fetch F1 F2 F3 F3e F4 F5 F6 Decode D1 D2 D3 D3e D4 D5 Execute E1 E2 E3 E3e E4 Instruction Cycle: 1 2 3 4 5 6 7 MOTOROLA INSTRUCTION SET AND EXECUTION 6-1 SECTION

More information

1. Lexical Analysis Phase

1. Lexical Analysis Phase 1. Lexical Analysis Phase The purpose of the lexical analyzer is to read the source program, one character at time, and to translate it into a sequence of primitive units called tokens. Keywords, identifiers,

More information

Introduction to Computers - Chapter 4

Introduction to Computers - Chapter 4 Introduction to Computers - Chapter 4 Since the invention of the transistor and the first digital computer of the 1940s, computers have been increasing in complexity and performance; however, their overall

More information

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions. Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions Stage Instruction Fetch Instruction Decode Execution / Effective addr Memory access Write-back Abbreviation

More information

Student # (In case pages get detached) The Edward S. Rogers Sr. Department of Electrical and Computer Engineering

Student # (In case pages get detached) The Edward S. Rogers Sr. Department of Electrical and Computer Engineering ECE 243S - Computer Organization The Edward S. Rogers Sr. Department of Electrical and Computer Engineering Mid-term Examination, March 2005 Name Student # Please circle your lecture section for exam return

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

Introduction to Microcomputer Systems Addressing modes

Introduction to Microcomputer Systems Addressing modes Dept. of Computer Science and Engineering Introduction to Microcomputer Systems Overview Addressing mode Source form Abbreviation Description Inherent INST (no externally supplied operands) INH Immediate

More information

Microcontroller Systems

Microcontroller Systems µcontroller systems 1 / 43 Microcontroller Systems Engineering Science 2nd year A2 Lectures Prof David Murray david.murray@eng.ox.ac.uk www.robots.ox.ac.uk/ dwm/courses/2co Michaelmas 2014 µcontroller

More information

CPU08RM/AD REV 3 8M68HC08M. CPU08 Central Processor Unit. Reference Manual

CPU08RM/AD REV 3 8M68HC08M. CPU08 Central Processor Unit. Reference Manual CPU08RM/AD REV 3 68HC08M6 HC08M68HC 8M68HC08M CPU08 Central Processor Unit Reference Manual blank CPU08 Central Processor Unit Reference Manual Motorola reserves the right to make changes without further

More information

Programming the Motorola MC68HC11 Microcontroller

Programming the Motorola MC68HC11 Microcontroller Programming the Motorola MC68HC11 Microcontroller COMMON PROGRAM INSTRUCTIONS WITH EXAMPLES aba Add register B to register A Similar commands are abx aby aba add the value in register B to the value in

More information

538 Lecture Notes Week 5

538 Lecture Notes Week 5 538 Lecture Notes Week 5 (October 4, 2017) 1/18 538 Lecture Notes Week 5 Announements Midterm: Tuesday, October 25 Answers to last week's questions 1. With the diagram shown for a port (single bit), what

More information

Chapter Seven. Large & Fast: Exploring Memory Hierarchy

Chapter Seven. Large & Fast: Exploring Memory Hierarchy Chapter Seven Large & Fast: Exploring Memory Hierarchy 1 Memories: Review SRAM (Static Random Access Memory): value is stored on a pair of inverting gates very fast but takes up more space than DRAM DRAM

More information

538 Lecture Notes Week 5

538 Lecture Notes Week 5 538 Lecture Notes Week 5 (Sept. 30, 2013) 1/15 538 Lecture Notes Week 5 Answers to last week's questions 1. With the diagram shown for a port (single bit), what happens if the Direction Register is read?

More information

INSTITUTE OF ENGINEERING AND MANAGEMENT, KOLKATA Microprocessor

INSTITUTE OF ENGINEERING AND MANAGEMENT, KOLKATA Microprocessor INSTITUTE OF ENGINEERING AND MANAGEMENT, KOLKATA Microprocessor Subject Name: Microprocessor and Microcontroller Year: 3 rd Year Subject Code: CS502 Semester: 5 th Module Day Assignment 1 Microprocessor

More information

The X86 Assembly Language Instruction Nop Means

The X86 Assembly Language Instruction Nop Means The X86 Assembly Language Instruction Nop Means As little as 1 CPU cycle is "wasted" to execute a NOP instruction (the exact and other "assembly tricks", as explained also in this thread on Programmers.

More information

5.7. Microprogramming: Simplifying Control Design 5.7

5.7. Microprogramming: Simplifying Control Design 5.7 5.7 Microprogramming: Simplifying Control Design 5.7 For the of our simple MIPS subset, a graphical representation of the finite state machine, as in Figure 5.40 on page 345, is certainly adequate. We

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

0b) [2] Can you name 2 people form technical support services (stockroom)?

0b) [2] Can you name 2 people form technical support services (stockroom)? ECE 372 1 st Midterm ECE 372 Midterm Exam Fall 2004 In this exam only pencil/pen are allowed. Please write your name on the front page. If you unstaple the papers write your name on the loose papers also.

More information

AN1742. Programming the 68HC705J1A In-Circuit By Chris Falk CSG Product Engineering Austin, Texas. Introduction. Overview

AN1742. Programming the 68HC705J1A In-Circuit By Chris Falk CSG Product Engineering Austin, Texas. Introduction. Overview Order this document by /D Programming the 68HC705J1A In-Circuit By Chris Falk CSG Product Engineering Austin, Texas Introduction Overview This application note describes how a user can program the 68HC705J1A

More information

the SAP-2 I. Intro cmpt-150-arc Sections 8-8, 8-9, 9-4, 9-5, 9.6, We ll do this in bits and pieces, doing the beginning of each section first.

the SAP-2 I. Intro cmpt-150-arc Sections 8-8, 8-9, 9-4, 9-5, 9.6, We ll do this in bits and pieces, doing the beginning of each section first. I. Intro the SAP-2 cmpt-150-arc Sections 8-8, 8-9, 9-4, 9-5, 9.6, 9.8 1. We ll do this in bits and pieces, doing the beginning of each section first. 1. The SAP-2 adds a lot of functionality to the SAP-1

More information

CHAPTER 3 RESOURCE MANAGEMENT

CHAPTER 3 RESOURCE MANAGEMENT CHAPTER 3 RESOURCE MANAGEMENT SUBTOPIC Understand Memory Management Understand Processor Management INTRODUCTION Memory management is the act of managing computer memory. This involves providing ways to

More information

What Are The Main Differences Between Program Counter Pc And Instruction Register Ir

What Are The Main Differences Between Program Counter Pc And Instruction Register Ir What Are The Main Differences Between Program Counter Pc And Instruction Register Ir and register-based instructions - Anatomy on a CPU - Program Counter (PC): holds memory address of next instruction

More information

Wednesday, February 4, Chapter 4

Wednesday, February 4, Chapter 4 Wednesday, February 4, 2015 Topics for today Introduction to Computer Systems Static overview Operation Cycle Introduction to Pep/8 Features of the system Operational cycle Program trace Categories of

More information

PROBLEMS. 7.1 Why is the Wait-for-Memory-Function-Completed step needed when reading from or writing to the main memory?

PROBLEMS. 7.1 Why is the Wait-for-Memory-Function-Completed step needed when reading from or writing to the main memory? 446 CHAPTER 7 BASIC PROCESSING UNIT (Corrisponde al cap. 10 - Struttura del processore) PROBLEMS 7.1 Why is the Wait-for-Memory-Function-Completed step needed when reading from or writing to the main memory?

More information

Computer Organization I. Lecture 28: Architecture of M68HC11

Computer Organization I. Lecture 28: Architecture of M68HC11 Computer Organization I Lecture 28: Architecture of M68HC11 Overview Architecture of HC11 Microprocessor Format of HC11 Assembly Code Objectives To understand the simplified architecture of HC11 To know

More information

The PC's keyboard. PC Keyboard Theory. Quality Information in one Place...

The PC's keyboard. PC Keyboard Theory. Quality Information in one Place... Interfacing the PC / Beyond Logic Quality Information in one Place... Parallel Ports Serial Ports Interrupts AT Keyboard Ports USB The PC's keyboard. Why would you want to interface the Keyboard? The IBM

More information

Course Schedule. CS 221 Computer Architecture. Week 3: Plan. I. Hexadecimals and Character Representations. Hexadecimal Representation

Course Schedule. CS 221 Computer Architecture. Week 3: Plan. I. Hexadecimals and Character Representations. Hexadecimal Representation Course Schedule CS 221 Computer Architecture Week 3: Information Representation (2) Fall 2001 W1 Sep 11- Sep 14 Introduction W2 Sep 18- Sep 21 Information Representation (1) (Chapter 3) W3 Sep 25- Sep

More information

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

More information

BINARY LOAD AND PUNCH

BINARY LOAD AND PUNCH BINARY LOAD AND PUNCH To easily decrease the amount of time it takes to load a long tape (Cassette or paper) a BINARY formatting technique can be used instead of the conventional ASCII format used by the

More information

Lempel-Ziv-Welch (LZW) Compression Algorithm

Lempel-Ziv-Welch (LZW) Compression Algorithm Lempel-Ziv-Welch (LZW) Compression lgorithm Introduction to the LZW lgorithm Example 1: Encoding using LZW Example 2: Decoding using LZW LZW: Concluding Notes Introduction to LZW s mentioned earlier, static

More information

Wednesday, October 17, 2012

Wednesday, October 17, 2012 Wednesday, October 17, 2012 Topics for today Arrays and Indexed Addressing Arrays as parameters of functions Multi-dimensional arrays Indexed branching Implementation of switch statement Arrays as parameters

More information

Outline. CPE/EE 422/522 Advanced Logic Design L15. Files. Files

Outline. CPE/EE 422/522 Advanced Logic Design L15. Files. Files Outline CPE/EE 422/522 Advanced Logic Design L15 Electrical and Computer Engineering University of Alabama in Huntsville VHDL What we know (additional topics) Attributes Transport and Inertial Delays Operator

More information

CHAPTER SEVEN PROGRAMMING THE BASIC COMPUTER

CHAPTER SEVEN PROGRAMMING THE BASIC COMPUTER CHAPTER SEVEN 71 Introduction PROGRAMMING THE BASIC COMPUTER A computer system as it was mentioned before in chapter 1, it is subdivided into two functional parts: 1 Hardware, which consists of all physical

More information

Micro-KIM Tutorial. Aart J.C. Bik

Micro-KIM Tutorial. Aart J.C. Bik Micro-KIM Tutorial Aart J.C. Bik http://www.aartbik.com/ 1 Getting Started Perhaps reminiscing the past is a sign of getting older, but I cannot help but look back fondly at the times I learned programming

More information

srl - shift right logical - 0 enters from left, bit drops off right end note: little-endian bit notation msb lsb "b" for bit

srl - shift right logical - 0 enters from left, bit drops off right end note: little-endian bit notation msb lsb b for bit Clemson University -- CPSC 231 Shifts (p. 123) srl - shift right logical - 0 enters from left, bit drops off right end 0 b 31 b 30 b 2 b 1 b 0 note: little-endian bit notation msb lsb "b" for bit a f 5

More information

STEVEN R. BAGLEY ARM: PROCESSING DATA

STEVEN R. BAGLEY ARM: PROCESSING DATA STEVEN R. BAGLEY ARM: PROCESSING DATA INTRODUCTION CPU gets instructions from the computer s memory Each instruction is encoded as a binary pattern (an opcode) Assembly language developed as a human readable

More information

NAM M6800 DISK-BUG DS VER 3.5 OPT PAG

NAM M6800 DISK-BUG DS VER 3.5 OPT PAG NAM M6800 DISK-BUG DS VER 3.5 OPT PAG Floppy Disk Controller Debug Monitor Written 27 Aug 1980 Michael Holley Record of modifications 18 OCT 1981 Disk routines DC-1 23 JAN 1982 Command Table 8 MAY 1982

More information

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between MITOCW Lecture 10A [MUSIC PLAYING] PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between all these high-level languages like Lisp and the query

More information

PROGRAM CONTROL UNIT (PCU)

PROGRAM CONTROL UNIT (PCU) nc. SECTION 5 PROGRAM CONTROL UNIT (PCU) MOTOROLA PROGRAM CONTROL UNIT (PCU) 5-1 nc. SECTION CONTENTS 5.1 INTRODUCTION........................................ 5-3 5.2 PROGRAM COUNTER (PC)...............................

More information

DAN64: an AVR based 8-bit Microcomputer

DAN64: an AVR based 8-bit Microcomputer DAN64: an AVR based 8-bit Microcomputer Juan J. Martínez jjm@usebox.net Manual for V.R - May 0, 06 Features Composite video black and white output, 56 x 9 resolution, x 4 characters (8 x 8 pixels font,

More information

The SURE Architecture

The SURE Architecture The SURE Architecture David May: December 11, 2016 Background Computer programming is changing. Object-oriented languages, functional languages and others have accelerated software development. But these

More information

We can create PDAs with multiple stacks. At each step we look at the current state, the current input symbol, and the top of each stack.

We can create PDAs with multiple stacks. At each step we look at the current state, the current input symbol, and the top of each stack. Other Automata We can create PDAs with multiple stacks. At each step we look at the current state, the current input symbol, and the top of each stack. From all of this information we decide what state

More information

Control Unit: The control unit provides the necessary timing and control Microprocessor resembles a CPU exactly.

Control Unit: The control unit provides the necessary timing and control Microprocessor resembles a CPU exactly. Unit I 8085 and 8086 PROCESSOR Introduction to microprocessor A microprocessor is a clock-driven semiconductor device consisting of electronic logic circuits manufactured by using either a large-scale

More information

AN1818. Motorola Semiconductor Application Note

AN1818. Motorola Semiconductor Application Note Order this document by /D Motorola Semiconductor Application Note Software SCI Routines with the 16-Bit Timer Module By Brad Bierschenk MMD Applications Engineering Austin, Texas Introduction Many applications

More information

2. Define Instruction Set Architecture. What are its two main characteristics? Be precise!

2. Define Instruction Set Architecture. What are its two main characteristics? Be precise! Chapter 1: Computer Abstractions and Technology 1. Assume two processors, a CISC processor and a RISC processor. In order to run a particular program, the CISC processor must execute 10 million instructions

More information

Appendix A The Co-processor Instructions

Appendix A The Co-processor Instructions Appendix A The Co-processor Instructions In Chapter Seven, we talked about the undefined instruction trap. This occurs when the ARM tries to execute an instruction which does not have a valid interpretation.

More information

Wednesday, September 13, Chapter 4

Wednesday, September 13, Chapter 4 Wednesday, September 13, 2017 Topics for today Introduction to Computer Systems Static overview Operation Cycle Introduction to Pep/9 Features of the system Operational cycle Program trace Categories of

More information

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline.

Code Generation. The Main Idea of Today s Lecture. We can emit stack-machine-style code for expressions via recursion. Lecture Outline. The Main Idea of Today s Lecture Code Generation We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

We can emit stack-machine-style code for expressions via recursion

We can emit stack-machine-style code for expressions via recursion Code Generation The Main Idea of Today s Lecture We can emit stack-machine-style code for expressions via recursion (We will use MIPS assembly as our target language) 2 Lecture Outline What are stack machines?

More information

Job Posting (Aug. 19) ECE 425. ARM7 Block Diagram. ARM Programming. Assembly Language Programming. ARM Architecture 9/7/2017. Microprocessor Systems

Job Posting (Aug. 19) ECE 425. ARM7 Block Diagram. ARM Programming. Assembly Language Programming. ARM Architecture 9/7/2017. Microprocessor Systems Job Posting (Aug. 19) ECE 425 Microprocessor Systems TECHNICAL SKILLS: Use software development tools for microcontrollers. Must have experience with verification test languages such as Vera, Specman,

More information

6.001 Notes: Section 15.1

6.001 Notes: Section 15.1 6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define

More information

Chapter 12. CPU Structure and Function. Yonsei University

Chapter 12. CPU Structure and Function. Yonsei University Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor

More information

Memory. Objectives. Introduction. 6.2 Types of Memory

Memory. Objectives. Introduction. 6.2 Types of Memory Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts

More information

3.0 Instruction Set. 3.1 Overview

3.0 Instruction Set. 3.1 Overview 3.0 Instruction Set 3.1 Overview There are 16 different P8 instructions. Research on instruction set usage was the basis for instruction selection. Each instruction has at least two addressing modes, with

More information

ECE331 Handout 3- ASM Instructions, Address Modes and Directives

ECE331 Handout 3- ASM Instructions, Address Modes and Directives ECE331 Handout 3- ASM Instructions, Address Modes and Directives ASM Instructions Functional Instruction Groups Data Transfer/Manipulation Arithmetic Logic & Bit Operations Data Test Branch Function Call

More information

Appendix A: The ISA of a Small 8-bit Processor

Appendix A: The ISA of a Small 8-bit Processor Computer Architecture in VHDL 1 Appendix A: The ISA of a Small 8-bit Processor Introduction to Small8 An Instruction Set Processor (ISP) is characterized by its instruction set, address modes (means to

More information

6 Direct Memory Access (DMA)

6 Direct Memory Access (DMA) 1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 6 Direct Access (DMA) DMA technique is used to transfer large volumes of data between I/O interfaces and the memory. Example: Disk drive controllers,

More information

CSC201, SECTION 002, Fall 2000: Homework Assignment #3

CSC201, SECTION 002, Fall 2000: Homework Assignment #3 1 of 7 11/8/2003 7:34 PM CSC201, SECTION 002, Fall 2000: Homework Assignment #3 DUE DATE October 25 for the homework problems, in class. October 27 for the programs, in class. INSTRUCTIONS FOR HOMEWORK

More information

Chapter 7 Central Processor Unit (S08CPUV2)

Chapter 7 Central Processor Unit (S08CPUV2) Chapter 7 Central Processor Unit (S08CPUV2) 7.1 Introduction This section provides summary information about the registers, addressing modes, and instruction set of the CPU of the HCS08 Family. For a more

More information

Operating system Dr. Shroouq J.

Operating system Dr. Shroouq J. 2.2.2 DMA Structure In a simple terminal-input driver, when a line is to be read from the terminal, the first character typed is sent to the computer. When that character is received, the asynchronous-communication

More information

Fixed-Point Math and Other Optimizations

Fixed-Point Math and Other Optimizations Fixed-Point Math and Other Optimizations Embedded Systems 8-1 Fixed Point Math Why and How Floating point is too slow and integers truncate the data Floating point subroutines: slower than native, overhead

More information

S12CPUV2. Reference Manual HCS12. Microcontrollers. S12CPUV2/D Rev. 0 7/2003 MOTOROLA.COM/SEMICONDUCTORS

S12CPUV2. Reference Manual HCS12. Microcontrollers. S12CPUV2/D Rev. 0 7/2003 MOTOROLA.COM/SEMICONDUCTORS HCS12 Microcontrollers /D Rev. 0 7/2003 MOTOROLA.COM/SEMICONDUCTORS To provide the most up-to-date information, the revision of our documents on the World Wide Web will be the most current. Your printed

More information