High Level Software Cost Estimation

Size: px
Start display at page:

Download "High Level Software Cost Estimation"

Transcription

1 High Level Software Cost Estimation Per Bjuréus Abstract This report is dedicated to the processor characterization method and software cost estimation technique used in the Polis Codesign tool environment. The processor characterization method has been exercised by applying it to the ARM processor family. In particular, two processors, ARM7TDMI and ARM920T, have been examined. An improved method is proposed, which is supported and partially automated by two utility tools. The improved method is based on an iterative two-pass technique. The first pass involves processor characteristic extraction from generic software templates. The second pass improves the parameters from the first pass using a validation method. The results obtained during the exercise are presented and discussed. In particular the effect of instruction and data cache memory is addressed. The estimation technique used today is well suited for processors without cache, but processors with cache calls for new techniques. Introduction High-level software cost estimation is an attractive feature in a system design flow. This allows the designer to estimate software code size and performance in an early design phase. The approach to high-level software cost estimation that is available in the Polis codesign environment developed at UC Berkeley [2] is based on work done by K. Suzuki et. al. [1]. Under the assumption that the software program can be represented as a set of directed acyclic graphs (DAG), called S-graphs, the estimation is performed using macro modeling. A number of macros, representing different types of nodes that constitute the S- graph, are collected in a set of template files. The template files are profiled for the processor that will be used, and size and time parameters for each individual macro is extracted. The profiling is only performed once, using an Instruction Set Simulator (ISS) or debugger for the processor in question. The macro parameters are collected in a parameter file, which is used to estimate the cost for any software program that runs on the processor. In this way there is no need for a designer to install and learn any simulators or debuggers to evaluate the software cost on different microprocessors. This amounts to a fast and convenient way to do design trade-off decisions between hardware, software, and functionality of an embedded system. The main objective of this project was to exercise and possibly improve the processor characterization methodology in Polis. The ARM processor family was selected as suitable microprocessors to perform the experiments on for several reasons. The ARM processors are widely used and has up until now not been available in the Polis environment. The ARM cores are a family of processors with different characteristics, which allows a wide range of different processor configurations to be analyzed without changing the experimental framework radically. The ARM processor comes with a Software Development Kit (SDK) that contains a set of tools in an open environment. 1

2 A second objective was to study the effects that instruction and data cache has on software cost estimation and to possibly suggest solutions to the expected problems. Some previous work in this direction has been performed by Lajalo et. al. [3]. Processor Characterization Processor Characterization in Polis is based on the assumption that the software program is decomposed into communicating Codesign Finite State-Machines (CFSMs) that are executed upon request by a scheduler or operating system. The communication between CFSMs is asynchronous, and a signal enables a CFSM when it is received. The scheduler executes enabled CFSMs according to a scheduling policy. Each CFSM can be represented by a polar directed acyclic graph (DAG) called an S-graph. When the CFSM is executed, only one execution path is traversed in the S-graph from the Begin to the End node. The S- graph is composed of a fixed set of node types. Each node type in an S-graph is represented by a macro, which is an atomic operation. A macro will eventually be executed as a sequence of instructions on a microprocessor. The idea is that if the code size and execution time for each macro can be estimated, so can the S-graph, and hence the CFSM. If the CFSMs are annotated with the estimated execution time, simulation of the system will yield a performance estimate of the whole system, this is referred to as performance simulation. The goal for processor characterization is thus to estimate the code size and execution time of the macros that constitute the S-graphs. All macro cost estimates are collected in a parameter file, which is processor specific. The parameter file is read by Polis, which annotates the software before performance simulation can be carried out. Estimating the macros involves compiling and analyzing the code for the intended processor. The current methodology for parameter file generation is outlined in Figure 1. Template Files Compiler Assembler Files Debugger Parameter File Figure 1. Processor Characterization Flow The macros are collected in template files that are written in ANSI C. A processor specific compiler compiles the template files, and assembler files are generated. The assembler files are analyzed manually or with a debugger or Instruction Set Simulator (ISS), and the code size and execution time for each macro is extracted. The code size and execution time estimates are manually collected in the processor specific parameter file. This approach requires a good knowledge about the compiler and debugger. It also involves a lot of tedious work analyzing the debugger output and converting it to a parameter file. 2

3 ;;;4 void tmp_avv(int proc, int inst) ;;;5 { tmp_avv e1a02001 MOV a3,a2 ;;;6 v_st1_enc = v_st2_enc; e59f1010 LDR a2,[pc, #L00001c-.-8] e LDR a2,[a2,#0] 00000c e59f300c LDR a4,[pc, #L ] e STR a2,[a4,#0] ;;;7 ;;;8 return; e1a0f00e MOV pc,lr ;;;9 } Figure 2. Compiled Macro with Interleaved Source Code Figure 2 shows an excerpt from the $99 macro that has been compiled into assembler code interleaved with the original macro C-code. Since the objective of this project was to evaluate several different processors in the ARM family, the above methodology was improved and automated for efficiency. The new methodology is outlined in Figure 3. Template Files Compiler Assembler Files Makefile Annotation Debugger Log File log2param Parameter File Archar Debugger Script Figure 3. Improved Processor Characterization Flow First, the template files are annotated with macro Entry and Exit points. A program, called $UFKDU, operates on the annotated template files and generates a debugger script. The log file generated by the debugger is converted by another program, ORJSDUDP, into a raw parameter file. Compilation, debugger execution, and log file conversion is performed by a Makefile, keeping track of changes and dependencies between the files. $UFKDU reads the template files and generates a debugger script that is used by the ARM symbolic debugger DUPVG for analysis. The annotated $99 macro is shown in Figure 4. 3

4 void tmp_avv(int proc, int inst) { /*Enter AVV*/ v_st1_enc = v_st2_enc; /*Exit AVV*/ return; } Figure 4. Annotated Macro Function The $UFKDU program generates a debugger script, which has three parts; a prefix, macro entry and exit commands, and a postfix. The prefix is used to configure the debugger. The entry and exit commands insert breakpoints at macro Entry and Exit. The breakpoints are programmed with a command that is executed when the breakpoint is reached. The breakpoint command outputs the macro name, the current program counter address, and the current cycle count. A portion of the debugger log file generated by the debugger running the debugger script generated by $UFKDU is shown in Figure 5. enter:avv Total exit:avv Total Figure 5. Excerpt From Debugger Log File To convert the debugger log file into a parameter file, another program, ORJSDUDP, was written. The program reads the log file, records the macro Entry and Exit points, and performs the necessary operations to output a parameter file. ORJSDUDP accepts nesting macro Entry and Exit points, allows multiple calls to the same macro, and supports several cycle count variables. Another feature of both $UFKDU and ORJSDUDP is that they make a distinction between macros and software library functions. A macro name followed by a colon and the IXQF keyword indicates a function. The IXQF keyword must be followed by the function output bit-width in parenthesis. An example of an annotated software library function is shown in Figure 6. void tmp_timesl(int proc, int inst) { /*Enter _TIMES:func(32)*/ v_2_enc = _TIMES(v_sL3_enc, v_const2l); /*Exit _TIMES:func(32)*/ return; } Figure 6. Annotated Software Library Function The corresponding log file generated by the debugger is shown in Figure 7. 4

5 enter:_times:func(32) Total exit:_times:func(32) Total Figure 7. Software Library Function Log If multiple calls to the macros are performed during debugger analysis, each call will generate an Entry and Exit point in the log file. The parameter file generated by ORJSDUDP will then contain the average execution time for macros, and the maximum and minimum execution times for software library functions. The parameter file generated by ORJSDUDP from the $99 macro and the 7,0(6 software library function is shown in Figure 8..time AVV 11.size AVV 16.dp func=times max_cycle=19 min_cycle=19 size=28 out_width=32 Figure 8. Parameter File Example The output from ORJSDUDP is a raw parameter file, which means that it still needs some manual massaging before it can be read into Polis. All software library functions involve the assignment of the function output to a variable. Thus, the assignment, which is belongs to a separate macro, is counted in the parameter for software library functions, and needs to be subtracted. For example, consider the software library function call in Figure 6. The size and execution time of the assignment (YBBHQF B7,0(6 ) must be subtracted from the size and execution time parameters for B7,0(6 respectively. Parameter values for the (1&, 7,(1&7, 7,(1&), and 7,(1& are currently not implemented in the template files, and it is unclear what their purpose is. All those parameters are set to zero. The pointer size parameter 375 is entered manually. The parameter file must contain the name of the processor, the units that are used, and the bit width of an integer variable; those lines are simply added to the beginning of the parameter file according to Figure 9..name ARM7TDMI.unit_time cycle.unit_size byte.int_width 32 Figure 9. Parameter File Additions for the ARM7TDMI When the parameter file has been modified it can be used by the Polis tool for size and execution time estimation. 5

6 Estimation Validation For validation purposes, an ATM switch example was selected, which will be referred to as the $70. The $70 consists of several modules, each representing one CFSM in the system. This section describes the design flow to synthesize the software, and to profile the software using the ARMulator. The validation flow is outlined in Figure 10. Esterel Program Polis Parameter File Source Files Cost Estimation Compiler Size? DoDelay Image Simulation Time? Debug Image Execution Figure 10. Validation Flow The parameter file from the processor characterization is used together with the $70 Esterel program. The Esterel program is compiled into a SHIFT file (Software Hardware Intermediate FormaT) using VWUOVKLIW. Polis converts the SHIFT file into an S-graph internal representation. The parameter file is used to assign weights to the edges of the S-graph, which enables Polis to generate a software source file and a cost estimation file. The cost estimation file contains the expected maximum and minimum execution time of each CFSM in the design along with a code size estimate. The Polis execution script that was used to generate the source files and the cost estimate file is depicted in Figure 11. read_shift atm_v.shift propagate_const set_impl -s partition build_sg set arch ARM7TDMI read_cost_param sg_to_c -D -d software gen_os -D os -d software set polisout software/sgraph.txt print_sg set polisout software/cost.txt print_cost -sn quit Figure 11. Polis Execution Script It is beyond the scope of this report to go into details about all the steps that Polis performs. However, it is worth pointing out that in order to run the $70 on a workstation for performance simulation, all modules (CFSMs) were implemented as software. 6

7 When the software source files have been generated by Polis, the application is built using the ARM project manager. All source files, including the generated OS file (RVF) and a file containing memory library functions (PHPBOLEF) are compiled and linked into an ARM executable image. The 81,;, (67(5(/, and %(1&+ variables are set during compilation to build a stand-alone executable image called 'HEXJ. The estimated code size for each module, found in the cost file (FRVWW[W), can be compared to the number reported by the map file generated by the ARM link tool. To measure the execution time, a debugger script was written that inserts breakpoints at module Entry and Exit points. The breakpoints are programmed to execute commands very similar to the Entry and Exit commands used for macro estimation. An input pattern (i.e. scenario) is applied to the application, and the cycle times are written to a log file at the module Entry and Exit points. The log file generated by the debugger has the same format as the log file generated during processor characterization and can be read by ORJSDUDP. The ORJSDUDP program has a simple mode, which does not generate a parameter file, but rather outputs an unformatted parameter file. An excerpt from such a validation parameter file is shown in Figure 12. collision_detector 604 bytes Instructions S_Cycles I_Cycles Total arbiter_sc 308 bytes Instructions S_Cycles I_Cycles Total Figure 12. Validation Parameter File The numbers from the validation parameter file are imported in Excel for analysis. The maximum and minimum execution times found in the cost file generated by Polis refers to the longest and shortest paths in the S-graph respectively. However, it is hard to run an exhaustive simulation of the application to make sure that exactly those execution paths are traversed. Therefore, for validation purposes, the estimated execution times of the execution paths actually traversed must be recorded instead. This is accomplished by building an executable image with the '2B'(/$< pre-processing variable set during compilation. The OS file has to be slightly modified to allow the debugger to run without user interaction through a debugger script. The image generated is called 'R'HOD\. It is important to remember that the 'R'HOD\ image cannot be used to measure the actual code size or execution time. When the 'R'HOD\ image is executed, it reads the input pattern from a file, and reports the estimated execution times upon completion. The numbers reported are imported in Excel for comparison with the measured values. Results Two ARM processor configurations were characterized and validated, the ARM7TDMI core, and the ARM920T processor. Two memory configurations were analyzed for both 7

8 processors; one fast and one slow. The fast memory was configured not to impose any wait-states on program execution. The slow memory configuration on the other hand had a sequential/non-sequential read and write latency of 120/90 ns, which required wait-states to be inserted during program execution. However, due to the current uncertainty of the analysis of the slow memory configuration, those results are omitted in this report. The ARM7TDMI core was not equipped with cache memory whereas the ARM920T processor was equipped with a 16KB Data Cache and a 16KB Instruction Cache. Both processors are 32-bit processors. The 16-bit Thumb instruction set has not been examined. Code Size The estimated and measured code size for the individual ATM modules was compared for the ARM7TDMI processor. Originally, the maximum absolute estimation error was 70%. The large difference was unexpected and the error source was investigated immediately, comparing the estimated size reported on each S-graph node with the actual size in the assembly program. Three major error sources were identified rather quickly: 1. The Polis tool did not estimate variables that were local to the modules. Each simple variable occupied 4 bytes data memory and an array occupied 4 bytes per element and an additional 8 bytes for reference variables (pointers). The initialization of a simple variable required 4 instructions (16 bytes), and initialization of an array required 6 instructions (24 bytes). 2. The QHWBTXLGPHPRU\, QHWBVWDWHPHPRU\, and QHWBPVXEBVRUW modules called the FUHDWH, JHWGDWD, and SXWGDWD memory library functions in PHPBOLEF. Each such function call required additional instructions compared to the generic 6:/ macro. FUHDWH required 1 more instruction, JHWGDWD required 4 more instructions, and SXWGDWD required 7 more instructions. 3. The 7,'7 macro and the $9& macro required one more instruction each. Those deficiencies were corrected manually in the Excel worksheet, and an absolute maximum error of 12% was achieved. The results are shown in Figure 13. ATM Size Estimation Error for ARM7TDMI arbiter_sc arbiter_sorter collision_detector counter extract_cell2 first_cell lqm_arbiter3 msd_technique net_m2/sub_sort net_quid/memory2 net_state/memory sorter2 space_controller supervisor3 Total Original Corrected 20% 10% 0% -10% -20% -30% -40% -50% -60% -70% -80% Figure 13. ATM Size Estimation Error for ARM7TDMI 8

9 The corrections made were fed back into the parameter file. The software library functions in the PHPBOLEF file were added to the parameter file as GS parameters, which override the 6:/ generic macro. The VL]H parameters for the 7,'7 and $9& macros were corrected and an additional 4 bytes were added to the corresponding parameters. The size estimation for the ARM920T was performed using the same procedure. The same observations were made, and no major differences were found. The result from code size estimation of the ARM920T is shown in Figure 14. ATM Size Estimation Error for ARM920T arbiter_sc arbiter_sorter collision_detector counter extract_cell2 first_cell lqm_arbiter3 msd_technique net_m2/sub_sort net_quid/memory2 net_state/memory sorter2 space_controller supervisor3 Total Original Corrected 20% 10% 0% -10% -20% -30% -40% -50% -60% -70% -80% Figure 14. ATM Size Estimation Error for ARM920T The estimation error for the ARM920T is displaced towards over-estimation compared to the ARM7TDMI. No actions were taken to investigate this phenomenon any further. Execution Time The execution time was measured by running the 'HEXJ executable image using a simple input scenario. The estimated execution time was extracted by running the 'R'HOD\ executable image. The original difference between estimation and measurement was 98%, which was clearly unacceptable. An investigation on the cause of the large error was carried out. From the size estimation was learned that Polis did not estimate the initialization of the internal variables of the modules, so an additional 12 cycles per variable were added to the 0LQ, 0D[, and $YJ execution time for initialization. The memory access functions FUHDWH, SXWGDWD, and JHWGDWD, that are called by some of the modules required separate modeling, because the average software library function execution time could not be applied to those functions. This was accomplished by adding the maximum and minimum execution time to the corresponding GS parameters in the parameter file. The $(0,7 macro caused major problems, because the execution time had large variations. Execution times between 6 and 139 cycles were recorded. The two modules that used $(0,7 most were QHWBPVXEBVRUW and VXSHUYLVRU. Those modules are also the ones that exhibit the largest deviations between estimated and measured execution time. To tackle the problem an average execution time for the $(0,7 macro was used. Using an average execution time is probably application dependent and might also depend on the input pattern. Therefore, this approach can 9

10 easily cause large errors if the number is used for other applications or input patterns. Correction for execution time errors cannot be applied directly in the Excel worksheet as is done for size corrections, because they depend on the execution path. A second pass is needed, changing the appropriate time variables in the parameter file to examine the improved model. The results are shown in Figure 15. 2nd Pass ATM Execution Time Estimation Error for the ARM7TDMI 30% 25% 20% 15% 10% 5% 0% -5% -10% -15% arbiter_sc arbiter_sorter collision_detector counter extract_cell2 first_cell lqm_arbiter3 msd_technique net_m2/sub_sort net_quid/memory2 net_state/memory sorter2 space_controller supervisor3 Total Max Min Avg Figure 15. ATM Execution Time Estimation Error for ARM7TDMI The size and execution time errors for the ARM7TDMI core, before and after correction are depicted in Table 1. Table 1. Estimation Errors for the ARM7TDMI Original Error Corrected Error Module Size Max Min Avg Size Max Min Avg arbiter_sc -18% -12% -19% -17% 7% 1% -4% -3% arbiter_sorter -23% -19% -24% -23% 7% -1% -4% -3% collision_detector -51% -17% -55% -41% 2% 11% -7% -1% counter -35% -5% -27% -7% -4% 11% 2% 10% extract_cell2-26% -36% -38% -37% 2% -1% 0% -1% first_cell -31% -48% -49% -49% 8% 7% 7% 7% lqm_arbiter3-12% -35% -37% -36% 7% 5% 6% 5% msd_technique -24% -58% -61% -59% -2% -5% -6% -6% net_m2/sub_sort -35% -18% -17% -17% -7% 12% 14% 13% net_quid/memory2-70% -98% -45% -93% -12% -4% -13% -3% net_state/memory -64% -97% -47% -94% -7% -1% -12% -12% sorter2-45% -27% -29% -28% -4% -3% -3% -3% space_controller -33% -47% -48% -47% 3% 3% 4% 3% supervisor3-42% -7% -63% -48% 4% 13% -4% -1% Total 33% -49% 51% 57% 0% -1% 0% -3% Abs Max Error -70% -98% -63% -94% 12% 13% 14% 13% Average Error 36% 38% 40% 43% 0% 3% -1% 0% The same procedure was applied to the ARM920T processor. After a first pass with very large estimation errors, corrections for software library functions were made. The result of a second pass estimation is shown in Figure

11 2nd Pass ATM Execution Time Estimation Error for the ARM920T 120% 100% 80% 60% 40% 20% 0% Max Min Avg -20% -40% -60% arbiter_sc arbiter_sorter collision_detector counter extract_cell2 first_cell lqm_arbiter3 msd_technique net_m2/sub_sort net_quid/memory2 net_state/memory sorter2 space_controller supervisor3 Total Figure 16. ATM Execution Time Estimation Error for ARM920T After a second pass, the estimation errors of the ARM920T processor were still much larger than those errors observed with the ARM7TDMI core. The minimum execution times were generally over-estimated, and the maximum execution times were generally underestimated. The size and execution time estimation errors before and after correction are summarized in Table 2. Table 2. Estimation Errors for the ARM920T Original Error Corrected Error Module Size Max Min Avg Size Max Min Avg arbiter_sc -5% -33% -3% -21% 7% -23% 15% -7% arbiter_sorter -11% -36% 1% -21% 5% -21% 29% 0% collision_detector -41% -35% -39% -48% -8% -7% 29% -7% counter -27% -11% -10% 22% -7% 4% 25% 44% extract_cell2-18% -47% -37% -41% -4% -17% 2% -6% first_cell -28% -57% -27% -43% -2% -12% 54% 19% lqm_arbiter3-6% -48% -23% -38% 4% -14% 30% 3% msd_technique -19% -77% -58% -67% -9% -47% 2% -23% net_m2/sub_sort -33% -43% 31% 0% 1% -23% 79% 36% net_quid/memory2-64% -97% -21% -90% -7% -2% 24% 3% net_state/memory -59% -96% -24% -91% -1% 0% 26% -8% sorter2-33% -39% -18% -29% -7% -19% 12% -3% space_controller -27% -58% -44% -50% -5% -17% 12% 0% supervisor3-38% -16% -50% -45% -7% 3% 31% 6% Total -28% -79% -24% -62% -3% -6% 30% 2% Abs Max Error 64% 97% 58% 91% 9% 47% 79% 44% Average Error -29% -50% -23% -40% -3% -14% 26% 4% Conclusions The processor characterization flow developed in this project was very useful for quick evaluation of different processor and memory configurations building on the ARM processor. The ARM Software Development Kit is sophisticated but lacks some documentation. 11

12 A valuable addition to the ORJSDUDP utility would be an expression builder that would allow a fully automatic conversion of the debugger log file to a parameter file. The $UFKDU tool is currently running on Windows NT, and several extensions are possible. The original goal was to let $UFKDU house the whole processor characterization flow, but the solution with a complementary Makefile was chosen due to lack of time. Using template files for macro profiling is generally a good idea, but is intrinsically hard to develop templates that will capture all the effects of software compilation. Further research is needed on the topic, and eventually a better set of template files should be developed. Meanwhile, the two-pass approach used in this project can be successfully applied. Starting with a set of template files that generate reasonable numbers, a second pass where closer analysis of a real application are taken into account will hopefully provide the required accuracy. The actual proposed methodology thus involves iteration. Going into detail, the (0,7 macro alone constituted one of the major problems encountered during validation. A deeper understanding of the execution time of the (0,7 macro is suggested. From the superficial analysis made on the execution of (0,7, a subdivision of the (0,7 macro seems to be needed. At least three different types of (0,7 were observed in the execution of the $70 example. In this project, one processor with cache and one without were deliberately chosen to investigate the effect that cache memory has on execution time. Studying the parameter file generated for the ARM920T, it immediately becomes obvious that the cache memory will affect the estimation. The minimum and maximum execution times for software library functions (defined by GS parameters) differ by up to a factor of four. The large estimation error after the second pass on the ARM920T processor can also be explained by the cache behavior. A cache miss will generate an execution time that is larger than the average execution time captured by the parameter. Thus, the maximum execution time will be larger than the estimated maximum execution time, in effect under-estimating the maximum execution time. Conversely, a cache hit will generate an execution time smaller than the average, in effect over-estimating the minimum execution time. The large estimation errors (up to 79%) motivate further investigation of cache behavior and cache estimation techniques. References [1] K. Suzuki, A. Sangiovanni-Vincentelli, Efficient Software Performance Estimation Methods for Hardware-Software Codesign, Proceedings of Design Automotion Conference DAC, [2] F. Balarin, M. Chiodo, A. Jurecska, H. Hsieh, A. L. Lavagno, C. Passerone, A. Sangiovanni-Vincentelli, E. Sentovich, K. Suzuki, B. Tabbara, Hardware-Software Co- Design of Embedded Systems: The Polis Approach, Kluwer Academic Press, June 1997 [3] M. Lajolo, L. Lavagno, A. Sangiovanni-Vincentelli, Fast Instruction Cache Simulation Strategies in a Hardware/Software Co-Design Environment, Proceedings of the ASP-DAC 99 Asian and South Pacific Design Automation Conference,

ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN

ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN / ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN Gerard Berry Ed Harcourt Luciano Lavagno Ellen Sentovich Abstract We propose a new specification environment for system-level design called ECL.

More information

Extending POLIS with User-Defined Data Types

Extending POLIS with User-Defined Data Types Extending POLIS with User-Defined Data Types EE 249 Final Project by Arvind Thirunarayanan Prof. Alberto Sangiovanni-Vincentelli Mentors: Marco Sgroi, Bassam Tabbara Introduction POLIS is, by definition,

More information

Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator

Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator Jie Liu Department of EECS University of California Berkeley, CA 94720 liuj@eecs.berkeley.edu Marcello Lajolo Dipartimento

More information

Task Response Time Optimization Using Cost-Based Operation Motion

Task Response Time Optimization Using Cost-Based Operation Motion Task Response Time Optimization Using Cost-Based Operation Motion Abdallah Tabbara +1-510-643-5187 atabbara@eecs.berkeley.edu ABSTRACT We present a technique for task response time improvement based on

More information

Task Response Time Optimization Using Cost-Based Operation Motion

Task Response Time Optimization Using Cost-Based Operation Motion Task Response Time Optimization Using Cost-Based Operation Motion Bassam Tabbara +1-510-643-5187 tbassam@eecs.berkeley.edu ABSTRACT We present a technique for task response time improvement based on the

More information

Efficient Power Estimation Techniques for HW/SW Systems

Efficient Power Estimation Techniques for HW/SW Systems Efficient Power Estimation Techniques for HW/SW Systems Marcello Lajolo Anand Raghunathan Sujit Dey Politecnico di Torino NEC USA, C&C Research Labs UC San Diego Torino, Italy Princeton, NJ La Jolla, CA

More information

Architecture choices. Functional specifications. expensive loop (in time and cost) Hardware and Software partitioning. Hardware synthesis

Architecture choices. Functional specifications. expensive loop (in time and cost) Hardware and Software partitioning. Hardware synthesis Introduction of co-simulation in the design cycle of the real- control for electrical systems R.RUELLAND, J.C.HAPIOT, G.GATEAU Laboratoire d'electrotechnique et d'electronique Industrielle Unité Mixte

More information

Timing-Based Communication Refinement for CFSMs

Timing-Based Communication Refinement for CFSMs Timing-Based Communication Refinement for CFSMs Heloise Hse and Irene Po {hwawen, ipo}@eecs.berkeley.edu EE249 Term Project Report December 10, 1998 Department of Electrical Engineering and Computer Sciences

More information

FPGA Resource and Timing Estimation from Matlab Execution Traces

FPGA Resource and Timing Estimation from Matlab Execution Traces FPGA Resource and Timing Estimation from Matlab Execution Traces Per Bjuréus Saab Avionics Nettovägen 6 175 88 Järfälla, Sweden +46 (8) 790 4132 perb@imit.kth.se Mikael Millberg Royal Institute of Technology

More information

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Introduction All processors offer some form of instructions to add, subtract, and manipulate data.

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

System-Level Modeling Environment: MLDesigner

System-Level Modeling Environment: MLDesigner System-Level Modeling Environment: MLDesigner Ankur Agarwal 1, Cyril-Daniel Iskander 2, Ravi Shankar 1, Georgiana Hamza-Lup 1 ankur@cse.fau.edu, cyril_iskander@hotmail.com, ravi@cse.fau.edu, ghamzal@fau.edu

More information

Transaction-Level Modeling Definitions and Approximations. 2. Definitions of Transaction-Level Modeling

Transaction-Level Modeling Definitions and Approximations. 2. Definitions of Transaction-Level Modeling Transaction-Level Modeling Definitions and Approximations EE290A Final Report Trevor Meyerowitz May 20, 2005 1. Introduction Over the years the field of electronic design automation has enabled gigantic

More information

ECEC 355: Cache Design

ECEC 355: Cache Design ECEC 355: Cache Design November 28, 2007 Terminology Let us first define some general terms applicable to caches. Cache block or line. The minimum unit of information (in bytes) that can be either present

More information

Programming at different levels

Programming at different levels CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING 2014 EMY MNEMONIC MACHINE LANGUAGE PROGRAMMING EXAMPLES Programming at different levels CS1114 Mathematical Problem : a = b + c CS2214 CS2214 The C-like

More information

Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network

Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network Hua Huang Modern Education Technology Center, Information Teaching Applied Technology Extension

More information

Intermediate Code Generation

Intermediate Code Generation Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target

More information

Hardware and Software Representation, Optimization, and Co-synthesis for Embedded Systems

Hardware and Software Representation, Optimization, and Co-synthesis for Embedded Systems Hardware and Software Representation, Optimization, and Co-synthesis for Embedded Systems Bassam Tabbara EECS Department University of California at Berkeley Berkeley, CA 94720 +1-510-43-5187 tbassam@eecs.berkeley.edu

More information

Embedded Software Generation from System Level Design Languages

Embedded Software Generation from System Level Design Languages Embedded Software Generation from System Level Design Languages Haobo Yu, Rainer Dömer, Daniel Gajski Center for Embedded Computer Systems University of California, Irvine, USA haoboy,doemer,gajski}@cecs.uci.edu

More information

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree. Group B Assignment 9 Att (2) Perm(3) Oral(5) Total(10) Sign Title of Assignment: Code generation using DAG. 9.1.1 Problem Definition: Code generation using DAG / labeled tree. 9.1.2 Perquisite: Lex, Yacc,

More information

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides Slide Set 1 for ENCM 369 Winter 2014 Lecture Section 01 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 2014 ENCM 369 W14 Section

More information

Problem with Scanning an Infix Expression

Problem with Scanning an Infix Expression Operator Notation Consider the infix expression (X Y) + (W U), with parentheses added to make the evaluation order perfectly obvious. This is an arithmetic expression written in standard form, called infix

More information

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 11 Instruction Sets: Addressing Modes and Formats William Stallings Computer Organization and Architecture 8 th Edition Chapter 11 Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement

More information

An Introduction to Komodo

An Introduction to Komodo An Introduction to Komodo The Komodo debugger and simulator is the low-level debugger used in the Digital Systems Laboratory. Like all debuggers, Komodo allows you to run your programs under controlled

More information

CSE 410. Operating Systems

CSE 410. Operating Systems CSE 410 Operating Systems Handout: syllabus 1 Today s Lecture Course organization Computing environment Overview of course topics 2 Course Organization Course website http://www.cse.msu.edu/~cse410/ Syllabus

More information

A Feasibility Study for Methods of Effective Memoization Optimization

A Feasibility Study for Methods of Effective Memoization Optimization A Feasibility Study for Methods of Effective Memoization Optimization Daniel Mock October 2018 Abstract Traditionally, memoization is a compiler optimization that is applied to regions of code with few

More information

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358 Memory Management Reading: Silberschatz chapter 9 Reading: Stallings chapter 7 1 Outline Background Issues in Memory Management Logical Vs Physical address, MMU Dynamic Loading Memory Partitioning Placement

More information

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information

A Hardware/Software Co-design Flow and IP Library Based on Simulink

A Hardware/Software Co-design Flow and IP Library Based on Simulink A Hardware/Software Co-design Flow and IP Library Based on Simulink L.M.Reyneri, F.Cucinotta, A.Serra Dipartimento di Elettronica Politecnico di Torino, Italy email:reyneri@polito.it L.Lavagno DIEGM Università

More information

Timed Compiled-Code Functional Simulation of Embedded Software for Performance Analysis of SOC Design

Timed Compiled-Code Functional Simulation of Embedded Software for Performance Analysis of SOC Design IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 1, JANUARY 2003 1 Timed Compiled-Code Functional Simulation of Embedded Software for Performance Analysis of

More information

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering

UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. M.Sc. in Computational Science & Engineering COMP60081 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE M.Sc. in Computational Science & Engineering Fundamentals of High Performance Execution Wednesday 16 th January 2008 Time: 09:45

More information

Tradeoff between coverage of a Markov prefetcher and memory bandwidth usage

Tradeoff between coverage of a Markov prefetcher and memory bandwidth usage Tradeoff between coverage of a Markov prefetcher and memory bandwidth usage Elec525 Spring 2005 Raj Bandyopadhyay, Mandy Liu, Nico Peña Hypothesis Some modern processors use a prefetching unit at the front-end

More information

Today's Topics. CISC 458 Winter J.R. Cordy

Today's Topics. CISC 458 Winter J.R. Cordy Today's Topics Last Time Semantics - the meaning of program structures Stack model of expression evaluation, the Expression Stack (ES) Stack model of automatic storage, the Run Stack (RS) Today Managing

More information

LECTURE 11. Memory Hierarchy

LECTURE 11. Memory Hierarchy LECTURE 11 Memory Hierarchy MEMORY HIERARCHY When it comes to memory, there are two universally desirable properties: Large Size: ideally, we want to never have to worry about running out of memory. Speed

More information

A case study on modeling shared memory access effects during performance analysis of HW/SW systems

A case study on modeling shared memory access effects during performance analysis of HW/SW systems A case study on modeling shared memory access effects during performance analysis of HW/SW systems Marcello Lajolo * Politecnico di Torino Torino, Italy lajolo@polito.it Luciano Lavagno Politecnico di

More information

Tracking the Virtual World

Tracking the Virtual World Tracking the Virtual World Synopsys: For many years the JTAG interface has been used for ARM-based SoC debugging. With this JTAG style debugging, the developer has been granted the ability to debug software

More information

Chapter 9. Software Testing

Chapter 9. Software Testing Chapter 9. Software Testing Table of Contents Objectives... 1 Introduction to software testing... 1 The testers... 2 The developers... 2 An independent testing team... 2 The customer... 2 Principles of

More information

Fast Software-Level Power Estimation for Design Space Exploration

Fast Software-Level Power Estimation for Design Space Exploration Fast Software-Level Power Estimation for Design Space Exploration Carlo Brandolese *, William Fornaciari *, Fabio Salice *, Donatella Sciuto Politecnico di Milano, DEI, Piazza L. Da Vinci, 32, 20133 Milano

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information

Function Call Stack and Activation Records

Function Call Stack and Activation Records 71 Function Call Stack and Activation Records To understand how C performs function calls, we first need to consider a data structure (i.e., collection of related data items) known as a stack. Students

More information

Figure 28.1 Position of the Code generator

Figure 28.1 Position of the Code generator Module 28 Code Generator Introduction and Basic Blocks After discussing the various semantic rules necessary to convert every programming construct into three-address code, in this module, we will discuss

More information

Arm Assembly Language programming. 2. Inside the ARM

Arm Assembly Language programming. 2. Inside the ARM 2. Inside the ARM In the previous chapter, we started by considering instructions executed by a mythical processor with mnemonics like ON and OFF. Then we went on to describe some of the features of an

More information

Advanced Debug Methods for ARM DSM-Based Simulation. Jim Kenney SoC Verification Product Manager

Advanced Debug Methods for ARM DSM-Based Simulation. Jim Kenney SoC Verification Product Manager Advanced Debug Methods for ARM DSM-Based Simulation Jim Kenney SoC Verification Product Manager Agenda The ARM Design Simulation (signoff) Model Processor driven tests Current DSM debug methods Advanced

More information

Lecture 11: Packet forwarding

Lecture 11: Packet forwarding Lecture 11: Packet forwarding Anirudh Sivaraman 2017/10/23 This week we ll talk about the data plane. Recall that the routing layer broadly consists of two parts: (1) the control plane that computes routes

More information

Process size is independent of the main memory present in the system.

Process size is independent of the main memory present in the system. Hardware control structure Two characteristics are key to paging and segmentation: 1. All memory references are logical addresses within a process which are dynamically converted into physical at run time.

More information

INTERACTION TEMPLATES FOR CONSTRUCTING USER INTERFACES FROM TASK MODELS

INTERACTION TEMPLATES FOR CONSTRUCTING USER INTERFACES FROM TASK MODELS Chapter 1 INTERACTION TEMPLATES FOR CONSTRUCTING USER INTERFACES FROM TASK MODELS David Paquette and Kevin A. Schneider Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5A9,

More information

ARM Assembler Workbook. CS160 Computer Organization Version 1.1 October 27 th, 2002 Revised Fall 2005

ARM Assembler Workbook. CS160 Computer Organization Version 1.1 October 27 th, 2002 Revised Fall 2005 ARM Assembler Workbook CS160 Computer Organization Version 1.1 October 27 th, 2002 Revised Fall 2005 ARM University Program Version 1.0 January 14th, 1997 Introduction Aim This workbook provides the student

More information

Reliable Estimation of Execution Time of Embedded Software

Reliable Estimation of Execution Time of Embedded Software Reliable Estimation of Execution Time of Embedded Software Paolo Giusto Cadence Design Systems, Inc. 2670 Seely Avenue San Jose, CA 95134, U.S.A. giusto@cadence.com Grant Martin Cadence Design Systems,

More information

This book is licensed under a Creative Commons Attribution 3.0 License

This book is licensed under a Creative Commons Attribution 3.0 License 6. Syntax Learning objectives: syntax and semantics syntax diagrams and EBNF describe context-free grammars terminal and nonterminal symbols productions definition of EBNF by itself parse tree grammars

More information

PROBLEM SOLVING AND OFFICE AUTOMATION. A Program consists of a series of instruction that a computer processes to perform the required operation.

PROBLEM SOLVING AND OFFICE AUTOMATION. A Program consists of a series of instruction that a computer processes to perform the required operation. UNIT III PROBLEM SOLVING AND OFFICE AUTOMATION Planning the Computer Program Purpose Algorithm Flow Charts Pseudo code -Application Software Packages- Introduction to Office Packages (not detailed commands

More information

Integrated Software Environment. Part 2

Integrated Software Environment. Part 2 Integrated Software Environment Part 2 Operating Systems An operating system is the most important software that runs on a computer. It manages the computer's memory, processes, and all of its software

More information

Building a safe and secure embedded world. Testing State Machines. and Other Test Objects Maintaining a State. > TESSY Tutorial Author: Frank Büchner

Building a safe and secure embedded world. Testing State Machines. and Other Test Objects Maintaining a State. > TESSY Tutorial Author: Frank Büchner Building a safe and secure embedded world Testing State Machines and Other Test Objects Maintaining a State > TESSY Tutorial Author: Frank Büchner Topic: TESSY is especially well-suited for testing state

More information

SF-LRU Cache Replacement Algorithm

SF-LRU Cache Replacement Algorithm SF-LRU Cache Replacement Algorithm Jaafar Alghazo, Adil Akaaboune, Nazeih Botros Southern Illinois University at Carbondale Department of Electrical and Computer Engineering Carbondale, IL 6291 alghazo@siu.edu,

More information

A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors *

A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors * A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors * Hsin-Ta Chiao and Shyan-Ming Yuan Department of Computer and Information Science National Chiao Tung University

More information

F28HS Hardware-Software Interface: Systems Programming

F28HS Hardware-Software Interface: Systems Programming F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has

More information

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,

More information

Overview. EE 4504 Computer Organization. Much of the computer s architecture / organization is hidden from a HLL programmer

Overview. EE 4504 Computer Organization. Much of the computer s architecture / organization is hidden from a HLL programmer Overview EE 4504 Computer Organization Section 7 The Instruction Set Much of the computer s architecture / organization is hidden from a HLL programmer In the abstract sense, the programmer should not

More information

We briefly explain an instruction cycle now, before proceeding with the details of addressing modes.

We briefly explain an instruction cycle now, before proceeding with the details of addressing modes. Addressing Modes This is an important feature of computers. We start with the known fact that many instructions have to include addresses; the instructions should be short, but addresses tend to be long.

More information

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific

More information

A faster way to downscale during JPEG decoding to a fourth

A faster way to downscale during JPEG decoding to a fourth A faster way to downscale during JPEG decoding to a fourth written by written by Stefan Kuhr 1 Introduction The algorithm that is employed in the JPEGLib for downscaling to a fourth during decoding uses

More information

Stating the obvious, people and computers do not speak the same language.

Stating the obvious, people and computers do not speak the same language. 3.4 SYSTEM SOFTWARE 3.4.3 TRANSLATION SOFTWARE INTRODUCTION Stating the obvious, people and computers do not speak the same language. People have to write programs in order to instruct a computer what

More information

Lecture #12 February 25, 2004 Ugly Programming Tricks

Lecture #12 February 25, 2004 Ugly Programming Tricks Lecture #12 February 25, 2004 Ugly Programming Tricks In this lecture we will visit a number of tricks available in assembly language not normally seen in high-level languages. Some of the techniques are

More information

Hardware, Software and Mechanical Cosimulation for Automotive Applications

Hardware, Software and Mechanical Cosimulation for Automotive Applications Hardware, Software and Mechanical Cosimulation for Automotive Applications P. Le Marrec, C.A. Valderrama, F. Hessel, A.A. Jerraya TIMA Laboratory 46 Avenue Felix Viallet 38031 Grenoble France fphilippe.lemarrec,

More information

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program. Language Translation Compilation vs. interpretation Compilation diagram Step 1: compile program compiler Compiled program Step 2: run input Compiled program output Language Translation compilation is translation

More information

Compilers. Prerequisites

Compilers. Prerequisites Compilers Prerequisites Data structures & algorithms Linked lists, dictionaries, trees, hash tables Formal languages & automata Regular expressions, finite automata, context-free grammars Machine organization

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

1 Motivation for Improving Matrix Multiplication

1 Motivation for Improving Matrix Multiplication CS170 Spring 2007 Lecture 7 Feb 6 1 Motivation for Improving Matrix Multiplication Now we will just consider the best way to implement the usual algorithm for matrix multiplication, the one that take 2n

More information

Visual Profiler. User Guide

Visual Profiler. User Guide Visual Profiler User Guide Version 3.0 Document No. 06-RM-1136 Revision: 4.B February 2008 Visual Profiler User Guide Table of contents Table of contents 1 Introduction................................................

More information

Trace Getting Started V8.02

Trace Getting Started V8.02 Trace Getting Started V8.02 1. Introduction This paper helps the user to entirely exploit the trace and troubleshoot most often situations that the developer is confronted with while debugging the application.

More information

COMPILER DESIGN. For COMPUTER SCIENCE

COMPILER DESIGN. For COMPUTER SCIENCE COMPILER DESIGN For COMPUTER SCIENCE . COMPILER DESIGN SYLLABUS Lexical analysis, parsing, syntax-directed translation. Runtime environments. Intermediate code generation. ANALYSIS OF GATE PAPERS Exam

More information

Unit 2 : Computer and Operating System Structure

Unit 2 : Computer and Operating System Structure Unit 2 : Computer and Operating System Structure Lesson 1 : Interrupts and I/O Structure 1.1. Learning Objectives On completion of this lesson you will know : what interrupt is the causes of occurring

More information

Hardware/Software Co-Design of an Avionics Communication Protocol Interface System: an Industrial Case Study

Hardware/Software Co-Design of an Avionics Communication Protocol Interface System: an Industrial Case Study Hardware/Software Co-Design of an Avionics Communication Protocol Interface System: an Industrial Case Study Franqois Clout6 ENSEEIHT, 2 rue Camichel 33.5.61.56.64.36 cloute@len7.enseeihtfr Pascal Pampagnin

More information

SPARK: A Parallelizing High-Level Synthesis Framework

SPARK: A Parallelizing High-Level Synthesis Framework SPARK: A Parallelizing High-Level Synthesis Framework Sumit Gupta Rajesh Gupta, Nikil Dutt, Alex Nicolau Center for Embedded Computer Systems University of California, Irvine and San Diego http://www.cecs.uci.edu/~spark

More information

CSE 361S Intro to Systems Software Lab Assignment #4

CSE 361S Intro to Systems Software Lab Assignment #4 Due: Thursday, October 23, 2008. CSE 361S Intro to Systems Software Lab Assignment #4 In this lab, you will mount a buffer overflow attack on your own program. As stated in class, we do not condone using

More information

Minsoo Ryu. College of Information and Communications Hanyang University.

Minsoo Ryu. College of Information and Communications Hanyang University. Software Reuse and Component-Based Software Engineering Minsoo Ryu College of Information and Communications Hanyang University msryu@hanyang.ac.kr Software Reuse Contents Components CBSE (Component-Based

More information

BasicScript 2.25 User s Guide. May 29, 1996

BasicScript 2.25 User s Guide. May 29, 1996 BasicScript 2.25 User s Guide May 29, 1996 Information in this document is subject to change without notice. No part of this document may be reproduced or transmitted in any form or by any means, electronic

More information

Support for high-level languages

Support for high-level languages Outline: Support for high-level languages memory organization ARM data types conditional statements & loop structures the ARM Procedure Call Standard hands-on: writing & debugging C programs 2005 PEVE

More information

Imelda C. Go, South Carolina Department of Education, Columbia, SC

Imelda C. Go, South Carolina Department of Education, Columbia, SC PO 082 Rounding in SAS : Preventing Numeric Representation Problems Imelda C. Go, South Carolina Department of Education, Columbia, SC ABSTRACT As SAS programmers, we come from a variety of backgrounds.

More information

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine Machine Language Instructions Introduction Instructions Words of a language understood by machine Instruction set Vocabulary of the machine Current goal: to relate a high level language to instruction

More information

Divisibility Rules and Their Explanations

Divisibility Rules and Their Explanations Divisibility Rules and Their Explanations Increase Your Number Sense These divisibility rules apply to determining the divisibility of a positive integer (1, 2, 3, ) by another positive integer or 0 (although

More information

RISC Principles. Introduction

RISC Principles. Introduction 3 RISC Principles In the last chapter, we presented many details on the processor design space as well as the CISC and RISC architectures. It is time we consolidated our discussion to give details of RISC

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

Instruction Set II. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 7. Instruction Set. Quiz. What is an Instruction Set?

Instruction Set II. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 7. Instruction Set. Quiz. What is an Instruction Set? COMP 212 Computer Organization & Architecture Quiz COMP 212 Fall 2008 Lecture 7 Fill in your student number only, do NOT write down your name Open book, but NO calculator, NO discussions, Relax and have

More information

Lab 03 - x86-64: atoi

Lab 03 - x86-64: atoi CSCI0330 Intro Computer Systems Doeppner Lab 03 - x86-64: atoi Due: October 1, 2017 at 4pm 1 Introduction 1 2 Assignment 1 2.1 Algorithm 2 3 Assembling and Testing 3 3.1 A Text Editor, Makefile, and gdb

More information

Software Reuse and Component-Based Software Engineering

Software Reuse and Component-Based Software Engineering Software Reuse and Component-Based Software Engineering Minsoo Ryu Hanyang University msryu@hanyang.ac.kr Contents Software Reuse Components CBSE (Component-Based Software Engineering) Domain Engineering

More information

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems)

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems) Design&Methodologies Fö 1&2-1 Design&Methodologies Fö 1&2-2 Course Information Design and Methodology/ Embedded s Design (Modeling and Design of Embedded s) TDTS07/TDDI08 Web page: http://www.ida.liu.se/~tdts07

More information

Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation

Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation Automatic Instrumentation of Embedded Software for High Level Hardware/Software Co-Simulation Aimen Bouchhima, Patrice Gerin and Frédéric Pétrot System-Level Synthesis Group TIMA Laboratory 46, Av Félix

More information

Computer Organization & Assembly Language Programming (CSE 2312)

Computer Organization & Assembly Language Programming (CSE 2312) Computer Organization & Assembly Language Programming (CSE 2312) Lecture 15: Running ARM Programs in QEMU and Debugging with gdb Taylor Johnson Announcements and Outline Homework 5 due Thursday Midterm

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

EE/CSCI 451 Midterm 1

EE/CSCI 451 Midterm 1 EE/CSCI 451 Midterm 1 Spring 2018 Instructor: Xuehai Qian Friday: 02/26/2018 Problem # Topic Points Score 1 Definitions 20 2 Memory System Performance 10 3 Cache Performance 10 4 Shared Memory Programming

More information

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions

Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit Integer Functions Performance Evaluation of a Novel Direct Table Lookup Method and Architecture With Application to 16-bit nteger Functions L. Li, Alex Fit-Florea, M. A. Thornton, D. W. Matula Southern Methodist University,

More information

System-Level Performance Analysis in SystemC¹

System-Level Performance Analysis in SystemC¹ System-Level Performance Analysis in SystemC¹ H. Posadas *, F. Herrera *, P. Sánchez *, E. Villar * & F. Blasco ** * TEISA Dept., E.T.S.I. Industriales y Telecom. University of Cantabria Avda. Los Castros

More information

Comp 11 Lectures. Mike Shah. June 26, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures June 26, / 57

Comp 11 Lectures. Mike Shah. June 26, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures June 26, / 57 Comp 11 Lectures Mike Shah Tufts University June 26, 2017 Mike Shah (Tufts University) Comp 11 Lectures June 26, 2017 1 / 57 Please do not distribute or host these slides without prior permission. Mike

More information

Data Hiding in Binary Text Documents 1. Q. Mei, E. K. Wong, and N. Memon

Data Hiding in Binary Text Documents 1. Q. Mei, E. K. Wong, and N. Memon Data Hiding in Binary Text Documents 1 Q. Mei, E. K. Wong, and N. Memon Department of Computer and Information Science Polytechnic University 5 Metrotech Center, Brooklyn, NY 11201 ABSTRACT With the proliferation

More information

Author: Steve Gorman Title: Programming with the Intel architecture in the flat memory model

Author: Steve Gorman Title: Programming with the Intel architecture in the flat memory model Author: Steve Gorman Title: Programming with the Intel architecture in the flat memory model Abstract: As the Intel architecture moves off the desktop into a variety of other computing applications, developers

More information

Review Topics. Midterm Exam Review Slides

Review Topics. Midterm Exam Review Slides Review Topics Midterm Exam Review Slides Original slides from Gregory Byrd, North Carolina State University Modified slides by Chris Wilcox, Colorado State University!! Computer Arithmetic!! Combinational

More information

Controller Synthesis for Hardware Accelerator Design

Controller Synthesis for Hardware Accelerator Design ler Synthesis for Hardware Accelerator Design Jiang, Hongtu; Öwall, Viktor 2002 Link to publication Citation for published version (APA): Jiang, H., & Öwall, V. (2002). ler Synthesis for Hardware Accelerator

More information

Type Checking and Type Equality

Type Checking and Type Equality Type Checking and Type Equality Type systems are the biggest point of variation across programming languages. Even languages that look similar are often greatly different when it comes to their type systems.

More information