Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier"

Transcription

1 Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence

2 Performance = 1 Executon tme Speedup = Performance (B) Performance (A) = Tme (A) Tme (B) CPU tme = Instructons Program Cycles Instructon Seconds Cycle CPU clock cycles = n =1 CPI Instructons

3 The performance enhancement possble wth a gven mprovement s lmted by the amount that the mproved feature s used Executon tme after mprovement = Executon tme affected by the mprovement Amount of mprovement + Executon tme unaffected A common theme n Hardware desgn s to make the common case fast Increasng the clock rate would not affect memory access tme Usng a floatng pont processng unt does not speed nteger ALU operatons Example: Floatng pont nstructons mproved to run 2X; but only 10% of actual nstructons are floatng pont Exec-Tme new = Exec-Tme old x ( /2) = 0.95 x Exec-Tme old Speedup overall = Exec-Tme new / Exec-Tme old = 1/0.95 = 1.053

4 Tme old = Tme old * ( Fracton unchanged + Fracton enhanced) Tme new = Tme old * Fracton unchanged + Fracton enhanced Speedup enhanced Speedup overall = Tme old Tme new = = Speedup overall = Tme old Tme old * Fracton unchanged + Fracton enhanced 1 Fracton unchanged + Fracton enhanced Speedup enhanced 1 1 Fracton enhanced ( )+ Fracton enhanced Speedup enhanced Speedup enhanced

5 Tme KDF9 B5500 Instructons executed Code sze n nstructons Code sze n bts ICL μs ATLAS CDC 6600 NU The Burroughs B5500 machne s desgned specfcally for Algol 60 programs Although CDC 6600 s programs are over 3 tmes as bg as those of B5500, yet the CDC machne runs them almost 6 tmes faster Code sze cannot be used as an ndcaton for performance

6 Computer A Computer B Program 1 (seconds) 1 10 Program 2 (seconds) Total tme (seconds) Wrong summary can present a confusng pcture A s 10 tmes faster than B for program 1 B s 10 tmes faster than A for program 2 Total executon tme s a consstent summary measure Relatve executon tmes for the same workload Assumng that programs 1 and 2 are executng for the same number of tmes on computers A and B CPU Performance (B) CPU Performance (A) = Total executon tme (A) Total executon tme (B) = = 9.1 Executon tme s the only vald and unmpeachable measure of performance

7 Arthmetc Mean (AM) = 1 n Executon_ Tme n 1 = Weghted Arthmetc Mean (WAM) = n = 1 w Executon_ Tme Where: n s the number of programs executed w s a weghtng factor that ndcates the frequency of executng program n w = wth and = w 1 Weghted arthmetc means summarze performance whle trackng exec. tme Never use AM for normalzng tme relatve to a reference machne Tme on A Tme on B Norm. to A Norm. to B A B A B Program Program AM of normalzed tme AM of tme

8 Geometrc Mean (GM) = n n = 1 Executon_Tme_rato Where: n s the number of programs executed Wth Geometrc Mean ( X ) Geometrc Mean ( Y ) = Geometrc Mean X Y Geometrc mean s sutable for reportng average normalzed executon tme Tme on A Tme on B Norm. to A Norm. to B A B A B Program Program GM of tme or normalzed tme

9 Many wdely-used benchmarks are small programs that have sgnfcant localty of nstructon and data reference Unversal benchmarks can be msleadng snce hardware and compler vendors do optmze ther desgn for these programs The best types of benchmarks are real applcatons snce they reflect the end-user nterest Archtectures mght perform well for some applcatons and poorly for others Complaton can boost performance by takng advantage of archtecture-specfc features Applcaton-specfc compler optmzaton are becomng more popular

10 gcc espresso spce doduc nasa7 l eqntott matrx300 fpppp tomcatv Benchmark Compler Enhanced compler App. and arch. specfc optmzaton can dramatcally mpact performance

11 SPEC stands for System Performance Evaluaton Cooperatve sute of benchmarks Created by a set of companes to mprove the measurement and reportng of CPU performance SPEC2000 s the latest sute that conssts of 12 nteger (wrtten n C) and 14 floatng-pont (n Fortran 77) programs Customzed SPEC sutes have been recently ntroduced to assess performance of graphcs and transacton systems. Snce SPEC requres runnng applcatons on real hardware, the memory system has a sgnfcant effect on performance

12 Hardware Model number Powerstaton 550 CPU MHz POWER 4164 FPU (floatng pont) Integrated Number of CPU 1 Cache sze per CPU 64K data/8k nstructon Memory 64 MB Dsk subsystem Network nterface N/A Software MB SCSI OS type and revson AIX Ver Compler revson AIX XL C/6000 Ver AIX XL Fortran Ver. 2.2 Other software Fle system type Frmware level Tunng parameters Background load System state None AIX N/A System None None Mult-user (sngle-user logn) Gudng prncple s reproducblty (report envronment & experments setup)

13 SPEC rato = Executon tme on SUN SPARCstaton10/40 Executon tme on the measure machne Bgger numerc values of the SPEC rato ndcate faster machne

14 Clock rate (MHz) Pentum Clock rate (MHz) Pentum Pentum Pro Pentum Pro The performance measured may be dfferent on other Pentum-based hardware wth dfferent memory system and usng dfferent complers At the same clock rate, the SPECnt95 measure shows that Pentum Pro s tmes faster whle the SPECfp95 shows that t s tmes faster When the clock rate s ncreased by a certan factor, the processor performance ncreases by a lower factor

15 SPECbase CINT2000 Prces reflects those of July 2001 SPEC CINT2000 per $1000 n prce Dfferent results are obtaned for other benchmarks, e.g. SPEC CFP2000 Wth the excepton of the Sunblade prce-performance metrcs were consstent wth performance

16 In early computers most nstructons of a machne took the same executon tme The measure of performance for old machnes was the tme requred performng an ndvdual operaton (e.g. addton) New computers have dverse set of nstructons wth dfferent executon tmes The relatve frequency of nstructons across many programs was calculated The average nstructon executon tme was measured by multplyng the tme of each nstructon by ts frequency The average nstructon executon tme was a small step to MIPS that grew n popularty

17 MIPS = Mllon of Instructons Per Second one of the smplest metrcs vald only n a lmted context Instructon count MIPS (natve MIPS) = 6 Executon tme 10 There are three problems wth MIPS: MIPS specfes the nstructon executon rate but not the capabltes of the nstructons MIPS vares between programs on the same computer MIPS can vary nversely wth performance (see next example) The use of MIPS s smple and ntutve, faster machnes have bgger MIPS

18 Consder the machne wth the followng three nstructon classes and CPI: Now suppose we measure the code for the same program from two dfferent complers and obtan the followng data: Assume that the machne s clock rate s 500 MHz. Whch code sequence wll execute faster accordng to MIPS? Accordng to executon tme? Answer: Usng the formula: Instructon class CPI for ths nstructon class A 1 B 2 C 3 Instructon count n (bllons) for each Code from nstructon class A B C Compler Compler CPU clock cycles = CPI C Sequence 1: CPU clock cycles = ( ) 10 9 = cycles Sequence 2: CPU clock cycles = ( ) 10 9 = cycles n =1

19 Usng the formula: Execton tme = CPU clock cycles Clock rate Sequence 1: Executon tme = ( )/( ) = 20 seconds Sequence 2: Executon tme = ( )/( ) = 30 seconds Therefore compler 1 generates a faster program Usng the formula: MIPS = Instructon count Executon tme 10 6 ( ) 10 Sequence 1: MIPS = = ( ) 10 Sequence 2: MIPS = 6 = Although compler 2 has a hgher MIPS ratng, the code from generated by compler 1 runs faster 9 9

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science

More information

Performance Evaluation

Performance Evaluation Performance Evaluaton [Ch. ] What s performance? of a car? of a car wash? of a TV? How should we measure the performance of a computer? The response tme (or wall-clock tme) t takes to complete a task?

More information

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to: 4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/

More information

CMSC 611: Advanced Computer Architecture

CMSC 611: Advanced Computer Architecture CMSC 611: Advanced Computer Architecture Cost, Performance & Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Review of Basic Computer Architecture

Review of Basic Computer Architecture of Basc Computer Archtecture 1 Computer Archtecture What s Computer Archtecture From Wkpeda, the free encyclopeda In computer scence and engneerng, computer archtecture refers to specfcaton of the relatonshp

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization

High level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization What s a Computer Program? Descrpton of algorthms and data structures to acheve a specfc ojectve Could e done n any language, even a natural language lke Englsh Programmng language: A Standard notaton

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Random Kernel Perceptron on ATTiny2313 Microcontroller

Random Kernel Perceptron on ATTiny2313 Microcontroller Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Conditional Speculative Decimal Addition*

Conditional Speculative Decimal Addition* Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET

More information

Lecture 3: Computer Arithmetic: Multiplication and Division

Lecture 3: Computer Arithmetic: Multiplication and Division 8-447 Lecture 3: Computer Arthmetc: Multplcaton and Dvson James C. Hoe Dept of ECE, CMU January 26, 29 S 9 L3- Announcements: Handout survey due Lab partner?? Read P&H Ch 3 Read IEEE 754-985 Handouts:

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

RADIX-10 PARALLEL DECIMAL MULTIPLIER

RADIX-10 PARALLEL DECIMAL MULTIPLIER RADIX-10 PARALLEL DECIMAL MULTIPLIER 1 MRUNALINI E. INGLE & 2 TEJASWINI PANSE 1&2 Electroncs Engneerng, Yeshwantrao Chavan College of Engneerng, Nagpur, Inda E-mal : mrunalngle@gmal.com, tejaswn.deshmukh@gmal.com

More information

Floating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier

Floating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier Floatng-Pont Dvson Algorthms for an x86 Mcroprocessor wth a Rectangular Multpler Mchael J. Schulte Dmtr Tan Carl E. Lemonds Unversty of Wsconsn Advanced Mcro Devces Advanced Mcro Devces Schulte@engr.wsc.edu

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Review of Basic. Computer Architecture. Theory Goals Specification

Review of Basic. Computer Architecture. Theory Goals Specification Computer Archtecture What s Computer Archtecture of Basc Computer Archtecture From Wkpeda, the free encyclopeda In computer scence and engneerng, computer archtecture refers to specfcaton of the relatonshp

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Brave New World Pseudocode Reference

Brave New World Pseudocode Reference Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 Causes of Cache Msses: The 3 C s Computer Archtecture ELEC3441 Lecture 9 Cache (2) Dr. Hayden Kwo-Hay So Department of Electrcal and Electronc Engneerng Compulsory: frst reference to a lne (a..a. cold

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Rules for Using Multi-Attribute Utility Theory for Estimating a User s Interests

Rules for Using Multi-Attribute Utility Theory for Estimating a User s Interests Rules for Usng Mult-Attrbute Utlty Theory for Estmatng a User s Interests Ralph Schäfer 1 DFKI GmbH, Stuhlsatzenhausweg 3, 66123 Saarbrücken Ralph.Schaefer@dfk.de Abstract. In ths paper, we show that Mult-Attrbute

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

A Clustering Algorithm Solution to the Collaborative Filtering

A Clustering Algorithm Solution to the Collaborative Filtering Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 A Clusterng Algorthm Soluton to the Collaboratve Flterng Yongl Yang 1, a, Fe Xue, b, Yongquan Ca 1, c Zhenhu Nng 1, d,* Hafeng Lu 3, e 1 Faculty

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Alternating Direction Method of Multipliers Implementation Using Apache Spark

Alternating Direction Method of Multipliers Implementation Using Apache Spark Alternatng Drecton Method of Multplers Implementaton Usng Apache Spark Deterch Lawson June 4, 2014 1 Introducton Many applcaton areas n optmzaton have benefted from recent trends towards massve datasets.

More information

Nachos Project 3. Speaker: Sheng-Wei Cheng 2010/12/16

Nachos Project 3. Speaker: Sheng-Wei Cheng 2010/12/16 Nachos Project Speaker: Sheng-We Cheng //6 Agenda Motvaton User Programs n Nachos Related Nachos Code for User Programs Project Assgnment Bonus Submsson Agenda Motvaton User Programs n Nachos Related Nachos

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract

LS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract 12 th Internatonal LS-DYNA Users Conference Optmzaton(1) LS-TaSC Verson 2.1 Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2.1,

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?

More information

Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs

Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs Utlty-Based Acceleraton of Multthreaded Applcatons on Asymmetrc CMPs José A. Joao M. Aater Suleman Onur Mutlu Yale N. Patt ECE Department The Unversty of Texas at Austn Austn, TX, USA {joao, patt}@ece.utexas.edu

More information

SOLUTION APPROACHES FOR THE CLUSTER TOOL SCHEDULING PROBLEM IN SEMICONDUCTOR MANUFACTURING

SOLUTION APPROACHES FOR THE CLUSTER TOOL SCHEDULING PROBLEM IN SEMICONDUCTOR MANUFACTURING SOLUTION APPROACHES FOR THE CLUSTER TOOL SCHEDULING PROBLEM IN SEMICONDUCTOR MANUFACTURING Heko Nedermayer Olver Rose Computer Networks and Internet Wlhelm-Schckard-Insttute for Computer Scence Dstrbuted

More information

and NSF Engineering Research Center Abstract Generalized speedup is dened as parallel speed over sequential speed. In this paper

and NSF Engineering Research Center Abstract Generalized speedup is dened as parallel speed over sequential speed. In this paper Shared Vrtual Memory and Generalzed Speedup Xan-He Sun Janpng Zhu ICASE NSF Engneerng Research Center Mal Stop 132C Dept. of Math. and Stat. NASA Langley Research Center Msssspp State Unversty Hampton,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Optimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden

Optimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden Optmzng for Speed Er Hagersten Uppsala Unversty, Sweden eh@t.uu.se What s the potental gan? Latency dfference L$ and mem: ~5x Bandwdth dfference L$ and mem: ~x Repeated TLB msses adds a factor ~-3x Execute

More information

Cache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory

Cache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory Topcs Lecture 4 Cache Memores Generc cache memory organzaton Drect mapped caches Set assocate caches Impact of caches on performance Cache Memores Cache memores are small, fast SRAM-based memores managed

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Secure and Fast Fingerprint Authentication on Smart Card

Secure and Fast Fingerprint Authentication on Smart Card SETIT 2005 3 rd Internatonal Conference: Scences of Electronc, Technologes of Informaton and Telecommuncatons March 27-31, 2005 TUNISIA Secure and Fast Fngerprnt Authentcaton on Smart Card Y. S. Moon*,

More information

Giving credit where credit is due

Giving credit where credit is due CSCE 23J Computer Organzaton Cache Memores Dr. Stee Goddard goddard@cse.unl.edu Gng credt where credt s due Most of sldes for ths lecture are based on sldes created by Drs. Bryant and O Hallaron, Carnege

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Optimized caching in systems with heterogeneous client populations

Optimized caching in systems with heterogeneous client populations Performance Evaluaton 42 (2000) 163 185 Optmzed cachng n systems wth heterogeneous clent populatons Derek L. Eager a,, Mchael C. Ferrs b, Mary K. Vernon b a Department of Computer Scence, Unversty of Saskatchewan,

More information

Quantifying Performance Models

Quantifying Performance Models Quantfyng Performance Models Prof. Danel A. Menascé Department of Computer Scence George Mason Unversty www.cs.gmu.edu/faculty/menasce.html 1 Copyrght Notce Most of the fgures n ths set of sldes come from

More information

Article RGCA: a Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

Article RGCA: a Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization Artcle RGCA: a Relable GPU Cluster Archtecture for Large-Scale Internet of Thngs Computng Based on Effectve Performance-Energy Optmzaton Yulng Fang, Qngku Chen *, Neal N. Xong, Deyu Zhao and Jngjuan Wang

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.

Assembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface. IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005 Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed

More information

Cracking of the Merkle Hellman Cryptosystem Using Genetic Algorithm

Cracking of the Merkle Hellman Cryptosystem Using Genetic Algorithm Crackng of the Merkle Hellman Cryptosystem Usng Genetc Algorthm Zurab Kochladze 1 * & Lal Besela 2 1 Ivane Javakhshvl Tbls State Unversty, 1, I.Chavchavadze av 1, 0128, Tbls, Georga 2 Sokhum State Unversty,

More information

Memory and I/O Organization

Memory and I/O Organization Memory and I/O Organzaton 8-1 Prncple of Localty Localty small proporton of memory accounts for most run tme Rule of thumb For 9% of run tme next nstructon/data wll come from 1% of program/data closest

More information

Optimized Resource Scheduling Using Classification and Regression Tree and Modified Bacterial Foraging Optimization Algorithm

Optimized Resource Scheduling Using Classification and Regression Tree and Modified Bacterial Foraging Optimization Algorithm World Engneerng & Appled Scences Journal 7 (1): 10-17, 2016 ISSN 2079-2204 IDOSI Publcatons, 2016 DOI: 10.5829/dos.weasj.2016.7.1.22540 Optmzed Resource Schedulng Usng Classfcaton and Regresson Tree and

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Alufix Expert D Design Software #85344

Alufix Expert D Design Software #85344 238 ALUFIX SOFTWARE Alufx Expert 2014 3D Desgn Software #85344 Alufx Expert software makes automatc desgns for fxtures wth correspondng partlsts. You choose the system and defne clampng ponts. The software

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

A Parallelization Design of JavaScript Execution Engine

A Parallelization Design of JavaScript Execution Engine , pp.171-184 http://dx.do.org/10.14257/mue.2014.9.7.15 A Parallelzaton Desgn of JavaScrpt Executon Engne Duan Huca 1,2, N Hong 2, Deng Feng 2 and Hu Lnln 2 1 Natonal Network New eda Engneerng Research

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Energy-Efficient Workload Placement in Enterprise Datacenters

Energy-Efficient Workload Placement in Enterprise Datacenters COVER FEATURE CLOUD COMPUTING Energy-Effcent Workload Placement n Enterprse Datacenters Quan Zhang and Wesong Sh, Wayne State Unversty Power loss from an unnterruptble power supply can account for 15 percent

More information

Real-Time Systems. Real-Time Systems. Verification by testing. Verification by testing

Real-Time Systems. Real-Time Systems. Verification by testing. Verification by testing EDA222/DIT161 Real-Tme Systems, Chalmers/GU, 2014/2015 Lecture #8 Real-Tme Systems Real-Tme Systems Lecture #8 Specfcaton Professor Jan Jonsson Implementaton System models Executon-tme analyss Department

More information

Multiblock method for database generation in finite element programs

Multiblock method for database generation in finite element programs Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs

More information

Burst Round Robin as a Proportional-Share Scheduling Algorithm

Burst Round Robin as a Proportional-Share Scheduling Algorithm Burst Round Robn as a Proportonal-Share Schedulng Algorthm Tarek Helmy * Abdelkader Dekdouk ** * College of Computer Scence & Engneerng, Kng Fahd Unversty of Petroleum and Mnerals, Dhahran 31261, Saud

More information

Loop Pipelining for High-Throughput Stream Computation Using Self-Timed Rings

Loop Pipelining for High-Throughput Stream Computation Using Self-Timed Rings Loop Ppelnng for Hgh-Throughput Stream Computaton Usng Self-Tmed Rngs Gennette Gll, John Hansen and Montek Sngh Dept. of Computer Scence Unv. of North Carolna, Chapel Hll, NC 27599, USA {gllg,jbhansen,montek}@cs.unc.edu

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Verification by testing

Verification by testing Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over

More information

Newton-Raphson division module via truncated multipliers

Newton-Raphson division module via truncated multipliers Newton-Raphson dvson module va truncated multplers Alexandar Tzakov Department of Electrcal and Computer Engneerng Illnos Insttute of Technology Chcago,IL 60616, USA Abstract Reducton n area and power

More information

FPGA-based implementation of circular interpolation

FPGA-based implementation of circular interpolation Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 04, 6(7):585-593 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 FPGA-based mplementaton of crcular nterpolaton Mngyu Gao,

More information

THE IMPACT OF SMT/SMP DESIGNS ON MULTIMEDIA SOFTWARE ENGINEERING - A WORKLOAD ANALYSIS STUDY

THE IMPACT OF SMT/SMP DESIGNS ON MULTIMEDIA SOFTWARE ENGINEERING - A WORKLOAD ANALYSIS STUDY THE IMPACT OF SMT/SMP DESIGNS ON MULTIMEDIA SOFTWARE ENGINEERING - A WORKLOAD ANALYSIS STUDY Yen-Kuang Chen, Raner Lenhart, Erc Debes, Matthew Hollman, and Mnerva Yeung Mcroprocessor Research Labs, Intel

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information