Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier
|
|
- Alexis Ward
- 5 years ago
- Views:
Transcription
1 Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence
2 Performance = 1 Executon tme Speedup = Performance (B) Performance (A) = Tme (A) Tme (B) CPU tme = Instructons Program Cycles Instructon Seconds Cycle CPU clock cycles = n =1 CPI Instructons
3 The performance enhancement possble wth a gven mprovement s lmted by the amount that the mproved feature s used Executon tme after mprovement = Executon tme affected by the mprovement Amount of mprovement + Executon tme unaffected A common theme n Hardware desgn s to make the common case fast Increasng the clock rate would not affect memory access tme Usng a floatng pont processng unt does not speed nteger ALU operatons Example: Floatng pont nstructons mproved to run 2X; but only 10% of actual nstructons are floatng pont Exec-Tme new = Exec-Tme old x ( /2) = 0.95 x Exec-Tme old Speedup overall = Exec-Tme new / Exec-Tme old = 1/0.95 = 1.053
4 Tme old = Tme old * ( Fracton unchanged + Fracton enhanced) Tme new = Tme old * Fracton unchanged + Fracton enhanced Speedup enhanced Speedup overall = Tme old Tme new = = Speedup overall = Tme old Tme old * Fracton unchanged + Fracton enhanced 1 Fracton unchanged + Fracton enhanced Speedup enhanced 1 1 Fracton enhanced ( )+ Fracton enhanced Speedup enhanced Speedup enhanced
5 Tme KDF9 B5500 Instructons executed Code sze n nstructons Code sze n bts ICL μs ATLAS CDC 6600 NU The Burroughs B5500 machne s desgned specfcally for Algol 60 programs Although CDC 6600 s programs are over 3 tmes as bg as those of B5500, yet the CDC machne runs them almost 6 tmes faster Code sze cannot be used as an ndcaton for performance
6 Computer A Computer B Program 1 (seconds) 1 10 Program 2 (seconds) Total tme (seconds) Wrong summary can present a confusng pcture A s 10 tmes faster than B for program 1 B s 10 tmes faster than A for program 2 Total executon tme s a consstent summary measure Relatve executon tmes for the same workload Assumng that programs 1 and 2 are executng for the same number of tmes on computers A and B CPU Performance (B) CPU Performance (A) = Total executon tme (A) Total executon tme (B) = = 9.1 Executon tme s the only vald and unmpeachable measure of performance
7 Arthmetc Mean (AM) = 1 n Executon_ Tme n 1 = Weghted Arthmetc Mean (WAM) = n = 1 w Executon_ Tme Where: n s the number of programs executed w s a weghtng factor that ndcates the frequency of executng program n w = wth and = w 1 Weghted arthmetc means summarze performance whle trackng exec. tme Never use AM for normalzng tme relatve to a reference machne Tme on A Tme on B Norm. to A Norm. to B A B A B Program Program AM of normalzed tme AM of tme
8 Geometrc Mean (GM) = n n = 1 Executon_Tme_rato Where: n s the number of programs executed Wth Geometrc Mean ( X ) Geometrc Mean ( Y ) = Geometrc Mean X Y Geometrc mean s sutable for reportng average normalzed executon tme Tme on A Tme on B Norm. to A Norm. to B A B A B Program Program GM of tme or normalzed tme
9 Many wdely-used benchmarks are small programs that have sgnfcant localty of nstructon and data reference Unversal benchmarks can be msleadng snce hardware and compler vendors do optmze ther desgn for these programs The best types of benchmarks are real applcatons snce they reflect the end-user nterest Archtectures mght perform well for some applcatons and poorly for others Complaton can boost performance by takng advantage of archtecture-specfc features Applcaton-specfc compler optmzaton are becomng more popular
10 gcc espresso spce doduc nasa7 l eqntott matrx300 fpppp tomcatv Benchmark Compler Enhanced compler App. and arch. specfc optmzaton can dramatcally mpact performance
11 SPEC stands for System Performance Evaluaton Cooperatve sute of benchmarks Created by a set of companes to mprove the measurement and reportng of CPU performance SPEC2000 s the latest sute that conssts of 12 nteger (wrtten n C) and 14 floatng-pont (n Fortran 77) programs Customzed SPEC sutes have been recently ntroduced to assess performance of graphcs and transacton systems. Snce SPEC requres runnng applcatons on real hardware, the memory system has a sgnfcant effect on performance
12 Hardware Model number Powerstaton 550 CPU MHz POWER 4164 FPU (floatng pont) Integrated Number of CPU 1 Cache sze per CPU 64K data/8k nstructon Memory 64 MB Dsk subsystem Network nterface N/A Software MB SCSI OS type and revson AIX Ver Compler revson AIX XL C/6000 Ver AIX XL Fortran Ver. 2.2 Other software Fle system type Frmware level Tunng parameters Background load System state None AIX N/A System None None Mult-user (sngle-user logn) Gudng prncple s reproducblty (report envronment & experments setup)
13 SPEC rato = Executon tme on SUN SPARCstaton10/40 Executon tme on the measure machne Bgger numerc values of the SPEC rato ndcate faster machne
14 Clock rate (MHz) Pentum Clock rate (MHz) Pentum Pentum Pro Pentum Pro The performance measured may be dfferent on other Pentum-based hardware wth dfferent memory system and usng dfferent complers At the same clock rate, the SPECnt95 measure shows that Pentum Pro s tmes faster whle the SPECfp95 shows that t s tmes faster When the clock rate s ncreased by a certan factor, the processor performance ncreases by a lower factor
15 SPECbase CINT2000 Prces reflects those of July 2001 SPEC CINT2000 per $1000 n prce Dfferent results are obtaned for other benchmarks, e.g. SPEC CFP2000 Wth the excepton of the Sunblade prce-performance metrcs were consstent wth performance
16 In early computers most nstructons of a machne took the same executon tme The measure of performance for old machnes was the tme requred performng an ndvdual operaton (e.g. addton) New computers have dverse set of nstructons wth dfferent executon tmes The relatve frequency of nstructons across many programs was calculated The average nstructon executon tme was measured by multplyng the tme of each nstructon by ts frequency The average nstructon executon tme was a small step to MIPS that grew n popularty
17 MIPS = Mllon of Instructons Per Second one of the smplest metrcs vald only n a lmted context Instructon count MIPS (natve MIPS) = 6 Executon tme 10 There are three problems wth MIPS: MIPS specfes the nstructon executon rate but not the capabltes of the nstructons MIPS vares between programs on the same computer MIPS can vary nversely wth performance (see next example) The use of MIPS s smple and ntutve, faster machnes have bgger MIPS
18 Consder the machne wth the followng three nstructon classes and CPI: Now suppose we measure the code for the same program from two dfferent complers and obtan the followng data: Assume that the machne s clock rate s 500 MHz. Whch code sequence wll execute faster accordng to MIPS? Accordng to executon tme? Answer: Usng the formula: Instructon class CPI for ths nstructon class A 1 B 2 C 3 Instructon count n (bllons) for each Code from nstructon class A B C Compler Compler CPU clock cycles = CPI C Sequence 1: CPU clock cycles = ( ) 10 9 = cycles Sequence 2: CPU clock cycles = ( ) 10 9 = cycles n =1
19 Usng the formula: Execton tme = CPU clock cycles Clock rate Sequence 1: Executon tme = ( )/( ) = 20 seconds Sequence 2: Executon tme = ( )/( ) = 30 seconds Therefore compler 1 generates a faster program Usng the formula: MIPS = Instructon count Executon tme 10 6 ( ) 10 Sequence 1: MIPS = = ( ) 10 Sequence 2: MIPS = 6 = Although compler 2 has a hgher MIPS ratng, the code from generated by compler 1 runs faster 9 9
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science
More informationPerformance Evaluation
Performance Evaluaton [Ch. ] What s performance? of a car? of a car wash? of a TV? How should we measure the performance of a computer? The response tme (or wall-clock tme) t takes to complete a task?
More informationMotivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:
4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Cost, Performance & Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationVirtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory
Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationAssembler. Building a Modern Computer From First Principles.
Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationAnalysis of Continuous Beams in General
Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,
More informationReview of Basic Computer Architecture
of Basc Computer Archtecture 1 Computer Archtecture What s Computer Archtecture From Wkpeda, the free encyclopeda In computer scence and engneerng, computer archtecture refers to specfcaton of the relatonshp
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationHigh level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization
What s a Computer Program? Descrpton of algorthms and data structures to acheve a specfc ojectve Could e done n any language, even a natural language lke Englsh Programmng language: A Standard notaton
More informationOptimizing Document Scoring for Query Retrieval
Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng
More informationRandom Kernel Perceptron on ATTiny2313 Microcontroller
Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationConditional Speculative Decimal Addition*
Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant
More informationELEC 377 Operating Systems. Week 6 Class 3
ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationCACHE MEMORY DESIGN FOR INTERNET PROCESSORS
CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET
More informationLecture 3: Computer Arithmetic: Multiplication and Division
8-447 Lecture 3: Computer Arthmetc: Multplcaton and Dvson James C. Hoe Dept of ECE, CMU January 26, 29 S 9 L3- Announcements: Handout survey due Lab partner?? Read P&H Ch 3 Read IEEE 754-985 Handouts:
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationRADIX-10 PARALLEL DECIMAL MULTIPLIER
RADIX-10 PARALLEL DECIMAL MULTIPLIER 1 MRUNALINI E. INGLE & 2 TEJASWINI PANSE 1&2 Electroncs Engneerng, Yeshwantrao Chavan College of Engneerng, Nagpur, Inda E-mal : mrunalngle@gmal.com, tejaswn.deshmukh@gmal.com
More informationFloating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier
Floatng-Pont Dvson Algorthms for an x86 Mcroprocessor wth a Rectangular Multpler Mchael J. Schulte Dmtr Tan Carl E. Lemonds Unversty of Wsconsn Advanced Mcro Devces Advanced Mcro Devces Schulte@engr.wsc.edu
More informationPerformance Evaluation of Information Retrieval Systems
Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence
More informationReview of Basic. Computer Architecture. Theory Goals Specification
Computer Archtecture What s Computer Archtecture of Basc Computer Archtecture From Wkpeda, the free encyclopeda In computer scence and engneerng, computer archtecture refers to specfcaton of the relatonshp
More informationLearning the Kernel Parameters in Kernel Minimum Distance Classifier
Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department
More informationEfficient Distributed File System (EDFS)
Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate
More informationNUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS
ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationConfiguration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*
Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad
More informationComputer Architecture ELEC3441
Causes of Cache Msses: The 3 C s Computer Archtecture ELEC3441 Lecture 9 Cache (2) Dr. Hayden Kwo-Hay So Department of Electrcal and Electronc Engneerng Compulsory: frst reference to a lne (a..a. cold
More informationAccounting for the Use of Different Length Scale Factors in x, y and z Directions
1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationR s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes
SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges
More informationComplex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationTopology Design using LS-TaSC Version 2 and LS-DYNA
Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool
More informationTerm Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task
Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto
More informationRules for Using Multi-Attribute Utility Theory for Estimating a User s Interests
Rules for Usng Mult-Attrbute Utlty Theory for Estmatng a User s Interests Ralph Schäfer 1 DFKI GmbH, Stuhlsatzenhausweg 3, 66123 Saarbrücken Ralph.Schaefer@dfk.de Abstract. In ths paper, we show that Mult-Attrbute
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationData Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach
Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer
More informationA Clustering Algorithm Solution to the Collaborative Filtering
Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 A Clusterng Algorthm Soluton to the Collaboratve Flterng Yongl Yang 1, a, Fe Xue, b, Yongquan Ca 1, c Zhenhu Nng 1, d,* Hafeng Lu 3, e 1 Faculty
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationAlternating Direction Method of Multipliers Implementation Using Apache Spark
Alternatng Drecton Method of Multplers Implementaton Usng Apache Spark Deterch Lawson June 4, 2014 1 Introducton Many applcaton areas n optmzaton have benefted from recent trends towards massve datasets.
More informationNachos Project 3. Speaker: Sheng-Wei Cheng 2010/12/16
Nachos Project Speaker: Sheng-We Cheng //6 Agenda Motvaton User Programs n Nachos Related Nachos Code for User Programs Project Assgnment Bonus Submsson Agenda Motvaton User Programs n Nachos Related Nachos
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationMaintaining temporal validity of real-time data on non-continuously executing resources
Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan
More informationLS-TaSC Version 2.1. Willem Roux Livermore Software Technology Corporation, Livermore, CA, USA. Abstract
12 th Internatonal LS-DYNA Users Conference Optmzaton(1) LS-TaSC Verson 2.1 Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2.1,
More informationSAO: A Stream Index for Answering Linear Optimization Queries
SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples
More informationCache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access
Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?
More informationUtility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs
Utlty-Based Acceleraton of Multthreaded Applcatons on Asymmetrc CMPs José A. Joao M. Aater Suleman Onur Mutlu Yale N. Patt ECE Department The Unversty of Texas at Austn Austn, TX, USA {joao, patt}@ece.utexas.edu
More informationSOLUTION APPROACHES FOR THE CLUSTER TOOL SCHEDULING PROBLEM IN SEMICONDUCTOR MANUFACTURING
SOLUTION APPROACHES FOR THE CLUSTER TOOL SCHEDULING PROBLEM IN SEMICONDUCTOR MANUFACTURING Heko Nedermayer Olver Rose Computer Networks and Internet Wlhelm-Schckard-Insttute for Computer Scence Dstrbuted
More informationand NSF Engineering Research Center Abstract Generalized speedup is dened as parallel speed over sequential speed. In this paper
Shared Vrtual Memory and Generalzed Speedup Xan-He Sun Janpng Zhu ICASE NSF Engneerng Research Center Mal Stop 132C Dept. of Math. and Stat. NASA Langley Research Center Msssspp State Unversty Hampton,
More informationSteps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices
Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between
More informationAPPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT
3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationChapter 1. Introduction
Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal
More informationSimulation Based Analysis of FAST TCP using OMNET++
Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months
More informationOptimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden
Optmzng for Speed Er Hagersten Uppsala Unversty, Sweden eh@t.uu.se What s the potental gan? Latency dfference L$ and mem: ~5x Bandwdth dfference L$ and mem: ~x Repeated TLB msses adds a factor ~-3x Execute
More informationCache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory
Topcs Lecture 4 Cache Memores Generc cache memory organzaton Drect mapped caches Set assocate caches Impact of caches on performance Cache Memores Cache memores are small, fast SRAM-based memores managed
More informationProblem Definitions and Evaluation Criteria for Computational Expensive Optimization
Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty
More informationSecure and Fast Fingerprint Authentication on Smart Card
SETIT 2005 3 rd Internatonal Conference: Scences of Electronc, Technologes of Informaton and Telecommuncatons March 27-31, 2005 TUNISIA Secure and Fast Fngerprnt Authentcaton on Smart Card Y. S. Moon*,
More informationGiving credit where credit is due
CSCE 23J Computer Organzaton Cache Memores Dr. Stee Goddard goddard@cse.unl.edu Gng credt where credt s due Most of sldes for ths lecture are based on sldes created by Drs. Bryant and O Hallaron, Carnege
More information3D vector computer graphics
3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres
More informationOptimized caching in systems with heterogeneous client populations
Performance Evaluaton 42 (2000) 163 185 Optmzed cachng n systems wth heterogeneous clent populatons Derek L. Eager a,, Mchael C. Ferrs b, Mary K. Vernon b a Department of Computer Scence, Unversty of Saskatchewan,
More informationQuantifying Performance Models
Quantfyng Performance Models Prof. Danel A. Menascé Department of Computer Scence George Mason Unversty www.cs.gmu.edu/faculty/menasce.html 1 Copyrght Notce Most of the fgures n ths set of sldes come from
More informationArticle RGCA: a Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization
Artcle RGCA: a Relable GPU Cluster Archtecture for Large-Scale Internet of Thngs Computng Based on Effectve Performance-Energy Optmzaton Yulng Fang, Qngku Chen *, Neal N. Xong, Deyu Zhao and Jngjuan Wang
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationAssembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.
IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language
More informationUnsupervised Learning
Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and
More informationExercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005
Exercses (Part 4) Introducton to R UCLA/CCPR John Fox, February 2005 1. A challengng problem: Iterated weghted least squares (IWLS) s a standard method of fttng generalzed lnear models to data. As descrbed
More informationCracking of the Merkle Hellman Cryptosystem Using Genetic Algorithm
Crackng of the Merkle Hellman Cryptosystem Usng Genetc Algorthm Zurab Kochladze 1 * & Lal Besela 2 1 Ivane Javakhshvl Tbls State Unversty, 1, I.Chavchavadze av 1, 0128, Tbls, Georga 2 Sokhum State Unversty,
More informationMemory and I/O Organization
Memory and I/O Organzaton 8-1 Prncple of Localty Localty small proporton of memory accounts for most run tme Rule of thumb For 9% of run tme next nstructon/data wll come from 1% of program/data closest
More informationOptimized Resource Scheduling Using Classification and Regression Tree and Modified Bacterial Foraging Optimization Algorithm
World Engneerng & Appled Scences Journal 7 (1): 10-17, 2016 ISSN 2079-2204 IDOSI Publcatons, 2016 DOI: 10.5829/dos.weasj.2016.7.1.22540 Optmzed Resource Schedulng Usng Classfcaton and Regresson Tree and
More informationComparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments
Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,
More informationAlufix Expert D Design Software #85344
238 ALUFIX SOFTWARE Alufx Expert 2014 3D Desgn Software #85344 Alufx Expert software makes automatc desgns for fxtures wth correspondng partlsts. You choose the system and defne clampng ponts. The software
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More informationA Parallelization Design of JavaScript Execution Engine
, pp.171-184 http://dx.do.org/10.14257/mue.2014.9.7.15 A Parallelzaton Desgn of JavaScrpt Executon Engne Duan Huca 1,2, N Hong 2, Deng Feng 2 and Hu Lnln 2 1 Natonal Network New eda Engneerng Research
More informationUser Authentication Based On Behavioral Mouse Dynamics Biometrics
User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA
More informationEnergy-Efficient Workload Placement in Enterprise Datacenters
COVER FEATURE CLOUD COMPUTING Energy-Effcent Workload Placement n Enterprse Datacenters Quan Zhang and Wesong Sh, Wayne State Unversty Power loss from an unnterruptble power supply can account for 15 percent
More informationReal-Time Systems. Real-Time Systems. Verification by testing. Verification by testing
EDA222/DIT161 Real-Tme Systems, Chalmers/GU, 2014/2015 Lecture #8 Real-Tme Systems Real-Tme Systems Lecture #8 Specfcaton Professor Jan Jonsson Implementaton System models Executon-tme analyss Department
More informationMultiblock method for database generation in finite element programs
Proc. of the 9th WSEAS Int. Conf. on Mathematcal Methods and Computatonal Technques n Electrcal Engneerng, Arcachon, October 13-15, 2007 53 Multblock method for database generaton n fnte element programs
More informationLoop Pipelining for High-Throughput Stream Computation Using Self-Timed Rings
Loop Ppelnng for Hgh-Throughput Stream Computaton Usng Self-Tmed Rngs Gennette Gll, John Hansen and Montek Sngh Dept. of Computer Scence Unv. of North Carolna, Chapel Hll, NC 27599, USA {gllg,jbhansen,montek}@cs.unc.edu
More informationBurst Round Robin as a Proportional-Share Scheduling Algorithm
Burst Round Robn as a Proportonal-Share Schedulng Algorthm Tarek Helmy * Abdelkader Dekdouk ** * College of Computer Scence & Engneerng, Kng Fahd Unversty of Petroleum and Mnerals, Dhahran 31261, Saud
More informationDesign and Analysis of Algorithms
Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationVerification by testing
Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over
More informationNewton-Raphson division module via truncated multipliers
Newton-Raphson dvson module va truncated multplers Alexandar Tzakov Department of Electrcal and Computer Engneerng Illnos Insttute of Technology Chcago,IL 60616, USA Abstract Reducton n area and power
More informationFPGA-based implementation of circular interpolation
Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 04, 6(7):585-593 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 FPGA-based mplementaton of crcular nterpolaton Mngyu Gao,
More informationTHE IMPACT OF SMT/SMP DESIGNS ON MULTIMEDIA SOFTWARE ENGINEERING - A WORKLOAD ANALYSIS STUDY
THE IMPACT OF SMT/SMP DESIGNS ON MULTIMEDIA SOFTWARE ENGINEERING - A WORKLOAD ANALYSIS STUDY Yen-Kuang Chen, Raner Lenhart, Erc Debes, Matthew Hollman, and Mnerva Yeung Mcroprocessor Research Labs, Intel
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationLife Tables (Times) Summary. Sample StatFolio: lifetable times.sgp
Lfe Tables (Tmes) Summary... 1 Data Input... 2 Analyss Summary... 3 Survval Functon... 5 Log Survval Functon... 6 Cumulatve Hazard Functon... 7 Percentles... 7 Group Comparsons... 8 Summary The Lfe Tables
More information