DOI: /jos Tel/Fax: by Journal of Software. All rights reserved. , )

Size: px

Start display at page:

Logan Morgan
5 years ago
Views:

1 ISSN , CODEN RUXUEW Journal of Software, Vol18, No4, April 2007, pp DOI: /jos Tel/Fax: by Journal of Software All rights reserved CPU SimOS-Goodson 1,2+, 1, 1, 1, 1 1, 1 (, ) 2 (, ) SimOS-Goodson: A Goodson-Processor Based Multi-Core Full-System Simulator GAO Xiang 1,2+, ZHANG Fu-Xin 1, TANG Yan 1, ZHANG Long-Bing 1, HU Wei-Wu 1, TANG Zhi-Min 1 1 (Institute of Computing Technology, The Chinese Academy of Sciences, Beijing , China) 2 (Department of Computer Science and Technology, University of Science and Technology of China, Hefei , China) + Corresponding author: Phn: ext 9320, gaoxiang@ictaccn Gao X, Zhang FX, Tang Y, Zhang LB, Hu WW, Tang ZM SimOS-Goodson: A Goodson-processor based multi-core full-system simulator Journal of Software, 2007,18(4): /1047htm Abstract: As the Chip MultiProcessors (CMPs) have become the trend of high performance microprocessors, the target workloads become more and more diversified The traditional user-level simulators cannot handle them, so new simulators are needed for the future architecture research Based on the SimOS full-system environment, a new multi-core full-system simulator of Goodson processors, SimOS-Goodson, has been designed and implemented The SimOS-Goodson decouples the simulation functionality and timing It adopts a new value-prediction approach to implement memory consistency in the simulation environment The credibility and accuracy of SimOS-Goodson are achieved by cross-validating the simulator with the actual hardware The simulator inherits the benefits such as high speed and high flexibility from the traditional user-level simulators It also has the new benefits such as accuracy, full-system support and easy to use By porting the entire Linux OS, analysis and evaluation of the microarchitecture and workloads can be conducted easily in the SimOS-Goodson full-system environment On a machine of Pentium4 30GHz, the speed of SimOS-Goodson exceeds 300K instructions per second SimOS-Goodson will play a key role in the research of future Goodson multi-core architecture Key words: simulator; Goodson-2 processor; full-system; multi-core; SimOS : Supported by the National Natural Foundation of China for Distinguished Young Scholars under Grant No ( ); the National High-Tech Research and Development Plan of China under Grant Nos2005AA110010, 2005AA ( (863)); the National Grand Fundamental Research 973 Program of China under Grant No2005CB ( (973)); the Basic Research Foundation of the Institute of Computing Technology, the Chinese Academy of Sciences under Grant No ( ); the Knowledge Innovation Program of the Institute of Computing Technology, the Chinese Academy of Sciences under Grant No ( ) Received ; Accepted

2 1048 Journal of Software Vol18, No4, April 2007 SimOS, CPU SimOS-Goodson SimOS-Goodson,SimOS-Goodson, Linux, SimOS-Goodson 30GHz Pentium4,SimOS-Goodson 300K/ SimOS-Goodson CPU : ; 2 ; ;SimOS : TP302 : A, 3 SimpleScalar [1], :, Web,, ;,,Intel,AMD, IBM,CMP(chip multiprocessor), 3, [2],,,, Stanford SimOS [4], CPU [5] SimOS-Goodson :, SimOS-Goodson 300K/ ; SimOS-Goodson 2 CPU, 1 2 SimOS-Goodson 3 SimOS-Goodson 4 SimOS-Goodson 5 SimOS-Goodson 6 1 [3] 11 2 CPU 2 4,,

3 : CPU SimOS-Goodson , ICT-Goodson,,, ICT-Goodson Sim-Goodson ICT-Goodson 2 (execution-driven) SimpleScalar 2 CPU SPEC CPU2000,Sim-Goodson,,Sim-Goodson 12 SimOS SimOS Stanford DASH/FLASH SimOS MIPS R4000 SimOS : ; IRIX ( ) 2 SimOS-Goodson SimOS-Goodson,SimOS-Goodson, SimOS, 2 SimOS SimOS-Goodson 1 Linux/MIPS, Application: SPEC CPU2000, Splash2, Linux-Mips operating system SimOS interface/device driver Console Interrupt-Controller, Timer NIC DISK Memory controller CPU interface/interconnect SimOS Goodson CPU core Goodson CPU core Fig1 Diagram of SimOS-Goodson 1 SimOS-Goodson SimOS 2 3 : 2 21 SimOS-Goodson 2 Sim-Goodson

4 1050 Journal of Software Vol18, No4, April 2007 (proxy), ; SimOS-Goodson 2 (reorder queue, ROQ),,,,, Cache : ; 22 2 (processor consistency),,,, :, [1],,,,,, 2, [6], 2,, 2 23

5 : CPU SimOS-Goodson 1051 SimOS-Goodson SimOS,, : ; : 8255, Linux/MIPS,SimOS-Goodson,Linux/MIPS ELF(executable and linkable format),, SimOS-Goodson ELF : 2 Linux ;, Linux SimOS-Goodson SCSI(small computer system interface) SimOS-Goodson Linux ( 2420) SimOS-Goodson, VxWorks SimOS-Goodson SimOS-Goodson 3 31 SimOS-Goodson 3, GDB(the GNU project debugger) (stub) (checkpoint), SimOS-Goodson SimOS-Fast,,SimOS-Fast SimOS, 2M / 2,,SimOS-Fast,SimOS-Goodson ;, SimOS-Fast, SimOS-Goodson SimOS-Goodson Share memory PC trace SimOS-Fast InterProcess communication Fig3 On-Line cross validating 3 SimOS GDB, SimOS GDB (416) Linux/MIPS GDB(53 60) SimOS-Goodson, 2 SimOS-Goodson Linux/MIPS SimOS-Goodson,,

6 1052 Journal of Software Vol18, No4, April 2007 (warm-up), 32 SimOS-Goodson Tcl, SimOS ELF dwarf,, Tcl, SimOS IRIX, Linux Linux Tcl 2 Load/Store ( cp0queue) [5],cp0queue, Load/Store, Store CACHE,Load Store cp0queue LW/SW LW/SW,Load Store ( 30%), Load Store Load, SPEC CPU 11% 7% 4 SimOS-Goodson Linux Unix 30G Pentium 4, 2GB, 4GB Red Hat Linux 73, GCC 296 Linux/MIPS SimOS-Goodson,, SPEC CPU 2000 [7] LMbench [8], SPEC CPU 2000 CPU IPC(instructons per cycle)simos-goodson CPU,,, 1 2 CPU,SPEC CPU2000 test, 10, 2 3 IPC 1 ICT-Goodson SimOS-Goodson 10%, SimOS-Goodson ; Sim-Goodson,SimOS-Goodson IPC Cache LMbench, ( ) 2 300MHZ 2 SimOS-Goodson 15%, SimOS-Goodson

7 : CPU SimOS-Goodson Table 1 Validating with SPEC CPU SPEC CPU 2000 Prog (SPECint) ICT-Goodson Sim-Goodson SimOS-Goodson IPC-Err Mcf Parser Perlbmk Crafty Twolf Vortex Vpr Gap Gcc Gzip Eon Applu Apsi Art Swim Equake Mesa Wupwise Sixtrack Table 2 Validating with LMbench 2 LMbench Application Real SimOS-Goodson Err Description Syscall null Simple syscall Syscall read Read syscall Pipe Interprocess communication through pipes Sig catch Signal handler Unix Unix stream socket Fifo Pipe operation Ctx Context switching, IPS(instructions per second) CPS(cycles per second) 4 3 Linux SPEC CPU 2000 SPLASH2 [9] Apache Benchmark,,SPLASH2 3 SimOS-Goodson Table 3 Simlating speed of SimOS-Goodson 3 SimOS-Goodson Core count Workload Cycles (K) Instructions (K) Time (s) CPS (K) IPS (K) 1 Linux OS boot Spec2k (MESA) Splash2 (radix) Apache benchmark Linux OS boot Splash2 (radix) Spec2k (MESA+MESA) Apache benchmark Linux OS boot Splash2 (radix) Apache benchmark

8 1054 Journal of Software Vol18, No4, April 2007 Linux 506K CPS IPC IPC,,SimOS-Goodson 300K IPS, 405K IPS 4 Linux Linux cpu_idle Cache 1, IPC 2 CPU,ICT-Goodson 50K / ;Sim-Goodson 500K / ;SimOS-Goodson 300K / 400K / ICT-Goodson,,, ;Sim-Goodson, ;SimOS-Goodson, Sim-Goodson, 5 DEC IBM SimOS Alpha PowerPC SimOS-Alpha SimOS-PPC SimOS-Goodson SimOS,, ;, GEMS(general execution model simulator) [10] Wisconsin, Simics [11] Simics SimOS-Goodson, GEMS Simics, [2] 120K IPS,GEMS (flat), SimOS-Goodson,,SimOS-Goodson 300 K IPS,SimOS-Goodson, 6 SimOS SimOS-Goodson,, 3 SPEC CPU 2000 LMbench, SimOS-Goodson 15% SimOS-Goodson, 2 CPU Linux,SimOS-Goodson 30%, 300K / SimOS-Goodson, SimOS-Goodson Cache,

: CPU SimOS-Goodson 1055 SimOS-Goodson, References: [1] Burger DC, Austin TM The simplescalar tool set, version 20 Technical Report, CS-TR-97-1342, Madison: University of Wisconsin,

116 [3] Gibson J, Kunz R, Ofelt D, Horowitz M, Hennessy J, Heinrich M FLASH vs (simulated) FLASH: Closing the simulation loop In: Proc of the 9th Int l Conf on Architectural Support

SimOS approach IEEE Parallel and Distributed Technology: Systems and Applications, 1995,3(4):34 43 [5] Hu WW, Zhang FX, Li ZS Microarchitecture of the Goodson-2 processor Journal of

multithreading or multiprocessing In: Proc of the 34th Int l Symp on Microarchitecture IEEE Computer Society, 2001 328 337 [7] Henning J SPEC CPU2000: Measuring CPU performance in

[9] Woo SC, Ohara M, Torrie E, Singh JP, Gupta A The SPLASH-2 programs: Characterization and methodological considerations In: Proc of the 22nd Annual Int l Symp on Computer

multiprocessor simulator (GEMS) toolset, submitted to computer architecture news (CAN) 2005 http://wwwcs wiscedu/gems [11] Magnusson PS, Cbristensson M, Eskilson J, Forsgren D,

9 : CPU SimOS-Goodson 1055 SimOS-Goodson, References: [1] Burger DC, Austin TM The simplescalar tool set, version 20 Technical Report, CS-TR , Madison: University of Wisconsin, 1997 [2] Mauer CJ, Hill MD, Wood DA Full system timing-first simulation In: Proc of the 2002 ACM Sigmetrics Conf on Measurement and Modeling of Computer Systems ACM Press, [3] Gibson J, Kunz R, Ofelt D, Horowitz M, Hennessy J, Heinrich M FLASH vs (simulated) FLASH: Closing the simulation loop In: Proc of the 9th Int l Conf on Architectural Support for Programming Languages and Operating Systems (ASPLOS) IEEE Computer Society, [4] Rosenblum M, Herrod SA, Witchel E, Gupta A Complete computer system simulation: The SimOS approach IEEE Parallel and Distributed Technology: Systems and Applications, 1995,3(4):34 43 [5] Hu WW, Zhang FX, Li ZS Microarchitecture of the Goodson-2 processor Journal of Computer Science and Technology, 2005, 20(2): [6] Martin MMK, Sorin DJ, Cain HW, Hill MD, Lipasti MH Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing In: Proc of the 34th Int l Symp on Microarchitecture IEEE Computer Society, [7] Henning J SPEC CPU2000: Measuring CPU performance in the new millennium IEEE Computer, 2000,33(7):28 35 [8] McVoy L, Staelin C LMbench: Portable tools for performance analysis In: USENIX Annual Technical Conf San Diego, [9] Woo SC, Ohara M, Torrie E, Singh JP, Gupta A The SPLASH-2 programs: Characterization and methodological considerations In: Proc of the 22nd Annual Int l Symp on Computer Architecture IEEE Computer Society, [10] Martin MM, Sorin DJ, Beckmann BM, Marty MR, Xu M, Alameldeen AR, Moore KE, Hill MD, Wood DA Multifacet s general execution-driven multiprocessor simulator (GEMS) toolset, submitted to computer architecture news (CAN) wiscedu/gems [11] Magnusson PS, Cbristensson M, Eskilson J, Forsgren D, Hällberg G, Högberg J, Larsson F, Moestedt A, Werner B Simics: A full system simulation platform IEEE Computer, 2002,35(2):50 58 (1982 ),, (1974 ),, (1976 ),,,,CCF, (1968 ),,,VLSI (1981 ),,, (1966 ),,,CCF,

2. Futile Stall HTM HTM HTM. Transactional Memory: TM [1] TM. HTM exponential backoff. magic waiting HTM. futile stall. Hardware Transactional Memory:

2. Futile Stall HTM HTM HTM. Transactional Memory: TM [1] TM. HTM exponential backoff. magic waiting HTM. futile stall. Hardware Transactional Memory: 1 1 1 1 1,a) 1 HTM 2 2 LogTM 72.2% 28.4% 1. Transactional Memory: TM [1] TM Hardware Transactional Memory: 1 Nagoya Institute of Technology, Nagoya, Aichi, 466-8555, Japan a) tsumura@nitech.ac.jp HTM HTM