GPU. OpenMP. OMPCUDA OpenMP. forall. Omni CUDA 3) Global Memory OMPCUDA. GPU Thread. Block GPU Thread. Vol.2012-HPC-133 No.

Size: px

Start display at page:

Download "GPU. OpenMP. OMPCUDA OpenMP. forall. Omni CUDA 3) Global Memory OMPCUDA. GPU Thread. Block GPU Thread. Vol.2012-HPC-133 No."

Osborne Day
6 years ago
Views:

1 GPU CUDA OpenMP OpenMP CUDA OM- PCUDA OMPCUDA GPU CUDA CUDA 1. GPU GPGPU 1)2) GPGPU CUDA 3) CPU CUDA GPGPU CPU GPU OpenMP GPU CUDA OMPCUDA 4)5) OMPCUDA GPU OpenMP GPU CUDA OMPCUDA/MG 2 GPU OMPCUDA 3 OMPCUDA/MG 1 Graduate School of Information Systems The University of Electro-Communications 2 Information Technology Center, The University of Tokyo 3 /JST The University of Electro-Communications/JST GPU OMPCUDA OMPCUDA GPU OpenMP Omni OpenMP Compiler 6) ( Omni) OpenMP CUDA OMPCUDA OpenMP GPU CUDA OpenMP GPU CUDA OMPCUDA OMPCUDA OpenMP forall forall 1GPU Thread GPU GPU forall Omni OMPCUDA GPU Global Memory GPU GPU Omni forall CPU ID OMPCUDA GPU Thread ID GPU Block ID GPU forall OMPCUDA GPU Block GPU Thread GPU Block GPU Thread forall GPU Thread 1 c 2012 Information Processing Society of Japan

2 3. GPU CUDA 3.1 OpenMP CPU CPU CUDA GPU GPU Thread Global Memory GPU CUDA GPU Global Memory GPU CPU OpenMP GPU CUDA CPU GPU Global Memory CPU GPU CUDA GPU Global Memory GPU CUDA CPU GPU GPU CPU GPU CPU GPU GPU CPU Global Memory GPU 3.2 OpenMP GPU CUDA 1 OpenMP 2 GPU CUDA OpenMP CPU ( 1) CPU forall CPU CPU GPU CUDA GPU GPU GPU Block GPU Block GPU Thread ( 2) GPU 2 c 2012 Information Processing Society of Japan

3 CPU-GPU GPU Block GPU Thread OMPCUDA GPU GPU Block GPU Thread GPU Block GPU Thread OpenMP CUDA forall schedule(static, 1) GPU GPU GPU Block, GPU Thread forall ( ) GPU Thread GPU GPU Block GPU Thread 1CPU GPU GPU GPU CPU CPU 3.3 OMPCUDA OpenMP GPU CUDA 1 2 Omni 3 OMPCUDA 3 ( 1 ) OpenMP Frontend OpenMP (Xcode) ( 2 ) OpenMP OpenMP Omni OpenMP Omni ( 3 ) ( 4 ) CPU-GPU Omni ( 5 ) ( GPU CPU GPU ) ( 6 ) Omni GPU CPU-GPU GPU CPU nvcc OMPCUDA CPU ( 7 ) GPU CUDA ( global ) CPU-GPU CUDA (nvcc) CUDA OpenMP GPU CPU CUDA CPU GPU 3.4 GPU OMPCUDA GPU Owens 7) Shared Memory CUDA OMPCUDA 1GPU 3 c 2012 Information Processing Society of Japan

4 4 5 3 GPU GPU GPU CPU GPU GPU CPU GPU GPU CPU 4. GPU OMPCUDA OM- PCUDA ( OMPCUDA) Omni CPU ( Omni) CUDA 1 Table 1 Evaluation environment. CPU Intel(R) Xeon(R)CPU X5550(4 ) 2.67GHz 2 4.0GB DDR GB ECC Reg GPU TeslaT10Processor(240 ) GHz GDDR3 4GB 4 GPU PCI-Express Gen nvcc 3.2 V ( SimpleCUDA) CPU GPU. 1 CPU Hyper-Threading GPU 1 PCI-Express Gen GPU GPU n CPU GPU 4 c 2012 Information Processing Society of Japan

5 OMPCUDA Omni CPU OMPCUDA SimpleCUDA OMPCUDA Omni OMPCUDA CUDA OMPCUDA GPU 2GPU 4GPU 1GPU 4GPU OMPCUDA 4GPU CPU 5 c 2012 Information Processing Society of Japan

6 10 11 Global Memory (GPU ) 1GPU 2GPU 2 4GPU 4 GPU 4.3 Global Memory 12 Global Memory GPU 1GPU 4GPU Omni 5. GPU CUDA OpenMP OM- PCUDA/MG OpenMP GPU OpenMP CUDA Open- MPC 8) OpenMPC Cetus Comipler OpenMP CUDA OpenMPC OpenMP OpenMP GPU OpenACC 10) Nvidia OpenACC CUDA GPU 4GPU 6 c 2012 Information Processing Society of Japan

7 12 PGI PGI Accelerator Model 9) OpenACC CUDA GPU-CPU CUDA GPU OpenMP OpenMP GPU OpenMP OpenMPD 11) OpenMPD Omni OpenMP Compiler MPI OpenMP OpenMPD OpenMP MPI GPU MPI GPU 12) GPU CUDA MPI GPU CPU OpenMP GPU GPU CUDA OpenMP OMPCUDA GPU OpenMP GPU CUDA OMPCUDA CUDA Global Memory JST CREST ULP- 7 c 2012 Information Processing Society of Japan

8 HPC: 1) Takashi Shimokawabe,et al., An 80-Fold Speedup, 15.0 TFlops, Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code, Proceeding of the 2009 ACM/IEEE conference on SuperComputing 2010 PP.1-11(2010). 2),, GPU Phase Field,, PP (2010). 3) NVIDIA, NVIDIA CUDA C Programing Guide 3.2,nVIDIA, ),, OMPCUDA:GPU OpenMP,HPCS2009, PP (2009). 5) S.Ohshima, S.Shoichi H.Honda,OMPCUDA : OpenMP Execution Framework for CUDA Based on Omni OpenMP Compiler,In: EWOMP 10, PP (2010). 6) M.Sato, S.Satoh, K.Kusano and Y.Tanaka, Design of OpenMP Compiler for an SMP Cluster, EWOMP 99, PP.32-39(1999). 7) John Owens and UC Davis. Data-parallel algorithms and data structures.in SUPERCOMPUTING 2007 Tutorial: Hight Performance Computing withcuda, ) Seyong Lee, Rudolf Eigenmann, OpenMPC: Extended OpenMP Programming and Tuning for GPUs, Proceeding of the 2010 ACM/IEEE conference on SuperComputing2010, PP.1-11(2010). 9) PGI, PGI Compiler and Tools, ) NVIDIA, Cray Inc., Portland Group, CAPS enterprise, OpenACC DIRECTIVE FOR ACCELERATOR, November ),,, OpenMPD, HOKKE2007 PP (2007). 12) MPI GPU, SACSIS2011, PP (2011). 8 c 2012 Information Processing Society of Japan

GPU Computing with NVIDIA s new Kepler Architecture

GPU Computing with NVIDIA s new Kepler Architecture Axel Koehler Sr. Solution Architect HPC HPC Advisory Council Meeting, March 13-15 2013, Lugano 1 NVIDIA: Parallel Computing Company GPUs: GeForce, Quadro,