Large-Scale Spatial Query Processing on GPU-Accelerated Big Data Systems
|
|
- MargaretMargaret Floyd
- 5 years ago
- Views:
Transcription
1 Large-Scale Spatial Query Processing on GPU-Accelerated Big Data Systems Jianting Zhang 1,2 Simin You 2 1 Depart of Computer Science, CUNY City College (CCNY) 2 Department of Computer Science, CUNY Graduate Center
2 Outline Introduction Spatial data, GIS, BigData and HPC Taxi trip data in NYC and Global Biodiversity Applications Spatial query processing on GPUs ISP-GPU Architecture and Implementations Experiment Results Alternative Techniques SpatialSpark Lightweight Distributed Execution (LDE) Engine Summary an Future Work
3 Geographical Information System Social Studies Computational Geometry Computer Graphics Spatial Databases: data modeling, indexing, query processing Scientific Data/Information Visualization Statistics/Machine learning Image Processing/Computer Vision GIS Remote Sensing Social- Economic Modeling Environmental Modeling Census/Taxation Urban planning Transportation Air quality Hydrology Ecology
4 Big Geospatial Data Challenges Event Locations, trajectories and O-D data E.g., Taxi trip records (GPS traces or O-D locations) 0.5 million in NYC (medallion taxi cab only) and 1.2 million in Beijing per day From O-D locations to trajectories to frequent patterns Satellite: e.g., from GOES to GOES-R (2015/2016) [$11B] Spectral (3X)*spatial (4X)* temporal (5X)=60X 2km*2km*5min*16bands (360*60)*(180*60)*(12*24)*16~ 1+ trillion pixels per day Derived thematic data products (vector) Species distributions E.g million occurrence records (GBIF) E.g. 717,057 polygons and 78,929,697 vertices for 4148 birds distribution data (NatureServe)
5 Cloud computing+mapreduce+hadoop GPU SIMD CPU Host (CMP) GDRAM Core... Core GDRAM PCI-E Core Local Cache PCI-E Core Ring Bus Core C Core... Core... Core Thread Block B A Shared Cache HDD DRAM SSD MIC T 0 T 1 T 2 T 3 4-Threads In-Order Local Cache 16 Intel Sandy Bridge CPU cores+ 128GB RAM + 8TB disk + GTX TITAN + Xeon Phi 3120A ~ $9,994
6 ASCI Red: 1997 First 1 Teraflops (sustained) system with 9298 Intel Pentium II Xeon processors (in 72 Cabinets) Feb billion transistors (551mm²) 2,688 processors 4.5 TFLOPS SP and 1.3 TFLOPS DP Max bandwidth GB/s PCI-E peripheral device 250 W (17.98 GFLOPS/W -SP) Suggested retail price: $999 What can we do today using a device that is more powerful than ASCI Red 16 years ago?
7 Affiliated Institutions Students: Simin You (Ph.D ), Siyu Liao (Ph.D ), Costin Vicoveanu (Undergraduate, 2014-) Bharat Rosanlall (Undergraduate, 2014), Jay Yao (MS-thesis, ), Chandrashekar Singh (MS 2013), Agniva Banerjee (MS, 2012), Roger King (MS, 2012), Wahyu Nugroho (MS, 2011), Xiao Quan Cen Feng (MS 2011), Chetram Dasrat (Undergraduate, 2008) Collaborating Institutions Geospatial Technologies and Environmental CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang Department of Computer Science The City College of New York
8 $449,845/4yr (08/01/ /31/2017) HIGHEST-DB HIgh-performance GrapHics units based Engine for Spatial-Temporal data Spatial and Spatiotemporal indexing, query processing and optimization Trajectory data management on GPUs Segmentation/simplification/compression/Aggregation/Warehousing Map matching with road networks Data mining (moving cluster, convoy, swarm...) when yellow cabs, green cabs and MTA buses meet with multicore CPUs, GPUs and MICs in NYC
9 when GOES-R satellites, extratropical cyclones and hummingbirds meet with TITAN V T Temporal Trends High-resolution Satellite Imagery T Data Assimilation In-situ Observation Sensor Data T Zonal Statistics Ecological, environmental and administrative zones T ROIs T Global and Regional Climate Model Outputs C B High-End Computing Facility A Thread Block
10 ...building a highly-configurable experimental computing environment for innovative BigData technologies CCNY Computer Science LAN GeoTECI@CCNY CUNY HPCC KVM SGI Octane III Brawny GPU cluster Microway DIY Web Server/ Linux App Server Dell T5400 Windows App Server HP 8740w HP 8740w Lenovo T400s Dual Quadcore 48GB memory *2 Nvidia C2050*2 8 TB storage Dual 8-core 128GB memory Nvidia GTX Titan Intel Xeon Phi 3120A 8 TB storage Dual-core 8GB memory Nvidia GTX Titan 3 TB storage Dual Quadcore 16GB memory Nvidia Quadro TB storage Quadcore 8 GB memory Nvidia Quadro 5000m Wimmy GPU cluster Dell T7500 Dell T7500 Dell T5400 DIY Dual 6-core 24 GB memory Nvidia Quadro 6000 Dual 6-core 24 GB memory Nvidia GTX 480 Dual Quadcore 16GB memory Nvidia FX3700*2 Quadcore (Haswell) 16 GB memory AMD/ATI 7970
11 Taxi trip data in NYC Taxicabs 13,000 Medallion taxi cabs License priced at > $1M Car services and taxi services are separate Taxi trip records ~170 million trips (300 million passengers) in /5 of that of subway riders and 1/3 of that of bus riders in NYC 11
12 Taxi trip data in NYC Over all distributions of trip distance, time, speed and fare (2009) Count-Distance Distribution Count-Time Distribution Count <= 0.0 ( 0.8, 1.0] ( 1.8, 2.0] ( 2.8, 3.0] ( 3.8, 4.0] ( 4.8, 5.0] ( 5.8, 6.0] ( 6.8, 7.0] ( 7.8, 8.0] ( 8.8, 9.0] ( 9.8, 10.0] ( 10.8, 11.0] ( 11.8, 12.0] ( 12.8, 13.0] ( 13.8, 14.0] ( 14.8, 15.0] ( 15.8, 16.0] ( 16.8, 17.0] ( 17.8, 18.0] ( 18.8, 19.0] ( 19.8, 20.0] Count <= 0.0 ( 2.0, 3.0] ( 5.0, 6.0] ( 8.0, 9.0] ( 11.0, 12.0] ( 14.0, 15.0] ( 17.0, 18.0] ( 20.0, 21.0] ( 23.0, 24.0] ( 26.0, 27.0] ( 29.0, 30.0] ( 32.0, 33.0] ( 35.0, 36.0] ( 38.0, 39.0] ( 41.0, 42.0] ( 44.0, 45.0] ( 47.0, 48.0] > 50.0 Trip Distance (mile) TripTime (Minute) Count-Speed Distribution Count-Fare Distribution Count <= 0.0 ( 1.0, 2.0] ( 3.0, 4.0] ( 5.0, 6.0] ( 7.0, 8.0] ( 9.0, 10.0] ( 11.0, 12.0] ( 13.0, 14.0] ( 15.0, 16.0] ( 17.0, 18.0] ( 19.0, 20.0] ( 21.0, 22.0] ( 23.0, 24.0] ( 25.0, 26.0] ( 27.0, 28.0] ( 29.0, 30.0] ( 31.0, 32.0] ( 33.0, 34.0] ( 35.0, 36.0] ( 37.0, 38.0] ( 39.0, 40.0] ( 41.0, 42.0] ( 43.0, 44.0] ( 45.0, 46.0] ( 47.0, 48.0] ( 49.0, 50.0] Count <= 0.0 ( 1.0, 2.0] ( 3.0, 4.0] ( 5.0, 6.0] ( 7.0, 8.0] ( 9.0, 10.0] ( 11.0, 12.0] ( 13.0, 14.0] ( 15.0, 16.0] ( 17.0, 18.0] ( 19.0, 20.0] ( 21.0, 22.0] ( 23.0, 24.0] ( 25.0, 26.0] ( 27.0, 28.0] ( 29.0, 30.0] ( 31.0, 32.0] ( 33.0, 34.0] ( 35.0, 36.0] ( 37.0, 38.0] ( 39.0, 40.0] ( 41.0, 42.0] ( 43.0, 44.0] ( 45.0, 46.0] ( 47.0, 48.0] ( 49.0, 50.0] Speed (MPH) Fare ($)
13 Taxi trip data in NYC How to manage taxi trip data? Geographical Information System (GIS) Spatial Databases (SDB) Moving Object Databases (MOD) How good are they? Pretty good for small amount of data But, rather poor for large-scale data
14 Example 1: Taxi trip data in NYC Loading 170 million taxi pickup locations into PostgreSQL UPDATE t SET PUGeo = ST_SetSRID(ST_Point("PULong","PuLat"),4326); hours! Example 2: Finding the nearest tax blocks for 170 million taxi pickup locations using open source libspatiaindex+gdal 30.5 hours! Intel Xeon 2.26 GHz processors with 48G memory I do not have time to wait... Can we do better?
15 Global Biodiversity Data at GBIF SELECT aoi_id, sp_id, sum (ST_area (inter_geom)) FROM ( SELECT aoi_id, sp_id, ST_Intersection (sp_geom, qw_geom) AS inter_geom FROM SP_TB, QW_TB WHERE ST_Intersects (sp_geometry, qw_geom) ) GROUP BY aoi_id, sp_id HAVING sum(st_area(inter_geom)) >T; 15
16 Spatial Data Processing on GPUs
17 Spatial query processing on GPUs Single-Level Grid-File based Spatial Filtering Nested-Loop based Refinement Points Vertices (polygon/ polyline) Perfect coalesced memory accesses Utilizing GPU floating point computing power J. Zhang, S. You and L. Gruenwald, "Parallel Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs," Information Systems, vol. 44, p , 2014.
18 Spatial query processing on GPUs Top: grid size =256*256 resolution=128 feet Right: grid size =8192*8192 resolution=4 feet Spatial Aggregation 9,424 /326=30X (8192*8192) Temporal Aggregation 1709/198=8.6X (minute) 1598 /165 = 9.7X (hour)
19 Spatial query processing on GPUs P2N-D 147,011 street segments P2P-T 38,794 census blocks (470,941 points) P2P-D 735,488 tax blocks (4,698,986 points) CPU time GPU Time Speedup P2N-D P2P-T P2P-D h 30.5 h 10.9 s 11.2 s 33.1 s - 4,900X 3,200X Algorithmic improvement: 3.7X Using main-memory data structures: 37.4X GPU Acceleration: 24.3X
20 Outline Introduction Spatial data, GIS, BigData and HPC Taxi trip data in NYC and Global Biodiversity Applications Spatial query processing on GPUs ISP-GPU Architecture and Implementations Experiment Results Alternative Techniques SpatialSpark Lightweight Distributed Execution (LDE) Engine Summary an Future Work
21 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters
22 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters Attractive Features SQL Frontend: translate SQL queries into execution plans C/C++ backend with SSE4 support (for strings operations) Efficient implementations of hash-joins (partitioned and nonpartitioned) LLVM-based JIT. Extension is challenging!
23 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters class SpatialJoinNode : public BlockingJoinNode { public: SpatialJoinNode(ObjectPool* pool, const TPlanNode& tnode, const DescriptorTbl& descs); virtual Status Prepare(RuntimeState* state); virtual Status GetNext(RuntimeState* state, RowBatch* row_batch, bool* eos); virtual void Close(RuntimeState* state); protected: virtual Status InitGetNext(TupleRow* first_left_row); virtual Status ConstructBuildSide(RuntimeState* state); private: boost::scoped_ptr<tplannode> thrift_plan_node_; RuntimeState* runtime_state_; } create_rtree( ) pip_join( ) nearest_join( )
24 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters Scalable and Efficient Spatial Data Management on Multi-Core CPU and GPU Clusters. IEEE HardBD 15 Workshop
25 ISP-GPU: Scaling out Geospatial Data Processing to GPU Clusters Single-node results: 16core CPU/128GB, GTX Titan ISP-GPU ISP-MC+ GPU-Standalone MC-Standalone taxi-nycb (s) GBIF-WWF(s) Taxi-nycb: ~170 million points, ~40 thousand polygons (9 vertices/polygon) GBF-WWF: ~375 million points, ~15 thousand polygons (279 vertices/polygon) Cluster results: 2-10 nodes each with 8 vcpu cores/15gb, 1536 CUDA cores/4 GB (50 million species locations used due to memory constraint)
26 Outline Introduction Spatial data, GIS, BigData and HPC Taxi trip data in NYC and Global Biodiversity Applications Spatial query processing on GPUs ISP-GPU Architecture and Implementations Experiment Results Alternative Techniques SpatialSpark Lightweight Distributed Execution (LDE) Engine Summary an Future Work
27 Alternative Techniques SpatialSpark: Just Open-Sourced val sc = new SparkContext(conf) //reading left side data from HDFS and perform pre-processing val leftdata = sc.textfile(leftfile, numpartitions).map(x => x.split(separator)).zipwithindex() val leftgeometrybyid = leftdata.map(x => (x._2, Try(new WKTReader().read(x._1.apply(leftGeometryIndex))))).filter(_._2.isSuccess).map(x => (x._1, x._2.get)) //similarly for right-side data. //ready for spatial query (broadcast-based) val joinpredicate =SpatialOperator.Within // NearestD can be applied similarly var matchedpairs:rdd[(long, Long)] = BroadcastSpatialJoin(sc, leftgeometrybyid, rightgeometrybyid, joinpredicate) Large-Scale Spatial Join Query Processing in Cloud (Comparison with ISP-MC) IEEE CloudDM 15 Workshop
28 Alternative Techniques Lightweight Distributed Execution Engine for Large-Scale Spatial Join Query Processing
29 Spatial Data Processing and IoT Cell-phone based sensing and querying 3D world (personal navigation) Crowd-sourcing 3D urban infrastructure/traffic monitoring using RGB-D videos Building Information System and energy control Emergency response and disaster relief
30 Summary and Future Work Designs and implementations of an in-memory spatial data management system on multi-core CPU and many-core GPU clusters by extending Cloudera Impala for distributed spatial join query processing Experiments on the initial implementations have revealed both advantages and disadvantages of extending a tightly-coupled big data system to support spatial data types and their operations. Alternative techniques are being developed to further improve efficiency, scalability, extensibility and portability.
31 Q&A
Geospatial Technologies and Environmental CyberInfrastructure (GeoTECI) Lab Dr. Jianting Zhang
Affiliated Institutions Students: Simin You (Ph.D. 2009 -), Siyu Liao (Ph.D. 2014-), Costin Vicoveanu (Undergraduate, 2014-) Bharat Rosanlall (Undergraduate, 2014), Jay Yao (MS-thesis, 2011-2012), Chandrashekar
More informationTiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation
Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation Jianting Zhang 1,2 Simin You 2, Le Gruenwald 3 1 Depart of Computer Science, CUNY City College (CCNY) 2 Department of Computer
More informationHigh-Performance Analytics on Large- Scale GPS Taxi Trip Records in NYC
High-Performance Analytics on Large- Scale GPS Taxi Trip Records in NYC Jianting Zhang Department of Computer Science The City College of New York Outline Background and Motivation Parallel Taxi data management
More informationParallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs DOE Visiting Faculty Program Project Report
Parallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs 2013 DOE Visiting Faculty Program Project Report By Jianting Zhang (Visiting Faculty) (Department of Computer Science,
More informationLarge-Scale Spatial Data Processing on GPUs and GPU-Accelerated Clusters
Large-Scale Spatial Data Processing on GPUs and GPU-Accelerated Clusters Jianting Zhang, Simin You and Le Gruenwald Department of Computer Science, City College of New York, USA Department of Computer
More informationJianting Zhang Dept. of Computer Science The City College of New York New York, NY, USA
High-Performance Partition-based and Broadcastbased Spatial Join on GPU-Accelerated Clusters Simin You Dept. of Computer Science CUNY Graduate Center New York, NY, USA syou@gc.cuny.edu Jianting Zhang Dept.
More informationLarge-Scale Spatial Data Processing on GPUs and GPU-Accelerated Clusters
Large-Scale Spatial Data Processing on GPUs and GPU-Accelerated Clusters Jianting Zhang, Simin You,Le Gruenwald Department of Computer Science, City College of New York, USA Department of Computer Science,
More informationHigh-Performance Polyline Intersection based Spatial Join on GPU-Accelerated Clusters
High-Performance Polyline Intersection based Spatial Join on GPU-Accelerated Clusters Simin You Dept. of Computer Science CUNY Graduate Center New York, NY, 10016 syou@gradcenter.cuny.edu Jianting Zhang
More informationParallel Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs
Parallel Online Spatial and Temporal Aggregations on Multi-core CPUs and Many-Core GPUs Jianting Zhang, Department of Computer Science, the City College of New York, New York, NY, 10031, USA, jzhang@cs.ccny.cuny.edu
More informationDS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Data Management Prof. Yanhua Li Time: 6:00pm 8:50pm R Location: KH 116 Fall 2017 First Grading for Reading Assignment Weka v 6 weeks v https://weka.waikato.ac.nz/dataminingwithweka/preview
More informationAn efficient map-reduce algorithm for spatio-temporal analysis using Spark (GIS Cup)
Rensselaer Polytechnic Institute Universidade Federal de Viçosa An efficient map-reduce algorithm for spatio-temporal analysis using Spark (GIS Cup) Prof. Dr. W Randolph Franklin, RPI Salles Viana Gomes
More informationIntel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins
Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications
More informationArchitectures for Scalable Media Object Search
Architectures for Scalable Media Object Search Dennis Sng Deputy Director & Principal Scientist NVIDIA GPU Technology Workshop 10 July 2014 ROSE LAB OVERVIEW 2 Large Database of Media Objects Next- Generation
More informationLarge-Scale Spatial Data Management on Modern Parallel and Distributed Platforms
City University of New York (CUNY) CUNY Academic Works Dissertations, Theses, and Capstone Projects Graduate Center 2-1-2016 Large-Scale Spatial Data Management on Modern Parallel and Distributed Platforms
More informationANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation
ANSYS Improvements to Engineering Productivity with HPC and GPU-Accelerated Simulation Ray Browell nvidia Technology Theater SC12 1 2012 ANSYS, Inc. nvidia Technology Theater SC12 HPC Revolution Recent
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationUniversity at Buffalo Center for Computational Research
University at Buffalo Center for Computational Research The following is a short and long description of CCR Facilities for use in proposals, reports, and presentations. If desired, a letter of support
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationHigh-Performance Spatial Join Processing on GPGPUs with Applications to Large-Scale Taxi Trip Data
High-Performance Spatial Join Processing on GPGPUs with Applications to Large-Scale Taxi Trip Data Jianting Zhang Dept. of Computer Science City College of New York New York City, NY, 10031 jzhang@cs.ccny.cuny.edu
More informationIN11E: Architecture and Integration Testbed for Earth/Space Science Cyberinfrastructures
IN11E: Architecture and Integration Testbed for Earth/Space Science Cyberinfrastructures A Future Accelerated Cognitive Distributed Hybrid Testbed for Big Data Science Analytics Milton Halem 1, John Edward
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationWhen MPPDB Meets GPU:
When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationData Model and Management
Data Model and Management Ye Zhao and Farah Kamw Outline Urban Data and Availability Urban Trajectory Data Types Data Preprocessing and Data Registration Urban Trajectory Data and Query Model Spatial Database
More informationData Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs
Data Parallel Quadtree Indexing and Spatial Query Processing of Complex Polygon Data on GPUs Jianting Zhang Department of Computer Science The City College of New York New York, NY, USA jzhang@cs.ccny.cuny.edu
More informationRecent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect
Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect Copyright 2017, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is intended to
More informationFinite Element Integration and Assembly on Modern Multi and Many-core Processors
Finite Element Integration and Assembly on Modern Multi and Many-core Processors Krzysztof Banaś, Jan Bielański, Kazimierz Chłoń AGH University of Science and Technology, Mickiewicza 30, 30-059 Kraków,
More informationBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduce Guozhang Wang, Cornell University with Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, and Walker White MSR DMX August 20, 2010
More informationHigh Performance Computing Resources at MSU
MICHIGAN STATE UNIVERSITY High Performance Computing Resources at MSU Last Update: August 15, 2017 Institute for Cyber-Enabled Research Misson icer is MSU s central research computing facility. The unit
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationHigh Performance Computing with Accelerators
High Performance Computing with Accelerators Volodymyr Kindratenko Innovative Systems Laboratory @ NCSA Institute for Advanced Computing Applications and Technologies (IACAT) National Center for Supercomputing
More informationProgress Report on QDP-JIT
Progress Report on QDP-JIT F. T. Winter Thomas Jefferson National Accelerator Facility USQCD Software Meeting 14 April 16-17, 14 at Jefferson Lab F. Winter (Jefferson Lab) QDP-JIT USQCD-Software 14 1 /
More informationOpenMPSuperscalar: Task-Parallel Simulation and Visualization of Crowds with Several CPUs and GPUs
www.bsc.es OpenMPSuperscalar: Task-Parallel Simulation and Visualization of Crowds with Several CPUs and GPUs Hugo Pérez UPC-BSC Benjamin Hernandez Oak Ridge National Lab Isaac Rudomin BSC March 2015 OUTLINE
More informationFra superdatamaskiner til grafikkprosessorer og
Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc
More informationCMSC 858M/AMSC 698R. Fast Multipole Methods. Nail A. Gumerov & Ramani Duraiswami. Lecture 20. Outline
CMSC 858M/AMSC 698R Fast Multipole Methods Nail A. Gumerov & Ramani Duraiswami Lecture 20 Outline Two parts of the FMM Data Structures FMM Cost/Optimization on CPU Fine Grain Parallelization for Multicore
More informationBig Data Technologies and Geospatial Data Processing:
Big Data Technologies and Geospatial Data Processing: A perfect fit Albert Godfrind Spatial and Graph Solutions Architect Oracle Corporation Agenda 1 2 3 4 The Data Explosion Big Data? Big Data and Geo
More informationChapter 04. Authors: John Hennessy & David Patterson. Copyright 2011, Elsevier Inc. All rights Reserved. 1
Chapter 04 Authors: John Hennessy & David Patterson Copyright 2011, Elsevier Inc. All rights Reserved. 1 Figure 4.1 Potential speedup via parallelism from MIMD, SIMD, and both MIMD and SIMD over time for
More informationScalable Selective Traffic Congestion Notification
Scalable Selective Traffic Congestion Notification Győző Gidófalvi Division of Geoinformatics Deptartment of Urban Planning and Environment KTH Royal Institution of Technology, Sweden gyozo@kth.se Outline
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all
More informationChapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348
Chapter 1 Introduction: Part I Jens Saak Scientific Computing II 7/348 Why Parallel Computing? 1. Problem size exceeds desktop capabilities. Jens Saak Scientific Computing II 8/348 Why Parallel Computing?
More information(software agnostic) Computational Considerations
(software agnostic) Computational Considerations The Issues CPU GPU Emerging - FPGA, Phi, Nervana Storage Networking CPU 2 Threads core core Processor/Chip Processor/Chip Computer CPU Threads vs. Cores
More informationAWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe
AWS & Intel: A Partnership Dedicated to fueling your Innovations Thomas Kellerer BDM CSP, Intel Central Europe The Digital Service Economy Growth in connected devices enables new business opportunities
More informationINTRODUCTION TO OPENACC. Analyzing and Parallelizing with OpenACC, Feb 22, 2017
INTRODUCTION TO OPENACC Analyzing and Parallelizing with OpenACC, Feb 22, 2017 Objective: Enable you to to accelerate your applications with OpenACC. 2 Today s Objectives Understand what OpenACC is and
More informationIntroduction of Seoul Smart City. Pillars of Seoul Smart City 90% No.6 10,370,000 GDP 25%
Introduction of Seoul Smart City 90% More than 90% of Seoul citizens are Smart Phone Users Pillars of Seoul Smart City No.6 Ranked 6th on Urban Competitiveness Worldwide ( 15) 1 The best ICT infrastructure
More informationTrajAnalytics: A software system for visual analysis of urban trajectory data
TrajAnalytics: A software system for visual analysis of urban trajectory data Ye Zhao Computer Science, Kent State University Xinyue Ye Geography, Kent State University Jing Yang Computer Science, University
More informationPerformance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA
Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to
More informationErkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA
14. LS-DYNA Forum, Oktober 2016, Bamberg Erkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA Eric Schnepf 1, Dr. Eckardt Kehl 1, Chih-Song Kuo 2, Dymitrios Kyranas 2 1 Fujitsu Technology Solutions
More informationMobile Millennium Using Smartphones as Traffic Sensors
Mobile Millennium Using Smartphones as Traffic Sensors Dan Work and Alex Bayen Systems Engineering, Civil and Environmental Engineering, UC Berkeley Intelligent Infrastructure, Center for Information Technology
More informationComputing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany
Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been
More informationAccelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies
Accelerating the Implicit Integration of Stiff Chemical Systems with Emerging Multi-core Technologies John C. Linford John Michalakes Manish Vachharajani Adrian Sandu IMAGe TOY 2009 Workshop 2 Virginia
More informationMemory Bound Computing
Memory Bound Computing Francesc Alted Freelance Consultant & Trainer http://www.blosc.org/professional-services.html Advanced Scientific Programming in Python Reading, UK September, 2016 Goals Recognize
More informationOverview of Project's Achievements
PalDMC Parallelised Data Mining Components Final Presentation ESRIN, 12/01/2012 Overview of Project's Achievements page 1 Project Outline Project's objectives design and implement performance optimised,
More informationLecture 1: Introduction and Computational Thinking
PASI Summer School Advanced Algorithmic Techniques for GPUs Lecture 1: Introduction and Computational Thinking 1 Course Objective To master the most commonly used algorithm techniques and computational
More informationParallel Computing. Hwansoo Han (SKKU)
Parallel Computing Hwansoo Han (SKKU) Unicore Limitations Performance scaling stopped due to Power consumption Wire delay DRAM latency Limitation in ILP 10000 SPEC CINT2000 2 cores/chip Xeon 3.0GHz Core2duo
More informationGraph Database and Analytics in a GPU- Accelerated Cloud Offering
Graph Database and Analytics in a GPU- Accelerated Cloud Offering - Blazegraph GPU @ Cirrascale Cloud Brad Bebee, CEO, Blazegraph Dave Driggers, Chief Executive and Technical Officer, Cirrascale Corporation
More informationGPU-accelerated 3-D point cloud generation from stereo images
GPU-accelerated 3-D point cloud generation from stereo images Dr. Bingcai Zhang Release of this guide is approved as of 02/28/2014. This document gives only a general description of the product(s) or service(s)
More informationKES: Knowledge Enabled Services for better EO Information Use. Andrea Colapicchioni Advanced Computer Systems Space Division
KES: Knowledge Enabled Services for better EO Information Use Andrea Colapicchioni Advanced Computer Systems Space Division a.colapicchioni@acsys.it The problem During the last decades, the satellite image
More informationNew Trends in Database Systems
New Trends in Database Systems Ahmed Eldawy 9/29/2016 1 Spatial and Spatio-temporal data 9/29/2016 2 What is spatial data Geographical data Medical images 9/29/2016 Astronomical data Trajectories 3 Application
More informationHigh Performance Computing
CSC630/CSC730: Parallel & Distributed Computing Trends in HPC 1 High Performance Computing High-performance computing (HPC) is the use of supercomputers and parallel processing techniques for solving complex
More informationIntelligent Enterprise meets Science of Where. Anand Raisinghani Head Platform & Data Management SAP India 10 September, 2018
Intelligent Enterprise meets Science of Where Anand Raisinghani Head Platform & Data Management SAP India 10 September, 2018 Value The Esri & SAP journey Customer Impact Innovation Track Record Customer
More informationSAP HANA Spatial Location-based business platform
SAP HANA Spatial Location-based business platform Thomas Hammer, HANA Spatial Development April 19, 2018 SAP HANA Architecture Application development All Devices SAP, ISV and Custom Applications SAP HANA
More informationVisual Analysis of Lagrangian Particle Data from Combustion Simulations
Visual Analysis of Lagrangian Particle Data from Combustion Simulations Hongfeng Yu Sandia National Laboratories, CA Ultrascale Visualization Workshop, SC11 Nov 13 2011, Seattle, WA Joint work with Jishang
More informationIntroduction to GPGPU and GPU-architectures
Introduction to GPGPU and GPU-architectures Henk Corporaal Gert-Jan van den Braak http://www.es.ele.tue.nl/ Contents 1. What is a GPU 2. Programming a GPU 3. GPU thread scheduling 4. GPU performance bottlenecks
More informationDS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li
Welcome to DS595/CS525: Urban Network Analysis --Urban Mobility Prof. Yanhua Li Time: 6:00pm 8:50pm Wednesday Location: Fuller 320 Spring 2017 2 Team assignment Finalized. (Great!) Guest Speaker 2/22 A
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationThe Dell Precision T3620 tower as a Smart Client leveraging GPU hardware acceleration
The Dell Precision T3620 tower as a Smart Client leveraging GPU hardware acceleration Dell IP Video Platform Design and Calibration Lab June 2018 H17415 Reference Architecture Dell EMC Solutions Copyright
More informationIntroduction to Xeon Phi. Bill Barth January 11, 2013
Introduction to Xeon Phi Bill Barth January 11, 2013 What is it? Co-processor PCI Express card Stripped down Linux operating system Dense, simplified processor Many power-hungry operations removed Wider
More informationSDA: Software-Defined Accelerator for general-purpose big data analysis system
SDA: Software-Defined Accelerator for general-purpose big data analysis system Jian Ouyang(ouyangjian@baidu.com), Wei Qi, Yong Wang, Yichen Tu, Jing Wang, Bowen Jia Baidu is beyond a search engine Search
More informationUsing CUDA to Accelerate Radar Image Processing
Using CUDA to Accelerate Radar Image Processing Aaron Rogan Richard Carande 9/23/2010 Approved for Public Release by the Air Force on 14 Sep 2010, Document Number 88 ABW-10-5006 Company Overview Neva Ridge
More informationReal-Time Support for GPU. GPU Management Heechul Yun
Real-Time Support for GPU GPU Management Heechul Yun 1 This Week Topic: Real-Time Support for General Purpose Graphic Processing Unit (GPGPU) Today Background Challenges Real-Time GPU Management Frameworks
More informationGodson Processor and its Application in High Performance Computers
Godson Processor and its Application in High Performance Computers Weiwu Hu Institute of Computing Technology, Chinese Academy of Sciences Loongson Technologies Corporation Limited hww@ict.ac.cn 1 Contents
More informationGPU-ACCELERATED PLATFORM TRANSFORMING THE SMART CITIES LANDSCAPE PRADEEP GUPTA SENIOR SOLUTIONS ARCHITECT, NVIDIA
GPU-ACCELERATED PLATFORM TRANSFORMING THE SMART CITIES LANDSCAPE PRADEEP GUPTA SENIOR SOLUTIONS ARCHITECT, NVIDIA Smart City - Concept and Motivation Agenda NVIDIA s Platform for Making Smart Cities Use
More informationUsers and utilization of CERIT-SC infrastructure
Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user
More informationMultipredicate Join Algorithms for Accelerating Relational Graph Processing on GPUs
Multipredicate Join Algorithms for Accelerating Relational Graph Processing on GPUs Haicheng Wu 1, Daniel Zinn 2, Molham Aref 2, Sudhakar Yalamanchili 1 1. Georgia Institute of Technology 2. LogicBlox
More informationSTORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp
STORAGE CONSOLIDATION WITH IP STORAGE David Dale, NetApp SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in
More informationX-ray imaging software tools for HPC clusters and the Cloud
X-ray imaging software tools for HPC clusters and the Cloud Darren Thompson Application Support Specialist 9 October 2012 IM&T ADVANCED SCIENTIFIC COMPUTING NeAT Remote CT & visualisation project Aim:
More informationTesla GPU Computing A Revolution in High Performance Computing
Tesla GPU Computing A Revolution in High Performance Computing Mark Harris, NVIDIA Agenda Tesla GPU Computing CUDA Fermi What is GPU Computing? Introduction to Tesla CUDA Architecture Programming & Memory
More informationGPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS
GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge
More informationResources Current and Future Systems. Timothy H. Kaiser, Ph.D.
Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic
More informationCS GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8. Markus Hadwiger, KAUST
CS 380 - GPU and GPGPU Programming Lecture 8+9: GPU Architecture 7+8 Markus Hadwiger, KAUST Reading Assignment #5 (until March 12) Read (required): Programming Massively Parallel Processors book, Chapter
More informationAutomatic Scaling Iterative Computations. Aug. 7 th, 2012
Automatic Scaling Iterative Computations Guozhang Wang Cornell University Aug. 7 th, 2012 1 What are Non-Iterative Computations? Non-iterative computation flow Directed Acyclic Examples Batch style analytics
More informationRed Fox: An Execution Environment for Relational Query Processing on GPUs
Red Fox: An Execution Environment for Relational Query Processing on GPUs Haicheng Wu 1, Gregory Diamos 2, Tim Sheard 3, Molham Aref 4, Sean Baxter 2, Michael Garland 2, Sudhakar Yalamanchili 1 1. Georgia
More informationImproving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine
Improving performances of an embedded RDBMS with a hybrid CPU/GPU processing engine Samuel Cremer 1,2, Michel Bagein 1, Saïd Mahmoudi 1, Pierre Manneback 1 1 UMONS, University of Mons Computer Science
More informationSEASHORE / SARUMAN. Short Read Matching using GPU Programming. Tobias Jakobi
SEASHORE SARUMAN Summary 1 / 24 SEASHORE / SARUMAN Short Read Matching using GPU Programming Tobias Jakobi Center for Biotechnology (CeBiTec) Bioinformatics Resource Facility (BRF) Bielefeld University
More informationContact: Ye Zhao, Professor Phone: Dept. of Computer Science, Kent State University, Ohio 44242
Table of Contents I. Overview... 2 II. Trajectory Datasets and Data Types... 3 III. Data Loading and Processing Guide... 5 IV. Account and Web-based Data Access... 14 V. Visual Analytics Interface... 15
More informationIntel Many Integrated Core (MIC) Architecture
Intel Many Integrated Core (MIC) Architecture Karl Solchenbach Director European Exascale Labs BMW2011, November 3, 2011 1 Notice and Disclaimers Notice: This document contains information on products
More informationSTORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp
STORAGE CONSOLIDATION WITH IP STORAGE David Dale, NetApp SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in
More informationDigital transformation in the Networked Society. Milena Matic Strategy, Marketing & Communications June 2016
Digital transformation in the Networked Society Milena Matic Strategy, Marketing & Communications June 2016 Connections (billion) Everything that benefits from a connection will be connected 50 Our vision
More informationAn Extension of the StarSs Programming Model for Platforms with Multiple GPUs
An Extension of the StarSs Programming Model for Platforms with Multiple GPUs Eduard Ayguadé 2 Rosa M. Badia 2 Francisco Igual 1 Jesús Labarta 2 Rafael Mayo 1 Enrique S. Quintana-Ortí 1 1 Departamento
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationLaptop Requirement: Technical Specifications and Guidelines. Frequently Asked Questions
Laptop Requirement: Technical Specifications and Guidelines As artists and designers, you will be working in an increasingly digital landscape. The Parsons curriculum addresses this by making digital literacy
More informationURBAN SCALE CROWD DATA ANALYSIS, SIMULATION, AND VISUALIZATION
www.bsc.es URBAN SCALE CROWD DATA ANALYSIS, SIMULATION, AND VISUALIZATION Isaac Rudomin May 2017 ABSTRACT We'll dive deep into how we use heterogeneous clusters with GPUs for accelerating urban-scale crowd
More informationBuilding NVLink for Developers
Building NVLink for Developers Unleashing programmatic, architectural and performance capabilities for accelerated computing Why NVLink TM? Simpler, Better and Faster Simplified Programming No specialized
More informationComplexity and Advanced Algorithms. Introduction to Parallel Algorithms
Complexity and Advanced Algorithms Introduction to Parallel Algorithms Why Parallel Computing? Save time, resources, memory,... Who is using it? Academia Industry Government Individuals? Two practical
More informationSplotch: High Performance Visualization using MPI, OpenMP and CUDA
Splotch: High Performance Visualization using MPI, OpenMP and CUDA Klaus Dolag (Munich University Observatory) Martin Reinecke (MPA, Garching) Claudio Gheller (CSCS, Switzerland), Marzia Rivi (CINECA,
More informationMachine Learning on VMware vsphere with NVIDIA GPUs
Machine Learning on VMware vsphere with NVIDIA GPUs Uday Kurkure, Hari Sivaraman, Lan Vu GPU Technology Conference 2017 2016 VMware Inc. All rights reserved. Gartner Hype Cycle for Emerging Technology
More informationU 2 STRA: High-Performance Data Management of Ubiquitous Urban Sensing Trajectories on GPGPUs
U 2 STRA: High-Performance Data Management of Ubiquitous Urban Sensing Trajectories on GPGPUs Jianting Zhang Dept. of Computer Science City College of New York New York City, NY, 10031 jzhang@cs.ccny.cuny.edu
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationA Parallel Access Method for Spatial Data Using GPU
A Parallel Access Method for Spatial Data Using GPU Byoung-Woo Oh Department of Computer Engineering Kumoh National Institute of Technology Gumi, Korea bwoh@kumoh.ac.kr Abstract Spatial access methods
More informationParticle-in-Cell Simulations on Modern Computing Platforms. Viktor K. Decyk and Tajendra V. Singh UCLA
Particle-in-Cell Simulations on Modern Computing Platforms Viktor K. Decyk and Tajendra V. Singh UCLA Outline of Presentation Abstraction of future computer hardware PIC on GPUs OpenCL and Cuda Fortran
More information