PROTON Computed Tomography (pct) [1] is a medical

Size: px
Start display at page:

Download "PROTON Computed Tomography (pct) [1] is a medical"

Transcription

1 182 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 Scalable pct Image Reconstruction Delivered as a Cloud Service Ryan Chard, Student Member, IEEE, Ravi Madduri, Nicholas T. Karonis, Kyle Chard, Member, IEEE, Kirk L. Duffin, Caesar E. Ordo~nez, Thomas D. Uram, Justin Fleischauer, Ian T. Foster, Senior Member, IEEE, Michael E. Papka, Senior Member, IEEE, and John Winans Abstract We describe a cloud-based medical image reconstruction service designed to meet a real-time and daily demand to reconstruct thousands of images from proton cancer treatment facilities worldwide. Rapid reconstruction of a three-dimensional Proton Computed Tomography (pct) image can require the transfer of 100 GB of data and use of approximately 120 GPU-enabled compute nodes. The nature of proton therapy means that demand for such a service is sporadic and comes from potentially hundreds of clients worldwide. We thus explore the use of a commercial cloud as a scalable and cost-efficient platform for pct reconstruction. To address the high performance requirements of this application we leverage Amazon Web Services GPU-enabled cluster resources that are provisioned with high performance networks between nodes. To support episodic demand, we develop an on-demand multi-user provisioning service that can dynamically provision and resize clusters based on image reconstruction requirements, priorities, and wait times. We compare the performance of our pct reconstruction service running on commercial cloud resources with that of the same application on dedicated local high performance computing resources. We show that we can achieve scalable and on-demand reconstruction of large scale pct images for simultaneous multi-client requests, processing images in less than 10 minutes for less than $10 per image. Index Terms Cloud computing, proton computed tomography, medical imaging Ç 1 INTRODUCTION PROTON Computed Tomography (pct) [1] is a medical imaging modality based on tracking the change in trajectory and energy loss of protons as they pass through an R. Chard is with the School of Engineering and Computer Science, Victoria University of Wellington, New Zealand. ryan@ecs.vuw.ac.nz. R. Madduri is with the Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL and the Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL. madduri@mcs.anl.gov. K. Chard is with the Computation Institute, University of Chicago and Argonne National Laboratory, Chicago, IL, USA. chard@uchicago.edu. I.T. Foster is with the Computation Institute, University of Chicago, Chicago, IL and Argonne National Laboratory, Argonne, IL and the Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL and the Department of Computer Science, University of Chicago, Chicago, IL. foster@anl.gov. M.E. Papka is with the Department of Computer Science, Northern Illinois University, DeKalb, IL and Argonne National Laboratory, Argonne, IL. papka@niu.edu. N.T. Karonis is with the Department of Computer Science, Northern Illinois University, DeKalb, IL, and the Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL. karonis@niu.edu. K.L. Duffin, C.E. Ordo~nez, J. Fleischauer, and J. Winans are with the Department of Computer Science, Northern Illinois University, DeKalb, IL. {duffin, cordonez}@cs.niu.edu, justin_fleischauer@hotmail.com, jwinans@niu.edu. T.D. Uram is with the Argonne Leadership Computing Facility, Argonne National Laboratory, Argonne, IL. turam@anl.gov. Manuscript received 24 June 2014; revised 30 June 2015; accepted 2 July Date of publication 16 July 2015; date of current version 7 Mar Recommended for acceptance by P. Corcoran. For information on obtaining reprints of this article, please send to: reprints@ieee.org, and reference the Digital Object Identifier below. Digital Object Identifier no /TCC object. pct imaging was initially developed as a method for acquiring high accuracy images for proton cancer therapy applications. pct systems provide a number of advantages over traditional X-ray Computed Tomography (xct) scanners, such as higher accuracy of electron density reconstruction, and lower dose for the same density resolution [2]. The enhanced accuracy in reconstructed electron density serves to improve the quality of care delivered to patients. It first allows physicians to develop more accurate treatment plans, thus sparing healthy tissue during treatment. It also allows health care providers to more accurately position patients during treatment sessions. Presently, a patient s position is verified just prior to receiving each treatment through the use of two-dimensional orthogonal projections. Position verification can be significantly improved by instead using the three-dimensional image produced by pct. In order to use pct imaging for position verification, images must be reconstructed in near real-time initial studies suggest within 10 to 15 minutes [1]. Due to the nonlinear path of protons through an imaged material, data reduction techniques cannot be applied to pct datasets in the same way that they can to other modalities such as positron emission tomography (PET) and xct. Thus, extremely large datasets must be processed in order to reconstruct an image. It is estimated that the ratio of total protons to total number of 1mm 3 voxels in a target should be greater than 100 to 1 in order to image the target [3]. This gives a conservative upper limit of proton histories for reconstruction size. Each history can be represented in 50 bytes, producing a dataset of 100 GB. Considerable compute resources are needed to reconstruct a dataset of this size in a ß 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See ht_tp:// for more information.

2 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE 183 timely manner. For example, Penfold [4] reports that a reconstruction on a small 6 GB dataset of 131 million proton histories, using a single CPU and GPU, took almost seventy minutes. At this rate, two billion histories would require almost nine hours to process. In previous work we developed a high performance parallel, Message Passing Interface (MPI)-based pct image reconstruction code [5]. Using Gaea, a 60-node, GPU-enhanced, high performance computing (HPC) cluster at Northern Illinois University (NIU) we reduced the time required to reconstruct an image considerably. Running on 60 nodes, our parallelized, hardware accelerated software can reconstruct a two billion history image within seven minutes, and a small 131 million history image in less than 30 seconds. This work demonstrated for the first time the feasibility of using a pct scanner to provide near real-time images for clinical treatment. However, the use of an HPC cluster limits the applicability of our results. Clusters are expensive both to acquire and to maintain, making it impractical for many medical centers to perform real-time reconstructions of pct data. This limitation is unfortunate, as both the number of patients treated by proton therapy and the number of proton cancer treatment facilities have been on the rise [6]. As of December, 2012, over 90,000 patients have received proton therapy in almost 50 different medical centers, with nearly 40,000 patients receiving it since Over 10,000 patients were treated each year in 2011 and Approximately two thirds of those patients received treatment for prostate cancer, which requires 45 treatment sessions, and one third received treatment for some other form of cancer, each requiring three to eight treatment sessions. These figures conservatively place the current global demand for real-time pct imaging at over 1,200 images per day, and that demand is expected to rise. Commercial cloud resources represent a promising alternative platform for pct reconstruction. They have the advantage that computing resources can be obtained quasiinstantaneously, when required, and only paid for when in use. Furthermore, cloud providers are increasingly offering high performance and GPU-enabled nodes, capable of running high performance applications such as pct image reconstruction. But no one has previously explored the feasibility of using cloud resources for pct reconstruction. We describe in this paper an on-demand and scalable pct reconstruction service that combines our parallel pct reconstruction software with on-demand computing resources provided by the Amazon Web Services (AWS) Elastic Compute Cloud (EC2). Our contributions include investigation of the challenges related to deploying a data- and computeintensive service in the cloud; a novel architecture that includes scalable data transfer and elastic, on-demand resource provisioning; and a detailed evaluation of the application of this architecture under a range of real-world scenarios. Our results show that our implementation dynamically provisions, resizes, and removes GPU-enhanced clusters efficiently to fulfill workloads. We demonstrate that our service can compute billion-history reconstructions in under 10 minutes, for as little as $7 per image, thus meeting the goals specified. pct is not yet common in clinical practice and is currently undergoing a multidisciplinary research effort to develop a solution for widespread clinical adoption. Our work addresses one significant aspect of this investigation, that is the computational viability and cost efficiency of real-time pct reconstruction. 2 RELATED WORK Both the HPC and medical imaging communities have explored the use of public cloud resources [7]. However, uptake has been limited due to issues related to inter-node latencies, data privacy, cost, and other constraints. HPC applications often have vastly different quality of service requirements than e-commerce applications for which clouds were originally designed [8]. HPC applications can be extremely sensitive to bandwidth and latency variations, where small overheads can significantly affect performance. Comparisons between HPC applications on clouds and HPC resources using standard benchmarking suites, such as the NAS parallel benchmarks, have shown that network performance is a key limitation of HPC execution on clouds [9]. Concerns have also been raised with respect to the economic models employed by cloud providers, especially when moving and analyzing large scientific data [10]. Despite these limitations, the use of the clouds for scientific applications is growing rapidly. Lifka et al. [7] survey the use of clouds for research and education and report that cloud resources have been successfully adopted in many projects spanning over 25 scientific domains from science and engineering as well as humanities, arts, and social sciences. On-demand access to burst resources and support for high throughput scientific workflows were found to be two of the main reasons for adoption of cloud. The growth of cloud computing as a viable platform for science has also been proven via scientometic analysis [11]. With an emphasis on current trends toward Big Data and data analytics. There is also significant literature related to efficient execution of scientific workflows on clouds. For example, Sossa and Buyya [12] propose a resource provisioning and scheduling algorithm for minimizing execution cost while meeting deadline constraints. While much medical imaging cloud research focuses on the exchange and storage of images [13], [14], there is widespread belief that the use of cloud resources will become commonplace for medical image processing [15]. Kim et al. [16] use the CometCloud engine to integrate local and public cloud resources dynamically to facilitate image registration requests from various research groups on small EC2 instances. Parsonson et al. [17] create an image processing framework that exploits cloud resources for tasks such as volume rendering; however, unlike our work they focus on creating single-instance environments for multiple researchers and clinicians to access collaboratively. Bednarz et al. [18] present an image analysis toolkit that enables access to the cutting edge analysis tools on cloud resources. Rather than build services or leverage scalable infrastructure, the authors instead focus on providing accessible interfaces to pre-deployed software packages on a cloud VM. GPU and parallel programming techniques have been explored for medical image analyses in positron emission tomography [19], magnetic induction tomography [20], and

3 184 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 transmission tomography [21]. Our previous work is the first such approach for pct [5]. 3 PROTON COMPUTED TOMOGRAPHY Proton therapy is a form of radiation treatment that delivers highly directed, and localized doses of radiation to areas of interest. Proton therapy uses a beam of protons as agents, allowing for higher degrees of conformality than conventional external beam X-ray therapy. Protons also have little lateral scatter due to their mass and therefore proton beams can focus more precisely on a tumor with reduced sideeffects to surrounding tissues. Protons lose energy as they travel through a medium, therefore protons with a certain energy have a particular range and few protons will pass beyond this range. The rate at which energy is lost is related to the electron density of the medium being targeted. The energy lost through a medium can be quantified as the Relative Stopping Power (RSP), or the ratio between the stopping power of the target compared to that of water. When treating tumors at different depths the proton accelerator must generate beams with different energies. Thus tumors closer to the surface of the body require less beam energy than those deeper in the body. Efficient and accurate proton therapy is therefore dependent on the accuracy of the scan reconstructions used in treatment planning. As tumors and tissue change over time, patients require a number of scans to direct and fine tune the therapy. An initial scan is used to establish a treatment plan and subsequent scans are used for online position verification. Reconstructions of these scans have different priorities with varying deadlines. For example reconstructions for a treatment plan are often not required for several days, while reconstructions for online verification scans are needed immediately. Historically, proton therapy treatment plans have used pre-treatment X-ray CT scans to determine the RSP of a medium, which in turn is used to estimate the required beam energy. This process requires mapping of the Hounsfield units of X-ray CT scans to proton RSPs. However, this conversion is unique to each X-ray CT machine, requiring calibration and leading to uncertainties during treatment. The goal of pct is to establish the RSP of each target directly, using the same particles for imaging and treatment and therefore reducing uncertainty and increasing treatment efficiency. The concept of pct was first proposed by Cormack in 1976 [22] and has recently seen increased interest in the development of clinical pct devices [1], [23], [24]. In order to measure the RSP of a medium, pct typically employs a detection configuration, such as that shown in Fig. 1 [1]. This configuration enables the path and energy of individual protons, known as a proton s history, to be recorded as a broad beam is directed at the target. Each proton has a known input energy, and passes through two sensor planes before and after the target. The sensor planes collect the entry and exit positions, as well as the angle of a proton, allowing the trajectory to be estimated. The exit energy of each proton is then captured by the energy detector, enabling the calculation of total energy lost, and therefore electron density and RSP. The goal of pct is to reconstruct a map of these densities given a set of proton measurements. Fig. 1. The configuration of the NICADD/NIU pct detector. Protons pass left-to-right through sensor planes and traverse the target before stopping in the detector at the far right. This approach requires significant computation and many protons must be collected in order to generate statistically reliable measurements and therefore images. Furthermore as protons passing through different media travel in non-straight paths, the optimization techniques applied in other imaging modalities cannot be applied to reduce the data. 3.1 pct Reconstruction Our pct reconstruction code is a multi-stage and multi-process application which begins with each participating process reading a subset of proton histories into memory before performing a series of preliminary calculations. The preliminary calculations evaluate entry and exit data to filter out statistically abnormal histories. Typically 30%-40% of histories have either not successfully passed through the target or do not meet the statistical requirements for reconstruction and are removed from the dataset. The remaining histories are then rebinned according to their direction through the target and filtered back projection (FBP) is used to estimate an initial reconstruction solution. Solution bounds produced by FBP are used to create refined proton trajectories through the target, known as most-likely-paths (MLPs). The voxels of the MLP for each proton identify the non-zero coefficients in a row of the matrix representing the system of linear equations. The majority of the reconstruction execution time is spent computing the MLPs and iteratively solving the system of linear equations. Thus, it is advantageous to compute the MLP paths once and store the result in memory. In our pct code the MLP phase utilizes the GPUs of a compute node in order to improve parallelism and increase the performance of the reconstruction. However, the MLP phase can generate up to two terabytes of data for a two billion history image, requiring many nodes to store the MLP paths. The final stage of reconstruction is the iterative linear solver which uses a version of the string averaging algorithm, Component-Averaged Row-Action Projections (CARP) [25]. The initial distribution of data divides up the protons (and the corresponding rows of the linear system matrix) into blocks, one block per executing process. The result of the FBP is broadcast to each process and used as an initial solution. For each iteration of the linear solver, every process computes the rows of their block and updates the solution vector. A reduction across all processes computes an average solution for the iteration. The average solution is broadcast to each process and is used to start the subsequent iteration. A detailed description of the pct reconstruction workflow can be found in our previous work [5].

4 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE PCT AND THE CLOUD Cloud computing has gained significant popularity in the last several years, by providing convenient, self-serviceable, and cost effective infrastructure and services to users. Cloud computing enables on-demand and elastic provisioning of virtualized computing resources. Commercial cloud providers employ utility computing models through which consumers pay only for the resources used. In this section we outline background information about the commercial cloud infrastructure that we use in this work as well as important considerations in the deployment and execution of our pct reconstruction software on the cloud. 4.1 Amazon Elastic Compute Cloud The Amazon Elastic Compute Cloud platform offers many different virtualized instance types to consumers. These instances are optimized for different scenarios (e.g., CPU, I/O or memory). Recently, EC2 has incorporated cluster computing instances aimed toward HPC computing applications. Importantly, these instances offer high CPU and memory as well as improved network performance between instances. Some enhanced instances also include GPUs. EC2 allows consumers to lease resources following two distinct pricing models; on-demand and spot. On-demand instances incur the standard advertised price for each instance type, allowing a user to request an instance at their convenience, pay the hourly rate, and release the instance at their discretion. Spot pricing provides a potentially discounted option to acquire resources. Users bid on excess instances and acquire them when their bid exceeds the market price of the instance. The market price varies with demand and is recalculated hourly. However, as the market rate increases, spot requests can be reassigned if the market price exceeds the bid. 4.2 pct Reconstruction Instances Types Prior to deploying the pct reconstruction software on EC2 we first established a mapping of application requirements to cloud instances. Based on our experience with our parallel, GPU-enabled, MPI application we identified requirements for high-cpu, high-memory, and GPU-enabled instances as well as low latency between instances. The only instance type that matched these requirements at the time of our study was the Amazon EC2 GPUenhanced cluster compute instance, termed CG1. CG1 instances include two Intel Xeon X5570, quad-core CPUs with hyperthreading, 22.5 GB of RAM, and two NVIDIA Tesla M2050 GPUs, each containing 3 GB of RAM. The cluster compute instances are connected by a high performance 10-Gigabit network. To run pct reconstruction on these resources, we deployed an Ubuntu Server instance on a CG1 instance and installed the appropriate drivers, tools, and dependencies. We then tested a small-scale version of the pct reconstruction software on a single instance. Due to the difference in architecture between the Gaea cluster and the Amazon CG1 instance, the reconstruction software could not be directly mapped to the cloud. Gaea has 60 nodes, each with 72 GB of RAM, two six-core Intel Xeon X5650 CPUs and two NVIDIA Tesla M2070 s with 6 GB of RAM each. Gaea nodes are connected by QDR Infiniband, a high performance, switched network fabric [26]. The key difference between Gaea and Amazon CG1 instances is the lower available memory in CG1. We therefore modified our pct reconstruction parameters to reduce the number of proton histories that can be processed concurrently per node. 4.3 Shared File System The pct reconstruction software requires a shared data source to access proton information for each of the working MPI processes. As the application requires on the order of 120 CG1 instances to process large images (based on memory requirement calculations for billions of proton histories) a high performance data storage model is required. We chose to use GlusterFS, an open source distributed file system that provides scalable and high performance access to files [27]. The GlusterFS model relies on one or more storage bricks (or servers) that allow client applications, in this case the pct reconstruction worker nodes, to mount the data source. To evaluate the use of GlusterFS we deployed a small (non-optimized) EC2 instance and evaluated its ability to satisfy the data access requirements when running the pct reconstruction software over different topologies. For a small number of instances the software performed as predicted (based on theoretical calculations); however, as the number of instances was increased to 120 a significant decrease in performance was observed due to a network bottleneck resulting in high latency between the worker nodes running on CG1 instances and the Gluster node running on a separate non-cluster instance. To resolve this problem we deployed the GlusterFS storage node on a co-located cluster compute instance with a 10-Gigabit network interconnect between the working cluster and the GlusterFS storage node. Even using co-located storage we found the data distribution phase of our pct reconstruction code to be significantly longer on the cloud than on our dedicated HPC cluster. Because the HPC cluster utilizes a dedicated QDR Infiniband connection and requires fewer MPI ranks, the distribution phase is negligible in comparison to execution. Thus, the pct code focuses on optimizing the execution phase by allocating each process a specific range of proton histories from each input file. However, due to an increased number of processes, decreased network performance, and the overhead of GlusterFS, parallel reads reduced the performance of cloud-based reconstructions. To overcome this problem, we modified the data distribution algorithm in our pct reconstruction code so that processes are allocated a sequential set of histories. 4.4 Data Upload/Download Transferring large datasets to and from the cloud can be challenging as we must ensure that bandwidth is maximized, data is transferred reliably, and in the case of medical images, securely. For the real-time pct reconstruction application that we consider here, we require a high performance transfer system that can move data quickly between distributed source nodes (at hospitals) and our reconstruction service (in the cloud). Importantly, we require reliable data transfer as corruption may result in incorrect reconstructions.

5 186 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 To address these requirements we chose to use Globus [28] to provide high performance, secure, and reliable third party data movement. Globus moves data between Globus endpoints, the name given to a resource on which a Globus agent is installed. Globus endpoints can be created by installing Globus Connect Server for multi-user environments or, for an individual user, the lightweight Globus Connect Personal client for Linux, Windows, or Mac. Globus handles all the difficult aspects of data transfer allowing a user to fire and forget data transfers. Globus automatically tunes parameters to maximize bandwidth usage, manages security configurations, provides automatic fault recovery, encrypts data channels, ensures that files are transferred reliably by matching checksums, and notifies users of completion and problems. 4.5 Cloud Images and Elastic Scale Out One advantage of cloud computing models is the ability to take snapshots of virtual machines that can be reused without requiring re-installation and reconfiguration of software. In Amazon, these snapshots are referred to as Amazon Machine Images (AMIs). We leverage this approach to provision pct worker nodes that are preinstalled with the pct reconstruction software and all of its dependencies (e.g., GPU/MPI drivers and libraries) and GlusterFS client software. The task of configuring a pct reconstruction cluster is therefore limited to setting appropriate configuration settings on each node, a process which we also automate. The pct reconstruction software requires seven instances to reconstruct a small, 131 million history, image. In order to perform larger tasks, such as an adult human head, we need to be able to easily and consistently launch entire clusters with potentially hundreds of nodes. To do so, we leverage the Amazon EC2 APIs to automatically provision an arbitrary number of instances of a specified type, with customized security policies, running the predefined AMI. Automatically scaling out clusters requires coordinated mechanisms to contextualize nodes and to manage the cluster. Rather than implement a new scheduler, we leverage Apache Mesos [29] to configure, resize, and terminate provisioned clusters. Due to application specific requirements of pct, such as the need for shared file systems and GPU drivers, we extended Mesos to facilitate additional contextualization functionality when deploying pct worker nodes. Our scripts use AWS APIs to request instances by specifying the instance type, security group, number of instances, and if requesting spot-instance, a bid price. Our provisioning tool deploys the customized pct AMI described above. The tool creates a cluster consisting of one master node and as many slaves as required for the reconstruction. It also monitors instances as they are started to ensure they are started correctly. Provided the instances are successfully provisioned a second contextualization tool is used to ensure that each node in the cluster is correctly assembled, and is capable of executing the MPI workload. This tool performs basic assembly functions, such as connecting the instance to the shared GlusterFS drive and ensuring the appropriate GPU devices are loaded. The tool also generates an MPI hostfile, specifying each node in the cluster and the Fig. 2. The pct reconstruction service. Hospitals request image reconstructions, transferring input data via Globus. The scheduler dynamically creates and manages HPC, MPI-capable, cloud clusters to service requests. Once reconstructed, the resulting image is asynchronously pushed back to the client via Globus. available execution slots. Once these tools have executed, MPI workloads can be deployed across the cluster. 5 PCT RECONSTRUCTION SERVICE We have created an on-demand and elastic pct reconstruction service that leverages elastic cloud resources to support time-varying workloads that may involve many concurrent reconstructions of different sizes. The service processes requests for pct image reconstruction by provisioning cloud resources in an on-demand fashion. The general deployment model relies on a persistent representational state transfer (REST) service hosted on AWS. Clients worldwide can connect to this single service to request reconstructions. A single instance is responsible for hosting the REST interface, managing data transfers and the shared file system, provisioning clusters, and scheduling reconstructions across these clusters. The service uses Globus for asynchronous upload and download of input datasets and reconstructed images. Fig. 2 shows the core components of the system. At the bottom of the figure, clients, representing hospitals or proton imaging centers, request image reconstructions from the pct reconstruction service. The service includes a scheduler that both schedules reconstructions and dynamically provisions and manages clusters as required. Requests submitted to the scheduler are created with a priority that represents the type of reconstruction (e.g., treatment plan or position verification). The priority determines whether the work must be immediately processed (which may require starting a new cluster), or whether the job can be cost effectively scheduled over existing infrastructure as it becomes idle. The resulting images are then pushed back to the requesting client. 5.1 pct Reconstruction Service The pct reconstruction service provides a machine-accessible REST interface and is hosted on a CG1 cluster

6 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE 187 instance to ensure low latency connections to the working cluster nodes. The REST API includes functionality to enable clients (hospitals) to request processing of reconstruction workloads; manage and monitor the status of a reconstruction; and retrieve a computed reconstruction. The service includes a co-located Globus Connect Server endpoint for data transfer to and from the cloud. The service also includes a database that is used to maintain state relating to the available clusters and current and previous reconstructions. The pct reconstruction service includes a user interface (UI) for creation, monitoring, and management of reconstructions. An administration UI provides information regarding active clusters, clients, and existing reconstructions. Once a reconstruction is complete, details regarding its execution, such as the time required for each phase, are stored in the database and displayed through the UI. When a new reconstruction request is submitted, either programatically or through the UI, a record is created in the service database with the associated metadata (e.g., file transfer endpoint, priority, and name). The client s proton history dataset is then transferred to the service s Globus endpoint. The service creates a unique identifier for the data, stores this identifier in the database to associate the data with the reconstruction job, the service then monitors the Globus transfer for completion. Once complete, the reconstruction job is marked as ready to be scheduled. 5.2 Scheduler The pct reconstruction service includes an asynchronous scheduler that is responsible for creating and managing clusters, and then deploying and monitoring reconstruction jobs over them. Once a reconstruction is ready to be scheduled, it is queued for execution. The scheduler relies on predefined policies to determine reconstruction execution. By default, if no existing clusters are available and reconstruction is of high priority, a new cluster will be instantiated. Otherwise the reconstruction is added to a queue of low priority requests to be serviced when excess cluster time is available. We have designed the scheduler so that a wide variety of policies could be implemented. These policies provide a way to trade-off cost against compute time by managing how clusters are deployed and managed. For example, the stated quality of service goals for near real-time pct reconstruction requires responses in approximately 10 to 15 minutes. EC2 resources are paid for by the hour. Thus it is economically inefficient to create a new cluster for every reconstruction. The scheduler takes this information into account and only destroys clusters (minutes before the next billing hour) if they are no longer in use. Where possible, unused clusters are resized to satisfy the requirements of larger reconstructions to avoid constructing new clusters. Amazon EC2 also offers a number of different economic models for provisioning instances. Spot instances allow bidding on excess resources and potentially acquiring them at a significantly lower price than on-demand instances. Spot instances provide a trade-off between price and reliability; if a bid is exceeded, an instance can be destroyed without warning. Due to the unstable nature of spot instances, the practicality of launching entire clusters of spot instances is questionable, especially when reconstructions may have fixed deadlines. We provision the master of a cluster as an on-demand instance and remaining worker nodes as spot instances. However this approach is not without risk. For example, a recent increase in CG1 instance usage has caused the availability and price of spot instances to become increasingly volatile. Our scheduler implements policies that dictate the instances provisioned. We have found that a combination of approximately half spot and half on-demand instances achieves a good compromise between cost and reliability. Where possible, we leverage clusters composed of ondemand instances for high priority reconstructions. In the future, we aim to extend the scheduler to monitor spot prices to determine the volatility of the market. Using this approach the scheduler will then adjust the ratio of spot instances used in the cluster. 6 EVALUATION Our evaluation focuses on several areas: first, we quantify the performance of the pct reconstruction software on both cloud resources and a dedicated HPC cluster. Second, we investigate the performance of the reconstruction service, looking specifically at reconstruction time, transfer rate, and cost when reconstructing images of various sizes. Third, we study the performance of our pct reconstruction service when used for end-to-end, multi-client reconstructions. 6.1 pct Reconstruction We investigate the performance of the individual phases of the pct reconstruction on a pool of Amazon CG1 cluster instances and on the dedicated HPC cluster Gaea Input Reconstruction Data We use data collected at the Loma Linda University Medical Center on a phantom target object to evaluate the pct reconstruction software. Using a proton detector similar to that depicted in Fig. 1, we obtain a phantom dataset of 131 million proton histories. In order to scale the analysis to larger target areas we read the phantom dataset multiple times to create larger reconstructions. This is a fair reflection of pct reconstruction as the software operates at the individual voxel level and does not optimize calculations for repeated data. The pct software includes the ability to specify the number of times to process an input dataset. Reading 16 iterations of the phantom dataset provides approximately two billion proton histories, which is approximately our conservative upper limit of the number of histories generated when scanning a human head Cloud Reconstruction To evaluate reconstruction on cloud resources we created a number of clusters of different sizes and evaluated the time to compute reconstructions of varying numbers of proton histories. Each CG1 instance has eight physical cores (two hyperthreaded quad cores). We consider two different MPI configurations for the pct reconstruction software, one with Processes Per Node (PPN) = 2 and one with PPN = 8, where PPN defines the number of MPI ranks run on a single

7 188 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 Fig. 3. Performance for PPN = 2 and PPN = 8 on AWS. node. PPN = 8 is used in order to utilize each of the physical cores. Higher PPN values, which leverage hyperthreading, were not found to provide any significant advantage. Fig. 3a shows the time taken to reconstruct an image of a given size when PPN = 2. We see that performance improves as more nodes are added, especially for larger datasets. Due to the limited memory available within each instance, we can only reconstruct large datasets on large clusters. Therefore, we have gathered results over clusters consisting of up to 120 CG1 instances. Because such large numbers of instances are required, and as these instances are subject to public demand, multiple availability zones are required to ensure the number of instances can be acquired. For these tests, instances have been acquired evenly over two availability zones. However, it is important to note that this distribution may effect the latency between nodes. The PPN ¼ 2 configuration provides a baseline for which to evaluate executions using each of the available physical cores. In an attempt to optimize pct image reconstruction, we also performed reconstructions using PPN = 8. Fig. 3b shows the time taken to reconstruct an image of different size over various sized clusters at PPN = 8. We see slightly improved performance relative to PPN = 2, as individual nodes are used more efficiently. However, due to overheads associated with a far greater number of processes (and therefore the increased cost of data distribution), the execution time was found to be more variable than when PPN = 2. As above, restrictions related to the number of processes prevent us from reconstructing the largest, two billion history, datasets Dedicated HPC Cluster Reconstruction We now compare the performance of the pct reconstruction code on Gaea, a dedicated HPC cluster with 60 nodes, each with two GPU units and 12 physical cores. As each Gaea node has more physical cores than a CG1 instance, the pct software can be run at up to PPN = 12. Fig. 4 shows the time required to reconstruct different datasets for increasing cluster sizes on Gaea at PPN = 2 and PPN = 12. As on the cloud, there are only small differences between PPN = 2 and PPN = 12. However, with fewer MPI ranks, a more efficient shared file system and a dedicated Infiniband network connection, the data distribution phase is much faster than the cloud solution. Thus, PPN = 12 is more efficient for all reconstruction sizes, as opposed to the cloud case where PPN = 8 only becomes more efficient over large datasets Discussion The primary limitation with the MPI-based pct reconstruction software is the requirement for large amounts of memory. The software was developed for Gaea, where each of the nodes provide 72 GB of memory and GPUs have 6 GB. As the AWS instances have less than a third the memory Fig. 4. Performance for PPN = 2 and PPN = 12 on NIU s Gaea, from Karonis et al. [5].

8 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE 189 (22.5 GB) and half the GPU memory (3 GB), clusters must be significantly larger on AWS than when running on Gaea. In order to fulfill a two billion history dataset, more than 100 instances are required on AWS whereas only 60 nodes are required on Gaea. Moreover, to achieve similar reconstruction performance, Gaea requires roughly one third as many nodes. This is approximately proportional to the ratio of CPU RAM between Gaea nodes and cloud instances. In addition, the reduced GPU memory has induced more staging which in turn also reduces the performance of the software. Network performance is another limitation to cloudbased reconstructions. Even when using cluster compute instances connected through a high speed network, network limitations, multi-availability zone deployments, poor GlusterFS performance, and an increased number of MPI ranks make data distribution considerably more costly than on dedicated infrastructure. We plan to investigate the use of more GlusterFS bricks to distribute workload, co-located bricks across availability zones, and optimized SSD storage instances to provide more efficient I/O. We also expect that Amazon will soon release cluster nodes with faster network connections, which will improve application performance. While these limitations reduce performance when compared with a dedicated cluster, our results demonstrate that large-scale pct images can be reconstructed in a timely manner on commercial cloud resources and well within our stated quality of service goals. We believe that such approaches offer significant advantages in terms of cost and parallelism, as many Amazon clusters can be created and used simultaneously, far exceeding the capabilities of our dedicated cluster. 6.2 pct Reconstruction Service The pct reconstruction service described in Section 5 provides end to end support for pct reconstructions. The service enables clients (hospitals) to upload and reconstruct images of different sizes and different priorities. The service then allocates these reconstructions over a dynamic pool of cloud resources. In this section we investigate the total reconstruction time including transfer and processing, and the cost of reconstructing images Reconstruction Time The total time required for a reconstruction can be determined by calculating the combined time necessary to transfer the input data to the service, reconstruct the image, and transfer the resulting image back. Fig. 5 depicts the total time required to reconstruct images for various sized datasets. For these results, we assume that a 120nodeclustercapableoffulfilling the request is operational, and idle, at the time of receiving the input dataset. We measure transfer time by transferring different dataset sizes between AWS and the University of Chicago using Globus. The figure highlights one key limitation of a dataintensive cloud service; that is, the ability to transfer data to and from the cloud efficiently. Our results show that transfer time is a significant component of overall reconstructiontimeandwhilesmallerdatasetscanbereconstructed within our stated goal (10 to 15 minutes), larger datasets can take almost an hour to process, as transfer time alone exceeds our goal. Fig. 5. The total time required to transfer and reconstruct a pct image. Each column includes the time taken to transfer datasets to and from the cloud service, as well as perform the reconstruction. The forecast time required for transfer and reconstruction when supported by a 1-Gigabit and 10-Gigabit network with 100 percent utilization are also shown. The figure also shows the total reconstruction time that we would predict if the service is supported by a high speed (1-Gigabit and 10-Gigabit) network with maximum network utilization. Based on the previously calculated execution times, we require a 2.5-Gigabit bandwidth connection between the client and the service to reconstruct a two billion history dataset within 15 minutes. With a 10-Gigabit connection, we would be able to reconstruct two billion history datasets within 10 minutes. Fortunately, Amazon offers a direct connect capability that enables the creation of private high speed connections (1-Gigabit or 10-Gigabit) between AWS and client applications. We expect that as pct technology becomes more sophisticated, fewer histories will be required to reconstruct a statistically accurate image. Other approaches, such as the data reduction technique proposed by Herman and Davidi [30] in which an object is scanned from only one side and therefore the data generated is halved, can also be applied to reduce data sizes; however, the authors also recognize that this approach could result in noise, masking the presence of tumors. There is also potential to optimize our reconstruction code to process streamed data. At present computation waits for all data to be uploaded; however, if a sufficient data rate can be sustained we could overlap data upload with computation to produce results more quickly. We aim to investigate these approaches as future work Transfer Rate We have seen that data transfer rate has a major influence on total reconstruction time. As transfer rates may differ significantly between locations we measure the time required to move different amounts of data between various centers and Amazon. We have selected endpoints at the University of Chicago (UC), NIU, and the National Energy Research Scientific Computing Center (NERSC) in Berkeley, California. These centers provide geographical distribution across the US and represent the types of locations that we would expect to use such a service. In each location we create a virtual machine with Globus Connect Personal and measure the transfer time to a Globus Connect Server running on a Amazon EC2 instance in the US East Zone.

9 190 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 Fig. 6. Upload (solid) and download (dashed) transfer rates between Globus endpoints at various sites and Amazon. Fig. 6 shows the result of uploading (solid lines) and downloading (dashed lines) various size files to and from Amazon. The rates are computed based on total elapsed transfer times, which include costs associated with ensuring reliability, managing security, and monitoring the transfer. For example, total time includes the computation and comparison of checksums between source and destination, and any re-transfers of files that are found to be corrupt. From a pct perspective, data upload is most important as reconstructed images are negligible in size relative to input datasets. As with the results above, the transfer rates represent a barrier for real-time pct reconstruction. Uploading a modest dataset (131 million histories, 6 GB) typically takes between two and three minutes from UC, NIU, and NERSC. The variability of the results can be in part explained by the tests being conducted over public networks with varying load. In order to minimize this effect, we recorded measurements multiple times. The download rate from AWS to UC is significantly lower than any other center, a result that we attribute to restrictions placed on the data center in which our VM was housed Cost Due to the scale of resources required to perform reconstructions, the cost of provisioning AWS clusters is significant. Each GPU-enhanced cluster compute instance has an on-demand price of $2.10 per hour, meaning an entire 120 instance cluster costs $252 an hour. The largest envisioned images, consisting of two billion histories, require approximately nine minutes to reconstruct. Thus six of these reconstructions can be completed within an hour, making the price per reconstruction $42, assuming sufficient demand. As datasets vary significantly in size, smaller and cheaper clusters can be used to service many smaller reconstruction requests. Spot instances can provide significantly lower prices than that of on-demand instances. During our initial evaluation in 2013 the average spot price for CG1 instances was $0.34; we found that a bid price of $0.40 provided high reliability when provisioning large clusters and that a 120 instance cluster could be provisioned for approximately $50 an hour. Two billion history reconstructions could therefore be performed for less than $10 each. However, due to an increase Fig. 7. Per-image PPN = 2 reconstruction costs for various datasets, when using clusters of different sizes that are made up of either entirely on-demand (solid) or entirely spot (dashed) instances. in demand in 2014, the price for spot CG1 instances is now typically over $1 and can sometimes even exceed ondemand prices. In addition, the volatility of spot instance prices has increased, making clusters comprised of only spot instances increasingly unreliable. We have recently adopted a hybrid approach with half spot and half ondemand instances, this approach results in 120-instance clusters costing approximately $200 an hour. Fig. 7 shows the projected cost for reconstructing various size datasets under different cluster configurations with both on-demand (solid line) and low spot price (dashed line). The projected spot prices are based on our initial experiments in which spot prices were regularly at $0.34 per hour. Interestingly, the cheapest reconstructions tend to use smaller clusters. This is indicative of a slight trade-off between time and cost, and the lack of perfect linear scaling (as illustrated in Fig. 3). These results show that small reconstructions (approximately 500 million or less histories) can be conducted for under $10 using on-demand ($2.10 per hour) instances, while large reconstructions (e.g., two billion histories) can also be executed for under $10 using spot instances. It should be noted that the specialized GPU enhanced cluster instances used by our service are only offered in two Amazon regions (only one in the US) which removes the need for complex provisioning approaches at present. While limiting execution to a single region may affect cost and performance, in related work we have developed cost-aware provisioning techniques that could be used to further reduce costs [31]. While it is difficult to compare these costs accurately to that of using a dedicated HPC cluster we here present estimates based on the cost of Gaea. Costs associated with a dedicated cluster are in two categories: the upfront cost of the resources and the operational cost. The initial cost for a cluster of that size is approximately $800,000 and thereafter can cost an institution over $150,000 annually in staffing, contracts, and other operational expenses. Moreover, these estimates do not include the cost to replace or upgrade the cluster which will likely have to occur every five years or less. Even considering only the conservative annual cost a center would need to process over 15,000 reconstructions annually to make the cost comparable to our cloud-based solution.

10 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE Workflow Simulation In order to evaluate the ability of our pct reconstruction service to elastically respond to and service job requests, we have developed and tested a simulation workflow. We use a Python client application to simulate requests from a single location. Using several of these clients we are able to study the performance of the service when multiple simultaneous requests are submitted. Each client creates reconstruction requests and transfers the input proton history dataset to the pct reconstruction service. Clients are constructed with a name and associated Globus Connect endpoint from which datasets are transferred. Each reconstruction request includes a size and priority based on the workload. The reconstruction description is sent to the pct reconstruction service, and a Globus transfer is initiated to move the input dataset to the shared data store. When the reconstruction is complete, a second Globus transfer returns the resulting image to the client s endpoint. To reduce experiment cost, simulation jobs are created with a size of between 10 and 50 million proton histories. The service, in turn, creates and resizes clusters based on the clusters queue lengths preferring first to resize an existing cluster before creating a new one. We also restrict the scheduler to assume only 10 million histories can be processed by an individual instance, meaning the size of a job determines the minimum number of instances a cluster must have to successfully execute a job. Fig. 8a shows the simulation workload and time spent on each reconstruction job. It illustrates when jobs are created, which client creates them, and how long each job takes to complete. Fig. 8b shows the sizes of the clusters that are created by the reconstruction service to satisfy requests. We see that four clusters are created over the duration of the simulation, two clusters are resized to service larger jobs, and idle clusters are shutdown before their next billing cycle. Fig. 8c shows the total time spent for each request to be completed. An interesting feature of this graph is the representation of the differing queue time for each job. Longer queue times are caused by backlog in the system, where a new cluster must be started to service an influx of job requests. Once another cluster is operational, subsequent jobs are serviced more efficiently. In our simulation Cluster 1 ran jobs 1-9, 11, 14, 17, 19, and 20; Cluster 2 ran jobs 10, 13, 15, 16, and 18; Cluster 3 ran job 12; and Cluster 4 ran jobs DISCUSSION AND FUTURE WORK We briefly discuss important challenges identified in deploying and evaluating the pct reconstruction service, and present potential areas for future work. 7.1 File System Bottlenecks In any data-intensive application, in which many nodes simultaneously access shared data, file system access may become a bottleneck. Our approach, using GlusterFS, scaled adequately to the 120 node clusters we deployed. However, we also found that performance degraded when additional processes per node were used. Designing an appropriate shared file system is therefore an important challenge for Fig. 8. The result of simulating multiple clients requesting several reconstruction jobs over the pct reconstruction service. (a) shows the simulation workload in terms of when requests are submitted, by which client, and how long they take to be satisfied; (b) shows the history of the four clusters that the service creates and resizes to service waiting jobs; (c) shows the time consumed by each phase of the submitted reconstruction jobs. such applications. Our initial experience striping GlusterFS over multiple CG1 instances proved to be both costly and ineffective as it did not improve performance for our workload, presumably due to the lack of requirement for further blocks. Moreover, as demand for CG1 instances grows, we increasingly see the need to deploy clusters that span multiple availability zones, which in turn requires deployment-aware data distribution. In future work we aim to investigate whether replicated GlusterFS bricks on each availability zone can reduce the setup time and improve the overall performance of the application when deployed in such configurations. Discussions with Amazon developers suggested that storage-optimized I2 instance could be used to host GlusterFS to provide improved file system access. I2 storage include solid state drives to store data with high IO requirements; importantly, they also share the same high speed interconnect used by CG1 instances. We plan to investigate performance when using I2 instances to host the pct file system, as they are more cost efficient than a CG1 cluster instance and may reduce bottlenecks associated with file system access.

11 192 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH Cloud Deployment New instance types and high speed network connections introduced by Amazon may provide yet more benefits to our pct reconstruction service. For example, a new GPUenabled instance type, G2, is designed to facilitate streaming capabilities within the cloud. Although these instances are significantly less computationally powerful than CG1 instances, they are much cheaper and could potentially be employed to reduce the cost of pct reconstructions. Data transfer rates are a severe limitation of deploying our current implementation in production. Our results show that high speed networks between proton therapy treatment facilities and an operating pct reconstruction service are required before such a service would be plausible for large reconstructions. Our future work will investigate the use of Amazon s direct connect capabilities as a model for providing 1-Gigabit to 10-Gigabit connections to the pct reconstruction service. We have applied leading commercial approaches to the design of the pct reconstruction service to ensure that the service can be deployed in a highly available and reliable model. For instance, where possible we leverage reliable cloud-based services such as RDS to host databases and EC2 to host stateless service instances. Thus, using multi-availability zone deployments and elastic load balancers (ELB) to allocate requests across a pool of running services we can ensure a high degree of availability. Moreover, using Amazon health checks and services such as CloudWatch, we can monitor running instances, generate alerts, and automatically remove and replace unhealthy instances. 7.3 Pricing Strategies We have shown that cost-effective pct image reconstruction can be achieved via on-demand provisioning of public cloud resources. We note, however, that for large-scale image reconstructions to be practical, a high demand for such a service is required to amortize fixed overhead. These fixed overheads include, in our current implementation, a persistent CG1 instance that is used to to maintain the shared file system and facilitate data transfers; they could be reduced by using smaller and I/O-optimized instances, but cannot be eliminated. In an attempt to reduce cluster operation costs, we have developed policies to shut down idleresourcespriortoan upcoming billing cycle. It may also be possible to make more effective use of spot instances. For example, Poola etal.[32]describeanapproachinwhichspotinstancesare used until a specified slack time is exceeded, at which time a workflow is migrated to more costly on-demand instances in order to fulfill obligations. This approach could be applied to the pct reconstruction service by incorporating service level agreements on image reconstructions, and extending our model with various costing strategies. We also require better algorithms and mechanisms to predict spot price volatility and provision instances by bidding appropriately and to trade off cost and availability. In future work we plan to further evaluate the use of a combination of on-demand, spot, and reserved pricing models to establish a cost-efficient cloud service for on-demand HPC workloads. Finally, in order to operate the pct reconstruction service commercially we must develop a billing model in which users are charged for the resources consumed. We aim to leverage billing capabilities we have developed for the Globus Genomics [33] service for this purpose. The Globus Genomics model charges users for the cumulative costs associated with each analysis job as well as a fixed subscription for using the service. 7.4 Privacy on Clouds Image reconstructions may, depending on target, contain identifiable information such as human faces. Privacy of medical information in the US is governed by the Health Insurance Portability and Accountability Act (HIPAA) which places technical and non-technical restrictions on accessing and managing personal health information. There are two approaches to compliance when analyzing and reconstructing images: 1) anonymization or 2) analysis on HIPAA-compliant infrastructure. Anonymization of images includes both phenotype information (e.g., subject name, age, and address) and, in the case of neuroimages, removal of face images. It has long been noted that storage and processing of identifiable health information on commercial clouds is not possible due to data privacy requirements. However, this situation is changing rapidly. Cloud providers such as AWS now offer a number of compliance frameworks including HIPAA, and are also able to sign Business Associate Agreements (BAAs) necessary for HIPAA compliance. Moreover, issues of compliance and privacy are active research areas for the entire cloud community and we have seen a variety of new approaches for providing secure storage and computation on clouds [34]. Our pct reconstruction service currently requires that clients explicitly remove all directly identifiable information. Before deploying the service for clinical use we will investigate best practice approaches with respect to operations and security. For this purpose we expect to leverage our experiences operating the Globus and Globus Genomics services. We also expect to leverage AWS HIPAA-compliance to provide a secure and compliant reconstruction service for clinical use. Finally, as part of our operating procedures we will develop a comprehensive threat model prior to to making the service generally available. 8 CONCLUSION Real-time pct image reconstruction represents an exciting new approach to providing precise proton imaging immediately before proton therapy. Our parallel GPU-enabled software can produce large image reconstructions in a matter of minutes. We have shown in this paper that commercial cloud services can enable a cost-effective alternative to dedicated HPC infrastructure. We demonstrated, that specialized EC2 cluster instances can provide on-demand and highly scalable reconstruction. Our results show that large scale reconstructions can be performed within minutes using up to 120 GPUenhanced cluster compute nodes. While dedicated HPC clusters can achieve similar reconstruction performance with

12 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE 193 half the number of nodes, the extra flexibility of cloud platforms and the ability to satisfy sporadic usage requirements at low cost may be advantageous in many settings. Our results also reveal some limitations of currently available cloud instances for pct reconstruction: in particular, the reduced network performance when compared with our dedicated cluster. We show that enhancements such as using instances with higher quantities of RAM and faster networks, as well as software and deployment improvements such as tuning shared file systems can improve reconstruction performance. In order to demonstrate that such capabilities could be offered to a wide community, we have constructed a scalable end-to-end reconstruction service that provisions cloud clusters dynamically to meet demand. This service shields users from the complexities involved in configuring and maintaining the complex MPI- and GPU-enabled reconstruction code, allowing them simply to submit requests through a simple REST interface. The service handles dataset upload, the prioritization and scheduling of workloads, the creation and resizing of cloud clusters, and the return of results to clients. Furthermore, by using spot instances, the service can compute two billion history reconstructions in under 10 minutes for less than $10. The different instance pricing models supported by commercial cloud providers, plus the different provisioning, scheduling, and execution options, suggest exciting new strategies for optimizing reconstruction time and cost. The service is not yet commercially available. The biggest limitation that prohibits immediate use of our pct reconstruction service for real-time imaging is data upload overhead. Our experiments indicate that for 100 GB datasets, upload time significantly exceeds reconstruction time. Amazon s high-speed network capabilities provide a possible solution to this problem. Nevertheless, even with low bandwidth links, realistic reconstructions up to half a million histories can be uploaded and processed within 10 to 15 minutes. ACKNOWLEDGMENTS This work was supported by Amazon.com, Inc., the US Department of Defense contract no. W81XWH , and the US Department of Energy contract no. DE-SC The authors thank David Pellerin, Steve Elliott and Jamie Kinney from Amazon for their continued support; Keith Schubert and his students Scott McCallister and Micah Witt for conversations on hardware accelerated computing; Gabor Herman, Yair Censor, Ran Davidi, and Joanna Klukowski for many and valued discussions of pct mathematics; and Ford Hurley and the Loma Linda University Medical Center for sharing the LUCY phantom proton history data. Finally, they thank Reinhard Schulte and Scott Penfold for their collaboration and insight into understanding pct and their work. They especially thank Dr. Schulte for his comments on this manuscript. REFERENCES [1] R. Schulte, V. Bashkirov, T. Li, Z. Liang, K. Mueller, J. Heimann, L. Johnson, B. Keeney, H. F.-W. Sadrozinski, A. Seiden, D. Williams, L. Zhang, Z. Li, S. Peggs, T. Satogata, and C. Woody, Conceptual design of a proton computed tomography system for applications in proton radiation therapy, IEEE Trans. Nuclear Sci., vol. 51, no. 3, pp , Jun [2] V. Bashkirov, R. Schulte, G. Coutrakon, B. Erdelyi, K. Wong, H. Sadrozinski, S. Penfold, A. Rosenfeld, S. McAllister, and K. Schubert, Development of proton computed tomography for applications in proton therapy, in Proc. Am. Inst. Phys. Conf. Series, Mar. 2009, vol. 1099, pp [3] R. W. Schulte, V. Bashkirov, M. C. Loss Klock, T. Li, A. J. Wroe, I. Evseev, D. C. Williams, and T. Satogata, Density resolution of proton computed tomography, Med. Phys., vol. 32, no. 4, pp , [4] S. Penfold, Image reconstruction and monte carlo simulations in the development of proton computed tomography for applications in proton radiation therapy, Ph.D. dissertation, Centre for Medical Radiation Physics, Univ. of Wollongong, New South Wales, Australia, [5] N. T. Karonis, K. L. Duffin, C. E. Ordo~nez, B. Erdelyi, T. D. Uram, E. C. Olson, G. Coutrakon, and M. E. Papka, Distributed and hardware accelerated computing for clinical medical imaging using proton computed tomography (pct), J. Parallel Distrib. Comput., vol. 73, no. 12, pp , [6] (2014, Apr.). Particle therapy co-operative group [Online]. Available: [7] D. Lifka, I. Foster, S. Mehringer, M. Parashar, P. Redfern, C. Stewart, and S. Tuecke, XSEDE cloud survey report, Technical report, National Science Foundation, USA, XSEDE, Tech. Rep XSEDE-Reports-CloudSurvey-v1.0, [8] A. Gupta and D. Milojicic, Evaluation of HPC applications on cloud, in Proc. 6th Open Cirrus Summit, Oct. 2011, pp [9] K. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf, H. Wasserman, and N. Wright, Performance analysis of high performance computing applications on the amazon web services cloud, in Proc. IEEE Int. Conf. Cloud Comput. Technol. Sci., Nov. 2010, pp [10] E. Deelman, G. Singh, M. Livny, B. Berriman, and J. Good, The cost of doing science on the cloud: The montage example, in Proc. Int. Conf. High Perform. Comput., Netw., Storage Anal., Nov. 2008, pp [11] L. Heilig and S. Voß, A scientometric analysis of cloud computing literature, IEEE Trans. Cloud Comput., vol. 2, no. 3, pp , Apr [12] M. R. Sossa and R. Buyya, Deadline based resource provisioning and scheduling algorithmfor scientific workflows on clouds, IEEE Trans. Cloud Comput., vol. 2, no. 2, pp , Apr [13] G. Kanagaraj and A. C. Sumathi, Proposal of an open-source cloud computing system for exchanging medical images of a hospital information system, in Proc. 3rd Int. Conf. Trendz Inform. Sci. Comput., Dec. 2011, pp [14] A. Reddy and R. Bhatnagar, Distributed medical image management: A platform for storing, analysis and processing of image database over the cloud, in Proc. Int. Conf. Adv. Energy Convers. Technol., Jan. 2014, pp [15] G. C. Kagadis, C. Kloukinas, K. Moore, J. Philbin, P. Papadimitroulas, C. Alexakos, P. G. Nagy, D. Visvikis, and W. R. Hendee, Cloud computing in medical imaging, Med. Phys., vol. 40, no. 7, p , [16] H. Kim, M. Parashar, D. J. Foran, and L. Yang, Investigating the use of autonomic cloudbursts for high-throughput medical image registration, in Proc. 10th IEEE/ACM Int. Conf. Grid Comput., 2009, pp [17] L. Parsonson, S. Grimm, A. Bajwa, L. Bourn, and L. Bai, A cloud computing medical image analysis and collaboration platform, in Cloud Computing and Services Science, (Series Service Science: Research and Innovations in the Service Economy), I. Ivanov, M. van Sinderen, and B. Shishkov, Eds. New York, NY, USA: Springer 2012, pp [18] T. Bednarz, P. Szul, Y. Arzhaeva, D. Wang, N. Burdett, A. Khassapov, S. Chen, P. Vallotton, R. Lagerstrom, T. Gureyev, and J. Taylor, Biomedical image analysis and processing in clouds, in Proc. AIP Conf., 2013, vol. 1559, no. 1, pp [19] T. Beisel, S. Lietsch, and K. Thielemans, A method for OSEM PET reconstruction on parallel architectures using STIR, in Proc. IEEE Nuclear Sci. Symp. Conf. Record, 2008, pp [20] Y. Maimaitijiang, M. Roula, S. Watson, G. Meriadec, K. Sobaihi, and R. Williams, Evaluation of parallel accelerators for high performance image reconstruction for magnetic induction tomography, J. Select. Areas Softw. Eng., vol. 170, pp. 1 7, 2011.

13 194 IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. 6, NO. 1, JANUARY-MARCH 2018 [21] D. Vintache, B. Humbert, and D. Brasse, Iterative reconstruction for transmission tomography on GPU using nvidia CUDA, Tsinghua Sci. Technol., vol. 15, no. 1, pp , [22] A. M. Cormack and A. M. Koehler, Quantitative proton tomography: preliminary experiments, Phys. Med. Biol., vol. 21, no. 4, pp , [23] G. Cirrone, G. Cuttone, G. Candiano, F. Di Rosa, S. Lo Nigro, D. Lo Presti, N. Randazzo, V. Sipala, M. Bruzzi, D. Menichelli, M. Scaringella, V. Bashkirov, R. D. WilliamsHartmut, F.-W. Sadrozinski, J. Heimann, J. Feldt, N. Blumenkrantz, C. Talamonti, R. Schulte, and R. Schulte, Monte carlo studies of a proton computed tomography system, IEEE Trans. Nuclear Sci., vol. 54, no. 5, pp , Oct [24] V. Sipala, M. Bruzzi, M. Bucciolini, M. Carpinelli, G. Cirrone, C. Civinini, G. Cuttone, D. L. Presti, S. Pallotta, C. Pugliatti, N. Randazzoi, F. Romanog, M. Scaringellac, C. Stancampianoh, C. Talamontid, M. Tesic, E. Vanzif, and M. Zanid, A proton computed tomography system for medical applications, J. Instrumentation, vol. 8, no. 02, p. C02021, [25] D. Gordon and R. Gordon, Component-averaged row projections: A robust, block-parallel scheme for sparse linear systems, SIAM J. Sci. Comput., vol. 27, no. 3, pp , [26] I. T. Association, InfiniBand Architecture Specification: Release 1.0. InfiniBand Trade Association, [27] (2014, Apr.). The gluster web site [Online]. Available: [28] I. Foster, Globus online: Accelerating and democratizing science through cloud-based services, IEEE Internet Comput., vol. 15, no. 3, pp , May [29] B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, Mesos: A platform for finegrained resource sharing in the data center, in Proc. 8th USENIX Conf. Netw. Syst. Des. Implementation, 2011, pp [30] G. T. Herman and R. Davidi, Image reconstruction from a small number of projections, Inverse Problems, vol. 24, no. 4, p , [31] R. Chard, K. Chard, K. Bubendorfer, L. Lacinski, R. Madduri, and I. Foster, Cost-aware cloud provisioning, in Proc. 11th IEEE Int. Conf. escience, [32] D. Poola, K. Ramamohanarao, and R. Buyya, Fault-tolerant workflow scheduling using spot instances on clouds, Procedia Comput. Sci., vol. 29, no. 0, pp , [33] R. K. Madduri, D. Sulakhe, L. Lacinski, B. Liu, A. Rodriguez, K. Chard, U. J. Dave, and I. T. Foster, Experiences building globus genomics: A next-generation sequencing analysis service using galaxy, globus, and amazon web services, Concurrency Comput.: Practice Exp., vol. 26, no. 13, pp , [34] L. Wei, H. Zhu, Z. Cao, X. Dong, W. Jia, Y. Chen, and A. V. Vasilakos, Security and privacy for storage and computation in cloud computing, Inform. Sci., vol. 258, pp , Ryan Chard received the BSc (Hons) and MSc degrees from the Victoria University of Wellington. He is currently working toward the PhD degree at the Victoria University of Wellington. He is a student member of the IEEE. Nicholas T. Karonis is a professor of computer science at Northern Illinois University. Karonis received a Ph.D. in computer science from Syracuse University and is a Resident Guest Associate at Argonne National Laboratory. Kyle Chard received the PhD degree in computer science from the Victoria University of Wellington. He is a senior researcher and a fellow at the Computation Institute, a joint institute of the University of Chicago and Argonne National Laboratory. He is a member of the IEEE. Kirk L. Duffin received the PhD degree in computer science from Brigham Young University. He is an associate professor of computer science at Northern Illinois University. Caesar E. Ordo~nez received the PhD degree in nuclear physics from the Massachusetts Institute of Technology. He is a researcher at Northern Illinois University and has been working in the field of medical imaging for more than 20 years. Thomas D. Uram is a member of the research staff at the Argonne National Laboratory and the Computation Institute at the University of Chicago. Justin Fleischauer is a graduate student at Northern Illinois University. Ravi Madduri is a project manager at the Argonne National Laboratory and a fellow of the Computation Institute, a joint institute of the University of Chicago and Argonne National Laboratory. Ian T. Foster is a director of the Computation Institute, a joint institute of the University of Chicago and Argonne National Laboratory. He is also an Argonne senior scientist and distinguished fellow and the Arthur Holly Compton distinguished service professor of computer science. He is a senior member of the IEEE.

14 CHARD ET AL.: SCALABLE PCT IMAGE RECONSTRUCTION DELIVERED AS A CLOUD SERVICE 195 Michael E. Papka received the PhD degree in computer science from the University of Chicago. He is a senior scientist at the Argonne National Laboratory, where he serves both as a deputy associate laboratory director and as a director of the Argonne Leadership Computing Facility. He is a senior fellow of the Computation Institute, a joint institute of the University of Chicago and Argonne National Laboratory, and an associate professor of computer science at Northern Illinois University. He is a senior member of the IEEE. John Winans received the MS degree from Northern Illinois University. He is a research associate at Northern Illinois University. " For more information on this or any other computing topic, please visit our Digital Library at

CIT 668: System Architecture. Amazon Web Services

CIT 668: System Architecture. Amazon Web Services CIT 668: System Architecture Amazon Web Services Topics 1. AWS Global Infrastructure 2. Foundation Services 1. Compute 2. Storage 3. Database 4. Network 3. AWS Economics Amazon Services Architecture Regions

More information

WHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud.

WHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud. WHITEPAPER AMAZON ELB: Your Master Key to a Secure, Cost-Efficient and Scalable Cloud www.cloudcheckr.com TABLE OF CONTENTS Overview 3 What Is ELB? 3 How ELB Works 4 Classic Load Balancer 5 Application

More information

EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud

EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud Contents Introduction... 3 What is VMware Cloud on AWS?... 5 Customer Benefits of Adopting VMware Cloud on AWS... 6 VMware Cloud

More information

Future Generation Computer Systems

Future Generation Computer Systems Future Generation Computer Systems 56 (2016) 595 604 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs Network health and e-science

More information

A Real-time Image Reconstruction System for Particle Treatment Planning Using Proton Computed Tomography (pct)

A Real-time Image Reconstruction System for Particle Treatment Planning Using Proton Computed Tomography (pct) Conference on the Application of Accelerators in Research and Industry, CAARI 2016, 30 October 4 November 2016, Ft. Worth, TX, USA A Real-time Image Reconstruction System for Particle Treatment Planning

More information

MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA

MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA MPI Optimizations via MXM and FCA for Maximum Performance on LS-DYNA Gilad Shainer 1, Tong Liu 1, Pak Lui 1, Todd Wilde 1 1 Mellanox Technologies Abstract From concept to engineering, and from design to

More information

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES Introduction Amazon Web Services (AWS), which was officially launched in 2006, offers you varying cloud services that are not only cost effective but scalable

More information

The Use of Cloud Computing Resources in an HPC Environment

The Use of Cloud Computing Resources in an HPC Environment The Use of Cloud Computing Resources in an HPC Environment Bill, Labate, UCLA Office of Information Technology Prakashan Korambath, UCLA Institute for Digital Research & Education Cloud computing becomes

More information

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD?

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? MASS MIGRATION: SHOULD ALL COMPANIES MOVE TO THE CLOUD? Achieving digital

More information

QLogic TrueScale InfiniBand and Teraflop Simulations

QLogic TrueScale InfiniBand and Teraflop Simulations WHITE Paper QLogic TrueScale InfiniBand and Teraflop Simulations For ANSYS Mechanical v12 High Performance Interconnect for ANSYS Computer Aided Engineering Solutions Executive Summary Today s challenging

More information

Backtesting in the Cloud

Backtesting in the Cloud Backtesting in the Cloud A Scalable Market Data Optimization Model for Amazon s AWS Environment A Tick Data Custom Data Solutions Group Case Study Bob Fenster, Software Engineer and AWS Certified Solutions

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

ASSURING PERFORMANCE IN VDI DEPLOYMENTS

ASSURING PERFORMANCE IN VDI DEPLOYMENTS ASSURING PERFORMANCE IN VDI DEPLOYMENTS EXECUTIVE SUMMARY Virtual Desktop Infrastructure (VDI) holds great promise for end-user computing teams to centralize management and maintenance, lower operational

More information

Introduction to Amazon Cloud & EC2 Overview

Introduction to Amazon Cloud & EC2 Overview Introduction to Amazon Cloud & EC2 Overview 2015 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent

More information

context: massive systems

context: massive systems cutting the electric bill for internetscale systems Asfandyar Qureshi (MIT) Rick Weber (Akamai) Hari Balakrishnan (MIT) John Guttag (MIT) Bruce Maggs (Duke/Akamai) Éole @ flickr context: massive systems

More information

THE VMTURBO CLOUD CONTROL PLANE

THE VMTURBO CLOUD CONTROL PLANE THE VMTURBO CLOUD CONTROL PLANE Software-Driven Control for the Software-Defined Data Center EXECUTIVE SUMMARY The Software-Defined Datacenter (SDDC) has the potential to extend the agility, operational

More information

Mark Sandstrom ThroughPuter, Inc.

Mark Sandstrom ThroughPuter, Inc. Hardware Implemented Scheduler, Placer, Inter-Task Communications and IO System Functions for Many Processors Dynamically Shared among Multiple Applications Mark Sandstrom ThroughPuter, Inc mark@throughputercom

More information

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution

WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution WHITE PAPER Cloud FastPath: A Highly Secure Data Transfer Solution Tervela helps companies move large volumes of sensitive data safely and securely over network distances great and small. We have been

More information

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING

TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes

More information

ShuttleService. Scalable Big Data Processing Utilizing Cloud Structures. A Tick Data Custom Data Solutions Group Case Study

ShuttleService. Scalable Big Data Processing Utilizing Cloud Structures. A Tick Data Custom Data Solutions Group Case Study ShuttleService Scalable Big Data Processing Utilizing Cloud Structures A Tick Data Custom Data Solutions Group Case Study Robert Fenster, Senior Engineer and AWS Certified Solutions Architect Neal Falkenberry,

More information

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure Nutanix Tech Note Virtualizing Microsoft Applications on Web-Scale Infrastructure The increase in virtualization of critical applications has brought significant attention to compute and storage infrastructure.

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

CS15-319: Cloud Computing. Lecture 3 Course Project and Amazon AWS Majd Sakr and Mohammad Hammoud

CS15-319: Cloud Computing. Lecture 3 Course Project and Amazon AWS Majd Sakr and Mohammad Hammoud CS15-319: Cloud Computing Lecture 3 Course Project and Amazon AWS Majd Sakr and Mohammad Hammoud Lecture Outline Discussion On Course Project Amazon Web Services 2 Course Project Course Project Phase I-A

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

Loma Linda University Medical Center Dept. of Radiation Medicine

Loma Linda University Medical Center Dept. of Radiation Medicine Loma Linda University Medical Center Dept. of Radiation Medicine and Northern Illinois University Dept. of Physics and Dept. of Computer Science Presented by George Coutrakon, PhD NIU Physics Dept. Collaborators

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

White Paper. Platform9 ROI for Hybrid Clouds

White Paper. Platform9 ROI for Hybrid Clouds White Paper Platform9 ROI for Hybrid Clouds Quantifying the cost savings and benefits of moving from the Amazon Web Services (AWS) public cloud to the Platform9 hybrid cloud. Abstract Deciding whether

More information

Price Performance Analysis of NxtGen Vs. Amazon EC2 and Rackspace Cloud.

Price Performance Analysis of NxtGen Vs. Amazon EC2 and Rackspace Cloud. Price Performance Analysis of Vs. EC2 and Cloud. Performance Report: ECS Performance Analysis of Virtual Machines on ECS and Competitive IaaS Offerings An Examination of Web Server and Database Workloads

More information

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation Fujitsu All-Flash Arrays are extremely effective tools when virtualization is used for server consolidation.

More information

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE SOLUTION OVERVIEW DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE Cut Costs and Control Access with VMware Cloud PKS Digital transformation, the Internet of things, the growing popularity of public clouds,

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Deploying VSaaS and Hosted Solutions using CompleteView

Deploying VSaaS and Hosted Solutions using CompleteView SALIENT SYSTEMS WHITE PAPER Deploying VSaaS and Hosted Solutions using CompleteView Understanding the benefits of CompleteView for hosted solutions and successful deployment architectures. Salient Systems

More information

Amazon Elastic File System

Amazon Elastic File System Amazon Elastic File System Choosing Between the Different Throughput & Performance Modes July 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided

More information

Falling Out of the Clouds: When Your Big Data Needs a New Home

Falling Out of the Clouds: When Your Big Data Needs a New Home Falling Out of the Clouds: When Your Big Data Needs a New Home Executive Summary Today s public cloud computing infrastructures are not architected to support truly large Big Data applications. While it

More information

Data publication and discovery with Globus

Data publication and discovery with Globus Data publication and discovery with Globus Questions and comments to outreach@globus.org The Globus data publication and discovery services make it easy for institutions and projects to establish collections,

More information

NetApp Clustered Data ONTAP 8.2 Storage QoS Date: June 2013 Author: Tony Palmer, Senior Lab Analyst

NetApp Clustered Data ONTAP 8.2 Storage QoS Date: June 2013 Author: Tony Palmer, Senior Lab Analyst ESG Lab Spotlight NetApp Clustered Data ONTAP 8.2 Storage QoS Date: June 2013 Author: Tony Palmer, Senior Lab Analyst Abstract: This ESG Lab Spotlight explores how NetApp Data ONTAP 8.2 Storage QoS can

More information

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE SOLUTION OVERVIEW DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE Cut Costs and Control Access with VMware Kubernetes Engine Digital transformation, the Internet of things, the growing popularity of public

More information

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development Jeremy Fischer Indiana University 9 September 2014 Citation: Fischer, J.L. 2014. ACCI Recommendations on Long Term

More information

Cloud Computing. UCD IT Services Experience

Cloud Computing. UCD IT Services Experience Cloud Computing UCD IT Services Experience Background - UCD IT Services Central IT provider for University College Dublin 23,000 Full Time Students 7,000 Researchers 5,000 Staff Background - UCD IT Services

More information

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates

Survey: Users Share Their Storage Performance Needs. Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates Survey: Users Share Their Storage Performance Needs Jim Handy, Objective Analysis Thomas Coughlin, PhD, Coughlin Associates Table of Contents The Problem... 1 Application Classes... 1 IOPS Needs... 2 Capacity

More information

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT:

HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms. Author: Correspondence: ABSTRACT: HPC Considerations for Scalable Multidiscipline CAE Applications on Conventional Linux Platforms Author: Stan Posey Panasas, Inc. Correspondence: Stan Posey Panasas, Inc. Phone +510 608 4383 Email sposey@panasas.com

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

X-ray imaging software tools for HPC clusters and the Cloud

X-ray imaging software tools for HPC clusters and the Cloud X-ray imaging software tools for HPC clusters and the Cloud Darren Thompson Application Support Specialist 9 October 2012 IM&T ADVANCED SCIENTIFIC COMPUTING NeAT Remote CT & visualisation project Aim:

More information

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s

More information

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance

Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance Dell EMC Ready Bundle for HPC Digital Manufacturing Dassault Systѐmes Simulia Abaqus Performance This Dell EMC technical white paper discusses performance benchmarking results and analysis for Simulia

More information

DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs.

DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs. DDN Annual High Performance Computing Trends Survey Reveals Rising Deployment of Flash Tiers & Private/Hybrid Clouds vs. Public for HPC HPC End Users Cite Mixed I/O as the Most Difficult Performance Challenge

More information

VirtuLocity VLN Software Acceleration Service Virtualized acceleration wherever and whenever you need it

VirtuLocity VLN Software Acceleration Service Virtualized acceleration wherever and whenever you need it VirtuLocity VLN Software Acceleration Service Virtualized acceleration wherever and whenever you need it Bandwidth Optimization with Adaptive Congestion Avoidance for WAN Connections model and supports

More information

University at Buffalo Center for Computational Research

University at Buffalo Center for Computational Research University at Buffalo Center for Computational Research The following is a short and long description of CCR Facilities for use in proposals, reports, and presentations. If desired, a letter of support

More information

ParalleX. A Cure for Scaling Impaired Parallel Applications. Hartmut Kaiser

ParalleX. A Cure for Scaling Impaired Parallel Applications. Hartmut Kaiser ParalleX A Cure for Scaling Impaired Parallel Applications Hartmut Kaiser (hkaiser@cct.lsu.edu) 2 Tianhe-1A 2.566 Petaflops Rmax Heterogeneous Architecture: 14,336 Intel Xeon CPUs 7,168 Nvidia Tesla M2050

More information

Migration and Building of Data Centers in IBM SoftLayer

Migration and Building of Data Centers in IBM SoftLayer Migration and Building of Data Centers in IBM SoftLayer Advantages of IBM SoftLayer and RackWare Together IBM SoftLayer offers customers the advantage of migrating and building complex environments into

More information

Apache Spark Graph Performance with Memory1. February Page 1 of 13

Apache Spark Graph Performance with Memory1. February Page 1 of 13 Apache Spark Graph Performance with Memory1 February 2017 Page 1 of 13 Abstract Apache Spark is a powerful open source distributed computing platform focused on high speed, large scale data processing

More information

Genomics on Cisco Metacloud + SwiftStack

Genomics on Cisco Metacloud + SwiftStack Genomics on Cisco Metacloud + SwiftStack Technology is a large component of driving discovery in both research and providing timely answers for clinical treatments. Advances in genomic sequencing have

More information

2013 AWS Worldwide Public Sector Summit Washington, D.C.

2013 AWS Worldwide Public Sector Summit Washington, D.C. 2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic

More information

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER

QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER QLIKVIEW SCALABILITY BENCHMARK WHITE PAPER Hardware Sizing Using Amazon EC2 A QlikView Scalability Center Technical White Paper June 2013 qlikview.com Table of Contents Executive Summary 3 A Challenge

More information

Storage Networking Strategy for the Next Five Years

Storage Networking Strategy for the Next Five Years White Paper Storage Networking Strategy for the Next Five Years 2018 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 1 of 8 Top considerations for storage

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

QLIK INTEGRATION WITH AMAZON REDSHIFT

QLIK INTEGRATION WITH AMAZON REDSHIFT QLIK INTEGRATION WITH AMAZON REDSHIFT Qlik Partner Engineering Created August 2016, last updated March 2017 Contents Introduction... 2 About Amazon Web Services (AWS)... 2 About Amazon Redshift... 2 Qlik

More information

Oracle Solaris 11: No-Compromise Virtualization

Oracle Solaris 11: No-Compromise Virtualization Oracle Solaris 11: No-Compromise Virtualization Oracle Solaris 11 is a complete, integrated, and open platform engineered for large-scale enterprise environments. Its built-in virtualization provides a

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

Reduce costs and enhance user access with Lenovo Client Virtualization solutions

Reduce costs and enhance user access with Lenovo Client Virtualization solutions SYSTEM X SERVERS SOLUTION BRIEF Reduce costs and enhance user access with Lenovo Client Virtualization solutions Gain the benefits of client virtualization while maximizing your Lenovo infrastructure Highlights

More information

Splotch: High Performance Visualization using MPI, OpenMP and CUDA

Splotch: High Performance Visualization using MPI, OpenMP and CUDA Splotch: High Performance Visualization using MPI, OpenMP and CUDA Klaus Dolag (Munich University Observatory) Martin Reinecke (MPA, Garching) Claudio Gheller (CSCS, Switzerland), Marzia Rivi (CINECA,

More information

White paper: Agentless Backup is Not a Myth. Agentless Backup is Not a Myth

White paper: Agentless Backup is Not a Myth. Agentless Backup is Not a Myth White paper: less Backup is Not a Myth less Backup is Not a Myth White paper: less Backup is Not a Myth Executive Summary Backup and recovery software typically requires agents that are installed onto

More information

Storage Optimization with Oracle Database 11g

Storage Optimization with Oracle Database 11g Storage Optimization with Oracle Database 11g Terabytes of Data Reduce Storage Costs by Factor of 10x Data Growth Continues to Outpace Budget Growth Rate of Database Growth 1000 800 600 400 200 1998 2000

More information

SoftNAS Cloud Data Management Products for AWS Add Breakthrough NAS Performance, Protection, Flexibility

SoftNAS Cloud Data Management Products for AWS Add Breakthrough NAS Performance, Protection, Flexibility Control Any Data. Any Cloud. Anywhere. SoftNAS Cloud Data Management Products for AWS Add Breakthrough NAS Performance, Protection, Flexibility Understanding SoftNAS Cloud SoftNAS, Inc. is the #1 software-defined

More information

THE FUTURE OF BUSINESS DEPENDS ON SOFTWARE DEFINED STORAGE (SDS)

THE FUTURE OF BUSINESS DEPENDS ON SOFTWARE DEFINED STORAGE (SDS) THE FUTURE OF BUSINESS DEPENDS ON SOFTWARE DEFINED STORAGE (SDS) How SSDs can fit into and accelerate an SDS strategy SPONSORED BY TABLE OF CONTENTS Introduction 3 An Overview of SDS 4 Achieving the Goals

More information

The Future of Business Depends on Software Defined Storage (SDS) How SSDs can fit into and accelerate an SDS strategy

The Future of Business Depends on Software Defined Storage (SDS) How SSDs can fit into and accelerate an SDS strategy The Future of Business Depends on Software Defined Storage (SDS) Table of contents Introduction 2 An Overview of SDS 3 Achieving the Goals of SDS Hinges on Smart Hardware Decisions 5 Assessing the Role

More information

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances) HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access

More information

Dell EMC Isilon All-Flash

Dell EMC Isilon All-Flash Enterprise Strategy Group Getting to the bigger truth. ESG Lab Validation Dell EMC Isilon All-Flash Scale-out All-flash Storage for Demanding Unstructured Data Workloads By Tony Palmer, Senior Lab Analyst

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

PeerApp Case Study. November University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs

PeerApp Case Study. November University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs PeerApp Case Study University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs November 2010 Copyright 2010-2011 PeerApp Ltd. All rights reserved 1 Executive Summary

More information

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT PhD Summary DOCTORATE OF PHILOSOPHY IN COMPUTER SCIENCE & ENGINEERING By Sandip Kumar Goyal (09-PhD-052) Under the Supervision

More information

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems NETAPP TECHNICAL REPORT Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems A Performance Comparison Study of FC, iscsi, and NFS Protocols Jack McLeod, NetApp

More information

The Transition to Networked Storage

The Transition to Networked Storage The Transition to Networked Storage Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary... 3 2.0 The Emergence of the Storage Area Network... 3 3.0 The Link Between Business

More information

Designing elastic storage architectures leveraging distributed NVMe. Your network becomes your storage!

Designing elastic storage architectures leveraging distributed NVMe. Your network becomes your storage! Designing elastic storage architectures leveraging distributed NVMe Your network becomes your storage! Your hosts from Excelero 2 Yaniv Romem CTO & Co-founder Josh Goldenhar Vice President Product Management

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

Oracle Exadata: Strategy and Roadmap

Oracle Exadata: Strategy and Roadmap Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended

More information

Getting Started with Amazon EC2 and Amazon SQS

Getting Started with Amazon EC2 and Amazon SQS Getting Started with Amazon EC2 and Amazon SQS Building Scalable, Reliable Amazon EC2 Applications with Amazon SQS Overview Amazon Elastic Compute Cloud (EC2) is a web service that provides resizable compute

More information

Achieving Horizontal Scalability. Alain Houf Sales Engineer

Achieving Horizontal Scalability. Alain Houf Sales Engineer Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches

More information

Top 5 Reasons to Consider

Top 5 Reasons to Consider Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center White Paper Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center Major transformations are occurring

More information

Amazon Elastic Compute Cloud (EC2)

Amazon Elastic Compute Cloud (EC2) Amazon Elastic Compute Cloud (EC2) 1 Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity ( Virtual Machine) in the AWS cloud. Why EC2 Available in different locations

More information

Pacific Knowledge Systems. RippleDown Deployment Guide: v8.0.5

Pacific Knowledge Systems. RippleDown Deployment Guide: v8.0.5 Pacific Knowledge Systems RippleDown Deployment Guide: v8.0.5 Copyright Notice The information provided in this User's Guide is subject to change without notice and is not a commitment by Pacific Knowledge

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

Business Case for the Cisco ASR 5500 Mobile Multimedia Core Solution

Business Case for the Cisco ASR 5500 Mobile Multimedia Core Solution Business Case for the Cisco ASR 5500 Mobile Multimedia Core Solution Executive Summary The scale, use and technologies of mobile broadband networks are changing rapidly. Mobile broadband growth continues

More information

AWS Integration Guide

AWS Integration Guide AWS Integration Guide Cloud-Native Security www.aporeto.com AWS Integration Guide Aporeto integrates with AWS to help enterprises efficiently deploy, manage, and secure applications at scale and the compute

More information

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging

Catalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging Catalogic DPX TM 4.3 ECX 2.0 Best Practices for Deployment and Cataloging 1 Catalogic Software, Inc TM, 2015. All rights reserved. This publication contains proprietary and confidential material, and is

More information

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Introduction to Amazon EC2 Danilo Poccia Technical Evangelist @danilop 2015, Amazon Web Services, Inc. or its affiliates. All

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Taking Hyper-converged Infrastructure to a New Level of Performance, Efficiency and TCO

Taking Hyper-converged Infrastructure to a New Level of Performance, Efficiency and TCO Taking Hyper-converged Infrastructure to a New Level of Performance, Efficiency and TCO Adoption of hyper-converged infrastructure is rapidly expanding, but the technology needs a new twist in order to

More information

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment Case Study Order Number: 334534-002US Ordering Information Contact your local Intel sales representative for ordering

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Task Computing Task computing

More information

A Closer Look at SERVER-SIDE RENDERING. Technology Overview

A Closer Look at SERVER-SIDE RENDERING. Technology Overview A Closer Look at SERVER-SIDE RENDERING Technology Overview Driven by server-based rendering, Synapse 5 is the fastest PACS in the medical industry, offering subsecond image delivery and diagnostic quality.

More information

Welcome to the New Era of Cloud Computing

Welcome to the New Era of Cloud Computing Welcome to the New Era of Cloud Computing Aaron Kimball The web is replacing the desktop 1 SDKs & toolkits are there What about the backend? Image: Wikipedia user Calyponte 2 Two key concepts Processing

More information

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper

Paperspace. Architecture Overview. 20 Jay St. Suite 312 Brooklyn, NY Technical Whitepaper Architecture Overview Copyright 2016 Paperspace, Co. All Rights Reserved June - 1-2017 Technical Whitepaper Paperspace Whitepaper: Architecture Overview Content 1. Overview 3 2. Virtualization 3 Xen Hypervisor

More information

Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices

Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices Consulting Solutions WHITE PAPER Citrix XenDesktop XenApp 6.x Planning Guide: Virtualization Best Practices www.citrix.com Table of Contents Overview... 3 Scalability... 3 Guidelines... 4 Operations...

More information

Microsoft SQL Server on Stratus ftserver Systems

Microsoft SQL Server on Stratus ftserver Systems W H I T E P A P E R Microsoft SQL Server on Stratus ftserver Systems Security, scalability and reliability at its best Uptime that approaches six nines Significant cost savings for your business Only from

More information

Navigating the Pros and Cons of Structured Cabling vs. Top of Rack in the Data Center

Navigating the Pros and Cons of Structured Cabling vs. Top of Rack in the Data Center Navigating the Pros and Cons of Structured Cabling vs. Top of Rack in the Data Center Executive Summary There is no single end-all cabling configuration for every data center, and CIOs, data center professionals

More information

Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices

Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices Solution Overview Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices BENEFITS Cisco UCS and StorMagic SvSAN deliver a solution to the edge: Single addressable storage pool

More information

VirtuLocity VLNCloud Software Acceleration Service Virtualized acceleration wherever and whenever you need it

VirtuLocity VLNCloud Software Acceleration Service Virtualized acceleration wherever and whenever you need it VirtuLocity VLNCloud Software Acceleration Service Virtualized acceleration wherever and whenever you need it Bandwidth Optimization with Adaptive Congestion Avoidance for Cloud Connections Virtulocity

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Understanding Data Locality in VMware vsan First Published On: Last Updated On:

Understanding Data Locality in VMware vsan First Published On: Last Updated On: Understanding Data Locality in VMware vsan First Published On: 07-20-2016 Last Updated On: 09-30-2016 1 Table of Contents 1. Understanding Data Locality in VMware vsan 1.1.Introduction 1.2.vSAN Design

More information