Cyber-Physical Systems and Mixed Simulations

Size: px
Start display at page:

Download "Cyber-Physical Systems and Mixed Simulations"

Transcription

1 Master Research Internship Master Thesis Cyber-Physical Systems and Mixed Simulations Author: Tran Van Hoang Supervisor: Professor Bernard Pottier

2 Abstract Climate change has received much attention in recent years. The needs of prediction and validation of real systems behaviors and natural phenomena are critical. Simulation is a good candidate for this mission. However, the major problem is that modeling and simulating complicated and large physical systems are time-consuming. Despite many commercial software now exist for such systems (water, forest modeling as examples), require a considerable knowledge of specific physical processes, and about the study areas. Thus, at the first step, we propose a practical way for simply modeling physical systems, especially natural system, by using Cellular Automata (CA). The PickCell tool developed at Lab-STICC laboratory will facilitate that process. As a point, GPU computations and parallelisms will be proposed as an important part of this methodology. The purpose is to accelerate large size physical simulations. In addition, we propose the use of distributed simulations to deal with the lack of interoperability between simulations. To do that, we use an IEEE standard High Level Architecture (HLA) for designing the system supporting mixed simulations being based on synchronous systems. This also makes a great chance of conducting the simulations Cyber-Physical Systems.

3 Acknowledgments I would like to give special thanks to professor Bernard Pottier, and all of my colleagues at LabSTICC, UBO. I appreciate the supports not only in research activities but also in my daily life. Tran Van Hoang, Brest, France, 12/06/2015

4 Contents 1 Introduction Motivations and Objectives Cyber-Physical Systems (CPS) Cellular Automata (CA) Physical simulations based on cell networks PickCell tool and cell networks Physical simulations based on cell networks Case study and applications Routing algorithm Remarks Simulations with Cuda programming model GPU and Cuda programming model Accelerating simulations by using Cuda Details of GPU implementation of simulations Performance measurement principles Distributed simulation with HLA Overview of The High Level Architecture (HLA) Time management in HLA Distributed physical simulation Conclusion Contributions Future works Bibliography 40 i

5 1 Introduction 1.1 Motivations and Objectives Nowadays, developing countries have su ered from natural disasters such as typhoon, tsunami, fire, and flood. For example, in Mekong Delta of Vietnam, under the impacts of climate change, the sea level rise around. This could make the flooding Mekong Delta every year. Thus, environment surveillance and prediction of such phenomenon become necessary. Simulation is a good approach for that purpose. It helps human make better decisions to prevent or relieve the impacts. In recent years, wireless sensor network (WSN) emerges as a good candidate in monitoring the environment. Several inspiring projects have been launched as a common aim to sense the environment [10]. Sensors are used to collect status of physical systems and send status data to computer systems for processing, analyzing. Some reactions will be sent back to physical systems. A such integration between physical systems and computer systems pertain to Cyber- Physical Systems (CPS), as presented in Section 1.2. Therefore, it is necessary to consider sensing processes. The objective is to support and to validate operations of the WSN. Especially, it is responsible for dangerous accidents such as monitoring chemical store placed at residents regions. A composing model of the parallel simulations of the two sides of the CPS will thus be conducted. However, modeling and simulating physical systems confront many issues. These systems often appear as huge systems and complex behaviour. This leads to a lot e ort for designing the models. Moreover, the lack of interoperability is also a major challenge. In fact, they always impact to each other in the real world. For instance, the fire spread is influenced by several other factors, namely weather conditions, wind directions and speeds, responding abilities, and sensing performance of the wireless sensor network (WSN). In such systems, the model consists of a lot of components (fire spreading, weather conditions, firefighter, and WSN). These components and the their relations result in large scale models. Such models are very di cult to maintain and adopt. These circumstances bring about: long run times for simulation runs. long time for developing and testing of such models. huge e ort for maintaining and for adapting the models for other perspectives. 1

6 CHAPTER 1. INTRODUCTION low flexibility and reusability. Traditionally, there are two common approaches to handle these problems. One solution is the employment of powerful hardware. The other is breaking up the model into a set of submodels, which are distributed on di erent computer systems. However, they come from separate works. Thus, in this project, we use a hybrid approach of the association of distributed models and parallel computations. It aims to enable and to adapt to huge size and complex behavior physical systems. This approach can be viewed under two main aspects. For the problem of computing performance, the use of parallel simulations based on GPU is suggested. The powerful GPU has been considered in several studies to speed up large simulations over the last years. To deal with the lack of the interoperability of simulations, we use an IEEE standard High Level Architecture (HLA), which provides independent simulations the ability to communicate together in the context of a synchronous system. The thesis is roughly divided into five chapters: Chapter 1: An introduction to the motivations and the objectives of the study is presented. An overview of related concepts will be described such as Cyber-Physical Systems (CPS), Cellular Automata (CA). A description of PickCell tool and its applications will end the chapter. Chapter 2: A new approach is to simplify the process of modeling physical systems. The approach is facilitated by the PickCell tool in accordance with the CA. Chapter 3: Describing the use of Cuda programming model to simulate physical models. Some experiments are conducted to evaluate the feasibility of the solution. Chapter 4: Using the HLA standard to deal with the lack of interoperability of several simulations. It enables parallel simulations to be able to communicate together in context of distributed systems. Chapter 5: Summarising the contributions and presenting future work. 1.2 Cyber-Physical Systems (CPS) Cyber-Physical Systems (CPS) are integration of computation and physical processes [9], [1]. In which, embedded computers and networks monitor and control the physical processes. It includes feedback loops where physical processes a ect computations and vice versa. 2

7 CHAPTER 1. INTRODUCTION Figure 1.1: An example of Cyber-Physical System. An example of CPS is illustrated in Figure 1.1, as an illustrating of monitoring accidents (pollution, flood, landslides, chemical spreading, as example) in the river. A WSN can be used to observe the status of the river via sensors. Sensors forward status data to computer systems, which will carry out computations. An analysis of computed results can lead to some emergency operations, giving some signals or closing the basin, in the case of the accidents. Apparently, for implementation of this type of system, one of critical challenges is system integration. Therefore, to obtain the interoperability of simulations, an integration solution is required. In fact, on [22], the authors presented a co-simulation framework based on the HLA standard. That work focus on integrating heterogeneous systems, designed in di erent tools and languages, as CPSs. However, the given prototype has not taken care for phenomena and computation performance as well. Thus, the considerations in this project are expected to provide another perspective on phenomena simulations. 1.3 Cellular Automata (CA) Cellular Automaton (CA) is one of the techniques used in simulating complex physical systems such as self-reproduction in biology, di usion models in chemistry. The famous Game of Life, it illustrates that cellular automata have capacity of producing dynamic patterns and structures [2], [3]. According to [4], a major e ort is presented to show the advantages of using CA for modeling systems, especially for natural phenomena. The use of CA for modeling phenomena is clearer, more accurate, and more complete than conventional mathematical system. Moreover, the transition rules of CA models are often simpler than mathematical equations, but the result produced is more comprehensive. It can mimic the actions of any possible physical systems. A 3

8 CHAPTER 1. INTRODUCTION CA typically consists of two main components. The first component is a cellular space that is a lattice of cells, each with an identical pattern of local connection to other cells for input and output. The cell has a set of states that is chosen from a finite number states. In the simplest case each cell can have the binary states 1 or 0. A set of cells called neighbourhood is defined relatively to the specified cell (center). The states of the neighbours will be used to calculate the next state of the center according to the defined rule. The number of neighbour depend on the pattern chosen in modeling process. The second component is a transition rule (CA rule) giving the update of the state (at time t+1) of each cell according to its current state and the states of its neighbourhood (at time t). Typically, the rules for updating states of all cells are the same and do not change over time. Generally, the CA exits under various forms. The simplest CA is one being the one-dimensional lattice, meaning that all the cells are arranged in a line. Then, the neighbourhood of the cell are just in its left and its right. Meanwhile, for the two-dimensional lattice, the most common types of neighbourhood are Moore neighbourhood and Von Neumann neighbourhood (see Figure 1.2). Figure 1.2: Von Neumann and Moore neighbourhood (distance = 1). In Von Neumann neighbourhood, each cell has four neighbourhood, north (N), south (S), east (E), and west (W). We thus have 32 (2 5 ) possible. Meanwhile, for the latter, each cell totally has nine cells, then 512 (2 9 ) possible patterns can be produced. In both cases, the distances are one and transition function is supposed to generate. Therefore, in order to model systems with this approach, the two components need be accomplished: the cellular space and the transition rules or behavior. In the next chapter, an approach proposed in Lab-STICC laboratory to automatically generate the cellular spaces (cell networks) from geographic data is briefly presented. Input data and behavior of each cell will be later determined according to di erent interests on a certain physical system. 4

9 2 Physical simulations based on cell networks This chapter presents a brief description about PickCell, a tool allowing to generate cell networks of physical systems. Their structures thus will be described in the second section as well. We next propose a methodology to develop physical simulations in term of the cell networks. Lastly, some cases are examined to demonstrate the use of the proposed methodology. 2.1 PickCell tool and cell networks PickCell tool PickCell is a modeling tool, has developed in Lab-STICC in recent years (more in document [8]). It enables to access geographic data from various public resources as input data, namely GoogleMap, OpenStreetMap, or even picture files. The tool uses these data to analyze, process, and generate cell network structures of physical processes. The main feature of the tool is extracting visible properties (potential physical systems) on geographic data such as river, forest, or road system. A process start from input data. The final results are a set of separated physical systems being represented by a group of cell networks, presented in Section Generally, this process is performed throughout three main steps: Preprocessing data: Geographic data are usually yet well presented, especially in the case of satellite and air images. At this step, the tool increases the contrast of the data to serve the following steps. Segmenting data into cells: In order to achieve interest regions on the data such as rivers, or roads. The data are divided into small cells. Their sizes (x, y) depend on the objective on the desirable models. In which, x and y parameter represent the width and the height of cells, respectively. It makes sense that with the same size of input data, if x and y values are small, the number of cells will large or vice versa. Recognizing similar cells and grouping into layers: Typically, the tool uses 3 standard components of color (Red-Green-Blue) to classify divided cells into defined layers. Each contains a set of cells with similar colors. Next, the relations between these cells in the same layer will be defined depending on a certain CA pattern. As a result, for each layer, we have a set of cells organized as a network due to their relations (or links). These 5

10 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS sets are considered as cell networks. The details of cell networks will be presented in the next section Cell network As mentioned previous, a cell network is a group of cells and the relations between them. Each typically has its data consisting of four elements: identity, local state (such as pollution density, insect population, geographic positions), links to other cells (or its neighbour), and relative positions to its the neighbour. The last one means that a cell is capable of determining the directions of its neighbour, which can be located at the eastern, the western, the northern, or the southern. This property can be useful in various situations such as simulating the weather, or flow of the fluid. For the sake of simplicity, it can be organized as pairs of number, shown in Table 2.1. Direction Value East (1,0) West (-1,0) North (0,-1) South (0,1) Table 2.1: A proposed organization of directions in a cell network. Table 2.1 formally shows an example of a cell network, which is generated from PickCell tool except for its data represented by the column named Pollution Density. The data can be loaded at the beginning of simulations or at runtime. 6

11 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS Cell Id Pollution Density Neighbour Id Directions , 25, 1, 600 (-1,0), (1,0), (0,-1), (0,1) , 0 (-1,0), (0,1) , 26 (-1,0), (0,1) , 0, 589 (-1,0), (1,0), (0-1) (1,0) Table 2.2: The table presents a cell network structure of 601 cells generated by PickCell tool (Von Neumann 1 CA). The use of the cell network brings some advantages in developing physical simulations. Firstly, each cell network is a clear and consistent structure. All cells come from a certain physical system. They own the same type local data and have the same behaviour. This structure looks like a class in OOP (Object Oriented Programming) and its cells are objects being instantiated from that class. Under the view of software engineering, it thus especially useful in maintaining the systems. It is simple to add necessary properties to states or transitions of the models. Secondly, cell networks generated from PickCell tool help to tackle the latency of input data. Many phenomena simulations have used raster data as the input for their models. It is often di cult to distinguish the interest regions with this type of data. The limitation causes the useless computations occurring on the outside of those regions. For example, in [23], data cells are not belonging to the real interest area (rivers) will be marked NoData in the preprocessing step. The use of models built from cell networks will avoid this useless processing in default. In addition to the cell network structure, the PickCell tool also allows to extract visible data. This is useful for displaying and analyzing simulated results. Figure 2.1 demonstrates how a river system is displayed from extracted visual data. In current version, the tool enables to generate two dimension data in the format of two concurrent programming languages, Cuda and Occam [6], [7]. The third dimension data for elevation will appear soon in the next version. 7

12 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS Figure 2.1: A cell network of a river system generated from PickCell tool with Von Neumann 1. In short, cell networks generated from PickCell tool are presented as skeletons for simulation models. In order to obtain a complete model by this approach, two other components need to be considered: input data and transition rules. These will be presented in the next section. 2.2 Physical simulations based on cell networks The cell network structure early presented is one of main components for this methodology. Each model has at least three other components: cell network, input data, and transition rule. The first one will be generated from geographic data with the facilitation of PickCell tool. Whereas, the two others will be defined according to the characteristics of physical systems. A summary of the methodology is depicted in Figure 2.2. The process has three main steps. Initially, it begins with geographic data. These data are next processed to generate a cell network by the PickCell tool. The cell network is associated with input data and transition rule to make up a complete model. Lastly, this model is executed by a simulator. Currently, the cell networks are generated in two versions, Cuda and Occam codes. Cuda was chosen in this work due to adequation of its model. 8

13 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS Figure 2.2: A summary of the proposed process which is used to conduct physical simulations. 2.3 Case study and applications This section describes a case study that has been applied to study region. It is a small area located in Mekong Delta of Vietnam, as shown in Figure 2.3. In which, there are totally three physical systems: river, forest, and road. The first two of those, river system and forest system, which were considered in this project. Considering applications of the proposed approach, there are two models will be conducted from the study region. One is the model of forest fire spread. The other is river pollution di usion. In addition, we assume that a Wireless sensor network (WSN) is used to monitor the status of the forest. Thus, a model of WSN is also developed. Details of three models are later described in this section. Another assumption is that there are communications between those three systems. One happens as the fire spreading close to the river. Then, ashes of the fire will pollute to the river. Meanwhile, as the sensors of the WSN recognised the fire appearing near to them, these sensors will raise emergency signals. This scene will be clarified and used as an application for a solution presented in Chapter 4. 9

14 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS Figure 2.3: The study region: A small area in Mekong Delta, the South of Vietnam. (data source: OpenStreetMap [16]) In reality, there are many elements of input data will be used for models and transition rules are often very complicated. The goal is to create simulations as real as possible. However, in our case, some basic characteristics will be picked to express the possibility of the proposed methodology. Particularly, the input data and the transition rule of each model are presented as follows: The di usion of pollution in the river This model is used to simulate the di usion of pollution in a river. Regarding the context of pollution, it is possible to think of various potential situations such as chemical, oil, contaminant. Then, the di usion much depends on the density. Thus, the pollution density was kept as input data for this model. Each cell contains an amount of pollution density, which represents the cell state. The states are changed according to the transition rule. Input data: Pollution density. Transition rule: At every time step, to achieve a new state at time t+1, each cell will perform sequential tasks: If the local density value is larger than zero, it will be randomly subtracted a certain amount of its density. That proportion will be equally transported to its neighbour. Next, it will receive some proportions from its neighbour. Finally, the addition and the subtraction will be updated to prepare for the next step (time+1). 10

15 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS The fire spread in the forest A model used for simulating the fire spread in the forest. It is reproduced from a sample in CORMAS [15]. Each cell has four possible states: tree, fire, ash, and empty. At the beginning, some cells are initialized with the state fire, while others are tree. Input data: Tree, fire, ash, and empty. Transition rule: If a cell is tree at time t, it will become fire at time t+1 in the case that there is at least one of its neighbour is fire. If a cell is fire at time t, it will become ash at time t+1. If a cell is ash at time t, it will become empty at time t Wireless sensor network (WSN) In this study, WSN plays as a sensing component role. It regularly collect raw data from the environment, processes that data, and raises emergency alert in the case of the fire detected. A WSN will monitor status of the forest. To do that, a set of sensors will be deployed in the forest border because our consideration is the spread of the fire to other systems. In this case, we give a simple way using a distributed algorithm for the deployment of sensors. The algorithm will be described in Section 2.4. A simple WSN is achieved as shown in Figure 2.4. Figure 2.4: Deploying sensors along the forest border extracted from the study region with the 4 neighbour pattern. The communication range and the sensing range are 25 and 5 cells units, respectively. Typically, sensors have two types of ranges. One is to indicate the sensing capacity of the sensor. This sensing range can be small. Meanwhile, the other, communication range, can be 11

16 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS longer due to radio link technology. Thus, as deploying sensors, it is necessary to make sure that sensors are connected together depending on the value of the communication ranges. Input data: Sensing data. Transition rule: At every step, the nodes check data received from the fire forest simulation. In case of fire detected at some points, signals will be raised. 2.4 Routing algorithm This section presents a routing algorithm implemented in parallel. Taking advantage of the GPU computation, a new version of this algorithm was implemented in Cuda starting from a Occam program. The routing table which can be used for deploying sensors as described in previous. We assume that the network has the shape and structure like the cell network as introduced in Section Generally, it consists of n nodes, numbered 0 to n-1, they are viewed as their identity, as showed in Figure 2.5. Associating to each node is two elements: route table and temperate table. Inwhich, route table will store identities of itself and other nodes, to which it has reached after t step. The structure of this table is presented in Table 2.3. Meanwhile, temperate table will only contains new nodes identity, to which it reached at each step. It means that after each step, the values held by temperate table are completely replaced by the new ones while the route table can be added more new records or will be unchanged. At each step, each node performs two main tasks that are sending out local temperate tables to its neighbor and receiving temperate tables from them as well. These tasks will be performed n-1 times. This is to assume that the maximum distance will be obtained. The algorithm is presented as the following: Algorithm in parallel: Initializing Adding node s id to local temperate table and route table with distance is zero, link index is -1. For i to n For each neighbour Sending local temperate table to neighbour. Receiving a temperate table from the neighbor. Emptying local temperate table For each id in received temperate table If id does not exist in the route table. Adding id, i as distance, and a link index to route table. 12

17 CHAPTER 2. PHYSICAL SIMULATIONS BASED ON CELL NETWORKS Adding id to local temperate table. Figure 2.5: A simple network. Node 0 Node 1 Known Id Distance Links Known Id Distance Links Node 2 Node 3 Known Id Distance Links Known Id Distance Links Table 2.3: An example of route table at node 0 after 3 steps. These tables show information held by nodes in the network. Each node can know who it can reach and the distance to destinations.that it can achieved. 2.5 Remarks The chapter presented a variety of subjects. The most noticeable is the concept of cell network. It plays an important role in developing physical models. For the next chapter, parallel computations will be employed to simulate these models. 13

18 3 Simulations with Cuda programming model This chapter describes Cuda programming model and its applications. One goal is to show a adequation of mapping between GPU architecture and cell network structure. Besides, it enables to solve the problems of both large cell networks and complicated behavior. Next, the performance tests on computation will be conducted in di erent scenarios due to the necessary considerations on the e ectiveness of this approach. 3.1 GPU and Cuda programming model Introduction to GPU The Graphic Processing Unit (GPU) [5] is massively multithreaded - many core chips composed of hundreds of cores and thousands of threads. This provides the capacity for processing large data in parallel. Thus, it is widely used in parallel computations. a simplified of a motherboard architecture is depicted in Figure 3.1. There are two parts, the left part for the CPU (host) and the right one for the GPU (device). They are connected together by a PCI bus. On the CPU, only host memory is considered in this model. Meanwhile, the GPU chip comes with a set of streaming multiprocessors (SM). Each consists of several scalar processors (SP), a set of registers, a shared memory. An on-chip shared memory is visible for all threads that executed on a SM. A global memory is shared for all SMs. 14

19 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Figure 3.1: A simplified motherboard architecture Cuda programming model Cuda (Compute Unified Device Architecture) is created by NVIDIA. It provides a platform for parallel computing and programming model. It enables to increase computing performance by harnessing the power of the GPU. Cuda provides a set of extensions to C/C++ language, to express parallel programs. The GPU has thousands of threads handing multiple tasks while a CPU consists of a few threads for sequential serial processing. Thus, a Cuda program typically consists of CPU code (host code) and one or more kernels (device code) running concurrently on the GPU. As shown in Figure 3.2, the compute-intensive portions of the application will be sent to the GPU, while the remainder of the code still runs on the CPU. Kernels are executed by many several threads with private local variables and shared memory. The executions of blocks are synchronous while those of threads in each block are independent. In addition, each of the CPU and the GPU has its own separate memory. They cannot directly access the memory of each other. Thus, we need to explicit transfer data between the two memories via PCI bus. 15

20 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Figure 3.2: Anatomy of a CUDA program. 3.2 Accelerating simulations by using Cuda Programming with Cuda, means programming a large number of threads with own shared memory and concurrent executing the same task. Therefore, if there is a need to address a large number of repeated works which are the same, it is convenient to apply this model. In our case, each model owns a cell network, input data for each cell, and a common transition rule for entire cells. This makes sense that each cell has its local data and global behavior. Every cells must make the same computation on its own data at each step in order to achieve new states for the system. It is thus simple to map each cell to each thread being responsible for the processing of that cell, as illustrated in Figure

21 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Figure 3.3: The mapping between the cell network structure and the GPU architecture. According to this model, data need to be moved on the global memory to share between threads. Figure 3.4 shows the data flow of physical simulations in term of CUDA programming. This can be summarized into some main steps: Initializing initial states (input data) for all network cells. Transferring data (cells states and network structure) to the GPU for computations. For each cycle, the new states of all nodes will be concurrently computed on the GPU. These states will updated with new values to prepare for the next cycle. Sending data back to to the CPU memory possibly to display and analyze the results. It is optional, if the result of each step is not considered for displaying and analyzing at run-time, these operations can be omitted. Figure 3.4: Data flow in the system. Obviously, if the phase of displaying and analyzing is ignored, the execution of simulation 17

22 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL mostly is run on device. Hence, it is believed that the benefit of performance in this case will be proportional to the size of cell networks. It becomes more worthwhile in the case of simulating phenomena, which often appear with large sizes and very complicated transition functions. Moreover, this proposition provides an opportunity to achieve computations and statistics in real time. This increasingly becomes important when the needs of predictions of many emergent cases increase, namely clouds of insects, flooding, tra c congestion, tsunami, fire. For those situations, the systems can directly access available data from the natural environment via observing systems. The simulations use input data to conduct useful information (directions of clouds of insects or the level of flood at a certain time in the future, for example). 3.3 Details of GPU implementation of simulations In this section, the details of GPU implementations of three main simulations will be presented: pollution di usion, forest fire and wireless sensor network. All of them are developed by C programming language in accordance with Cuda model. These implementations are resulted from the analysis in the previous section. The formal presentations of implementations are described as the following. Host program implemented on the CPU (1) Initializing the initial values for all cells. (2) Copying the cell network structure and data from the CPU host memory to the GPU device memory and launching the kernel. Kernel program implemented on the GPU (3) Looping each cycle. (4) Computing the new states for each cell. (5) Updating new states to each cell. (6) Reading back results to the CPU and output the results (once for each time step or more). Apparently, the execution runs mostly on GPU (from (3) to (5)). Others do not much a ect to global performance if line (6) is not considered. Then, line (1) is executed once and line (2) is run twice. Thus, as a comparison, the execution time on CPU can be omitted. In the next section, some initial measurements will be performed for evaluating the e ectiveness 18

23 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL of using the massively parallel architecture GPU to accelerate the computation of phenomena simulations. 3.4 Performance measurement principles In order to validate the performance of the proposal methodology, a few measurement tests were performed. The simulation of pollution di usion in the river was chosen as a case. The description of the pollution di usion model follows Section The implementation of the transition function presented in Listing 3.2. There are two data structures used. The NodeState structure contains states of cells, the Canaux structure consists of links to neighbours. Listing 3.1: Transition function { } device NodeState computestate(nodestate nowstate, int nodeindex, Canaux channels) NodeState mystate ; int nbin, nodein ; float receive ; /// Getting pollution density of the c e l l mystate = nowstate [ nodeindex ] ; /// Getting number of neighbours of the c e l l nbin = channels [ nodeindex ]. nbin ; receive = 0; for ( int i = 0; i < nbin ; i++) { /// Getting id of the neighbours nodein = channels [ nodeindex ]. read [ i ]. node ; receive = receive + ((nowstate d[nodein].density / 2.0) / (float) channels [ nodein ]. nbin) ; } /// Computing the new state mystate. density = (mystate. density / 2.0) + receive ; return mystate ; We have tested and have evaluated the computational e ciency in various studies. The concentration of these tests is to show how the GPU speeds up the simulations when comparing to the CPU. Therefore, the time for transferring data between CPU and GPU are omitted in most cases. The time execution of the simulation on the host is also ignored due to most of computation being moved on the device. As mentioned earlier, the simulation execution costs depend on two main components: cell networks (size and type of CA pattern chosen) and the complexity of transition rules. Thus, many di erent aspects related to these components will be concerned. All tests have been tried on a PC with hardware configuration shown in Table 3.1. Information about Graphics Device is presented in Table 3.2 (more details, see [11]). We have used a pro- 19

24 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL filing tool nvprof [17] to estimate time for GPU computation and the standard library time.h for that on the CPU. Intel(R) Xeon(R) CPU E GHz Num. CPUs 8 Num. Cores/CPU 4 Architecture RAM i GB Table 3.1: Technical data of PC used. GeForce GTX 680 Num. cores 1536 Maximum number of threads per block 1024 Global memory 4 GB Table 3.2: Technical data of NVidia graphics card used. The first scenario: The comparison of time computation between the CPU and the GPU was carried out. All tests follow the model of river pollution di usion (Section 2.3.1) with the pattern of 8 neighbourhoods and 1,000 cycle runs for each test. The transport time was considered in this case study. The computation on both the CPU and the GPU are influenced by the size of cell networks (number of cells), but not by the size of cells. Since, the cell is a basic element in cell networks, the computations are careless about the pixels of cells. With the same studied region, as the size of cells is smaller, we can process a larger cell network. Otherwise, the cell network is small if a bigger size of cells is chosen. Thus, the sizes of cells were regardless the performance tests. Table 3.3 shows the time executions of the pollution di usion model on the CPU and the GPU with 1,000 cycles. The network sizes used between 1,220 and 83,661 cells. Regarding the network size, the number of cells influence the performance for both the CPU and the GPU. On the CPU, the upward trend is very noticeable. The great increase starts from the size of 10,703 to 83,661 at a rate of 0.26(s)/1,000 cells. It is projected that the trend anticipation will be maintained with bigger sizes. Whereas, the increase on the GPU is not dramatic. It gradually rises between 1,220 and 83,661 at a rate of 0.01(s)/1,000 cells. Table 3.3 presents that the GPU is overwhelmingly faster than the CPU. The gap increasingly becomes significant according to the rise of the number of cells. This is visually expressed in Figure 3.5. As the size of cell network is 83,661, the GPU is approximately 22 times faster 20

25 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL than the CPU. It is that the use of GPU is very vital in the case of vast systems. Time (seconds)/1,000 cycles Num. cells Cell size (Pixel) CPU GPU 1,220 10x ,703 5x ,425 2x ,661 2x Table 3.3: The computation comparison between the CPU and the GPU in the case of pollution di usion model. Figure 3.5: Demonstrating the accelerating time of using the GPU for physical simulation. Figure 3.6 shows an example about physical simulation on GPU. The cell network of a river is generated by PickCell tool with the use of four neighbor pattern. Meanwhile, the model of pollution di usion is referred from Section 2.2. Initially, two polluted points are randomly created in the river. These points contain an amount of pollution density as their data states. At every step, system states are changed according to the transition function. 21

26 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Figure 3.6: Illustrating a simulation of di using pollution in a river following the model described in Section 2.2. It is initialized with two polluted points (black points). The second scenario: Di erent sizes of cell networks are still taken into account. The two popular patterns of CA (Von Neumann 1 and Moore 1) and the di erence of number of cycles are considered as well. The model are used as the previous case. The achieved results are presented in Table 3.4. One of these attempts is shown in Figure 3.6. The values shown in Table 3.4 indicate that the increase of cycles does not much a ect to the execution time. It can be understood that the transition functions are very simple to generate major di erences. 22

27 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Num. cells Cell size (Pixel) CA Pattern Time (seconds) / Num. cycles 100 1,000 10, ,000 1,000,000 1,220 10x10 VN ,220 10x10 Moore ,703 5x5 VN ,703 5x5 Moore ,425 2x2 VN ,425 2x2 Moore ,661 2x2 VN ,661 2x2 Moore , Table 3.4: Measurements results. Regarding CA patterns, for small networks, the di erences between Von Neumann 1 and Moore 1 are not very remarkable. However, in the case of larger ones, Von Neumann 1 is significantly faster than the other. As a case, as running time is 10,000 cycles and network size is 83,661 cells, the Moore 1 takes (s) while the Von Neumann 1 just takes 8.948(s). The former is about 1.6 times slower than the latter, as shown in Figure 3.7. Figure 3.7: The graph displays the increase of the gap between two CA patterns with 10,000 cycles. The third scenario: It aims to show that the execution time also depends on transition function. To do that, we modified a little on the previous version. Particularly, at every step, each cell loses an random amount of the pollution density. The implementation is shown as below. 23

28 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL { } Listing 3.2: Transition function (version 2) device NodeState computestate(nodestate nowstate, int nodeindex, Canaux channels, curandstate devstates) NodeState mystate ; float losspercentage, receive, loss ; int nbin, nodein ; mystate = nowstate [ nodeindex ] ; /// Generating a random value in [ ] by generatenumber function. losspercentage = generatenumber(devstates, nodeindex) ; /// Calculating an amount of loss. loss = losspercentage mystate. density ; /// Getting number of neighbour nbin = channels [ nodeindex ]. nbin ; receive = 0; for ( int i = 0; i < nbin ; i++) { /// Getting id of the neighbour nodein = channels [ nodeindex ]. read [ i ]. node ; receive = receive + ((nowstate[nodein ]. density / 2.0) / ( float) channels [ nodein ]. nbin) ; } /// Computing the new state mystate. density = (mystate. density / 2.0) + receive loss ; if (mystate.density < 0.0) { mystate. density = 0.0; } return mystate ; The graph 3.8 demonstrates the influences of transition rules on execution time in this approach. The version 2 is slower than version 1 due to the more complex behaviour. The increase of time is stable following the size of the networks. 24

29 CHAPTER 3. SIMULATIONS WITH CUDA PROGRAMMING MODEL Figure 3.8: Comparing the execution time between previous transition function (version 1) and the new one (version 2). 25

30 4 Distributed simulation with HLA The simulations of large systems often face with the performance issues. The use of Cuda programming model can deal with those. However, the lack of interoperability between simulations poses a major challenge. Thus, the High Level Architecture [(HLA) [12], [13], [20]] standard is proposed as a solution for addressing that new demand. According to this standard, the distribution of many sub-simulations can be achieved instead of the development of one vast simulation. The integration of Cuda model and the HLA leads to a hybrid solution in which several parallel simulations can be distributed on di erent computer systems. This chapter gives a brief description of the application of HLA on parallel simulations. 4.1 Overview of The High Level Architecture (HLA) The High Level Architecture (HLA) ( [12], [13], [20]) is a standard for distributed simulations, the main goal is to support interoperability and reusability of simulations. The HLA was developed by the United States Department Defense (DoD) to facilitate the integration of distributed simulation models within an HLA environment. It allows the division of a large scale model into a number of manageable components, while maintaining interaction between them. Over the last years, the HLA is deployed in a wide range of simulation application areas including transportation and the manufacturing industry. But, it hardly appears in simulation about phenomena, especially the climate change area. The HLA is thus suggested as a potential approach of composition of parallel simulations in this project. 26

31 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA Figure 4.1: HLA Federation. In HLA terminology, the entire system is represented by a federation. Each simulator referring to the federation is called a federate. A set of federates is connected via Run Time Infrastructure (RTI). These federates can be established on di erent platforms and connected together by a network system. In such case, RTI can be viewed as distributed operating systems for interconnect cooperating system federates. Figure 4.1 describes the global architecture of a HLA simulation. Generally, the HLA specification defines: Asetofrules:Thisdescribestheresponsibilities of federates and their relationship with RTI. There are ten rules. One of them is that all exchange of data among federates should occur via the RTI during a federation execution. An interface specification: The interface specification prescribes the interface between each federate and the Runtime Infrastructure (RTI), which provides communication services to the federates. The interface specification is divided into some main management areas: Federation management: Federation management includes main tasks such as creating federations, joining federates to federations, resigning federates from federations, and destroying federations. Declaration management: This allows federates publish and subscribe class attributes and interactions to RTI. Other federates can only subscribe to an attribute or an interaction when they were published by the federates owning them. Object management: Which includes the tasks of creating, and sending the updates of objects to other federates. Ownership management: The RTI allows federates to distribute the responsibility for updating and deleting object instances with a few restrictions. Time management: This focuses on the implementation of time management policies and negotiate time advances. This mechanism allows to create several simulations running concurrently. 27

32 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA An Object Model Template (based on the OMT standard [14]): This component defines how information is communicated between federates, and how the federates and federation have to be documented (using Federation Object Model FOM). FOM defines the shared objects, attributes, and interactions for whole federation. There are two elements can be exchanged between federates: An object: is an entity that represents actor playing in the simulation. It contains shared data that are created by a federate during the federation execution and persist until it is destroyed. The FOM defines all classes of object, a case presented in Table 4.2. As a federate wants to publish or subscribe to an object, it must compatibly define that object in its FOM. Objects store their data in attributes. An interaction: is a broadcast message that any federate can send or receive. A publishing federate sends out an interaction to the federates, which have subscribed to the publisher. If no subscribing federate receives the interaction, the data it carries are lost. The FOM also defines all classes of interaction. As a federate wants to publish or subscribe to an interaction, it must compatibly define that interaction in its FOM. Interactions carry data in parameters. Figure 4.2: Illustrating a high level of the interplay between a federate and a federation. Figure 4.2 depicts the interplay between a federate and a federation. Initially, a federate will try to create a federation, or to connect to existing one on RTI. It then specifies what data will be shared with other federates by using publishing services. These published objects or published interactions will be available to all federates, which also has a connection to the same 28

33 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA federation. An federate want to send data to other federates, it has to register objects and call an update service. That data will be automatically reflected to subscribers by the RTI. Releasing allocated resources is always necessary at the end. 4.2 Time management in HLA The RTI provides a variety of optional time management services. It is important to understand time management to manage the mechanism of exchanging events between federates. Each federate manages its own logical time and communicate this time to the RTI. The RTI will ensure correct coordination of federates by advancing time coherently. In the discrete event simulation literature, logical time is equivalent to simulation time. It is used to make sure that federates observe events in the same order [19]. It helps to avoid many problems such as causality violation, or di erent results led from repeated executions of the simulation with the same input data. Logical time is not mapped to real time Time policies According to the HLA time policies, each federate is involved in the progress of time. In some cases, it is necessary to map the progress of one federate to the progress of another. A federate needs to request a regulation policy to participate in the decision for the progress of time. A constrained federate follows the time progress imposed by other federates. As our approach, the synchronization of logical time from di erent federates is necessary. Thus, the federating and constrained federates are allowed, as shown in Table 4.3. This enables participating federates can exchange data together Time progress The second portion of the time management component provides a mechanism to advance simulation time within each federate. There are two particular services which federates can invoke to request time advancement from the RTI. The timeadvancerequest is used to implement time-stepped federates; the nexteventrequest is used to implement event-based federates. The granted time is given by timeadvancegrant service. Generally, a time management cycle consists of three steps. First, a federate sends a request for time advancement. Next, the federates can receive ReflectAttributeValues callbacks. The RTI completes the cycle by invoking a federate defined procedure called timeadvancegrant to indicate the federate s logical time has been advanced. 29

34 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA Figure 4.3: A model of time advancement request is used in this project Time synchronization As presented in previous sections, all simulations have to synchronize their local logical time to ensure the causality. The constrained parameter and regulating parameter are enabled for all simulations. The former ensures federates to be able to send the updates and interactions in causal order. In the other hand, the latter allows federates to able receive those updates and interactions from the RTI. Since a passive visualization federate does not send any updates or interactions, it has no impact to the time advance of the federation. Therefore, only constrained parameter is enabled and regulating can be switched o in the case of visualization. Table 4.1 shows time policies proposed for the case study (Section 2.3). Federate Time constrained Time regulating Time advance Forest Yes Yes Time stepped River Yes Yes Time stepped WSN Yes Yes Time stepped Visualization Yes Yes/No Time stepped Table 4.1: Time management of the federation. To synchronize activities between several federates participating in a federation, the RTI gives a mechanisms for exchanging data between them. In this case, times will be associated with exchanged data in coordinating federate activities. The RTI allows federates communicate explicit synchronization points. Figure 4.4 illustrates a process of synchronizing between two federates, the river federate and the forest federate. 30

35 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA Figure 4.4: Federate synchronization First of all, one of available federates sends a synchronizing request to the RTI and in this case it is the river federate. Then, the RTI will send the response to river federate and later send an announce to other federates to achieve a synchronization point. A service will be used by federates to confirm the synchronized point achieved. In the next portion, some issues relating to exchanging data is considered in the context of distributed simulations Exchanging data Exchanging data between simulation federates is one important part of distributed systems. However, a question arriving in this case is that what kind of data must be shared, where the communication will happen. Regarding the type of exchanging data, it is determined by characteristics of real systems as well as interoperability between them. As our case, there is a communication between four federates: forest, river, WSN, and visualization. The forest federate transports its status to the river federate and the WSN federate. Meanwhile, the three federates need to provide their data to the visualization federate for analyzing the results. To achieve it, the forest federate will publish its data (forest status and position) as an object class (ForestNode). The river federate and the WSN federate need to subscribe it. As the same case, the river federate and the WSN federate also publish the object classes RiverNode and WSNNode, respectively. These published classes have to be declared in the FOM of the federation, has a structure as indicated in Table

36 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA Object Class Attributes Published by Subscribed by ForestNode State, Position ForestNode River, WSN, Visualization RiverNode Pollution density, Position RiverNode Visualization WSNNode State, Position WSNNode Visualization Table 4.2: Objects and their attributes, publishers and subscribers. In some cases, it is also important to specify where data will be exchanged between two federates, especially in the case of physical systems owning very large sizes. Indeed, it is often ine ciency to send the entire data via the RTI because of issues with network performance and local computation yield. Thus, we proposed a solution for a general case of exchanging data between two adjacent systems. Adjacent situation is two physical systems that have a common frontier or some places in common. It is useless if unrelated information is sent to others. For example, as shown in Figure 2.3, new polluted points to the river can be caused by the ashes of forest fire only appears at the frontier of the two systems. Forest fire federate sends regularly its states to the river federate. The latter only takes care states of points close to it instead of entire forest states. This not only takes time for transporting data between federates, but also lead to less e cient in computation at receiver side. A solution based on the morphology theory [21] can be used to address that issue. That enables to smooth the boundary of physical systems by applying basic operations such as erosion, and dilation. To summarize, only the status data at the boundary of forest will be sent to the RTI. 4.3 Distributed physical simulation This section presents an application of using of the HLA standard for unifying parallel several simulations, or called a mixed simulation. The study region was suggested as shown in Figure 2.3. The whole model was split in three simulation federates: forest fire spread, river pollution di usion, and WSN. The simulation federates was all implemented in accordance with the Cuda programming model and the HLA standard as well. These parallel simulations are executed concurrently as three di erent simulators. Their models were presented in Section 2.2. In addition to the three federates, the last one, visualization, is designed as a supportive federate. The overview about the federation can be seen in Figure

37 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA Figure 4.5: A structure for a proposed federation. Repeating the communication that was proposed in Section 2.3, forest fire spread will produce ashes which result in some new polluted points and dusts to river pollution di usion at time t. The latter will include these new data to its model at time t+1. The communication depends on a specify condition. There is also the communication between WSN and forest fire spreading, the sensors regularly collect the forest status as it is the goal of sensing. New information will be sent to observers. In the case of fire detected, the observers will raise emergency signals as the fires were detected. To do that, the synchronization needs to be achieved as indicated in Table 4.1 and the shared data have to be declared as shown in Table 4.2. The file FOM for the federation was represented in the cyber.fed file shown in 4.1. Listing 4.1: cyber.fed file ;; Cyber physical simulation (Fed (Federation Cyber) (FedVersion v1.0) (Federate river Public ) (Federate forest Public ) (Federate wsn Public ) (Federate visualization Public ) (Objects (Class ObjectRoot (Attribute privilegetodelete reliable timestamp) (Class RTIprivate) (Class ForestNode (Attribute PositionX RELIABLE TIMESTAMP) (Attribute PositionY RELIABLE TIMESTAMP) (Attribute State RELIABLE TIMESTAMP) ) (Class RiverNode (Attribute PositionX RELIABLE TIMESTAMP) (Attribute PositionY RELIABLE TIMESTAMP) (Attribute Density RELIABLE TIMESTAMP) )

38 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA ) (Class SensorNode (Attribute PositionX RELIABLE TIMESTAMP) (Attribute PositionY RELIABLE TIMESTAMP) (Attribute State RELIABLE TIMESTAMP) ) ) ) Forest fire spread federate The model of this simulation federate was presented in Chapter 2. In which, there are some fires (red points) being randomly initialized in the forest. These fires will spread around according to the transition function and CA pattern of the model. An example about the spreading is shown in Figure 4.6. The green, red, grey, and white points represent the trees, fires, ashes, and empty states, respectively. Figure 4.6: An example of simulating of fire spread in the forest. The pattern of 4 neighbour is used. The red color represents fire trees and the gray color implies ashes formed by the fire. The ashes can be formed after some steps. These ashes are able to pollute the river as shown in Figure River pollution di usion federate The model of pollution di usion in the river was also presented in Chapter 2. Initially, there are some polluted points randomly generated in the river. During the progress of di usion, it always receives status data about the fire from forest federate via the RTI. It will check the data to determine whether the ashes will pollute some river cells or not. This defends on a specific condition. For each river cell, if the distance to an ash cell is equal or less than a specify 34

39 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA threshold, the pollution density of river cell will decrease in inverse proportion of that of the distance. The RTI only sends that data to river federate as soon as it receives an update call from forest federate. The update call only appears when ashes presented in the scope of the forest boundary. Figure 4.7: A result is got from visualization federate. This demonstrates the exchanging data between the two simulations via the RTI. Two regions marked with the red circles representing the new pollution created by the ashes, which are formed from the forest fire after 4 steps WSN federate The model of WSN was also introduced in Chapter 2. Every time step, nodes will receive the data from forest federate via the RTI and only consider to cells in the scope of the sensing range. If it detected that there are fire, it will forward that information to a observer for making decision. The signals will be raised as the fire is recognized. As depicted in Figure 4.8, thered rings indicate that fires have been detected at those sensors Visualization federate The viewer federate is based on the 2D visualization X Window System. As mentioned above, it first subscribes all necessary data, which have been published by other federates. The aim is to provide a overview on the results as shown in Figure 4.8. Initially, the background of the viewer is drawn from visible data extracted from PickCell tool. During the federation execution, this federate will receive data from others and update the view at every step. 35

40 CHAPTER 4. DISTRIBUTED SIMULATION WITH HLA A case study This section describes a case of the federation. Initially, one federate creates a federation on the RTI and waits for other federates to participate. Another federate will connect to that federation and also wait until the last coming. The first one will send a request to others to achieve a synchronization point. After the responses of other federates, the synchronization point is achieved. They run on the same time progress. At each time step, these federates exchange data together via the RTI. Figure 4.8 presents the results captured from visualization federate. Figure 4.8: Illustrating an interoperability between the four federates via the RTI Simulation tools Along with the PickCell tool, which is developed at LabSTICC laboratory. An Open Source software, CERTI [18], was used in this project. The CERTI RTI supports HLA 1.3 specification (C++ and Java). The X Window System was used to support for displaying the results of simulation federates. 36

! High Level Architecture (HLA): Background. ! Rules. ! Interface Specification. Maria Hybinette, UGA. ! SIMNET (SIMulator NETworking) ( )

! High Level Architecture (HLA): Background. ! Rules. ! Interface Specification. Maria Hybinette, UGA. ! SIMNET (SIMulator NETworking) ( ) Outline CSCI 8220 Parallel & Distributed Simulation PDES: Distributed Virtual Environments Introduction High Level Architecture! High Level Architecture (HLA): Background! Rules! Interface Specification»

More information

high performance medical reconstruction using stream programming paradigms

high performance medical reconstruction using stream programming paradigms high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming

More information

An Object-Oriented HLA Simulation Study

An Object-Oriented HLA Simulation Study BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No 5 Special Issue on Control in Transportation Systems Sofia 2015 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.1515/cait-2015-0022

More information

Part IV. Chapter 15 - Introduction to MIMD Architectures

Part IV. Chapter 15 - Introduction to MIMD Architectures D. Sima, T. J. Fountain, P. Kacsuk dvanced Computer rchitectures Part IV. Chapter 15 - Introduction to MIMD rchitectures Thread and process-level parallel architectures are typically realised by MIMD (Multiple

More information

Distributed simulation of situated multi-agent systems

Distributed simulation of situated multi-agent systems Distributed simulation of situated multi-agent systems Franco Cicirelli, Andrea Giordano, Libero Nigro Laboratorio di Ingegneria del Software http://www.lis.deis.unical.it Dipartimento di Elettronica Informatica

More information

Optimization solutions for the segmented sum algorithmic function

Optimization solutions for the segmented sum algorithmic function Optimization solutions for the segmented sum algorithmic function ALEXANDRU PÎRJAN Department of Informatics, Statistics and Mathematics Romanian-American University 1B, Expozitiei Blvd., district 1, code

More information

Please view notes for further information on later slides

Please view notes for further information on later slides Please view notes for further information on later slides 1 2 Mobile telecoms planning is driven primarily by coverage of population and secondarily by coverage of geographic area, often with reference

More information

Cellular Automata + Parallel Computing = Computational Simulation

Cellular Automata + Parallel Computing = Computational Simulation Cellular Automata + Parallel Computing = Computational Simulation Domenico Talia ISI-CNR c/o DEIS, Università della Calabria, 87036 Rende, Italy e-mail: talia@si.deis.unical.it Keywords: cellular automata,

More information

CUDA (Compute Unified Device Architecture)

CUDA (Compute Unified Device Architecture) CUDA (Compute Unified Device Architecture) Mike Bailey History of GPU Performance vs. CPU Performance GFLOPS Source: NVIDIA G80 = GeForce 8800 GTX G71 = GeForce 7900 GTX G70 = GeForce 7800 GTX NV40 = GeForce

More information

Dense matching GPU implementation

Dense matching GPU implementation Dense matching GPU implementation Author: Hailong Fu. Supervisor: Prof. Dr.-Ing. Norbert Haala, Dipl. -Ing. Mathias Rothermel. Universität Stuttgart 1. Introduction Correspondence problem is an important

More information

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data 46 Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Chapter 2 Overview of the Design Methodology

Chapter 2 Overview of the Design Methodology Chapter 2 Overview of the Design Methodology This chapter presents an overview of the design methodology which is developed in this thesis, by identifying global abstraction levels at which a distributed

More information

GPU Programming Using NVIDIA CUDA

GPU Programming Using NVIDIA CUDA GPU Programming Using NVIDIA CUDA Siddhante Nangla 1, Professor Chetna Achar 2 1, 2 MET s Institute of Computer Science, Bandra Mumbai University Abstract: GPGPU or General-Purpose Computing on Graphics

More information

Variations on Genetic Cellular Automata

Variations on Genetic Cellular Automata Variations on Genetic Cellular Automata Alice Durand David Olson Physics Department amdurand@ucdavis.edu daolson@ucdavis.edu Abstract: We investigated the properties of cellular automata with three or

More information

Network protocols and. network systems INTRODUCTION CHAPTER

Network protocols and. network systems INTRODUCTION CHAPTER CHAPTER Network protocols and 2 network systems INTRODUCTION The technical area of telecommunications and networking is a mature area of engineering that has experienced significant contributions for more

More information

R&D White Paper WHP 018. The DVB MHP Internet Access profile. Research & Development BRITISH BROADCASTING CORPORATION. January J.C.

R&D White Paper WHP 018. The DVB MHP Internet Access profile. Research & Development BRITISH BROADCASTING CORPORATION. January J.C. R&D White Paper WHP 018 January 2002 The DVB MHP Internet Access profile J.C. Newell Research & Development BRITISH BROADCASTING CORPORATION BBC Research & Development White Paper WHP 018 Title J.C. Newell

More information

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE

A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE A TALENTED CPU-TO-GPU MEMORY MAPPING TECHNIQUE Abu Asaduzzaman, Deepthi Gummadi, and Chok M. Yip Department of Electrical Engineering and Computer Science Wichita State University Wichita, Kansas, USA

More information

Complex Dynamics in Life-like Rules Described with de Bruijn Diagrams: Complex and Chaotic Cellular Automata

Complex Dynamics in Life-like Rules Described with de Bruijn Diagrams: Complex and Chaotic Cellular Automata Complex Dynamics in Life-like Rules Described with de Bruijn Diagrams: Complex and Chaotic Cellular Automata Paulina A. León Centro de Investigación y de Estudios Avanzados Instituto Politécnico Nacional

More information

Vortex Whitepaper. Simplifying Real-time Information Integration in Industrial Internet of Things (IIoT) Control Systems

Vortex Whitepaper. Simplifying Real-time Information Integration in Industrial Internet of Things (IIoT) Control Systems Vortex Whitepaper Simplifying Real-time Information Integration in Industrial Internet of Things (IIoT) Control Systems www.adlinktech.com 2017 Table of Contents 1. Introduction........ P 3 2. Iot and

More information

Parallel Execution of Kahn Process Networks in the GPU

Parallel Execution of Kahn Process Networks in the GPU Parallel Execution of Kahn Process Networks in the GPU Keith J. Winstein keithw@mit.edu Abstract Modern video cards perform data-parallel operations extremely quickly, but there has been less work toward

More information

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN

NOTES ON OBJECT-ORIENTED MODELING AND DESIGN NOTES ON OBJECT-ORIENTED MODELING AND DESIGN Stephen W. Clyde Brigham Young University Provo, UT 86402 Abstract: A review of the Object Modeling Technique (OMT) is presented. OMT is an object-oriented

More information

Cellular Automata. Cellular Automata contains three modes: 1. One Dimensional, 2. Two Dimensional, and 3. Life

Cellular Automata. Cellular Automata contains three modes: 1. One Dimensional, 2. Two Dimensional, and 3. Life Cellular Automata Cellular Automata is a program that explores the dynamics of cellular automata. As described in Chapter 9 of Peak and Frame, a cellular automaton is determined by four features: The state

More information

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni

CUDA Optimizations WS Intelligent Robotics Seminar. Universität Hamburg WS Intelligent Robotics Seminar Praveen Kulkarni CUDA Optimizations WS 2014-15 Intelligent Robotics Seminar 1 Table of content 1 Background information 2 Optimizations 3 Summary 2 Table of content 1 Background information 2 Optimizations 3 Summary 3

More information

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 669-674 Research India Publications http://www.ripublication.com/aeee.htm Data Warehousing Ritham Vashisht,

More information

GPU Computing: Development and Analysis. Part 1. Anton Wijs Muhammad Osama. Marieke Huisman Sebastiaan Joosten

GPU Computing: Development and Analysis. Part 1. Anton Wijs Muhammad Osama. Marieke Huisman Sebastiaan Joosten GPU Computing: Development and Analysis Part 1 Anton Wijs Muhammad Osama Marieke Huisman Sebastiaan Joosten NLeSC GPU Course Rob van Nieuwpoort & Ben van Werkhoven Who are we? Anton Wijs Assistant professor,

More information

Technical Brief. AGP 8X Evolving the Graphics Interface

Technical Brief. AGP 8X Evolving the Graphics Interface Technical Brief AGP 8X Evolving the Graphics Interface Increasing Graphics Bandwidth No one needs to be convinced that the overall PC experience is increasingly dependent on the efficient processing of

More information

Top-Level View of Computer Organization

Top-Level View of Computer Organization Top-Level View of Computer Organization Bởi: Hoang Lan Nguyen Computer Component Contemporary computer designs are based on concepts developed by John von Neumann at the Institute for Advanced Studies

More information

ACCELERATED COMPLEX EVENT PROCESSING WITH GRAPHICS PROCESSING UNITS

ACCELERATED COMPLEX EVENT PROCESSING WITH GRAPHICS PROCESSING UNITS ACCELERATED COMPLEX EVENT PROCESSING WITH GRAPHICS PROCESSING UNITS Prabodha Srimal Rodrigo Registration No. : 138230V Degree of Master of Science Department of Computer Science & Engineering University

More information

A Road Marking Extraction Method Using GPGPU

A Road Marking Extraction Method Using GPGPU , pp.46-54 http://dx.doi.org/10.14257/astl.2014.50.08 A Road Marking Extraction Method Using GPGPU Dajun Ding 1, Jongsu Yoo 1, Jekyo Jung 1, Kwon Soon 1 1 Daegu Gyeongbuk Institute of Science and Technology,

More information

A Review on Cache Memory with Multiprocessor System

A Review on Cache Memory with Multiprocessor System A Review on Cache Memory with Multiprocessor System Chirag R. Patel 1, Rajesh H. Davda 2 1,2 Computer Engineering Department, C. U. Shah College of Engineering & Technology, Wadhwan (Gujarat) Abstract

More information

APNIC input to the Vietnam Ministry of Information and Communications ICT Journal on IPv6

APNIC input to the Vietnam Ministry of Information and Communications ICT Journal on IPv6 APNIC input to the Vietnam Ministry of Information and Communications ICT Journal on IPv6 April 2013 Question One Since APNIC formally announce that Asia Pacific was the first region on the world coming

More information

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm

Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Mobile Cloud Multimedia Services Using Enhance Blind Online Scheduling Algorithm Saiyad Sharik Kaji Prof.M.B.Chandak WCOEM, Nagpur RBCOE. Nagpur Department of Computer Science, Nagpur University, Nagpur-441111

More information

CUDA Programming Model

CUDA Programming Model CUDA Xing Zeng, Dongyue Mou Introduction Example Pro & Contra Trend Introduction Example Pro & Contra Trend Introduction What is CUDA? - Compute Unified Device Architecture. - A powerful parallel programming

More information

References. The vision of ambient intelligence. The missing component...

References. The vision of ambient intelligence. The missing component... References Introduction 1 K. Sohraby, D. Minoli, and T. Znadi. Wireless Sensor Networks: Technology, Protocols, and Applications. John Wiley & Sons, 2007. H. Karl and A. Willig. Protocols and Architectures

More information

An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture

An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture An Introduction to GPGPU Pro g ra m m ing - CUDA Arc hitec ture Rafia Inam Mälardalen Real-Time Research Centre Mälardalen University, Västerås, Sweden http://www.mrtc.mdh.se rafia.inam@mdh.se CONTENTS

More information

U.S. Department of Defense. High Level Architecture Interface Specification. Version 1.3

U.S. Department of Defense. High Level Architecture Interface Specification. Version 1.3 U.S. Department of Defense High Level Architecture Interface Specification Version 1.3 2 April 1998 Contents 1. Overview... 1 1.1 Scope...1 1.2 Purpose...1 1.3 Background...1 1.3.1 HLA federation object

More information

A Tutorial on Agent Based Software Engineering

A Tutorial on Agent Based Software Engineering A tutorial report for SENG 609.22 Agent Based Software Engineering Course Instructor: Dr. Behrouz H. Far A Tutorial on Agent Based Software Engineering Qun Zhou December, 2002 Abstract Agent oriented software

More information

Lecturer 2: Spatial Concepts and Data Models

Lecturer 2: Spatial Concepts and Data Models Lecturer 2: Spatial Concepts and Data Models 2.1 Introduction 2.2 Models of Spatial Information 2.3 Three-Step Database Design 2.4 Extending ER with Spatial Concepts 2.5 Summary Learning Objectives Learning

More information

Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM. Join the Conversation #OpenPOWERSummit

Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM. Join the Conversation #OpenPOWERSummit Facilitating IP Development for the OpenCAPI Memory Interface Kevin McIlvain, Memory Development Engineer IBM Join the Conversation #OpenPOWERSummit Moral of the Story OpenPOWER is the best platform to

More information

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore

High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No # 09 Lecture No # 40 This is lecture forty of the course on

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

Optimizing Data Locality for Iterative Matrix Solvers on CUDA

Optimizing Data Locality for Iterative Matrix Solvers on CUDA Optimizing Data Locality for Iterative Matrix Solvers on CUDA Raymond Flagg, Jason Monk, Yifeng Zhu PhD., Bruce Segee PhD. Department of Electrical and Computer Engineering, University of Maine, Orono,

More information

Chapter 17 - Parallel Processing

Chapter 17 - Parallel Processing Chapter 17 - Parallel Processing Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ Luis Tarrataca Chapter 17 - Parallel Processing 1 / 71 Table of Contents I 1 Motivation 2 Parallel Processing Categories

More information

Object Orientated Analysis and Design. Benjamin Kenwright

Object Orientated Analysis and Design. Benjamin Kenwright Notation Part 2 Object Orientated Analysis and Design Benjamin Kenwright Outline Review What do we mean by Notation and UML? Types of UML View Continue UML Diagram Types Conclusion and Discussion Summary

More information

arxiv: v1 [physics.comp-ph] 4 Nov 2013

arxiv: v1 [physics.comp-ph] 4 Nov 2013 arxiv:1311.0590v1 [physics.comp-ph] 4 Nov 2013 Performance of Kepler GTX Titan GPUs and Xeon Phi System, Weonjong Lee, and Jeonghwan Pak Lattice Gauge Theory Research Center, CTP, and FPRD, Department

More information

B. Tech. Project Second Stage Report on

B. Tech. Project Second Stage Report on B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic

More information

How Does Your Real-time Data Look?

How Does Your Real-time Data Look? How Does Your Real-time Data Look? By Supreet Oberoi Real-Time Innovations, Inc. 385 Moffett Park Drive Sunnyvale, CA 94089 www.rti.com Introduction Are all real-time distributed applications supposed

More information

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 20 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(20), 2014 [12526-12531] Exploration on the data mining system construction

More information

Introduction to Mobile Ad hoc Networks (MANETs)

Introduction to Mobile Ad hoc Networks (MANETs) Introduction to Mobile Ad hoc Networks (MANETs) 1 Overview of Ad hoc Network Communication between various devices makes it possible to provide unique and innovative services. Although this inter-device

More information

Ch 1: The Architecture Business Cycle

Ch 1: The Architecture Business Cycle Ch 1: The Architecture Business Cycle For decades, software designers have been taught to build systems based exclusively on the technical requirements. Software architecture encompasses the structures

More information

Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics

Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics N. Melab, T-V. Luong, K. Boufaras and E-G. Talbi Dolphin Project INRIA Lille Nord Europe - LIFL/CNRS UMR 8022 - Université

More information

Make Networks Work. Network simulation emulation software for: Development Analysis Testing Cyber Assessment DATASHEET

Make Networks Work. Network simulation emulation software for: Development Analysis Testing Cyber Assessment DATASHEET DATASHEET Make Networks Work Network simulation emulation software for: Development Analysis Testing Cyber Assessment The EXata Simulation Emulation Platform The EXata software (EXata) provides ultra high-fidelity

More information

A Parallel Access Method for Spatial Data Using GPU

A Parallel Access Method for Spatial Data Using GPU A Parallel Access Method for Spatial Data Using GPU Byoung-Woo Oh Department of Computer Engineering Kumoh National Institute of Technology Gumi, Korea bwoh@kumoh.ac.kr Abstract Spatial access methods

More information

CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION

CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION CHAPTER 5 ANT-FUZZY META HEURISTIC GENETIC SENSOR NETWORK SYSTEM FOR MULTI - SINK AGGREGATED DATA TRANSMISSION 5.1 INTRODUCTION Generally, deployment of Wireless Sensor Network (WSN) is based on a many

More information

Emergency Services: Process, Rules and Events

Emergency Services: Process, Rules and Events Emergency Services: Process, Rules and Events Mauricio Salatino, Esteban Aliverti, and Demian Calcaprina Plugtree salaboy@gmail.com Abstract. The Emergency Service Application was built as a blue print

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 12

More information

UBIQUITIOUS, RESILIENT, SECURE CONNECTIVITY IN THE NEAR-PEER THREAT ENVIRONMENT

UBIQUITIOUS, RESILIENT, SECURE CONNECTIVITY IN THE NEAR-PEER THREAT ENVIRONMENT 2018 Viasat White Paper August 27, 2018 UBIQUITIOUS, RESILIENT, SECURE CONNECTIVITY IN THE NEAR-PEER THREAT ENVIRONMENT With Hybrid Adaptive Networking By Craig Miller Vice President, Chief Technical Officer

More information

The Affinity Effects of Parallelized Libraries in Concurrent Environments. Abstract

The Affinity Effects of Parallelized Libraries in Concurrent Environments. Abstract The Affinity Effects of Parallelized Libraries in Concurrent Environments FABIO LICHT, BRUNO SCHULZE, LUIS E. BONA, AND ANTONIO R. MURY 1 Federal University of Parana (UFPR) licht@lncc.br Abstract The

More information

Massive Scalability With InterSystems IRIS Data Platform

Massive Scalability With InterSystems IRIS Data Platform Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special

More information

Tuning CUDA Applications for Fermi. Version 1.2

Tuning CUDA Applications for Fermi. Version 1.2 Tuning CUDA Applications for Fermi Version 1.2 7/21/2010 Next-Generation CUDA Compute Architecture Fermi is NVIDIA s next-generation CUDA compute architecture. The Fermi whitepaper [1] gives a detailed

More information

OPEN Networks - Future Worlds Consultation

OPEN Networks - Future Worlds Consultation ENA Open Networks Future Worlds Consultation OPEN Networks - Future Worlds Consultation Developing change options to facilitate energy decarbonisation, digitisation and decentralisation Executive Summary

More information

Fusion of Radar and EO-sensors for Surveillance

Fusion of Radar and EO-sensors for Surveillance of Radar and EO-sensors for Surveillance L.J.H.M. Kester, A. Theil TNO Physics and Electronics Laboratory P.O. Box 96864, 2509 JG The Hague, The Netherlands kester@fel.tno.nl, theil@fel.tno.nl Abstract

More information

Introduction and Statement of the Problem

Introduction and Statement of the Problem Chapter 1 Introduction and Statement of the Problem 1.1 Introduction Unlike conventional cellular wireless mobile networks that rely on centralized infrastructure to support mobility. An Adhoc network

More information

Chapter-4. Simulation Design and Implementation

Chapter-4. Simulation Design and Implementation Chapter-4 Simulation Design and Implementation In this chapter, the design parameters of system and the various metrics measured for performance evaluation of the routing protocols are presented. An overview

More information

CELLULAR AUTOMATA IN MATHEMATICAL MODELING JOSH KANTOR. 1. History

CELLULAR AUTOMATA IN MATHEMATICAL MODELING JOSH KANTOR. 1. History CELLULAR AUTOMATA IN MATHEMATICAL MODELING JOSH KANTOR 1. History Cellular automata were initially conceived of in 1948 by John von Neumann who was searching for ways of modeling evolution. He was trying

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

CPS122 Lecture: From Python to Java last revised January 4, Objectives: Objectives: CPS122 Lecture: From Python to Java last revised January 4, 2017 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

A Taxonomy of Web Agents

A Taxonomy of Web Agents A Taxonomy of s Zhisheng Huang, Anton Eliëns, Alex van Ballegooij, and Paul de Bra Free University of Amsterdam, The Netherlands Center for Mathematics and Computer Science(CWI), The Netherlands Eindhoven

More information

Interprocess Communication By: Kaushik Vaghani

Interprocess Communication By: Kaushik Vaghani Interprocess Communication By: Kaushik Vaghani Background Race Condition: A situation where several processes access and manipulate the same data concurrently and the outcome of execution depends on the

More information

Measurement of real time information using GPU

Measurement of real time information using GPU Measurement of real time information using GPU Pooja Sharma M. Tech Scholar, Department of Electronics and Communication E-mail: poojachaturvedi1985@gmail.com Rajni Billa M. Tech Scholar, Department of

More information

COMP 605: Introduction to Parallel Computing Lecture : GPU Architecture

COMP 605: Introduction to Parallel Computing Lecture : GPU Architecture COMP 605: Introduction to Parallel Computing Lecture : GPU Architecture Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University (SDSU) Posted:

More information

Chapter 1. Introduction

Chapter 1. Introduction Introduction 1 Chapter 1. Introduction We live in a three-dimensional world. Inevitably, any application that analyzes or visualizes this world relies on three-dimensional data. Inherent characteristics

More information

Associative Cellular Learning Automata and its Applications

Associative Cellular Learning Automata and its Applications Associative Cellular Learning Automata and its Applications Meysam Ahangaran and Nasrin Taghizadeh and Hamid Beigy Department of Computer Engineering, Sharif University of Technology, Tehran, Iran ahangaran@iust.ac.ir,

More information

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture

More information

Ans 1-j)True, these diagrams show a set of classes, interfaces and collaborations and their relationships.

Ans 1-j)True, these diagrams show a set of classes, interfaces and collaborations and their relationships. Q 1) Attempt all the following questions: (a) Define the term cohesion in the context of object oriented design of systems? (b) Do you need to develop all the views of the system? Justify your answer?

More information

CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS

CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS CS 179: GPU Computing LECTURE 4: GPU MEMORY SYSTEMS 1 Last time Each block is assigned to and executed on a single streaming multiprocessor (SM). Threads execute in groups of 32 called warps. Threads in

More information

Closing the Hybrid Cloud Security Gap with Cavirin

Closing the Hybrid Cloud Security Gap with Cavirin Enterprise Strategy Group Getting to the bigger truth. Solution Showcase Closing the Hybrid Cloud Security Gap with Cavirin Date: June 2018 Author: Doug Cahill, Senior Analyst Abstract: Most organizations

More information

Information Visualization Theorem for Battlefield Screen

Information Visualization Theorem for Battlefield Screen Journal of Computer and Communications, 2016, 4, 73-78 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.45011 Information Visualization Theorem for

More information

A Review: Optimization of Energy in Wireless Sensor Networks

A Review: Optimization of Energy in Wireless Sensor Networks A Review: Optimization of Energy in Wireless Sensor Networks Anjali 1, Navpreet Kaur 2 1 Department of Electronics & Communication, M.Tech Scholar, Lovely Professional University, Punjab, India 2Department

More information

High Ppeed Circuit Techniques for Network Intrusion Detection Systems (NIDS)

High Ppeed Circuit Techniques for Network Intrusion Detection Systems (NIDS) The University of Akron IdeaExchange@UAkron Mechanical Engineering Faculty Research Mechanical Engineering Department 2008 High Ppeed Circuit Techniques for Network Intrusion Detection Systems (NIDS) Ajay

More information

Performing MapReduce on Data Centers with Hierarchical Structures

Performing MapReduce on Data Centers with Hierarchical Structures INT J COMPUT COMMUN, ISSN 1841-9836 Vol.7 (212), No. 3 (September), pp. 432-449 Performing MapReduce on Data Centers with Hierarchical Structures Z. Ding, D. Guo, X. Chen, X. Luo Zeliu Ding, Deke Guo,

More information

WHAT IS SOFTWARE ARCHITECTURE?

WHAT IS SOFTWARE ARCHITECTURE? WHAT IS SOFTWARE ARCHITECTURE? Chapter Outline What Software Architecture Is and What It Isn t Architectural Structures and Views Architectural Patterns What Makes a Good Architecture? Summary 1 What is

More information

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1

Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 Localized and Incremental Monitoring of Reverse Nearest Neighbor Queries in Wireless Sensor Networks 1 HAI THANH MAI AND MYOUNG HO KIM Department of Computer Science Korea Advanced Institute of Science

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

Small Cells as a Service rethinking the mobile operator business

Small Cells as a Service rethinking the mobile operator business Small Cells as a Service rethinking the mobile operator business Mats Eriksson, CEO Arctos Labs Scandinavia AB -02-18 1 Executive summary Mobile operators (MNOs) face a huge challenge in meeting the data

More information

General Purpose GPU Computing in Partial Wave Analysis

General Purpose GPU Computing in Partial Wave Analysis JLAB at 12 GeV - INT General Purpose GPU Computing in Partial Wave Analysis Hrayr Matevosyan - NTC, Indiana University November 18/2009 COmputationAL Challenges IN PWA Rapid Increase in Available Data

More information

High-Speed Context Switching on FPGAs

High-Speed Context Switching on FPGAs High-Speed Context Switching on FPGAs FPGAs provide thousands of simple configurable logic blocks combined with a programmable interconnect network to implement virtually any digital circuit. FPGAs have

More information

Technical Briefing. The TAOS Operating System: An introduction. October 1994

Technical Briefing. The TAOS Operating System: An introduction. October 1994 Technical Briefing The TAOS Operating System: An introduction October 1994 Disclaimer: Provided for information only. This does not imply Acorn has any intention or contract to use or sell any products

More information

ARM Simulation using C++ and Multithreading

ARM Simulation using C++ and Multithreading International Journal of Innovative Technology and Exploring Engineering (IJITEE) ARM Simulation using C++ and Multithreading Suresh Babu S, Channabasappa Baligar Abstract: - This project is to be produced

More information

COMPUTER EXERCISE: POPULATION DYNAMICS IN SPACE September 3, 2013

COMPUTER EXERCISE: POPULATION DYNAMICS IN SPACE September 3, 2013 COMPUTER EXERCISE: POPULATION DYNAMICS IN SPACE September 3, 2013 Objectives: Introduction to coupled maps lattice as a basis for spatial modeling Solve a spatial Ricker model to investigate how wave speed

More information

Computer-based systems will be increasingly embedded in many of

Computer-based systems will be increasingly embedded in many of Programming Ubiquitous and Mobile Computing Applications with TOTA Middleware Marco Mamei, Franco Zambonelli, and Letizia Leonardi Universita di Modena e Reggio Emilia Tuples on the Air (TOTA) facilitates

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

Optimizing Simulation of Movement in Buildings by Using People Flow Analysis Technology

Optimizing Simulation of Movement in Buildings by Using People Flow Analysis Technology Mobility Services for Better Urban Travel Experiences Optimizing Simulation of Movement in Buildings by Using People Flow Analysis Technology The high level of progress in urban planning is being accompanied

More information

CS 475: Parallel Programming Introduction

CS 475: Parallel Programming Introduction CS 475: Parallel Programming Introduction Wim Bohm, Sanjay Rajopadhye Colorado State University Fall 2014 Course Organization n Let s make a tour of the course website. n Main pages Home, front page. Syllabus.

More information

SCALABLE. Network modeling software for: Development Analysis Testing Cyber Assessment DATASHEET NETWORK TECHNOLOGIES. Virtual Network Model

SCALABLE. Network modeling software for: Development Analysis Testing Cyber Assessment DATASHEET NETWORK TECHNOLOGIES. Virtual Network Model SCALABLE NETWORK TECHNOLOGIES DATASHEET Network modeling software for: Development Analysis Testing Cyber Assessment EXata software (EXata) is a tool for scientists, engineers, IT technicians and communications

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied

More information

CHAPTER 3 ANTI-COLLISION PROTOCOLS IN RFID BASED HUMAN TRACKING SYSTEMS (A BRIEF OVERVIEW)

CHAPTER 3 ANTI-COLLISION PROTOCOLS IN RFID BASED HUMAN TRACKING SYSTEMS (A BRIEF OVERVIEW) 33 CHAPTER 3 ANTI-COLLISION PROTOCOLS IN RFID BASED HUMAN TRACKING SYSTEMS (A BRIEF OVERVIEW) In a RFID based communication system the reader activates a set of tags, and the tags respond back. As outlined

More information

Chapter 10. Conclusion Discussion

Chapter 10. Conclusion Discussion Chapter 10 Conclusion 10.1 Discussion Question 1: Usually a dynamic system has delays and feedback. Can OMEGA handle systems with infinite delays, and with elastic delays? OMEGA handles those systems with

More information