Forecasting Model for Network Throughput of Remote Data Access in Computing Grids

Size: px
Start display at page:

Download "Forecasting Model for Network Throughput of Remote Data Access in Computing Grids"

Transcription

1 Forecasting Model for Network Throughput of Remote Data Access in Computing Grids Volodimir Begy CERN Geneva, Switzerland Martin Barisits CERN Geneva, Switzerland Mario Lassnig CERN Geneva, Switzerland Erich Schikuta University of Vienna Vienna, Austria ATL-SOFT-PROC /09/2018 Abstract Computing grids are one of the key enablers of escience. Researchers from many fields (e.g. High Energy Physics, Bioinformatics, Climatology, etc.) employ grids to run computational jobs in a highly distributed manner. The current state of the art approach for data access in the grid is data placement: a job is scheduled to run at a specific data center, and its execution starts only when the complete input data has been transferred there. This approach has two major disadvantages: (1) the jobs are staying idle while waiting for the input data; (2) due to the limited infrastructure resources, the distributed data management system handling the data placement, may queue the transfers up to several days. An alternative approach is remote data access: a job may stream the input data directly from storage elements, which may be located at local or remote data centers. Remote data access brings two innovative benefits: (1) the jobs can be executed asynchronously with respect to the data transfer; (2) when combined with data placement on the policy level, it may help to optimize the network load grid-wide, since these two data access methodologies partially exhibit nonoverlapping bottlenecks. However, in order to employ such a technique systematically, the properties of its network throughput need to be studied carefully. This paper presents results of experimental identification of parameters influencing the throughput of remote data access, a statistically tested mathematical formalization of these parameters, and, finally, an Auto-Regressive Integrated Moving Average (ARIMA) based forecasting model. In the context of a given virtual link between a storage element and a worker node, the model predicts the future relationship between the amount of data to be transferred and the amount of concurrent traffic as the independent variables and transfer time as the dependent variable. The purpose of the model is to assist various stakeholders of the grid in decision-making related to data access. A benchmark of Long Short-Term Memory Artificial Neural Networks (LSTM ANN) forecasting performance is also presented. This work is based on measurements taken on the World-Wide LHC Computing Grid (WLCG), which is employed by the ATLAS experiment at CERN in the domain of High Energy Physics. Index Terms Grid Computing, Remote Data Access, Network Modeling, Time Series Forecasting, Machine Learning I. MOTIVATION Computing grids [1] are collections of geographically sparse data centers. Employing a wide range of heterogenous hardware, these participating facilities work together to reach a common goal in a coordinated manner. Grids store vast Copyright 2018 CERN for the benefit of the ATLAS Collaboration. Reproduction of this article or parts of it is allowed as specified in the CC- BY-4.0 license amounts of scientific data, and numerous users run computational jobs to analyze these data in a highly distributed fashion. For example, within the World-Wide LHC Computing Grid (WLCG) [2] more than 150 computing sites are employed by the ATLAS experiment [3] at CERN. WLCG stores more than 360 petabytes of ATLAS data, which is used for distributed analysis by more than 5000 users. In order to achieve a modular architecture, the grid resources are typically divided into three major classes: storage elements, worker nodes and network. The function of storage elements is to replicate and persist the data for long term using various technologies (e.g. hard disks, solid state drives or tapes). On the other hand, the worker nodes perform CPU intensive tasks. WAN and LAN networks facilitate the data transfer between storage elements and worker nodes. Such strict division of labour has led to the fact that the current best practice for data access in the grid is data placement [4]. Data placement is performed by dedicated distributed data management systems. Consider a job scheduled to run at the data center DC1, while the required input replica is located at the data center DC2. The job may commence its execution only after the completion of the following workflow: (1) the input replica is transferred from the relevant storage element at DC2 to a storage element at DC1; (2) the input replica is staged-in from the relevant storage element at DC1 to the worker node. An alternative approach is remote data access, which allows to directly stream the input replica from a local or remote storage element. To demonstrate the potential of remote data access to optimize the resource utilization, when combined on policy level with traditional data placement, consider two following scenarios. The first example is depicted in Fig. 1. A computational job is submitted to the worker node at the data center DC1. The job requires the input replica R, which is persisted in the remote storage element at the data center DC2. Currently, the storage element in the local data center DC1 has a high load because of transfers to various other data centers. Assume that the bottleneck is only present at this local storage element. This is recognized by the distributed data management system, and it queues the transfer of the input replica R. Employing remote data access, the bottleneck can be immediately bypassed. In another example, the input replica is located at the local storage element. However, the worker node does not have enough disk space available for stage-in. The distributed data

2 Fig. 1. Bottleneck at the local storage element. management system queues the file transfer, until the required disk space becomes available. In contrast, using remote data access to the local storage element, the input data can be streamed in fractions without persisting the complete input file. Taking into account the magnitudes of real life computing grids, it becomes obvious that many more potential scenarios would benefit from remote data access in the context of resource utilization. This has motivated us to study the properties of its network throughput. Furthermore, a forecasting model for such network throughput is needed for planning and coordination of scientific workflows. II. STATE OF THE ART The state of the art approach for data access in computing grids is data placement. An example of a distributed data management system facilitating inter-storage element data placement and stage-in is Rucio [5]. Besides data placement Rucio is also responsible for data cataloging and replication. In its deployment by the ATLAS experiment at CERN, Rucio is coupled with the File Transfer Service (FTS) and the workload management system PanDA [6]. However, the modular architecture allows for a flexible integration with other grid frameworks. Due to its horizontally scalable architecture and high portability, it currently handles about 2 million file transfers, which are equivalent to 2 petabytes, among more than 150 data centers per day. Another example of a data placement framework is Stork [7]. The authors propose to strictly differentiate between data placement and computational activities within the workflows, enabling their asynchronous execution. Stork is interoperable with workflow management systems (e.g. DAGMan [8]), supports various data transfer protocols (e.g. HTTP, GridFTP [9]) and allows to control the load on storage elements and network links. There is an ongoing effort at CERN to enable grid-wide remote data access in the WLCG based on the WebDAV protocol [10] [12]. Currently, it is deployed only at a fraction of the data centers, allowing analysis tasks to access local storage elements. An open issue is the thorough monitoring and logging of such file accesses. Our research on remote data access network throughput is intended to help enable more advanced data access policies. Similar research on network throughput forecasting is presented in [13]. The authors present a methodology for derivation of short term predictions based on regression. The effect of concurrent traffic is not explicitly investigated in this work. Finally, the experiments are performed with data probes of up to 16 megabytes. In contrast, our model delivers long term predictions, taking into account the concurrent traffic of parallel threads and processes. Our work is based on experiments with concurrent file transfers totaling up to tens of gigabytes. III. PARAMETERS INFLUENCING REMOTE DATA ACCESS THROUGHPUT The first phase of this study is the experimental identification of the crucial parameters having impact on remote data access throughput. In our experiment the throughput is indirectly represented through transfer time for a given filesize. Note that these quantities can be easily converted to one another. The hardware of the communicating hosts is a black box. The properties of the hosts are represented exclusively by the observed network throughput. A single computational job is equivalent to a process. Within this process, each requested input file is transferred by an individual thread. The null hypothesis states that for a computational job streaming a given remote file, the transfer time T cannot be linearly regressed onto the following independent variables: (1) S, the size of the original file; (2) ConTh, the cumulative concurrent traffic originating from other threads in the same process during the streaming of the original file; (3) ConPr, the cumulative concurrent traffic originating from other processes in the same worker node during the streaming of the original file. The alternative hypothesis states that such a regression represents the true relationship between the mentioned independent and dependent variables. The independent variable ConPr stands for traffic, which is created as part of the study. Obviously, other concurrent traffic is present on the investigated links, but it is not feasible to monitor all the traffic in the grid. Thus, only the additional effect imposed on the grid by the given bag of jobs is measured. It will be shown further, how other dynamic aspects of the grid are implicitly encapsulated in the forecasting model. To sample the data, 12 computational jobs are launched on a single worker node at the CERN data center (Switzerland). In 15-minute steps the processes initiate remote accesses to the storage element GRIF-LPNHE SCRATCHDISK at the GRIF- LPNHE data center (France) in the following order: 1) Process 1: launch 1 thread to stream 300MB

3 2) Process 1: launch 1 thread to stream 1GB 3) Process 1: launch 1 thread to stream 3GB 4) Processes 1-4: each launch 1 thread to stream 300MB per thread 5) Process 1: launch 2 threads to stream 300MB per thread 6) Processes 1-4: each launch 1 thread to stream 1GB per thread 7) Process 1: launch 2 threads to stream 1GB per thread 8) Processes 1-4: each launch 1 thread to stream 3GB per thread 9) Process 1: launch 2 threads to stream 3GB per thread 10) Processes 1-12: each launch 1 thread to stream 300 MB per thread 11) Process 1: launch 4 threads to stream 300MB per thread 12) Processes 1-12: each launch 1 thread to stream 1GB per thread 13) Process 1: launch 4 threads to stream 1GB per thread 14) Processes 1-12: each launch 1 thread to stream 3GB per thread 15) Process 1: launch 4 threads to stream 3GB per thread Each launched thread is treated as an observation in the final dataset. The drawing interval of 15 minutes was chosen to reduce the probability of overlapping traffic among neighboring drawing steps. Finally, such overlaps did not occur. In each observation, the variable S takes on one of the predefined values (meaning, all the transfers have terminated successfully). Based on the experimental design it is known, what concurrent threads and processes belong to what drawing step. In the period of : :30 we have repeatedly drawn the transfer time T for all the mentioned kinds of threads. Given a single observation x, its value for the variable ConTh is approximated in the following way. First, note down the drawn transfer time T of the observation x. Next, note down the set of all concurrent threads originating from the same process in the same drawing step. Denote this set by xth. Next, iterate over all the elements of xth, denoting the current element by y. If the drawn transfer time T of the thread x is greater than or equal to the drawn transfer time T of the thread y, then add the predefined value for S of y to ConTh of x. Otherwise, calculate the fraction f, by which the transfer times overlap and add f * (S of y) to ConTh of x. The value for the variable ConPr is approximated by an analogous algorithm, based on the drawn transfer time T for concurrent processes. After such sampling and transformations we derive 1122 observations, each having values for the variables T, S, ConTh and ConPr. The protocols used for remote data access are Web- DAV/HTTP [14]. Note that such an intrusive sampling design was necessary, because currently these metrics are not logged. Once proper remote data access monitoring is implemented grid-wide, the observations can be mined from logs unintrusively. A multiple linear regression is fit to the sampled data using ordinary least squares, resulting into the following equation: T = S ConT h ConP r (1) The fit has an F-statistic of 4.549e+04 on 3 degrees of freedom, 1119 residual degrees of freedom and a p-value of <2.2e-16. Thus, the null hypothesis is rejected. The coefficient of S shows that files of various sizes are transferred with various throughputs. Also note that the coefficient of ConTh is about 34.4 times greater than the coefficient of ConPr. This means that concurrent traffic among threads within one process is penalized to a much higher extent in terms of throughput than concurrent traffic among different processes. To check for interaction among the variables ConTh and ConPr, two reduced multiple linear regression models are fit on accordingly adjusted datasets using ordinary least squares. The first reduced model is fit on observations, for which ConPr is equal to 0; the second reduced model is fit on observations, for which ConTh is equal to 0. The resulting models are demonstrated below: T = S ConT h (2) which is statistically significant, having an F-statistic of 3.046e+04 on 2 degrees of freedom and 340 residual degrees of freedom and a p-value of <2.2e-16, and T = S ConP r (3) which is statistically significant with an F-statistic of 1.978e+04 on 2 degrees of freedom and 832 residual degrees of freedom and a p-value of <2.2e-16. These fits are depicted in Fig. 2 and 3. Note, how the coefficient of ConTh has changed in the reduced model by only -1.7% in relation to the full model. The coefficient of ConPr has changed by 2.5%. Among all 3 regression fits, the coefficient of S has changed by at most 9.7% in absolute value. This is a hint that in the true relationship, there is no interaction between the variables ConTh and ConPr. Also note that the Fig. 2 and 3 exhibit few outliers and a certain amount of spread along the axis of the dependent variable for similar (in terms of independent variables) observations. This is to be expected, because the data was sampled over a period of more than 2.5 days, and computing grids are highly dynamic systems. It will be shown further, how our forecasting approach handles this dynamic behavior. It is evident from the figures that once the similar observations get averaged over the sampling period, the linear fit will remain a good approximation of the true relationship. Thus, this finding generalizes well. IV. THROUGHPUT FORECASTING MODEL Our forecasting approach is based on the demonstrated fact that the transfer time T can be regressed onto the variables S, ConTh and ConPr. Consider the time series plot in Fig. 4,

4 which shows transfer times for 2 threads, transferring 2 gigabytes and 300 megabytes respectively, within 1 computing job in the time window : :30. The observations are drawn once every 30 minutes. The computing job runs in the worker node pool ANALY AUSTRALIA and streams the data from a local storage element AUSTRALIA- ATLAS SCRATCHDISK. Once per day, the job gets resubmitted to a new worker node in the pool, which is selected by the workload management system. This is done to emulate the real world workflow during job submission. Fig. 2. Regression fit T~0+S+ConTh. Fig. 4. Australia: transfer times. It is obvious from the daily shifts along the y-axis that various worker nodes exhibit either very different, or very similar throughputs. Consider e.g. the throughput during the first and the seventh drawing days (here, 1 day is equal to 48 points). Even though there are multiple days between the intervals, and even though during each of these days the transfers run on two different machines (00:1B:21:8F:1F:44 and 00:1B:21:90:0C:60), the throughput is pretty constant. We assume that the large shifts are caused by such factors, as significantly different hardware or network loads. For more advanced analytics, worker nodes with similar throughputs can be clustered together, and based on the number of occurrences of each cluster in a time window of interest, a probability distribution can be derived. This probability distribution can be useful for scheduling, since it will give an insight, what fractions of jobs in a computational campaign are likely to be submitted to what families of worker nodes. Let a, b and c denote the coefficients of the regression Fig. 3. Regression fit T~0+S+ConPr. T 0 + a S + b ConT h + c ConP r (4) From the Fig. 4 it is evident that the values of the coefficients vary over time. The sampled multivariate time series can be transformed into a multivariate time series of coefficients a and b. The reduced regression is written as T 0 + a S + b ConT h (5)

5 Let T 1 and T 2 be the sampled transfer times for the 2GB and 300MB files respectively. Since the transfer of the 300MB file always terminates first in our data, the coefficients a and b can be calculated for each time step using the following system of equations: { T 1 = 2000 a b (6) T 2 = 300 a + (2000 (T 2 /T 1 )) b after substitution we get: a = ((17 T 1 T 2 )/(100 (9 T T 2 ))) (7) and b = (T a)/300 (8) Note that we did not sample data, which would allow us to reconstruct the time series for the coefficient c. This is due to the previously demonstrated fact that a vast amount of concurrent traffic among processes is required, in order to reach a noticeable effect. Such sampling would be highly intrusive. Also note that the complete multivariate time series for the coefficients a, b and c can be unintrusively mined from logs, once proper remote data access monitoring is implemented. We have transformed the sampled time series of transfer times from Fig. 4 into time series of coefficients a and b. Next, for forecasting purposes, we have normalized it with respect to the daily shifts among clusters of worker nodes. This normalization can be reversed. The rest of the variance in the normalized data comes from factors, which are not exactly known due to the high complexity and dynamic behavior of the grid. We assume that the remaining variance is determined by the unexplained behavior of the storage elements and network links. Finally, we have simulated the variate for the coefficient c, based on the previous demonstration that in case of remote data access from CERN to GRIF-LPNHE data centers, its value on average was 34.4 times smaller, than that of the coefficient b. The resulting time series is presented in Fig. 5. We have also performed an analogous sampling and analysis for the storage element - worker node pool pair BNL- OSG2 SCRATCHDISK - ANALY BNL LONG. The transfer time and coefficients time series are shown in Fig. 6 and 7. Note that in infrequent instances in Fig. 7 the values of the coefficients are negative. This is due to the fact that the time series is normalized with respect to the first drawing day. Obviously, these observations can be transformed or cleaned in accordance with the specific analysis requirements. The demonstrated multivariate coefficient time series encapsulates all the unknown factors, which are associated with the high complexity and dynamic behavior of the grid. Using time series forecasting methods, its future values can be derived. Thus, it allows to calculate the normalized transfer time (and thus throughput) given arbitrary amounts for S, Fig. 5. Australia: normalized coefficients. Fig. 6. BNL: transfer times. ConTh and ConPr, at any past, present or forecasted time step. For advanced analytics, based on the probability distribution representing job assignment to various clusters of worker nodes and the inverse transform for neutralization of the initial normalization, concrete transfer times can be estimated. However, such estimation will only make sense for bags of jobs. In this case the assignment of numerous jobs to various clusters of worker nodes will lend itself to a meaningful probabilistic analysis. A. Input Features of Forecasting Model The forecasting model individually predicts the 3 variates of the multivariate coefficients time series. When forecasting a single variate, following features are considered as inputs to the forecasting model: 1) Lagged values of the variate itself. 2) Lagged values of 2 other variates.

6 Fig. 9. BNL: CCF between normalized coefficients a and b. Fig. 7. BNL: normalized coefficients. 3) Slightly lagged values of the time series representing the overall traffic at the data center of interest. Consider Fig. 8 and 9, which depict the cross-correlation functions between the normalized coefficients a and b sampled at data centers Australia-ATLAS and BNL-ATLAS. we have mined 4 different traffic time series for the period of interest at the data center Australia-ATLAS: (1) intra-data center traffic in bytes; (2) inter-data center traffic in bytes; (3) sum of intra- and inter-data center traffic in bytes; (4) amount of intra-data center file accesses. It can be seen from the crosscorrelation plots in Fig. 10 that none of the 4 mentioned traffic time series is significantly leading the normalized coefficient a at small lags with positive cross-correlation. Due to the symmetric CCF between the coefficients a and b, this result is also present in the analysis of CCF between the mentioned traffic categories and the normalized coefficient b. The same analysis delivers no meaningful results at the BNL-ATLAS data center. Thus, the overall traffic time series is not used as an input feature for the prediction of the coefficients. Fig. 8. Australia: CCF between normalized coefficients a and b. It is evident from the CCF plots that the values are highly symmetric around the origin. Thus, we do not use other variates as input features to predict a given variate, since such input features will not bring any additional valuable information for model construction. Without a posteriori knowledge, a plausible hypothesis is that the time series of overall network load at the data center of interest leads the amplitude of the coefficients time series with a positive cross-correlation at small negative lags (i.e. if the amount of the overall traffic increases, the time to transfer an additional unit will increase shortly after). From the logs recorded by the distributed data management system Rucio, Fig. 10. Australia: CCF between traffic time series and normalized coefficient a.

7 Finally, only the lagged values of the variate of interest will be used as inputs to predict the future values. As previously mentioned, we have not sampled the authentic data for derivation of coefficient c. We assume that in the demonstrated scenarios the CCF between c and other variates exhibits an analogous symmetric behavior. B. Benchmark of Forecasting Methods The time series of the normalized coefficient a at the Australia-ATLAS (Fig. 5) and BNL-ATLAS (Fig. 7) data centers are used for demonstrational purposes in this paper. Due to the shown symmetric CCF between coefficients a and b it is known that the coefficient b exhibits an analogous behavior. We assume, that so does the coefficient c. The previously presented time series for data centers Australia-ATLAS and BNL-ATLAS are split into 16 datasets respectively, each spanning a period of 2 drawing days. 80% of each dataset are used for the construction of the model, the remaining 20% are used for its evaluation using normalized root mean squared error (NRMSE) as the metric of choice. Thus, we are forecasting the series for the next 9.5 hours. Following methods are benchmarked in context of coefficients time series forecasting: (1) Auto-Regressive Integrated Moving Average (ARIMA) [15]; (2) Long Short-Term Memory Artificial Neural Networks (LSTM ANN) [16]. ARMA is a statistical state of the art approach for derivation of a mathematical model describing the time series of interest in terms of its past values (AR part) and errors (MA part). ARMA models work with stationary time series. ARIMA, the integrated version of ARMA, allows to work with time series, which become stationary after a certain degree of differencing. The model parameters are identified based on the analysis of the (partial) auto-correlation function and stationarity tests. Consider Fig. 11, which depicts the auto-correlation functions of the normalized coefficients a and b at both the data centers. It is evident from the figure that all the time series have a seasonal component with a period of 48 points, which is equivalent to a day. To fit ARIMA models we use the auto.arima() function from R [17], which will estimate the model parameters based on a number of statistical tests. When passing a time series object to the function, we indicate the discovered seasonality. Long Short-Term Memory neural networks are a kind of recurrent neural network. LSTM networks are optimized for learning significant information at arbitrarily extended lags. They have been used for time series forecasting by researchers in different domains, e.g. for traffic speed prediction [18], detection of faults in industrial data [19], etc. It has been also demonstrated at the recent time series forecasting competition CIF 2016 [20], hosted by IEEE that LSTM has delivered outstanding results, outperforming common statistical and soft computing methods. In our work we use the Keras API [21] to construct the neural network instances. The input to the network is onedimensional, accepting a single timestep from the time series. The network has a single hidden layer consisting of an LSTM Fig. 11. ACF of normalized coefficients a and b. cell. The output layer has a size of 19. Thus, the model is trained to do a one-shot 19-step-ahead forecast given a single input observation. The training lasts 1000 epochs. The hyperparameters are chosen based on initial tuning. Before training, first order differencing and then scaling using the interval [-1;1] are applied to the data. In comparison to ARIMA, LSTM has following benefits: LSTM allows to frame the learning problem in a flexible manner. E.g. the specified shape of the output layer will tailor the training procedure with the goal of delivering forecasts of the specific length of interest. ARIMA requires a thorough statistical analysis and treatment of the underlying time series (e.g. in terms of stationarity, Gaussian distribution of residuals, etc.). LSTM does not explicitly pose such restrictions. Conversely, ARIMA models are computationally cheaper to construct. Given the input time series, the auto.arima() function returns a deterministically computed ARIMA model. However, due to the random initialization of weights in the LSTM model, a trained instance may differ from run to run. Thus, in order to compare the performance of both approaches without setting a random seed, a new LSTM model was trained 50 times on each dataset from the Australia-ATLAS and BNL-ATLAS data centers. For each of the datasets the corresponding LSTM training runs are considered, and the log changes [22] between ARIMA and LSTM NRMSE (holding ARIMA NRMSE as reference) are computed, resulting into a distribution. Our findings show that given a dataset, one approach will systematically outperform the other. In rare cases (for 5 out of 32 datasets), both methods perform equally well on average. Consider Fig. 12 and 13, which show concrete examples of LSTM outperforming ARIMA and vice versa. For

8 these 2 datasets, the distributions of log NRMSE changes are depicted in Fig. 14. Fig. 14. Distributions of log NRMSE changes for 2 datasets. Fig. 12. LSTM outperforming ARIMA(1,0,0)(1,0,0)[48]. Fig. 13. ARIMA(0,1,1) outperforming LSTM. For these 2 datasets, the distributions of log NRMSE changes are depicted in Fig. 14. We have performed the augmented Dickey-Fuller test implying stationarity under the alternative hypothesis on each dataset (on its fraction used for model construction). We have then tested for an association between the ADF test statistic and the type of the forecasting method, which dominates in terms of performance given the dataset. Unfortunately, this analysis did not deliver significant results. Thus, the method, which is more accurate on average needs to be selected. Fig. 15 shows boxplots of all measured log NRMSE changes for the two separate data centers. The average log NRMSE changes are 0.51 and 0.57 for Australia-ATLAS and BNL- ATLAS data centers respectively. Thus, on average, ARIMA outperforms LSTM. Based on these findings, we adopt the ARIMA method for our throughput forecasting model. The boxplots of ARIMA NRMSE for all datasets are shown in Fig. 16. Fig. 15. Log NRMSE changes. Fig. 16. Performance of ARIMA in terms of NRMSE.

9 V. CONCLUSIONS AND FUTURE WORK In this paper we have demonstrated that the remote data access throughput can be framed as a multiple linear regression. Furthermore, we have shown, how the estimates for the regression coefficients can be mined in form of time series from logs provided by the remote data access monitoring system. We have identified useful input features for the forecasting model. Based on the ARIMA method, it is possible to do long term forecasts for the coefficient time series with state of the art accuracy. A benchmark of LSTM ANN has partially shown good results, but did not outperform ARIMA. The throughput forecasting model was constructed and validated using performance metrics from ATLAS data centers in the World-Wide LHC Computing Grid. Currently we are focusing on the following aspects for the future work: A heuristic model for selection of data access profile in computational jobs. Given a bag of jobs, the model decides what input data should be accessed remotely or in the data placement fashion, with the goal to minimize the total transfer time. Since the model handles bags of jobs, it also deals with bottlenecks on the side of the storage elements and network links. A framework for remote data access management and monitoring, which is interoperable with the overall grid environment. ACKNOWLEDGMENT This work was performed on behalf of the ATLAS Collaboration. REFERENCES [1] I. Foster and C. Kesselman, The Grid 2: Blueprint for a New Computing Infrastructure. Elsevier, [2] I. Bird, Computing for the Large Hadron Collider, Annual Review of Nuclear and Particle Science, vol. 61, pp , [3] ATLAS Collaboration, The ATLAS Experiment at the CERN Large Hadron Collider, 2008 JINST 3 S [4] A. Chervenak, E. Deelman, M. Livny, M.-H. Su, R. Schuler, S. Bharathi, G. Mehta, and K. Vahi, Data Placement for Scientific Applications in Distributed Environments, in Proceedings of the 8th IEEE/ACM International Conference on Grid Computing. IEEE Computer Society, 2007, pp [5] V. Garonne, R. Vigne, G. Stewart, M. Barisits, M. Lassnig, C. Serfon, L. Goossens, and A. Nairz, on behalf of ATLAS Collaboration, Rucio The next generation of large scale distributed system for ATLAS Data Management, in Journal of Physics: Conference Series, vol. 513, no. 4. IOP Publishing, 2014, p [6] T. Maeno, on behalf of ATLAS Collaboration, PanDA: Distributed Production and Distributed Analysis System for ATLAS, in Journal of Physics: Conference Series, vol. 119, no. 6. IOP Publishing, 2008, p [7] T. Kosar and M. Livny, Stork: Making Data Placement a First Class Citizen in the Grid, in 24th International Conference on Distributed Computing Systems, 2004, pp [8] J. Frey, Condor DAGMan: Handling Inter-Job Dependencies, University of Wisconsin, Dept. of Computer Science, Tech. Rep, [9] J. Bresnahan, M. Link, G. Khanna, Z. Imani, R. Kettimuthu, and I. Foster, Globus GridFTP: what s new in 2007, in Proceedings of the first international conference on Networks for grid applications. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), 2007, p. 19. [10] J. Elmsheuser, R. Walker, C. Serfon, V. Garonne, S. Blunier, V. Lavorini, and P. Nilsson, on behalf of ATLAS Collaboration, New data access with HTTP/WebDAV in the ATLAS experiment, in Journal of Physics: Conference Series, vol. 664, no. 4. IOP Publishing, 2015, p [11] G. Bernabeu, F. Martinez, E. Acción, A. Bria, M. Caubet, M. Delfino, and X. Espinal, Experiences with http/webdav protocols for data access in high throughput computing, in Journal of Physics: Conference Series, vol. 331, no. 7. IOP Publishing, 2011, p [12] F. Furano, A. Devresse, O. Keeble, M. Hellmich, and A. Á. Ayllón, Towards an HTTP Ecosystem for HEP Data Access, in Journal of Physics: Conference Series, vol. 513, no. 3. IOP Publishing, 2014, p [13] M. Faerman, A. Su, R. Wolski, and F. Berman, Adaptive Performance Prediction for Distributed Data-Intensive Applications, in Supercomputing, ACM/IEEE 1999 Conference. IEEE, 1999, pp [14] Y. Goland, E. Whitehead, A. Faizi, S. Carter, and D. Jensen, HTTP Extensions for Distributed Authoring WEBDAV, Tech. Rep., [15] R. Shumway and D. Stoffer, Time Series Analysis and Its Applications: With R Examples, ser. Springer Texts in Statistics. Springer New York, [16] S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Computation, vol. 9, no. 8, pp , [17] R Development Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2008, ISBN [Online]. Available: [18] X. Ma, Z. Tao, Y. Wang, H. Yu, and Y. Wang, Long short-term memory neural network for traffic speed prediction using remote microwave sensor data, Transportation Research Part C: Emerging Technologies, vol. 54, pp , [19] P. Filonov, A. Lavrentyev, and A. Vorontsov, Multivariate Industrial Time Series with Cyber-Attack Simulation: Fault Detection Using an LSTM-based Predictive Data Model, arxiv preprint arxiv: , [20] M. Štěpnička and M. Burda, On the Results and Observations of the Time Series Forecasting Competition CIF 2016, in 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), July 2017, pp [21] F. Chollet et al., Keras, [22] L. Törnqvist, P. Vartia, and Y. O. Vartia, How Should Relative Changes be Measured? The American Statistician, vol. 39, no. 1, pp , 1985.

Popularity Prediction Tool for ATLAS Distributed Data Management

Popularity Prediction Tool for ATLAS Distributed Data Management Popularity Prediction Tool for ATLAS Distributed Data Management T Beermann 1,2, P Maettig 1, G Stewart 2, 3, M Lassnig 2, V Garonne 2, M Barisits 2, R Vigne 2, C Serfon 2, L Goossens 2, A Nairz 2 and

More information

ATLAS DQ2 to Rucio renaming infrastructure

ATLAS DQ2 to Rucio renaming infrastructure ATLAS DQ2 to Rucio renaming infrastructure C. Serfon 1, M. Barisits 1,2, T. Beermann 1, V. Garonne 1, L. Goossens 1, M. Lassnig 1, A. Molfetas 1,3, A. Nairz 1, G. Stewart 1, R. Vigne 1 on behalf of the

More information

C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management

C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management 1 2 3 4 5 6 7 C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management T Beermann 1, M Lassnig 1, M Barisits 1, C Serfon 2, V Garonne 2 on behalf of the ATLAS Collaboration 1 CERN, Geneva,

More information

Improved ATLAS HammerCloud Monitoring for Local Site Administration

Improved ATLAS HammerCloud Monitoring for Local Site Administration Improved ATLAS HammerCloud Monitoring for Local Site Administration M Böhler 1, J Elmsheuser 2, F Hönig 2, F Legger 2, V Mancinelli 3, and G Sciacca 4 on behalf of the ATLAS collaboration 1 Albert-Ludwigs

More information

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation Journal of Physics: Conference Series PAPER OPEN ACCESS Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation To cite this article: R. Di Nardo et al 2015 J. Phys.: Conf.

More information

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model Journal of Physics: Conference Series The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model To cite this article: S González de la Hoz 2012 J. Phys.: Conf. Ser. 396 032050

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES 1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

ATLAS operations in the GridKa T1/T2 Cloud

ATLAS operations in the GridKa T1/T2 Cloud Journal of Physics: Conference Series ATLAS operations in the GridKa T1/T2 Cloud To cite this article: G Duckeck et al 2011 J. Phys.: Conf. Ser. 331 072047 View the article online for updates and enhancements.

More information

Text mining on a grid environment

Text mining on a grid environment Data Mining X 13 Text mining on a grid environment V. G. Roncero, M. C. A. Costa & N. F. F. Ebecken COPPE/Federal University of Rio de Janeiro, Brazil Abstract The enormous amount of information stored

More information

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. J Andreeva 1, A Beche 1, S Belov 2, I Kadochnikov 2, P Saiz 1 and D Tuckett 1 1 CERN (European Organization for Nuclear

More information

The DMLite Rucio Plugin: ATLAS data in a filesystem

The DMLite Rucio Plugin: ATLAS data in a filesystem Journal of Physics: Conference Series OPEN ACCESS The DMLite Rucio Plugin: ATLAS data in a filesystem To cite this article: M Lassnig et al 2014 J. Phys.: Conf. Ser. 513 042030 View the article online

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Monitoring of large-scale federated data storage: XRootD and beyond.

Monitoring of large-scale federated data storage: XRootD and beyond. Monitoring of large-scale federated data storage: XRootD and beyond. J Andreeva 1, A Beche 1, S Belov 2, D Diguez Arias 1, D Giordano 1, D Oleynik 2, A Petrosyan 2, P Saiz 1, M Tadel 3, D Tuckett 1 and

More information

ADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION DATA

ADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION DATA 1st International Conference on Experiments/Process/System Modeling/Simulation/Optimization 1st IC-EpsMsO Athens, 6-9 July, 2005 IC-EpsMsO ADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION

More information

CMS users data management service integration and first experiences with its NoSQL data storage

CMS users data management service integration and first experiences with its NoSQL data storage Journal of Physics: Conference Series OPEN ACCESS CMS users data management service integration and first experiences with its NoSQL data storage To cite this article: H Riahi et al 2014 J. Phys.: Conf.

More information

COMPUTER NETWORK PERFORMANCE. Gaia Maselli Room: 319

COMPUTER NETWORK PERFORMANCE. Gaia Maselli Room: 319 COMPUTER NETWORK PERFORMANCE Gaia Maselli maselli@di.uniroma1.it Room: 319 Computer Networks Performance 2 Overview of first class Practical Info (schedule, exam, readings) Goal of this course Contents

More information

Data transfer over the wide area network with a large round trip time

Data transfer over the wide area network with a large round trip time Journal of Physics: Conference Series Data transfer over the wide area network with a large round trip time To cite this article: H Matsunaga et al 1 J. Phys.: Conf. Ser. 219 656 Recent citations - A two

More information

A Popularity-Based Prediction and Data Redistribution Tool for ATLAS Distributed Data Management

A Popularity-Based Prediction and Data Redistribution Tool for ATLAS Distributed Data Management A Popularity-Based Prediction and Data Redistribution Tool for ATLAS Distributed Data Management CERN E-mail: thomas.beermann@cern.ch Graeme A. Stewart University of Glasgow E-mail: graeme.a.stewart@gmail.com

More information

Multi-threaded, discrete event simulation of distributed computing systems

Multi-threaded, discrete event simulation of distributed computing systems Multi-threaded, discrete event simulation of distributed computing systems Iosif C. Legrand California Institute of Technology, Pasadena, CA, U.S.A Abstract The LHC experiments have envisaged computing

More information

New data access with HTTP/WebDAV in the ATLAS experiment

New data access with HTTP/WebDAV in the ATLAS experiment New data access with HTTP/WebDAV in the ATLAS experiment Johannes Elmsheuser on behalf of the ATLAS collaboration Ludwig-Maximilians-Universität München 13 April 2015 21st International Conference on Computing

More information

Experiences with the new ATLAS Distributed Data Management System

Experiences with the new ATLAS Distributed Data Management System Experiences with the new ATLAS Distributed Data Management System V. Garonne 1, M. Barisits 2, T. Beermann 2, M. Lassnig 2, C. Serfon 1, W. Guan 3 on behalf of the ATLAS Collaboration 1 University of Oslo,

More information

HammerCloud: A Stress Testing System for Distributed Analysis

HammerCloud: A Stress Testing System for Distributed Analysis HammerCloud: A Stress Testing System for Distributed Analysis Daniel C. van der Ster 1, Johannes Elmsheuser 2, Mario Úbeda García 1, Massimo Paladin 1 1: CERN, Geneva, Switzerland 2: Ludwig-Maximilians-Universität

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

COMPUTER NETWORKS PERFORMANCE. Gaia Maselli

COMPUTER NETWORKS PERFORMANCE. Gaia Maselli COMPUTER NETWORKS PERFORMANCE Gaia Maselli maselli@di.uniroma1.it Prestazioni dei sistemi di rete 2 Overview of first class Practical Info (schedule, exam, readings) Goal of this course Contents of the

More information

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT

MONTE CARLO SIMULATION FOR RADIOTHERAPY IN A DISTRIBUTED COMPUTING ENVIRONMENT The Monte Carlo Method: Versatility Unbounded in a Dynamic Computing World Chattanooga, Tennessee, April 17-21, 2005, on CD-ROM, American Nuclear Society, LaGrange Park, IL (2005) MONTE CARLO SIMULATION

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

The ATLAS PanDA Pilot in Operation

The ATLAS PanDA Pilot in Operation The ATLAS PanDA Pilot in Operation P. Nilsson 1, J. Caballero 2, K. De 1, T. Maeno 2, A. Stradling 1, T. Wenaus 2 for the ATLAS Collaboration 1 University of Texas at Arlington, Science Hall, P O Box 19059,

More information

ATLAS Data Management Accounting with Hadoop Pig and HBase

ATLAS Data Management Accounting with Hadoop Pig and HBase Journal of Physics: Conference Series ATLAS Data Management Accounting with Hadoop Pig and HBase To cite this article: Mario Lassnig et al 2012 J. Phys.: Conf. Ser. 396 052044 View the article online for

More information

Joe Wingbermuehle, (A paper written under the guidance of Prof. Raj Jain)

Joe Wingbermuehle, (A paper written under the guidance of Prof. Raj Jain) 1 of 11 5/4/2011 4:49 PM Joe Wingbermuehle, wingbej@wustl.edu (A paper written under the guidance of Prof. Raj Jain) Download The Auto-Pipe system allows one to evaluate various resource mappings and topologies

More information

Overview of ATLAS PanDA Workload Management

Overview of ATLAS PanDA Workload Management Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration

More information

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation

AutoPyFactory: A Scalable Flexible Pilot Factory Implementation ATL-SOFT-PROC-2012-045 22 May 2012 Not reviewed, for internal circulation only AutoPyFactory: A Scalable Flexible Pilot Factory Implementation J. Caballero 1, J. Hover 1, P. Love 2, G. A. Stewart 3 on

More information

Network Bandwidth Utilization Prediction Based on Observed SNMP Data

Network Bandwidth Utilization Prediction Based on Observed SNMP Data 160 TUTA/IOE/PCU Journal of the Institute of Engineering, 2017, 13(1): 160-168 TUTA/IOE/PCU Printed in Nepal Network Bandwidth Utilization Prediction Based on Observed SNMP Data Nandalal Rana 1, Krishna

More information

Discovery of the Source of Contaminant Release

Discovery of the Source of Contaminant Release Discovery of the Source of Contaminant Release Devina Sanjaya 1 Henry Qin Introduction Computer ability to model contaminant release events and predict the source of release in real time is crucial in

More information

Federated data storage system prototype for LHC experiments and data intensive science

Federated data storage system prototype for LHC experiments and data intensive science Federated data storage system prototype for LHC experiments and data intensive science A. Kiryanov 1,2,a, A. Klimentov 1,3,b, D. Krasnopevtsev 1,4,c, E. Ryabinkin 1,d, A. Zarochentsev 1,5,e 1 National

More information

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns Journal of Physics: Conference Series OPEN ACCESS Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns To cite this article: A Vaniachine et al 2014 J. Phys.: Conf. Ser. 513 032101 View

More information

AGIS: The ATLAS Grid Information System

AGIS: The ATLAS Grid Information System AGIS: The ATLAS Grid Information System Alexey Anisenkov 1, Sergey Belov 2, Alessandro Di Girolamo 3, Stavro Gayazov 1, Alexei Klimentov 4, Danila Oleynik 2, Alexander Senchenko 1 on behalf of the ATLAS

More information

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science

Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science Evolution of the ATLAS PanDA Workload Management System for Exascale Computational Science T. Maeno, K. De, A. Klimentov, P. Nilsson, D. Oleynik, S. Panitkin, A. Petrosyan, J. Schovancova, A. Vaniachine,

More information

PROOF-Condor integration for ATLAS

PROOF-Condor integration for ATLAS PROOF-Condor integration for ATLAS G. Ganis,, J. Iwaszkiewicz, F. Rademakers CERN / PH-SFT M. Livny, B. Mellado, Neng Xu,, Sau Lan Wu University Of Wisconsin Condor Week, Madison, 29 Apr 2 May 2008 Outline

More information

Benchmarking third-party-transfer protocols with the FTS

Benchmarking third-party-transfer protocols with the FTS Benchmarking third-party-transfer protocols with the FTS Rizart Dona CERN Summer Student Programme 2018 Supervised by Dr. Simone Campana & Dr. Oliver Keeble 1.Introduction 1 Worldwide LHC Computing Grid

More information

Predicting Scientific Grid Data Transfer Characteristics

Predicting Scientific Grid Data Transfer Characteristics Predicting Scientific Grid Data Transfer Characteristics William Agnew Georgia Institute of Technology wagnew3@gatech.edu Michael Fischer Univerity of Wisconson-Milwaukee fisch355@uwm.edu Kyle Chard and

More information

Benchmarking the ATLAS software through the Kit Validation engine

Benchmarking the ATLAS software through the Kit Validation engine Benchmarking the ATLAS software through the Kit Validation engine Alessandro De Salvo (1), Franco Brasolin (2) (1) Istituto Nazionale di Fisica Nucleare, Sezione di Roma, (2) Istituto Nazionale di Fisica

More information

ATLAS Nightly Build System Upgrade

ATLAS Nightly Build System Upgrade Journal of Physics: Conference Series OPEN ACCESS ATLAS Nightly Build System Upgrade To cite this article: G Dimitrov et al 2014 J. Phys.: Conf. Ser. 513 052034 Recent citations - A Roadmap to Continuous

More information

Evaluation of the Huawei UDS cloud storage system for CERN specific data

Evaluation of the Huawei UDS cloud storage system for CERN specific data th International Conference on Computing in High Energy and Nuclear Physics (CHEP3) IOP Publishing Journal of Physics: Conference Series 53 (4) 44 doi:.88/74-6596/53/4/44 Evaluation of the Huawei UDS cloud

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

Andrea Sciabà CERN, Switzerland

Andrea Sciabà CERN, Switzerland Frascati Physics Series Vol. VVVVVV (xxxx), pp. 000-000 XX Conference Location, Date-start - Date-end, Year THE LHC COMPUTING GRID Andrea Sciabà CERN, Switzerland Abstract The LHC experiments will start

More information

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data D. Barberis 1*, J. Cranshaw 2, G. Dimitrov 3, A. Favareto 1, Á. Fernández Casaní 4, S. González de la Hoz 4, J.

More information

Decentralized and Distributed Machine Learning Model Training with Actors

Decentralized and Distributed Machine Learning Model Training with Actors Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of

More information

An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1

An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1 Journal of Physics: Conference Series PAPER OPEN ACCESS An Analysis of Storage Interface Usages at a Large, MultiExperiment Tier 1 Related content - Topical Review W W Symes - MAP Mission C. L. Bennett,

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

ATLAS Tracking Detector Upgrade studies using the Fast Simulation Engine

ATLAS Tracking Detector Upgrade studies using the Fast Simulation Engine Journal of Physics: Conference Series PAPER OPEN ACCESS ATLAS Tracking Detector Upgrade studies using the Fast Simulation Engine To cite this article: Noemi Calace et al 2015 J. Phys.: Conf. Ser. 664 072005

More information

Ch. 13: Measuring Performance

Ch. 13: Measuring Performance Ch. 13: Measuring Performance Kenneth Mitchell School of Computing & Engineering, University of Missouri-Kansas City, Kansas City, MO 64110 Kenneth Mitchell, CS & EE dept., SCE, UMKC p. 1/3 Introduction

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider

From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider From raw data to new fundamental particles: The data management lifecycle at the Large Hadron Collider Andrew Washbrook School of Physics and Astronomy University of Edinburgh Dealing with Data Conference

More information

Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration

Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration Hojiev Sardor Qurbonboyevich Department of IT Convergence Engineering Kumoh National Institute of Technology, Daehak-ro

More information

SDS: A Scalable Data Services System in Data Grid

SDS: A Scalable Data Services System in Data Grid SDS: A Scalable Data s System in Data Grid Xiaoning Peng School of Information Science & Engineering, Central South University Changsha 410083, China Department of Computer Science and Technology, Huaihua

More information

PoS(EGICF12-EMITC2)106

PoS(EGICF12-EMITC2)106 DDM Site Services: A solution for global replication of HEP data Fernando Harald Barreiro Megino 1 E-mail: fernando.harald.barreiro.megino@cern.ch Simone Campana E-mail: simone.campana@cern.ch Vincent

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

The CMS data quality monitoring software: experience and future prospects

The CMS data quality monitoring software: experience and future prospects The CMS data quality monitoring software: experience and future prospects Federico De Guio on behalf of the CMS Collaboration CERN, Geneva, Switzerland E-mail: federico.de.guio@cern.ch Abstract. The Data

More information

SYS 6021 Linear Statistical Models

SYS 6021 Linear Statistical Models SYS 6021 Linear Statistical Models Project 2 Spam Filters Jinghe Zhang Summary The spambase data and time indexed counts of spams and hams are studied to develop accurate spam filters. Static models are

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Striped Data Server for Scalable Parallel Data Analysis

Striped Data Server for Scalable Parallel Data Analysis Journal of Physics: Conference Series PAPER OPEN ACCESS Striped Data Server for Scalable Parallel Data Analysis To cite this article: Jin Chang et al 2018 J. Phys.: Conf. Ser. 1085 042035 View the article

More information

File Access Optimization with the Lustre Filesystem at Florida CMS T2

File Access Optimization with the Lustre Filesystem at Florida CMS T2 Journal of Physics: Conference Series PAPER OPEN ACCESS File Access Optimization with the Lustre Filesystem at Florida CMS T2 To cite this article: P. Avery et al 215 J. Phys.: Conf. Ser. 664 4228 View

More information

A Federated Grid Environment with Replication Services

A Federated Grid Environment with Replication Services A Federated Grid Environment with Replication Services Vivek Khurana, Max Berger & Michael Sobolewski SORCER Research Group, Texas Tech University Grids can be classified as computational grids, access

More information

MVAPICH2 vs. OpenMPI for a Clustering Algorithm

MVAPICH2 vs. OpenMPI for a Clustering Algorithm MVAPICH2 vs. OpenMPI for a Clustering Algorithm Robin V. Blasberg and Matthias K. Gobbert Naval Research Laboratory, Washington, D.C. Department of Mathematics and Statistics, University of Maryland, Baltimore

More information

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library Authors Devresse Adrien (CERN) Fabrizio Furano (CERN) Typical HPC architecture Computing Cluster

More information

DIAL: Distributed Interactive Analysis of Large Datasets

DIAL: Distributed Interactive Analysis of Large Datasets DIAL: Distributed Interactive Analysis of Large Datasets D. L. Adams Brookhaven National Laboratory, Upton NY 11973, USA DIAL will enable users to analyze very large, event-based datasets using an application

More information

Solar Radiation Data Modeling with a Novel Surface Fitting Approach

Solar Radiation Data Modeling with a Novel Surface Fitting Approach Solar Radiation Data Modeling with a Novel Surface Fitting Approach F. Onur Hocao glu, Ömer Nezih Gerek, Mehmet Kurban Anadolu University, Dept. of Electrical and Electronics Eng., Eskisehir, Turkey {fohocaoglu,ongerek,mkurban}

More information

Part IV. Workflow Mapping and Execution in Pegasus. (Thanks to Ewa Deelman)

Part IV. Workflow Mapping and Execution in Pegasus. (Thanks to Ewa Deelman) AAAI-08 Tutorial on Computational Workflows for Large-Scale Artificial Intelligence Research Part IV Workflow Mapping and Execution in Pegasus (Thanks to Ewa Deelman) 1 Pegasus-Workflow Management System

More information

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract ATLAS NOTE December 4, 2014 ATLAS offline reconstruction timing improvements for run-2 The ATLAS Collaboration Abstract ATL-SOFT-PUB-2014-004 04/12/2014 From 2013 to 2014 the LHC underwent an upgrade to

More information

The ATLAS Trigger Simulation with Legacy Software

The ATLAS Trigger Simulation with Legacy Software The ATLAS Trigger Simulation with Legacy Software Carin Bernius SLAC National Accelerator Laboratory, Menlo Park, California E-mail: Catrin.Bernius@cern.ch Gorm Galster The Niels Bohr Institute, University

More information

Data Management for Distributed Scientific Collaborations Using a Rule Engine

Data Management for Distributed Scientific Collaborations Using a Rule Engine Data Management for Distributed Scientific Collaborations Using a Rule Engine Sara Alspaugh Department of Computer Science University of Virginia alspaugh@virginia.edu Ann Chervenak Information Sciences

More information

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster

Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster Storage and Compute Resource Management via DYRE, 3DcacheGrid, and CompuStore Ioan Raicu, Ian Foster. Overview Both the industry and academia have an increase demand for good policies and mechanisms to

More information

A Data-Aware Resource Broker for Data Grids

A Data-Aware Resource Broker for Data Grids A Data-Aware Resource Broker for Data Grids Huy Le, Paul Coddington, and Andrew L. Wendelborn School of Computer Science, University of Adelaide Adelaide, SA 5005, Australia {paulc,andrew}@cs.adelaide.edu.au

More information

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Abstract. The Data and Storage Services group at CERN is conducting

More information

Using S3 cloud storage with ROOT and CvmFS

Using S3 cloud storage with ROOT and CvmFS Journal of Physics: Conference Series PAPER OPEN ACCESS Using S cloud storage with ROOT and CvmFS To cite this article: María Arsuaga-Ríos et al 05 J. Phys.: Conf. Ser. 66 000 View the article online for

More information

Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*

Performance Analysis of Applying Replica Selection Technology for Data Grid Environments* Performance Analysis of Applying Replica Selection Technology for Data Grid Environments* Chao-Tung Yang 1,, Chun-Hsiang Chen 1, Kuan-Ching Li 2, and Ching-Hsien Hsu 3 1 High-Performance Computing Laboratory,

More information

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data 46 Next-generation IT Platforms Delivering New Value through Accumulation and Utilization of Big Data

More information

CernVM-FS beyond LHC computing

CernVM-FS beyond LHC computing CernVM-FS beyond LHC computing C Condurache, I Collier STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot, OX11 0QX, UK E-mail: catalin.condurache@stfc.ac.uk Abstract. In the last three years

More information

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Tharso Ferreira 1, Antonio Espinosa 1, Juan Carlos Moure 2 and Porfidio Hernández 2 Computer Architecture and Operating

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

The glite File Transfer Service

The glite File Transfer Service The glite File Transfer Service Peter Kunszt Paolo Badino Ricardo Brito da Rocha James Casey Ákos Frohner Gavin McCance CERN, IT Department 1211 Geneva 23, Switzerland Abstract Transferring data reliably

More information

Operating the Distributed NDGF Tier-1

Operating the Distributed NDGF Tier-1 Operating the Distributed NDGF Tier-1 Michael Grønager Technical Coordinator, NDGF International Symposium on Grid Computing 08 Taipei, April 10th 2008 Talk Outline What is NDGF? Why a distributed Tier-1?

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

A Compact Computing Environment For A Windows PC Cluster Towards Seamless Molecular Dynamics Simulations

A Compact Computing Environment For A Windows PC Cluster Towards Seamless Molecular Dynamics Simulations A Compact Computing Environment For A Windows PC Cluster Towards Seamless Molecular Dynamics Simulations Yuichi Tsujita Abstract A Windows PC cluster is focused for its high availabilities and fruitful

More information

Performance of popular open source databases for HEP related computing problems

Performance of popular open source databases for HEP related computing problems Journal of Physics: Conference Series OPEN ACCESS Performance of popular open source databases for HEP related computing problems To cite this article: D Kovalskyi et al 2014 J. Phys.: Conf. Ser. 513 042027

More information

Report. Middleware Proxy: A Request-Driven Messaging Broker For High Volume Data Distribution

Report. Middleware Proxy: A Request-Driven Messaging Broker For High Volume Data Distribution CERN-ACC-2013-0237 Wojciech.Sliwinski@cern.ch Report Middleware Proxy: A Request-Driven Messaging Broker For High Volume Data Distribution W. Sliwinski, I. Yastrebov, A. Dworak CERN, Geneva, Switzerland

More information

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications New Optimal Load Allocation for Scheduling Divisible Data Grid Applications M. Othman, M. Abdullah, H. Ibrahim, and S. Subramaniam Department of Communication Technology and Network, University Putra Malaysia,

More information

ANSE: Advanced Network Services for [LHC] Experiments

ANSE: Advanced Network Services for [LHC] Experiments ANSE: Advanced Network Services for [LHC] Experiments Artur Barczyk California Institute of Technology Joint Techs 2013 Honolulu, January 16, 2013 Introduction ANSE is a project funded by NSF s CC-NIE

More information

LHCb Distributed Conditions Database

LHCb Distributed Conditions Database LHCb Distributed Conditions Database Marco Clemencic E-mail: marco.clemencic@cern.ch Abstract. The LHCb Conditions Database project provides the necessary tools to handle non-event time-varying data. The

More information

Scientific data management

Scientific data management Scientific data management Storage and data management components Application database Certificate Certificate Authorised users directory Certificate Certificate Researcher Certificate Policies Information

More information

Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications

Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications D.A. Karras 1 and V. Zorkadis 2 1 University of Piraeus, Dept. of Business Administration,

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

A Capacity Planning Methodology for Distributed E-Commerce Applications

A Capacity Planning Methodology for Distributed E-Commerce Applications A Capacity Planning Methodology for Distributed E-Commerce Applications I. Introduction Most of today s e-commerce environments are based on distributed, multi-tiered, component-based architectures. The

More information

Modeling VM Performance Interference with Fuzzy MIMO Model

Modeling VM Performance Interference with Fuzzy MIMO Model Modeling VM Performance Interference with Fuzzy MIMO Model ABSTRACT Virtual machines (VM) can be a powerful platform for multiplexing resources for applications workloads on demand in datacenters and cloud

More information

CouchDB-based system for data management in a Grid environment Implementation and Experience

CouchDB-based system for data management in a Grid environment Implementation and Experience CouchDB-based system for data management in a Grid environment Implementation and Experience Hassen Riahi IT/SDC, CERN Outline Context Problematic and strategy System architecture Integration and deployment

More information

DiPerF: automated DIstributed PERformance testing Framework

DiPerF: automated DIstributed PERformance testing Framework DiPerF: automated DIstributed PERformance testing Framework Ioan Raicu, Catalin Dumitrescu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago Introduction

More information

An Experiment in Visual Clustering Using Star Glyph Displays

An Experiment in Visual Clustering Using Star Glyph Displays An Experiment in Visual Clustering Using Star Glyph Displays by Hanna Kazhamiaka A Research Paper presented to the University of Waterloo in partial fulfillment of the requirements for the degree of Master

More information

Streamlining CASTOR to manage the LHC data torrent

Streamlining CASTOR to manage the LHC data torrent Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch

More information