Data Processing and Analysis Requirements for CMS-HI Computing

Size: px
Start display at page:

Download "Data Processing and Analysis Requirements for CMS-HI Computing"

Transcription

1 CMS-HI Computing Specifications 1 Data Processing and Analysis Requirements for CMS-HI Computing Charles F. Maguire, Version August 21 Executive Summary The annual bandwidth, CPU power, data storage, and tape archiving requirements for CMS-HI in the U.S. for are calculated according to certain assumptions described in the main text. The 2011 calendar year is taken to be the point when the computing center is operating at designed goals. The suggested milestones for a separate HI compute center are: Bandwidth and archive specification for a HI compute center functioning as Tier 1+Tier 2 Input: 0.5 GBits/s by the end of 2008, growing to 2.5 GBits/s by 2011, 2 months/year Output: 0.4 GBits/s by the end of 2008, growing to 2.0 GBits/s by 2011, year round operation Raw data archiving: 60 TBytes end 2008, growing to 300 TBytes by 2011, done in one month s time Annual CPU: scaled to a 2011 single pass reconstruction power of SpInt2K Year CPUs Total Power Available Time MC Computing Real Computing Avail Need (SpInt2K) (SpInt2K-sec) (SpInt2K-sec) (SpInt2K-sec) (SpInt2K-sec) Annual Disk Storage Year MC RAW RECO AOD PWG Data+User Total (TB) (TB) (TB) (TB) (TB) (TB) Total Annual Tape Archiving: raw data in one month, rest over the course of each year Year MC RAW RECO AOD PWG Cumulative (TB) (TB) (TB) (TB) (TB) (TB) While it may be possible for the FermiLab Tier 1 center to carry out some of the functions for CMS-HI, it appears that a separate HI compute center is the most practical solution.

2 CMS-HI Computing Specifications 2 1 Introduction This document summarizes the data processsing and analysis requirements needed for a CMS-HI computing center in the U.S. These requirements are developed in large measure from the information given in the tables presented by Olga Barannikova at the DOE review committee on October 24, Additional guidance came from the discussion at the CMS-HI-US computer committee VRVS meeting on August 15, and from the phone conference with the MIT CMS-HEP computing model person held the following day. This document does not go into detail about various infrastructure requirements: electric power, air conditioning capacity, and technical staffing. It assumed that the four sites who have expressed interest in hosting the CMS-HI-US computing center (Iowa, MIT, UIC, and Vanderbilt) will specify those infrastructure requirements according to the demonstrated experience at already working, large computing centers. Similarly, an annual budget is not provided. The guidelines appear to be 1) $ K for the first four years to cover both capital costs and operations, and 2) $250K after the first four years to cover operations and maintenance. 2 Standard CMS Computing Model for p + p Data The DAQ system for the CMS detector is specified to write 225 MBytes/s of RAW data onto buffer disks in the CERN Tier 0 center for both p + p and heavy ion running. Subdetector alignment and calibration data will be made available for prompt processing of the raw data at the Tier 0. After the RAW data has been processed into what are called RECO data, these files will be written to write-only tapes at the Tier 0. These tapes are intended only as emergency archives, and will not normally be read for additional processing. Subsets of the RAW and RECO data will be transferred to the seven Tier 1 institutions. These subsets are arranged according to physics content. For example, sets with electron data will go to Tier X, sets with muon data will go to Tier Y, sets with jet data will go to Tier Z, and so on. I presume that these event sets are identified either on-line or as part of the reconstruction process. There are not enough resources at the Tier 1 centers to have duplicates of these data sets. The Tier 1 centers will have their own archives of the RAW data which they receive. Additionally, the Tier 1 will derive AOD (Analysis Object Data) files from the RECO files. The AOD files are an order of magnitude smaller than are the RECO files. The AOD files are intended as the primary input data for physics analysis at the Tier 2 centers. The AOD event format is for the most part fixed, but possibly different event sets may have slightly varying AOD event types. Each Tier 1 site will get a copy of all the AOD files from the other Tier 1 sites such that any Tier 2 site coupled to a given Tier 1 site, can in principle have access to any AOD file. I assume this last specification is subject to disk space limitations at the Tier 2 sites. Lastly, the Tier 1 sites will reprocess the RAW data files, with better calibrations and possibly better algorithms, at least once per year to produce new sets of RECO and AOD files. The Tier 1 sites are responsible for all tape archiving of their respective data files. 3 Raw Data Volume in Heavy Ion Operations The effective running time for p+p is said to be 10 7 seconds, and 10 6 seconds for heavy ion operations. In the CMS computing TDR (2005), it is said that the design luminosity for heavy ions, cm 2 s 1, will be available in In this document, I take the start of heavy ion operations to be in 2009, and the ramp-up to full data taking is assumed to occur over three years as 20%, 40%, and 100%.

3 CMS-HI Computing Specifications 3 These are also the ramp-up numbers being assumed in the ALICE computing model for assembling their computing resources. From the MIT phone conversation of August 16, the conclusion was that it that there would not be prompt reconstruction of CMS-HI RAW data at the CERN Tier 0 site. Apparently the ALICE experiment will have first priority for any CPU cycles at Tier 0 which are not being used by for reconstruction by the CMS, ATLAS, and LHCb experiments during the heavy ion running. Possibly there would be resources available at the Tier 0 for developing the initial calibration and alignment files needed for the first heavy ion reconstruction pass. Otherwise, that information would have to be developed at another computer center. At a minimum we would expect that the RAW data from HI running would be written to the emergency back tapes in the CERN Tier 0. From Olga s document the RAW data event average size will be taken as 4.5 MBytes. At the nominal DAQ rate of 225 MBytes/s for 10 6 seconds there would be 225 TBytes of RAW data per year, corresponding to 50 Million events. For the calibration and alignment, and possibly other file types, we can conservatively round-up the volume of HI files coming from CERN as 300 TBytes/year. In the 20%, 40%, and 100% ramp-up scenario, this would mean volumes of 60, 120, and 300 TBytes/year for 2009, 2010, and 2011, respectively. 1 4 RAW Data Archive and Reconstruction Siting Solutions During the August 16 phone meeting there was an extended discussion of where the RAW data files should be reconstructed, and re-reconstructed. The original idea was that the RAW data files would first be sent to the FermiLab Tier 1 for archiving, and then sent to the Heavy Ion computing center for initial reconstruction. However, since a re-reconstruction will be necessary, this would likely mean there would have to be two file transfers from FermiLab, as well as one reading of the RAW data tapes at FermiLab. The only way to avoid the second file transfer was to have a dedicated 300 TBytes of disk space for RAW storage at the HI computing center, in addition to whatever other disk space was needed at the HI computing center. 2 The extra transport of files from FermiLab to the HI computing center was characterized as undesirable new dependency, meaning a new component in the CMS computing model which would have to be supported with software and other resources. An alternate solution would be to add CPU, disk, and tape resources to the FermiLab site to serve as a full-fledged Tier 1 center for the HI program. 3 Actually, since the FermiLab Tier 1 would then be doing two reconstructions of the HI RAW data, it would be going beyond what it was doing for the p + p data. This solution would still require a HI computing center which functioned as a regular Tier 2 site for the purposes of receiving AOD files from FermiLab and generating physics analyses. The conclusion at the August 16 phone meeting was that it would be prudent to consult with the FermiLab Tier 1 on whether this solution would be practical. We would have to know the costs of going this route for Tier 1 and also supporting a separate HI computing center as a Tier 2 site. A second solution to the problem of needing two instances of RAW data file transfers from the FermiLab Tier 1 site is to bypass FermiLab entirely. Instead, the HI computing center would function 1 If the actual data volumes in the first two years are much closer to the third year volume, then we would not be able to process all of those data with the at-year computing power available in any linearly staged budget plan. Possibly the data could be archived to tape for future processing when more CPU became available. 2 Reserving 300 TBytes of disk for files which will be read only twice per year seems extravagant. 3 A variant of this solution would be to have either RCF or the US-ALICE associated NERSC or LLNL as the CMS- HI Tier 1 site. Effectively this would be a contracting out of the reconstruction computing to a non-cms but still CERN-linked large computing center. Such non-cms sites would be required to maintain the standard CMS software.

4 CMS-HI Computing Specifications 4 as both a Tier 1 and a Tier 2 center. As a Tier 1 center it would receive the RAW data and other support files from CERN. These files would be archived locally as they were received. The first and (time-separated) second reconstruction passes would be done at this one site. To save time some fraction of the first pass could be done while the RAW data were on buffers awaiting transfer to the tape archive. The remainder of the first pass would be done by reading from the tape archive. The second pass input would be entirely coming from the tape archive. There would not be the need to reserve 300 TBytes of disk space dedicated to maintaining the RAW data files themselves as long as there was a local tape archiving facility. A critical element in this second solution is the bandwidth capability into the HI computing center. By definition the FermiLab Tier 1 center solution should be expected to have the needed bandwidth. This assumes that the extra input load of the HI RAW is not excessive compared to the normal input of the p + p RAW data. Any proposed HI computing center, in the second solution model, would have to meet the specification of accepting 300 TBytes and archiving it locally in one month s time. The minimum input bandwidth is easily calculated as I (MBytes/s) = (MBytes) 120 (MBytes/s) 30 (days) 24 (hours/day) 3600 (seconds/hour) I (MBytes/s) 1 GBit/s While in the first two years of operation, one might be able to make do with less than 1 GBit/s, there is no question that by the third year of operation the HI compute center must have an average input acceptance of 1 GBit/s. In order to accommodate fluctuations in the data rate or other exigencies, the actually capacity should be significantly greater, say a factor of 2.5 higher, meaning 2.5 GBits/s. By comparison, the CMS computing TDR quotes the incoming bandwidth requirement in p + p data for a Tier 1 center to be 7.2 GBits/s. A major difference between a p+p Tier 1 facility and the proposed HI compute facility is that the input bandwidth requirement need be sustained over only 1 2 months for the HI data, but over at least 7 months for the p + p data. A separately functioning HI compute center must archive to tape the input RAW data. Ideally, the archiving to tape will be achieved as rapidly as the data arrive. If the HI compute center were to dedicate 300 TBytes of disk space to the incoming data, then the archiving could proceed at a slower rate. In the more likely case where only a limited buffer area, say 50 TBytes, is dedicated to the RAW data, then the archiving would have to be more rapid. In this model some of the RAW data might be processed immediately in a first pass, while the first pass for the remainder of the data would be reading from the tape archive. The second reconstruction of the data would also read RAW input from the tape archive. 4 The CMS computing TDR quotes the outgoing bandwidth requirement in p + p data for a Tier 1 center to be 3.5 GBit/s. If the HI compute center is to be a combined Tier 1 and Tier 2 facility, there is no obvious output bandwidth requirement for Tier 1 functions regarding the HI data. For a Tier 2 center, the CMS specification for p + p data is at least 1 GBit/s, with some Tier 2 centers hope to be as large as 10 GBits/s. The output capacity is intended to serve the Tier 3 computing facilities located at the institutions participating in the CMS-HI program. To be conservative, we can specify that the HI compute center must be rated at 2 GBits/s for output. This output capacity must be maintained for the entire year. 4 There could be a question of how long data on the tape archive should be retained. Due to the demands on processing the newest data, it becomes less likely that one will want to process data more than say 3 years old with the available CPUs. Perhaps those data could be transferred to a less expensive archive medium.

5 CMS-HI Computing Specifications 5 Bandwidth and archive specification for a HI compute center functioning as Tier 1+Tier 2 Input: 0.5 GBits/s end 2008, growing to 2.5 GBits/s by 2011, 2 months/year Output: 0.4 GBits/s end 2008, growing to 2.0 GBits/s by 2011, year round operation Archiving: 60 TBytes end 2008, growing to 300 TBytes by 2011, done in one month s time 5 Reconstruction, Analysis and Simulation Requirements 5.1 Reconstruction Passes For the reconstruction time of an average event Olga essentially quoted a time of 621 seconds on a 900 SpecInt2K processor, leading to integrated 556 KSpecInt2K-seconds. The 621 seconds in turn was derived from old OCRA simulations for which the quoted time was 20 minutes. Perhaps the 621 seconds assumed a factor of two improvement eventually. In any case, the conclusion of both the August 15 VRVS meeting and the August 16 phone meeting was that we would not obtain any new or more believable information on the CPU time per event using the CMSSW framework for several more weeks. Moreover, even such a new number would be subject to code-profiling tools in order to squeeze out all the less efficient coding. So the decision was to retain the 556 KSpecInt2K-seconds number for the purpose of setting the reconstruction requirements for this document. In the CMS language, this discussion involves the process of producing RECO events from RAW events. For nominal year operations at design HI luminosity we expect 50 million events. Hence the integrated CPU time to reconstruct these events is R = (events) SpecInt2K-seconds/event = SpecInt2K-seconds We would want to see the first reconstruction pass, and the re-reconstruction pass as well, to each take no more than 4 months time (T ) apiece. That would leave 4 more months total for analysis of each year s data. We will assume an effective 0.80 utilization rate of the CPUs, to account for I/O overhead, nodes which are down, and other outage sources during the reconstruction period. With such a utilization factor, the required compute power P RD for raw data reconstruction can be calculated at P RD = R 0.8T = SpecInt2K-seconds (days) 24 (hours/day) 3600 (seconds/hour) P RD = SpecInt2K In order to calculate the number of CPUs required to achieve this value of P RD we assume that the HI computing center will be assembled over a 4 year period with equal numbers of nodes being purchased each year in order to take advantage of Moore s Law. The first purchase year would be in mid-calendar The remaining three years would be witnessing a ramp-up of the RAW data volume as 20%, 40%, and 100% of the final nominal year volume. We can calculate the total number of CPUs required using the following conservative assumptions 1) Equal numbers of compute nodes are purchased each year for four years starting sometime in mid For definitiveness, we assume that the nodes are single quad-cpus, although one can get the same CPU number result in terms of dual quad-cpus. 2) We assume that the first purchase year has CPU processors rated at S = 1900 SpecInt2K. 3) We assume that each year there is a 25% growth in power per CPU processor for the same cost per CPU, in some currently valid approximation of Moore s Law.

6 CMS-HI Computing Specifications 6 With these assumptions, the only unknown is the number of CPUs to be purchased in each of the four years. It is easy to work out 5 that the number of CPUs to be purchased each year under the above assumptions is 320. After four years, the total number of CPUs would be 1280, and these CPUs will total to SpecInt2K in overall CPU power, enough power to meet the goal of one pass of reconstruction in four months. The analysis and simulation goals will be discussed in the following two sections. Whether these 1280 CPUs are in single quad-cpu units or dual quad-cpu units is a second order question to be answered in terms of cost/performance comparisons and networking tests. If these assumptions are placed into a spreadsheet, it is straightforward to change some of the parameters to see the effect on the total number of CPUs. For example, if one wanted to be more optimistic about the Moore s Law growth, and assume say 1.40 instead of 1.25, then the required number of CPUs at the end four years would be 1040, corresponding to buying 65 quad-cpu nodes per year. Similarly, if one retained the 1.25 Moore s Law parameter but took an initial CPU processor power as 2300 instead of 1900, then the target CPU power would be reached after four years also with 65 quad-cpus. 5.2 Real Data Analysis Passes Each reconstruction pass will yield RECO and AOD type event files, with the AOD files a factor of 10 smaller than the RECO files. The CMS model is that the AOD files will serve as input for the vast majority of analysis projects. Possibly a subset of the RECO files will be used for additional verification of the analysis methods or specialized analyses. In her presentation, Olga had the real data analysis requirement as 0.25R in the first year of data taking, and 0.5R in the second and third years of data taking. Here, R represents one pass of reconstruction over the nominal year data, SpecInt2K-seconds. For this document I will be a little more conservative and state that the real data analysis requirement R A is 0.5R in all three data taking years. This means R A = SpecInt2K-seconds. Furthermore, in the simplest approach, this analysis will be carried out in two passes of two months each, one after each of the four-month reconstruction passes 6, with each of these two month analysis passes requiring R A /2 = R/4 of computing time. Thus the net value of the SpecInt2000-seconds has been reduced by a factor of four compared to the reconstruction pass, while the time to complete the analysis pass has been reduced by a factor of two. Trivially then, each analysis pass will require one-half the CPU power as the each reconstruction pass. Hence the power for each analysis pass, symbolized as P AD is P AD = 0.5P RD = SpecInt2K-seconds Moreover, this means that during the each two month analysis pass the computer farm will have half of its power available for simulation purposes. 5.3 Simulation Analysis For the simulation analysis, Olga assumed that we would be reconstructing 5% of the real data in the nominal data production years. The CPU time to produce and analyze a simulation event is taken to be twenty times the time to reconstruct a real data event. So the simulation requirement per nominal year is R SpecInt2K-seconds. 5 See the spreadsheet at maguirc/cms-hi/computingrequirementsexcel.xls which also allows for different input parameters assumptions. 6 One can modify this approach to have partial overlaps of the reconstruction and analysis processing, but that will not change the total required SpecInt2000-seconds.

7 CMS-HI Computing Specifications 7 In the above time division of functions, there are four months in which simulations could be sharing the farm with the analysis jobs. Since the analysis jobs will exhaust R/2, this means that only half of the specified simulations can be carried out in these four months. In order to complete the remaining R/2 of simulations, we have to add CPUs which can be dedicated to simulations for the entire year. Approximately additional CPUs would be needed to carry out the remainder of the simulations, using the same numerical assumptions for the assembly of the computer farm. 5.4 Comparison of HI Computing Center CPUs Installation with Projected Needs The four year plan for purchasing the HI Computing Center CPUs in comparison with the assumed simulation and real data computing needs in each of these years is shown in the following table. Year CPUs Total Power Available Time MC Computing Real Computing Avail Need (SpInt2K) (SpInt2K-sec) (SpInt2K-sec) (SpInt2K-sec) (SpInt2K-sec) a a b b c c d d a In 2008 the CPUs are assumed to arrive in mid-year. The desired number of simulation events for a thorough study would be the nominal year amount, but there is not enough first-year CPU power to accomplish that goal. b In 2009 the real data is assumed to arrive at 20% the nominal year amount, meaning 10 million events. There will be 500 thousand more MC events in this year. c In 2010 the real data is assumed to arrive at 40% the nominal year amount, meaning 20 million events. There will be 1 million more MC events in this year. d In 2011 the real data is assumed to arrive at the nominal year amount, meaning 50 million events. There will be 2.5 million more MC events in this year. In the 2008, when there are no data, the computing will be all simulations largely done for the purpose of perfecting the reconstruction and analysis algorithms to be applied to the next year s real data. Since it is doubtful that the HI computing center will be installed and operational before June 2008, I am allowing for only half a year s production. This means that there will be at most only about 18 million MC events processed instead of a desired 50 million events. Hence, this table shows for 2008 a computing deficit of SpecInt2000-seconds. For 2009, when 20% of the nominal year RAW data is assumed to arrive, there is an apparent surplus of computing. However, this surplus will be easily expended in making up for the missing simulation events from the first year, or in doing more than two passes on the real data. For 2010, there is again a similar surplus forecast, equal to 40% of the installed CPU power. If the real data volume is 60% of the nominal year value instead of 40%, then this surplus will largely disappear. Nonetheless, we may decide to defer purchases in 2010 into 2011 in order to take advantage of any more price reductions. Finally in 2011, which is the year when the design goal of 50 million RAW events is expected, the available computing power and the projected needed computing power are matched to within 4% by design of this plan.

8 CMS-HI Computing Specifications 8 6 Disk Storage Requirement for the HI Computing Center The HI Computing Center will need significant disk storage for three purposes 1) Input buffers for incoming RAW and other data files sent by the CERN Tier 0 while and immediately after the HI data are acquired. 2) Resident disk areas for the MC production and analysis. One does not want to be repeatedly replaying MC production off tape, at least in the initial few years. 3) Resident disk areas for the real data output analysis. Items 2) and 3) will also be archived to tape when appropriate, but we want to be working off disk areas as much as possible in the first two or three years, instead of having to wait for tape accesses. In addition to these major disk uses, individual analyzers or Physics Working Groups (PWGs) will need space for developing their software. These disk areas will be much smaller than the disk areas which accommodate large volumes of MC or real data production. However, these individual and PWG areas will need to be backed-up on a daily basis. That backup service is a separate cost which will eventually be counted or donated. In the first year of operation (2008) there will be primarily MC data driven needs, although we should be practicing the transfer of files from the Tier 0 in volume amounts approaching what we expect in So some of item 1) disk space should be installed. 6.1 New Forecasts of Disk Storage As with predicting CPU power requirements, the prediction of disk requirements requires making assumptions which cannot be made with good precision several years in advance. My table of anticipated disk need for the first four years of operation is shown below, followed by the assumptions being made in each year. Year MC RAW RECO AOD PWG Data+User Total (TB) (TB) (TB) (TB) (TB) (TB) In 2008, I am assuming that the MC production will be about 1/3 of the nominal year production, in line with the number of CPUs available in that (half) year. This amounts to about 875,000 events. Each of those simulation events, as produced by GEANT4, is assumed to be five times the size of a raw event. This factor of five includes the extra simulation information which goes into producing a simulated RAW event. So MC production, without further reconstruction, is forecast to take about 26 TBytes. The third column in the above table shows the volume of the RAW data buffer. In the first year, I am assigning 20 TBytes of disk space on which to practice the transfer of files from the CERN Tier 0. That 20 TBytes corresponds to 1/3 of the initial data year s (2009) anticipated volume. In the fourth column for the first year I have the RECO output corresponding only to the 875K MC events input data, since there are no real data in the first year. I allow for three passes of the MC data to produce three versions of RECO output.

9 CMS-HI Computing Specifications 9 The fifth column shows the AOD output. According to the Olga s presentation, one scales the AOD output by a factor 0.20 (=0.3/1.5) compared to the RECO data. The sixth column shows the analysis output from the AOD input. On an annual basis I simply put in the PWG output as two times the AOD input. We may do more than two passes over each AOD pass, but we can decide to retain on disk only the two most recent sets of results in a given year. However, I keep accumulating the analysis output from the prior years. The last column shows the total of columns 2 6 with an extra 5% to account for user software areas which are backed up. In the second operations year 2009 I assume that because enough additional CPU is available that we can run the full amount (2.5 million) of nominal year MC events. In the third and fourth years of operation, I keep the same amount of MC disk space, assuming that we can reuse, or replace, the prior year simulations. Similarly, the RAW data input buffer volumes for all three data years in column three are kept fixed at 50 TBytes. As mentioned earlier in this document, we should not plan reserving dedicated space for all the RAW data if that space will be read only twice per year. The 50 TBytes represents 1/6 of the nominal year RAW data input, which should be an adequate buffer for archiving to tape. The RECO column for the first data year 2009 assumes that we will keep all that year s RAW data reconstruction (RECO) output on disk. In addition, we will be keeping all the MC reconstruction output on disk. In the CMS TDR it is said that the RECO outputs will not normally be used for physics analysis, but we can plan on using those in the first year to assist in developing better analysis algorithms which are supposed to be based on only the smaller AOD data format. In 2010 I am assuming that we are keeping only half the produced real RECO output on disk (one pass) along with the same amount of MC reconstruction output as in The RECO disk area remains the same in 2010 as in 2009 because the assumption is that there will be twice as much real data in 2010 as in Lastly, in 2011 when the real data volume should go up by a factor of 2.5 compared to 2010, I am also assuming that we keep only one RECO pass on disk. The amount of RECO disk space in 2011 jumps accordingly, compared to Comparison with Previous Forecast for Disk Storage The previous forecast for disk storage was contained in the lower figure on page 9 of Olga s presentation. In 2011, Olga s figure has about 1.5 PBytes as the predicted number for disk storage, as compared with the 385 TBytes shown in the above table. Our computer center charges $700 per TByte of purchased disk space; so the 1.5 PBytes would cost approximately one million dollars to buy. So why is there such a huge difference between Olga s prediction and mine? The major difference is that Olga s amount is a cumulative one, meaning that all disk space used in the previous year is not overwritten and new space is constantly being added for the next year s work. In my model, last year s disk space tends to get overwritten with this year s needs, or re-used in the case of MC data, except for the final analysis output. If one were to sum all annual numbers in my model it would come to about 750 TBytes, half of what Olga showed. This re-writing in my model may be too optimistic an assumption. On the other hand with a small group like CMS-HI-US, can we really expect to be looking at all three year s of previous files in the fourth year? There may not be enough CPU power to do that, even if we had the human resources. It should be realized that the previous year s output will be archived to tape. A user could re-read that information from tape onto the local (non-networked) disks of the compute nodes, which do not have to be stated in the above table. So one could have a certain amount of prior year data being accessed on a node-by-node basis, which may be sufficient for the intended purposes.

10 CMS-HI Computing Specifications 10 A second smaller difference is that Olga might be assuming that the entire accumulated amount of RAW and RECO data is being kept on disk. In my model, I am assuming that even in a single year we never store all the RAW data on disk, and only keep fractions of the RECO data on disk, relying primarily on the smaller AOD data set. A third difference with Olga s presentation is that she had multiple AOD outputs per RECO output. Since the AOD events are a defined subset of the RECO events, I did not see why multiple AOD outputs would be necessary. It s possible that these multiple AOD outputs were accounting for what I have in the multiple PWG outputs. Ultimately the price of disk space will be a seriously painful constraint. In this respect, I point out the current numbers from the PHENIX experiment, which also tends to recycle disk use in subsequent years. The equivalent RAW data input in PHENIX for 2007 was over 600 TBytes, at least a factor of two more than we expect to get from CMS in nominal years. The available disk space in PHENIX is also about 600 TBytes. So in this respect, the final year total of 385 TBytes almost scales with what PHENIX presently has. On the other hand, ALICE is requesting 10.2 PBytes of transient disk storage, and that is also based on non-cumulative one-year needs. ALICE must account for the needs of the p + p program, storage at 4 Tier 1 sites (7.5 PBtyes), and an unspecified number of Tier 2 sites (2.6 PBytes). Nonetheless, a factor of 26 difference between what ALICE claims to need and what CMS-HI claims to need as transient disk space is hard to reconcile. 7 Tape Archiving Requirement The tape archiving requirement, shown in the table below, follows rather directly from the disk storage requirements discussion. The major difference in the two tables is that the RAW and RECO requirements must include the total amount of data, not the partial amounts which are being stored on the disk. Also a cumulative total is given in the final column instead of just the annual amounts since the annual cost of storage will be based on the cumulative number. Naturally, after two years we will have Year MC RAW RECO AOD PWG Cumulative (TB) (TB) (TB) (TB) (TB) (TB) enough experience to adjust all of these predicted numbers. 8 Summary Specifications have been presented for the needed data bandwidths, CPU processing power, data storage, and tape archiving capabilities for CMS-HI computing. The most practical solution appears to be a separate, dedicated HI computing center functioning as a combined Tier 1 and Tier 2 facility. The CPU power numbers derived here are comparable to what has previously been shown, within the uncertainties of various input parameters. The disk storage and tape archiving requirements are a factor of two smaller than previously shown, but the numbers here may be closer to what can be realistically budgeted in the near future.

August 31, 2009 Bologna Workshop Rehearsal I

August 31, 2009 Bologna Workshop Rehearsal I August 31, 2009 Bologna Workshop Rehearsal I 1 The CMS-HI Research Plan Major goals Outline Assumptions about the Heavy Ion beam schedule CMS-HI Compute Model Guiding principles Actual implementation Computing

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

LHCb Computing Resources: 2018 requests and preview of 2019 requests

LHCb Computing Resources: 2018 requests and preview of 2019 requests LHCb Computing Resources: 2018 requests and preview of 2019 requests LHCb-PUB-2017-009 23/02/2017 LHCb Public Note Issue: 0 Revision: 0 Reference: LHCb-PUB-2017-009 Created: 23 rd February 2017 Last modified:

More information

The CMS Computing Model

The CMS Computing Model The CMS Computing Model Dorian Kcira California Institute of Technology SuperComputing 2009 November 14-20 2009, Portland, OR CERN s Large Hadron Collider 5000+ Physicists/Engineers 300+ Institutes 70+

More information

LHCb Computing Resources: 2019 requests and reassessment of 2018 requests

LHCb Computing Resources: 2019 requests and reassessment of 2018 requests LHCb Computing Resources: 2019 requests and reassessment of 2018 requests LHCb-PUB-2017-019 09/09/2017 LHCb Public Note Issue: 0 Revision: 0 Reference: LHCb-PUB-2017-019 Created: 30 th August 2017 Last

More information

CMS-HI-US Computing Proposal Update to the U.S.D.O.E 1

CMS-HI-US Computing Proposal Update to the U.S.D.O.E 1 CMS-HI-US Computing Proposal Update to the U.S.D.O.E 1 Version V1: December 16 at 14:45 CST 1) Applicant Institution: Vanderbilt University 2) Institutional Address: Department of Physics and Astronomy

More information

The BABAR Database: Challenges, Trends and Projections

The BABAR Database: Challenges, Trends and Projections SLAC-PUB-9179 September 2001 The BABAR Database: Challenges, Trends and Projections I. Gaponenko 1, A. Mokhtarani 1, S. Patton 1, D. Quarrie 1, A. Adesanya 2, J. Becla 2, A. Hanushevsky 2, A. Hasan 2,

More information

IBM 3850-Mass storage system

IBM 3850-Mass storage system BM 385-Mass storage system by CLAYTON JOHNSON BM Corporation Boulder, Colorado SUMMARY BM's 385, a hierarchical storage system, provides random access to stored data with capacity ranging from 35 X 1()9

More information

Summary of the LHC Computing Review

Summary of the LHC Computing Review Summary of the LHC Computing Review http://lhc-computing-review-public.web.cern.ch John Harvey CERN/EP May 10 th, 2001 LHCb Collaboration Meeting The Scale Data taking rate : 50,100, 200 Hz (ALICE, ATLAS-CMS,

More information

Continuation Report/Proposal: Vanderbilt CMS Tier 2 Computing Project Reporting Period: November 1, 2010 to March 31, 2012

Continuation Report/Proposal: Vanderbilt CMS Tier 2 Computing Project Reporting Period: November 1, 2010 to March 31, 2012 Continuation Report/Proposal for DOE Grant Number: DE SC0005220 1 May 15, 2012 Continuation Report/Proposal: Vanderbilt CMS Tier 2 Computing Project Reporting Period: November 1, 2010 to March 31, 2012

More information

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN

Software and computing evolution: the HL-LHC challenge. Simone Campana, CERN Software and computing evolution: the HL-LHC challenge Simone Campana, CERN Higgs discovery in Run-1 The Large Hadron Collider at CERN We are here: Run-2 (Fernando s talk) High Luminosity: the HL-LHC challenge

More information

University Information Systems. Administrative Computing Services. Contingency Plan. Overview

University Information Systems. Administrative Computing Services. Contingency Plan. Overview University Information Systems Administrative Computing Services Contingency Plan Overview Last updated 01/11/2005 University Information Systems Administrative Computing Services Contingency Plan Overview

More information

DEDUPLICATION BASICS

DEDUPLICATION BASICS DEDUPLICATION BASICS 4 DEDUPE BASICS 6 WHAT IS DEDUPLICATION 8 METHODS OF DEDUPLICATION 10 DEDUPLICATION EXAMPLE 12 HOW DO DISASTER RECOVERY & ARCHIVING FIT IN? 14 DEDUPLICATION FOR EVERY BUDGET QUANTUM

More information

A L I C E Computing Model

A L I C E Computing Model CERN-LHCC-2004-038/G-086 04 February 2005 A L I C E Computing Model Computing Project Leader Offline Coordinator F. Carminati Y. Schutz (Editors on behalf of the ALICE Collaboration) i Foreword This document

More information

HP s VLS9000 and D2D4112 deduplication systems

HP s VLS9000 and D2D4112 deduplication systems Silverton Consulting StorInt Briefing Introduction Particularly in today s economy, costs and return on investment (ROI) often dominate product selection decisions. However, gathering the appropriate information

More information

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era

New strategies of the LHC experiments to meet the computing requirements of the HL-LHC era to meet the computing requirements of the HL-LHC era NPI AS CR Prague/Rez E-mail: adamova@ujf.cas.cz Maarten Litmaath CERN E-mail: Maarten.Litmaath@cern.ch The performance of the Large Hadron Collider

More information

LHC Computing Models

LHC Computing Models LHC Computing Models Commissione I 31/1/2005 Francesco Forti,, Pisa Gruppo di referaggio Forti (chair), Belforte, Menasce, Simone, Taiuti, Ferrari, Morandin, Zoccoli Outline Comparative analysis of the

More information

NC Education Cloud Feasibility Report

NC Education Cloud Feasibility Report 1 NC Education Cloud Feasibility Report 1. Problem Definition and rationale North Carolina districts are generally ill-equipped to manage production server infrastructure. Server infrastructure is most

More information

Performance of relational database management

Performance of relational database management Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate

More information

Is Tape Really Cheaper Than Disk?

Is Tape Really Cheaper Than Disk? White Paper 20 October 2005 Dianne McAdam Is Tape Really Cheaper Than Disk? Disk Is... Tape Is... Not since the Y2K phenomenon has the IT industry seen such intensity over technology as it has with the

More information

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model Journal of Physics: Conference Series The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model To cite this article: S González de la Hoz 2012 J. Phys.: Conf. Ser. 396 032050

More information

IEPSAS-Kosice: experiences in running LCG site

IEPSAS-Kosice: experiences in running LCG site IEPSAS-Kosice: experiences in running LCG site Marian Babik 1, Dusan Bruncko 2, Tomas Daranyi 1, Ladislav Hluchy 1 and Pavol Strizenec 2 1 Department of Parallel and Distributed Computing, Institute of

More information

An Oracle White Paper April 2010

An Oracle White Paper April 2010 An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited

More information

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2

More information

Oracle Rdb Hot Standby Performance Test Results

Oracle Rdb Hot Standby Performance Test Results Oracle Rdb Hot Performance Test Results Bill Gettys (bill.gettys@oracle.com), Principal Engineer, Oracle Corporation August 15, 1999 Introduction With the release of Rdb version 7.0, Oracle offered a powerful

More information

Machine Learning in Data Quality Monitoring

Machine Learning in Data Quality Monitoring CERN openlab workshop on Machine Learning and Data Analytics April 27 th, 2017 Machine Learning in Data Quality Monitoring a point of view Goal Maximize the best Quality Data for physics analysis Data

More information

5) DOE/Office of Science Program Office: Nuclear Physics, Heavy Ion. 7) Collaborating Institution: Massachusetts Institute of Technology

5) DOE/Office of Science Program Office: Nuclear Physics, Heavy Ion. 7) Collaborating Institution: Massachusetts Institute of Technology Version February 18, 2010 1) Applicant Institution: Vanderbilt University 2) Institutional Address: Department of Physics and Astronomy Box 1807 Station B Vanderbilt University Nashville, TN 37235 3) Co-PIs

More information

Achieving 24-bit Resolution with TASCAM s New-Generation DTRS Format Recorders / Reproducers

Achieving 24-bit Resolution with TASCAM s New-Generation DTRS Format Recorders / Reproducers Achieving 24-bit Resolution with TASCAM s New-Generation DTRS Format Recorders / Reproducers Introduction. The DTRS 16-bit format was originally seen by many people as an interim technology to bridge the

More information

Technical papers Web caches

Technical papers Web caches Technical papers Web caches Web caches What is a web cache? In their simplest form, web caches store temporary copies of web objects. They are designed primarily to improve the accessibility and availability

More information

Computing at Belle II

Computing at Belle II Computing at Belle II CHEP 22.05.2012 Takanori Hara for the Belle II Computing Group Physics Objective of Belle and Belle II Confirmation of KM mechanism of CP in the Standard Model CP in the SM too small

More information

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD?

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? MASS MIGRATION: SHOULD ALL COMPANIES MOVE TO THE CLOUD? Achieving digital

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

Accelerate your SAS analytics to take the gold

Accelerate your SAS analytics to take the gold Accelerate your SAS analytics to take the gold A White Paper by Fuzzy Logix Whatever the nature of your business s analytics environment we are sure you are under increasing pressure to deliver more: more

More information

1. Introduction. Outline

1. Introduction. Outline Outline 1. Introduction ALICE computing in Run-1 and Run-2 2. ALICE computing in Run-3 and Run-4 (2021-) 3. Current ALICE O 2 project status 4. T2 site(s) in Japan and network 5. Summary 2 Quark- Gluon

More information

High-Energy Physics Data-Storage Challenges

High-Energy Physics Data-Storage Challenges High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)

More information

Memorandum APPENDIX 2. April 3, Audit Committee

Memorandum APPENDIX 2. April 3, Audit Committee APPENDI 2 Information & Technology Dave Wallace, Chief Information Officer Metro Hall 55 John Street 15th Floor Toronto, Ontario M5V 3C6 Memorandum Tel: 416 392-8421 Fax: 416 696-4244 dwwallace@toronto.ca

More information

The Computation and Data Needs of Canadian Astronomy

The Computation and Data Needs of Canadian Astronomy Summary The Computation and Data Needs of Canadian Astronomy The Computation and Data Committee In this white paper, we review the role of computing in astronomy and astrophysics and present the Computation

More information

Virtualizing a Batch. University Grid Center

Virtualizing a Batch. University Grid Center Virtualizing a Batch Queuing System at a University Grid Center Volker Büge (1,2), Yves Kemp (1), Günter Quast (1), Oliver Oberst (1), Marcel Kunze (2) (1) University of Karlsruhe (2) Forschungszentrum

More information

M E M O R A N D U M. To: California State Lottery Commission Date: October 16, Item 9(c): Approval to Hire Project Management Consultant

M E M O R A N D U M. To: California State Lottery Commission Date: October 16, Item 9(c): Approval to Hire Project Management Consultant M E M O R A N D U M To: California State Lottery Commission Date: From: Joan M. Borucki Director Prepared By: Linh Nguyen Chief Deputy Director Subject: Item 9(c): Approval to Hire Project Management Consultant

More information

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns Journal of Physics: Conference Series OPEN ACCESS Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns To cite this article: A Vaniachine et al 2014 J. Phys.: Conf. Ser. 513 032101 View

More information

The Project Management and Acquisition Plan for the CMS-HI Tier 2 Computing Facility at Vanderbilt University

The Project Management and Acquisition Plan for the CMS-HI Tier 2 Computing Facility at Vanderbilt University The Project Management and Acquisition Plan for the CMS-HI Tier 2 Computing Facility at Vanderbilt University R. Granier de Cassagnac 1, C. Kennedy 2, C. Maguire 2,4, G. Roland 3, P. Sheldon 2, A. Tackett

More information

Technology Insight Series

Technology Insight Series EMC Avamar for NAS - Accelerating NDMP Backup Performance John Webster June, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights reserved. Page 1 of 7 Introduction/Executive

More information

Networking for a smarter data center: Getting it right

Networking for a smarter data center: Getting it right IBM Global Technology Services October 2011 Networking for a smarter data center: Getting it right Planning the network needed for a dynamic infrastructure 2 Networking for a smarter data center: Getting

More information

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser

Data Analysis in ATLAS. Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser Data Analysis in ATLAS Graeme Stewart with thanks to Attila Krasznahorkay and Johannes Elmsheuser 1 ATLAS Data Flow into Analysis RAW detector data and simulated RDO data are reconstructed into our xaod

More information

Review of the Compact Muon Solenoid (CMS) Collaboration Heavy Ion Computing Proposal

Review of the Compact Muon Solenoid (CMS) Collaboration Heavy Ion Computing Proposal Office of Nuclear Physics Report Review of the Compact Muon Solenoid (CMS) Collaboration Heavy Ion Computing Proposal May 11, 2009 Evaluation Summary Report The Department of Energy (DOE), Office of Nuclear

More information

Avoiding Costs From Oversizing Datacenter Infrastructure

Avoiding Costs From Oversizing Datacenter Infrastructure Avoiding Costs From Oversizing Datacenter Infrastructure White Paper # 37 Executive Summary The physical and power infrastructure of data centers is typically oversized by more than 100%. Statistics related

More information

A Generic Multi-node State Monitoring Subsystem

A Generic Multi-node State Monitoring Subsystem A Generic Multi-node State Monitoring Subsystem James A. Hamilton SLAC, Stanford, CA 94025, USA Gregory P. Dubois-Felsmann California Institute of Technology, CA 91125, USA Rainer Bartoldus SLAC, Stanford,

More information

L1 and Subsequent Triggers

L1 and Subsequent Triggers April 8, 2003 L1 and Subsequent Triggers Abstract During the last year the scope of the L1 trigger has changed rather drastically compared to the TP. This note aims at summarising the changes, both in

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

The GAP project: GPU applications for High Level Trigger and Medical Imaging

The GAP project: GPU applications for High Level Trigger and Medical Imaging The GAP project: GPU applications for High Level Trigger and Medical Imaging Matteo Bauce 1,2, Andrea Messina 1,2,3, Marco Rescigno 3, Stefano Giagu 1,3, Gianluca Lamanna 4,6, Massimiliano Fiorini 5 1

More information

ESNET Requirements for Physics Reseirch at the SSCL

ESNET Requirements for Physics Reseirch at the SSCL SSCLSR1222 June 1993 Distribution Category: 0 L. Cormell T. Johnson ESNET Requirements for Physics Reseirch at the SSCL Superconducting Super Collider Laboratory Disclaimer Notice I This report was prepared

More information

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation Fujitsu All-Flash Arrays are extremely effective tools when virtualization is used for server consolidation.

More information

Disk-Based Data Protection Architecture Comparisons

Disk-Based Data Protection Architecture Comparisons Disk-Based Data Protection Architecture Comparisons Abstract The dramatic drop in the price of hard disk storage combined with its performance characteristics has given rise to a number of data protection

More information

BUSINESS VALUE SPOTLIGHT

BUSINESS VALUE SPOTLIGHT BUSINESS VALUE SPOTLIGHT Improve Performance, Increase User Productivity, and Reduce Costs with Database Archiving: A Case Study of AT&T May 2010 Sponsored by Informatica, Inc. Background AT&T's Wireless

More information

The Transition to Networked Storage

The Transition to Networked Storage The Transition to Networked Storage Jim Metzler Ashton, Metzler & Associates Table of Contents 1.0 Executive Summary... 3 2.0 The Emergence of the Storage Area Network... 3 3.0 The Link Between Business

More information

Evaluation of the computing resources required for a Nordic research exploitation of the LHC

Evaluation of the computing resources required for a Nordic research exploitation of the LHC PROCEEDINGS Evaluation of the computing resources required for a Nordic research exploitation of the LHC and Sverker Almehed, Chafik Driouichi, Paula Eerola, Ulf Mjörnmark, Oxana Smirnova,TorstenÅkesson

More information

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Compact Muon Solenoid: Cyberinfrastructure Solutions Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Computing Demands CMS must provide computing to handle huge data rates and sizes, and

More information

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland

The LCG 3D Project. Maria Girone, CERN. The 23rd Open Grid Forum - OGF23 4th June 2008, Barcelona. CERN IT Department CH-1211 Genève 23 Switzerland The LCG 3D Project Maria Girone, CERN The rd Open Grid Forum - OGF 4th June 2008, Barcelona Outline Introduction The Distributed Database (3D) Project Streams Replication Technology and Performance Availability

More information

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM Lukas Nellen ICN-UNAM lukas@nucleares.unam.mx 3rd BigData BigNetworks Conference Puerto Vallarta April 23, 2015 Who Am I? ALICE

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Data Transfers Between LHC Grid Sites Dorian Kcira

Data Transfers Between LHC Grid Sites Dorian Kcira Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference

More information

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment Executive Summary The Need for Shared Storage The Shared Storage Dilemma for the SMB The SMB Answer - DroboElite Enhancing your VMware Environment Ideal for Virtualized SMB Conclusion Executive Summary

More information

Automated Storage Tiering on Infortrend s ESVA Storage Systems

Automated Storage Tiering on Infortrend s ESVA Storage Systems Automated Storage Tiering on Infortrend s ESVA Storage Systems White paper Abstract This white paper introduces automated storage tiering on Infortrend s ESVA storage arrays. Storage tiering can generate

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

Storage Infrastructure Optimization (SIO)

Storage Infrastructure Optimization (SIO) Name Storage Infrastructure Optimization (SIO) Title: Improve Operational Flexibility and Reduce Cost Vishal Maheshwari Tivoli Storage Leader India/South Asia 2 Agenda The Information Explosion Creates

More information

Early experience with the Run 2 ATLAS analysis model

Early experience with the Run 2 ATLAS analysis model Early experience with the Run 2 ATLAS analysis model Argonne National Laboratory E-mail: cranshaw@anl.gov During the long shutdown of the LHC, the ATLAS collaboration redesigned its analysis model based

More information

Competitive Public Switched Telephone Network (PSTN) Wide- Area Network (WAN) Access Using Signaling System 7 (SS7)

Competitive Public Switched Telephone Network (PSTN) Wide- Area Network (WAN) Access Using Signaling System 7 (SS7) Competitive Public Switched Telephone Network (PSTN) Wide- Area Network (WAN) Access Using Signaling System 7 (SS7) Definition Using conventional Internet access equipment, service providers may access

More information

Gartner Client Operating Systems Surveys and Polls: Enterprises Plan Early, but Slow, Move to Windows 7

Gartner Client Operating Systems Surveys and Polls: Enterprises Plan Early, but Slow, Move to Windows 7 Page 1 of 8 Gartner Client Operating Systems Surveys and Polls: Enterprises Plan Early, but Slow, Move to Windows 7 7 June 2010 Michael A. Silver Gartner RAS Core Research Note G00200542 Respondents to

More information

IBM Storwize V7000 TCO White Paper:

IBM Storwize V7000 TCO White Paper: IBM Storwize V7000 TCO White Paper: A TCO White Paper An Alinean White Paper Published by: Alinean, Inc. 201 S. Orange Ave Suite 1210 Orlando, FL 32801-12565 Tel: 407.382.0005 Fax: 407.382.0906 Email:

More information

Cellular Phone Usage and Administration

Cellular Phone Usage and Administration Program Evaluation and Audit Cellular Phone Usage and Administration May 13, 2008 INTRODUCTION Background Many areas of the Metropolitan Council use cellular telephones to enhance and improve critical

More information

Shedding Tiers Creating a Simpler, More Manageable Storage Infrastructure

Shedding Tiers Creating a Simpler, More Manageable Storage Infrastructure Shedding Tiers Creating a Simpler, More Manageable Storage Infrastructure By Gary Orenstein, Vice President of Marketing Gear6 [www.gear6.com] Introduction The concept of segmenting data storage repositories

More information

NOKIA FINANCIAL RESULTS Q3 / 2012

NOKIA FINANCIAL RESULTS Q3 / 2012 Nokia Internal Use Only NOKIA FINANCIAL RESULTS Q3 / 2012 Conference Call October 18, 2012 15.00 / Helsinki 08.00 / New York Stephen Elop / President & CEO Timo Ihamuotila / CFO Matt Shimao / Head of Investor

More information

2017 Q4 Earnings Conference Call

2017 Q4 Earnings Conference Call 2017 Q4 Earnings Conference Call Forward Looking Statements This presentation includes certain forward-looking statements that are made as of the date hereof and are based upon current expectations, which

More information

Column Generation Method for an Agent Scheduling Problem

Column Generation Method for an Agent Scheduling Problem Column Generation Method for an Agent Scheduling Problem Balázs Dezső Alpár Jüttner Péter Kovács Dept. of Algorithms and Their Applications, and Dept. of Operations Research Eötvös Loránd University, Budapest,

More information

Storage and I/O requirements of the LHC experiments

Storage and I/O requirements of the LHC experiments Storage and I/O requirements of the LHC experiments Sverre Jarp CERN openlab, IT Dept where the Web was born 22 June 2006 OpenFabrics Workshop, Paris 1 Briefly about CERN 22 June 2006 OpenFabrics Workshop,

More information

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract

ATLAS NOTE. December 4, ATLAS offline reconstruction timing improvements for run-2. The ATLAS Collaboration. Abstract ATLAS NOTE December 4, 2014 ATLAS offline reconstruction timing improvements for run-2 The ATLAS Collaboration Abstract ATL-SOFT-PUB-2014-004 04/12/2014 From 2013 to 2014 the LHC underwent an upgrade to

More information

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008 Spanish Tier-2 Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) Introduction Report here the status of the federated T2 for CMS basically corresponding to the budget 2006-2007 concentrate on last year

More information

CERN and Scientific Computing

CERN and Scientific Computing CERN and Scientific Computing Massimo Lamanna CERN Information Technology Department Experiment Support Group 1960: 26 GeV proton in the 32 cm CERN hydrogen bubble chamber 1960: IBM 709 at the Geneva airport

More information

Technology Watch. Data Communications 2 nd Quarter, 2012

Technology Watch. Data Communications 2 nd Quarter, 2012 Technology Watch Data Communications 2 nd Quarter, 2012 Page 1 Commercial Version Technology Watch DCCC August 2012 Table of Contents 1.0 Introduction... 1 2.0 Internet Traffic Growth Analysis... 1 3.0

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved.

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved. Power of the Portfolio 1 VMAX / VPLEX K-12 School System District seeking system to support rollout of new VDI implementation Customer found Vblock to be superior solutions versus competitor Customer expanded

More information

Tracking the future? Orbcomm s proposed IPO

Tracking the future? Orbcomm s proposed IPO Tracking the future? Orbcomm s proposed IPO This week has seen the announcement of a proposed IPO by Orbcomm, seeking to raise up to $150M, and potentially marking the first IPO by one of the mobile satellite

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

Understanding Managed Services

Understanding Managed Services Understanding Managed Services The buzzword relating to IT Support is Managed Services, and every day more and more businesses are jumping on the bandwagon. But what does managed services actually mean

More information

Keeping the lid on storage

Keeping the lid on storage Keeping the lid on storage Drive significant cost savings through innovation and efficiency Publication date: December 2011 Optimising storage performance & costs through innovation As the compute power

More information

Deferred High Level Trigger in LHCb: A Boost to CPU Resource Utilization

Deferred High Level Trigger in LHCb: A Boost to CPU Resource Utilization Deferred High Level Trigger in LHCb: A Boost to Resource Utilization The use of periods without beam for online high level triggers Introduction, problem statement Realization of the chosen solution Conclusions

More information

YOUR CONDUIT TO THE CLOUD

YOUR CONDUIT TO THE CLOUD COLOCATION YOUR CONDUIT TO THE CLOUD MASSIVE NETWORKS Enterprise-Class Data Transport Solutions SUMMARY COLOCATION PROVIDERS ARE EVERYWHERE. With so many to choose from, how do you know which one is right

More information

Module 15 Communication at Data Link and Transport Layer

Module 15 Communication at Data Link and Transport Layer Computer Networks and ITCP/IP Protocols 1 Module 15 Communication at Data Link and Transport Layer Introduction Communication at data link layer is very important as it is between two adjacent machines

More information

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator

Computing. DOE Program Review SLAC. Rainer Bartoldus. Breakout Session 3 June BaBar Deputy Computing Coordinator Computing DOE Program Review SLAC Breakout Session 3 June 2004 Rainer Bartoldus BaBar Deputy Computing Coordinator 1 Outline The New Computing Model (CM2) New Kanga/ROOT event store, new Analysis Model,

More information

Precision Timing in High Pile-Up and Time-Based Vertex Reconstruction

Precision Timing in High Pile-Up and Time-Based Vertex Reconstruction Precision Timing in High Pile-Up and Time-Based Vertex Reconstruction Cedric Flamant (CERN Summer Student) - Supervisor: Adi Bornheim Division of High Energy Physics, California Institute of Technology,

More information

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures Next 1 What we see happening today. The amount of data businesses must cope with on a daily basis is getting

More information

TSM Offsite installation

TSM Offsite installation CNS Internal Project Charter Allen Rout asr@ufl.edu 1. Background Computing and Network Services (CNS) provides a campus-wide "Network Storage and Archive Management" (NSAM) service implemented with the

More information

Automatic Format Generation Techniques For Network Data Acquisition Systems

Automatic Format Generation Techniques For Network Data Acquisition Systems Automatic Format Generation Techniques For Network Data Acquisition Systems Benjamin Kupferschmidt Technical Manager - TTCWare Teletronics Technology Corporation Eric Pesciotta Director of Systems Software

More information

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18

File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18 File Open, Close, and Flush Performance Issues in HDF5 Scot Breitenfeld John Mainzer Richard Warren 02/19/18 1 Introduction Historically, the parallel version of the HDF5 library has suffered from performance

More information

THE COMPLETE GUIDE COUCHBASE BACKUP & RECOVERY

THE COMPLETE GUIDE COUCHBASE BACKUP & RECOVERY THE COMPLETE GUIDE COUCHBASE BACKUP & RECOVERY INTRODUCTION Driven by the need to remain competitive and differentiate themselves, organizations are undergoing digital transformations and becoming increasingly

More information

LTO and Magnetic Tapes Lively Roadmap With a roadmap like this, how can tape be dead?

LTO and Magnetic Tapes Lively Roadmap With a roadmap like this, how can tape be dead? LTO and Magnetic Tapes Lively Roadmap With a roadmap like this, how can tape be dead? By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (CRC) April

More information

Caching & Tiering BPG

Caching & Tiering BPG Intro: SSD Caching and SSD Tiering functionality in the StorTrends 3500i offers the most intelligent performance possible from a hybrid storage array at the most cost-effective prices in the industry.

More information

HP Dynamic Deduplication achieving a 50:1 ratio

HP Dynamic Deduplication achieving a 50:1 ratio HP Dynamic Deduplication achieving a 50:1 ratio Table of contents Introduction... 2 Data deduplication the hottest topic in data protection... 2 The benefits of data deduplication... 2 How does data deduplication

More information