HPS Data Analysis Group Summary Matt Graham HPS Collaboration Meeting June 6, 2013
Data Analysis 101 define what you want to measure get data to disk (by magic or whatever) select subset of data to optimize signal vs background this is a complicate step and can include fancy statistical tricks develop method to extract signal from the data cut-and-count or ML fit...more tricks here discovery and/or limit setting criteria...and here correct for biases, efficiencies, etc write paper and go through long, painful, soul crushing review process optimize doesn t necessarily mean S/sqrt(B) signal may be a parameter you want to measure verification that you know what s going on should be in here somewhere 2
HPS Data Analysis & DAWG On HPS, we have a variety of data analysis tasks calibrations, t0 extraction, alignment, angular distribution measurements (e.g. in test run), A bump hunt & vertexing searches, absolute rate measurements... The Data Analysis Working Group is commissioned to be the first level of review for physics analysis topics things like calibrations, etc, are responsibility of detector groups Also, we are responsible for data & simulation production (Sho) and DSTs (Omar) and distribution (Homer) Also, data quality and maintaining DQ conditions lists/ database (need someone!) Also, good place to discuss potential physics topics and general analysis procedures 3
Physics analysis so far The only physics analysis done so far has been for A reach calculation get mass & vertex resolution, efficiency etc from MC for vertexing, a roughly optimized set of square cuts were applied these are inputs into a highly extrapolated cut-and-count like calculation good enough for a rough reach calculation...but it s not an experimental data analysis. 4
Some philosophy Work to squeeze all of the information possible out of the data... be smart optimizing cuts...don t be afraid to use MVAs if appropriate instead of cutting on discriminating variables, put in ML fit Don t trust anything...check, test, verify... make sure all distributions make sense; if data & MC don t agree, figure out why (and fix MC if necessary) perform many toy & embedded MC fits to assure that ML fit is well understood...don t just trust a single fit to data Be blind to the result as long as possible most cuts can be defined by MC, data must be well understood, signal extraction & limit setting should written in stone, expected sensitivity should be computed, most systematics computed (or at least have a plan for) 5
Cuts & Selectors good tracks: chi 2, timing, # of hits, residual pattern?, isolation, ecal match, etc... combine into good vertex: vertex chi 2, probability it came from target, etc... good event: # hits, # tracks, # clusters, P(rad) vs P(BH), etc For these types of things, make MVA selectors instead of cuts to optimize selection important to make cuts that don t bias momentum, direction, mass, vertex Eventually, I d like to make standard definitions (cut or selector) so that lists of (e.g.) good tracks can be made...let s us speak the same language Optimization will be different for bump hunt-vs-vertex-vs whatever 6
Bump-hunt analysis sketch select e + e pair events...don t have to worry much about track purity here; focus on reducing BH background Extract signal using 1d ML fit in m(e + e ); single background component (polynomial?) and gaussian for signal probably binned fit; too many events for unbinned fix signal width to MC Perform fits in steps of mass, either bigger steps (float mass) or small (fix mass) get significance & limit at each step from ML fit discovery criteria needs to account for trials factor events cross section (not trivial) 7 5k A at 75MeV 1M bkg events (50-100MeV) sidebands determine background toy MC for example only... does not reflect reality
Vertex analysis sketch The vertexing analysis is quite different from pure BH looking for ~small signal (10s of events) on ~small background (less) two main discriminating vars: m(e + e ) & vertex position here, track purity is key (see tracking talk), particularly in first few layers. Track isolation cut is very powerful...kalman filter (e.g.) should help some (a lot?) as well extract signal using 2d ML fit, again stepping in m(e + e ), and float (or step) in tau and N(signal) epsilon in the regular A hypothesis, N(sig), mass, and tau are related...for limits, can impose this relationship vertex Z shape extrapolated from mass sidebands at each step (see plot) double gauss 6.6 GeV 200 MeV same comments on trials factors I wish the exponential was signal... but it s still the tail of the prompt events e 1.2 Z 8 Vertex Z(mm)
Blinding blind analysis: the entire analysis chain, from event selection to method of extraction of signal significance & limits is determined prior to looking at the data under analysis We have some choices on how to blind. Here are a few options: tune all cuts on MC data MC needs to be validated by data somehow... tune cuts on sub-set of all data (say, 10%) this is what APEX did include that data in the search? limit? tune cuts on all data in one slice of mass rely on MC to give mass dependence for displaced vertex search, blind data above X cm Some combination of these probably work for us...we need to decide, as a collaboration, on a strategy & the mechanics 9
Current reach (proposal) 1 week @ 1.1 GeV 6 weeks @ 2.2 GeV 4 weeks @ 6.6 GeV dashed lines: 1 week @ 1.1 GeV 2 weeks @ 2.2 GeV shaded green: 6 months each @ 2.2&6.6 GeV 10 (all floor time = 2 beam time)
Reach with pions @ 6.6 GeV Assume: trigger di-pions at 80% (same as electrons) -- fantasy mass resolution & acceptance same as di-muons -- mass resolution probably ok... acceptance? different angular distribution assume BH background same as in di-muons... --??? 1 month beam time 11
How can we improve? Push up luminosity increase current occupancy increase doesn t hurt bump-hunt much; loose efficiency (for same purity) for vertexing since at high beam energies, we don t have much vertexing reach anyway, this might be a good way to go? Push up luminosity increase target thickness occupancy increase + worse resolution...probably not a big winner for displaced vertex, resolution not effected... Improving mass resolution effectively reduces background...resolution improvement by factor of 2 increases reach by sqrt(2) bump-hunt can use beam constraint...resolution is ~all momentum; vertexing mass resolution has mom & angle contributions high-res ECAL could contribute to momentum measurement, recoil detector improves high mass resolution Bigger detector more acceptance 12
Other (publishable?) results? True muonium (Sarah et al) & multi-lepton (Yuri et al) Trident rates (as fcn of angle, energy, mass...) probably want to measure this at least as a sanity check Hadron production rates? how well can we id hadrons? Highly ionizing particles? Other ideas? 13
Di-pions at 11GeV 14