A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology
Contents Introducton Wavefront reconstructon usng Smplex B-Splnes Dstrbuted wavefront reconstructon usng Smplex B-Splnes Computatonal Aspects Concluson & Future Work 2
Introducton Wavefront reconstructon (WFR): necessary because wavefront phase cannot be measured drectly computatonally expensve and Key operaton n AO Example: for E-ELT XAO system usng standard Matrx-Vector-Multplcaton: 4.8 TFLOPS Current sngle core CPU performance: 18 GFLOPS (Core 7-980) 3
Introducton Increase of computatonal performance n the near future only through parallelzaton. Large scale WFR for XAO requres parallelzaton! Smplex B-splne (SABRE * ) method s a WFR method that enables massve parallelzaton l and mplementaton on GPU. * C.C. de Vsser and M. Verhaegen, A Wavefront Reconstructon n Adaptve Optcs Systems usng Nonlnear Multvarate Splnes, JOSA A, accepted for publcaton. 4
Splne based Aberraton Reconstructon Recently, a new method called the SABRE (Splne based ABeraton REconstructon) for local wavefront reconstructon was ntroduced *. The SABRE uses nonlnear bvarate splnes to locally approxmate the wavefront. The SABRE uses trangular sub- parttons of the global wavefront sensor grd and estmates local wavefront phase. * C.C. de Vsser and M. Verhaegen, A Wavefront Reconstructon n Adaptve Optcs Systems usng Nonlnear Multvarate Splnes, JOSA A, accepted for publcaton. 5
Splne based Aberraton Reconstructon SABRE s compatble wth many dfferent wavefront sensor geometres (occluson, msalgnment, etc.). SABRE can approxmate the wavefront usng nonlnear polynomal bass functons. SABRE was shown to exceed reconstructon accuracy of Fred FD methods for all nose levels (*). SABRE can be mplemented n a dstrbuted manner * Black crosses: SH lenslet locatons Grey lnes: trangular sub-parttons * Ths lecture 6
Splne based Aberraton Reconstructon SABRE models the wavefront through local bass functons plus contnuty constrants: Polynomal bass functon of degree d Estmated splne coeffcents SABRE slope sensor model s lnear n the parameters (c): sxy d B x y P u c n x y d 1, 1 (, ) d B d d (, ) P ( ) (, ) Slope measurements De Casteljau matrx (*) of degree d to d-1 as a functon of dervatve drecton u Nose model (*) CC C.C. de Vsser et al., Dfferental Constrants for Bounded Recursve Identfcaton wth Multvarate Splnes, Automatca, 2011 7
Splne based Aberraton Reconstructon Constraned optmzaton problem for the splne coeffcents c s : Wth the sparse matrx A contanng the splne smoothness constrants. Now defne N A as the null-space projector of A: N A ker( A) The constraned optmzaton problem can now be reduced to an unconstraned problem by usng a projector on the null-space of A as follows: 8
Comparson Fred FD and SABRE Fred Fnte Dfference SABRE Wavefront model ˆ ˆ FD Gs ( xy, ) B d ( x SABRE, y ) c, d 0 Reconstructon matrx Sensor geometry G ( pseudo nverse of G) 1 T T N ( A D D ) D 9
Dstrbuted-SABRE Full doman s parttoned nto any number of parttons. Each partton runs on a separate CPU/GPU core. 10
Dstrbuted-SABRE Prncple of Dstrbuted WFR: each partton depends only on ts drect neghbors Problem: Each partton wll have an unknown pston mode, and wll be dscontnuous wth ts neghbors on ts borders a three stage soluton 11
Dstrbuted-SABRE D-SABRE s a three stage method: Stage 1: local wavefront reconstructon (local LS problem) for partton : cˆ N ( D D ) D s T 1 T A where c are the coeffcents of the splnes used to model the wavefront over the -th partton Stage 2: dstrbuted (teratve) Pston Mode Equalzaton (PME) for partton wth respect to neghbor partton j: ˆ ( ) ˆ ( ) m mean c I c J cˆ cˆ m j 12
Frst 2 stages of D-SABRE llustrated Stage 1 Stage 2 Local WF s estmated usng local WF measurements. Global WF s reconstructed n two extra stages: dstrbuted pston mode equalzaton (PME) and nter-partton smoothng. 13
Stage 3 of D-SABRE Stage 3: dstrbuted teratve nter-partton smoothng usng dstrbuted Dual Ascent (DA) method (**) : A j Dual varable y s updated usng partton of constrant matrx A: y ( k 1) y ( k) A cˆ ( k), 0 1 j j Splne coeffcents are updated usng dual varable y(k+1) and local partton of constrant matrx A : cˆ ( k 1) cˆ ( k ) ( A ) T y ( k 1) Dstrbuted Optmzaton made possble by the hghly sparse structure of the constrant matrx A! A (*) S. Boyd al., Dstrbuted Optmzaton and Statstcal Learnng va the Alternatng Drecton Method of Multplers, Foundatons and Trends n Machne Learnng, 2010 14
Dstrbuted-SABRE Move: Stage 2; dstrbuted PME Move: Stage 3; dstrbuted Dual Ascent 15
Numercal Experment wth D-SABRE Quarter scale (100x100 sensor grd) numercal experment setup: Smulated EPICS turbulence wavefronts (Strehl@750nm = 0.3+/- 0.1) Dynamc wavefront reconstructon usng smple b-cubc DM model 38 [db] sgnal to nose rato 500 turbulence realzatons 100x100 sensor grd 400 parttons for dstrbuted method 16
Numercal Experment wth D-SABRE 17
Computatonal Aspects of D-SABRE D-SABRE compute requrements per trangulaton partton per stage Stage 1 (local wavefront reconstructon): Matrx-Vector-Multplcaton: ˆ Requrement: 2 ON ( ) Stage 2 (Dstrbuted Pston Mode Equalzaton) p Vector-Add operatons: Requrement: O ( p N ) Stage 3 (Dstrbuted Dual Ascent Smoothng) k teratve Sparse-MVM operatons: Requrement: Ok ( N/ E) c N Q s = Total number cˆ cˆ m of B-coeffcents per partton ( A ) T y ( k 1), A cˆ ( k) y j j 18
Computatonal Aspects of D-SABRE D-SABRE total compute requrements per trangulaton partton Stage 1+2+3: Compute requrement: ON p N k N E 2 ( p / ) Stage 2 teraton count p depends on the total number of smplces n a partton, Stage 3 teraton count k depends on contnuty order and nose levels. Stage 1 (local reconstructon) s domnant f and f In general p N, k E N p N k E N Concluson: Stage 1 reconstructon s determnng factor n compute performance! 19
Computatonal Aspects of D-SABRE Compute budget for WFR on an ELT class system: Sensor grd: 240x240, Total trangles: Total parttons: 2*240 2 = 115200 trangles, 768, wth 150 trangles per partton (ncludes overlap) FLOP s per partton per cycle: (150*3) 2 FLOPS per partton for 3000Hz update rate = 3000*202e3 = 202 KFLOP = 610 MFLOPS TOTAL FLOPS for 768 parttons = 469 GFLOPS Concluson Hardware Requrement: 2 NVda Tesla C2050 GPU s wth peak DP performance 2 * 448 cores * 1 GFLOPS = 896 GFLOPS runnng 1 partton per core (requres 768 cores total) 8 Intel Core 7-980 CPU s wth peak DP performance 8 * 6 cores * 18 GFLOPS = 864 GFLOPS runnng 18 parttons per core (requres 43 cores total) 20
Concluson The SABRE method can locally reconstruct wavefronts on non-rectangular domans usng non-lnear splne functons. The D-SABRE method s a dstrbuted verson of the SABRE splne WFR method publshed n JOSA-2012; t s specfcally desgned for parallel operatons on mult-core hardware.. D-SABRE has all potental to perform real-tme Wavefront Reconstructon at 3000Hz for the E-ELT challenges usng 8 Intel Core 7-980 class CPU s, or 2 NVda Tesla C2050 class GPU s. 21
Future Work The D-SABRE method wll be mplemented n a C-GPU language lke CUDA or OpenCL. The SABRE method wll be refned to enable non-lnear wavefront reconstructon, and the use of non-shack-hartmann based wavefront sensors. A full scale smulaton based on smulated E-ELT phase screens and operatonal (GPU) hardware wll be created. 22
Thank you for your attenton! 23