Supplementary Information

Size: px

Start display at page:

Download "Supplementary Information"

Aubrey May
5 years ago
Views:

1 Boosting theoretical zeolitic framework generation for the prediction of new materials structures using GPU programming Laurent A. Baumes,* a Frederic Kruger, b Santiago Jimenez, a Pierre Collet, b and Avelino Corma a Supplementary Information Table S1. Flynn s taxonomy. See Fig.S1 for architectures diagrams. Single instruction Multiple instruction Single data SISD MISD Multiple data SIMD MIMD Figure S1. GPU architecture. Texture cache and Atomics are not integrated for clarity. Table S2. GeForce GTX295 description Model GeForce GTX 295 Year 2009 Average components size (nm) 55 Transistors (Million) 2x 1400 Die Size (mm 2 ) 2x 470 Number of Die 2 Bus interface PCIe x Memory min (MB) 2x 896 Core 576 Reference clock rate Shader 1242 (MHz) Memory 1998 Fillrate Pixel (GP/s) 2x Texture (GT/s) 2x Bandwidth Reference Memory (GB/s) 2x Configuration DRAM type GDDR3 Bus width (bit) 2x x14 Graphics library DirectX 10.0 support (version) OpenGL 3.2 GFLOPs (MADD+MUL)

2 Figure S3. Diagram comparing architectures where PU is a processing unit. Figure S4. Parallel Evolutionary loop. Evaluation is done in parallel on the GPU while the rest of the algorithm runs on the CPU.

3 Figure S5. Evolutionary algorithm minimizing the Weierstrass function in EASEA. The idea behind EASEA was to allow virtually any basic programmer to try out an evolutionary algorithm by just typing the code that was specific to the problem to be solved. The code for the implementation of the GPGPU algorithm that tries to minimise a benchmark function called Weierstrass is presented and does not contain much more than the following lines. This is how the genome of a newly created individual is initialised. EASEA provides a random function that returns a random value between the value of its two arguments, of the same type as its arguments, i.e. floats. The evaluator section contains a straightforward C-like implementation of the function to be optimised, that anyone with even basic programming skills in C should be able to write. This is the function that is sent to the GPGPU for parallel evaluation of the individuals (i.e. the genomes with the array of float containing the different values to be tested by the function). One must understand that once the code is being sent to the GPGPU, it is over there on its own, and must therefore be totally autonomous, as it will be cut off from the address space of the main program. Therefore, referring to global variables makes no sense, as well as using such things as printf, which will have nowhere to print to. In fact, function calls are not allowed in GPGPU programs. If however function calls are found in the code (call to the Abs function as above, for instance) the compiler will get the code of the function and inline it automatically. So this means that functions can be used in the code, but the programmer must keep in mind that the function will be inlined at compilation time, so these will not be true functions. Recursive functions are impossible to inline, so must be turned into iterative functions first, before they can be used on a GPGPU. A standard barycentric crossover is implemented. child, parent1 and parent2 are EASEA-defined pointers towards the selected parents to cross-over and child to create. In the mutator, tosscoin is a function provided by EASEA that returns a value of 1 with the probability of its argument (here pmutpergene). pmutpergene is a global variable that can be used in the mutator because the mutator is executed on the main CPU, not in the GPGPU (that does not know about any global variables of the evolutionary program). MAX is a macro function that is defined somewhere by the user, and X_MIN and X_MAX are global variables. Since the mutation function is fed by EASEA with a new child resulting from the above crossover, the Genome variable can be used directly. Finally, the program ends with a section containing default run parameters that allow to specify the evolutionary algorithm to be used: Number of generations, Probability to call mutation and crossover operators, Population size, The desired selection method to choose the parents for a crossover, The number of children (Offspring) per generation (100% of the population size), The number of parents that will compete with children in order to make it to the next generation (50% of the population size), How competing parents will be selected, How individuals from the (competing parents + offspring) temporary population will be selected, Whether elitism should be implemented or not, and Whether the fitness function should be maximised or minimised. The.ez file containing these sections gets compiled by typing: $ easea weierstrass.ez on the command line. cuda : will output code for any NVIDIA GPGPU card. When this option is used, the evaluation function will be sent on the GPGPU and run in parallel on the population to be evaluated. The rest of the algorithm that manages the population (selections, crossovers, mutations, reductions,... ) will stay on the host CPU and execute linarly. Speedup therefore only depends on the population size, the size of the genome and evaluation time.

4 Figure S5. AFX fitness landscape.

5 Fitness Function (Pseudo code) Constants definition: ANGLE_MIN ANGLE_MAX ANGLE_AVG_OPT ANGLE_AVG_OPT_MIN ANGLE_AVG_OPT_MAX DIST_MIN DIST_MAX DIST_OPT DIST_OPT_MIN DIST_OPT_MAX DIST_MIN_SQ DIST_MAX_SQ DIST_OPT_SQ DIST_OPT_MIN_SQ DIST_OPT_MAX_SQ Inputs: UnitCell uc; Unit Cell dimensions (a, b, c, α, β, γ) Atom[] auatoms; int[] aumultiplicity; Atom[] ucatoms; Atom[] nbatoms; Array containing the Asymmetric Unit Atoms coordinates. (T-Atoms) Symmetry multiplicity for each T-Atom. Array containing the Unit Cell Atoms coordinates.(symmetry Operations over T- Atoms). Array containing the Atoms in the Neighbour Cells.Only the atoms that could create links with ucatoms. Function GPU_GetFitness : //Local variables float distsq; // Distance square for two given atoms float dist; // Distance for two given atoms float aux; // auxiliar value to get the fitness float errors; // contains distance and nblink errors meassure float errors3mr; // contains the 3MR errors meassure float errorsangles; // contains a angle errors meassure int mult; // Current atau multiplicity Atom linkedatoms[]; // Array containing the atoms linked to atau float linkeddists[]; // Array containing the link distances to atau int nblinks; // Current number of link distances float angle; // Angle for three given atoms float avgangles; // Angles average for the current atau atom int nbangles; // Current number of angles formed with atau // Local variables initialization int NBLinkErrors = 4 * auatoms.count; float Fitness = (4 * ucatoms.count * DIST_OPT) + (6 * ucatoms.count * ANGLE_AVG_OPT); aux = 0.0f; errors = 0.0f; errors3mr = 0.0f; errorsangles = 0.0f; foreach(atom atau in auatoms) nblinks = 0; mult = aumultiplicity[atau]; //au x uc foreach(atom atuc in ucatoms) if(atau is not atuc) distsq = distancesq(uc, atau, atuc); if(distsq [DIST_MIN_SQ - DIST_MAX_SQ]) dist = sqrt(mydistsq); addatom(linkedatoms, atuc); adddist(linkeddists, dist); nblinks++; if(nblinks <= 4) if(dist [DIST_OPT_RANGE_MIN DIST_OPT_RANGE_MAX]) aux += (DIST_OPT * mult); aux += (DIST_OPT abs(dist_opt - dist)) * mult; NBLinkErrors--; errors += DIST_OPT * mult * (1.0f + abs(dist_opt - dist));

6 NBLinkErrors++; if(mydistsq < DIST_MIN_SQ) // too close distance errors += DIST_OPT_SQ * mult * (1.0f + abs(dist_min_sq - distsq)); // au x nb foreach(atom atnb in nbatoms) distsq = distancesq(uc, atau, atnb); if(distsq [DIST_MIN_SQ - DIST_MAX_SQ]) dist = sqrt(mydistsq); addatom(linkedatoms, atnb); adddist(linkeddists, dist); nblinks++; if(nblinks <= 4) if(dist [DIST_OPT_MIN - DIST_OPT_MAX]) aux += (DIST_OPT * mult); aux += (DIST_OPT - abs(dist_opt - dist)) * mult; NBLinkErrors--; // more than 4 links errors += DIST_OPT * mult * (1.0f + abs(dist_opt - dist)); NBLinkErrors++; if(distsq < DIST_MIN_SQ) // too close distance errors += DIST_OPT_SQ * mult * (1.0f + abs(dist_min_sq - distsq)); // Reset the number and average angles values for this atau nbangles = 0; avgangles = 0.0f; foreach(atom linkedatomj/j=0 in linkedatoms) for(atom linkedatomk/k=j+1 in linkedatoms) // 3MR distsq = distancesq(uc, linkedatomj, linkedatomk); if(distsq [DIST_MIN_SQ - DIST_MAX_SQ]) errors3mr += DIST_OPT_SQ * mult * 1.0f + (DIST_MAX_SQ - distsq)); //ANGLES mydist = sqrt(mydistsq); angle = getangle(atau, linkedatomj, linkedatomk); if(angle [ANGLE_MIN - ANGLE_MAX]) nbangles++; if(nbangles <= 6) avgangles += angle; // more than 6 angles errorsangles += ANGLE_AVG_OPT abs(angle_avg_opt - angle); errorsangles += ANGLE_AVG_OPT abs(angle_avg_opt - angle); // only takes the first 6 angles if(nbangles > 6) nbangles = 6; if(nbangles > 0) avgangles = avgangles / nbangles; if(avganges [ANGLE_AVG_OPT_MIN - ANGLE_AVG_OPT_MAX]) aux += ANGLE_AVG_OPT * nbangles * mult; aux += (ANGLE_AVG_OPT - abs(angle_avg_opt - avgangles))* nbangles * mult; Fitness -= aux; if(fitness >= 0) Fitness += errors + errors3mr + errorsangles; Else Fitness = abs(fitness - errors - errors3mr errorsangles); output [Fitness];

7 Description of files in Sup. Info. (compressed in Zip) Note that CIF files contain Oxygen and have been optimized using GULP CIF_DATA/UNIT_CELL_A_SPG_74: Subset of solutions for Unit Cell A defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Imma (74), in cif format, with 6 and 8 T-Atoms respectively. CIF_DATA/UNIT_CELL_B_SPG_74: Subset of solutions for Unit Cell B defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Imma (74), in cif format, with 6 and 8 T-Atoms respectively. CIF_DATA/UNIT_CELL_C_SPG_46: Subset of solutions for Unit Cell C defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Ima2 (46), in cif format, with 10, 12 and 14 T-Atoms respectively. GULP_DATA: Contains an example of the input and output gulp files for the solution# 21 in Unit Cell A. The contained subset of structures is the following (grey rows refer to the structures which are integrated in the manuscript): Table S1. Subset of solutions for unit cell A defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Imma (74) T_Atoms# Solution_# Energy Fitness Table S2. Subset of solutions for unit cell B defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Imma (74) T_Atoms# Solution_# Energy Fitness Table S3. Subset of solutions for Unit Cell C defined by dimensions a, b, c = , , , angles α, β, γ = 90, 90, 90, and space group Ima2 (46) T_Atoms# Solution_# Energy Fitness

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS

GPGPU, 1st Meeting Mordechai Butrashvily, CEO GASS Agenda Forming a GPGPU WG 1 st meeting Future meetings Activities Forming a GPGPU WG To raise needs and enhance information sharing A platform for knowledge