A Multithreaded Genetic Algorithm for Floorplanning

Size: px
Start display at page:

Download "A Multithreaded Genetic Algorithm for Floorplanning"

Transcription

1 A Multithreaded Genetic Algorithm for Floorplanning Jake Adriaens ECE 556 Fall 2004

2 Introduction I have chosen to implement the algorithm described in the paper, Distributed Genetic Algorithms for the Floorplan Design Problem by J.P. Cohoon, S.U. Hegde, W.N. Martin, and D.S. Richards. Standard genetic algorithms operate sequentially with all new solutions depending on previous solutions. This is ok for small problems but as the number of modules M in the floorplan increases the number of solutions increases faster than the rate of M! (it is between M! and M!*M*(M-1), this is a very steep curve and therefore is a good problem to solve in a distributed fashion. The algorithm I implement is intended to provide a divide and conquer method for solving the floorplanning problem. Multithreading Motivation In the paper the algorithm is presented as distributed across a network of computers. I have chosen to instead implement the algorithm as a multithreaded program. There are a number of reasons I have chosen to make the algorithm multithreaded instead of distributed across a network. The paper states all of the genetic instances are assumed to have a large shared memory to communicate through. In a network environment separate instances would have to communicate over the network to exchange data, this results in slow communication relative to computational speed. If the program where multi-process on a single machine instances would have to communicate through inter-process techniques, because processes do not share memory so this also would be slower than having a shared memory. In a multithreaded environment the threads have a shared memory so they are able to exchange data directly through it. Traditionally multithreading as a form of distributing a computationally intense algorithm has been avoided. This is because older architectures only support one thread in flight at a time. The usefulness of multithreading in an environment that supports only a single thread in flight at a time is for I/O bound programs. If part of a program has to stop and wait for something slow, like keyboard input, that thread of the program may go to sleep giving processor time to other threads of the program. If the program is bound by the computational speed of the CPU there is no sense in making this program multithreaded because no extra CPU time can be gained from it. Modern architectures support multiple threads in flight at once (hyper-threading). This means that while one computationally complex part of a program is executing another part may be executing in another arithmetic/logical core within the processor, or even on another processor itself (one that shares the same memory). As these architectures become more popular there is more opportunity to distribute CPU limited algorithms over multiple threads, instead of multiple processes or over a network. Algorithm Genetic The non-distributed genetic floorplanning algorithm is an important component of the distributed algorithm. The genetic algorithm consists of four parts: spawning, crossover, mutation and merging. The initial step, spawning, is generating the initial set of solutions you intend to work with. To spawn solutions I start with the initial floorplan: 12v3h4v5h6v To make the next member of the initial population I mutate it N times, where N is the size of the initial population desired. A mutation consists of performing an M1, M2, or M3 move on the floorplan. M1 is a swap of two adjacent operands, 2 and 3 for example. M2 is complementing 1

3 some chain of operators, h s or v s. And M3 is swapping an adjacent operator and operand. To make the next solution for the initial population I perform the N mutations on the previously generated solution, this is repeated N-1 times to generate the N desired initial solutions. The size of the population N is a parameter passed on the command line to the program at runtime. Crossover is done to generate new, possibly better, solutions. To perform crossover two solutions from the population are chosen randomly, then one of four functions are chosen to combine them into one new solution. The following are graphical illustrations of each of the crossover functions from the paper Distributed Genetic Algorithms for the Floorplan Design Problem, the * and + represent vertical and horizontal cuts: Crossover 1 Crossover 2 Crossover 3 Crossover 4 2

4 The amount of offspring to make from the crossover functions is passed as a percentage of the total population on the command line. Mutation, as described earlier, is one of the three move operations from the simulated annealing floorplanning algorithm. Mutation is only done on the offspring produced from crossover and is passed to the program on the command line as a percentage of the offspring that will be mutated. Mutation causes the solutions from stagnating too quickly at a local minimum by introducing new possible floorplans. The last step of the genetic algorithm is merging. This part of the algorithm decides which offspring to keep for the next round of the genetic algorithm. If an offspring has a smaller cost (area of the floorplan) than the population member with the worst cost, the population member is thrown out and the offspring replaces it. This is done for all the offspring. The genetic algorithm is performed as follows. First the initial solutions are spawned, then for a number of generations crossover, mutation and merging are performed. After all generations the best solution is chosen. The number of generations to run is specified on the command line at run-time. Here is the psuedo-code for the genetic floorplanning algorithm: Spawn initial population For G generations Produce offspring through crossover Mutate offspring Merge offspring into population Choose best solution Distributed Genetic The genetic algorithm is modified slightly to make it distributed. A number of instances of the genetic algorithm are spawned and run independently an in parallel for a number generations. After a set number of generations the separate instances stop and trade solutions with each other to introduce diversity into their populations and keep them from stagnating at local minima. They then repeat this process for a set number of epochs, which can be specified on the command line as well. After all epochs the best solution is chosen from all the instances of the genetic algorithm. To keep the separate instances from reaching the same local minimum only one crossover function is used per instance. So thread A uses crossover A mod 4. The following is a graphical model of the distributed genetic algorithm using two threads: Spawn Init Population Init population Run Genetic Run Genetic Trade Data Trade Data Choose Best 3

5 Code Highlights One particularly hard problem to solve in development was having the threads communicate. Each thread must give data once to every other thread and must receive data once from every other thread, in other words a handshaking problem. To make the problem even harder, only the main program thread knows who all the other threads are, the threads that actually need to communicate have no idea who the other threads are. I solved the problem by creating a buffer all the threads share and having the main program signal who s turn it is to write the buffer, having the writing thread signal when the buffer is ready to be read and having the reading threads increment a counter when they are done reading the buffer so the main thread can check when all the threads have finished reading and it is ok to signal the next writer: Main thread: Set the writer to thread 0 For N threads Signal it is ok to write Wait until all threads have read the buffer (except the writer) Signal reading is not ok Set the read counter to 0 Writer = writer + 1 Child threads: For N threads Wait until writing is ok If I am not the writer Wait until reading is ok Read the buffer Increment the read counter to signal I have completed reading Else Signal writing is not ok Write the buffer Signal reading is ok Results The results I have come up with are misleading as to the effectiveness of the algorithm. My results show the distributed genetic floorplanning algorithm performing quite slower than the simulated annealing floorplanning algorithm. There were two major reasons my algorithm didn t perform as well as expected. The first is that I was running it on a machine that only supported a single thread in flight at once, because I was running four threads in my tests the distributed algorithm had an almost 4x increase in run-times (not quite 4x because trading is done sequentially among the threads). The other major slowdown in my implementation of the algorithm was the floorplanning data structure. I reused my data structure from the simulated annealing floorplanner. This data structure was optimized for doing moves and undos and also used a large amount of memory (to make it faster), which wasn t an issue because the simulated annealing floorplanner only used two floorplan objects (one for the current solution and one for the best). The most common operations I do in the distributed algorithm are making new floorplans (the crossover stage), removing floorplans and copying floorplans (the merging and trading stages). Removing a floorplan involves an operating system call to dynamically deallocate memory and copying and creating floorplans involve dynamically allocating memory 4

6 with a corresponding request to the operating system. The distributed genetic floorplanner also requires a large number of floorplan objects, which ends up using a large amount of memory (4 million objects overflowed 1GB of ram). Here is a summary of run times and areas produced by both the multithreaded genetic and simulated annealing algorithms followed by a graph of the parameters required to generate the provide solutions in the multithreaded genetic algorithm: Number of Multithreaded Genetic Simulated Annealing Modules Area Run-time (sec) Area Run-time (sec) Area and Run-time versus Modules Modules Population Epochs Run-time (hundreds of secs) Population, Epochs and Run-time versus Modules The results I feel are an important measure of the algorithm is the number of solutions generated, the simulated annealing algorithm generates solutions when there are 200 modules while the distributed genetic algorithm looks at 205 solutions per thread for the same 200 modules. This means if the solution generation time of the distributed genetic algorithm was cut down to take 10x as long as one in the simulated annealing algorithm, in a machine that has four of the genetic threads in flight at once the run time would be about ¼ that of the simulated annealing algorithm. The code size of the multithreaded genetic program is 40.8KB with an executable of 50.2KB while the simulated annealing program has a code size of 20.1KB and an executable of 31.7KB. Conclusion Overall I am pleased with my implementation of the algorithm. I feel it would benefit greatly from a different floorplan data structure, with some restructuring the data structure could avoid the requirements of dynamically allocating and de-allocating memory, which takes a large amount of time. The data structure could also avoid storing the module sizes as well, since they are the same for each instance of the module in every floorplan, currently the size of a given 5

7 module is stored in every floorplan which is quite wasteful. If the data structure is rewritten the merging function could be rewritten as well, it is currently O(M 4 ) complexity, where M is the number of modules in the floorplan, unfortunately with the current data structure it is necessary to use this implementation. I think this algorithm will work quite well on hyper-threaded machines and is an efficient way to distribute the genetic algorithm and seems to be a natural extension of it. References Cohoon, J.P. Hegde, S.U. Martin,W.N. Richards, D.S. "Distributed Genetic Algorithms for the Floorplan Design Problem," IEE Trans. Computer-Aided Design, vol.10 No.4. pp April Rose, J.B. Snelgrove, W.M. Vranesie, Z.G. "Parallel standard cell placement algorithms with quality equivalent to simulated annealing," IEEE Trans. Computer-Aided Design, vol.7 No.3. pp Mar Sait, S.M. Youssef, H. VLSI Physical Design Automation: Theory and Practice, World Scientific Publishing Co. River Edge, NJ,

Genetic Placement: Genie Algorithm Way Sern Shong ECE556 Final Project Fall 2004

Genetic Placement: Genie Algorithm Way Sern Shong ECE556 Final Project Fall 2004 Genetic Placement: Genie Algorithm Way Sern Shong ECE556 Final Project Fall 2004 Introduction Overview One of the principle problems in VLSI chip design is the layout problem. The layout problem is complex

More information

Summary of Computer Architecture

Summary of Computer Architecture Summary of Computer Architecture Summary CHAP 1: INTRODUCTION Structure Top Level Peripherals Computer Central Processing Unit Main Memory Computer Systems Interconnection Communication lines Input Output

More information

A Genetic Algorithm for VLSI Floorplanning

A Genetic Algorithm for VLSI Floorplanning A Genetic Algorithm for VLSI Floorplanning Christine L. Valenzuela (Mumford) 1 and Pearl Y. Wang 2 1 Cardiff School of Computer Science & Informatics, Cardiff University, UK. C.L.Mumford@cs.cardiff.ac.uk

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18 PROCESS VIRTUAL MEMORY CS124 Operating Systems Winter 2015-2016, Lecture 18 2 Programs and Memory Programs perform many interactions with memory Accessing variables stored at specific memory locations

More information

Two Efficient Algorithms for VLSI Floorplanning. Chris Holmes Peter Sassone

Two Efficient Algorithms for VLSI Floorplanning. Chris Holmes Peter Sassone Two Efficient Algorithms for VLSI Floorplanning Chris Holmes Peter Sassone ECE 8823A July 26, 2002 1 Table of Contents 1. Introduction 2. Traditional Annealing 3. Enhanced Annealing 4. Contiguous Placement

More information

Machine Architecture. or what s in the box? Lectures 2 & 3. Prof Leslie Smith. ITNP23 - Autumn 2014 Lectures 2&3, Slide 1

Machine Architecture. or what s in the box? Lectures 2 & 3. Prof Leslie Smith. ITNP23 - Autumn 2014 Lectures 2&3, Slide 1 Machine Architecture Prof Leslie Smith or what s in the box? Lectures 2 & 3 ITNP23 - Autumn 2014 Lectures 2&3, Slide 1 Basic Machine Architecture In these lectures we aim to: understand the basic architecture

More information

GENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T. R.

GENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T. R. GENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T R FPGA PLACEMENT PROBLEM Input A technology mapped netlist of Configurable Logic Blocks (CLB) realizing a given circuit Output

More information

System Call. Preview. System Call. System Call. System Call 9/7/2018

System Call. Preview. System Call. System Call. System Call 9/7/2018 Preview Operating System Structure Monolithic Layered System Microkernel Virtual Machine Process Management Process Models Process Creation Process Termination Process State Process Implementation Operating

More information

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram:

The CPU and Memory. How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: The CPU and Memory How does a computer work? How does a computer interact with data? How are instructions performed? Recall schematic diagram: 1 Registers A register is a permanent storage location within

More information

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2. BASIC ELEMENTS Simplified view: Processor Slide 1 Computer System Overview Operating Systems Slide 3 Main Memory referred to as real memory or primary memory volatile modules 2004/S2 secondary memory devices

More information

School of Computer and Information Science

School of Computer and Information Science School of Computer and Information Science CIS Research Placement Report Multiple threads in floating-point sort operations Name: Quang Do Date: 8/6/2012 Supervisor: Grant Wigley Abstract Despite the vast

More information

Section 7: Wait/Exit, Address Translation

Section 7: Wait/Exit, Address Translation William Liu October 15, 2014 Contents 1 Wait and Exit 2 1.1 Thinking about what you need to do.............................. 2 1.2 Code................................................ 2 2 Vocabulary 4

More information

Chapter 3 - Memory Management

Chapter 3 - Memory Management Chapter 3 - Memory Management Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 3 - Memory Management 1 / 222 1 A Memory Abstraction: Address Spaces The Notion of an Address Space Swapping

More information

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Chapter 5 A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2 Graph Matching has attracted the exploration of applying new computing paradigms because of the large number of applications

More information

Genetic Algorithm for FPGA Placement

Genetic Algorithm for FPGA Placement Genetic Algorithm for FPGA Placement Zoltan Baruch, Octavian Creţ, and Horia Giurgiu Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

A Simple Efficient Circuit Partitioning by Genetic Algorithm

A Simple Efficient Circuit Partitioning by Genetic Algorithm 272 A Simple Efficient Circuit Partitioning by Genetic Algorithm Akash deep 1, Baljit Singh 2, Arjan Singh 3, and Jatinder Singh 4 BBSB Engineering College, Fatehgarh Sahib-140407, Punjab, India Summary

More information

Selection Queries. to answer a selection query (ssn=10) needs to traverse a full path.

Selection Queries. to answer a selection query (ssn=10) needs to traverse a full path. Hashing B+-tree is perfect, but... Selection Queries to answer a selection query (ssn=) needs to traverse a full path. In practice, 3-4 block accesses (depending on the height of the tree, buffering) Any

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017

CS 137 Part 8. Merge Sort, Quick Sort, Binary Search. November 20th, 2017 CS 137 Part 8 Merge Sort, Quick Sort, Binary Search November 20th, 2017 This Week We re going to see two more complicated sorting algorithms that will be our first introduction to O(n log n) sorting algorithms.

More information

Using Genetic Algorithm with Triple Crossover to Solve Travelling Salesman Problem

Using Genetic Algorithm with Triple Crossover to Solve Travelling Salesman Problem Proc. 1 st International Conference on Machine Learning and Data Engineering (icmlde2017) 20-22 Nov 2017, Sydney, Australia ISBN: 978-0-6480147-3-7 Using Genetic Algorithm with Triple Crossover to Solve

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

Computer System Overview

Computer System Overview Computer System Overview Operating Systems 2005/S2 1 What are the objectives of an Operating System? 2 What are the objectives of an Operating System? convenience & abstraction the OS should facilitate

More information

Simulated annealing/metropolis and genetic optimization

Simulated annealing/metropolis and genetic optimization Simulated annealing/metropolis and genetic optimization Eugeniy E. Mikhailov The College of William & Mary Lecture 18 Eugeniy Mikhailov (W&M) Practical Computing Lecture 18 1 / 8 Nature s way to find a

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables

More information

Constructive floorplanning with a yield objective

Constructive floorplanning with a yield objective Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Using Genetic Algorithms to Solve the Box Stacking Problem

Using Genetic Algorithms to Solve the Box Stacking Problem Using Genetic Algorithms to Solve the Box Stacking Problem Jenniffer Estrada, Kris Lee, Ryan Edgar October 7th, 2010 Abstract The box stacking or strip stacking problem is exceedingly difficult to solve

More information

Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE

Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE Ken Christensen Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 christen@cse.usf.edu (813) 974-4761

More information

Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE

Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE Rapid PHY Selection (RPS): Emulation and Experiments using PAUSE Ken Christensen Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 christen@cse.usf.edu (813) 974-4761

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions Genetic programming Lecture Genetic Programming CIS 412 Artificial Intelligence Umass, Dartmouth One of the central problems in computer science is how to make computers solve problems without being explicitly

More information

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: Local Search Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat: Select a variable to change Select a new value for that variable Until a satisfying assignment is

More information

A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture

A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture Robert S. French April 5, 1989 Abstract Computational origami is a parallel-processing concept in which

More information

Optimizing Testing Performance With Data Validation Option

Optimizing Testing Performance With Data Validation Option Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg

Short-term Memory for Self-collecting Mutators. Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg Short-term Memory for Self-collecting Mutators Martin Aigner, Andreas Haas, Christoph Kirsch, Ana Sokolova Universität Salzburg CHESS Seminar, UC Berkeley, September 2010 Heap Management explicit heap

More information

The Parallel Software Design Process. Parallel Software Design

The Parallel Software Design Process. Parallel Software Design Parallel Software Design The Parallel Software Design Process Deborah Stacey, Chair Dept. of Comp. & Info Sci., University of Guelph dastacey@uoguelph.ca Why Parallel? Why NOT Parallel? Why Talk about

More information

Caching Basics. Memory Hierarchies

Caching Basics. Memory Hierarchies Caching Basics CS448 1 Memory Hierarchies Takes advantage of locality of reference principle Most programs do not access all code and data uniformly, but repeat for certain data choices spatial nearby

More information

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2014 Lecture 14 CS24: INTRODUCTION TO COMPUTING SYSTEMS Spring 2014 Lecture 14 LAST TIME! Examined several memory technologies: SRAM volatile memory cells built from transistors! Fast to use, larger memory cells (6+ transistors

More information

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012

CSE332: Data Abstractions Lecture 7: B Trees. James Fogarty Winter 2012 CSE2: Data Abstractions Lecture 7: B Trees James Fogarty Winter 20 The Dictionary (a.k.a. Map) ADT Data: Set of (key, value) pairs keys must be comparable insert(jfogarty,.) Operations: insert(key,value)

More information

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES

DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES DETERMINING MAXIMUM/MINIMUM VALUES FOR TWO- DIMENTIONAL MATHMATICLE FUNCTIONS USING RANDOM CREOSSOVER TECHNIQUES SHIHADEH ALQRAINY. Department of Software Engineering, Albalqa Applied University. E-mail:

More information

Kanban Scheduling System

Kanban Scheduling System Kanban Scheduling System Christian Colombo and John Abela Department of Artificial Intelligence, University of Malta Abstract. Nowadays manufacturing plants have adopted a demanddriven production control

More information

Optimizing Replication, Communication, and Capacity Allocation in CMPs

Optimizing Replication, Communication, and Capacity Allocation in CMPs Optimizing Replication, Communication, and Capacity Allocation in CMPs Zeshan Chishti, Michael D Powell, and T. N. Vijaykumar School of ECE Purdue University Motivation CMP becoming increasingly important

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Fundamentals of Computer Systems and Engineering Fall 2017 The Operating System (OS) Prof. John Board Duke University Slides are derived from work by Profs. Tyler Bletsch and Andrew Hilton (Duke)

More information

NETW3005 Operating Systems Lecture 1: Introduction and history of O/Ss

NETW3005 Operating Systems Lecture 1: Introduction and history of O/Ss NETW3005 Operating Systems Lecture 1: Introduction and history of O/Ss General The Computer Architecture section SFDV2005 is now complete, and today we begin on NETW3005 Operating Systems. Lecturers: Give

More information

A Hybrid Genetic Algorithms and Tabu Search for Solving an Irregular Shape Strip Packing Problem

A Hybrid Genetic Algorithms and Tabu Search for Solving an Irregular Shape Strip Packing Problem A Hybrid Genetic Algorithms and Tabu Search for Solving an Irregular Shape Strip Packing Problem Kittipong Ekkachai 1 and Pradondet Nilagupta 2 ABSTRACT This paper presents a packing algorithm to solve

More information

Homework 2: Search and Optimization

Homework 2: Search and Optimization Scott Chow ROB 537: Learning Based Control October 16, 2017 Homework 2: Search and Optimization 1 Introduction The Traveling Salesman Problem is a well-explored problem that has been shown to be NP-Complete.

More information

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini Metaheuristic Development Methodology Fall 2009 Instructor: Dr. Masoud Yaghini Phases and Steps Phases and Steps Phase 1: Understanding Problem Step 1: State the Problem Step 2: Review of Existing Solution

More information

Concurrency ECE2893. Lecture 12. ECE2893 Concurrency Spring / 16

Concurrency ECE2893. Lecture 12. ECE2893 Concurrency Spring / 16 Concurrency ECE2893 Lecture 12 ECE2893 Concurrency Spring 2011 1 / 16 Single Core Architectures 1 Recall that in the very beginning of the class we discussed the basic architecture of a modern computer,

More information

Designing Computers. The Von Neumann Architecture. The Von Neumann Architecture. The Von Neumann Architecture

Designing Computers. The Von Neumann Architecture. The Von Neumann Architecture. The Von Neumann Architecture Chapter 5.1-5.2 Designing Computers All computers more or less based on the same basic design, the Von Neumann Architecture! Von Neumann Architecture CMPUT101 Introduction to Computing (c) Yngvi Bjornsson

More information

The Von Neumann Architecture. Designing Computers. The Von Neumann Architecture. CMPUT101 Introduction to Computing - Spring 2001

The Von Neumann Architecture. Designing Computers. The Von Neumann Architecture. CMPUT101 Introduction to Computing - Spring 2001 The Von Neumann Architecture Chapter 5.1-5.2 Von Neumann Architecture Designing Computers All computers more or less based on the same basic design, the Von Neumann Architecture! CMPUT101 Introduction

More information

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured System Performance Analysis Introduction Performance Means many things to many people Important in any design Critical in real time systems 1 ns can mean the difference between system Doing job expected

More information

Using implicit fitness functions for genetic algorithm-based agent scheduling

Using implicit fitness functions for genetic algorithm-based agent scheduling Using implicit fitness functions for genetic algorithm-based agent scheduling Sankaran Prashanth, Daniel Andresen Department of Computing and Information Sciences Kansas State University Manhattan, KS

More information

CMPUT101 Introduction to Computing - Summer 2002

CMPUT101 Introduction to Computing - Summer 2002 7KH9RQ1HXPDQQ$UFKLWHFWXUH Chapter 5.1-5.2 Von Neumann Architecture 'HVLJQLQJ&RPSXWHUV All computers more or less based on the same basic design, the Von Neumann Architecture! CMPUT101 Introduction to Computing

More information

A GENETIC ALGORITHM APPROACH TO OPTIMAL TOPOLOGICAL DESIGN OF ALL TERMINAL NETWORKS

A GENETIC ALGORITHM APPROACH TO OPTIMAL TOPOLOGICAL DESIGN OF ALL TERMINAL NETWORKS A GENETIC ALGORITHM APPROACH TO OPTIMAL TOPOLOGICAL DESIGN OF ALL TERMINAL NETWORKS BERNA DENGIZ AND FULYA ALTIPARMAK Department of Industrial Engineering Gazi University, Ankara, TURKEY 06570 ALICE E.

More information

External Sorting Sorting Tables Larger Than Main Memory

External Sorting Sorting Tables Larger Than Main Memory External External Tables Larger Than Main Memory B + -trees for 7.1 External Challenges lurking behind a SQL query aggregation SELECT C.CUST_ID, C.NAME, SUM (O.TOTAL) AS REVENUE FROM CUSTOMERS AS C, ORDERS

More information

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms The levels of a memory hierarchy CPU registers C A C H E Memory bus Main Memory I/O bus External memory 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms 1 1 Some useful definitions When the CPU finds a requested

More information

Increasing Performance for PowerCenter Sessions that Use Partitions

Increasing Performance for PowerCenter Sessions that Use Partitions Increasing Performance for PowerCenter Sessions that Use Partitions 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Preview. The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread

Preview. The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread Preview The Thread Model Motivation of Threads Benefits of Threads Implementation of Thread Implement thread in User s Mode Implement thread in Kernel s Mode CS 431 Operating System 1 The Thread Model

More information

A Linear-Time Heuristic for Improving Network Partitions

A Linear-Time Heuristic for Improving Network Partitions A Linear-Time Heuristic for Improving Network Partitions ECE 556 Project Report Josh Brauer Introduction The Fiduccia-Matteyses min-cut heuristic provides an efficient solution to the problem of separating

More information

Client vs. Enterprise SSDs

Client vs. Enterprise SSDs Client vs. Enterprise SSDs A Guide to Understanding Similarities and Differences in Performance and Use Cases Overview Client SSDs those designed primarily for personal computer storage can excel in some,

More information

Introduction to I/O Efficient Algorithms (External Memory Model)

Introduction to I/O Efficient Algorithms (External Memory Model) Introduction to I/O Efficient Algorithms (External Memory Model) Jeff M. Phillips August 30, 2013 Von Neumann Architecture Model: CPU and Memory Read, Write, Operations (+,,,...) constant time polynomially

More information

(Refer Slide Time: 1:26)

(Refer Slide Time: 1:26) Information Security-3 Prof. V Kamakoti Department of Computer science and Engineering Indian Institute of Technology Madras Basics of Unix and Network Administration Operating Systems Introduction Mod01,

More information

COMPUTER SYSTEM. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U

COMPUTER SYSTEM. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U. COMPUTER SYSTEM IB DP Computer science Standard Level ICS3U C A N A D I A N I N T E R N A T I O N A L S C H O O L O F H O N G K O N G 5.1 Introduction 5.2 Components of a Computer System Algorithm The Von Neumann architecture is based on the following three characteristics:

More information

Chapter 5 - Input / Output

Chapter 5 - Input / Output Chapter 5 - Input / Output Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 5 - Input / Output 1 / 90 1 Motivation 2 Principle of I/O Hardware I/O Devices Device Controllers Memory-Mapped

More information

Processes The Process Model. Chapter 2 Processes and Threads. Process Termination. Process States (1) Process Hierarchies

Processes The Process Model. Chapter 2 Processes and Threads. Process Termination. Process States (1) Process Hierarchies Chapter 2 Processes and Threads Processes The Process Model 2.1 Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling Multiprogramming of four programs Conceptual

More information

Lecture Notes on Garbage Collection

Lecture Notes on Garbage Collection Lecture Notes on Garbage Collection 15-411: Compiler Design André Platzer Lecture 20 1 Introduction In the previous lectures we have considered a programming language C0 with pointers and memory and array

More information

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems Tony Maciejewski, Kyle Tarplee, Ryan Friese, and Howard Jay Siegel Department of Electrical and Computer Engineering Colorado

More information

A Performance Puzzle: B-Tree Insertions are Slow on SSDs or What Is a Performance Model for SSDs?

A Performance Puzzle: B-Tree Insertions are Slow on SSDs or What Is a Performance Model for SSDs? 1 A Performance Puzzle: B-Tree Insertions are Slow on SSDs or What Is a Performance Model for SSDs? Bradley C. Kuszmaul MIT CSAIL, & Tokutek 3 iibench - SSD Insert Test 25 2 Rows/Second 15 1 5 2,, 4,,

More information

ECE468 Computer Organization and Architecture. Virtual Memory

ECE468 Computer Organization and Architecture. Virtual Memory ECE468 Computer Organization and Architecture Virtual Memory ECE468 vm.1 Review: The Principle of Locality Probability of reference 0 Address Space 2 The Principle of Locality: Program access a relatively

More information

Genetic Algorithms. PHY 604: Computational Methods in Physics and Astrophysics II

Genetic Algorithms. PHY 604: Computational Methods in Physics and Astrophysics II Genetic Algorithms Genetic Algorithms Iterative method for doing optimization Inspiration from biology General idea (see Pang or Wikipedia for more details): Create a collection of organisms/individuals

More information

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) UC Berkeley Fall 2010 Unit-Transaction Level

More information

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS INSTRUCTOR: Dr. MUHAMMAD SHAABAN PRESENTED BY: MOHIT SATHAWANE AKSHAY YEMBARWAR WHAT IS MULTICORE SYSTEMS? Multi-core processor architecture means placing

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017 Caches and Memory Hierarchy: Review UCSB CS24A, Fall 27 Motivation Most applications in a single processor runs at only - 2% of the processor peak Most of the single processor performance loss is in the

More information

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism in Artificial Life VIII, Standish, Abbass, Bedau (eds)(mit Press) 2002. pp 182 185 1 Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism Shengxiang Yang Department of Mathematics and Computer

More information

ECE4680 Computer Organization and Architecture. Virtual Memory

ECE4680 Computer Organization and Architecture. Virtual Memory ECE468 Computer Organization and Architecture Virtual Memory If I can see it and I can touch it, it s real. If I can t see it but I can touch it, it s invisible. If I can see it but I can t touch it, it

More information

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies

Let!s go back to a course goal... Let!s go back to a course goal... Question? Lecture 22 Introduction to Memory Hierarchies 1 Lecture 22 Introduction to Memory Hierarchies Let!s go back to a course goal... At the end of the semester, you should be able to......describe the fundamental components required in a single core of

More information

Advanced Computer Architecture

Advanced Computer Architecture ECE 563 Advanced Computer Architecture Fall 2009 Lecture 3: Memory Hierarchy Review: Caches 563 L03.1 Fall 2010 Since 1980, CPU has outpaced DRAM... Four-issue 2GHz superscalar accessing 100ns DRAM could

More information

Popularity of Twitter Accounts: PageRank on a Social Network

Popularity of Twitter Accounts: PageRank on a Social Network Popularity of Twitter Accounts: PageRank on a Social Network A.D-A December 8, 2017 1 Problem Statement Twitter is a social networking service, where users can create and interact with 140 character messages,

More information

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing 1. Introduction 2. Cutting and Packing Problems 3. Optimisation Techniques 4. Automated Packing Techniques 5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing 6.

More information

The Limits of Sorting Divide-and-Conquer Comparison Sorts II

The Limits of Sorting Divide-and-Conquer Comparison Sorts II The Limits of Sorting Divide-and-Conquer Comparison Sorts II CS 311 Data Structures and Algorithms Lecture Slides Monday, October 12, 2009 Glenn G. Chappell Department of Computer Science University of

More information

Operating Systems Unit 6. Memory Management

Operating Systems Unit 6. Memory Management Unit 6 Memory Management Structure 6.1 Introduction Objectives 6.2 Logical versus Physical Address Space 6.3 Swapping 6.4 Contiguous Allocation Single partition Allocation Multiple Partition Allocation

More information

Data Storage and Query Answering. Data Storage and Disk Structure (2)

Data Storage and Query Answering. Data Storage and Disk Structure (2) Data Storage and Query Answering Data Storage and Disk Structure (2) Review: The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM @200MHz) 6,400

More information

Process. Memory Management

Process. Memory Management Process Memory Management One or more threads of execution Resources required for execution Memory (RAM) Program code ( text ) Data (initialised, uninitialised, stack) Buffers held in the kernel on behalf

More information

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others Memory Management 1 Process One or more threads of execution Resources required for execution Memory (RAM) Program code ( text ) Data (initialised, uninitialised, stack) Buffers held in the kernel on behalf

More information

Preview. Memory Management

Preview. Memory Management Preview Memory Management With Mono-Process With Multi-Processes Multi-process with Fixed Partitions Modeling Multiprogramming Swapping Memory Management with Bitmaps Memory Management with Free-List Virtual

More information

high performance medical reconstruction using stream programming paradigms

high performance medical reconstruction using stream programming paradigms high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming

More information

Cache introduction. April 16, Howard Huang 1

Cache introduction. April 16, Howard Huang 1 Cache introduction We ve already seen how to make a fast processor. How can we supply the CPU with enough data to keep it busy? The rest of CS232 focuses on memory and input/output issues, which are frequently

More information

PCnet-FAST Buffer Performance White Paper

PCnet-FAST Buffer Performance White Paper PCnet-FAST Buffer Performance White Paper The PCnet-FAST controller is designed with a flexible FIFO-SRAM buffer architecture to handle traffic in half-duplex and full-duplex 1-Mbps Ethernet networks.

More information

Outlook. Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The Intel Pentium

Outlook. Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The Intel Pentium Main Memory Outlook Background Swapping Contiguous Memory Allocation Paging Structure of the Page Table Segmentation Example: The Intel Pentium 2 Backgound Background So far we considered how to share

More information

FAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017

FAWN as a Service. 1 Introduction. Jintian Liang CS244B December 13, 2017 Liang 1 Jintian Liang CS244B December 13, 2017 1 Introduction FAWN as a Service FAWN, an acronym for Fast Array of Wimpy Nodes, is a distributed cluster of inexpensive nodes designed to give users a view

More information

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others

Process. One or more threads of execution Resources required for execution. Memory (RAM) Others Memory Management 1 Learning Outcomes Appreciate the need for memory management in operating systems, understand the limits of fixed memory allocation schemes. Understand fragmentation in dynamic memory

More information

Scalable Ambient Effects

Scalable Ambient Effects Scalable Ambient Effects Introduction Imagine playing a video game where the player guides a character through a marsh in the pitch black dead of night; the only guiding light is a swarm of fireflies that

More information

File Size Distribution on UNIX Systems Then and Now

File Size Distribution on UNIX Systems Then and Now File Size Distribution on UNIX Systems Then and Now Andrew S. Tanenbaum, Jorrit N. Herder*, Herbert Bos Dept. of Computer Science Vrije Universiteit Amsterdam, The Netherlands {ast@cs.vu.nl, jnherder@cs.vu.nl,

More information

Using Genetic Algorithms to solve complex optimization problems. New Mexico. Supercomputing Challenge. Final Report. April 4, 2012.

Using Genetic Algorithms to solve complex optimization problems. New Mexico. Supercomputing Challenge. Final Report. April 4, 2012. Using Genetic Algorithms to solve complex optimization problems New Mexico Supercomputing Challenge Final Report April 4, 2012 Team 68 Los Alamos High School Team Members: Alexander Swart Teacher: Mr.

More information

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016 Caches and Memory Hierarchy: Review UCSB CS240A, Winter 2016 1 Motivation Most applications in a single processor runs at only 10-20% of the processor peak Most of the single processor performance loss

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+

More information