Parallel Block-Layered Nonbinary QC-LDPC Decoding on GPU

Size: px
Start display at page:

Download "Parallel Block-Layered Nonbinary QC-LDPC Decoding on GPU"

Transcription

1 Parallel Blok-Layered Nonbinary QC-LDPC Deoding on GPU Huyen Thi Pham, Sabooh Ajaz and Hanho Lee Department of Information and Communiation Engineering, Inha University, Inheon, , Korea Abstrat This paper presents an effiient implementation of a parallel blok-layered nonbinary quasi-yli low-density parity-hek (NB-QC-LDPC) deoder on a graphis proessing unit (GPU) to ahieve signifiant improvements in both flexibility and salability. An effiient blok-layered sheme and a data struture suitable for parallel omputing are proposed to perform deoding on the GPU. The sheme is applied to a minmax deoding algorithm that exploits the inherent massive parallelization apabilities of NB-QC-LDPC deoder. The results of the proposed approah demonstrate that the layered sheme an be effiiently implemented in a GPU devie. Moreover, experimental results show that the proposed GPU-based bloklayered NB-QC-LDPC deoder provides a faster deoding runtime ompare to CPU-based implementation and obtains a oding gain under a low 1-1 BER and low 1-7 FER. Keywords nonbinary; quasi-yli; LDPC; GPU; parallel omputation, CUDA I. INTRODUCTION A binary low-density parity-hek (LDPC) ode that provides performane lose to that of the Shannon limit for long ode lengths was investigated by Gallager [1]. Reently, nonbinary LDPC (NB-LDPC) odes [2 7] have attrated a tremendous amount of researh interest beause of their exellent error orretion apabilities. Matthew and MaKay [2] showed that NB-LDPC odes provide signifiant performane improvements when the ode lengths are short and moderate. However, the deoding algorithms for NBLDPC odes require omplex omputations and large memories [6]. It has been shown that NB-LDPC odes, whih have higher order Galois-field GF(q), provide better performane; however, the deoding omplexity grows up rapidly, and the simulation time on a entral proessing unit (CPU) is extremely slow. Therefore, it is impossible to show the error floor property of NB-LDPC odes using CPU-based the simulations at a low bit error rate (BER) and frame error rate (FER). Reently, graphis proessing units (GPUs) have been widely used for their high omputational power by whih they an simultaneously exeute numerous threads. NVIDIA presented the Compute Unified Devie Arhiteture (CUDA) using the C high-level programming language, whih offers a software environment that failitates the development of highperformane appliations. The GPU an provide massively parallel omputation threads with many-ore arhiteture, whih an aelerate simulations of NB-LDPC deoding. Currently, the implementation of the NB-LDPC deoder onto /15/. 215 IEEE the GPU devie is being atively researhed [8-12]. However, the implementation of NB-LDPC odes remains very hallenging. The reent work of Beermann et al. [12] applied a layered sheme for the belief propagation algorithm on the GPU. In our work, it demonstrates that the horizontal-layered sheme an be effiiently implemented with a min-max algorithm on the GPU devie. In this paper, an effiient GPU based-implementation of the parallel blok-layered NB-QC-LDPC deoder is presented to aelerate the deoding proess. The rest of this paper is organized as follows. In setion II, the NB-LDPC odes are briefly reviewed, and an effiient parallel blok-layered minmax algorithm for GPU is proposed. Setion III, the proposed parallel arhiteture and implementation on a GPU using CUDA are desribed. Experimental results are presented in Setion IV. Finally, onlusions are given in Setion V. II. NB-LDPC CODES AND DECODING ALGORITHM A. NB-LDPC Codes An (N, K) NB-LDPC ode (N ode symbols, K information symbols, and M=N K parity symbols) is defined as parityhek matrix H, whih inludes a small proportion of Galoisfield elements. The NB-QC-LDPC ode is illustrated by a Tanner graph. Eah row of H matrix is onneted with a hek node; eah olumn of H matrix onnets with a variable node on the Tanner graph. These odes introdue good BER performane and effiient parallel proessing. Zhou et al. [3] presented two new algebrai onstrutions for NB-LDPC odes based on array dispersions of matries. In this study [3], the parity hek matries of NB-QC-LDPC odes are row olumn (RC) onstrained arrays, whih are extended as irulant permutation matries (CPMs) over nonbinary Galois-fields. The strutural property of an RC onstraint is a onstraint on the rows and olumns of H matrix. A (744, 653) NB-LDPC ode over GF(25) is generated by using RC-onstrained arrays; this struture is applied in simulations of our study. By using this method [3], a H submatrix is generated. For a pair (dv = 3, d = 24), dv is the olumn weight or variable node degree, and d is the row weight or hek node degree. Eah element in submatrix H(3,24) is dispersed either as an all-zero matrix of size (q 1) (q 1) or an α-multiplied CPM of size (q 1) (q 1). Fig. 1 shows a (744,653) NB-QC-LDPC ode, whih is onstruted by submatrix H(3,24). In α-multiplied CPM, there is only one i-th nonzero entry in the first row of the matrix. It

2 Algorithm 1: Proposed GPU-based parallel blok-layered minmax deoding algorithm Initialization: L n ln(pr(x n s n hannel) / Pr( x n a hannel)); 1,,1 L n L n ; R mn ; Iterations: For (k = 1; k <= I max ; k++) // Loop for eah iteration For (l = 1; l <= L; l++) // Loop for eah layer For (m = ; m < q- 1; m++) // (q-1) hek nodes proess parallel k 1, l Step1: L~ nk,,lm Lnk,l 1 R mn ( L~ nk,,lm ); L n, m a min GF ( q ) k,l L n, m (a ) L n, m (a ) L n, m ; 3 k,l Step2: R mn (b) Fig. 1. H matrix for a (744,653) NB-QC-LDPC ode over GF(25). (b) Example of α-multiplied CPM for α2. is generated by dispersing an element, αi, other entries are zero. Eah of the other rows is a right yli-shift of the previous row multiplied by α. Fig. 1(b) shows an α-multiplied CPM for α2. B. Proposed GPU-based Parallel Blok-layered Min-max Deoding Algorithm In this setion, the horizontal layered deoding [5, 7] is applied to derease both the memory and deoding iterations. The H matrix is divided into layers. Then, the deoder iteration is sequentially performed at eah layer iteration. In this work, we propose a parallel blok-layered min-max deoding algorithm on a GPU, as shown in Algorithm 1, in whih kernels are designed to simultaneously proess (q 1) hek nodes. One blok layer is onstruted by nonoverlapped (q 1) rows; eah olumn of these blok layers has a weight value of one. Algorithm 1 is briefly summarized as follows. The layered deoding divides the H matrix rows into L = dv layers. The deoding proessing for layer 1, layer 2,, and layer dv are sequentially performed to omplete a single iteration; the extrinsi values are exhanged among the layers. This proess is onseutively performed until the number of iterations reahes maximum value Imax or until the parity hek is satisfied. The initialization of the parallel blok-layered minmax deoding algorithm is similar as that shown in Algorithm 1. In addition, the variable node (V2C) messages, L~ nk,,lm of the l layer in iteration k are omputed based on Lkn,l 1 and k 1, l k,l 1 k 1,l R mn. It is noted that L n and R mn are the a posteriori messages of the l 1 layer in iteration k and the hek node (C2V) messages of layer l in iteration k 1, respetively. L~ nk,,lm is expressed as follows: k,l 1 k 1, l L n, m L n R mn (1) In the first layer of the first iteration, the V2C messages L1n, are the reliability information from hannel L n, ( L1n, L n ), and the hek node memory min ( max ( L nk,ml (a n ))); ( a n ) n N ( m ) mn ( a ) n N ( m \{n}) Step3: L nk,l L~ kn,,lm R mk,,ln ; End for End for Deision: ~ n arg min(l nk,l ) ; End for values (CMEM) are equal to zero ( Rmn,l ). Let xn be the n-th symbol in a reeived ode word, and let sn be the most likely symbol of xn. The Ln vetor has q elements, inluding one zero element and (q 1) positive elements. The min-max deoding, whih is implemented by the forward-bakward algorithm (FBA) [4], is applied in the hek node proess. This paper proposes a modified FBA, whih remove multipliation with nonzero elements of H matrix in the onditional equation of merger step to derease the omplexity of the hek node proessing (CNP). III. PARALLEL BLOCK LAYERED NB-QC-LDPC DECODING ON GPU A. Data flow of NB-QC-LDPC Deoding on GPU NVIDIA GPUs are powerful arithmeti engines that an run thousands of lightweight threads in parallel. A GPU-based heterogeneous platform has one or more CPUs and GPUs that are well-suited to implementing NB-LDPC deoding algorithms. In addition, the NB-LDPC deoding algorithm has a high omputation to memory aess ratio (CMAR). The CMAR represents the omplexity of operations that justify the ost of moving data to and from the devie. To obtain modern proessor arhiteture integrated in GPUs, the appliation must first be assessed to identify the hotspots, whih an be parallelized. Runtime of main bloks in the min-max algorithm is measured by running a serial C ode on a CPU. It has been shown that the hek node proessing is a bottlenek and aounts for 95.2% in the proessing time. Hene, the omputations of CNP an be parallelized on the GPU platform. In this setion, we present an effiient implementation of a parallel blok-layered NB-QC-LDPC deoding sheme based on a GPU platform to aelerate the deoding proessing. Fig. 2 shows a data flow for the parallel blok-layered deoding

3 Algorithm2: Modified Forward-Bakward Algorithm Forward metris: First step: F L, ( h ) Reursive step: for i=1 to d-2 1 F i (a ) min (max(f i 1 (a ), Li, (a ))) a, a GF ( q ) a h i a a Bakward metri: First step: B d 1 L d 1, ( h ) Reursive step: for i=d-2 to 1 1 d 1 (3) (4) min (max( Bi 1 (a ), Li, (a ))) (5) Bi (2) a, a GF ( q ) a h i a a Modified Merger: M, B1 ; M, d 1 F d 2 M, k min (max( F k 1(a ), B k 1(a ))) a, a GF ( q ) a a a (6) (7) sheme on CPU and GPU platforms. The CUDA program is divided into two tasks: one is for the CPU; the other is for the GPU. The host CPU handles the kernel sheduling, ontrol of the deoding iterations, omputing of BER performane, and so on. The host CPU must transfer the symbols of the reeived ode word to the GPU; it then reeives the deoded symbols from the GPU. Most of the deoding omputations are implemented on the GPU. All intermediate messages are stored in the devie memory to restrit data transfer between the host and devie. Eah of the modules in Fig. 2 responds to a kernel mapped on the GPU platform. B. Data and Memory Struture As mentioned in Setion II, H matrix is onstituted from the (dv, d) submatrix, where the elements of the submatrix are extended by α-multiplied CPM (q 1) (q 1). To take advantage of α-multiplied CPM, a submatrix must be stored in memory instead of in full H matrix. This method is alled the ompress tehnique, whih redues the storage memory for H matrix and enables fast memory aess. In this work, we propose a layered deoding sheme in whih (q 1) hek nodes in a row blok are simultaneously proessed. Therefore, in this setion, we desribe data and memory strutures for single-layer proessing. A total of d (q 1) V2C vetors for one layer are distributed within (q-1) hek nodes. The omputations for (q 1) hek nodes in one layer only need (q 1) d q messages, whih are stored in a variable node proessing (VNP) temporary memory. Thus, the memory required for the layered sheme is redued by a fator of the number of layers, dv, ompared to the flooding sheme. Fig. 3 depits a 3D struture of C2V vetors [q, d, q 1]. The three dimensions of the C2V vetors are as follows: width q orresponds to q entries in a vetor, height d orresponds to d V2C vetors onneting to one hek node, and the depth orresponds to (q 1) hek nodes. This struture allows (q 1) hek nodes, whih operate parallel to aess d V2C vetors in alignment. If a 3D array is formatted by [width, height, depth], eah element [x, y, z] of an array is uniquely indexed by [x + y width + z width height] in the 1D array, as shown in Fig. 3(b). By arranging L nk,,lm V2C Fig. 2. Data flow of parallel blok-layered NB-QC-LDPC deoding on CPU and GPU platforms. vetors and R km,,ln C2V vetors in this format, the q adjaent data entries are aessed by q adjaent threads; thus, oalesed memory aess is enabled, whih ahieves high throughput. Furthermore, additions and subtrations in GF(q) are implemented as exlusive OR (XOR) operations, and divisions by αa are omputed by multipliation with α(31-a)%31. Therefore, the GPU s texture memory is employed to implement nonbinary arithmeti in GF(q), whih is available to all kernels. Two 2D lookup tables of size q q exist for multipliation and addition; two 1D lookup tables of size q exist for onversion between exponential and deimal representation. A 64-KB onstant memory is used to store values from the parity-hek matrix, whih are atually the values and indies of bit nodes onneted to eah hek node. C. Parallel Forward-Bakward Sheme in Chek Node Proessing The proposed deoding algorithm is partitioned into four main kernels; the kernel sizes are listed in TABLE I. The onfiguration parameters of kernels are flexibly hanged depending on the parameters of eah GPU ard used. Eah hek node sequentially omputes forward (FD), bakward (BD) and merger messages. However, FD and BD messages are independently alulated. In this arhiteture, these messages are simultaneously proessed by using q threads for the forward step and q threads for the bakward step. One forward-bakward messages are available, the merger omputation begins. Fig. 4 shows the arhiteture of a detailed kernel for one hek node. Input messages for the kernel are stored in VNP temporary memory. Moreover, output forward-bakward messages are kept in forward and bakward memories to ontinue omputing merger messages, and stored in an on-hip loal memory for omputation of the next forward-bakward messages. Beause the Fi and Bj messages are used to ompute Fi+1 and Bj-1 in the next step, the on-hip loal memory with high bandwidth and low lateny is used to store output messages Fi and Bj to speed up the FD, BD steps. The implementation step of the FD, BD or merger omputation is alled an elementary step. The memory

4 TABLE I. KERNEL ARCHITECTURE FOR MAIN BLOCKS OF THE DECODER OVER GF(25). Funtion Initial LLR FD, BD MG VNP Deision CNP No. Thread bloks d q-1 q-1 d 1 No. Threads q (q-1) q+q q d q (q-1) d (q-1) Total No.Threads d q (q-1) (q-1) ( q+q) (q-1) q d d q (q-1) d (q-1) 1 15itr, BER 15itr, FER -2 (b) Fig. 3. Data struture for oalesing memory aess in CNP, 3D struture of CN messages, (b) 1D struture of CN messages. Bit/frame Error Rate EbNo(dB) Fig. 5. BER and FER performane of a (744, 653) blok-layered NB-QCLDPC ode over GF(25) with min-max algorithm using the GPU. Using equations (3) and (5), there are q different pairs of a and a to satisfy a + ha = a in the forward- bakward step. Suppose that F1 vetor is omputed if the onditions suh as h, urrent V2C vetor L1(a ), and previous forward vetor F(a ) are known. L1(a ) Shared Memory Threads Synthreads (b) Fig. 4. Forward-bakward kernel implementation of the FBA on GPU, (b) Shared memory for random memory aess. requirement for Fi and Bj in (q 1) hek nodes is 2 q sizeof(float) (q 1). For example, 7.75 KB of loal memory is required for 31 hek nodes in GF(25). A barrier synhronization funtion, synthreads(), is performed after eah forward or bakward step, Fi or Bj, to ensure that threads are synhronized. An elementary step for forward omputation has two input message vetors as Fi(a ) and Li+1(a ). One output message vetor is defined as Fi+1. To ompute one message of the F1 output vetor, q ombinations of F(a ) and L1(a ) are determined by substituting the indexes into a and a to satisfy a + ha = a. After obtaining q pairs, the q messages are firstly generated by seleting the larger ones in eah of the q pairs F(a ) L1(a ). Then, the minimal value of the q messages is found and defaulted as an output message of the forward vetor F1. For example, using h = α2, a = α1, there are 32 pairs that satisfy a + α2a = α1 as: a a = { α3, α α16, α1,, α3 α2}. As mentioned above, the variable node messages L1(a ) are stored for aess in the order of a linear memory. However, to ompute forward messages F1 in an elementary step, V2C messages L1(a ) are aessed in an arbitrary order. In this ase, the order to aess the L1(a ) messages follows { α3, α16,,, α 2 }. To address this problem, the V2C messages are opied to on-hip shared memory, whih has high bandwidth and low lateny, before beginning the omputation. Moreover, the additions and divisions are usefully implemented by lookup tables in the text memory. In this way, firstly variable node messages are diretly opied to the shared memory. Then, the output is written using the indexes, whih are omputed in text memory, as shown in Fig. 4(b). Thus, bank onflits are not generated and memory aess is speeded up.

5 TABLE II. DECODING TIME ON DIFFERENT DEVICES AT IMAX = 1. Deoding Time (ms) Eb/N -3dB 4dB 5dB 6dB 7dB 8dB CPU (Intel i7) GTX 65 Ti GTX TITAN Blak Fig. 6 shows the average total throughput of the deoder that is proessed by the CPU platform and various GPU platforms over different hannel qualities. Two platforms are almost similar in terms of BER, FER results. However, the speeds of the two platforms differ and are measured by the average runtime per deoding iteration with different Eb/N values. In Fig. 6, from 1 db to 3 db, the throughput remained fairly stati at the lowest value. This was due to the bad hannel performane in low Eb/N values; moreover, the deoding has to be exeuted at a maximum of ten iterations. In addition, the throughputs inreased with inreasing Eb/N beause fewer deoding iterations have to be exeuted until a orret ode word is reovered. GTX TITAN Blak GTX 65 Ti CPU.45.4 Throughput [Mbit/s] EbNo(dB) 5 6 deoding on the GTX TITAN Blak, the deoding runtime is ms at 4 db, whih is 7.5 times faster than that on the CPU-based implementation. On the other hand, different GPU devies were set up on different CPU platforms, whih produed varying deoding runtimes. The GTX TITAN Blak graphi ard has more advantages over the GTX 65 Ti. Thus, the deoding runtime is approximately twie as fast as that of the GTX 65 Ti. Moreover, the general-purpose GPUs ould perform floating-point arithmeti operations with better the auray and lower BER in very large-sale integration LDPC deoders. 7 8 Fig. 6. Average deoder throughput on GPUs and CPU at a maximum of 1 iterations. IV. EXPERIMENTAL RESULTS The experimental setup to evaluate the performane of the proposed NB-QC-LDPC deoder onsisted of an NVIDIA GTX 65Ti GPU and an Intel Core i7-477 CPU. The CPU platform of an Intel Core i7-477 CPU at 3.4 GHz with 16 GB RAM was used to simulate the serial C ode. An NVIDIA GTX 65Ti GPU with 768 CUDA ores at.97 GHz and 124 MB of GDDR5 devie memory was used to implement the CUDA C ode. Moreover, an NVIDIA GTX TITAN Blak graphis ard was applied to perform the CUDA C ode. This work used CUDA toolkit v5.5 for the implementation. A regular (744, 653) NB-QC-LDPC ode onstruted over GF(25) with an.877 ode rate was used in this simulation. The deoding performane of (744, 653) NB-QC-LDPC ode and its random ounterpart over an additive white Gaussian noise (AWGN) hannel with binary phase shift keying (BPSK) are illustrated in Fig. 5. This simulation is performed by the min-max algorithm for NB-QC-LDPC ode over GF(25) with 15 deoding iterations on the GPU. It demonstrates that the GPU aelerated deoding proess to enable the detetion of error floors of approximately 1-7 FER within days instead of weeks of omputation in C++. As the result, the implementation on the GPU ompared well with VLSI approahes. Furthermore, it is lear that the GPU proessing led to superior FER, BER performane as opposed to VLSI solutions [5, 7]. TABLE II shows the deoding runtime using the CPU platform and various GPU devies. Exeution times were obtained with CPU timers. In (744, 653) NB-QC-LDPC Two fators affet the deoding runtime of the layered sheme in this study. Firstly, it is dependent on the number of layers beause deoding has to be sequentially performed on eah layer to finish one iteration instead of one time per iteration in the flooding sheme. Thus, the deoding time of the layered sheme an be estimated to be dv times higher than that in the flooding sheme. However, the layered sheme doubly inreases the onvergene speed of the iterative deoder. This means that the number of required deoding iterations an be signifiantly dereased ompared to in the flooding sheme. Seondly, the hek node degree or d additionally impats the deoding runtime beause FBA is used in the CNP, whih is sequentially implemented in (d 1) steps. It is onluded that the deoding time is the balane for ahieving the same deoding performane between the layered and flooding shemes. Nonetheless, the layered sheme is more memory-effiient than the flooding sheme. To take advantage of independent omputation in the forward and bakward steps, we used q threads for the forward step and q threads for the bakward step to simultaneously proess eah step. Therefore, the running time for the CNP kernel is estimated to be doubly redued ompared to [8]. V. CONCLUSION In this paper, we presented an effiient GPU-based implementation of the parallel blok-layered NB-QC-LDPC deoder to aelerate the deoding proess. Owing to its inherently massive parallelism, NB-QC-LDPC deoding is easier to apply to GPU implementation than binary LDPC odes. The experimental results show that the GPU-based implementation of layered deoding sheme for the NB-QCLDPC provides a faster deoding runtime and oding gain under a low 1-1 BER and low 1-7 FER. A new solution is

6 thereby provided for NB-QC-LDPC deoding on a GPU, whih provides greater effiieny than on a CPU platform. ACKNOWLEDGMENT This researh was supported by Basi Siene Researh Program through the NRF funded by the Ministry of Siene, ICT and future Planning (213R1A2A2A168628). REFERENCES [1] [2] [3] [4] [5] R. G. Gallager, Low density parity hek odes, IRE Trans. on Information Theory, vol. 8, no. 1, pp , C. D. Matthew, and D. MaKay, Low-Density Parity Chek Codes over GF(q), IEEE Communiations Letters, vol. 2, no. 6, pp , Jun B. Zhou, J. Kang, S. Song, S. Lin, K. A. Ghaffar, and M. Xu, Constrution of non-binary Quasi-yli LDPC odes by arrays and array dispersions, IEEE Trans. on Communiations, vol. 57, no. 6, pp , Jun. 29. V. Savin, Min-Max deoding for nonbinary LDPC odes, Pro. IEEE. Int. Symp. Inf. Theory, Toronto Canada, pp , Jul. 28. C.-S. Choi and H. Lee, A Blok-Layered Deoder Arhiteture for Quasi-Cyli Non-Binary LDPC odes, Journal of Signal Proessing Systems, vol. 78, no. 2, pp , Feb [6] D. Delerq, M. Fossorier, Deoding Algorithms for Nonbinary LDPC Codes Over GF(q), IEEE Trans. on ommuniations, vol.55, no.4, pp , Apr. 27. [7] X. Zhang, and F. Cai, Effiient Partial-Parallel Deoder Arhiteture for Quasi-Cyli Nonbinary LDPC Codes, IEEE Trans. on Ciruits and Systems I, vol. 58, no. 2, pp , Feb [8] J. Andrade, G. Falao, and V. Silva, K. Kasai, FFT-SPA Non-binary LDPC deoding on GPU, Pro. IEEE International Conferene on Speeh and Signal Proessing, Vanouver, BC, pp , May 2631, 213. [9] G. Wang, H. Shen, et. al., Parallel Nonbinary LDPC Deoding on GPU, Pro. the 46th Asilomar Conferene on Signals, Systems and Computers, Paifi Grove, CA, pp , Nov 4-7, 212. [1] M. Beermann, E. Monzo, L. Shmalen, P. Vary, High speed deoding of non-binary irregular LDPC odes using GPUs, Pro. IEEE Workshop on Signal Proessing Systems, pp , Ot 16-18, 213. [11] H. Pham Thi, S. Ajaz, and H. Lee, Effiient Min-max Nonbinary LDPC Deoding on GPU, IEEE SoC Design Conferene (ISOCC), pp , Nov 3-6, 214. [12] M. Beermann, E. Monzo, L. Shmalen, and P. Vary, GPU Aelerated Belief Propagation Deoding of Non-Binary LDPC Codes with Parallel and Sequential Sheduling Journal of Signal Proessing Systems, vol. 78, no. 1, pp , January. 215.

Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes

Reduced-Complexity Column-Layered Decoding and. Implementation for LDPC Codes Redued-Complexity Column-Layered Deoding and Implementation for LDPC Codes Zhiqiang Cui 1, Zhongfeng Wang 2, Senior Member, IEEE, and Xinmiao Zhang 3 1 Qualomm In., San Diego, CA 92121, USA 2 Broadom Corp.,

More information

Pipelined Multipliers for Reconfigurable Hardware

Pipelined Multipliers for Reconfigurable Hardware Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,

More information

DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Euncheol Kim, Gwan Choi, Mark Yeary *

DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Euncheol Kim, Gwan Choi, Mark Yeary * DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Eunheol Kim, Gwan Choi, Mark Yeary * Dept. of Eletrial Engineering, Texas A&M University, College Station, TX-77840

More information

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR

A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Malaysian Journal of Computer Siene, Vol 10 No 1, June 1997, pp 36-41 A DYNAMIC ACCESS CONTROL WITH BINARY KEY-PAIR Md Rafiqul Islam, Harihodin Selamat and Mohd Noor Md Sap Faulty of Computer Siene and

More information

We don t need no generation - a practical approach to sliding window RLNC

We don t need no generation - a practical approach to sliding window RLNC We don t need no generation - a pratial approah to sliding window RLNC Simon Wunderlih, Frank Gabriel, Sreekrishna Pandi, Frank H.P. Fitzek Deutshe Telekom Chair of Communiation Networks, TU Dresden, Dresden,

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1.

Abstract. Key Words: Image Filters, Fuzzy Filters, Order Statistics Filters, Rank Ordered Mean Filters, Channel Noise. 1. Fuzzy Weighted Rank Ordered Mean (FWROM) Filters for Mixed Noise Suppression from Images S. Meher, G. Panda, B. Majhi 3, M.R. Meher 4,,4 Department of Eletronis and I.E., National Institute of Tehnology,

More information

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

Design of High Speed Mac Unit

Design of High Speed Mac Unit Design of High Speed Ma Unit 1 Harish Babu N, 2 Rajeev Pankaj N 1 PG Student, 2 Assistant professor Shools of Eletronis Engineering, VIT University, Vellore -632014, TamilNadu, India. 1 harishharsha72@gmail.om,

More information

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer

Cross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer Communiations and Networ, 2013, 5, 69-73 http://dx.doi.org/10.4236/n.2013.53b2014 Published Online September 2013 (http://www.sirp.org/journal/n) Cross-layer Resoure Alloation on Broadband Power Line Based

More information

High-level synthesis under I/O Timing and Memory constraints

High-level synthesis under I/O Timing and Memory constraints Highlevel synthesis under I/O Timing and Memory onstraints Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn, Eri Martin To ite this version: Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn,

More information

A {k, n}-secret Sharing Scheme for Color Images

A {k, n}-secret Sharing Scheme for Color Images A {k, n}-seret Sharing Sheme for Color Images Rastislav Luka, Konstantinos N. Plataniotis, and Anastasios N. Venetsanopoulos The Edward S. Rogers Sr. Dept. of Eletrial and Computer Engineering, University

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

A Novel Validity Index for Determination of the Optimal Number of Clusters

A Novel Validity Index for Determination of the Optimal Number of Clusters IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers

More information

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT?

3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? 3-D IMAGE MODELS AND COMPRESSION - SYNTHETIC HYBRID OR NATURAL FIT? Bernd Girod, Peter Eisert, Marus Magnor, Ekehard Steinbah, Thomas Wiegand Te {girod eommuniations Laboratory, University of Erlangen-Nuremberg

More information

Acoustic Links. Maximizing Channel Utilization for Underwater

Acoustic Links. Maximizing Channel Utilization for Underwater Maximizing Channel Utilization for Underwater Aousti Links Albert F Hairris III Davide G. B. Meneghetti Adihele Zorzi Department of Information Engineering University of Padova, Italy Email: {harris,davide.meneghetti,zorzi}@dei.unipd.it

More information

Anbuselvi et al., International Journal of Advanced Engineering Technology E-ISSN

Anbuselvi et al., International Journal of Advanced Engineering Technology E-ISSN Research Paper ANALYSIS OF A REDUED OMPLEXITY FFT-SPA BASED NON BINARY LDP DEODER WITH DIFFERENT ODE ONSTRUTIONS Anbuselvi M, Saravanan P and Arulmozhi M Address for orrespondence, SSN ollege of Engineering

More information

Multi-Channel Wireless Networks: Capacity and Protocols

Multi-Channel Wireless Networks: Capacity and Protocols Multi-Channel Wireless Networks: Capaity and Protools Tehnial Report April 2005 Pradeep Kyasanur Dept. of Computer Siene, and Coordinated Siene Laboratory, University of Illinois at Urbana-Champaign Email:

More information

Cluster-based Cooperative Communication with Network Coding in Wireless Networks

Cluster-based Cooperative Communication with Network Coding in Wireless Networks Cluster-based Cooperative Communiation with Network Coding in Wireless Networks Zygmunt J. Haas Shool of Eletrial and Computer Engineering Cornell University Ithaa, NY 4850, U.S.A. Email: haas@ee.ornell.edu

More information

Boosted Random Forest

Boosted Random Forest Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,

More information

Gray Codes for Reflectable Languages

Gray Codes for Reflectable Languages Gray Codes for Refletable Languages Yue Li Joe Sawada Marh 8, 2008 Abstrat We lassify a type of language alled a refletable language. We then develop a generi algorithm that an be used to list all strings

More information

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks

A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks A Partial Sorting Algorithm in Multi-Hop Wireless Sensor Networks Abouberine Ould Cheikhna Department of Computer Siene University of Piardie Jules Verne 80039 Amiens Frane Ould.heikhna.abouberine @u-piardie.fr

More information

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup

Parallelizing Frequent Web Access Pattern Mining with Partial Enumeration for High Speedup Parallelizing Frequent Web Aess Pattern Mining with Partial Enumeration for High Peiyi Tang Markus P. Turkia Department of Computer Siene Department of Computer Siene University of Arkansas at Little Rok

More information

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded

Folding. Hardware Mapped vs. Time multiplexed. Folding by N (N=folding factor) Node A. Unfolding by J A 1 A J-1. Time multiplexed/microcoded Folding is verse of Unfolding Node A A Folding by N (N=folding fator) Folding A Unfolding by J A A J- Hardware Mapped vs. Time multiplexed l Hardware Mapped vs. Time multiplexed/mirooded FI : y x(n) h

More information

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines

The Minimum Redundancy Maximum Relevance Approach to Building Sparse Support Vector Machines The Minimum Redundany Maximum Relevane Approah to Building Sparse Support Vetor Mahines Xiaoxing Yang, Ke Tang, and Xin Yao, Nature Inspired Computation and Appliations Laboratory (NICAL), Shool of Computer

More information

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks

A Dual-Hamiltonian-Path-Based Multicasting Strategy for Wormhole-Routed Star Graph Interconnection Networks A Dual-Hamiltonian-Path-Based Multiasting Strategy for Wormhole-Routed Star Graph Interonnetion Networks Nen-Chung Wang Department of Information and Communiation Engineering Chaoyang University of Tehnology,

More information

Robust Dynamic Provable Data Possession

Robust Dynamic Provable Data Possession Robust Dynami Provable Data Possession Bo Chen Reza Curtmola Department of Computer Siene New Jersey Institute of Tehnology Newark, USA Email: b47@njit.edu, rix@njit.edu Abstrat Remote Data Cheking (RDC)

More information

Partial Character Decoding for Improved Regular Expression Matching in FPGAs

Partial Character Decoding for Improved Regular Expression Matching in FPGAs Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

arxiv: v1 [cs.db] 13 Sep 2017

arxiv: v1 [cs.db] 13 Sep 2017 An effiient lustering algorithm from the measure of loal Gaussian distribution Yuan-Yen Tai (Dated: May 27, 2018) In this paper, I will introdue a fast and novel lustering algorithm based on Gaussian distribution

More information

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays

Analysis of input and output configurations for use in four-valued CCD programmable logic arrays nalysis of input and output onfigurations for use in four-valued D programmable logi arrays J.T. utler H.G. Kerkhoff ndexing terms: Logi, iruit theory and design, harge-oupled devies bstrat: s in binary,

More information

Accommodations of QoS DiffServ Over IP and MPLS Networks

Accommodations of QoS DiffServ Over IP and MPLS Networks Aommodations of QoS DiffServ Over IP and MPLS Networks Abdullah AlWehaibi, Anjali Agarwal, Mihael Kadoh and Ahmed ElHakeem Department of Eletrial and Computer Department de Genie Eletrique Engineering

More information

The Implementation of RRTs for a Remote-Controlled Mobile Robot

The Implementation of RRTs for a Remote-Controlled Mobile Robot ICCAS5 June -5, KINEX, Gyeonggi-Do, Korea he Implementation of RRs for a Remote-Controlled Mobile Robot Chi-Won Roh*, Woo-Sub Lee **, Sung-Chul Kang *** and Kwang-Won Lee **** * Intelligent Robotis Researh

More information

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks

A Load-Balanced Clustering Protocol for Hierarchical Wireless Sensor Networks International Journal of Advanes in Computer Networks and Its Seurity IJCNS A Load-Balaned Clustering Protool for Hierarhial Wireless Sensor Networks Mehdi Tarhani, Yousef S. Kavian, Saman Siavoshi, Ali

More information

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION

KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION KERNEL SPARSE REPRESENTATION WITH LOCAL PATTERNS FOR FACE RECOGNITION Cuiui Kang 1, Shengai Liao, Shiming Xiang 1, Chunhong Pan 1 1 National Laboratory of Pattern Reognition, Institute of Automation, Chinese

More information

Implementing Load-Balanced Switches With Fat-Tree Networks

Implementing Load-Balanced Switches With Fat-Tree Networks Implementing Load-Balaned Swithes With Fat-Tree Networks Hung-Shih Chueh, Ching-Min Lien, Cheng-Shang Chang, Jay Cheng, and Duan-Shin Lee Department of Eletrial Engineering & Institute of Communiations

More information

Approximate logic synthesis for error tolerant applications

Approximate logic synthesis for error tolerant applications Approximate logi synthesis for error tolerant appliations Doohul Shin and Sandeep K. Gupta Eletrial Engineering Department, University of Southern California, Los Angeles, CA 989 {doohuls, sandeep}@us.edu

More information

13.1 Numerical Evaluation of Integrals Over One Dimension

13.1 Numerical Evaluation of Integrals Over One Dimension 13.1 Numerial Evaluation of Integrals Over One Dimension A. Purpose This olletion of subprograms estimates the value of the integral b a f(x) dx where the integrand f(x) and the limits a and b are supplied

More information

Improved Circuit-to-CNF Transformation for SAT-based ATPG

Improved Circuit-to-CNF Transformation for SAT-based ATPG Improved Ciruit-to-CNF Transformation for SAT-based ATPG Daniel Tille 1 René Krenz-Bååth 2 Juergen Shloeffel 2 Rolf Drehsler 1 1 Institute of Computer Siene, University of Bremen, 28359 Bremen, Germany

More information

Facility Location: Distributed Approximation

Facility Location: Distributed Approximation Faility Loation: Distributed Approximation Thomas Mosibroda Roger Wattenhofer Distributed Computing Group PODC 2005 Where to plae ahes in the Internet? A distributed appliation that has to dynamially plae

More information

Multi-hop Fast Conflict Resolution Algorithm for Ad Hoc Networks

Multi-hop Fast Conflict Resolution Algorithm for Ad Hoc Networks Multi-hop Fast Conflit Resolution Algorithm for Ad Ho Networks Shengwei Wang 1, Jun Liu 2,*, Wei Cai 2, Minghao Yin 2, Lingyun Zhou 2, and Hui Hao 3 1 Power Emergeny Center, Sihuan Eletri Power Corporation,

More information

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A. Compilation 0368-3133 Leture 11a Text book: Modern ompiler implementation in C Andrew A. Appel Register Alloation Noam Rinetzky 1 Registers Dediated memory loations that an be aessed quikly, an have omputations

More information

The Tofu Interconnect D

The Tofu Interconnect D 2018 IEEE International Conferene on Cluster Computing The Tofu Interonnet D Yuihiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouihi Hirai, Toshiyuki Shimizu Next Generation Tehnial

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

Dr.Hazeem Al-Khafaji Dept. of Computer Science, Thi-Qar University, College of Science, Iraq

Dr.Hazeem Al-Khafaji Dept. of Computer Science, Thi-Qar University, College of Science, Iraq Volume 4 Issue 6 June 014 ISSN: 77 18X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om Medial Image Compression using

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

Smooth Trajectory Planning Along Bezier Curve for Mobile Robots with Velocity Constraints

Smooth Trajectory Planning Along Bezier Curve for Mobile Robots with Velocity Constraints Smooth Trajetory Planning Along Bezier Curve for Mobile Robots with Veloity Constraints Gil Jin Yang and Byoung Wook Choi Department of Eletrial and Information Engineering Seoul National University of

More information

496 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 3, MARCH 2018

496 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 3, MARCH 2018 496 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 3, MARCH 2018 Basic-Set Trellis Min Max Decoder Architecture for Nonbinary LDPC Codes With High-Order Galois Fields Huyen

More information

A Dictionary based Efficient Text Compression Technique using Replacement Strategy

A Dictionary based Efficient Text Compression Technique using Replacement Strategy A based Effiient Text Compression Tehnique using Replaement Strategy Debashis Chakraborty Assistant Professor, Department of CSE, St. Thomas College of Engineering and Tehnology, Kolkata, 700023, India

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

Reevaluating the overhead of data preparation for asymmetric multicore system on graphics processing

Reevaluating the overhead of data preparation for asymmetric multicore system on graphics processing KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 10, NO. 7, Jul. 2016 3231 Copyright 2016 KSII Reevaluating the overhead of data preparation for asymmetri multiore system on graphis proessing

More information

New Fuzzy Object Segmentation Algorithm for Video Sequences *

New Fuzzy Object Segmentation Algorithm for Video Sequences * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 521-537 (2008) New Fuzzy Obet Segmentation Algorithm for Video Sequenes * KUO-LIANG CHUNG, SHIH-WEI YU, HSUEH-JU YEH, YONG-HUAI HUANG AND TA-JEN YAO Department

More information

Uplink Channel Allocation Scheme and QoS Management Mechanism for Cognitive Cellular- Femtocell Networks

Uplink Channel Allocation Scheme and QoS Management Mechanism for Cognitive Cellular- Femtocell Networks 62 Uplink Channel Alloation Sheme and QoS Management Mehanism for Cognitive Cellular- Femtoell Networks Kien Du Nguyen 1, Hoang Nam Nguyen 1, Hiroaki Morino 2 and Iwao Sasase 3 1 University of Engineering

More information

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY

COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY Dileep P, Bhondarkor Texas Instruments Inorporated Dallas, Texas ABSTRACT Charge oupled devies (CCD's) hove been mentioned as potential fast auxiliary

More information

Performance Improvement of TCP on Wireless Cellular Networks by Adaptive FEC Combined with Explicit Loss Notification

Performance Improvement of TCP on Wireless Cellular Networks by Adaptive FEC Combined with Explicit Loss Notification erformane Improvement of TC on Wireless Cellular Networks by Adaptive Combined with Expliit Loss tifiation Masahiro Miyoshi, Masashi Sugano, Masayuki Murata Department of Infomatis and Mathematial Siene,

More information

Block-Layered Decoder Architecture for Quasi-Cyclic Nonbinary LDPC Codes

Block-Layered Decoder Architecture for Quasi-Cyclic Nonbinary LDPC Codes J Sign Process Syst (2015) 78:209 222 DOI 10.1007/s11265-013-0816-5 Block-Layered Decoder Architecture for Quasi-Cyclic Nonbinary LDPC Codes Chang-Seok Choi & Hanho Lee Received: 21 February 2013 /Revised:

More information

Improved flooding of broadcast messages using extended multipoint relaying

Improved flooding of broadcast messages using extended multipoint relaying Improved flooding of broadast messages using extended multipoint relaying Pere Montolio Aranda a, Joaquin Garia-Alfaro a,b, David Megías a a Universitat Oberta de Catalunya, Estudis d Informàtia, Mulimèdia

More information

DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT

DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT 1 ZHANGGUO TANG, 2 HUANZHOU LI, 3 MINGQUAN ZHONG, 4 JIAN ZHANG 1 Institute of Computer Network and Communiation Tehnology,

More information

IN structured P2P overlay networks, each node and file key

IN structured P2P overlay networks, each node and file key 242 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 21, NO. 2, FEBRUARY 2010 Elasti Routing Table with Provable Performane for Congestion Control in DHT Networks Haiying Shen, Member, IEEE,

More information

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method

Multiple-Criteria Decision Analysis: A Novel Rank Aggregation Method 3537 Multiple-Criteria Deision Analysis: A Novel Rank Aggregation Method Derya Yiltas-Kaplan Department of Computer Engineering, Istanbul University, 34320, Avilar, Istanbul, Turkey Email: dyiltas@ istanbul.edu.tr

More information

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0; Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

COMBINATION OF INTERSECTION- AND SWEPT-BASED METHODS FOR SINGLE-MATERIAL REMAP

COMBINATION OF INTERSECTION- AND SWEPT-BASED METHODS FOR SINGLE-MATERIAL REMAP Combination of intersetion- and swept-based methods for single-material remap 11th World Congress on Computational Mehanis WCCM XI) 5th European Conferene on Computational Mehanis ECCM V) 6th European

More information

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes

Detecting Outliers in High-Dimensional Datasets with Mixed Attributes Deteting Outliers in High-Dimensional Datasets with Mixed Attributes A. Koufakou, M. Georgiopoulos, and G.C. Anagnostopoulos 2 Shool of EECS, University of Central Florida, Orlando, FL, USA 2 Dept. of

More information

The AMDREL Project in Retrospective

The AMDREL Project in Retrospective The AMDREL Projet in Retrospetive K. Siozios 1, G. Koutroumpezis 1, K. Tatas 1, N. Vassiliadis 2, V. Kalenteridis 2, H. Pournara 2, I. Pappas 2, D. Soudris 1, S. Nikolaidis 2, S. Siskos 2, and A. Thanailakis

More information

HEXA: Compact Data Structures for Faster Packet Processing

HEXA: Compact Data Structures for Faster Packet Processing Washington University in St. Louis Washington University Open Sholarship All Computer Siene and Engineering Researh Computer Siene and Engineering Report Number: 27-26 27 HEXA: Compat Data Strutures for

More information

COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems

COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems Andreas Brokalakis Synelixis Solutions Ltd, Greee brokalakis@synelixis.om Nikolaos Tampouratzis Teleommuniation

More information

Finding the Equation of a Straight Line

Finding the Equation of a Straight Line Finding the Equation of a Straight Line You should have, before now, ome aross the equation of a straight line, perhaps at shool. Engineers use this equation to help determine how one quantity is related

More information

Establishing Secure Ethernet LANs Using Intelligent Switching Hubs in Internet Environments

Establishing Secure Ethernet LANs Using Intelligent Switching Hubs in Internet Environments Establishing Seure Ethernet LANs Using Intelligent Swithing Hubs in Internet Environments WOEIJIUNN TSAUR AND SHIJINN HORNG Department of Eletrial Engineering, National Taiwan University of Siene and Tehnology,

More information

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem

Calculation of typical running time of a branch-and-bound algorithm for the vertex-cover problem Calulation of typial running time of a branh-and-bound algorithm for the vertex-over problem Joni Pajarinen, Joni.Pajarinen@iki.fi Otober 21, 2007 1 Introdution The vertex-over problem is one of a olletion

More information

Automatic Generation of Transaction-Level Models for Rapid Design Space Exploration

Automatic Generation of Transaction-Level Models for Rapid Design Space Exploration Automati Generation of Transation-Level Models for Rapid Design Spae Exploration Dongwan Shin, Andreas Gerstlauer, Junyu Peng, Rainer Dömer and Daniel D. Gajski Center for Embedded Computer Systems University

More information

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems

Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Arne Hamann, Razvan Rau, Rolf Ernst Institute of Computer and Communiation Network Engineering Tehnial University of Braunshweig,

More information

RAC 2 E: Novel Rendezvous Protocol for Asynchronous Cognitive Radios in Cooperative Environments

RAC 2 E: Novel Rendezvous Protocol for Asynchronous Cognitive Radios in Cooperative Environments 21st Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communiations 1 RAC 2 E: Novel Rendezvous Protool for Asynhronous Cognitive Radios in Cooperative Environments Valentina Pavlovska,

More information

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework EECS 33 There be Dragons here http://ziyang.ees.northwestern.edu/ees33/ Teaher: Offie: Email: Phone: L477 Teh dikrp@northwestern.edu 847 467 2298 Today s material might at first appear diffiult Perhaps

More information

Alleviating DFT cost using testability driven HLS

Alleviating DFT cost using testability driven HLS Alleviating DFT ost using testability driven HLS M.L.Flottes, R.Pires, B.Rouzeyre Laboratoire d Informatique, de Robotique et de Miroéletronique de Montpellier, U.M. CNRS 5506 6 rue Ada, 34392 Montpellier

More information

Particle Swarm Optimization for the Design of High Diffraction Efficient Holographic Grating

Particle Swarm Optimization for the Design of High Diffraction Efficient Holographic Grating Original Artile Partile Swarm Optimization for the Design of High Diffration Effiient Holographi Grating A.K. Tripathy 1, S.K. Das, M. Sundaray 3 and S.K. Tripathy* 4 1, Department of Computer Siene, Berhampur

More information

Displacement-based Route Update Strategies for Proactive Routing Protocols in Mobile Ad Hoc Networks

Displacement-based Route Update Strategies for Proactive Routing Protocols in Mobile Ad Hoc Networks Displaement-based Route Update Strategies for Proative Routing Protools in Mobile Ad Ho Networks Mehran Abolhasan 1 and Tadeusz Wysoki 1 1 University of Wollongong, NSW 2522, Australia E-mail: mehran@titr.uow.edu.au,

More information

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 9, September 2013 ISSN: 2277 128X International Journal of Advaned Researh in Computer Siene and Software Engineering Researh Paper Available online at: www.ijarsse.om A New-Fangled Algorithm

More information

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks

Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Hoc Wireless Networks Using Game Theory and Bayesian Networks to Optimize Cooperation in Ad Ho Wireless Networks Giorgio Quer, Federio Librino, Lua Canzian, Leonardo Badia, Mihele Zorzi, University of California San Diego La

More information

Optimizing Sparse Matrix Operations on GPUs using Merge Path

Optimizing Sparse Matrix Operations on GPUs using Merge Path 21 IEEE 29th International Parallel and Distributed Proessing Symposium Optimizing Sparse Matrix Operations on GPUs using Merge Path Steven Dalton, Luke Olson Department of Computer Siene University of

More information

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core Announements Your fous should be on the lass projet now Leture 17: Cahing Issues for Multi-ore Proessors This week: status update and meeting A short presentation on: projet desription (problem, importane,

More information

Parametric Abstract Domains for Shape Analysis

Parametric Abstract Domains for Shape Analysis Parametri Abstrat Domains for Shape Analysis Xavier RIVAL (INRIA & Éole Normale Supérieure) Joint work with Bor-Yuh Evan CHANG (University of Maryland U University of Colorado) and George NECULA (University

More information

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks Flow Demands Oriented Node Plaement in Multi-Hop Wireless Networks Zimu Yuan Institute of Computing Tehnology, CAS, China {zimu.yuan}@gmail.om arxiv:153.8396v1 [s.ni] 29 Mar 215 Abstrat In multi-hop wireless

More information

Sequential Incremental-Value Auctions

Sequential Incremental-Value Auctions Sequential Inremental-Value Autions Xiaoming Zheng and Sven Koenig Department of Computer Siene University of Southern California Los Angeles, CA 90089-0781 {xiaominz,skoenig}@us.edu Abstrat We study the

More information

Make your process world

Make your process world Automation platforms Modion Quantum Safety System Make your proess world a safer plae You are faing omplex hallenges... Safety is at the heart of your proess In order to maintain and inrease your ompetitiveness,

More information

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections

SVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections SVC-DASH-M: Salable Video Coding Dynami Adaptive Streaming Over HTTP Using Multiple Connetions Samar Ibrahim, Ahmed H. Zahran and Mahmoud H. Ismail Department of Eletronis and Eletrial Communiations, Faulty

More information

Cluster Centric Fuzzy Modeling

Cluster Centric Fuzzy Modeling 10.1109/TFUZZ.014.300134, IEEE Transations on Fuzzy Systems TFS-013-0379.R1 1 Cluster Centri Fuzzy Modeling Witold Pedryz, Fellow, IEEE, and Hesam Izakian, Student Member, IEEE Abstrat In this study, we

More information

Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction

Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction University of Wollongong Researh Online Faulty of Informatis - apers (Arhive) Faulty of Engineering and Information Sienes 7 Time delay estimation of reverberant meeting speeh: on the use of multihannel

More information

Test Case Generation from UML State Machines

Test Case Generation from UML State Machines Test Case Generation from UML State Mahines Dirk Seifert To ite this version: Dirk Seifert. Test Case Generation from UML State Mahines. [Researh Report] 2008. HAL Id: inria-00268864

More information

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks

Unsupervised Stereoscopic Video Object Segmentation Based on Active Contours and Retrainable Neural Networks Unsupervised Stereosopi Video Objet Segmentation Based on Ative Contours and Retrainable Neural Networks KLIMIS NTALIANIS, ANASTASIOS DOULAMIS, and NIKOLAOS DOULAMIS National Tehnial University of Athens

More information

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center

Constructing Transaction Serialization Order for Incremental. Data Warehouse Refresh. Ming-Ling Lo and Hui-I Hsiao. IBM T. J. Watson Research Center Construting Transation Serialization Order for Inremental Data Warehouse Refresh Ming-Ling Lo and Hui-I Hsiao IBM T. J. Watson Researh Center July 11, 1997 Abstrat In typial pratie of data warehouse, the

More information

Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System Arhiteture and Performane of the Hitahi SR221 Massively Parallel Proessor System Hiroaki Fujii, Yoshiko Yasuda, Hideya Akashi, Yasuhiro Inagami, Makoto Koga*, Osamu Ishihara*, Masamori Kashiyama*, Hideo

More information

PARAMETRIC SAR IMAGE FORMATION - A PROMISING APPROACH TO RESOLUTION-UNLIMITED IMAGING. Yesheng Gao, Kaizhi Wang, Xingzhao Liu

PARAMETRIC SAR IMAGE FORMATION - A PROMISING APPROACH TO RESOLUTION-UNLIMITED IMAGING. Yesheng Gao, Kaizhi Wang, Xingzhao Liu 20th European Signal Proessing Conferene EUSIPCO 2012) Buharest, Romania, August 27-31, 2012 PARAMETRIC SAR IMAGE FORMATION - A PROMISING APPROACH TO RESOLUTION-UNLIMITED IMAGING Yesheng Gao, Kaizhi Wang,

More information

The recursive decoupling method for solving tridiagonal linear systems

The recursive decoupling method for solving tridiagonal linear systems Loughborough University Institutional Repository The reursive deoupling method for solving tridiagonal linear systems This item was submitted to Loughborough University's Institutional Repository by the/an

More information

Scheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiprocessor

Scheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiprocessor Sheduling Multiple Independent Hard-Real-Time Jobs on a Heterogeneous Multiproessor Orlando Moreira NXP Semiondutors Researh Eindhoven, Netherlands orlando.moreira@nxp.om Frederio Valente Universidade

More information

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman

NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION. Ken Sauer and Charles A. Bouman NONLINEAR BACK PROJECTION FOR TOMOGRAPHIC IMAGE RECONSTRUCTION Ken Sauer and Charles A. Bouman Department of Eletrial Engineering, University of Notre Dame Notre Dame, IN 46556, (219) 631-6999 Shool of

More information

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization Zippy - A oarse-grained reonfigurable array with support for hardware virtualization Christian Plessl Computer Engineering and Networks Lab ETH Zürih, Switzerland plessl@tik.ee.ethz.h Maro Platzner Department

More information

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality

Multi-Piece Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality INTERNATIONAL CONFERENCE ON MANUFACTURING AUTOMATION (ICMA200) Multi-Piee Mold Design Based on Linear Mixed-Integer Program Toward Guaranteed Optimality Stephen Stoyan, Yong Chen* Epstein Department of

More information