Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio

Size: px

Start display at page:

Download "Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio"

Aleesha Walker
5 years ago
Views:

1 Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio Michael L Dickens, J Nicholas Laneman, and Brian P Dunn WINNF 11 Europe 2011-Jun-22/24

2 Overview! Background on relevant SDR! Problem formulation! 3 issues and our solutions! Example: NBFM decoding / CPU load! Conclusions 2 / 17 WINNF 11 Europe 2011-Jun-22/24

.. L-Way Sync Adder Hardware (Description) SDR }Kernel { Mapping

3 SDR Kernel Role Application Graph 1:L Demux Quadrature Demodulator... Subfilter #1... Subfilter #L Sink Source... L-Way Sync Adder Hardware (Description) SDR }Kernel { Mapping Buffering Thresholding Processing Scheduling 3 / 17 WINNF 11 Europe 2011-Jun-22/24

4 Relevant Work GNU Radio! Single shared GPP (C2D, Cell, PPC, Arm)! Ettus E series: could allow heterogeneous processing! Continuing to work towards GPU using NVIDIA CUDA! Each signal processing block is processor specific SCA / CORBA! Any processor that can be interfaced into CORBA! Heterogeneous processing via a-priori scheduling! Full availability of CPU, GPU, DSP, FPGA,! Each signal processing block is processor specific 4 / 17 WINNF 11 Europe 2011-Jun-22/24

Our Work: Surfer 10 Kernel Application API Graphics API and GUI Scheduler API Surfer

Kernel Hardware Signal Processing Block API Block Runner API Host Hardware Abstraction

Separated block overhead from processing! Block threshold determined globally!

5 Our Work: Surfer 10 Kernel Application API Graphics API and GUI Scheduler API Surfer Surfer Operating System and Hardware Abstraction Layer Operating System API, ABI, and Kernel Hardware Signal Processing Block API Block Runner API Host Hardware Abstraction Layer! Shared and/or dedicated processors! Pool of Runner Threads! Separated block overhead from processing! Block threshold determined globally! Overhead load distributed across all threads! Runtime statistics collection 5 / 17 WINNF 11 Europe 2011-Jun-22/24

6 Problem Statement To allow waveform reconfiguration at the block level, during runtime, without interrupting data processing To allow signal processing to be switched between processing devices while maintaining state! Requires a switch somewhere 6 / 17 WINNF 11 Europe 2011-Jun-22/24

7 Primary Issues Where and how to switch? Waveform or Signal Processing Block #1 Waveform or Signal Processing Block #2 7 / 17 WINNF 11 Europe 2011-Jun-22/24

8 Our Solution Split blocks Block Processing Flavors Combined Local CPU Generic SSE3 Optimized Processing, State, State Glue Lookup Table OpenCL Generic OpenCL Optimized and Glue AltiVec Optimized NEON Optimized! Processing is cleanly separated from state! Processing can take place on any device! Processing flavor can be optimized for any device! User can modify lookup table with new processing flavors 8 / 17 WINNF 11 Europe 2011-Jun-22/24

9 Another Issue How to keep state?! Must be easily copyable; require contiguous memory! Must be fully interpretable by any processor! Must provide alignment for each variable in structure! C++ API should closely match that for scalars, std::vector, std::string! Memory must be resizable, e.g., to accommodate vectors and strings 9 / 17 WINNF 11 Europe 2011-Jun-22/24

10 Our Solution Create processor-agnostic dynamic structure State Static Handles DSScalar<int32> DSScalar<float64> DSVector<int16> DSString DSScalar<bool> Variables Dynamic Pointers int32 I float64 F int16 V[N] char C[M] bool B Alignment, Padding, & Type Info Allocation Contiguous Memory Header I: 4 Bytes F: 8 Bytes V: 2N Bytes C: M Bytes B: 1 Byte 10 / 17 WINNF 11 Europe 2011-Jun-22/24

11 Final Issue How to control flavor selection?! Currently uses a Scheduler! Need a way to use A-priori collected statistics Runtime collected statistics! Need a way to control Computation flavor selection Internal Surfer parameters 9 / 17 WINNF 11 Europe 2011-Jun-22/24

12 Our Solution Create supervisor to handle monitoring system! Plugins include statistics collection for Host CPU load (per core) Computation throughput (per block) Data transfer throughput and latency Surfer load (per thread) Input / output data latency (per block) System overhead! Can control Computation flavor selection (per block) Threshold settings (per block) Any other Surfer parameters 11 / 17 WINNF 11 Europe 2011-Jun-22/24

13 Surfer 11 Changes! Signal Processing Kernel Graphics API and GUI Signal Processing Flavor Block API Block becomes Flavor! Scheduler becomes Supervisor Application API Supervisor Scheduler API Surfer Block Runner API! New C++ namespace for OS and hardware abstraction: Salt Salt: Surfer operating system and hardware Abstraction Layer Surfer Operating System and Hardware Abstraction Layer nacl: OpenCL Abstraction Layer Operating System API, ABI, and Kernel Host Hardware Abstraction Layer Hardware! New C++ namespace for OpenCL abstraction: nacl 12 / 17 WINNF 11 Europe 2011-Jun-22/24

14 Example 1. Start CPU load collection 2. Start Surfer Supervisor set to switch block flavors when host CPU load reaches 60% for 3 consecutive samples (0.3 seconds) Decoding NBFM, ~9 seconds of data 3. Wait 1.5 seconds 4. Start CPU load hog, runs for ~4 seconds! Expect to see Surfer CPU load decrease when CPU load hog is executing 13 / 17 WINNF 11 Europe 2011-Jun-22/24

15 Surfer Starts External Load Starts CPU Load Graph External Load Finishes Surfer Finishes 15 / 17 WINNF 11 Europe 2011-Jun-22/24

16 Conclusions! Enabled seamless dynamic runtime reconfiguration Split block implementations Portable state construct Supervisor to collect statistics and modify system runtime behavior 16 / 17 WINNF 11 Europe 2011-Jun-22/24

GPU Fundamentals Jeff Larkin November 14, 2016

GPU Fundamentals Jeff Larkin November 14, 2016 GPU Fundamentals Jeff Larkin , November 4, 206 Who Am I? 2002 B.S. Computer Science Furman University 2005 M.S. Computer Science UT Knoxville 2002 Graduate Teaching Assistant 2005 Graduate