PROFESSIONAL APPOINTMENT: Rama Sangireddy Department of Electrical Engineering University of Texas at Dallas, Richardson, TX 75080 Phone: (972) 883 6143; E-mail: rama.sangireddy@utdallas.edu Aug. 2003 - till date, Assistant Professor (tenure-track) Department of Electrical Engineering, University of Texas at Dallas EDUCATION: Ph.D. in Computer Engineering, August 2003 Iowa State University, Ames, Iowa, USA Dissertation: On-Chip Adaptive Components for Balanced Computing Received Research Excellence Award for outstanding doctoral research. M.S. in Electrical Engineering, May 1999 University of Missouri-Rolla, USA. B.S. (distinction) in Electrical & Electronics Engineering, May 1996 National Institute of Technology (formerly Regional Engineering College), Warangal, India. ORIGINAL SCIENTIFIC RESEARCH: Contributions Developed novel low-power computing models where the under-utilized on-chip SRAM memory elements can be dynamically adapted as computing elements to accelerate multimedia and other relevant applications. This is done by exploiting the possibility of using a part of SRAM cache memory for computational purposes, striking a balance in the usage of memory and computing resources for various applications. A part of an L1 data cache was designed as a Reconfigurable Functional Cache (RFC). The SRAM logic in the cache was partitioned, without much overhead in area and access time, to perform lookup table (LUT) based computations. RFC plays a dual role by operating as a conventional cache memory or as a specialized computing unit depending on the application requirements. Designed an adaptive register file computing (ARC) unit, a novel on-chip dual-role circuit, with insignificant area overhead. The ARC unit supplements the conventional register bank to provide larger register storage capacity or acts as a specialized computing unit, depending on the requirement of a specific application. Presented circuit-level details for the implementation of the dual-role ARC unit, its integration in a processor pipeline, and the corresponding performance enhancement in various multimedia applications. This work of designing unique methodologies to adaptively utilize on-chip SRAM logic resources was first of its kind in the reconfigurable computing domain. Developed two path-breaking IP address lookup schemes: Elevator-Stairs algorithm and logw-elevators algorithm, which are efficient in the way IP packets can be forwarded at high data rates. The algorithms, unlike the past schemes, achieved the rare possibility of simultaneous optimization of multiple metrics such as lookup time, update time, memory usage, and time to construct the data structure. Developed a unique binary decision diagrams (BDDs) based optimized combinational logic for an efficient implementation of high-speed IP address lookup scheme in reconfigurable hardware. The results showed that the BDD hardware engine gives a throughput of up to 175.7 million lookups per second (Ml/s) for a large routing table with 33,796 prefixes. Developed split pipeline architecture, a novel technique to distinguish and process instructions in separate pipeline hardware based on their source operand requirements. The split pipeline provides capability for processing instructions at a higher clock rate as compared to a conventional processor, but with almost the same instructions per cycle (IPC) throughput. Results showed that an 8-wide processor with split integer pipelines achieved a clock rate 15.8% faster than that of an 8-wide conventional processor with only 0.7% IPC loss. Similarly, a 4-wide processor with split pipeline design achieved a 19.69% faster clock rate than a 4-wide conventional processor with only 1.9% IPC loss. Proposed a novel Rename SRAM logic in wide-issue processor pipeline, where the logic is optimized based on instruction source operand demands, while retaining in-order instruction renaming. With this technique in 4-wide integer pipeline, an optimized rename logic reduced access time, power, and area by 14%, 42%, and 49%, respectively, with only 4.7% loss in IPC.
Developed the CMP-SIM tool that provides a highly flexible and user-adaptable platform for modeling and evaluating multi-core SoC architectures, a tool evidently first of its kind. The tool was made available for public use at http://www.utdallas.edu/~rama.sangireddy/cmp-sim in October 2006. Since then it has attracted attention of several researchers across the world. Besides, the tool has been introduced with significant success into the graduate level computer architecture courses at UT-Dallas for students to experiment and evaluate state-of-the-art multi-core processor design techniques. This tool benefits the research community at large. Proposed streamline buffers, a novel mechanism to handle the long latency instructions that have entered the pipeline and clogged the instruction window (IW). Our novel technique of supplementing the out-of-order IW with in-order streamline buffers alleviated to a significant extent the clogging in the pipeline of a high throughput processor. Current research project is developing a novel programmable system-on-chip (SoC) architecture that will have multiple heterogeneous SIMPLE cores to meet the performance requirements of diverse wireless standards. The envisioned SoC will consist of a layout of multiple programmable radio processor (PRP) cores, where each PRP core is designed as a Single Instruction Multiple Programmable elements (SIMPLE) processor. This design framework will facilitate a performance-effective yet flexible platform for radio processing. Articles in Refereed Journals 1. Hui Wang, Rama Sangireddy, and Sandeep Baldawa, Optimizing Instruction Scheduling Through Combined In-order and o-o-o Execution in SMT Processors, IEEE Transactions on Parallel and Distributed Systems, accepted in April 2008 and to appear in an upcoming issue. 2. Hui Wang, Jatan Shah, and Rama Sangireddy, Streamlining long latency instructions for seamlessly combined in-order and out-of-order execution, Elsevier Journal of Microprocessors and Microsystems, accepted in April 2008 and to appear in an upcoming issue. 3. Jatan Shah and Rama Sangireddy, Operand load based split pipeline architecture for high clock rate and commensurable IPC, IEEE Transactions on Parallel and Distributed Systems, Vol. 19, No. 4, April 2008, pp. 529-544. 4. Rama Sangireddy and Prabhu Rajamani, Scalable Reconfigurable Architectures for High-Performance Energy-Efficient Multimedia Processing, ISCA International Journal of Computers and Their Applications, Vol. 14, No. 2, June 2007, pp. 68-78. 5. Rama Sangireddy, "Register Port Complexity Reduction in Wide-Issue Processors with Selective Instruction Execution," Elsevier Journal of Microprocessors and Microsystems, Vol. 31, Issue 1, February 2007, pp. 51-62. 6. Rama Sangireddy and Arun K. Somani, "On-Chip Adaptive Circuits for Fast Media Processing," IEEE Transactions on Circuits and Systems II, Vol. 53, No. 9, September 2006, pp. 946-950. 7. Rama Sangireddy, Reducing Rename Logic Complexity for Low-power High-speed Front-end Architectures, IEEE Transactions on Computers, Vol. 55, No. 6, June 2006, pp. 672-685. 8. Rama Sangireddy, Natsuhiko Futamura, Srinivas Aluru, and Arun K. Somani, "Scalable, Memory Efficient, High-Speed Algorithms for IP Lookups," IEEE/ACM Transactions on Networking, Vol.13, Issue 4, August 2005, pp. 802-812. 9. Rama Sangireddy, Huesung Kim, and Arun K. Somani, "Low-Power High-Performance Reconfigurable Computing Cache Architectures," IEEE Transactions on Computers, Vol.53, No.10, October 2004, pp. 1274-1290. 10. Rama Sangireddy and Arun K. Somani, "High-Speed IP Routing with Binary Decision Diagrams Based Hardware Address Lookup Engine," IEEE Journal on Selected Areas in Communications, IEEE J-SAC, Vol. 21. No. 4, May 2003, pp. 513-521. Articles in Review with Refereed Journals 11. Hui Wang, Sandeep Baldawa, Rama Sangireddy, Sreekala Puduru, and Sm M Rahman, A scalable technique for dependable cache consistency in chip multiprocessor (CMP) systems, IEEE Transactions on Dependable and Secure Computing, revised manuscript to be submitted for second round review. 12. Rama Sangireddy, Mitigating hotspots in dispatch logic through partitioned renaming, IEEE Transactions on Computers, submitted and in review. 13. Terrell Bennett and Rama Sangireddy, Novel Submicron Technology Designs for Multi-Functional Unit Dynamic Instruction Selection Logic, IEEE Transactions on Computers, submitted and in review.
Articles in preparation to be submitted to Journals 14. Mangesh Kunchanwar, Durga Prasad, and Rama Sangireddy, On future of wireless digital radio processing, manuscript in preparation to be submitted to IEEE Transactions on Mobile Computing. 15. Durga Prasad, Mangesh Kunchanwar, and Rama Sangireddy, Dynamics of accelerator selection for programmable radio architectures, manuscript in preparation to be submitted to IEEE Transactions on Mobile Computing. 16. Prabhu Rajamani and Rama Sangireddy, Power and performance trade-offs in CPU-OS management, manuscript in preparation to be submitted to IEEE Transactions on Computers. Tools Developed for Scientific Community s Benefit Sandeep Baldawa and Rama Sangireddy, CMP-SIM: An Environment for Simulating Chip Multiprocessor (CMP) Architectures, University of Texas at Dallas, October 2006. o http://www.utdallas.edu/~rama.sangireddy/cmp-sim o An innovative tool to model, simulate, and evaluate state-of-the-art multi-core processor architectures o The tool is being used by various researchers at several universities such as UT Austin, UC Davis, Iowa State University, University of Cincinnati-Ohio, Indian Institute of Science, University of Belgrade, and UC Irvine. Articles in Peer-reviewed and Refereed Conferences 1. Terrell Bennett and Rama Sangireddy, "An Optimal Multi-Functional Unit Dynamic Instruction Selection Logic at Submicron Technologies," Proceedings of 21st IEEE International Conference on VLSI Design (VLSI-2008), January 2008, pp. 267-272. 2. Hui Wang, Sandeep Baldawa, and Rama Sangireddy, "Dynamic Error Detection for Dependable Cache Coherency in Multicore Architecture," Proceedings of 21st IEEE International Conference on VLSI Design (VLSI-2008), January 2008, pp. 279-285. 3. Jatan Shah and Rama Sangireddy, "Higher Clock Rate at Comparable IPC Through Reduced Circuit Complexity in Instruction Format Based Pipeline Clustering," Proceedings of 2007 IEEE International Symposium on Circuits and Systems (ISCAS-2007), May 2007, pp. 4012-4015. 4. Rama Sangireddy, Fast and Low-Power Front-End Architectures with Reduced Rename Logic Complexity, Proc. 2006 IEEE International Symposium on Circuits and Systems (ISCAS-2006), May 2006, pp. 53-56. 5. Prabhu Rajamani, Jatan Shah, Vadhiraj Sankaranarayanan, and Rama Sangireddy, High Performance and Alleviated Hot-spot Problem in Processor Front-end with Enhanced Instruction Fetch Bandwidth Utilization, Proc. IEEE International Performance Computing and Communications Conference, April 2006, pp. 63-70. 6. Rama Sangireddy, "Instruction Format Based Selective Execution for Register Port Complexity Reduction in High-Performance Processors," Proceedings of High-Performance Computing Architecture track in IEEE Third International Conference on Information Technology: New Generations, April 2006, pp. 227-232. 7. Rama Sangireddy and Prabhu Rajamani, Performance Optimization with Scalable Reconfigurable Computing Systems, Proceedings of IEEE 19th International Conference on VLSI Design, January 2006, pp. 381-386. 8. Hui Wang, Sreekala Puduru, and Rama Sangireddy, A scalable scheme for dependable cache consistency architecture in wireless ad-hoc networks, Proceedings of First IEEE International Workshop on Next Generation Wireless Networks, December 2005. 9. Rama Sangireddy, "Register Organization for Enhanced On-chip Parallelism," Proceedings of ASAP2004, The IEEE 15th International Conference on Application-specific Systems, Architectures and Processors, September 2004, pp 180-190. 10. Rama Sangireddy and Arun K. Somani, "Exploiting Quiescent States in Register Lifetime," Proceedings of ICCD2004, The IEEE 22nd International Conference on Computer Design, October 2004, pp. 368-374. 11. Rama Sangireddy, Huesung Kim, and Arun K. Somani, "Timing Issues of Operating Mode Switch in High Performance Reconfigurable Architectures," Proceedings of HiPC2003, The Tenth Annual International Conference on High Performance Computing, December 2003, pp. 23-33. 12. Natsuhiko Futamura, Rama Sangireddy, Srinivas Aluru, and Arun K. Somani, "Scalable, Memory Efficient, High-Speed Lookup and Update Algorithms for IP Routing," Proceedings of 12 th IEEE International Conference on Computer Communications and Networks, ICCCN, October 2003, pp. 257-263. 13. Rama Sangireddy and Arun K. Somani, "Application-Specific Computing with Adaptive Register file Architectures," Proceedings of ASAP-2003, The IEEE 14 th International Conference on Application-specific Systems, Architectures and Processors, June 2003, pp. 183-193. 14. Rama Sangireddy, "Shadow IP Route Caching for Trusted Internet Routing," Proceedings of TIW-2002, The Trusted Internet Workshop, December 2002.
15. Rama Sangireddy, Huesung Kim, and Arun K. Somani, "Low-Power High-Performance Adaptive Computing Architectures for Multimedia Processing," Proceedings of HiPC2002, The Ninth Annual International Conference on High Performance Computing, December 2002, pp.124-134. 16. Rama Sangireddy, Huesung Kim, and Arun K. Somani, "Timing Configuration Switch in Reconfigurable Functional Cache Based Architectures," Proceedings of FPGA2002, Tenth ACM International Symposium on Field-Programmable Gate Arrays, February 2002. 17. Rama Sangireddy and Arun K. Somani, "Binary Decision Diagrams for Efficient Hardware Implementation of Fast IP Routing Lookups," Proceedings of ICCCN2001, Tenth IEEE International Conference on Computer Communications and Networks, October 2001, pp. 12-17. 18. Rama Sangireddy and Sreenivas Aluru, "Harmonic Elimination in HVDC Transmission-Recent Trends," CURRENTS'95, National Symposium in Electrical Engineering, Tiruchirapalli, India, March 1995. 19. Mangesh Kunchanwar, Durga Prasad, and Rama Sangireddy, Novel instruction accelerators for future mobile digital computing, to be submitted to Hotmobile 2009, The Tenth Workshop on Mobile Computing, Systems, and Applications. 20. Durga Prasad, Mangesh Kunchanwar, and Rama Sangireddy, Dynamics of accelerator selection for programmable mobile radio processors, to be submitted to Mobicom 2009. 21. Prabhu Rajamani and Rama Sangireddy, Dynamic trade-offs between Power and performance in CPU-OS management, to be submitted. STUDENTS GRADUATED: 1. Hui Wang (Ph.D.) (fall 2007) o Dissertation title: Conjoint Component Design for High Performance and Dependable Chip Multithreading Systems o Committee: Prof. Cyrus D. Cantrell, Prof. Edwin Sha, Dr. Nourani. o Employed with Agilent Technologies. 2. Sandeep Baldawa (M.S.) (fall 2007) o Thesis title: CMP-SIM: A Chip Multiprocessor Simulation Environment o Committee: Prof. Edwin Sha, Dr. Roozbeh Jafari. o Employed with Agilent Technologies. 3. Terrell Bennett (M.S.) (spring 2007) o Thesis title: Design and Analysis of Dynamic Instruction Scheduling Logic Using Submicron Technologies o Committee: Prof. Poras Balsara, Dr. Mehrdad Nourani. o Employed with Texas Instruments. 4. Jatan Shah (M.S.) (fall 2006) o Thesis title: Split Pipeline Architectures for Higher Clock Rate and Commensurable IPC o Committee: Prof. Poras Balsara, Dr. Mehrdad Nourani. o Employed with Texas Instruments. 5. Prabhu Rajamani (M.S.) (summer 2006) o Thesis title: Caching, Tracing, and Replicating Non-contiguous Instruction Blocks o Committee: Dr. Dinesh Bhatia, Dr. Mehrdad Nourani. o Continuing Ph.D. program since fall 2006 CURRENT GRADUATE STUDENTS: 1. Prabhu Rajamani (Ph.D.) o Commenced in May 2006 o Passed Ph.D. qualifying examination in spring 2006 o Expected Ph.D. proposal oral examination in fall 2008 o Expected graduation in spring 2009 2. Terrell Bennett (Ph.D.) o Commencing in August 2008 o Expected Ph.D. qualifying examination in spring 2009 o Expected Ph.D. proposal oral examination in spring 2010 o Expected graduation in spring 2011 3. Durga P. Prasad (M.S.) a. Commenced in August 2007, Expected graduation in spring 2009 4. Mangesh K. Kunchamwar (M.S.) a. Commenced in August 2007, Expected graduation in spring 2009
INVITED TALKS: High Performance Dependable Single Chip Multithreading Systems IEEE Dallas Chapter, Dallas (June 2007) Adaptive and Reconfigurable Computing: Opportunities and New Paradigms University of Nevada, Las Vegas (April 2006) Adaptive On-Chip Architectures for Low-Power High-Performance Multimedia Processing Texas Instruments, Dallas, Texas (February 2006) Trends in High Performance and Low-Power Computing National Institute of Technology, Warangal, India (January 2006) Exploiting quiescent states in register lifetime: Processor design implications Intel India Research Labs, Intel Corporation, India (January 2006) On-chip Adaptive components for low-power high-performance multimedia processing Texas Instruments Inc., Bangalore, India (December 2005) Trends in Processor Architectures National Institute of Technology, Warangal, India (December 2005) EXTERNAL FUNDING: 1. Semiconductor Research Corporation (SRC), Novel Instruction Set Architecture Designs for Programmable Multiple Standard Radio Processors. Total amount USD150,000, 08/01/08-07/31/2011 (co-pi: Poras Balsara) 2. IRIS Technologies, Research in High-Performance Hybrid Systems Architecture, January 2007. Total amount USD 15,000. 3. Calpont Corporation, Research in Computer Systems Architecture, in Sept 2006. Total amount USD 12,000. 4. UTD Erik Jonsson School, SimpleCMP: An Environment for Simulating Chip Multiprocessor (CMP) Architectures, funded in January 2006. Total amount USD 23,400. PROFESSIONAL RECOGNITIONS, HONORS & MEMBERSHIPS: Senior Member of IEEE and IEEE Computer Society. Received Research Excellence Award for accomplishments in Ph.D. at Iowa State University. Completed B.S. in Electrical Engineering with distinction. Invited as a panel member, National Science Foundation (NSF) proposal review on Computer Architecture, January 2007. Invited as a panel member, National Science Foundation (NSF) proposal review on Computer Research Infrastructure (CRI), October 2005. Invited to deliver guest research lectures at IEEE chapters of Dallas and Las Vegas, Texas Instruments, Intel Research labs, and several other academic institutions. CLASSROOM TEACHING: 2008, fall, EE6304, Computer Architecture for graduate students, enrollment 74 2008, fall, EE4304, Computer Architecture for undergraduate students, enrollment 44 2008, summer, EE4304, Computer Architecture for undergraduate students, enrollment 32 2008, spring, EE7304, Advanced Computer Architecture, enrollment 8, evaluation 4.25 2007, fall, EE6304, Computer Architecture for graduate students, enrollment 45, evaluation 4.16 2007, fall, EE4304, Computer Architecture for undergraduate students, enrollment 45, evaluation 3.81 2007, summer, EE4304, Computer Architecture for undergraduate students, enrollment 34, evaluation 4.22 2007, spring, EE7304, Advanced Computer Architecture, enrollment 14, evaluation 4.1 2006, fall, EE6304, Computer Architecture for graduate students, enrollment 41, evaluation 4.42 2006, fall, EE4304, Computer Architecture for undergraduate students, enrollment 21, evaluation 3.87 2006, summer, EE4304, Computer Architecture for undergraduate students, enrollment 30, evaluation 3.71 2006, spring, EE7304, Advanced Computer Architecture, enrollment 12, evaluation 4.25 2005, fall, EE6304, Computer Architecture for graduate students, enrollment 39, evaluation 3.57 2005, fall, EE4304, Computer Architecture for undergraduate students, enrollment 46, evaluation 3.74 2005, spring, EE6304, Computer Architecture for graduate students, enrollment 36, evaluation 3.93. 2004, fall, EE4304, Computer Architecture for undergraduate students, enrollment 19, evaluation 4.237. 2004, fall, EE7304, Advanced Computer Architecture, enrollment 6, evaluation 4.0. 2004, spring, EE6307, Fault Tolerant Digital Systems for graduate students, enrollment 11, evaluation 2.78.
PROFESSIONAL SERVICE ACTIVITIES: Member of Graduate committee, Department of Electrical Engineering, UT-Dallas, October 2003 till date. Member of Graduate committee, Computer Engineering Program, UT-Dallas, January 2008 till date. Local Arrangements Chair, 7th IEEE Dallas Circuits and Systems (DCAS) Workshop, October 2008. Panel member, National Science Foundation (NSF) proposal review on Computer Architecture, January 2007 Panel member, National Science Foundation (NSF) proposal review on Computer Research Infrastructure (CRI), October 2005. Publicity Chair, First IEEE International Workshop on Next Generation Wireless Networks 2005. [IEEE WoNGeN '05] Member of Technical Program Committee, 4th International Trusted Internet Workshop 2005, [TIW-2005] Sessions chair, 12 th IEEE International Conference on Computer Communications and Networks (ICCCN), October 2003. Technical paper reviewer for IEEE Transactions on Computers, IEEE Transactions on Circuits & Systems, IEEE Journal of Selected areas in Communications (JSAC), IEEE Transactions on Parallel and Distributed Systems, IEEE Communications Letters, Several International Conferences & Workshops. Had established the High Performance Dependable Computing Laboratory in October 2003, after joining the EE department at UTD. Currently responsible to supervise a team of graduate students to conduct research in the areas of Computer Architecture, Computer Communications and Dependable Computing. Till date had served as a member or an external chair of 8 Ph.D. dissertation committees and 7 M.S. thesis committees at UT-Dallas.