Reconfigurable Spintronic Fabric using Domain Wall Devices Ronald F. DeMara, Ramtin Zand, Arman Roohi, Soheil Salehi, and Steven Pyle Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 32816-2362 December 20, 2014 We introduce a novel spintronic device and architecture to realize a reconfigurable fabric for super-high-performance computing at ultra-low power while providing greater resiliency that reconfiguration allows. Figure 1 shows characteristics of spintronic-based technologies and architectures along with their advantages and challenges. Spintronic devices such as Magnetic Tunnel Junction (MTJ) and Domain Wall Magnets (DWM) are proven for memory applications and we research their potential in non Von Neumann computation for improved energy and throughput [1].
Figure 1: Taxonomy of Nanocomputing Architectures highlighting advantages of proposed LIM approach. Introduction While spintronic-based neuromorphic architectures offer analog computation strategies [2], in this proposal we exploit reconfigurability and associative processing using a Logic-In-Memory (LIM) paradigm. LIM is compatible with conventional computing algorithms and integrates logical operations with data storage, making it an ideal choice for parallel SIMD operations to eliminate frequent accesses to memory, which are extreme contributors to energy consumption. Spin-based LIM architectures have the capability to increase computational throughput, reduce the die area, provide instant-on functionality, and reduce static power consumption [3]. Feasibility of a low power spintronic LIM chip has recently been demonstrated in [4] for database applications. As shown in Figure 2, in order to facilitate a variety of highly data parallel Air Force applications such as Image Processing, Weather Forecasting, Big Data Analysis, and Physics Simulations, we propose a novel reconfigurable fabric succeeding FPGAs to allow unprecedented gains in nanocomputation. Specifically, we will research 1) energy-efficient associative computing paradigms and 2) DW-based LIM reconfigurable fabric. 1
Figure 2: Non-Conventional Ultra Low Power Computing Architectures. DWM logic devices initially proposed in [5] have the potential to alleviate power consumption issues. Specifically, in [6] the analytical expressions for wall energy density (ε W ) is sub-linearly related by ε W = 2π AK and wall width (δ W ) is expressed by δ W = π AK, where A is the exchange constant, and K is the magnetic anisotropy constant. Domain Wall (DW) Racetrack Memory has been fabricated by IBM in 2011 [7]. Our team utilized DW racetrack memory to implement a power efficient GPGPU register file [8]. The results show that energy efficiency is significantly improved as shown in Figure 3. Although DW devices could provide the high speed switching necessary for LIM architecture, reliability issues still remain a major concern for DW logic. In order to enhance reliability and exploit associative processing, a novel design of the conventional racetrack array, called Domain Wall Nanomagnet-based Ladders (DWNL) is proposed. Figure 3: Parameters of DW Racetrack Memory for GPGPU register file [8]. 2
Reconfigurable Spintronic Fabric (RSF) Unlike fixed pre-determined computing architectures which have recently been researched, a more effective approach is to realize the entire spectrum of applications by designing a Reconfigurable Spintronic Fabric (RSF). As shown in Figure 5, the RSF is a 2D array of Configurable Logic In Memory Blocks (CLIMBs) comprised of an array of DWNL cells. The use of reconfiguration to address challenges of AFRL-related applications with low energy budgets while maintaining availability and resilience have been developed by our team in recent years [9-14]. Figure 4: (a) Domain Wall Nanomagnet-based Ladder (DWNL), (b) Reflexive Referencing Cell Operation Cycle 1, (c) Reflexive Referencing Cell Operation Cycle 2. Conclusion DWNL will be utilized in CLIMB arrays to store bits as spin magnetization direction of different domains separated by domain walls, which can be shifted along a magnetic nanowire with the last domain reserved for sensing. This novel Reflexive Referencing Cell consists of a 3
reference MTJ that has a common fixed and oxide layer with the last domain. Such a design has the potential to reduce the effect of cell-to-cell variation. Figure 4(a) delineates our proposed 2- cycle self-reflexive variation-tolerant reading scheme. Cycle 1 and Cycle 2 sense the reference and output respectively as shown in Figure 4(b) and 4(c). If the voltage from the second cycle is greater than the voltage from the first, the value is 1, and vice versa. Moreover, DWNL is intrinsically compatible with the associative computing instructions such as: shift, compare, and write. Figure 5: System Hierarchy of Nanocomputing Architecture: RSF, CLIMB, Ladder. Figure 5 shows the proposed computing architecture which provides the appropriate platform for ultra-low power data-intensive processing applications. The core populates the RSF DWNL cells with application data as well as writes the CLIMBs instruction memory with appropriate associate computing programs to perform the desired application. Only the final output data needs to be transmitted to the core. 4
References [1] Kim, Jongyeon, et al. "Spin-Based Computing: Device Concepts, Current Status, and a Case Study on a High-Performance Microprocessor." Proceedings of the IEEE 103.1, 2015. [2] Sharad, Mrigank, et al. "Energy-Efficient Non-Boolean Computing With Spin Neurons and Resistive Memory." IEEE Transactions on Nanotechnology, pp. 23-34, 2014. [3] Zhang, Yue, et al. "Spintronics for low-power computing." Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014. [4] Jarollahi, Onizawa, et al. "A Nonvolatile Associative Memory-Based Context-Driven Search Engine Using 90 nm CMOS/MTJ-Hybrid Logic-in-Memory Architecture." IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 4, no. 4, pp. 460-474, 2014. [5] Allwood, Dan A., et al. "Magnetic domain-wall logic." Science, pp. 1688-1692, 2005. [6] Tauxe, Lisa. Essentials of Paleomagnetism. Univ. of California Press, 2010. [7] Annunziata, A. J., et al. "Racetrack memory cell array with integrated magnetic tunnel junction readout." IEEE International Electronics Devices Meeting (IEDM), 2011. IEEE, 2011. [8] Mao, Mengjie, et al. "Exploration of GPGPU register file architecture using domain-wall-shiftwrite based racetrack memory." Design Automation Conference (DAC), IEEE, 2014. [9] N. Imran, R. F. DeMara, J. Lee, and J. Huang, "Self-adapting Resource Escalation for Resilient Signal Processing Architectures." Journal of Signal Processing Systems, 2013. [10] R. Al-Haddad, R. Oreifej, R. A. Ashraf, and R. F. DeMara, "Sustainable Modular Adaptive Redundancy Technique Emphasizing Partial Reconfiguration for Reduced Power Consumption." International Journal of Reconfigurable Computing, 25 pages, 2011. [11] M. Alawad, Y. Bai, R. F. DeMara, and M. Lin, Energy-Efficient Multiplier-Less Discrete Convolver through Probabilistic Domain Transformation. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 185-188, 2014. [12] N. Imran and R. F. DeMara, Heterogeneous Concurrent Error Detection (hced) Based On Output Anticipation, in Proceedings of 2011 International Conference on Reconfigurable Computing and FPGAs, Cancun, Mexico, November 30, 2011 December 2, 2011, pp. 61 66. [13] N. Imran, J. Lee, Y. Kim, M. Lin, and R. F. DeMara, Fault-Mitigation by Adaptive Dynamic Reconfiguration for Survivable Signal-Processing Architectures, International Journal of Control and Automation, Volume 6, Number 2, Pages 111 120, April 2013. [14] R. F. DeMara, K. Zhang, and C. A. Sharma Autonomic Fault-Handling and Refurbishment Using Throughput-Driven Assessment, Applied Soft Computing, Volume 11, Issue 2, March 2011, pp. 1588 1599. 5