Abstract Tile: Intel Rack Scale Architecture This talk provides an overview of Intel Rack Scale Architecture and discusses how this architecture addresses underutilized and stranded resources in a Data center through resource pooling! We will also specifically discuss concept of a pooled system, storage node, pooling of PCIe as well as NVMe based storage. The impact of pooling on latency, radix and failure domains will also be discussed. Further pooling introduces a need for composition of the platform. We will also discuss the characteristics of such platform composition how software can emerge to take advantage of these capabilities!
Rack Scale Architecture Mohan J. Kumar Intel Fellow, Data Center Group Intel Confidential
Data Center Challenges Infrastructure has not kept up with increasing business demands Inefficiency Less than 50% server utilization 2 Growth Data growth doubles every 18 months 1 Agility New services can take a week or more to provision 1 Business Needs Reduce operational and capital expenses. Deliver new services in minutes, not months. Optimize data center based on real-time analytics. Address application workload needs with agility. Scale capacity without interruption. 1 Worldwide and Regional Public IT Cloud Services 2013 2017 Forecast. IDC (August 2013) 2 IDC s Digital Universe Study, sponsored by EMC, December 2012 3
Today s Architecture Proprietary and preconfigured Upgrade as a system Limited flexibility 4
What s Next? A seismic shift in how data centers are built and managed powered by Intel All infrastructure delivered as a service Hyper-scalable to keep up with business demands Resources automatically tuned to application workloads 5
Software Defined Infrastructure Dedicated Appliances Software Defined Infrastructure SAN NAS Network Appliance Telco Appliance 6
Intel Rack Scale Architecture Logical architecture for efficiently building and managing cloud infrastructure and providing the simplest path to a software defined data center. Simplified Platform Management Disaggregate User-Defined Perform ance Maximum Utilization Interoperable Solution s Pool & Compose Increase performance per TCO$ & accelerate cloud adoption 7
Rack Scale Architecture Framework 1. Pooled systems 2. Network fabric management 3. Pod-wide storage 4. Pod management Pod-wide Management Pod wide storage Network fabric Scalability v Modular scalable management architecture 8 8
Management Software Framework Flexible management architecture allowing for range of implementation options Asset & location discovery Rack Scale POD Management API Disaggregated resource management Composable system support Support compute, network, and storage Rack Scale API Network Switch v Comprehensive management architecture 9
Rack Scale Pooled System Platform Supports range of server processors Pooled NVM Compute Node Supports Ethernet fabric Service Model requires Full node replacement Storage Node Supports range of server processors Direct Attach Storage Pooled NVM Supports Ethernet fabric Redundant networks, scalable DAS storage, sub node FRUs
Rack Scale Pooled NVMe Controller (PNC) Drive 1 Drive 2 Drive 3 Drive n Drive 2 Drive 1 Drive 3 Assign Drive2 to Node1 PSME Mgt Port Pooled NVMe Controller Assign Drive1, Drive 3 to Node 2 Node 1 Node 2 Logical effect of the assignment Node 1 Node 2 Node n
Rack Scale Pooled NVMe Controller (PNC) Capacity: 1TB Drive 1 Drive 2 Drive n Capacity: ½ TB Drive 1 Capacity: ½ TB Drive 1 PSME Mgt Port Pooled NVMe Controller Assign ½ capacity of Drive1 to Node1 Assign ½ capacity of Drive1 to Node2 Node 1 Node 2 Logical effect of the assignment Node 1 Node 2 Node n
Local vs. Pooled Storage less than capacity deployed Local Deployment model? More than capacity deployed WL/SVR Pooling Value Consolidation of Storage Pooled Infrastructure cost Consolidation of compute Op Ex Value of pooling Avg IOPS/SVR < Local Capacity Avg IOPS/SVR = Local Capacity Diverse Workload deployment IOPS/SVR > Local Capacity 13
Rack Scale Pooled NVMe Controller (PNC) PSME Drive 1 Drive 2 Drive 3 Drive n Mgt Port Node 1 Node 2 Pooled NVMe Controller Node n Enable pooling and disaggregation of PCIe devices away from compute / storage nodes Enable disaggregation of PCIe devices including Storage, FPGA Assign high performance storage to nodes based on workload demand Allow full and partial drive assignment Prevent SPOF through host failover Enables ease of workload migration in hyperscale environment Enables better utilization of Data Center resources by allowing composable high performance IO capacity
Rack Scale Pooled NVMe Controller (PNC) PSME Node 1 Management link Node 2 Ethernet PNC Node n Drive 1 Drive 2 Drive n Enable disaggregation of NVM Express devices Utilizes NVMeOF to expand radix of pooling Assign storage to Compute or Storage nodes based on workload demand Prevent SPOF through host failover Enables ease of workload migration in hyperscale environment Enables better utilization of DC resources through composition
Local vs. Pooled Resources Resource Local to a Node Pooled Resource Device Attach and Capacity Device Availability (Failure Domain) Limited by Physical constraints Node is SPOF Not constrained by node volumetrics Pooled Fabric Utilization Limited by Local use No stranded capacity/capability Latency Local Access Incurs Additional Pooling Latency Radix Local Limited by Pooling Fabric (one rack to multiple racks) Refresh and Life Cycle Management Node based Component based 16
Composable Infrastructure Software Implications Compose Orchestration that comprehends composition capabilities Pool of Resources Composed System 1 Composed System n Location aware placement of workloads Location aware placement in hyperscale storage Monitoring software that knows the physical bounds of the hardware Software (OS, VMM, App) capability to take advantage of dynamically added resources Release 17
Summary USER-DEFINED PERFORMANCE Tailor performance to meet application SLAs by selecting from pooled compute, storage & network resources Easily scale capacity with modular, buy-as-you-go architecture MAXIMUM UTILIZATION Autonomously manage compute, network & storage pools to virtually eliminate stranded resources INTEROPERABLE SOLUTIONS Interoperable system architecture simplifies data center operations and integration of multi-vendor solutions Buy What You Need. Use What You Buy 18
Intel Confidential Do Not Forward