DELL MICROSOFT REFERENCE CONFIGURATIONS PHASE II 7 TERABYTE DATA WAREHOUSE

Similar documents
Dell Microsoft Reference Configuration Performance Results

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Four-Socket Server Consolidation Using SQL Server 2008

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Storage Consolidation with the Dell PowerVault MD3000i iscsi Storage

Reduce Costs & Increase Oracle Database OLTP Workload Service Levels:

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

Performance Comparisons of Dell PowerEdge Servers with SQL Server 2000 Service Pack 4 Enterprise Product Group (EPG)

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

vstart 50 VMware vsphere Solution Specification

Database Solutions Engineering. Best Practices for running Microsoft SQL Server and Microsoft Hyper-V on Dell PowerEdge Servers and Storage

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

EMC CLARiiON CX3 Series FCP

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

Consolidating OLTP Workloads on Dell PowerEdge R th generation Servers

Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740

Deploying Microsoft SQL Server 2005 Standard Edition with SP2 using the Dell PowerVault MD3000i iscsi Storage Array

Competitive Power Savings with VMware Consolidation on the Dell PowerEdge 2950

Cost and Performance benefits of Dell Compellent Automated Tiered Storage for Oracle OLAP Workloads

Reference Architecture

IBM Emulex 16Gb Fibre Channel HBA Evaluation

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

Dell PowerEdge R920 System Powers High Performing SQL Server Databases and Consolidates Databases

HP SAS benchmark performance tests

Technical Note. Abstract

SQL Server 2005 on a Dell Scalable Enterprise Foundation

Dell PowerEdge 11 th Generation Servers: R810, R910, and M910 Memory Guidance

Dell PowerEdge R910 SQL OLTP Virtualization Study Measuring Performance and Power Improvements of New Intel Xeon E7 Processors and Low-Voltage Memory

Impact of Dell FlexMem Bridge on Microsoft SQL Server Database Performance

Reference Architecture Microsoft Exchange 2013 on Dell PowerEdge R730xd 2500 Mailboxes

Assessing performance in HP LeftHand SANs

NEC Express5800 A2040b 22TB Data Warehouse Fast Track. Reference Architecture with SW mirrored HGST FlashMAX III

A Performance Characterization of Microsoft SQL Server 2005 Virtual Machines on Dell PowerEdge Servers Running VMware ESX Server 3.

60TB Data Warehouse Fast Track Reference Architecture for Microsoft SQL Server 2016 using Dell EMC PowerEdge R730 and SC5020

IBM s Data Warehouse Appliance Offerings

12/04/ Dell Inc. All Rights Reserved. 1

Performance Baseline for Deploying Microsoft SQL Server 2012 OLTP Database Applications Using EqualLogic PS Series Hybrid Storage Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

Lenovo Database Configuration

Technical Note P/N REV A01 March 29, 2007

17TB Data Warehouse Fast Track Reference Architecture for Microsoft SQL Server 2014 using PowerEdge R730 and Dell Storage PS6210S

Using EMC FAST with SAP on EMC Unified Storage

EMC CLARiiON CX3-80 EMC Metropolitan Recovery for SQL Server 2005 Enabled by Replication Manager and MirrorView/S

INFOBrief. Dell PowerEdge Key Points

HP solutions for mission critical SQL Server Data Management environments

High Availability and Disaster Recovery features in Microsoft Exchange Server 2007 SP1

Implementing SQL Server 2016 with Microsoft Storage Spaces Direct on Dell EMC PowerEdge R730xd

IBM System x3850 M2 servers feature hypervisor capability

VMware Infrastructure Update 1 for Dell PowerEdge Systems. Deployment Guide. support.dell.com

Demartek September Intel 10GbE Adapter Performance Evaluation for FCoE and iscsi. Introduction. Evaluation Environment. Evaluation Summary

Hyper-converged storage for Oracle RAC based on NVMe SSDs and standard x86 servers

VMware VAAI Integration. VMware vsphere 5.0 VAAI primitive integration and performance validation with Dell Compellent Storage Center 6.

Was ist dran an einer spezialisierten Data Warehousing platform?

21 TB Data Warehouse Fast Track for Microsoft SQL Server 2014 Using the PowerEdge R730xd Server Deployment Guide

Lenovo Database Configuration

Lenovo Database Configuration for Microsoft SQL Server TB

Sage MAS 200 SQL Server Edition Introduction and Overview

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

HP ProLiant DL380 Gen8 and HP PCle LE Workload Accelerator 28TB/45TB Data Warehouse Fast Track Reference Architecture

Exchange 2003 Deployment Considerations for Small and Medium Business

vstart 50 VMware vsphere Solution Overview

Reference Architectures for designing and deploying Microsoft SQL Server Databases in Active System800 Platform

A Dell Technical White Paper Dell Virtualization Solutions Engineering

Lenovo Database Configuration Guide

Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays

Deltek Vision 7.6. Technical Overview and System Requirements: Advanced Deployment (150 or More Employees)

An Oracle Technical White Paper October Sizing Guide for Single Click Configurations of Oracle s MySQL on Sun Fire x86 Servers

IBM System x3755 SMP-capable rack server supports new dual-core AMD Opteron processors

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Technical Note. Dell/EMC Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract

Microsoft SharePoint Server 2010

Implementing SharePoint Server 2010 on Dell vstart Solution

HP ProLiant DL580 G5. HP ProLiant BL680c G5. IBM p570 POWER6. Fujitsu Siemens PRIMERGY RX600 S4. Egenera BladeFrame PB400003R.

Storage Optimization with Oracle Database 11g

EMC CLARiiON CX3-80. Enterprise Solutions for Microsoft SQL Server 2005

HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads

EMC Backup and Recovery for Microsoft SQL Server

SAP SD Benchmark with DB2 and Red Hat Enterprise Linux 5 on IBM System x3850 M2

Virtualized SQL Server Performance and Scaling on Dell EMC XC Series Web-Scale Hyper-converged Appliances Powered by Nutanix Software

Data Protection Using Premium Features

Dell Solution for JD Edwards EnterpriseOne with Windows and SQL 2000 for 50 Users Utilizing Dell PowerEdge Servers And Dell Services

IBM System Storage DCS3700

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

IBM System Storage DS5020 Express

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers

Oracle Exadata: Strategy and Roadmap

Three Paths to Better Business Decisions

Reference Architecture for Dell VIS Self-Service Creator and VMware vsphere 4

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

Exactly as much as you need.

Transcription:

DELL MICROSOFT REFERENCE CONFIGURATIONS PHASE II 7 TERABYTE DATA WAREHOUSE Deploying Microsoft SQL Server 2005 Business Intelligence and Data Warehousing Solutions on Dell PowerEdge Servers and Dell PowerVault Storage Abstract This white paper provides an architectural overview and configuration guidelines for deploying Microsoft SQL Server 2005 Business Intelligence and Data Warehouse solutions on Dell PowerEdge servers with Dell PowerVault storage. Using the knowledge gained through joint development, testing and support with Microsoft, this Reference Configuration for SQL Server documents best practices that can help speed SQL Server 2005 solution implementation and help simplify operations, improve performance and availability. March, 2008

TABLE OF CONTENTS ABSTRACT 1 INTRODUCTION 3 OVERVIEW OF BUSINESS INTELLIGENCE AND DATA WAREHOUSING 3 Business Intelligence 3 Data Warehousing 4 MICROSOFT BUSINESS INTELLIGENCE TOOLS 5 BI/DW Reporting Tools 5 Microsoft SQL Server Reporting Services (SSRS) 5 Microsoft Performance Point 2007 6 Digital Dashboards 7 Data Extract Transform Load Tools 8 Microsoft SQL Server Integration Services 8 OVERVIEW OF DELL SERVERS AND STORAGE FOR BUSINESS INTELLIGENCE AND DATA WAREHOUSING Dell PowerEdge Servers 9 Dell PowerVault Storage Arrays 10 BEST PRACTICES FOR BUSINESS INTELLIGENCE 10 Hardware Configuration 10 Software Configuration 11 DELL MICROSOFT BUSINESS INTELLIGENCE TEST SCENARIO 11 Test System Configuration 12 Overview of Tests Performed 14 CONCLUSION 15 APPENDIX A: DELL MICROSOFT REFERENCE PLATFORMS 16 2 Terabyte Data Warehouse 16 4 Terabyte Data Warehouse 17 7 Terabyte Data Warehouse 19 9 FIGURE 1: SSRS REPORT 6 FIGURE 2: PERFORMANCE POINT 2007 SCREENSHOT 7 FIGURE 3: EXCEL 2007 DIGITAL DASHBOARD SCREENSHOT 8 FIGURE 4: SQL SERVER INTEGRATION SERVICES (SSIS) SCREENSHOT 9 FIGURE 5: DELL MICROSOFT BI/DW TEST CONFIGURATION 12 FIGURE 6: CONCURRENT QUERY PERFORMANCE RESULTS 15 FIGURE 7: 2 TB SERVER 17 FIGURE 8: 4 TB SERVER 18 FIGURE 9: 7 TB SERVER 20 2

SQL SERVER 2005 BUSINESS INTELLIGENCE USING DELL HARDWARE SECTION 1 INTRODUCTION Dell and Microsoft partnered to build Reference Architectures for Business Intelligence and Data Warehousing using the Microsoft BI solutions stack of software. Phase I of these efforts were jointly launched in September of 2007. Dell and Microsoft published a set of white papers for three reference architectures for data warehousing and business intelligence using database sizes of 1, 2 and 4 terabytes. This white paper is a phase II revision describing the results of testing a fourth reference architecture using a 7 terabyte platform that takes advantage of improved server and disk storage hardware. Dell PowerEdge servers and Dell PowerVault storage systems are ideal choices to deploy highly available and enterprise mission critical Microsoft SQL Server 2005 Data Warehouses for Business Intelligence solutions. Dell and Microsoft have worked together to run various tests and analyze performance results for a sample Business Intelligence (BI) and Data Warehouse (DW) solution. This BI testing was designed to validate the Dell PowerEdge server and PowerVault storage platforms for BI and DW, to demonstrate the stability and performance of the systems, and to create Reference Configurations that can assist customers in creating baseline systems for BI. An overview of these tests is provided in the section Dell Microsoft Business Intelligence Test Scenario in this paper. A more detailed description of the testing effort along with specific performance results can be found in a companion white paper entitled Performance of Microsoft SQL Server 2005 Business Intelligence and Data Warehousing Solutions on Dell PowerEdge Servers and Dell PowerVault Storage available online from www.dell.com/sqlbi The purpose of this white paper is to help IT professionals understand the key components and tools available from Microsoft for Business Intelligence and Data Warehousing, to document Dell and Microsoft s best practices approach for cost-effective platforms for BI/DW, and to identify four Dell reference configurations (1, 2,4 and 7 terabyte (TB) database sizes) that can be used as reference points when developing a BI/DW solution. The detailed hardware and software configurations for these three platforms are provided in Appendix A: Dell Microsoft Reference Platforms. SECTION 2 OVERVIEW OF BUSINESS INTELLIGENCE AND DATA WAREHOUSING In recent years the demand for Business Intelligence and Data Warehousing solutions have grown as new hardware and software technologies lowered cost and simplified implementation. As demand for BI and DW has grown, so too has the size and complexity of databases. A unique set of tools and processes has been developed by Microsoft in order to meet the demands for this type of information management. Before getting into the tools and technologies that are expanding the use of BI and DW, let s take an overview of BI and DW, and how it is used by organizations to make them more effective and efficient. BUSINESS INTELLIGENCE Business Intelligence is a broad term that refers to applications and technologies that are used to manage and analyze business operations data. BI is a way to monitor and report on various aspects of the business, from sales and marketing efforts, productivity and operations, to profitability. BI is designed to provide business decision makers with information that enables optimal business decisions. The choice of what data is analyzed within a BI system is purely a business decision which is made possible by technology. Some examples of how BI is used include: to inform management of the overall status and performance of the company, to provide information needed to be more competitive, to identify what areas of the business are in a positive or critical state, and to make decisions based on changes in the market, customer purchasing trends, product pricing, and sales. BI allows 3

analysts and managers to quickly understand the state of the business and make informed real-time decisions on how to adjust. In addition, tools like business score cards and executive dashboards provide instantaneous state of the business information to executive management. For these systems to be considered effective, large amounts of data must be analyzed very quickly. BI is usually thought to operate against a data warehouse, but it is also possible to run BI against an online relational database. This white paper addresses BI primarily as it relates to a data warehousing. A data warehouse can be thought of as the corporation s repository of historical data. The data warehouse is typically very large in size and contains years of data that is typically used for analysis and reporting. These reports are then used to make decisions on direction for the company. The technology of BI is made up of three main categories: the design of the data warehouse, the movement of data into the data warehouse, and reporting from the data. The design of the data warehouse includes either a relational database (SQL Server) or OLAP cubes (Analysis Services). Movement of data into the data warehouse can be done using any number of ETL (Extract Transform Load) tools, including SQL Server Integration Services (SSIS). Reporting and data mining can be accomplished with SQL Server Reporting Services (SSRS) or Microsoft Performance Point Server 2007. Business performance management can be accomplished with Microsoft s Performance Point Server product with the three core features of planning, monitoring, and analyzing. Learn more at www.microsoft.com/performancepoint DATA WAREHOUSING A data warehouse is a repository of historical information that is used to make organizational decisions. A data warehouse can contain data from many sources and can be very large in size, such as 10 s or 100 s of terabytes. How the data is stored depends in part on the software used to access that data. The performance of the data warehouse is dictated by both the software, hardware, and the configuration of each. Microsoft solutions enable organizations to use several models to store data in their data warehouse. The model chosen depends on the data and how it will be accessed. Data in the data warehouse is stored in either a Relational Database Management System (RDBMS) such as SQL Server, in Analysis Services (or OLAP) cubes, or a combination. An Analysis Services cube can be thought of as a multidimensional abstraction of relational data. Note: Because of the large amounts of memory required for BI, Dell and Microsoft recommend using the 64-bit editions of Windows and SQL Server for BI activities. OLAP, or Online Analytical Processing, is a subset of BI and is essentially a database designed for quick analysis of multidimensional data. OLAP tools can access several types of data models, including the following: ROLAP MOLAP HOLAP Relational OLAP refers to analysis of data that is stored in the relational model, i.e., within the SQL Server database. Multidimensional OLAP refers to the analysis of data that is stored in OLAP cubes, i.e., within SQL Server Analysis Services cubes. Hybrid OLAP refers to the analysis of data that is stored in a combination of both SQL Server and in OLAP cubes. 4

SECTION 3 MICROSOFT BUSINESS INTELLIGENCE TOOLS Microsoft provides a broad range of tools for BI. These tools cover the key areas of BI including the structure and creation of the data warehouse, analysis, reporting, and data Extract, Transform and Load (ETL). The following sections provide an overview of Microsoft s reporting and ETL tools. BI/DW REPORTING TOOLS BI/DW reporting tools tend to receive the most attention, since they are the components most visible to the end users. The reporting tool needed may vary based on the type of report desired and the data used to create the reports. For example, SQL Server Reporting Services (SSRS) is an excellent tool that integrates well with SQL Server databases as well as with Online Applications Processing (OLAP) cubes. If data mining and analytics against OLAP cubes are the focus, then Microsoft Performance Point Server 2007 might be the right choice. If Dashboards are needed, then Microsoft Excel 2007, and Dundas Data Visualization, recently purchased by Microsoft, might be the right choice. For the testing performed for this white paper, SSRS and Performance Point 2007 Software were utilized. There is no one tool that is right for every task; a variety of tools might be necessary to achieve all of the business objectives for a given BI/DW solution. MICROSOFT SQL SERVER REPORTING SERVICES (SSRS) SQL Server Reporting Services is a server-based reporting tool designed to create, deploy and serve web-based BI reports. These reports can be based on both relational and multidimensional data. The SQL Server reporting architecture consists of the Report Server engine that actually runs the reports and the report server website which is used to display the reports. One major advantage of SSRS is the wide range of data sources that can be used. This flexibility allows reports to be created from many different data sources in a heterogeneous environment. Currently SSRS can create reports from the following data sources: SQL Server Database SQL Server Analysis Services (or OLAP) Cubes OLE DB Oracle Database ODBC XML SAP NetWeaver BI Hyperion Essbase The following figure illustrates a sample report where business data is transformed into a format easily interpreted by the business community. The report shows the download attempts, download successes, and download failures by country for the Windows Update database for 2006. 5

Figure 1: SSRS Report SSRS is an excellent tool for creating business reports. Keep in mind that the report creation process consists of both executing database queries to collect the data and rendering the report. Reports based on large amounts of data can consume significant CPU and memory resources. MICROSOFT PERFORMANCE POINT SERVER 2007 ProClarity Analytics was purchased by Microsoft in April 2006 in order to enhance Microsoft s suite of BI analytics products. Enhanced and launched as Microsoft Performance Point Server 2007, it is an OLAP cube analytics tool that allows for data mining and analysis of multidimensional data, with rich reporting features. It provides the capability to explore data interactively. Unlike SSRS, PPS 2007 is a client tool that runs on a workstation rather than a web-based server tool. The following PPS 2007 screenshot shows business data in a format that is not only informative, but supports drill-down capabilities as well. This example shows the download attempts, successes and failures for the Windows Update database graphed by country and category. PPS 2007 provides the ability to view reports, and to analyze the data as desired and drill down into the details of the underlying OLAP cubes. 6

Figure 2: Performance Point 2007 Screenshot DIGITAL DASHBOARDS Digital Dashboards, also known as Enterprise Dashboards or Executive Dashboards, are tools that provide important high-level metrics to organization decision makers. Digital Dashboards are designed to visually display the health or status of the business by graphically displaying key performance indicators (KPIs) such as inventory levels, customer satisfaction ratings or manufacturing quality measures. The Digital Dashboard provides a visual heads-up display of data that can be pulled from multiple systems and data sources in order to provide notices, alerts, warnings and summaries of key performance indictors. Digital Dashboards are emerging as a leading means for reporting organizational performance, but they do not provide the full analytics of products such as Performance Point Server 2007 or the reporting features of SSRS. An example of an Excel 2007 generated dashboard is shown below. Unlike Performance Point 2007, the business data shown here does not provide the ability to drill-down. This Excel 2007 graph displays the same type of data that was used in the previous screenshots: download information from the Microsoft Windows Update database. 7

Figure 3: Excel 2007 Digital Dashboard Screenshot DATA EXTRACT TRANSFORM LOAD TOOLS SQL Server Integration Services (SSIS) is an essential piece of the SQL Server BI model. SSIS is an ETL (Extract Transform Load) tool that is used to move data from one or more OLTP systems into the data warehouse where it can be accessed or transformed into OLAP cubes. In any BI or DW project, a majority of the effort involves ETL, which is closely tied to the warehouse design. The data that is used to populate a BI system might be coming from one source, or hundreds of sources. It might be simple data loads, or it might involve complex transformations. This is completely dependent on the data design and the reporting capabilities desired. MICROSOFT SQL SERVER INTEGRATION SERVICES SQL Server Integration Services with SQL Server 2005 is the follow-on to the popular Microsoft SQL 2000 Data Transformation Services (DTS). SSIS is both a tool and a development platform. With SSIS, data can be extracted from almost any data source, standard or custom transformations can be performed, and data can be loaded into almost any data source. SSIS is extremely flexible and provides the capability of writing custom components. A sample screenshot of SSIS is shown below. The simple task shown takes data from one database, transforms that data, and inserts it into a different database. SSIS supports error handlers, data transformation, and automation. 8

Figure 4: SQL Server Integration Services (SSIS) Screenshot The flexibility of SSIS is one of its main assets. By allowing the addition of custom transformations to a SSIS package, the flexibility is almost limitless. Often, when using OLAP cubes, a model database is updated with SSIS and then Analysis Services transforms that data into the cube. SECTION 4 OVERVIEW OF DELL SERVERS AND STORAGE FOR BUSINESS INTELLIGENCE AND DATA WAREHOUSING Regardless of the method of building and reporting on a data warehouse, the amount and type of hardware is extremely important. A data warehouse is typically very large, often exceeding multi-terabytes. BI can be very CPU, memory and I/O intensive on a database system. In addition, the larger the data warehouse, the more important it is to properly size and configure it. Not only is it important to properly size the database server, but the reporting and analysis servers as well. These applications can be very hardware-intensive. DELL POWEREDGE SERVERS Dell PowerEdge servers are designed to deliver the highest performance for mission-critical enterprise applications for database, business intelligence, and data warehousing. Today s proprietary systems are increasingly expensive to maintain both in manpower and maintenance costs. Efforts to reduce IT costs and leverage technical skill-sets have pushed the industry to move to a standards-based hardware and software architecture. Customers looking for ease of implementations choose to deploy Dell PowerEdge servers that are standards-based systems which are easy to manage, simple to deploy and upgrade, and scalable as the enterprise moves to consolidate and virtualize computing resources. 9

All Dell PowerEdge servers provide a number of scalability and manageability features in common, such as: Hot-plug, redundant power supplies, memory, PCI Express slots, hard drives and cooling fans allowing replacement of components without having to make the server unavailable to users. State-of-the-art multi-core processors, with 2 4 sockets, featuring AMD and Intel dual-core and quad-core processors. Up to 64 GB of high-quality RAM, in the 6950, with advanced high availability features such as Memory RAID, mirroring and hot-plug DDR-2 SDRAM with Error Correcting Code (ECC) plus SDDC. Multiple internal hot-plug SAS or SATA disks are available, with hardware RAID protection options. Servers are available with Microsoft Windows Operating Systems pre-installed, including several versions of Windows Server 2003. Built-in, standards-based server management utilizing the Intelligent Platform Management Interface (IPMI), optional remote management and Dell OpenManage software for server deployment, monitoring and update. Dell Servers are fully tested and validated for Microsoft Windows Server clustering. DELL POWERVAULT STORAGE ARRAYS For this white paper, Dell and Microsoft used the Dell PowerVault MD1000 storage array, which is a high-performance storage system built for enterprise applications. The MD1000 is a modular disk storage expansion enclosure for PowerEdge servers capable of housing up to fifteen 3.5-inch disk drives in a single 3U rack enclosure. Dell PowerVault MD1000 arrays utilize Serial Attached SCSI (SAS) disks providing significant performance gains based on the higher performance potential of the SAS protocol compared to the traditional parallel SCSI protocol. The MD1000 storage array features read and write performance comparable to Fiber Channel based SAN storage systems, but at a lower cost. To enhance scalability, disk sizes as large as 300 GB are now offered, with drive speeds up to 15,000 rpm. In terms of reliability, data stored in the MD1000 array may be protected with RAID levels 1, 5 and 10. SECTION 5 BEST PRACTICES FOR BUSINESS INTELLIGENCE When building a SQL Server 2005 BI solution there are several best practices that should be followed. These best practices have been developed based on extensive experience and testing that has occurred both at Dell and Microsoft. Since all systems and applications are unique, not all of these best practices will apply for every system. HARDWARE CONFIGURATION An optimal hardware configuration is the foundation to generating the best performance from a Data Warehouse or Business Intelligence system. This is true whether storing data in a SQL Server relational database or in OLAP cubes. The key hardware components are CPU, memory, and disks. CPU MEMORY DISKS Whether using cubes or relational data, BI tends to be CPU-intensive at all levels (database, analysis, and reporting server). Take advantage of multi-core systems and make sure there is sufficient CPU power. It s highly recommended to use 64-bit Windows Server and 64-bit SQL Server with BI configurations. Configure the server with as much RAM as is applicable for the database size. In general, have at least 16 GB of RAM for BI systems. The number of disks necessary will vary based on the database size and I/O characteristics of the system. Monitor disk I/O and add more disks when necessary. With BI systems, Microsoft has determined that configuring SQL Server with many small, dedicated disks is optimal. Don t share physical disks with other applications. 10

For specific hardware configuration recommendations for BI systems, see Appendix A: Dell Microsoft Reference Platforms. SOFTWARE CONFIGURATION There are several configuration options to consider for SQL Server. For BI systems, memory usage is a key factor. Make sure that SQL Server is allocated sufficient memory, while leaving some memory for Windows usually 10% for memory allocations over 4 GB. It is highly recommended to use x64 (64-bit) editions of both Windows Server 2003 and SQL Server 2005. 64-bit memory addressing eliminates the overhead incurred with large memory access on 32-bit systems. When running SQL Server with memory allocation over 4 GB, take advantage of large-page extensions for the buffer pool. Each page in memory is managed by the Windows operating system. The page size for these pages is 4 KB. Thus in a 4 GB system there are approximately 1,048,576 pages to manage. In a 16 GB system, this number jumps to 16,777,216. Eventually there are so many pages that the management overhead becomes detrimental to system performance. With Windows large-page extensions, memory used for SQL Server is allocated in 2 MB to 16 MB pages, thus greatly reducing the management overhead by managing fewer, larger pages. Large-page extensions are enabled in SQL Server with the 834 trace flag. The 834 trace flag can be added to the SQL Server startup parameters in the SQL Server Configuration Manager. In addition, the Lock Pages in Memory permission must be granted to the Windows user that will be running SQL Server. When running SQL Server 2005 as both an OLTP system and a BI reporting system, the OLTP users often suffer due to the long-running reports. This is due in part to the efficiency of the SQL Server parallelism, which allows a large query to be processed in parallel streams on multiple processors. A long-running query can therefore easily consume all of the system resources, thus starving out other processes. In order to configure SQL Server to allow more resources for the online users, this issue can sometimes be alleviated by reducing the allowed degree of parallelism by setting the max degree of parallelism configuration option. The value refers to the number of processors allowed for parallel use. In some cases, setting this option to 1 (thus disabling parallelism) is optimal, while in other cases setting it to 2 or higher is optimal. Testing large queries with different settings is the best way to determine the best value to choose. Another consideration is to set this option to 1 during daytime hours when online users are active, and setting it to 2 or higher after hours when batches are run that could take advantage of parallelism. These best practices should be considered as general guidelines as performance tuning is an iterative process and not all optimizations work for all implementations. SECTION 6 DELL MICROSOFT BUSINESS INTELLIGENCE TEST SCENARIO As mentioned throughout this paper, Dell and Microsoft have worked together to execute various tests and analyze performance results of a sample BI and DW system under test. This test system was based on the Microsoft Windows Update database running on Dell PowerEdge 6950 and 2970 servers attached to Dell PowerVault MD1000 storage systems. The following sections provide an overview of the test environment and the tests performed. Details of this test effort and results are provided in the companion paper Performance of Microsoft SQL Server 2005 Business Intelligence and Data Warehousing Solutions on Dell PowerEdge Servers and Dell PowerVault Storage which is available online at www.dell.com/sql The testing was designed to validate the platforms for BI/DW, to provide the stability and performance of the hardware systems, and to define reference configurations that can be used to assist customers in creating baselines for choosing solutions for BI/DW solutions. These reference platforms represent the results of this testing and industry best practices to provide these baseline configurations that the customer can build upon. This platform was subjected to rigorous testing in the lab including batch processing tasks such as index creation, backup, and recovery, as well as online processing tasks such as Concurrent Query Loads. These tests exercised the I/O subsystem, memory subsystem and all CPUs in the system. Summaries of the test system configuration and the test results are provided in the next sections. 11

TEST SYSTEM CONFIGURATION The test platform utilized Dell PowerEdge 6950 and 2970 servers and PowerVault MD1000 storage arrays with Serial Attached SCSI (SAS) disks. These multi-socket PowerEdge servers are a good match for SQL Server 2005, as SQL Server licensing is determined by CPU socket, not by core. For example, a dual-core processor or a quad-core processor each count as a single processor for licensing. Almost as much additional performance can be achieved by an additional processor core as by adding a CPU. Figure 5: Dell Microsoft BI/DW Test Configuration For the test storage platform, Dell PowerVault MD1000 Direct Attached storage arrays were chosen rather than SAN storage because of their lower cost. Since features such as consolidation and clustering were not implemented, direct attached arrays were suitable for the tests. Regardless of the model and type of storage chosen, it is extremely important to properly size and configure database storage. For the test data, a 1.8 Terabyte subset of the Microsoft Windows Update database was used. This database is a data warehouse that contains information pertaining to customers who have updated their computers via Microsoft Update. This database contains several large tables, including a table with over 5 billion rows. The hardware configurations for the test system are outlined in the tables that follow. There were three physical servers used database, analysis, and reporting. For the server database storage, the user database files and the SQL Server tempdb files were physically separated in order to achieve optimal performance. 1 The 75 external disks were configured as 35 x 2-disk RAID 1 pairs and one hot-spare per MD1000 array (five hot spares total). The SQL Server transaction logs were placed on the internal disks. In addition, disk storage for database backups was configured across 30 x 73 GB 10K SAS drives as four RAID 0 sets two sets of eight drives and two sets of seven drives. These drives were configured as RAID 0 (striped, with no fault tolerance) for testing purposes only, as RAID 0 is not recommended for real-world scenarios. 1 Performance of Microsoft SQL Server 2005 Business Intelligence and Data Warehousing Solutions on Dell PowerEdge Servers and Dell PowerVault Storage. 12

SQL SERVER 2005 DATABASE SERVER Dell PowerEdge 6950 64 GB RAM (4 GB DIMMS) 4 x dual-core AMD Opteron CPUs (2.8 GHz) 4 x 146 GB, 15K SAS internal disks Four external SAS Disk Controllers (2 external connectors per controller) Sixteen Dell PowerVault MD1000 Storage Arrays (2 daisychained per controller connection) 224 x 73 GB 15K SAS Drives (1 hot spare disk per MD1000) DATA TYPE STORAGE # DISKS RAID SQL Transaction Logs Internal 4 2 x RAID 1 pairs SQL User DB Data Files External 210 15 RAID 10 LUNs (14 disks per LUN) SQL Tempdb Data and Log External 14 1 x RAID 10 LUN Hot Spares External 16 One per MD1000 SQL SERVER 2005 (SSIS) ANALYSIS SERVER PowerEdge 6950 32 GB RAM 4 x dual-core AMD Opteron CPUs 4 x 146 GB, 15K SAS internal disks Two external SAS Disk Controllers Two Dell PowerVault MD1000 Storage Arrays 30 x 73 GB 10K SAS Drives x64 (per processor licensing) MOM SQL SERVER 2005 (SSRS) REPORTING SERVER PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks x64 MOM SQL SERVER 2005 (SSRS) ANALYSIS SERVER PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks Performance Point Server 2007 MOM 13

OVERVIEW OF TESTS PERFORMED UPDATED Various tests were performed on the system to demonstrate performance under different processing conditions. In order to isolate and test the I/O subsystem, the Microsoft test sqlio was run. This test simulates I/Os as SQL Server would generate them, but without utilizing the full end-to-end functionality of SQL Server. By using sqlio the I/O subsystem can be stressed to its limit, since other components are eliminated. The result of this testing was peak read rates of up to 23,414 reads per second, with 7 ms latencies as shown here (using a single disk controller and two MD1000 PowerVaults): BLOCK SIZE OUTSTANDING I/O S READ OR WRITE I/O S SEC MB/SEC LATENCY 64 4 Read 18,790 1,174 2 64 8 Read 16,415 1,025 6 64 12 Read 23,414 1,463 7 64 16 Read 18,893 1,180 6 64 20 Read 18,834 1,149 15 64 32 Read 23,440 1,465 19 As part of the testing process, indexes were created on some of the larger tables and timings recorded. For example, a non-clustered index was created on a table containing over five billion rows. It took 53 minutes 26 seconds, which is a rate of 1,563,285 rows per second indexed. This index creation performance proves that very large tables can be properly maintained within normal maintenance windows. In addition, with SQL Server 2005, indexes can be rebuilt online, without locking the underlying tables from user access. Backup and restore testing was done with both Quest Software SQL Litespeed and Microsoft SQL Server native backup commands. A full database backup of the 1.8 TB database (containing over 1.1 TB of data) completed in less than 16 minutes. Sustained input from the data drives exceeded 1 GB/sec. The backups were written to the 30 disks dedicated for backup files. In order to demonstrate the ability of the system to support a large user community under stress, a test was run which simulated multiple users accessing the database. This test was taken from Microsoft Project REAL which can be found on the web at http://www.microsoft.com/sql/solutions/bi/projectreal.mspx. This project was designed to create best practices developed by real customer scenarios. Microsoft Visual Studio Team Edition was used to generate the user loads. The test system was able to successfully sustain 250 concurrent users with response times under five seconds. The following graph demonstrates this with user response time (blue line) and user load (red line) over time. 14

Figure 6: Concurrent Query Performance Results SECTION 7 CONCLUSION Dell Solutions for SQL Server 2005 are designed to simplify operations, improve utilization and cost-effectively scale as your needs grow over time. This reference configuration white paper provides a blueprint for deploying SQL Server 2005 Business Intelligence and Data Warehousing solutions on Dell PowerEdge servers and Dell PowerEdge MD1000 storage arrays. Working together, Dell and Microsoft have designed, tested and validated reference configurations for a range of BI/DW needs. The best practices described here are intended to help achieve optimal performance of SQL Server 2005. To learn more about deploying SQL Server 2005 on Dell PowerEdge servers and Dell storage, please visit www.dell.com/sql or contact your Dell representative for up to date information on Dell servers, storage, and services for SQL Server 2005 solutions. 15

SECTION 8 APPENDIX A: DELL MICROSOFT REFERENCE PLATFORMS Dell and Microsoft have partnered together to developed a range of reference configurations for Microsoft BI/DW solutions utilizing Dell servers and storage. These reference configurations come in three sizes; 2 TB, 4 TB and 7 TB DW databases using SQL Server 2005. The reference configurations call for a Database Server tier, an Analysis Services tier, and a Reporting server tier, and are detailed on the following pages. 2 TERABYTE DATA WAREHOUSE This configuration is designed for 2 Terabytes of data. This configuration consists of a Database Server, Analysis Server and a Reporting Server as follows: SQL SERVER 2005 DATABASE SERVER Dell PowerEdge 6950 24 GB RAM (2 GB DIMMS) 4 x dual-core AMD Opteron CPUs 4 x 73 GB, 15K SAS internal disks Five external SAS Disk Controllers Five Dell PowerVault MD1000 Storage Arrays 75 x 73 GB 10K SAS Drives DATA TYPE STORAGE # DISKS RAID SQL Transaction Logs Internal 4 2 x RAID 1 pairs SQL User DB Data Files External 54 27 x RAID 1 pairs SQL Tempdb Data and Log External 16 8 x RAID 1 pairs Hot Spares External 5 One per MD1000 SQL SERVER 2005 (SSAS) ANALYSIS SERVER Dell PowerEdge 6950 24 GB RAM (2 GB DIMMS) 4 x dual-core AMD Opteron CPUs 4 x 73 GB, 15K SAS internal disks One external SAS Disk Controllers One Dell PowerVault MD 1000 Storage Array 15 x 73 GB 15K SAS Drives SQL SERVER 2005 (SSRS) REPORTING SERVER Dell PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks 16

PERFORMANCE POINT SERVER 2007 ANALYSIS SERVER Dell PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks Figure 7: 2 TB Server 17

4 TERABYTE DATA WAREHOUSE This configuration is designed for 4 Terabytes of data. This configuration consists of a Database Server, Analysis Server and a Reporting Server as follows: SQL SERVER 2005 DATABASE SERVER Dell PowerEdge 6950 32 GB RAM (2 GB DIMMS) 4 x dual-core AMD Opteron CPUs 4 x 146 GB, 15K SAS internal disks Five external SAS Disk Controllers Five Dell PowerVault MD1000 Storage Arrays 75 x 146 GB 15K SAS Drives DATA TYPE STORAGE # DISKS RAID SQL Transaction Logs Internal 4 2 x RAID 1 pairs SQL User DB Data Files External 54 27 x RAID 1 pairs SQL Tempdb Data and Log External 16 8 x RAID 1 pairs Hot Spares External 5 One per MD1000 SQL SERVER 2005 (SSAS) ANALYSIS SERVER Dell PowerEdge 6950 32 GB RAM (2 GB DIMMS) 4 x dual-core AMD Opteron CPUs 4 x 146 GB, 15K SAS internal disks Two external SAS Disk Controllers Two Dell PowerVault MD 1000 Storage Array 30 x 73 GB 15K SAS Drives SQL SERVER 2005 (SSRS) REPORTING SERVER Dell PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks 18

PERFORMANCE POINT SERVER 2007 ANALYSIS SERVER PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks Performance Point Server 2007 MOM Figure 8: 4 TB Server 19

7 TERABYTE DATA WAREHOUSE This configuration is designed for 7 Terabytes of data. This configuration consists of a Database Server, Analysis Server and a Reporting Server as follows: SQL SERVER 2005 DATABASE SERVER Dell PowerEdge 6950 64 GB RAM (4 GB DIMMS) 4 x dual-core AMD Opteron CPUs (2.8 GHz) 4 x 146 GB, 15K SAS internal disks Four external SAS Disk Controllers (2 external connectors per controller) Sixteen Dell PowerVault MD1000 Storage Arrays (2 external connectors per controller) 224 x 73 GB 15K SAS Drives (1 hot spare disk per MD 1000) DATA TYPE STORAGE # DISKS RAID SQL Transaction Logs Internal 4 2 x RAID 1 pairs SQL User DB Data Files External 210 15 RAID 10 LUNs (14 disks per LUN) SQL Tempdb Data and Log External 14 1 x RAID 10 LUN Hot Spares External 16 One per MD1000 SQL SERVER 2005 (SSAS) ANALYSIS SERVER Dell PowerEdge 6950 32 GB RAM (2 GB DIMMS) 4 x dual-core AMD Opteron CPUs 4 x 146 GB, 15K SAS internal disks Two external SAS Disk Controllers Two Dell PowerVault MD1000 Storage Array 30 x 73 GB 15K SAS Drives SQL SERVER 2005 (SSRS) REPORTING SERVER Dell PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks 20

PERFORMANCE POINT SERVER 2007 ANALYSIS SERVER PowerEdge 2970 32 GB RAM 2 x dual-core AMD Opteron CPUs 4 x 73 GB, 10K SAS internal disks Performance Point Server 2007 Figure 9: 7 TB Server THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. 2008 Dell Inc. Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden. Trademarks used in this text: Intel and Xeon are registered trademarks of Intel Corporation; AMD, and Opteron are registered trademarks of AMD Corporation. EMC, Navisphere, and PowerPath are registered trademarks of EMC Corporation; Microsoft, Windows, and Windows Server are registered trademarks of Microsoft Corporation. March 2008: Rev A03 21