Improving Performance and Power of Multi-Core Processors with Wonderware and System Platform 3.0

Similar documents
Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems

Parallelism in Hardware

Multicore Hardware and Parallelism

MICROPROCESSOR ARCHITECTURE

Multi-threading technology and the challenges of meeting performance and power consumption demands for mobile applications

Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND

Scaling through more cores

The Benefits of Component Object- Based SCADA and Supervisory System Application Development

ECE 486/586. Computer Architecture. Lecture # 2

Comprehensive Training for Wonderware System Platform

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS

InfoBrief. Dell 2-Node Cluster Achieves Unprecedented Result with Three-tier SAP SD Parallel Standard Application Benchmark on Linux

IBM InfoSphere Streams v4.0 Performance Best Practices

Computers: Inside and Out

Operator actions are initiated in visualization nodes, processed in dedicated server nodes, and propagated to other nodes requiring it.

Now we are going to speak about the CPU, the Central Processing Unit.

Benchmark Performance Results for Pervasive PSQL v11. A Pervasive PSQL White Paper September 2010

Manual Sql Server 2012 Express Limit Cpu

Virtuozzo Hyperconverged Platform Uses Intel Optane SSDs to Accelerate Performance for Containers and VMs

2008 International ANSYS Conference

EE5780 Advanced VLSI CAD

Concurrency & Parallelism, 10 mi

Business Aspects of FibeAir IP-20C

LECTURE 1. Introduction

Microprocessor Trends and Implications for the Future

STAR Watch Statewide Technology Assistance Resources Project A publication of the Western New York Law Center,Inc.

Fundamentals of Quantitative Design and Analysis

Parallelism and Concurrency. COS 326 David Walker Princeton University

CS377P Programming for Performance Multicore Performance Multithreading

Introduction to OS. Introduction MOS Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

Broadcast-Quality, High-Density HEVC Encoding with AMD EPYC Processors

Industrial Application Server Glossary. Appendix C

MICROPROCESSOR TECHNOLOGY

Advances of parallel computing. Kirill Bogachev May 2016

Competitive Power Savings with VMware Consolidation on the Dell PowerEdge 2950

Key Considerations for Improving Performance And Virtualization in Microsoft SQL Server Environments

Harnessing Multicore Hardware to Accelerate PrimeTime Static Timing Analysis

Map3D V58 - Multi-Processor Version

VLSI Design Automation

Multicore Computing and Scientific Discovery

WHITE PAPER AGILOFT SCALABILITY AND REDUNDANCY

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Optimizing Performance of C++ Threading Libraries

Let s say I give you a homework assignment today with 100 problems. Each problem takes 2 hours to solve. The homework is due tomorrow.

WebSphere Application Server Base Performance

HISTORY OF MICROPROCESSORS

VLSI Design Automation

45-year CPU Evolution: 1 Law -2 Equations

IJESR Volume 1, Issue 1 ISSN:

Parallel Programming Multicore systems

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

L-Series 30 User Environment

Performance of computer systems

The Implications of Multi-core

Outline Marquette University

Determining the Relevancy of Moore's Law Through the Comparison of Ten Distinct Processor Systems

UNIT 1.1 SYSTEMS ARCHITECTURE MCQS

Processor Performance and Parallelism Y. K. Malaiya

1.3 Data processing; data storage; data movement; and control.

Manual Sql Server 2012 Express Limit Cpu Cores

DELL EMC POWER EDGE R940 MAKES DE NOVO ASSEMBLY EASIER

Waiting on IO: The Straw That Broke Virtualization s Back

Since the invention of microprocessors at Intel in the late 1960s, Multicore Chips and Parallel Processing for High-End Learning Environments

7/28/ Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc Prentice-Hall, Inc.

Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)"

Parallelism Marco Serafini

Double Rewards of Porting Scientific Applications to the Intel MIC Architecture

The Hardware Realities of Virtualization

CS758: Multicore Programming

What is This Course About? CS 356 Unit 0. Today's Digital Environment. Why is System Knowledge Important?

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

The Beauty and Joy of Computing

Introduction to GPU computing

Whitepaper / Benchmark

Performance of Multicore LUP Decomposition

Twos Complement Signed Numbers. IT 3123 Hardware and Software Concepts. Reminder: Moore s Law. The Need for Speed. Parallelism.

Introduction to Parallel Computing

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

Multi-threaded Queries. Intra-Query Parallelism in LLVM

Accelerating Implicit LS-DYNA with GPU

(ii) Why are we going to multi-core chips to find performance? Because we have to.

1.13 Historical Perspectives and References

Concurrent Programing: Why you should care, deeply

Why do we care about parallel?

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Exercise 1 Due 02.November 2010, 12:15pm

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

RIGHTNOW A C E

Comparing SYMMIC to ANSYS and TAS

Software within building physics and ground heat storage. HEAT3 version 7. A PC-program for heat transfer in three dimensions Update manual

Alarm Archiver Object for Wonderware Application Server Alarm Archiver Object User Guide Ver 1.0 Rev 1.1

CS Computer Architecture Spring Lecture 01: Introduction

An Introduction to Parallel Programming

Benchmarking AMD64 and EMT64

administrivia final hour exam next Wednesday covers assembly language like hw and worksheets

The HP Blade Workstation Solution A new paradigm in workstation computing featuring the HP ProLiant xw460c Blade Workstation

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Intel Core i7 Processor

Transcription:

Improving Performance and Power of Multi-Core Processors with Wonderware and Krishna Gummuluri, Wonderware Software Development Manager

Introduction Wonderware has spent a considerable effort to enable its Applications Server software component within the Wonderware to take advantage of new breakthroughs in IT hardware and software technologies. As a result, most Wonderware Application Server users will be able to enhance the power of multi-core processors by implementing the Wonderware. The proper distribution of loads between multiple application engines enables users to maximize the benefits of run-time hardware parallelism provided by multi-core CPUs. Users also will experience shorter development times due to the optimizations in configuration services. In addition, software helps make it easier to develop and test applications incrementally due to faster and more robust configuration time operations such as application deployment, un-deployment, redeployment, check-in, check-out, configuration loading and export. Evolution of Processor Technology For the past four decades Moore s law, which states that the number of transistors that can be placed on a chip will double every 24 months, has held true. Until recently chip manufacturers kept up with this law by increasing the clock speed to achieve increased performance. However, increasing clock speed is not viable anymore due to heat dissipation and power consumption constraints. Instead of trying to increase the clock speed, chip manufacturers shifted to newer multi-core processor architectures. A multicore processor is a single integrated circuit in which two or more processors have been attached for enhanced performance, reduced power consumption and more efficient simultaneous processing of multiple tasks. The advent of multi-core (e.g. dual-core, quad-core and 8-core) processors presents both new opportunities and challenges. With increased hardware parallelism more work can be accomplished on a single processor. However, to take advantage of this hardware parallelism, software applications will need to be appropriately and adequately multithreaded. Unthreaded or improperly threaded software applications will perform poorly and become less competitive in the marketplace. For example, on a dual-core processor, an unthreaded or improperly threaded software program may utilize as little as 50 percent of the theoretical CPU available, whereas a multithreaded program could achieve up to two times improvement in performance. Page 2

The Wonderware Application Server componentized run-time architecture and its multithreaded configuration services make it multi-core-ready. The Application Server, built using "multi-core-ready libraries, runs on the Windows operating system. Therefore, the Wonderware Application Server satisfies the multi-core-readiness requirements as defined by Intel, which requires the four layers of the software stack, including the operating system, device drivers, application libraries and development tools, to be multithreaded and multi-core-ready. Wonderware Application Server 3.0 is multi-core-ready Componentized Run-time Architecture Takes Advantage of Multi-Core Processors The Wonderware Application Server run-time services are separated into several distinct processes and services (separate executables) such as aabootstrap.exe, nmxsvc.exe, view.exe and aaengine.exe, one for each Application Engine. Redundancy services enable aabootstrap.exe and aaengine.exe to run on separate worker threads. Nmxsvc.exe, which provides off-platform communication services to application engines, also is multithreaded. If Wonderware InTouch HMI is used for visualization on a run-time node, it also runs as a separate process (view.exe). Optimal componentization and multithreading of essential run-time services has been achieved through careful application of standard computer science principles such as functional and data decomposition. On multi-core processors, at run-time, these distinct processes and threads could be running on separate cores as determined by the operating system. This provides better performance in terms of CPU utilization and added robustness to end-user applications. The tests below compare CPU behavior and load on single-core (2.80 GHz Intel Pentium 4) and two dual-core (2.66 GHz Intel Xeon ) processors. The first test was performed on the single core system using 2,500 objects that communicate with a PLC and perform calculations. Single-Core application details test 1 1 Application Engine with 2,500 Objects, 2,048 I/O, 2 Active scripts on each object 1 Application Engine with ABCIP Objects with 2,500 Active advised I/O Page 3

The total average CPU load is 43 percent as shown in the chart below. CPU load for single core test The same application is then spread across four Wonderware Application Engines and run on a computer with two dual-core processors with the following specification: Quad-Core application details test 1 4 Application Engines with 2,500 Objects, 2,048 I/O, 2 Active scripts on each object 4 Application Engines with ABCIP Objects with 2,500 Active advised I/O The total average CPU load is 7 percent as shown in the chart below. The chart also shows that the load is distributed across all the processor cores. Page 4

CPU load for multi-core test Finally the application size was quadrupled and the same test was run on the computer with the two dual-core processors: Quad-Core application details test 2 4 Application Engines with 10,000 Objects, 8,192 I/O, 2 active scripts on each object 4 Application Engines with ABCIP Objects with 10,000 active advised I/O The total average CPU load is now 21 percent as shown in the chart below. The chart again shows that the load is distributed across all the processor cores. CPU load with increased application size on multiple cores Page 5

Although the application size on the quad-core system is four times than that running on the single-core system the screen shots in the figures above demonstrate that the Wonderware Application Server runs more efficiently on the two dual-core CPUs, consuming roughly half the CPU load. Observations during the stress and functional testing have shown that on heavily loaded systems (i.e. running above the recommended/supported system loads) important operations such as failover caused by hardware or network failures and off-platform object-to-object communication were significantly faster on multi-core processors than on single-core processors. This observation can again be directly correlated to the componentized architecture and proper multithreading of the run-time services. Configuration Services Optimized to Run Faster on Multi-Core Processors The Wonderware Application Server provides a number of configuration services through the aagr.exe process. In general, these configuration services help users in developing applications and maintaining them. In version 3.0, a number of the more frequently used and computationally intensive operations are multithreaded to take advantage of hardware parallelism provided by multi-core processors. Without these optimizations, the same operations could perform much slower than on a single core processor. Significant performance improvements can be observed in deploy and un-deploy operations when there are a large number of platforms and engines in this Galaxy (A Galaxy is an Application Server implementation). The chart below compares the performance of optimized (as in Application Server Version 3.0) and non-optimized deployment, non-deployment and redeployment operations on a dual-core processor. These performance benchmarks were collected using a test Galaxy with 40 platforms and 40 Application Engines. Page 6

The above chart clearly demonstrates how poorly an un-optimized (un-threaded or improperly threaded) software program could perform on a multi-core processor when compared to an optimized software program such as the Wonderware Application Server. The chart also shows the benefits of using multi-core computers in distributed applications, where each computer is able to perform optimally without waiting for the other computers. Further, configuration operations heavily use Microsoft SQL Server 2005 which in turn is optimized for multi-core processors. Multi-user application development scenarios, which involve multiple users working on the same application, should also perform better. Overall, the Wonderware Application Server helps make developing and managing ArchestrA technology applications on multi-core CPUs faster and easier. As the number of cores on a single chip is sure to increase in the not-so-distant future, these optimizations ensure the scalability of customer applications built with the Wonderware Application Server. Future-proof your application with Wonderware Application Server 3.0 Page 7