Is Intel s Hyper-Threading Technology Worth the Extra Money to the Average User?

Size: px

Start display at page:

Download "Is Intel s Hyper-Threading Technology Worth the Extra Money to the Average User?"

Eleanore Randall
6 years ago
Views:

1 Is Intel s Hyper-Threading Technology Worth the Extra Money to the Average User? Andrew Murray Villanova University 800 Lancaster Avenue, Villanova, PA, United States of America ABSTRACT In the mid-1990 s, Intel Corporation decided to use symmetric multiprocessing (SMP) in order to increase the number of instructions that could execute simultaneously by putting more than one processor on a motherboard. This idea increased the overall performance of a system, but it was too expensive for the average user to afford. Intel then looked into the idea of simultaneous multithreading (SMT) for a single processor. The idea was to allow one processor to execute two threads simultaneously to increase the performance of the system. This technology was applied to Intel s processors and is called hyper-threading technology. This paper compares SMP to SMT and shows how the two technologies as similar, but very different. It then goes on to describe the hyper-threading technology in a little more detail and shows how hyper-threading compared to non-hyper-threading processors against some benchmarks and tests. 1. INTRODUCTION Every year since their creation, computers have continued to save time for the people that use them by being able to complete tasks with lighting speed. It is because of this that people are now looking for ways to get more out of this fabulous machine. They have begun making larger and more complex programs, executing multiple processes at the same time, and using them to run the servers that are the backbone of the Internet. For the longest time, the computer and processor architects have been able to keep up with the growing demand for more speed, but they needed something that would put them ahead of the demand. The idea of symmetric multiprocessing (SMP) was an answer to this problem for large scale computer users. The idea of using more than one processor to handle the workload definitely increased the power of the computer. The only problem was this type of computer was a lot more expensive and not feasible for the average user to go out and buy this type of machine. This meant that the average user was stuck to a single processor. In order to make the general public happy, processor architects did all they could to increase the speed of these processors to handle the average user s workload. To do this, they needed to add more transistors to increase the overall speed, but at the same time, these processors were consuming more power. This put the processor architects into a dilemma because speed sells, but they needed a way to improve the performance at a greater rate than transistor counts and power dissipation [3]. This is when the Intel Corporation created Hyper- Threading technology that was based on the idea of simultaneous multithreading (SMT) [1]. The very broad idea was to take a single processor and enable it to execute two separate threads simultaneously in order to improve the overall performance without increasing power and transistor counts. However, the main question that arises is whether or not this technology is better and more cost efficient than a regular processor at the same speed without hyper-threading. There have been a number of benchmarks that have compared that very idea and the results that will be seen later can be very surprising. 2. SMP vs. SMT Processor architects found a way to increase the overall performance of a system through the use of parallelism. Parallelism is the basic idea of having more than one independent thread executing simultaneously in order to boost the performance of a system [7]. In the mid-1990 s, Intel decided to use the idea of parallelism by putting more than one processor in a machine in order to execute different threads of a process simultaneously. This became known as symmetric multiprocessing (SMP). SMP did improve the overall performance of a system because it was able to execute more instructions simultaneously than could a single processor. What made it so power was its ability to continue execution. For example, if processor A receives an instruction to execute, but it must stall during its execution, processor B could receive the next instruction or an instruction from another program, which will keep the system busy and hide the stall latency on processor A. This is a huge improvement over single processor systems. In order for SMP systems to achieve these types of results, they must share the system resources and find different ways to schedule

2 threads for all the available processors. Programs that are already written for multithreaded environments fit perfectly into this type of system and will drastically increase its performance. However, non-multithreaded programs need to be scheduled in a way to achieve a multithreaded state. One way to achieve this is through out-of-order execution of instructions. This means that the processor or complier combines multiple instruction sequences, meant to be executed in a specific order and reschedules them so that they can be executed with the highest efficiency [3, 4]. This helps to ensure that all the processors receive some work to do and help increase the overall performance. SMP is still used today, especially in environments, like servers, that require a lot of work to be done in a relatively short amount of time. The only problem was how to bring the idea of SMP to the home user. SMP systems are relatively expensive because there is more than one processor on a motherboard and the hardware becomes more difficult and expensive to create. This puts this type of technology out of reach for most average users due to the cost. This problem was finally solved when the idea of simultaneous multithreading (SMT) was applied to processors. SMT is based on an idea known as thread-level parallelism (TLP), where multiple independent execution states can occur within a larger program context [3, 7]. Intel looked into this idea as a way to gain better performance vs. transistor count and power ratio [3]. It was discovered that when TLP is utilized, the overall performance of the program is increased. That s when Intel architects decided to apply this idea, by allowing the processor to handle multiple threads, in order to increase the performance of the processor. They decided to use SMT, which is a very fine grained form of hardware multithreading that allows simultaneous execution of more than one thread [1]. The main advantage of SMT is its ability to better utilize processor resources and to hide memory hierarchy latency by being able to provide more independent work to keep the processor busy [1]. This is similar idea to SMP, but instead of having more than one processor, everything occurs in the same processor. This has become known as Intel s new technology called Hyper-Threading. 3. Hyper-Threading Hyper-Threading Technology makes a single physical processor appear as multiple logical processors [3, 5, 6]. In order for this to happen, a copy of the architecture state is given to each logical processor so that two separate threads can execute at the same time on each architecture state. Each logical processor shares a single set of physical execution resources as compared to all the resources being shared with SMP type systems. Figure 1: Processors without Hyper-Threading Technology [3] Figure 1 shows what a classic two processor SMP system would look like during its execution. Each would have its own architecture state in order to execute separate threads simultaneously. Figure 2: Processors with Hyper-Threading Technology [3] Figure 2 shows what a two processor SMP system would look like with hyper-threading technology. Each processor now contains two copies of the architecture state, which means that each processor can execute two threads simultaneously. Since both processors can execute two threads in a single processor, this means that a two processor SMP system with hyper-threading could execute four threads simultaneously or in other words, it appears to have four logical processors [3]. This apparent increase in the number of processors, without having that number of physical processors present is a huge space saver on the motherboard and is also more cost efficient. Hyper-threading uses these logical processors in order to increase the performance of the system. It accomplishes this by switching the utilization of chip resources from the currently executing thread to a new thread when the currently executing thread initiates a long latency operation [7]. This ensures that any long pipeline stalls can be avoided by allowing the second logical processor to take over execution as the other logical processor stalls.

This apparent increase in performance should make the decision of buying a hyper-threading processor an easy one.

The next section will prove or disprove that statement by showing a number of benchmarks and tests that were used with two similar speed processors, but one had hyper-threading technology. 4.

com, linuxhardware.org, and tomshardware.com. Each site used a number of different benchmarks that are supposed to cover the range of user actions, like playing games, audio/video, and multitasking.

3 This apparent increase in performance should make the decision of buying a hyper-threading processor an easy one. If it can run two threads simultaneously and help hide long latency operations, then it must make it a better processor over the same speed processor without hyperthreading. The next section will prove or disprove that statement by showing a number of benchmarks and tests that were used with two similar speed processors, but one had hyper-threading technology. 4. Analysis This section takes a look at six benchmarks and tests that were taken from three online sites that perform a number of hardware performance benchmark tests. These sites are hardwareanalysis.com, linuxhardware.org, and tomshardware.com. Each site used a number of different benchmarks that are supposed to cover the range of user actions, like playing games, audio/video, and multitasking. They each used a Pentium GHz processor with hyper-threading technology. They were able to disable the hyper-threading technology in order to achieve the non-hyper-threading results. The first test will be how many frames per second can be achieved while playing the video game Quake 3, which is a very intense video game that requires a lot of graphics. Figure 4: SPECViewperf 7.0 [2] The next benchmark is a professional graphics benchmark shown in Figure 4. Its power is measured in frames per second and as it can be seen, the processor with the hyperthreading technology does not make too big of a difference. It is way to close to the non-hyper-threading processor to say either processor is better than the other. Figure 5: BAPCo SYSMark 2002 [4] Figure 3: Quake 3 Demo [2] Figure 3 shows the results of the Quake 3 Demo benchmark at three different resolutions. It seems that at the first two resolutions, the hyper-threading does a little bit better than the non-hyper-threading processor. However, at the last resolution, the non-hyper-threading processor just edges out the hyper-threading processor. It can be seen that there is no real advantage in having a hyper-threading processor because they are too close to have a statistically significant difference. Figure 5 shows the BAPCo SYSMark 2002 benchmark that is based on scripted runs of several popular office and workstation applications, including Microsoft Office, Adobe Photoshop and Premiere, and other popular applications. One of the best features of SYSMark2002 is that it runs multiple applications at once, unlike its predecessors, so essentially it is much more realistic and representative of how an actual user would work behind the PC [4]. It shows that the hyper-threading processor gives about a 3% increase in performance than the nonhyper-threading processor. This is also not a huge difference to make it clear that one processor is better than the other. After three benchmarks, the hyper-threading processor has done just slightly better than the non-hyper-threading processor. Although there has not been enough substantial evidence to say it is better than the non-hyperthreading processor. The next three benchmarks will try to simulate the situations when hyper-threading should be the best choice.

Figure 6: MadOnion 3DMark2001SE [4] The MadOnion 3DMark2001SE is not a multithreaded application and it is a single process.

However, when SETI@Home is executed in the background, the hyperthreading processor improves the performance by almost 40% compared to the other processor.

performance [4]. This is the first case that once resources were available to each logical processor, the true nature of hyper-threading was able to be used and it can definitely be seen.

4 Figure 6: MadOnion 3DMark2001SE [4] The MadOnion 3DMark2001SE is not a multithreaded application and it is a single process. When it is executed on both the hyper-threading and non-hyper-threading processors, the result is about the same. However, when SETI@Home is executed in the background, the hyperthreading processor improves the performance by almost 40% compared to the other processor. SETI@Home is a processor-intensive application that requests only specific execution resources, leaving the other resources to be used by the other logical processor and thus increase the overall performance [4]. This is the first case that once resources were available to each logical processor, the true nature of hyper-threading was able to be used and it can definitely be seen. Figure 8: SiSoft- Sandra 2002 SP1 [7] Figure 7: 1GB ZIP Compression [4] This next test in figure 7 shows how compressing a 1 GB file fairs against the two processors. Since the compression program is not multithreaded, it is not expected that there would be an improvement in performance with the hyper-threading processor. However, when more programs begin to execute, like the mp3 playing with a plug-in, the hyper-threading processor is able to handle the workload a lot better than the nonhyper-threading processor. This is a good example of how well a hyper-threading processor can handle multitasking, which is what most average users are going to be doing. The last benchmark shown in figure 8 is tested against the SiSoft Sandra 2002 SP1 benchmark. It is a multimedia benchmark and from the results, a Pentium GHz hyper-threading processor beat every other processor. It even beat a Pentium GHz processor in this particular benchmark. This benchmark shows how well a hyperthreading processor can handle integer MMX, floating point SSE and SSE2, and 3DNow operations compared to the other processors. Overall, it can be seen through these various benchmarks and tests that hyper-threading technology is best when it is in a multitasking environment or when a particular application is multithreaded. This is a problem because a lot of developers were not aware of the potential benefits of hyper-threading when writing programs, such as Adobe Photoshop and Windows Media Decoder [7]. Once developers realize this benefit and start to develop multithreaded applications, then the true power of hyperthreading will be seen across those processors. Until then, the only situation where a hyper-threaded processor is beneficial is in the multitasking environment. A good area for research is to determine how much multitasking must be done at once, with what size programs, and whether or not any of the programs are multithreaded, in order to achieve the best overall performance gain from a hyper-threading processor.

5 3. CONCLUSION Intel s hyper-threading technology is a very creative way to increase the overall performance of a processor without having to add more transistors or consume more power. It was shown in certain situations that hyper-threading was able to drastically increase the performance of a system. However, these situations were very limited and very dependent on the nature of the workload being done by the user. Only multithreaded applications and multitasking were the situations in which the most improvement was observed. This means that single, nonmultithreaded applications that were executed alone showed no improvement with this new technology. This is a problem because a lot of average computer users only execute one application at a time. It is only with the more advanced home users who really multitask these nonmultithreaded applications that would see a significant performance gain with hyper-threading technology. [4] S. Sassen. Hyper-Threading on the Desktop. Hardware Analysis.com. Nov. 14, [5] The Standard Performance Evaluation Corporation. [6] N. Tuck and D. M. Tullsen. Initial observations of the simultaneous multithreading Pentium 4 processor. In Proceedings of the 12 th International Conference on Parallel Architecture and Compilation Techniques (PACT 2003), pages IEEE Computer Society, Sept [7] F. Volkel et al. Single CPU in Dual Operation: P GHz with Hyper-Threading Technology. Tom s Hardware Guide. Nov. 2, All of this evidence shows that hyper-threading technology is only worth the extra money if the user is going to take advantage of its power. This means that only users who will use multithreaded applications or who do a lot of multitasking should consider purchasing a processor with this additional power. Otherwise, the extra money is simply wasted since there is no significant gain in performance when not used on multithreaded applications and in a multitasked environment. This conclusion should change within the next couple of years as software engineers begin to create more multithreaded applications. However, until multithreaded applications become the norm in the computer industry, a processor with hyper-threading technology should only be purchased if the user intends on taking advantage of what this new technology has to offer. REFERENCES [1] J. R. Bulpin and I. A. Pratt. Multiprogramming Performance of the Pentium 4 with Hyper-Threading. In Proceedings of 31 st International Symposium on Computer Architecture (ISCA-31)., pages Munich, Germany, June [2] Linux Hardware Hyper-Threading Benchmarks. Linux Hardware.com &mode=thread [3] D. Marr et al. Hyper-Threading Technology Architecture and Microarchitecture: A Hypertext History. Intel Technology J., vol. 6, issue 1, Feb

Simultaneous Multithreading on Pentium 4

Hyper-Threading: Simultaneous Multithreading on Pentium 4 Presented by: Thomas Repantis trep@cs.ucr.edu CS203B-Advanced Computer Architecture, Spring 2004 p.1/32 Overview Multiple threads executing on