Difference between revisions of "Stress Test"

From ArchWiki
Jump to: navigation, search
m (CPU Stressing Programs)
(40 intermediate revisions by 10 users not shown)
Line 1: Line 1:
 +
[[Category:CPU]]
 
== Introduction ==
 
== Introduction ==
Running Arch on an overclocked PC is totally fine provided that the PC is stable at the overclock settings.  There are several programs available to you that will help you stress test your system.  The steps of overclocking a PC are beyond the scope of this article, but there is pretty inclusive guide written by graysky on the topic you can read at your leisure.
+
Running an overclocked PC is totally fine provided that the PC is stable at the overclock settings.  There are several programs available to assess system stability through stress testing the system and thereby the overclock level.  The steps of overclocking a PC are beyond the scope of this article, but there is pretty inclusive guide written by graysky on the topic: [[http://www.hardforum.com/showthread.php?t=1198647 Overclocking guide]].
  
[[http://www.hardforum.com/showthread.php?t=1198647 Overclocking guide]]
+
{{Note|The linked guide is a bit dated. More contemporary guides are recommended for modern hardware.}}
  
== Stressing Memory ==
+
== Discovering Errors ==
A very good program for stress testing your memory is [[http://www.memtest.org/ Memtest86+]].  It is based on the well-known original memtest86 written by Chris BradyMemtest86+ is, like the original, released under the terms of the Gnu Public License (GPL). No restrictions for use, private or commercial exist other than the ones mentioned in the Gnu Public License (GPL).
+
Some stressing applications like mprime and linpack (see below) have built in consistency checks to discover errors due to non-matching results.  A more general and simple method for measuring hardware instabilities can be found in the kernel itselfTo use it, simply watch the output from the kernel ring buffer by this command:
 +
# cat /proc/kmsg
  
You may download it from the webhost of memtest86+ [[http://www.memtest.org/#downiso here]] either as a bootable CD ISO or as an pre-compiled bootable binary.  The later can be called by GRUB with a minor modification to your menu.lst to allow you to boot directly into Memtest86+ without a CD/DVDROM.
+
The key error to watch for looks like this:
  
=== Running Memtest86+ ===
+
  [Hardware Error]: Machine check events logged
Either download and burn the ISO to a CD and boot from it, or follow the instructions in the next section to add an entry to your GRUB boot menu. Either way when you enter Memtest86+, the application begins testing your memory without your intervention.  It will run indefinitely until you stop it reporting any errors as it goes.  When it has completed a number of iterations without errors or runs for an arbitrary amount of time without errors, you can pretty much call your memory "good" or "stable" at the settings you have chosen for it in your BIOS.
+
  
{{tip|Allowing Memtest86+ to run for >10 cycles without errors is usually sufficient.}}
+
The kernel can throw these errors during an mprime run before mprime itself finishes the calculate and reports the error thus providing a very sensitive method to assess stability.
=== Running Memtest86+ from GRUB's Bootscreen ===
+
Download the pre-compiled bootable binary from the webhost above and place the .bin file from the archive in your /boot directory.  I renamed the binary to simply 'memtest86.bin' on my system.  Next edit your /boot/grub/menu.lst and add the following entry:
+
Title  Memtest86+ v2.11 (28-Dec-2008)
+
root            (hd0,2)
+
kernel          /boot/memtest86.bin
+
  
You will obviously need to change your root line to match that of your own system.  Remember that the (hdx,y) format takes its inputs starting from 0, not 1In other words, your first hardrive is #0 and your first partition is also #0If you're root partition resides on the 1st partition of the 1st HDD you would use the following line:
+
== CPU Stressing Programs ==
root      (hd0,0)
+
These are listed in two categories: 'higher demand voltage' and 'medium demand voltage'.  It is important to use some from each category to evaluate system stabilityIronically, machines can be more sensitive to selections from the 'medium demand' category than from the 'high demand' category'Higher demand voltage' programs demand the most vcore when run due to intense hardware usage.  'Medium demand voltage' programs do not always call for the highest vcore when running and as such can be more prone to throwing errors for systems that are undervolted relative to the clock speed requested.
  
{{note|If your system that uses a dedicated /boot partition you MUST omit the preceding '/boot' from the kernel line of the above example. Your kernel line in this scenario would read simply, 'kernel      /memtest86.bin'}}
+
Example on an overclocked i7-3770K (4.50 GHz); vcore is +0.020 V in offset mode with all powersaving features enabled.
  
== Stressing CPU and/or Memory ==
+
Idle: 0.7440 V - 0.8320 V (varies).
A very good program for CPU and CPU/memory stress testing is [[http://www.mersenne.org/ prime95]]There are both x86 and x86_64 version for Linux you can freely use for stress testing purposes under LinuxPrime95 under torture test mode will preform a series of very CPU intensive calculations and compare the values it gets to known good valuesThe theory is that if your system is sufficiently stable to get the right answers, it should be stable to most anything you will throw at itPrime95 is pretty much recognized universally as one defacto measure of an overclocked system's stability.
+
Mprime small FFTs: 1.2880 V (steady).
 +
Mprime large FFTs: 1.3040 V (steady).
 +
  Mprime blend: 1.2960 V (steady).
 +
  Linpack: 1.2320 V - 1.2720 V (varies).
 +
  x264 encoding: 1.2320 V - 1.2720 V (varies).
 +
  gcc compiling: 1.2720 V (steady).
  
=== Getting Prime95 ===
+
This machine running with a vcore of +0.005 (in offset mode) remains stable in both mprime and linpack for hours, but throws errors under both x264 and gcc after only several minutes.
Download either the 'Linux' version (i.e. x86) or the 'Linux64' version (i.e. x86_64) from the aforementioned website.  It is a precompiled binary so once you have untarred the archive file, simply run mprime as you would any other executable under Linux.
+
  
  $ tar zxvf mprime259-linux64.tar.gz
+
{| class="wikitable" align="center"
 +
|-
 +
! Voltage Demand !! Program !! Description
 +
|-
 +
| rowspan="4" bgcolor=#fffacd| '''<span style="color: #CD8500;">Medium</span>'''
 +
|-
 +
| ''Cc/Gcc'' || Both cc/gcc compilation is a great method of stress testing. Both are available in the ''base-devel'' group.
 +
|-
 +
| ''HandBrake-cli'' || {{Pkg|handbrake-cli}} can be used to encode using high quality settings.
 +
|-
 +
| ''Systester'' || {{AUR|systester}} Systester is a multithreaded piece of software capable of deriving values of pi out to 128,000,000 decimal places. It has built in check for system stability.
 +
|-
 +
| rowspan="3" bgcolor=#f7e3e3| '''<span style="color: #e62c2c;">High</span>'''
 +
| ''mprime'' ||  {{AUR|mprime-bin}} factors large numbers and is an excellent way to stress CPU and memory.
 +
|-
 +
| ''linpack'' ||  {{AUR|linpack}} - Linpack makes use of the BLAS (Basic Linear Algebra Subprograms) libraries for performing basic vector and matrix operations. and is an excellent way to stress CPUs for stability.
 +
|-
 +
|}
  
=== Running Prime95 ===
+
== Memory Stressing Programs ==
To run prime95, simply enter the directory you unpacked the archive and run mprime:
+
=== Memtest86+ ===
$ cd mprime259-linux64
+
Memtest86+ is a standard memory testing util and is packaged in [extra].
$ ./mprime
+
  
{{note| That if you're using a cpu-frequency scaler such as [[cpufrequtils]] or [[powernowd]] you will have to manually set your processor to run with its highest multiplier because prime95 uses a nice value that doesn't trip the step-up in your multiplier}}
+
== Stressing CPU and Memory ==
 +
===Mprime (Prime95 for Windows and MacOS)===
 +
 
 +
Prime95 is recognized universally as one defacto measure of system stability.  Mprime under torture test mode will preform a series of very CPU intensive calculations and compare the values it gets to known good values.
 +
 
 +
Prime95 for Linux is called {{AUR|mprime}} and is available in the AUR.
 +
 
 +
{{Warning|Before proceeding, it is '''HIGHLY''' recommended that users have some means to monitor the CPU temperature.  Packages such as [[Lm_sensors]] can do this.}}
 +
 
 +
To run mprime, simply open a shell and type "mprime"
 +
$ mprime
 +
 
 +
{{note| If using a cpu-frequency scaler such as [[cpufrequtils]] or [[powernowd]] sometimes, users need to manually set the processor to run with its highest multiplier because mprime uses a nice value that doesn't always trip the step-up in multiplier.}}
 +
 
 +
When the software loads, simply answer 'N' to the first question to begin the torture testing:
  
When the software loads, simply answer 'N' to the first question to begin the torture testing.  The software begins with the torture test, but if you hit {{Keypress|CTRL}} + {{Keypress|C}} you can break out and return to the main prime95 menu shown here:
 
 
  Main Menu
 
  Main Menu
 
   
 
   
Line 63: Line 92:
 
There are several options for the torture test (menu option 15).
 
There are several options for the torture test (menu option 15).
  
Use small FFTs or In-place large FFTs (options 1 and 2 respectively) mainly for CPU stress testing.  Option 2 (In-place large FFTs) are preferred.
+
* Small FFTs (option 1) to stress the CPU (option 1)
 +
* In-place large FFTs (option 1) to test the CPU and memory controller
 +
* Blend (option 3) is the default and constitutes a hybrid mode which stresses the CPU and RAM.
  
Blend (option 3) is a hybrid mode between the a CPU and RAM stress.
+
Errors will be reported should they occur both to stdout and to {{ic|~/results.txt}} for review laterMany do not consider a system as 'stable' unless it can run the Large FFTs for a 24 hour period.
{{tip|If you enter modes 11, 12, or 13 you can further customize the first three optionsFor example, if you wish to use a maximal amount of RAM in the tests, select option 13 and manually enter 95 % of your memory as the amount to use.}}
+
  
Errors will be reported should they occurMany do not consider a system as 'stable' unless it can run the Large FFTs for a 24 h period.
+
Example {{ic|~/results.txt}}; note that the two runs from 26-June indicate a hardware failure.  In this case, due to insufficient vcore to the CPU:
 +
<pre>[Sun Jun 26 20:10:35 2011]
 +
FATAL ERROR: Rounding was 0.5, expected less than 0.4
 +
Hardware failure detected, consult stress.txt file.
 +
FATAL ERROR: Rounding was 0.5, expected less than 0.4
 +
Hardware failure detected, consult stress.txt file.
 +
[Sat Aug 20 10:50:45 2011]
 +
Self-test 480K passed!
 +
Self-test 480K passed!
 +
[Sat Aug 20 11:06:02 2011]
 +
Self-test 128K passed!
 +
Self-test 128K passed!
 +
[Sat Aug 20 11:22:10 2011]
 +
Self-test 560K passed!
 +
Self-test 560K passed!
 +
...</pre>
 +
 
 +
{{Note|Users suspecting bad memory or memory controllers should try the blend test first as the small FFT test uses very little memory.}}
 +
 
 +
=== Linpack ===
 +
Linpack makes use of the BLAS (Basic Linear Algebra Subprograms) libraries for performing basic vector and matrix operations. and is an excellent way to stress CPUs for stability{{AUR|linpack}} is available from the AUR.  After installation, users should adjust {{ic|/etc/linpack.conf}} according to the amount of memory on the target system.
 +
 
 +
=== Systester (SuperPi for Windows) ===
 +
{{AUR|Systester}} is available in the AUR in both cli and gui version.  It tests system stability by calculating up to 128 millions of Pi digits and includes error checking.  Note that one can select from two different calculation algorithms:  Quadratic Convergence of Borwein and Gauss-Legendre.  The latter being the same method that the popular SuperPi for Windows uses.
 +
 
 +
A cli example using 8 threads is given:
 +
$ systester-cli -gausslg 64M -threads 8
 +
 
 +
== Stressing Memory ==
 +
A very good program for stress testing memory is [[http://www.memtest.org/ Memtest86+]].  It is based on the well-known original memtest86 written by Chris Brady.  Memtest86+ is, like the original, released under the terms of the GNU General Public License (GPL). No restrictions for use, private or commercial exist other than the ones mentioned in the GNU GPL.
 +
 
 +
=== Running Memtest86+ ===
 +
Either download and burn the ISO to a CD and boot from it, or install {{Pkg|memtest86+}} from [extra] and update GRUB which will auto-detect the package and allow users to boot directly to it.
 +
 
 +
{{tip|Allowing Memtest86+ to run for >10 cycles without errors is usually sufficient.}}

Revision as of 10:47, 9 June 2013

Introduction

Running an overclocked PC is totally fine provided that the PC is stable at the overclock settings. There are several programs available to assess system stability through stress testing the system and thereby the overclock level. The steps of overclocking a PC are beyond the scope of this article, but there is pretty inclusive guide written by graysky on the topic: [Overclocking guide].

Note: The linked guide is a bit dated. More contemporary guides are recommended for modern hardware.

Discovering Errors

Some stressing applications like mprime and linpack (see below) have built in consistency checks to discover errors due to non-matching results. A more general and simple method for measuring hardware instabilities can be found in the kernel itself. To use it, simply watch the output from the kernel ring buffer by this command:

# cat /proc/kmsg

The key error to watch for looks like this:

[Hardware Error]: Machine check events logged

The kernel can throw these errors during an mprime run before mprime itself finishes the calculate and reports the error thus providing a very sensitive method to assess stability.

CPU Stressing Programs

These are listed in two categories: 'higher demand voltage' and 'medium demand voltage'. It is important to use some from each category to evaluate system stability. Ironically, machines can be more sensitive to selections from the 'medium demand' category than from the 'high demand' category. 'Higher demand voltage' programs demand the most vcore when run due to intense hardware usage. 'Medium demand voltage' programs do not always call for the highest vcore when running and as such can be more prone to throwing errors for systems that are undervolted relative to the clock speed requested.

Example on an overclocked i7-3770K (4.50 GHz); vcore is +0.020 V in offset mode with all powersaving features enabled.

Idle: 0.7440 V - 0.8320 V (varies).
Mprime small FFTs: 1.2880 V (steady).
Mprime large FFTs: 1.3040 V (steady).
Mprime blend: 1.2960 V (steady).
Linpack: 1.2320 V - 1.2720 V (varies).
x264 encoding: 1.2320 V - 1.2720 V (varies).
gcc compiling: 1.2720 V (steady).

This machine running with a vcore of +0.005 (in offset mode) remains stable in both mprime and linpack for hours, but throws errors under both x264 and gcc after only several minutes.

Voltage Demand Program Description
Medium
Cc/Gcc Both cc/gcc compilation is a great method of stress testing. Both are available in the base-devel group.
HandBrake-cli handbrake-cli can be used to encode using high quality settings.
Systester systesterAUR Systester is a multithreaded piece of software capable of deriving values of pi out to 128,000,000 decimal places. It has built in check for system stability.
High mprime mprime-binAUR factors large numbers and is an excellent way to stress CPU and memory.
linpack linpackAUR - Linpack makes use of the BLAS (Basic Linear Algebra Subprograms) libraries for performing basic vector and matrix operations. and is an excellent way to stress CPUs for stability.

Memory Stressing Programs

Memtest86+

Memtest86+ is a standard memory testing util and is packaged in [extra].

Stressing CPU and Memory

Mprime (Prime95 for Windows and MacOS)

Prime95 is recognized universally as one defacto measure of system stability. Mprime under torture test mode will preform a series of very CPU intensive calculations and compare the values it gets to known good values.

Prime95 for Linux is called mprimeAUR and is available in the AUR.

Warning: Before proceeding, it is HIGHLY recommended that users have some means to monitor the CPU temperature. Packages such as Lm_sensors can do this.

To run mprime, simply open a shell and type "mprime"

$ mprime
Note: If using a cpu-frequency scaler such as cpufrequtils or powernowd sometimes, users need to manually set the processor to run with its highest multiplier because mprime uses a nice value that doesn't always trip the step-up in multiplier.

When the software loads, simply answer 'N' to the first question to begin the torture testing:

Main Menu

1.  Test/Primenet
2.  Test/Worker threads
3.  Test/Status
4.  Test/Continue
5.  Test/Exit
6.  Advanced/Test
7.  Advanced/Time
8.  Advanced/P-1
9.  Advanced/ECM
10.  Advanced/Manual Communication
11.  Advanced/Unreserve Exponent
12.  Advanced/Quit Gimps
13.  Options/CPU
14.  Options/Preferences
15.  Options/Torture Test
16.  Options/Benchmark
17.  Help/About
18.  Help/About PrimeNet Server

There are several options for the torture test (menu option 15).

  • Small FFTs (option 1) to stress the CPU (option 1)
  • In-place large FFTs (option 1) to test the CPU and memory controller
  • Blend (option 3) is the default and constitutes a hybrid mode which stresses the CPU and RAM.

Errors will be reported should they occur both to stdout and to ~/results.txt for review later. Many do not consider a system as 'stable' unless it can run the Large FFTs for a 24 hour period.

Example ~/results.txt; note that the two runs from 26-June indicate a hardware failure. In this case, due to insufficient vcore to the CPU:

[Sun Jun 26 20:10:35 2011]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected, consult stress.txt file.
[Sat Aug 20 10:50:45 2011]
Self-test 480K passed!
Self-test 480K passed!
[Sat Aug 20 11:06:02 2011]
Self-test 128K passed!
Self-test 128K passed!
[Sat Aug 20 11:22:10 2011]
Self-test 560K passed!
Self-test 560K passed!
...
Note: Users suspecting bad memory or memory controllers should try the blend test first as the small FFT test uses very little memory.

Linpack

Linpack makes use of the BLAS (Basic Linear Algebra Subprograms) libraries for performing basic vector and matrix operations. and is an excellent way to stress CPUs for stability. linpackAUR is available from the AUR. After installation, users should adjust /etc/linpack.conf according to the amount of memory on the target system.

Systester (SuperPi for Windows)

SystesterAUR is available in the AUR in both cli and gui version. It tests system stability by calculating up to 128 millions of Pi digits and includes error checking. Note that one can select from two different calculation algorithms: Quadratic Convergence of Borwein and Gauss-Legendre. The latter being the same method that the popular SuperPi for Windows uses.

A cli example using 8 threads is given:

$ systester-cli -gausslg 64M -threads 8

Stressing Memory

A very good program for stress testing memory is [Memtest86+]. It is based on the well-known original memtest86 written by Chris Brady. Memtest86+ is, like the original, released under the terms of the GNU General Public License (GPL). No restrictions for use, private or commercial exist other than the ones mentioned in the GNU GPL.

Running Memtest86+

Either download and burn the ISO to a CD and boot from it, or install memtest86+ from [extra] and update GRUB which will auto-detect the package and allow users to boot directly to it.

Tip: Allowing Memtest86+ to run for >10 cycles without errors is usually sufficient.