Difference between revisions of "Machine-check exception"

From ArchWiki
Jump to: navigation, search
(Created the initial page dedicated for information on handling MCEs)
 
m (don't include parent category, see Help:Style#Categories)
(24 intermediate revisions by 7 users not shown)
Line 1: Line 1:
[[Category:Hardware (English)]]
+
[[Category:CPU]]
[[Category:CPU (English)]]
+
[[Category:Kernel]]
[[Category:Kernel (English)]]
+
{{Out of date|mentions [[rc.conf]]}}
[[Category:Daemons and system services (English)]]
+
 
+
 
This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.
 
This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.
  
 
==Introduction==
 
==Introduction==
Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs, or from other miscellaneous faults, including faulty software triggering hardware errors.
+
Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs or the CPU, or from other miscellaneous faults, including faulty software triggering hardware errors.
  
 
==Installing mcelog==
 
==Installing mcelog==
The [http://www.mcelog.org/ mcelog] daemon written by Andi Kleen is one of the methods in which one can handle MCEs. The {{Package Official|mcelog}} daemon can be found in the {{Codeline|[community]}} repository and be installed with [[pacman]].
+
The [http://www.mcelog.org/ mcelog] daemon written by Andi Kleen is one of the tools one can use to gather MCE information.
pacman -S mcelog
+
 
 +
[[pacman|Install]] the {{Pkg|mcelog}} package from the [[Official Repositories|official repositories]].
  
 
==Configuring mcelog==
 
==Configuring mcelog==
pass
+
mcelog's configuration file is located at {{ic|/etc/mcelog/mcelog.conf}}.
 +
 
 +
===Running mcelog as a daemon===
 +
It is recommended by upstream to always run mcelog as a daemon, so edit {{ic|/etc/mcelog/mcelog.conf}} and set {{ic|daemon <nowiki>=</nowiki> yes}}.
 +
 
 +
Finally, use the {{ic|/etc/rc.d/mcelog}} script to start mcelog at boot via {{ic|/etc/[[rc.conf]]}}.
 +
 
 +
{{Note|If running mcelog via the {{ic|rc.d}} command or via {{ic|/etc/[[rc.conf]]}}, it is unnecessary to set {{ic|daemon <nowiki>=</nowiki> yes}} in {{ic|/etc/mcelog/mcelog.conf}} because {{ic|/etc/rc.d/mcelog}} starts mcelog in daemon mode by default.}}
 +
 
 +
===Additional configuration options===
 +
The following option is probably recommended:
 +
syslog = yes
  
 
==Hardware documentation from CPU manufacturers==
 
==Hardware documentation from CPU manufacturers==
* [http://support.amd.com/us/Processor_TechDocs/24593.pdf AMD64 Architecture Programmer's Manual, Volume 2: System Programming]
+
* [http://support.amd.com/us/Processor_TechDocs/APM_v2_24593.pdf AMD64 Architecture Programmer's Manual, Volume 2: System Programming]
* [http://support.amd.com/us/Processor_TechDocs/26094.pdf BIOS and Kernel Developer's Guide for AMD Athlon™ 64 and AMD Opteron™ Processors]
+
* [http://support.amd.com/us/Processor_TechDocs/26094.PDF BIOS and Kernel Developer's Guide for AMD Athlon™ 64 and AMD Opteron™ Processors]
  
 
==See Also==
 
==See Also==
 
* [http://en.wikipedia.org/wiki/Machine_Check_Exception Wikipedia's article on machine check exceptions]
 
* [http://en.wikipedia.org/wiki/Machine_Check_Exception Wikipedia's article on machine check exceptions]
 
* [http://en.wikipedia.org/wiki/Machine_check_architecture Wikipedia's article on the machine check architecture]
 
* [http://en.wikipedia.org/wiki/Machine_check_architecture Wikipedia's article on the machine check architecture]
* [http://www.mcelog.org/ mcelog daemon by Andi Kleen; mirrored on GitHub]
+
* [http://www.mcelog.org/ mcelog daemon page by Andi Kleen]
 +
* [http://www.mcelog.org/references.html References page from mcelog site]

Revision as of 15:20, 7 August 2013

Tango-view-refresh-red.pngThis article or section is out of date.Tango-view-refresh-red.png

Reason: mentions rc.conf (Discuss in Talk:Machine-check exception#)

This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.

Introduction

Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs or the CPU, or from other miscellaneous faults, including faulty software triggering hardware errors.

Installing mcelog

The mcelog daemon written by Andi Kleen is one of the tools one can use to gather MCE information.

Install the mcelog package from the official repositories.

Configuring mcelog

mcelog's configuration file is located at /etc/mcelog/mcelog.conf.

Running mcelog as a daemon

It is recommended by upstream to always run mcelog as a daemon, so edit /etc/mcelog/mcelog.conf and set daemon = yes.

Finally, use the /etc/rc.d/mcelog script to start mcelog at boot via /etc/rc.conf.

Note: If running mcelog via the rc.d command or via /etc/rc.conf, it is unnecessary to set daemon = yes in /etc/mcelog/mcelog.conf because /etc/rc.d/mcelog starts mcelog in daemon mode by default.

Additional configuration options

The following option is probably recommended:

syslog = yes

Hardware documentation from CPU manufacturers

See Also