Difference between revisions of "Machine-check exception"
(remove language suffix from Category:CPU (English), see Talk:Table of Contents#English Category Names: Capitalization and Conflict with i18n)
(remove language suffix from Category:Daemons and system services (English), see Talk:Table of Contents#English Category Names: Capitalization and Conflict with i18n)
|Line 2:||Line 2:|
[[Category:Daemons and system services
[[Category:Daemons and system services]]
Revision as of 14:10, 23 April 2012
This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.
Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs or the CPU, or from other miscellaneous faults, including faulty software triggering hardware errors.
The mcelog daemon written by Andi Kleen is one of the tools one can use to gather MCE information.
mcelog's configuration file is located at
Running mcelog as a daemon
It is recommended by upstream to always run mcelog as a daemon, so edit
/etc/mcelog/mcelog.conf and set
daemon = yes.
Finally, use the
/etc/rc.d/mcelog script to start mcelog at boot via
Additional configuration options
The following option is probably recommended:
syslog = yes
Hardware documentation from CPU manufacturers
- AMD64 Architecture Programmer's Manual, Volume 2: System Programming
- BIOS and Kernel Developer's Guide for AMD Athlon™ 64 and AMD Opteron™ Processors