Difference between revisions of "Machine-check exception"
(→Running mcelog as a daemon: made the rc.conf stuff less ambiguous and updated it to better comply with the new style guide) |
(→Running mcelog as a daemon: update for systemd) |
||
(10 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
− | + | [[Category:CPU]] | |
− | [[Category:CPU | + | [[Category:Kernel]] |
− | [[Category:Kernel | + | {{Out of date|mentions [[rc.conf]]}} |
− | [[ | ||
− | |||
− | |||
− | |||
This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred. | This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred. | ||
Line 20: | Line 16: | ||
===Running mcelog as a daemon=== | ===Running mcelog as a daemon=== | ||
− | It is recommended by upstream to always run mcelog as a daemon, so edit {{ic|/etc/mcelog/mcelog.conf}} and set {{ic|daemon | + | It is recommended by upstream to always run mcelog as a daemon, so edit {{ic|/etc/mcelog/mcelog.conf}} and set {{ic|1=daemon = yes}}. |
− | |||
− | |||
− | + | Finally, start the mcelog service and enable it to start automatically on boot: | |
+ | # systemctl start mcelog | ||
+ | # systemctl enable mcelog | ||
===Additional configuration options=== | ===Additional configuration options=== |
Revision as of 04:01, 16 November 2013
This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.
Contents
Introduction
Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs or the CPU, or from other miscellaneous faults, including faulty software triggering hardware errors.
Installing mcelog
The mcelog daemon written by Andi Kleen is one of the tools one can use to gather MCE information.
Install the mcelog package from the official repositories.
Configuring mcelog
mcelog's configuration file is located at /etc/mcelog/mcelog.conf
.
Running mcelog as a daemon
It is recommended by upstream to always run mcelog as a daemon, so edit /etc/mcelog/mcelog.conf
and set daemon = yes
.
Finally, start the mcelog service and enable it to start automatically on boot:
# systemctl start mcelog # systemctl enable mcelog
Additional configuration options
The following option is probably recommended:
syslog = yes
Hardware documentation from CPU manufacturers
- AMD64 Architecture Programmer's Manual, Volume 2: System Programming
- BIOS and Kernel Developer's Guide for AMD Athlon™ 64 and AMD Opteron™ Processors