Machine-check exception

From ArchWiki
Revision as of 19:48, 4 October 2011 by Jstjohn (Talk | contribs) (Configuring mcelog: added Note box about rc.d stuff; removed info about having to manually add files and set permissions, etc. (fixed in 1.0pre3-3); remove socket-path config option because that is default)

Jump to: navigation, search


This article aims to help users implement services to actively monitor, log, and report hardware errors. A machine check exception (MCE) is an error generated by the CPU when the CPU detects that a hardware error or failure has occurred.

Introduction

Machine check exceptions (MCEs) can occur for a variety of reasons ranging from undesired or out-of-spec voltages from the power supply, from cosmic radiation flipping bits in memory DIMMs, or from other miscellaneous faults, including faulty software triggering hardware errors.

Installing mcelog

The mcelog daemon written by Andi Kleen is one of the methods in which one can handle MCEs. The Template:Package Official daemon can be found in the Template:Codeline repository and can be installed with pacman.

pacman -S mcelog

Configuring mcelog

mcelog's configuration file is located at Template:Filename.

Running mcelog as a daemon

It is recommended by upstream to always run mcelog as a daemon, so edit Template:Filename and set Template:Codeline.

Finally, Template:Codeline needs to be added to the Template:Codeline array in Template:Filename.

Note: If running mcelog via the Template:Codeline command or the Template:Codeline array in Template:Filename, it is unnecessary to set Template:Codeline in Template:Filename because Template:Filename starts mcelog in daemon mode by default.

Additional configuration options

The following options are probably recommended:

syslog = yes

Hardware documentation from CPU manufacturers

See Also