Skip to content

Categories:

Machine Check Error (MCE) decoding example

The PSOD and the vmkernel-1.log shows the following:
28:06:50:01.381 cpu0:1283)ALERT: MCE: 579: Machine Check Exception
28:06:50:01.381 cpu0:1283)ALERT: MCE: 169: Machine Check Exception: General Status 0000000000000004
28:06:50:01.381 cpu0:1283)ALERT: MCE: 193: Machine Check Exception: Bank 0, Status b66d400000000135
28:06:50:01.381 cpu0:1283)ALERT: MCE: 226: Machine Check Exception: Bank 0, Addr 00000000c60e2be0, Valid TRUE
This occurred on CPU 0 and there is information populated in the General Status register (MCG_STATUS)
and the Bank 0 Status register (MC0_STATUS).
4 means 0100, and bit 2 mean MCIP machine check in progress.
b66d400000000135 = 1011011001101101010000000000000000000000000000000000000100110101 in binary
7 most sig bits: 1011 011
Bit 63 = 1 – MC0_STATUS register contents are valid
Bit 62 = 0 – An overflow did not occur
Bit 61 = 1 – Error was not corrected
Bit 60 = 1 – Error checking was enabled
Bit 59 = 0 – Contents of the MC0_MISC register is INVALID
Bit 58 = 1 – Contents of the MC0_ADDR register is valid
Bit 57 = 1 – Processor context is corrupt – register values are unreliable
MCA Error code: Bits 0-15: 0000 0001 0011 0101
Pattern is 0000 0001 RRRR TTLL {TT}CACHE{LL}_{RRRR}_ERR
Therefore it’s a Memory Hierarchy Error.
RRRR = 0011 – Operation was a data read
TT = 01 – Data
LL = 01 – Level 1
So the problem is due to a data read error with the CPU 0 Level 1 cache.
The AMD MCAT tool confirms this:
C:\Program Files\AMD\MCAT>mcat /cmd 0 0xb66d400000000135 0x00000000c60e2be0 0 /ghx4
Processor Number : Unknown
Bank Number : 0
Time Stamp (0x): 00000000 00000000
Error Status (0x): B66D4000 00000135
Error Address (0x): 00000000 C60E2BE0
Error Misc (0x): 00000000 00000000
Status Bit Decode :
Correctable ECC error
Processor context corrupt
Error address valid
Error enable
Error uncorrected
Error valid
Error Code (0x): 0135
Error Type – Memory
Memory Transaction Type (RRRR) – Data read (DRD)
Transaction Type (TT) – Data
Cache Level (LL) – Level 1 (L1)
Bank 0 Data Cache Errors:
Data Load – A data error occured while accessing or managing data.

Posted in Uncategorized.


0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.

Spam Protection by WP-SpamFree