Error Handling Linux on Power Architecture Reference System Software Work Group syssw-chair@openpowerfoundation.org OpenPOWER Foundation 2016, 2018, 2020 OpenPOWER Foundation Revision 0.5_pre5 OpenPOWER Copyright details are filled in by the template. The purpose of this document is to provide firmware and software architectural details associated with Error Recovery and Logging on OpenPOWER Systems. The base content for this document were contributed to the OpenPOWER Foundation in the IBM Linux on Power Architecture Platform Reference (LoPAPR) Draft document which detailed Linux running on PowerVM. While this information is not always immediately applicable to new OpenPOWER modes of bare metal or KVM, many of the concepts and interfaces remain in some form. Until such time as the document addresses these new OpenPOWER modes and components, it will remain versioned less than 1.0. It should also be noted that the original document had numerous contributors inside IBM. This document is a Standard Track, Work Group Specification work product owned by the System Software Workgroup and handled in compliance with the requirements outlined in the OpenPOWER Foundation Work Group (WG) Process document. It was created using the Master Template Guide version 0.9.5. Comments, questions, etc. can be submitted to the public mailing list for this document at TBD. 2020-04-06 Revision 0.5_pre5 - Updates to include latest PAPR ACRs (2.9) as follows: Add H_VIOCTL subfunctions for VNIC failover support Add H_VIOCTL subfunction for virtual ethernet MAC scan functionality Add H_VIOCTL subfunctions for virtual scsi and FC mobility preparation functionality ibm,current-associativity-domain property HPT resizing option - KVM only Add Coherent Platform Facilities (CAPI) XIVE Exploitation Add 'OCC online/offline' events to 'IE' error log subsection LPM Redundancy Phase II: Redundancy Add optional sub-queue support to VFC on P9 and newer Increase max num-entries for H_SEND_SUB_CRQ_INDIRECT to 128 Add Virtual Serial Multiplex adapter interfaces Maximum size of Dispatch Trace Log Buffer Eliminate requirement for clearing TCP checksum field for ILLAN checksum calculation Continued Extension of H_Send_Logical_LAN for large send packets Add LPM Capablity keyword to RTAS AIX Support system parameter XIVE Exploitation addition: Add ESB Reset Status to RTAS ibm,read-slot-reset-state2 Add NVDIMM Protection and Encryption State system parameters Change or Remove 0x9 and 0xA event subtypes for 'IE' error log subsection Additional, post PAPR 2.9 ACRs as follows: Reserve a range of hcalls to to support Ultravisor Add New CAS Bit For SRIOV Virtual Function (VF) Dynamic DMA Window (DDW) Support Updates to support vTPM 2.0 Update XIVE Legacy hcalls to add H_Function Add NVDIMM Secure Erase Command system parameter Update H_REGISTER_VPA to add H_STATE return code for VPA and SLB shadow buffer. Extend Firmware Assisted Dump for ISA Version 3.0 Add a new return code, H_NOT_AVAILABLE, to start-cpu rtas call Document already-implemented NVRAM variables Update ibm,dynamic-memory-vN flags to include a "Hotplugged Memory" flag 2019-01-08 Revision 0.5_pre4 - Update document type to Work Group Note. Final review ready. 2018-07-30 Revision 0.5_pre3 - Updates to documentation in preparation for System SW WG review: Reset document version to 0.5 Improved Abstract 2017-10-11 Revision 2.0_pre2 - Updates to include latest PAPR ACRs (2.8) as follows: ISA 2.07 privileged doorbell extensions (9/16/2012) POWER ISA Name Change Category Vector.XOR to Vector.CRYPTO (11/4/2012) Enable Multiple Redirected RDMA mappings per page (3/5/2013) Add Block Invalidate Option (3/5/2013) Implementation Dependent Optimizations (3/13/2013) System Firmware Service Entitlement Date (Warranty Date) Check (4/3/2013) New Function for ibm,change-msi to specify 32 bit MSI (5/14/2013) Remove Client-Architecture-Support bit for UUID option (4/16/2013) AddClient Architecture Support bit for RTAS ibm,change-msi (5/28/2013) Add VNIC Server (5/24/2014) VPA changes for P8 (EBB) (5/24/2013) Add an hcall to clean up the entire MMU hashtable (11/20/2013) Add LPCR[ILE] support to H_SET_MODE (5/31/2013) New Root Node Properties (1/12/2016) Extended Firmware Assisted Dump for P8 Registers (1/24/2014) Sufficient H_COP_OP output buffer (6/21/2014) Extend H_SEND_LOGICAL_LAN for large send packets (6/29/2014) Extend H_GET_MPP_X reporting coalesced pages (8/24/2014) Update ibm,pcie-link-speed-stats property to support PCIe 3.0 link speeds (6/12/2015) Extend ibm,get-system-parameters RTAS to report Energy Management Tuning Parameters (3/18/2015) Additional System Parameters related to mgmt of FW Service Entitlement Warranty period (6/22/2015) Additional System Parameter to read LPAR Name string (10/7/2015) Redesign of properties for DRC information and dynamic memory (7/23/2015) Add additional logical loction code sections (3/4/2016) Add ibm,vnic-client-mac to support vNIC failover (2/29/2016) hcall for registering the process table (3/21/2016) New device tree property for UUID (3/21/2016) Changes for Hotplug RTAS Events (10/24/2016) Support 64-bit PE TCEs in ibm,query-pe-dma-window (7/14/2016) 2016-05-04 Revision 2.0_pre1 - initial conversion from IBM document. Extracted from Linux on Power Architecture Platform Reference (LoPAPR) version 1.1 dated March 24, 2016 -- Section 7.3.3 ([RTAS] Error and Event Reporting), Chapter 10 (Error and Event Notification), Sections 1-3 of Chapter 16 (Service Indicators), and Appendix L (When to use: Fault vs. Error Log Indicators).