Environmental and Power Warnings Environmental and Power Warnings (EPOW) is an option that provides a means for the platform to inform the OS of these types of events. The intent is to enable the OS to provide basic information to the user about environmental and power problems and to minimize the logical damage done by these problems. For example, an OS might want to abort all disk I/O operations in progress to ensure that disk sectors are not corrupted by the loss of power. Even on platforms that provide hardware protection of data during environmental events, EPOW notification allows discrimination between I/O errors caused by hardware failures versus EPOW events. These warnings include action codes that the platform can use to influence the OS behavior when various hardware components fail. For example, a fan failure where the system can continue to operate in the safe cooling range may just generate an action code of WARN_COOLING, but a fan failure where the system cannot operate in the safe cooling range may generate an action code of SYSTEM_HALT. Implementation Note: Hardware cannot assume that the OS will process or take action on these warnings. These warnings are only provided to the OS in order to allow the OS a chance to cleanly abort operations in progress at the time of the warning. Hardware still assumes responsibility for preventing hardware damage due to environmental or power problems. An OS that wants to be EPOW-aware will look for the epow-events node in the OF device tree, enable the interrupts listed in its “interrupts” property, and provide an interrupt handler to call check-exception when one of those interrupts are received. When an EPOW event occurs, whether reported by check-exception or event-scan, RTAS will directly pass back the EPOW sensor value as part of the Extended Error Log format as described in , assuming the extended log is requested. Doing so avoids the need for the OS to make an extra RTAS call to obtain the sensor value. For critical power problems, the check-exception function is used to immediately report changes of state to the OS, while the get-sensor-state function allows the OS to monitor the condition (for example, loss of AC power) to see if the problem corrects itself. R1--1. If the platform supports Environmental and Power Warnings by including a EPOW device tree entry, then the platform must support the EPOW sensor for the get-sensor-state RTAS function. R1--2. The EPOW sensor, if provided, must contain the EPOW action code (defined in ) in the least significant 4 bits. In cases where multiple EPOW actions are required, the action code with the highest numerical value (where 0 is lowest and 7 is highest) must be presented to the OS. The platform may implement any subset of these action codes, but must operate as described in for those it does implement. R1--3. To ensure adequate response time, platforms which implement the EPOW_MAIN_ENCLOSURE or EPOW_POWER_OFF action codes must do so via interrupt and check-exception notification, rather than by event-scan notification. (Except as modified by Requirement ) R1--4. If the platform does not notify EPOW_MAIN_ENCLOSURE or EPOW_POWER_OFF via interrupt, then the platform must protect data on I/O storage devices from corruption due to the EPOW event. R1--5. For interrupt-driven EPOW events, the platform must ensure that an EPOW interrupt is not lost in the case where a numerically higher-priority EPOW event occurs between the time when check-exception gathers the sensor value and when it resets the interrupt. R1--6. For SYSTEM_SHUTDOWN EPOW class 3, after a SYSTEM_SHUTDOWN EPOW commences and when the delay interval timer expires, if an “ibm,recoverable-epow3” encode-null property in the /rtas node is present, then the OS code that manages preserving storage must check the EPOW sensor state and the “ibm,request-partition-shutdown” property if present. A normal boot must only occur when the EPOW sensor state indicates that the EPOW condition requiring a shutdown no longer exists (EPOW 0) and the “ibm,request-partition-shutdown” is not present. Otherwise, the code that manages preserving storage must take the action as identified by the property. Implementation Note: One way for hardware to prevent the loss of an EPOW interrupt is by deferring the generation of a new EPOW interrupt until the existing EPOW interrupt is reset by a call to the RTAS check-exception function. Another way is to ignore resets to the interrupt until all EPOW events have been reported. EPOW Action Codes Action Code Value Description EPOW_RESET/MESSAGE 0 No EPOW event is pending. This action code is the lowest priority. WARN_COOLING 1 A non-critical cooling problem exists. An EPOW-aware OS logs the EPOW information. WARN_POWER 2 A non-critical power problem exists. An EPOW-aware OS logs the EPOW information. SYSTEM_SHUTDOWN 3 The system must be shut down. An EPOW-aware OS logs the EPOW error log information, then schedules the system to be shut down to begin after an OS defined delay internal (default is 10 minutes.) SYSTEM_HALT 4 The system must be shut down quickly. An EPOW-aware OS logs the EPOW error log information, then schedules the system to be shut down in 20 seconds. EPOW_MAIN_ENCLOSURE 5 The system may lose power. The hardware ensures that at least 4 milliseconds of power within operational thresholds is available after signalling an interrupt. An EPOW-aware OS performs any desired functions, masks the EPOW interrupt, and monitors the sensor to see if the condition changes. Hardware does not clear this action code until the system resumes operation within safe power levels. EPOW_POWER_OFF 7 The system will lose power. The hardware ensures that at least 4 milliseconds of power within operational thresholds is available after signalling an interrupt. An EPOW-aware OS performs any desired operations, then attempts to turn system power off. An EPOW-aware OS does not clear the EPOW interrupt for this action code. This action code is the highest priority.
Software Implementation Note: A recommended OS processing method for an EPOW_MAIN_ENCLOSURE event is as follows: Prepare for shutdown, mask the EPOW interrupt, and wait for 50 milliseconds. Then call get-sensor-state to read the EPOW sensor. If the EPOW action code is unchanged, wait an additional 50 milliseconds. If the action code is EPOW_POWER_OFF, attempt to power off. Otherwise, the power condition may have stabilized, so interrupts may be enabled and normal operation resumed. Implementation Note: EPOW_RESET (EPOW action code 0) may be used to indicate that a previously reported EPOW condition is no longer present. For instance, a system might see a WARN_POWER action code for a loss of a redundant line input power. EPOW_RESET may subsequently be issued if the line power were restored. The same bits in the EPOW error log that specified the type of WARN_POWER EPOW generated would be set in the EPOW_RESET error log to indicate the specific EPOW event that was reset. Systems that do not support an EPOW interrupt would generally be unable to support the EPOW action codes 5 and 7. In those cases, there could not be an EPOW event to indicate a loss of power. However, after power were restored, generating the EPOW_RESET EPOW would indicate that the system had lost power previously and the power had been restored. The EPOW_RESET should only be used in this way if the system is unable to generate an EPOW class 5 or class 7.