Dynamic Reconfiguration (DR) ArchitectureDynamic Reconfiguration (DR) is the capability of a system to adapt to
changes in the hardware/firmware physical or logical configuration, and to be
able to use the new configuration, all without having to turn the platform
power off or restart the OS. This section will define the requirements for
systems that support DR operations.DR Architecture Structure shows the relationship of the DR
architecture with LoPAR and the relationship of the individual DR pieces
with the base DR architecture. Each specific DR option (for example, PCI
Hot Plug) will have a piece that sits on top of the base DR option. The
base DR option is the set of requirements that will be implemented by all
DR platforms and that will be utilized by the OS that supports any of the
specific DR options. The specific DR options will call out the base DR
option requirements as being required. Therefore, in the figure, any
specific DR option is really that specific DR option piece plus the base DR
option. The base DR option is not a stand-alone option; a platform which
supports the base DR option without one or more of the specific DR option
pieces that sit on top of it, has not implemented the DR architecture to a
level that will provide any DR function to the user. Likewise, a DR entity
will meet the requirements of at least one of the specific DR options, or
else software is not required to support it as a DR entity. Thus, the base
DR option is the common building block and structure upon which all other
specific DR options are built.DR operations can be physical or logical. Currently the only physical
DR entities are PCI Hot Plug. That is, the OS only has control over the
physical DR operations on PCI IOAs. The current direction for hot plug of
other DR entities is to do the physical hot plug (power up/down, control of
service indicators, etc.) via the HMC and to bring the entity into usage by
an OS via logical DR operations (Logical Resource DR -- LRDR). The PCI Hot
Plug DR option can be found in
. The Logical Resource Dynamic
Reconfiguration option can be found in
. It is expected that as time
goes on, the base DR option may be expanded upon by addition of other DR
options.Definitions Used in DR
DR DefinitionsTermDefinitionBase Dynamic Reconfiguration (DR) optionThe base on which all of the specific DR options are built.
Specific DR options include, for example, the PCI Hot Plug DR
option, processor card DR option, etc. These specific DR options
each include the requirement that all the base DR option
requirements be met. See
for more information
about the structure of the DR architecture pieces.Dynamic Reconfiguration (DR)The capability of a system to adapt to changes in the
hardware/firmware configuration with the power on and the OS
operating, and to be able to use the new configuration. This is a
piece of the High Availability puzzle, but only one of the
pieces.Addition, removal, and replacement may, in general, be done
with the power on or off on the connector into which the entity
is being added or removed. For the PCI Hot Plug option, the power
to the slot is turned off and the logic signals are electrically
isolated from the connector during the plug or unplug
operation.Depth FirstRefers to a method where a tree structure (for example, a
set of PCI buses connected by PCI to PCI bridges) is traversed
from the top to the bottom before all the siblings at any
particular level are acted upon.DR Connector (DRC)The term “DR connector” will be used here to
define the plug-in point for the entity that is participating in
DR. For example, a ‘slot’ into which a PCI IOA is
inserted is a DRC.DR EntityAn entity that can participate in DR operations. That is,
an entity that can be added or removed from the platform while
the platform power is on and the system remains operational. See
also the definitions of logical and physical DR entities.DR OperationThe act of removing, adding or replacing a DR
Entity.EntityOne or more I/O devices, IOAs, Processor cards, etc., that
are treated as one unit.High Availability (HA) SystemA system that gives the customer “close” to
continuous availability, but allows for some system down-time.
Besides DR, other factors that need to be considered in the
design of an HA system include system partitioning, clustering,
redundancy, error recovery, failure prediction, Error Detection
and Fault Isolation (EDFI), software failure detection/recovery,
etc.I/O Adapter (IOA)A device which attaches to a physical bus which is capable
of supporting I/O (a physical IOA) or logical bus (a virtual IOA)
and which has its own separate set of resources is referred to as
an IOA. The term “IOA” without the usage of the
qualifier “physical” or “virtual” will be
used to designate a physical IOA. Virtual IOAs are defined
further in
. Resources which must
have the capability of being separate (from other devices)
include: MMIO Load/Store address spaces, configuration address
spaces, DMA address spaces, power domains, error domains,
interrupt domains, and reset domains. Note that the hardware of
an IOA may allow for separation of these resources but the
platform or system implementation may limit the separation (for
example, shared error domains). In PCI terms, an IOA may be
defined by a unique combination of its assigned bus number and
device number, but not including its function number; an IOA may
be a single or multi-function device, unless otherwise specified
by the context of the text. Examples include LAN and SCSI IOAs. A
PCI IOA may exist as multiple device nodes in the OF device tree;
that is, the OF may treat separate “functions” in an
IOA as separate OF device tree nodes.IOA: built-inAn IOA that is not pluggable by the user. Sometimes called
integrated I/O. As opposed to an IOA that may be removed as part
of a plug-in card removal (see definition for a plug-in card,
below).I/O BusA hardware interface onto which an IOA can be plugged on a
platform. I/O buses discussed here include:I/O Bus: PCIThe term “PCI” refers to one of: conventional
PCI, PCI-X, or PCI Express. The term “bus” in the
case of PCI Express refers to a PCI Express link.I/O Bus: System BusThe system bus in a platform is normally used only to
attach CPUs, memory controllers, and Host Bridges to bridge to
I/O buses. A platform’s system bus may, in certain
circumstances, be used to attach very high speed IOAs. DR of
system bus-attached entities is not considered here.I/O DeviceAn entity that is connected to an IOA (usually through a
cable). A SCSI-attached DASD device is an example. Some I/O
devices and their connection points to the IOAs are designed to
be plugged while the connection point is operational to the other
I/O devices connected to the same IOA, and some are not. For
example, while the SCSI bus was not initially designed to have
devices added and removed while the SCSI bus was operational,
different vendors have found ways to do so. For example,
SCSI-attached DASD is pluggable and unpluggable from the SCSI bus
in some platforms.Live InsertionA DR operation where the power remains on at the DR
connector. Live insertion entities are always powered unless the
machine power is shut off or unless a subsystem containing those
entities is shut off.Logical DR entityA DR entity which does not have to be physically plugged or
unplugged during a DR operation on that entity. See
for a list of the supported
Logical DR types.Logical Resource DRThe name of the option for support of DR of logical
entities. See
.PCI Hot PlugDR for PCI plug-in cards where there is a separate power
domain for each PCI Hot Plug slot. Platforms which do not provide
individual control of power and isolation for each PCI slot but
which do provide power and isolation control for groups of PCI
slots (that is, multiple slots per power domain), do not provide
“PCI Hot Plug,” but can support PCI DR.Physical DR entityA DR entity which may need to be physically plugged or
unplugged during a DR operation on that entity. See
for a list of the supported
physical DR types.Plug-in cardA card which can be plugged into an I/O connector in a
platform and which contains one or more IOAs and potentially one
or more I/O bridges or switches.SubsystemOne or more I/O devices, IOAs, Processor cards, etc., that
are treated as one unit, for purposes of
removal/insertion.
Architectural LimitationsThe DR architecture places a few limitations on the implementations.
Current architectural limitations include:DR operations will be user initiated at the software level before
any physical plugging or unplugging of hardware is performed. This
architecture will be flexible enough to add additional methods for invoking
the process in the future, but for the initial architecture it will be
assumed that the operation is invoked by the user via a software method
(for example, invoking an OS DR services program). It is expected that some
technologies which will be added in the future will allow
plugging/unplugging without the user first informing the software (for
example, P1394 and USB).Critical system resources cannot be removed via a DR operation.
Which system resources are critical will not be defined by this
architecture; it is expected that this determination will be made by the OS
implementation and/or architecture. Loss of a critical resource would stop
the system from operating.Many of the RTAS calls will need to work properly, independent of
what is powered-off (for example, NVRAM access must work during DR
operations). This is partially encompassed by the last bullet. For more
information, see
.Any special requirements relative to redundant power supplies or
cooling are not addressed here.Moving of a DR entity from one location to another in a platform is
supported through a “remove and add” methodology rather than a
specific architecture which defines the constructs necessary to allow
moving of pieces of the platform around.Note: The current AIX implementation does a “remove and
add” sequence even when the overall DR operation is a replacement.
That is, first the old entity is removed, and then the new entity is
added.Dynamic Reconfiguration State Transitions shows the states and transitions
for the dynamic reconfiguration entities (DR Entities). The transition
between states is initiated by a program action (RTAS functions) provided
the conditions for the transition are met.Note: Relative to
, physical DRC types are brought
in to the “owned by the OS” states either: (1) by the Device
Tree at boot time, or (2) by a DLPAR operation, which brings in the logical
DRC “above” the physical DRC first, and drags the physical in
as part of transferring from state 3 to state 4. Therefore no states appear
in the “owned by platform” section under Hot Plug DR in the
figure. So, for example, the DLPAR assignment of a PCI physical slot to an
OS is done by assigning the logical SLOT DRC above the physical PCI slot,
thus giving the following state transitions: state 1, to state 2, to state
3, to state 4, at which time the OS sees the physical slot, sees an IOA in
the physical slot (via
get-sensor-state (dr-entity-sense) of the physical DRC
returning “present”), and then proceeds with the state
transitions of: state 5, to state 6, to state 7, to state 8. The reverse of
this (DLPAR removal of the PCI slot) is: state 8, to state 6, to state 5,
to state 4, to state 2, to state 1.Notes:In State 5, if empty status is returned from the
get-sensor-state dr-entity-sense call, then do not attempt to power-onTransitions from State 8 to 6 or from State 6 to 5 may fail
(set-indicator isolation-state isolate, and get-sensor-state
dr-entity-sense) if the hardware cannot be accessed to control
these operations. In this case, the OS may ignore
those errors if the operation is a DLPAR to remove the
hardware. See also the “ibm,ignore-hp-po-fails-for-dlpar”
property in .Base DR OptionFor All DR Options - Platform RequirementsThis section contains the extra requirements placed on
the platform for all of the various DR configurations.At this time, there are no provisions made in the DR architecture
for unexpected removal of hardware or insertion of hardware into a DR
connector. Therefore the user is expected to interact with the DR
software prior to changing the hardware configuration. For example, it is
expected that most systems will require a keyboard action prior to the
hardware configuration change. Future architecture might allow for other
possibilities. For example, a push-button switch at the DR connector may
be provided which causes an interrupt to the OS to signal that an
operation is about to take place on the connectorThe push-button method is one that has been mentioned as a
possible enhancement for systems that are produced for telephone
company applications..As mentioned in
, the requirements in this
section are not stand-alone requirements; the platform will also need to
implement one or more of the specific DR options.R1--1.For all DR options: If the
“ibm,configure-connector”
property exists in the
/rtas node of the OF device tree, then the platform
must meet all of the requirements for the Base DR option (that is, all of
the requirements labeled “For all DR options”), and must also
meet all the requirements for at least one of the specific DR
options.R1--2.For all DR options: The platform and
OS must adhere to the design and usage restrictions on RTAS routines
defined in
, and any RTAS calls not
specified in
must comply with
Note
and
.
RTAS Call Operation During DR OperationsRTAS Call NameReference to
Note
NumbersRTAS Call NameReference to
Note
Numbersrtas-last-error1ibm,read-pci-config4check-exception1, 2ibm,write-pci-config4,7display-character1restart-rtas1event-scan1, 2set-indicator3, 4, 5query-cpu-stopped-state4set-power-level3, 4, 5get-power-level4set-time-for-power-on1get-sensor-state3, 4set-time-of-day1get-time-of-day1start-cpu4ibm,configure-connector7stop-self7ibm,exti2c1system-reboot1ibm,os-term1nvram-store1nvram-fetch1power-off1, 6ibm,power-off-ups
Notes:These RTAS calls
function as specified in this architecture, regardless of the power state
of any DR entity in the platform (providing the call is
implemented).These RTAS calls
do not cause errors nor return an error status by accessing hardware
which is isolated, unusable and/or powered down.These RTAS calls function properly when dealing with a DR
connector, when the parent of that DR connector is powered and
configured, regardless of the state of the child of the parent (for
set-indicator, the isolation-state and dr-indicator names, and for
get-sensor-state, the dr-entity-sense sensor name).The results of the OS issuing these RTAS calls to hardware when
the access to that hardware is through hardware which is isolated,
unusable, powered off, or incompletely configured, are
indeterminate.The results of the OS changing the power or isolation state of a
Dynamic Reconfigure connector while there is an uncompleted
ibm,configure-connector operation in progress against
that connector are indeterminate.Power domains which were defined within sub-trees which have
been subsequently isolated may remain un-modified by this call; their
state will be platform dependent.The results of the OS issuing these RTAS calls to hardware which
is isolated and/or powered off are indeterminate.R1--3.For all DR options: If there is Forth code associated
with a DR entity, it must not modify the OF device tree properties or
methods unless modifications can be hidden by the
ibm,configure-connector RTAS call (that is, where
this RTAS routine recognizes the entity and creates the appropriate OF
device tree characteristics that would have been created by the Forth
code).R1--4.For all DR options: The hardware must protect against
any physical damage to components if the DR entity is removed or inserted
while power is on at the DR connector.R1--5.For all DR options: During a DR operation (including
resetting and removing the reset from the entity, powering up and
powering down the entity, unisolating and isolating the entity, and
physically inserting and removing the entity), the platform must prevent
the introduction of unrecoverable errors on the bus or interconnect into
which the DR entity is being inserted or removed.R1--6.For all DR options: During a DR operation (including
resetting and removing the reset from the entity, powering up and
powering down the entity, unisolating and isolating the entity, and
physically inserting and removing the entity), the platform must prevent
damage to the DR entity and the planar due to any electrical
transitions.R1--7.For all DR options: If there are any
live insertion DR entities in a platform and if those entities or the
rest of the platform cannot tolerate the power being turned off to those
entities during DR operations on other DR entities, then they must not be
placed in the same power domain as the DR entities that will be powered
off.R1--8.For all DR options: A separate visual
indicator must be provided for each physical DR connector which can be
used for insertion of a DR Entity or which contains a DR entity that can
be removed, and the indicator must be individually controllable via the
set-indicator RTAS call, and must have the capability
to be set to the states as indicated in
and
.R1--9.For all DR options:
If a platform provides a separate indicator to indicate the state of the power for the
DR connector, then that LED must be turned on by the platform when the
platform turns the power on to the DR connector and must be turned off by
the platform when the platform turns the power off to the DR
connector.R1--10.For all DR options: If a DR entity requires power to
be turned off prior to the physical removal of the DR entity from the
platform, then the hardware must provide a green power indicator to
indicate the power state of the DR entityR1--11.For all DR options: The platform must provide any
necessary power sequencing between voltages within a power domain during
DR operations (for example, during the
set-power-level RTAS call).R1--12.For all DR options: If a platform supports DR, then
all DR entities must support the full on to off and the off to full on
power transitions.Architecture Note: Requirement
is necessary so that the OS can
count on the availability of certain RTAS facilities and so that the OS
does not use other RTAS facilities when they are not available. This may
put certain hardware restrictions on what can and cannot be shut
down.Hardware Implementation Notes:Requirement
requires careful planning of
hardware design and platform structure to assure that no resources
critical to RTAS are put into power domains that are powered down as part
of a DR operation. In addition, the platform is required to provide the
facilities (registers and bits in registers readable by firmware, etc.)
so that RTAS can query the state of the hardware and determine if
something is powered off before actually accessing the powered-off
hardware.Requirement
indicates that there cannot be
any sharing of indicators between DR connectors.In some large systems (for example, systems with many racks of
equipment) it may not be possible or convenient to view the individual DR
visual indicators without opening cabinet doors, etc. In such cases, the
designers of such systems could consider putting a “summary”
visual indicator where the user could readily see it, which is basically
a logical “or” of the visual indicators which are out of
sight. For example, in a rack-based system, the drawers might have an
indicator on the front of the drawer that indicates if any indicators on
the back of the drawer are flashing. This summary indicator will not be
accessed by the software (that is, will be transparent to the software)
but it is permissible for the indicator to have firmware
dependencies.For All DR Options - OF RequirementsThis section describes the OF properties added for DR and any
additional requirements placed on OF due to DR.This section defines a number of new DR properties which are
arrays. All properties for a specific DR connector under a node are at
the same offset into each array. Also, when the descriptive text states
“the first connector” this does not imply any physical
position or numbering, but rather a logical “first” connector
beneath a particular node in the OF device tree.General RequirementsR1--1.For all DR options: When the firmware passes control
to the OS, the DR hardware must be initialized such that all of the DR
connectors which would return “DR entity present” to a
get-sensor-state dr-entity-sense) are fully powered
and operational and any DR visual indicators are set to the appropriate
state (on or off) as indicated by
.R1--2.For all DR options: After the firmware has passed
control to the OS, the state of the DR visual indicators must not change
except under the following conditions:As directed to do so by the
set-indicator RTAS call.Under the condition of a power-fault, in which case the hardware
may change the state of the visual indicator to the “off”
state if it turns the power off to the slot.R1--3.For all DR options: The platforms which have
hierarchical power domains must provide the
“power-domains-tree” property in the OF
device tree.Property
“ibm,drc-indexes”This property is added for the DR option to specify for each DR
connector an index to be passed between the OS and RTAS to identify the
DR connector to be operated upon. This property is in the parent node of
the DR connector to which the property applies. See
for the definition of this
property.R1--1.For all DR options: For each OF device tree node
which supports DR operations on its children, the OF must provide an
“ibm,drc-indexes” property for that
node.Property
“ibm,my-drc-index”This property is added for the DR option to specify for each node
which has a DR connector between it and its parent, the value of the
entry in the
“ibm,drc-indexes” property for that
connector. This property is used for correlation purposes. See
for the definition of this
property.R1--1.For all DR options: For each OF device tree node
which has a DR connector between it and its parent, the OF must provide an
“ibm,my-drc-index” property for that
node.Property
“ibm,drc-names”This property is added for the DR option to specify for each DR
connector a user-readable location code for the connector. See
for the definition of this
property.R1--1.For all DR options: For each OF device tree node
which supports DR operations on its children, the OF must provide an
“ibm,drc-names” property for that
node.R1--2.For all DR options: The content of the
“ibm,drc-names” property must be of the
format defined in
.
“ibm,drc-names” Property FormatDRC TypeDRC Name1-8, 11-30 (PCI Hot Plug)Location codeSLOTLocation code (built-in has port suffix)PORTPort xCPUCPU xwhere “x” is a decimal number with one or
more digits and no leading zeroesMEM or MEM-nLMB xwhere “x” is a decimal number with one or
more digits and no leading zeroesPHBPHB xwhere “x” is a decimal number with one or
more digits and no leading zeroes
“ibm,drc-power-domains” PropertyThis property is added for the DR option to specify for each DR
connector the power domain in which the connector resides. See
for the definition of this
property.R1--1.For all DR options:
For each OF device tree node which supports DR operations on its children, the OF
must provide an
“ibm,drc-power-domains” property for that
node.Software Implementation Notes:Software will not call the
set-power-level RTAS call with an invalid power
domain number, and for purposes of this call, a power domain number of -1
(a live insert connector) is considered invalid.For the case where the power domain is -1 (the live insert case),
this does not imply that the connector does not need isolating before the
DR operation, only that it does not need to be powered off.Property
“ibm,drc-types”This property is added for the DR option to specify for each DR
connector a user-readable connector type for the connector. See
for the definition of this
property.Architecture Note: The logical connectors (CPU, MEM
etc.) represent DR boundaries that may not have physical DR connectors
associated with them. If a physical DR boundaries were present they would
be represented by a different DR connector type. It is possible that a
given boundary may be represented by both a physical and a logical
connector. In that case, logical assignment would be managed with the
logical connector and physical add/remove would be managed by specifying
the physical DR connector.R1--1.For all DR options: For each OF device tree node
which supports DR operations on its children, the OF must provide an
“ibm,drc-types” property for that
node.Property
“ibm,phandle”This property is added for the DR option to specify the phandle for
each OF device tree node returned by ibm,configure-connector. See
for the definition of this
property.R1--1.For all DR options: The
ibm,configure-connector RTAS call will include the
“ibm,phandle” property in each OF device
tree node that it returns. This phandle must be unique and consistent
with any phandle visible to an OF client program or any other information
returned by
ibm,configure-connector.For All DR Options - RTAS RequirementsFor platforms that implement DR, there is one new RTAS call and
some changes (new requirements) placed on existing ones.General RequirementsThe following are the general requirements for RTAS for all DR
options.R1--1.For all DR options:
If there is Forth
code associated with a DR entity and that Forth code would normally
modify the OF device tree properties or methods, then if that entity is
to be supported as a DR entity on a particular platform, the
ibm,configure-connector RTAS call on that platform
must recognize that entity and create the appropriate OF device tree
characteristics that would have been created by the Forth code.set-power-levelThis RTAS call is defined in
. Several additional requirements are placed
on this call when the platform implements DR along with PM.This RTAS call is used in DR to power up or power down a DR
connector, if necessary (that is, if there is a non-zero power domain
listed for the DR connector in the
“ibm,drc-power-domains” property). The
input is the power domain and the output is the power level that is
actually to be set for that domain; for purposes of DR, only two of the
current power levels are of interest: “full on” and
“off.”For sequencing requirements between this RTAS routine and others,
see Requirements
and
.R1--1.For all DR options: the
set-power-level RTAS call must be implemented as
specified in
and the further requirements of this DR
option.R1--2.For all DR options: The
set-power-level RTAS call must initiate the operation
and return “busy” status for each call until the operation is
actually complete.R1--3.For all DR options:
If a DR operation
involves the user inserting a DR entity, then if the firmware can
determine that the inserted entity would cause a system disturbance, then
the
set-power-level RTAS call must not power up the
entity and must return an error status which is unique to that particular
type of error, as indicated in
.
set-power-level Error Status for specific DR
optionsParameter TypeNameOption NameValuesOutStatusPCI Hot Plug DR option-9000: Powering entity would create change of frequency
on the bus and would disturb the operation of other PCI IOAs on
the bus, therefore entity not powered up.
Hardware Implementation Notes:For any DR operation, the firmware could optionally not allow
powering up of a DR entity, if the powering up would cause a platform
over-power condition (the firmware would have to be provided with the DR
Entities’ power requirements and the platform’s power
capability by a method which is not architected by the DR
architecture).If PM is not implemented in the platform, then only the
“full on” and “off” states need to be implemented
for DR and only those two states will be used.Software Implementation Note: The operation of the
set-power-level call is not complete at the time of
the return from the call if the “busy” status is returned. If
it is necessary to know when the operation is complete, the routine
should be called with the same parameters until a non-busy status is
returned.get-sensor-stateThis RTAS call is defined in
. This RTAS call will be used
in DR to determine if there is something connected to the DR
connector.The
“rtas-sensors” and
“ibm,sensor-<token>”
OF properties are not applicable to DR
sensors defined in
.R1--1.For all DR options: RTAS must implement the
get-sensor-state RTAS call.R1--2.For all DR options: The sensor values specified in
must be implemented as
specified in that table.
get-sensor-state Defined Sensors for All DR
OptionsSensor NameToken ValueDefined Sensor ValuesDescriptiondr-entity-sense9003DR connector empty (0)Returned for physical DR entities if the connector is
available (empty) for an add operation. The DR connector must
be allocated to the OS to return this value, otherwise a status
of -3, no such sensor implemented, will be returned from the
get-sensor-state RTAS call.DR entity present (1)Returned for logical and physical DR entities when the DR
connector is allocated to the OS and the DR entity is present.
For physical DR entities, this indicates that the DR connector
actually has a DR entity plugged into it. For DR connectors of
physical DR entities, the DR connector must be allocated to the
OS to return this value, otherwise a status of -3, no such
sensor implemented, will be returned from the
get-sensor-state RTAS call. For DR
connectors of logical DR entities, the DR connector must be
allocated to the OS to return this value, otherwise a sensor
value of 2 or 3 will be returned.DR entity unusable (2)Returned for logical DR entities when the DR entity is
not currently available to the OS, but may possibly be made
available to the OS by calling
set-indicator with the allocation-state
indicator, setting that indicator to usable.DR entity available for exchange (3)Returned for logical DR entities when the DR entity is
available for exchange in a sparing type operation, in which
case the OS can claim that resource by doing a
set-indicator RTAS call with
allocation-state set to exchange.DR entity available for recovery (4)Returned for logical DR entities when the DR entity can
be recovered by the platform and used by the partition
performing a
set-indicator RTAS call with
allocation-state set to recover.
R1--3.For all DR options except the PCI Hot Plug and LRDR
options:
If the
get-sensor-state call with the dr-entity-sense sensor
requires the DR entity to be powered up and/or unisolated to sense the
presence of the DR entity, then the
get-sensor-state call must return the error code of
-9000 or -9001, as defined in
, if the DR entity is powered
down or is isolated when the call is made.
get-sensor-state Error Status for All DR
OptionsParameter TypeNameValuesOutStatus-9000: Need DR entity to be powered up and unisolated
before RTAS call-9001: Need DR entity to be powered up, but not
unisolated, before RTAS call-9002: (see architecture note, directly below)
Architecture Note: The -9002 return code should not
be implemented. For legacy implementations if it is returned, then it
should be treated by the caller the same as a return value of 2 (DR
entity unusable).R1--4.For all DR options:
The value used
for the sensor-index input to the
get-sensor-state RTAS call for the sensors in
must be the index for the
connector, as passed in the
“ibm,drc-indexes” property.Hardware and Software Implementation Note: The status
introduced in Requirement
is not valid for
get-sensor-state calls when trying to sense insertion
status for PCI slots (see Requirement
).Architecture Note: DR entity available for recovery
state is intended to allow a platform to temporary allocate to itself
resources on a reboot and then allow the OS to subsequently recover those
resources when no longer needed by the platform. An example of use would
be the platform temporarily reserving some LMBs to itself during a reboot
to store dump data, and then making the LMBs available to a OS partition
by marking them with the state of “available for recovery”
after the dump data has been transferred to the OS.set-indicatorThis RTAS call is defined as shown in
. This RTAS call is used in DR to transition
between isolation states, allocation states, and control DR indicators.
In some cases, a state transition fails due to various conditions,
however, a null transition (commanding that the new state be what it
already is) always succeeds. As a consequence, this RTAS call is used in
all DR sequences to logically (and if necessary physically) isolate and
unisolate the connection between a DR entity and the platform. If
physical isolation is indeed required for the DR entity, this RTAS call
determines the necessity for isolation, not the calling program.The
set-indicator allocation-state and
set-indicator isolation-state are linked. Before
calling
set-indicator with isolation-state set to unisolate,
the DR entity being unisolated will first need to be allocated to the OS.
If the
get-sensor-state call would return a value of DR
entity unusable or if it would return an error like -3 for the DR entity,
then the
set-indicator isolation-state to unisolate would fail
for that DR entity.For sequencing requirements between this RTAS routine and others,
see Requirements
and
.A single
set-indicator operation for indicator type 9001 may
require an extended period of time for execution. Following the
initiation of the hardware operation, if the
set-indicator call returns prior to successful
completion of the operation, the call will return either a status code of
-2 or 990x. A status code of -2 indicates that RTAS may be capable of
doing useful processing immediately. A status code of 990x indicates that
the platform requires an extended period of time, and hints at how much
time will be required before completion status can be obtained. Neither
the 990x nor the -2 status codes imply that the platform has initiated
the operation, but it is expected that the 990x status would only be used
if the operation had been initiated.The following are the requirements for the base DR option. Other DR
options may put additional requirements on this RTAS call. indicates which DR indicators
are used with which DR connector types.The
“rtas-indicators” and
“ibm,indicator-<token>”
OF properties are not applicable to DR
indicators defined in
.R1--1.For all DR options: The indicator state values
specified in
must be implemented as
specified in that table.
set-indicator Defined Indicators for all DR
OptionsIndicator NameToken ValueDefined State ValuesDefault ValueExamples/Commentsisolation-state9001Isolate (0),
Unisolate (1)UnisolatedThis indicator must be implemented for DR connectors for
both physical and logical DR entities. Isolate refers to the DR
action to logically disconnect the DR entity from the platform.
An isolate operation makes the DR entity available to the
firmware, and in the case of a physical DR entity like a PCI
IOA, logically disconnects the DR entity from the platform (for
example, from the PCI bus). Unisolate refers to the DR action
to logically connect the entity. Before
set-indicator isolation-state to unisolate,
the DR entity being unisolated must first be allocated to the
OS. If the
get-sensor-state call with the
dr-entity-sense token would return a value of DR entity
unusable or if it would return an error like -3 for the DR
entity, then the
set-indicator isolation-state to unisolate
must fail for that DR entity.dr-indicator9002Inactive (0),
Active (1),
Identify (2)
Action (3)0 if Inactive
1 if ActiveThis indicator must be implemented for DR connectors for
physical DR entities. If the DR indicators exist for the DR
connector, then they are used to indicate the state of the DR
connector to the user. Usage of these states are as defined in
and
.allocation-state9003unusable (0)
usable (1)
exchange (2)
recover (3)NAThis indicator must be implemented for DR connectors for
logical DR entities. Used to allocate and deallocate entities
to the OS. The initial allocation state of a connector is
established based upon the initial allocation of resources to
the OS image. Subsequently, an OS may request a change of
allocation state by use of the
set-indicator with allocation-state token.
If the transition to the usable state is not possible the -3
(no such indicator implemented) status is returned.
R1--2.For all DR options:
The value used for the indicator-index input to the
set-indicator RTAS call for the indicators in
must be the index for the
connector, as passed in the
“ibm,drc-indexes” property.R1--3.For all DR options: The
set-indicator call must return a -2 status, or
optionally for indicator type 9001 the 990x status, for each call until
the operation is complete; where the 990x status is defined in
.R1--4.For all DR options: If this is a DR operation that
involves the user inserting a DR entity, then if the firmware can
determine that the inserted entity would cause a system disturbance, then
the
set-indicator RTAS call must not unisolate the entity
and must return an error status which is unique to the particular
error.R1--5.For all DR options: If the
set-indicator index refers to a connector that would
return a “DR entity unusable” status (2) to the
get-sensor dr-entity-sense token, the
set-indicator RTAS return code must be “No such
indicator implemented” (-3), except in response to a successful
set-indicator allocation state usable.R1--6.For all DR options combined with the LPAR option: The
RTAS
set-indicator specifying unusable allocation-state of
a DR connector must unmap the resource from the partition’s Page
Frame Table(s) and, as appropriate, its Translation Control Entry
tables.R1--7.For all DR options combined with the LPAR option: The
successful completion of the RTAS
set-indicator specifying usable allocation-state of a
DR connector must allow subsequent mapping of the resource as appropriate
within the partition’s Page Frame Table(s) and/or its Translation
Control Entry tables.Software Implementation Note: The operation of the
set-indicator call is not complete at the time of the
return from the call if the “busy” status is returned. If it
is necessary to know when the operation is complete, the routine should
be called with the same parameters until a non-busy status is
returned.Hardware and Software Implementation Note: The
set-indicator (isolation-state) call is used to clear
RTAS internal tables regarding this device. The
ibm,configure-connector RTAS routine will need to be
called before using the entities below this connector, even if power was
never removed from an entity while it was in the isolated state.ibm,configure-connector RTAS CallThe RTAS function
ibm,configure-connector is a new RTAS call introduced
by DR and is used to configure a DR entity after it has been added by
either an add or replace operation. It is expected that the
ibm,configure-connector RTAS routine will have to be
called several times to complete the configuration of a dynamic
reconfiguration connector, due to the time required to complete the
entire configuration process. The work area contains the intermediate
state that RTAS needs to retain between calls. The work area consists of
4096 byte pages of real storage on 4096 byte boundaries which can be
increased by one page on each call. The OS may interleave calls to
ibm,configure-connector for different dynamic
reconfiguration connectors, however, a separate work area will be
associated with each dynamic reconfiguration connector which is actively
being configured. Other standard RTAS locking rules apply.The properties generated by the
ibm,configure-connector call are dependent on the
type of DR entities. For a list of properties generated, see the RTAS
Requirements section for each specific DR option. For example, for a list
of properties generated for PCI Hot Plug, see
.For sequencing requirements between this RTAS routine and others,
see Requirement
.R1--1.For all DR options: The RTAS function
ibm,configure-connector must be implemented and must
implement the argument call buffer defined by
.
ibm,configure-connector Argument Call BufferParameter TypeNameValuesInTokenToken for
ibm,configure-connectorNumber Inputs2Number Outputs1Work areaAddress of work areaMemory extent0 or address of additional pageOutStatus-9003: Cannot configure - Logical DR connector unusable,
available for exchange, or available for recovery.-9002: Cannot configure - DR Entity cannot be supported
in this connector-9001 Cannot configure - DR Entity cannot be supported in
this system-2: Call again-1: Hardware error0: Configuration complete1: Next sibling2: Next child3: Next property4: Previous parent5: Need more memory990X: Extended Delay
R1--2.For all DR options: On the first call of a dynamic
reconfiguration sequence, the one page work area must be initialized by
the OS as in
.
Initial Work Area InitializationEntry OffsetValue0entry from the
“ibm,drc-indexes” property for
the connector to configure10
Architecture Note: The entry offset in
is either four bytes or eight
bytes depending on whether RTAS was instantiated in 32-bit or 64-bit
mode, respectively.R1--3.For all DR options: On all subsequent calls of the
sequence, the work area must be returned unmodified from its state at the
last return from RTAS.R1--4.For all DR options: The
ibm,configure-connector RTAS call must update any
necessary RTAS configuration state based upon the configuration changes
effected through the specified DR connector.The sequence ends when either RTAS returns a “hardware
error” or “configuration complete” status code, at
which time the contents of the work area are undefined. If the OS no
longer wishes to continue configuring the connector, the OS may recycle
the work area and never recall RTAS with that work area. Unless the
sequence ends with Configuration Complete, the OS will assume that any
reported devices remain unconfigured and unusable. RTAS internal data
structures (outside of the work area) are not updated until the call
which returns “configuration complete” status. A subsequent
sequence of calls to
ibm,configure-connector with the same entry from the
“ibm,drc-indexes” property will restart
the configuration of devices which were not completely configured.If the index from
“ibm,drc-indexes” refers to a connector
that would return an “DR entity unusable” status (2) to the
get-sensor RTAS call with dr-entity-sense token, the
ibm,configure-connector RTAS call for that index
immediately returns “-9003: Cannot configure - Logical DR connector
unusable” on the first call without any configuration action taken
on the DR connector.A dynamic reconfiguration connector may attach several sibling OF
device tree architected devices. Each such device may be the parent of
one or more device sub-trees. The
ibm,configure-connector RTAS routine configures and
reports the entire sub-tree of devices rooted in previously unconfigured
architected devices found below the connector whose index is specified in
the first entry of the work area, except those that are associated with
an empty or unowned dynamic reconfiguration connector; where unowned
refers to a DR connector that would return a DR entity unusable, a DR
entity available for exchange, or a DR entity available for entity
available for recovery value, for a
get-sensor dr-entity-sense sensor. Configuration
proceeds in a depth first order.If the
ibm,configure-connector RTAS routine returns with the
“call again” or 990x status, configuration is proceeding but
had to be suspended to maintain the short execution time requirement of
RTAS routines. No results are available. The OS should call the
ibm,configure-connector RTAS routine passing back the
work area unmodified at a later time to continue the configuration
process.If the
ibm,configure-connector RTAS routine returns with a
“Cannot configure - DR Entity cannot be supported in this
connector”, then there is a lack of one or more resources at this
connector for this DR Entity and there is at least one DR connector in
the system into which this DR Entity can be configured. In this case, the
DR program should indicate to the user that they need to consult the
appropriate system documentation relative to the DR Entity that they are
trying to insert into the system.The “need more memory” status code, is similar in
semantics to the “call again” status. However, on the next
ibm,configure-connector call, the OS will supply, via
the
Memory extent parameter, the address of another page
of memory for RTAS to add to the work area in order for configuration to
continue. On all other calls to
ibm,configure-connector the contents of the
Memory extent parameter should be 0. It is the
responsibility of the OS to recover all work area memory after a sequence
of
ibm,configure-connector calls is completed.Software Implementation Note: The OS may allocate the
work area from contiguous virtual space and pass individual discontiguous
real pages to
ibm,configure-connector as needed.If the
ibm,configure-connector RTAS routine returns either
the “next sibling” or “next child” status codes,
configuration has detected an architected OF device tree device, and is
returning its OF device tree node-name. Work Area offset 2 contains an
offset within the first page of the work area to a NULL terminated string
containing the node-name. Note, if the caller needs to preserve this or
any other returned parameters between the various calls of a
configuration sequence it will copy the value to its own area. Also, the
first call returning configuration data will have a “next
child” status code.The “next property” status code indicates that a
subsequent property is being returned for the device. Work Area entry
offset 2 contains an offset within the first page of the work area to a
NULL terminated string containing the property name. Work Area entry
offset 3 contains the length of the property value in bytes. Work Area
entry offset 4 contains an offset within the first page of the work area
to the value of the property.Architecture Note: The
ibm,configure-connector RTAS routine returns those
applicable properties that can be determined without interpreting any
FCode ROM which is associated with the IOA. Additionally, it is
permissible for this RTAS call to be aware of various specific IOAs and
emulate the action of any FCode associated with the IOA.If the
ibm,configure-connector RTAS routine returns the
“previous parent” status code, it has come to the end of the
string of siblings, and will back up the tree one level following its
depth first order algorithm. The 2nd through 4th work area entries are
undefined for this status code.Software Implementation Notes:Any attempts to configure an already configured connector or one
in progress of being configured will produce unpredictable
results.The software will put the DR entity in the full on power state
before issuing the
ibm,configure-connector RTAS call to configure the DR
entity.For All DR Options - OS RequirementsVisual Indicator StatesDR Visual indicator usage will be as indicated in the following
requirement, in order to provide for a consistent user interface across
platforms. Information on implementation dependent aspects of the DR
indicators can be found in
.R1--1.For all DR options: The visual indicators must be
used as defined in
.
Visual Indicator UsageState of indicatorUsageInactiveThe DR connector is inactive and entity may be removed or
added without system disruption. For DR entities that require
power off at the connector, then the caller of
set-indicator must turn power off prior to
setting the indicator to this state. See also
.Identify (Locate)This indicator state is used to allow the user to
identify the physical location of the DR connector. This state
may map to the same visual state (for example, blink rate) as
the Action state, or may map to a different state. See also
.ActionUsed to indicate to the user the DR connector on which
the user is to perform the current DR operation. This state may
map to the same visual state (for example, blink rate) as the
Identify state, or may map to a different state. See also
.ActiveThe DR connector is active and entity removal may disrupt
system operation. See also
.
Other RequirementsR1--1.For all DR options: The OS must detect hierarchical
power domains (as specified in the
“power-domains-tree” property) and must
handle those properly during a DR operation.R1--2.For all DR options:
When bringing a
DR entity online, the OS must issue the following RTAS calls in the
following order:If the power domain is not 0, then call
set-power-levelset-indicator (with the isolation-state token and a
state value of unisolate)ibm,configure-connectorR1--3.For all DR options:
When taking a DR
entity offline, the OS must issue the following RTAS calls in the
following order:set-indicator with the isolation-state token and a
state value of isolate)If the power domain is not 0, then call
set-power-levelR1--4.When bringing a DR entity online that
utilizes TCEs (see
), the OS must initialize the DR
entity's TCEs.PCI Hot Plug DR OptionThis section will develop the requirements over and beyond the base
DR option requirements, that are unique to being able to perform DR
operations on PCI plug-in cards that do not share power domains with other
PCI plug-in cards.PCI Hot Plug DR - Platform RequirementsA method will be provided to isolate the plug-in card (power and
logic signals) and to physically remove the plug-in card from the
machine. The physical removal may pose an interesting mechanical
challenge, due to the position of the card edge connector relative to the
desired direction of insertion of the card from the outside of the
machine. In addition, PCI plug-in cards may have internal cables and may
span multiple slots. Such mechanical issues are not addressed by this
architecture.This section describes the requirements for the platform when a
platform implements the PCI Hot Plug DR option.R1--1.For the PCI Hot Plug DR option: All platform
requirements of the base DR option architecture must be met (
).R1--2.For the PCI Hot Plug DR option: All PCI requirements
must be met (for example, timing rules, power slew rates, etc.) as
specified in the appropriate PCI specifications, and in the
.R1--3.For the PCI Hot Plug DR option:
The hardware must
provide two indicators per PCI Hot Plug slot, and all the following must
be true:
One indicator must be green and the platform must use the
indicator to indicate the power state of the PCI Hot Plug slot, turning
on the indicator when the slot power is turned on and turning off the
indicator when the slot power is turned off.The other indicator must be amber and must be controllable by
RTAS, separately from all other indicators, and must be used as a slot
Identify indicator, as defined in
.R1--4.For the PCI Hot Plug DR option:
The hardware must
provide a separate power domain for each PCI Hot Plug slot, controllable
by RTAS, and that power domain must not be used by any other DR connector
in the platform.R1--5.For the PCI Hot Plug DR option:
The hardware must
provide the capability to RTAS to be able to read the insertion state of
each PCI Hot Plug slot individually and must provide the capability of
reading this information independent of the power and isolation status of
the plug-in card.R1--6.For the PCI Hot Plug DR option:
The hardware must
provide individually controllable electrical isolation (disconnect) from
the PCI bus for each PCI Hot Plug slot, controllable by RTAS and this
isolation when set to the isolation mode must protect against errors
being introduced on the bus, and damage to the plug-in cards or planars
during the plug-in card power up, power down, insertion, and
removal.R1--7.For the PCI Hot Plug option:
A platform must
prevent the change in frequency of a bus segment (for example, on the
insertion or removal of an plug-in card) while that change of frequency
would result in improper operation of the system.R1--8.For the PCI Hot Plug option: For each PCI Hot Plug
slot which will accept only 32-bit (data width) plug-in cards, the
platform must:
Accommodate plug-in cards requiring up to 64 MB of PCI Memory
Space and 64 KB of PCI I/O spaceFor TCE-mapped DMA address space, must provide the capability to
map simultaneously and at all times at least 128 MB of PCI Memory space
for the slot.R1--9.For the PCI Hot Plug option: Each PCI Hot Plug slot
which will accept 64-bit (data width) plug-in cards, the platform
must:
Accommodate plug-in cards requiring up to 128 MB of PCI Memory
Space and 64 KB of PCI I/O spaceFor TCE-mapped DMA address space, must provide the capability to
map simultaneously and at all times at least 256 MB of PCI Memory space
for the slot.R1--10.For the PCI Hot Plug option with PCI Express: The
power and isolation controls must be implemented by use of the PCI
Standard Hot-Plug Controller (see
).R1--11.For the PCI Hot Plug option with PCI Express: If a
PCI Hot Plug DRC contains multiple PEs, then that DRC must be owned by
the platform or a trusted platform agent.Hardware implementation Notes:Surge current protection on the planar is one way to provide the
required protection against damage to components if an entity is removed
from or inserted into a connector with the power still applied to the
connector.Removal of an entity without the proper quiescing operation may
result in a system crash.In order for hot plugging of PCI plug-in cards with the system
operational to be useful, a mechanical means is needed in order to be
able to remove or insert PCI plug-in cards without shutting off system
power and without removing the covers above the plug-in cards (which in
general, would require powering-down the system).It is recommended that the control of the indicators required by
Requirement
be via the PCI Standard Hot
Plug Controller (see
).PCI Hot Plug DR - Boot Time Firmware RequirementsR1--1.For the PCI Hot Plug DR option: All OF requirements
of the base DR option architecture must be met (
).R1--2.For the PCI Hot Plug DR option: The OF must only
generate the
“clock-frequency” OF property for PCI
bridge nodes which cannot change bus clock frequency during a PCI Hot
Plug operation.R1--3.For the PCI Hot Plug DR option: The OF must set the
PCI configuration register bits and fields appropriately.Hardware Implementation Note: The OF should leave
sufficient gaps in the bus numbers when configuring bridges and switches
such that plug-in cards with bridges and switches which are to be
supported by the platform’s DR operations can be plugged into every
slot in the platform in which those plug-in cards are supported. That is,
insertion of an plug-in card that contains a bridge or switch into a
platform, requires that there be sufficient available bus numbers
allocated to that PCI bus such that new bus numbers can be assigned to
the buses generated by the bridges and switches on the plug-in
cards.PCI Hot Plug DR - Run Time Firmware RequirementsR1--1.For the PCI Hot Plug DR option: All RTAS requirements
of the base DR option architecture must be met (
).R1--2.For the PCI Hot Plug DR option: The
set-indicator RTAS call with a indicator type of
isolation-state and a state value of unisolate (1) must not return a
“success” status until any IOA on a plug-in card inserted
into the PCI slot is ready to accept configuration cycles, and must
return a “success” status if the PCI slot is empty.R1--3.For the PCI Hot Plug DR option: The
ibm,configure-connector RTAS call must initialize the
PCI configuration registers and platform to the same values as at boot
time.Architecture Note: During a DR replace operation, the
replacement PCI IOA may not get placed back at the same addresses, etc.,
as the original DR entity by the firmware (although it has to be placed
back into the same DR connector, or it is not a DR replace operation). On
a replace operation, the configuration information cannot reliably be
read from the IOA being replaced (the IOA might be broken), so the
firmware cannot read the configuration information from the old IOA and
replace the configuration information into the new IOA.PCI I/O sub-systems architecturally consist of two classes of
devices, bus bridges (Processor Host Bridges (PHBs), PCI to PCI Bridges,
and PCI Express switches and bridges) and IOAs. The support that
ibm,configure-connector provides for these two
classes is different.For Bus Bridges, firmware will totally configure the bridge so that
it can probe down the depth of the tree. For this reason, the firmware
must include support for all bridges the platform supports. This includes
interrupt controllers as well as miscellaneous unarchitected devices that
do not appear in the OF device tree. The properties supported and
reported are the same as provided by the boot time firmware.For PCI plug-in cards, the support is significantly less; it is
essentially the functionality specified in section 2.5 FCode Evaluation
Semantics of the
. However, the configuration
proceeds as if all devices do not have an expansion ROM since the RTAS
code does not attempt to determine if an FCode ROM is present nor
attempts to execute it. This may, in some cases, generate different
device node properties, values and methods than would happen had the IOA
been configured during boot. If the IOA’s device driver or
configuration support cannot deal with such differences, then the IOA is
not dynamically reconfigurable. The other properties generated are
dependent upon the IOA’s configuration header from the following
list. If the property is not on this list the reader should assume that
RTAS
ibm,configure-connector will not generate it. shows what PCI OF properties
can be expected to be returned from the
ibm,configure-connector call for PCI Hot Plug
operations and
shows some which can be
expected to not be returned.R1--4.For the PCI Hot Plug DR option: The
ibm,configure-connector RTAS call when used for PCI
IOAs must return the properties named in
except as indicated in the
Present?/Source column.
PCI Property Names which will be Generated by
ibm,configure-connectorProperty NamePresent?/Source“name”Always present.“vendor-id”Always present. From PCI header.“device-id”Always present. From PCI header.“revision-id”Always present. From PCI header.“class-code”Always present. From PCI header.“interrupts”Only present if Interrupt Pin register not 0.“min-grant”Present unless Header Type is 0x01.“max-latency”Present unless Header Type is 0x01.“devsel-speed”Only present for conventional PCI and PCI-X.“compatible”Always present. Constructed from the PCI header
information for the IOA or bridge.“fast-back-to-back”Only present for conventional PCI and PCI-X when Status
Register bit 7 is set.“subsystem-id”Only present if “Subsystem ID” register not
0.“subsystem-vendor-id”Only present if “Subsystem vendor ID”
register not 0.“66mhz-capable”Only present for conventional PCI and PCI-X when Status
Register bit 5 is set.“133mhz-capable”Only present for PCI-X when PCI-X Status Register bit 17
is set.“266mhz-capable”Only present for PCI-X when PCI-X Status Register bit 30
is set.“533mhz-capable”Only present for PCI-X when PCI-X Status Register bit 31
is set.“reg”Always present. Specifies address requirements.“assigned-addresses”Always present. Specifies address assignment.“ibm,loc-code”Always present. RTAS will have to remember the location
codes associated with all DR connectors so that it can build
this property.“ibm,my-drc-index”Always present.“ibm,vpd”Always present for sub-systems and for PCI IOAs which
follow the PCI VPD proposed standard. See
and note to see the effect of
using different PCI versions.“device_type”For bridges, always present with a value of
“PCI” otherwise not
present.“ibm,req#msi”Present for all PCI Express IOA nodes which are
requesting MSI support, when the platform supports MSIs.
is a non-exhaustive list of
common properties that may not be generated by RTAS
ibm,configure connector for a PCI IOA. Also, the
concept of a phandle does not apply to nodes reported by
ibm,configure-connector.
Non-exhaustive list of PCI properties that may not be
generated by
ibm,configure connectorProperty NamePresent?/Source“ibm,connector-type”Never present -- only for built-in entries not for
pluggable ones.“ibm,wrap-plug-pn”Never present -- only for built-in entries not for
pluggable ones.“alternate-reg”Never present -- needs FCode.“fcode-rom-offset”Never present -- RTAS does not look for this.“wide”Never present -- needs FCode.“model”Never present -- needs FCode.“supported-network-types”Never present -- needs FCode.“address-bits”Never present -- needs FCode.“max-frame-size”Never present -- needs FCode.“local-mac-address”Never present -- needs FCode.“mac-address”Never present -- needs FCode.“built-in”Not present for a PCI Hot Plug connectors.
Architecture Note: Without
“device_type” and other properties, the
OS cannot append an IOA added via DR to the boot list for use during the
next boot.R1--5.For the PCI Hot Plug option: When
ibm,configure-connector RTAS call returns to the
caller, if the device driver(s) for any IOA(s) configured as part of the
call are EEH unaware (that is may produce data integrity exposures due to
an EEH stopped state) or if they may be EEH unaware, then the
ibm,configure-connector call must disable EEH prior
to returning to the caller.Software Implementation Note: To be EEH aware, a
device driver does not need to be able to recover from an EEH stopped
state, only recognize the all-1's condition and not use data from
operations that may have occurred since the last all-1's checkpoint. In
addition, the device driver under such failure circumstances needs to
turn off interrupts (using the
ibm,set-int-off RTAS call) in order to make sure that
any (unserviceable) interrupts from the IOA do not affect the system.
Note that this is the same device driver support needed to protect
against an IOA dying or against a no-DEVSEL type error (which may or may
not be the result of an IOA that has died). Note that if all-1’s
data may be valid, the
ibm,read-slot-reset-state2 RTAS call should be used
to discover the true EEH state of the device.PCI Hot Plug DR - OS RequirementsR1--1.For the PCI Hot Plug DR option: All OS requirements
of the base DR option architecture must be met (
).Logical Resource Dynamic Reconfiguration (LRDR)The Logical Resource Dynamic Reconfiguration option allows a platform
to make available and recover platform resources such as CPUs, Memory
Regions, Processor Host Bridges, and I/O slots to/from its operating OS
image(s). The Logical Resource Dynamic Reconfiguration option provides the
means for providing capacity on demand to the running OS and provides the
capability for the platform to make available spare parts (for example,
CPUs) to replace failing ones (called
sparing operations). Combined with the LPAR option,
platforms can move resources between partitions without rebooting the
partitions’ OS images.The Logical Resource Dynamic Reconfiguration (LRDR) option deals with
logical rather than physical resources. These logical resources are already
physically installed (dynamic installation/removal of these resources, if
supported, is managed via the Hardware Management Console (HMC) or Service
Focal Point (SFP)). As such, the OS does not manage either connector power
or DR visual indicators. Logical connector power domains are specified as
“hot pluggable” (value -1) and DR visual indicators are not
defined for logical connectors.The device tree contains logical resource DR connectors for the
maximum number of resources that the platform can allocate to the specific
OS. In some cases such as for processors and PHBs, this may be the maximum
number of these resources that the platform supports even if there are
fewer than that currently installed. In other cases, such as memory regions
in a LPARed system, the number may be limited to the amount of memory that
can be supported without resizing the cpu page frame table. The OS may use
the
get-sensor-state RTAS call with the dr-entity-sense
token to determine if a given drc-index refers to a connector that is
currently usable for DR operations. If the connector is not currently
usable the return state is “DR entity unusable” (2). A
set-indicator (isolation state) RTAS call to an
unusable connector or (dr-indicator) to any logical resource connector
results in a “No such indicator implemented” return
status.Two allocation models are supported. In the first, resources are
specifically assigned to one and only one partition at a time by the HMC.
In this model, a DR entity state is changed from unusable to usable only by
firmware in response to HMC requests to explicitly move the allocation of
the resource between partitions. In the second model, certain resources may
“float” between cooperating partitions, a partition issues a
set-indicator (allocation state usable) RTAS call and
if the resource is free, the firmware assigns the resource to the
requesting partition and returns the success status.
Set-indicator returns the code
“no-such-indicator” if either the resource is not free, or the
platform is operating in the first model. To return a resource to the
platform firmware, the OS issues a
set-indicator (allocation state unusable) RTAS call for
the resource’s DR connector.Platform Requirements for LRDRThe following requirements apply to the hardware and/or firmware as
a result of implementing LRDR on a platform.R1--1.For the LRDR option: The hardware must provide the
capability to power-cycle any hardware that is going to be switched
between partitions as part of LRDR, if that hardware requires
power-cycling to put the hardware into a known state (for example, PCI
IOAs).Architecture Note: Except for PCI Express IOAs that
implement the Function Level Reset (FLR) option, since the PCI
architecture is not specific as to the state of the IOA when the IOAs
reset is activated and deactivated, either the platform designer will
need to guarantee that all logic in all IOAs (including any internal
storage associated with the IOA) is cleared to a known state by use of
the IOAs' reset, or else the platform will need to provide the capability
to power-cycle those IOAs, including the integrated ones (that is,
including the non-pluggable ones). Also note that hardware which requires
power-cycling to initialize may impact the capability to reliably reboot
an OS, independent of whether or not LRDR is implemented.R1--2.For the LRDR option:
Any power-cycling
of the hardware which is done by the platform during an LRDR operation
(for example, as part of an ibm,configure-connector operation), must be
functionally transparent to the software, except that PCI plug-in cards
that are plugged into a PCI Hot Plug DR connector do not need to be
powered on before the
ibm,configure-connector call for a logical SLOT DR
connector returns to the caller.Architecture Note: PCI plug-in cards that are plugged
into a DR connector will not be configured as part of an
ibm,configure-connector operation on a logical DR connector of type SLOT
above the plug-in card (see section 17.6.3.3 ibm,configure-connector).
However, Requirement
does require a PCI IOA which is
not plugged in to a PCI Hot Plug DR connector (for example, soldered on
the planar) be powered up and configured as a result of an
ibm,configure-connector operation on a logical DR connector of type SLOT
above such an IOA, and requires this powering up to be functionally
transparent to the caller of ibm,configure-connector operation (a longer
busy time is not considered to be a violation of the functional
transparency requirement).DR Properties for Logical ResourcesLogical resource dynamic reconfiguration is a special case of
general DR, therefore, certain DR properties take on special
values.
DR Property Values for Logical ResourcesProperty NameProperty Value“ibm,drc-indexes”As defined in
.“ibm,my-drc-index”As defined in
.“ibm,drc-names”As defined in
.
Note:
This name
allows for correlation between the OS and HMC user
interfaces.“ibm,drc-power-domains”Logical Resource connectors are defined to be “hot
pluggable” having a domain value of -1 per definition in
.“ibm,drc-types”Shall be one of the values “CPU”,
“MEM”, “PHB”, or “SLOT” as
defined in
.
R1--1.For the LRDR option: All platform requirements of the
base DR option architecture must be met (
).R1--2.For the LRDR option: The
/cpus OF device tree node must include
“ibm,drc-types” (of type CPU),
“ibm,drc-power-domains”
(of value -1),
“ibm,drc-names”, and
“ibm,drc-indexes”
properties with entries for each potentially
supported dynamically reconfigurable processor.R1--3.For the LRDR option: The root node of the OF device
tree must include
“ibm,drc-types”
(of type MEM),
“ibm,drc-power-domains”
(of value -1),
“ibm,drc-names”, and
“ibm,drc-indexes”
properties with entries for each potentially
supported dynamically reconfigurable memory region.R1--4.For the LRDR option: The root node of the OF device
tree must not include any drc properties (
“ibm,drc-*”) for the base memory region
(reg value 0).R1--5.For the LRDR option: The root node of the OF device
tree must include
“ibm,drc-types”
(of type PHB),
“ibm,drc-power-domains”
(of value -1),
“ibm,drc-names”, and
“ibm,drc-indexes”
properties with entries for each potentially
supported dynamically reconfigurable PHB.R1--6.For the LRDR option: The
/pci OF device tree node representing a PHB must
include
“ibm,drc-types”
(of type SLOT),
“ibm,drc-power-domains”
(of value -1),
“ibm,drc-names”, and
“ibm,drc-indexes”
properties with entries for each potentially
supported dynamically reconfigurable PCI SLOT.R1--7.For the LRDR option: platforms must implement the
allocation-state indicator 9003, as defined in
.R1--8.For the LRDR option: For memory LRDR, the
“ibm,lrdr-capacity” property must be
included in the
/rtas node of the partition device tree (see
).Architectural Intent -- Logical DR Sequences:This architecture is designed to support the logical DR sequences
specified in the following sections. See also
.Acquire Logical Resource from Resource PoolThe OS responds to some stimuli (command, workload manager, HMC,
etc.) to acquire the resource, perhaps using the
“ibm,drc-names” value as a reference if a
human interface is involved.The OS determines
if the resource is usable:OS uses
get-sensor-state (dr-entity-sense) to determine the
state of the DR connectorIf the state is “unusable” the OS issues
set-indicator (allocation-state, usable) to attempt
to allocate the resource. Similarly, if the state is “available for
exchange” the OS issues
set-indicator (allocation-state, exchange) to attempt
to allocate the resource, and if the state is “available for
recovery” the OS issues
set-indicator (allocation-state, recover) to attempt
to allocate the resource.If successful, continue, else return error status to the
requester. If successful, this is the point where the resource is
allocated to the OS.Continue with DR operation.The OS unisolates the resource via
set-indicator (isolation-state, unisolate). This is
the point where the OS takes ownership of the resource from the platform
firmware and the firmware removes the resource from its resource
pool.The OS configures the resource using
ibm,configure-connector RTAS.The OS incorporates the resource into its resource pool.If the resource is a processor, the OS must use the
start-cpu RTAS call to move the processor from the
stopped state (at the end of the
ibm,configure-connector) to the running
state.The OS returns status of operation to the requester.The OS notifies requesting entity of the OS state relative to
the resource acquisition.Release Logical ResourceSome entity (System administrator commanding from the HMC, a
workload manager, etc.) requests the OS to release the resource using the
“ibm,drc-names” value as a
reference.The OS attempts to stop using logical resource.If the resource is a processor, the OS calls the
stop-self RTAS call then waits for the processor to
enter the stopped state using the RTAS
query-cpu-stopped-state call.The OS isolates the resource via
set-indicator (isolation-state, isolate).Unless the isolated resource was the partition’s last
processor, the OS deallocates the resource via
set-indicator (allocation-state, unusable). This is
the point where the platform firmware takes ownership of the resource
from the OS. That is, the OS removes the resource from its resource pool
and the firmware adds it to the firmware resource pool.The OS returns status of operation to the requester.The OS unallocates the resource by
set-indicator (allocation-state, unusable).The system administrator may command the HMC to allocate the
logical resource to another partition (LPAR) or reserved pool
(COD).Any needed hardware removal is handled by HMC/SPC.RTAS Call Semantics/RestrictionsThis section describes the unique application of DR RTAS functions
to the dynamic reconfiguration of logical resources.set-indicator (isolation-state, isolate)Dynamic reconfiguration of logical resources introduces special
meaning and restrictions to the DR connector isolation function depending
upon the logical resource being isolated.Isolation of CPUsThe isolation of a CPU, in all cases, is preceded by the
stop-self RTAS function for all processor threads,
and the OS insures that all the CPU’s threads are in the RTAS
stopped state prior to isolating the CPU. Isolation of a processor that
is not stopped produces unpredictable results. The stopping of the last
processor thread of a LPAR partition effectively kills the partition, and
at that point, ownership of all partition resources reverts to the
platform firmware.R1--1.For the LRDR option: Prior to issuing the RTAS
set-indicator specifying isolate isolation-state of a
CPU DR connector type, all the CPU threads must be in the RTAS stopped
state.R1--2.For the LRDR option: Stopping of the last processor
thread of a LPAR partition with the
stop-self RTAS function, must kill the partition,
with ownership of all partition resources reverting to the platform
firmware.Isolation of MEM RegionsIsolation of a MEM region creates a paradox if the MEM region being
isolated contains the calling program (there being no program left for
the firmware to return).Note: The base memory region (starting at address
zero) is not associated with a MEM DR connector. This means that the base
memory region cannot be isolated. This restriction avoids two fatal
conditions, attempts to isolate the region containing RTAS, and attempts
to isolate the region containing the interrupt vectors.It is the responsibility of the OS to unmap the addresses of the
MEM region being isolated from both PFT and the TCE tables. When the LRDR
option is combined with the LPAR option, the hypervisor ensures that the
addresses of the MEM region being isolated are unmapped from both the PFT
and TCE tables before successfully completing the isolation of the MEM
region. If any valid mappings are found, the RTAS
set-indicator (isolation-state) does not change the
isolation-state and returns with a
Status-9001 (Valid outstanding translation).R1--1.For the LRDR option: The caller of the RTAS
set-indicator specifying isolate isolation-state of a
MEM DR connector type must not be within the region being
isolated.R1--2.For the LRDR option combined with the LPAR option:
The RTAS set-indicator specifying isolate isolation-state of a
MEM DR connector type must check that the region is unmapped from both
the partition’s Page Frame Table(s) and any Translation Control
Entries that would reference the memory, else the RTAS routine must
return with a status of
Status-9001 (Valid outstanding translation) and the
isolation-state is not changed.Implementation Note: The algorithm chosen for
implementing Requirement
depends upon the expected
frequency of isolation events. For RAS reasons, they should be seldom.
For load balancing, they may be far more frequent. These methods are
briefly described here:First pull the corresponding logical address from the
partition’s valid space so setting new translations to the logical
address are not possible. Then wait for any current in flight translation
additions to complete. Followed by either scanning the entire PFT and TCE
tables looking for valid translations or checking a use count for the
particular logical address range. The PFT/TCE table search may be long,
however, it is only done at isolation time.The use count method must be maintained for each add and remove
of an address translation with the corresponding accessing of a use count
based upon the physical real address of the memory block.Isolation of PHBs and SlotsAn isolation of a PHB naturally disconnects the OS image from any
of the DR connectors downstream of the PHB (specifically any I/O slots
and PCI Hot Plug connectors associated with the PHB). To avoid the
complexity of gracefully managing multi-level isolation, isolation is
restricted to only “leaf” DR connectors, that is connectors
that have no unisolated or usable DR connectors below them. That is, for
logical DR connectors below the connector being isolated, a
get-sensor-state dr-entity-sense needs to return an
unusable (2) and for physical DR connectors below the connector being
isolated, the DR entity needs to be isolated first via
set-indicator (isolation-state, isolate). The OS is
responsible for removing all virtual address mappings to the address
range associated with a logical I/O SLOT before making the RTAS
set-indicator (isolation-state) call that isolates
the SLOT. When the LRDR option is combined with the LPAR option, the
hypervisor ensures that the addresses associated with the logical SLOT
being isolated are unmapped from both the PFT and TCE tables before
successfully completing the isolation of the SLOT connector. If any valid
mappings are found, the RTAS
set-indicator (isolation-state) does not change the
isolation-state and returns with a
Status-9001 (Valid outstanding translation).R1--1.For all LRDR options: If a request to
set-indicator (isolation-state, isolate) would result
in the isolation of one or more other DR connectors which are currently
unisolated or usable, then the
set-indicator RTAS must fail with a return code of
“Multi-level isolation error” (-9000).R1--2.For the LRDR option combined with the LPAR
option: The RTAS
set-indicator specifying isolate isolation-state of a
SLOT DR connector type must check that the IOA address range associated
with the slot is unmapped from both the partition’s Page Frame
Table(s) and any Translation Control Entries that would reference those
locations, else the RTAS routine must return with a
Status-9001 (Valid outstanding translation) and the
isolation-state is not changed.set-indicator (dr-indicator)Logical connectors do not have associated dr-indicators (token
value 9002). An attempt to set the state of such an indicator results in
a “No such indicator implemented” return status.
R1--1.For all LRDR options: The calling of
set-indicator with a token value of 9002
(dr-indicator) and an index representing a logical connector must fail
with a return code of “No such indicator implemented”
(-3).ibm,configure-connectorThe
ibm,configure-connector RTAS call is used to return
to the OS the device tree nodes and properties associated with the newly
un-isolated logical resources and configure them for use.The
ibm,configure-connector RTAS call used against a
logical DR connector can encounter other logical DR connectors or
physical DR connectors below it in the tree. If a logical connector is
encountered below a logical connector that is being configured, the
ibm,configure-connector RTAS call will not configure
the sub-tree, if it is not owned by the OS (where owned refers to a DR
connector that would return a DR entity usable, for a
get-sensor dr-entity-sense sensor). If a physical
connector is encountered, then the sub-tree below the physical connector
may or may not be configured, depending on the implementation.Architecture Note: The requirements of this section
specify the minimum sub-tree contents returned for various connector
types. Implementations may optionally return other valid previously
reported nodes that represent the current configuration of the device
tree. Previously reported nodes may not have any changes from their
previously reported state. A node that was removed from the configuration
due to a DR operation and returns due to a subsequent DR operation is not
considered to have been previously reported. It is the caller's
responsibility to recognize previously reported nodes.R1--1.For all LRDR options: If a request to
ibm,configure-connector specifies a connector that is
isolated,
ibm,configure-connector must immediately return
configuration complete.R1--2.For all LRDR options: If the connector index refers
to a connector that would return a “DR entity unusable”
status (2), “DR entity available for exchange” status (3), or
“DR entity available for recovery” status (4) to the
get-sensor dr-entity-sense token, the
ibm,configure-connector RTAS call must return
“-9003: Cannot configure - Logical DR connector unusable, available
for exchange, or available for recovery” on the first call without
any configuration action taken on the DR connector.R1--3.For all LRDR options: If a request to
ibm,configure-connector specifies a connector of type
CPU,
the returned sub-tree must consist of the specific
cpu-node, its children, and any referenced nodes that had not been
previously reported (such as L2 and L3 caches etc.) all containing the
properties as would be contained in those nodes had they been available
at boot time.Implementation Note: Future platforms that support
concurrent maintenance of caches, will require that high level cache
nodes (L2, L3 etc.) are added by
ibm,configure-connector such that their properties
can change as new/repaired hardware is added to the platform. Therefore,
it is the OS's responsibility when isolating a CPU to purge any
information it may have regarding an orphaned high level cache node. The
OS may use the
“ibm,phandle” property to selectively
remove caches when a processor is removed. The platform considers any
high level cache that is newly referenced (reference count for this
partition goes from 0 to 1) to have been previously unreported.R1--4.For all LRDR options: If a request to
ibm,configure-connector specifies a connector of type
MEM,
the returned sub-tree must consist of the specific
ibm,memory-region node containing the properties as
would be contained in that node had it been available at boot
time.R1--5.For all LRDR options: If a request to
ibm,configure-connector specifies a connector of type
PHB or SLOT, then all of the following must be true:
The returned values must represent the sub-tree for the specific
I/O sub-system represented by the connector, except for entities below
any DR connectors (logical or physical) which are below the connector
which is the target of the ibm,configure-connector operation (that is,
the ibm,configure-connector operation stops at any DR connector).The sub-tree must consist of the specific node and its children
all containing the properties as would be contained in those nodes had
they been available at boot time, including (if they exist) built-in PCI
IOAs.R1--6.For all LRDR options: If a request to
ibm,configure-connector specifies a connector of type
SLOT,
the returned values must represent the sub-tree for
the specific I/O sub-system represented by the SLOT connector, and the
sub-tree must consist of the specific
/pci node and its children all containing the
properties as would be contained in those nodes had they been available
at boot time, except for the PCI IOA nodes assigned to the OS image that
contain the same properties as they would following a PCI hot plug
operation (see
).R1--7.For all LRDR options: If a platform implementation
powers-up and configures physical DR entities in the sub-tree under a
logical DR connector, then a request to
ibm,configure-connector of the logical DR connector
must use the return status of 990x from the
ibm,configure-connector call, as necessary, during
the DR entity power-up sequence(s) and must control any power-up and
sequencing requirements, as would be done by the platform during platform
power-up.