I/O Devices
This chapter describes requirements for IOAs. It adds detail to areas
of the PCI architectures (conventional PCI, PCI-X and PCI Express) that are
either unaddressed or optional. It also places some requirements on firmware
and the OS for IOA support. It provides references to specifications to which
IOAs must comply and gives design notes for IOAs that run on LoPAR systems.
PCI IOAs
R1--1.
All PCI IOAs must be capable of decoding and
generating either a full 32-bit address or a full 64-bit address.
R1--2.
IOAs that implement conventional PCI must be compliant with the
most recent version of the at the
time of their design, including any approved Engineering Change Requests (ECRs)
against that document.
R1--3.
IOAs that implement PCI-X must be compliant
with the most recent version of the
at the time of their design, including any approved Engineering Change Requests
(ECRs) against that document.
R1--4.
IOAs that implement PCI Express must be
compliant with the most recent version of the at the time of their design, including any
approved Engineering Change Requests (ECRs) against that document
Architecture Note: Revision 2.1 and later of
the PCI Local Bus Specification requires that PCI masters
which receive a Retry target termination to unconditionally repeat the same
request until it completes. The master may perform other bus transactions, but
cannot require those to complete before repeating the original transaction
which was previously target terminated with Retry. Revision 2.1 of the
specification (page 49) also includes an example which describes how the
requirement above applies to a multi-function IOA. See page 48-49 of the 2.1
revision of the PCI Local Bus Specification for more
detail. Revision 2.0 of the PCI Local Bus Specification
includes a definition of target termination via Retry, but did not spell out
the requirement described above for masters, as does the 2.1 revision of the
specification. Masters which are designed based on revision 2.0 of the
specification that perform other transactions following target termination with
Retry, may cause live-locks and/or deadlocks when installed in a system that
utilizes bridges (host bridge or PCI-PCI bridges) that implement Retry, delayed
transactions, and/or TCEs, when those masters require following transactions to
complete before the original transaction that was terminated with the target
Retry. This revision 2.0 to revision 2.1 compatibility problem has been
observed on several IOAs that have asked for deviations to Requirement . Wording was added to the revision 2.2 of
the PCI Local Bus Specification which makes a statement
similar to this Architecture Note.
Resource Locking
R1--1.
PCI IOAs, excepting bridges, must not depend on the PCI LOCK#
signal for correct operation nor require any other PCI IOA to assert LOCK# for
correct operation.
There are some legacy IOAs on legacy buses which require LOCK#.
Additionally, LOCK# is used in some implementations to resolve deadlocks
between bridges under a single PHB. These uses of LOCK# are permitted.
PCI Expansion ROMs
R1--1.
PCI expansion ROMs must have a ROM image
with a code type of 1 for OF as provided in the . This ROM image must abide by the ROM image
format for OF as documented in the .
LoPAR systems rely on OF - not BIOS - to boot. This is why strong
requirements for OF device support are made.
Vital Product Data (VPD) is an optional feature for PCI adapters
and it is strongly recommended that VPD be included in all PCI expansion ROMs.
If it is put in the PCI expansion ROM in accordance with the , VPD will be reported in the OF device
tree. If the VPD information is formatted as defined in Revision 2.2 with the
new capabilities feature, or in any other format, firmware will not read the
VPD, and the device driver for the IOA will have to reformat any provided VPD
into an OS specified format. It is still required that the keywords and their
values must conform to those specified by either PCI 2.1 or PCI 2.2, no matter
how they are formatted. Refer to Requirement .
Assignment of Interrupts to PCI IOAs
R1--1.
All PCI IOAs must use the PowerPC
interrupt controller, except when made transparent to the OS by the platform
through the architected hcall()s.
R1--2.
PCI IOAs that do not reside in the
Peripheral Memory Space and Peripheral I/O Space of the same PHB must not share
the same LSI source.
For further information on the interrupt controller refer to .
It is strongly advised that system board designers assign one
interrupt for each interrupt source. Additionally, multi-function PCI IOAs
should have multiple interrupt sources. For restrictions on sharing interrupts
with the LPAR option, see Requirement .
For restrictions on sharing MSIs, see Requirement and Requirement .
PCI-PCI Bridge Devices
R1--1.
Firmware must initialize all PCI-to-PCI
bridges. See .
All bridges and switches are required to comply with the bus
specification(s) of the buses to which they are attached. See Requirement .
Graphics Controller and Monitor Requirements for Clients
The graphics requirements for servers are different from those for
portable and personal systems.
R1--1.
Plug-in graphics controllers for portable
and personal platforms must provide graphics mode sets in the OF PCI expansion
ROM image in accordance with the
.
Portable and personal platforms are strongly urged to support some
mechanism which allows the platform to electronically sense the display
capabilities of monitors.
For graphics controllers that are placed on the system board, the
graphics mode sets can be put in system ROM. The mode set software put in the
system ROM in this case would be FCode and would be largely or entirely the
same as the FCode that would be in the PCI expansion ROM if the same graphics
controller was put on a plug-in PCI card.
PCI Plug-in Graphic Cards
R1--1.
(Requirement Number Reserved
For Compatibility)
R1--2.
PCI plug-in graphics cards which are
going to be the primary display IOA during the time prior to the OS device
driver being loaded must contain an OF display driver on the IOA.
PCI Cache Support Protocol
The PCI architecture allows for the optional implementation of
caching of data. This architecture basically assumes that the data in I/O
memory is non-coherent. As such, platforms are not required to implement the
optional PCI Cache Support protocol using the SBO# and SDONE signals.
Therefore, IOAs used in LoPAR platforms should not count on those signals for
proper operations.
R1--1.
IOAs used in LoPAR platforms
and their device drivers must not require the use of the PCI signals SBO# and
SDONE for proper operations.
PCI Configuration Space for IOAs
There are several writable fields in the PCI Configuration Header.
Some of these are written by the firmware and should never be changed by the
device driver.
R1--1.
All registers and bits in the PCI Configuration Header must be
set to a platform specific value by firmware and preserved by software, except
that software is responsible for setting the configuration space as indicated
in .
Software Programming of PCI Configuration Header Registers
Register Name
Bit Name
Software Action
Command
Bus Master
Must write to a 1 before the first DMA operation after a
reset. Must write to a 0 before unconfiguring device driver.
Memory Space
Must write to a 1 before the first MMIO operation to
IOA’s memory space (if any) after a reset. Must write to a 0 before
unconfiguring device driver.
IO Space
Must write to a 1 before the first MMIO operation to
IOA’s I/O space (if any) after a reset. Must write to a 0 before
unconfiguring device driver.
all other bits
Must restore to previous value after any reset operation
(for example, via ibm,set-slot-reset Function 1 or 3).
The ibm,configure-bridge RTAS call is available to assist
in restoring values, where appropriate.
Built in Self Test (BIST)
all
If implemented, software may use if desired.
all other PCI header registers that may be modified by
firmware after initial reset or by ibm,configure-connector for DR operations
all
Must restore to previous value after any reset operation
(for example, via ibm,set-slot-reset-state Function 1).
The ibm,configure-bridge RTAS call is available to assist
in configuring PCI bridges and switches, where appropriate.
R1--2.
All IOAs that implement PCI-X Mode 2 or PCI Express must supply
the “ibm,pci-config-space-type” property
(see ).
Implementation Note: The
“ibm,pci-config-space-type”
property in Requirement is added for platforms that support
I/O fabric and IOAs that implement PCI-X Mode 2, and PCI Express. To access the
extended configuration space provided by PCI-X Mode 2 and PCI Express, all I/O
fabric leading up to an IOA must support a 12-bit register number. In other
words, if a platform implementation has a conventional PCI bridge leading up to
an IOA that implements PCI-X Mode 2, the platform will not be able to provide
access to the extended configuration space of that IOA. The
“ibm,config-space-type” property in the IOA's OF node
is used by device drivers to determine if an IOA’s extended
configuration space can be accessed.
PCI IOA Use of PCI Bus Memory Space Address 0
Some PCI IOAs will fail when given a bus address of 0. In the PC
world, address 0 would not be a good address, so some PCI IOA designs which
were designed for the PC arena will check for an address of 0, and fail the
operation if it is 0.
R1--1.
For systems that use PCI IOAs
which will fail when given a bus address of 0 for DMA operations, and when the
operations for which those IOAs are used are other than system memory dump
operations, then the OS must prevent the mapping of PCI bus address 0 for PCI
DMA operation for such IOAs.
R1--2.
PCI IOAs used for dumping contents of
system memory must operate properly with a PCI bus address of 0 for PCI DMA
operations.
R1--3.
The firmware must not map an IOA used for
loading a boot image to an address of 0, when loading a boot image, if that IOA
cannot accept an address of 0.
Implementation Note: A reasonable
implementation of Requirement would
be to have an interface between the device driver and the kernel to allow the
device driver to indicate to the kernel that the restriction is required for
that IOA, so that all IOAs for that kernel image are not affected.
PCI Express Completion Timeout
Prior to the implementation of the PCI Express additional
capability to set the Completion Timeout Value and Completion Timeout Disable
in the PCI Express Device Control 2 register of an IOA, the IOAs need
device-specific way to provide the disable capability. In addition, the
platforms need to provide a way for the OSs and device drivers to know when to
disable the completion timeout of these devices that only provide a
device-specific way of doing so.
R1--1.
PCI Express IOAs must either provide a
device-specific way to disable their DMA Completion Timeout timer or must
provide the Completion Timeout Disable or Completion Timeout Value capability
in the PCI Express Device Control 2 register, and device drivers for IOAs that
provide a device-specific way must disable their DMA Completion Timeout timer
if it is either unknown whether the IOA provides a sufficiently long timer
value for the platform, or if it is known that they do not provide a sufficient
timeout value (for example, if the
“ibm,max-completion-latency” property is not
provided).
R1--2.
Platforms must provide the
“ibm,max-completion-latency” property in
each PCI Express PHB node of the OF Device Tree.
PCI Express I/O Virtualized (IOV) Adapters
PCI Express defines I/O Virtualized (IOV) adapters, where such an
adapter has separate resources for each virtual instance, called a Virtual
Function (VF). There are two PCI specifications that exist to define such
adapters:
defines the
requirements for SR-IOV adapters.
defines the
requirements for MR-IOV adapters.
The interface presented to an OS from an MR-IOV adapter will look
the same as an SR-IOV adapters, and therefore will not be described separately
here.
IOV adapters and/or the VFs of an IOV adapter that has IOV enabled,
are assigned to OSs as follows (see also for a full set of characteristics of these
environments):
For the Legacy Dedicated environment, the
entire adapter is assigned to one LPAR, with the IOV functionality not enabled.
In this mode, the OS provides device driver(s) for the adapter Function(s). VFs
do not exist, because IOV is not enabled. The OS is given the capability to do
Hot Plug add, remove, and replace in a non-managed environment (without an
HMC), and may be given that capability in a managed environment.
For the SR-IOV Non-shared environment, the
entire adapter is assigned to one LPAR, with IOV functionality enabled, but
with the Physical Function(s) (PFs) of the adapter hosted by the platform. Only
VFs are presented to the OS. The OS is given the capability to do Hot Plug add,
remove, and replace in a non-managed environment (without an HMC), and may be
given that capability in a managed environment.
For the SR-IOV Shared environment, the
adapter is assigned to the platform, with IOV functionality enabled. The
platform then assigns VF(s) to OS(s). Only the managed environment applies, and
add/remove/replace operations are controlled by DLPAR operations to the OS(s)
from the management console.
For all environments except the SR-IOV Shared, multiple functions
will appear as a multi-function IOA with possible sharing of a single PE. For
example, the multi-function adapters may have a shared EEH domain and shared
DMA window.
Determination of which of the above environments is supported for a
given platform and partition or OS type is beyond the scope of this
architecture.
defines the
characteristics of these environments.
IOV Environment Characteristics
Legacy Dedicated
SR-IOV Non-shared
SR-IOV Shared
Entire adapter assigned to OS, IOV not enabled
yes
n/a
n/a
Entire adapter assigned to OS, IOV enabled
n/a
yes
n/a
Adapter can be shared across multiple OSs, IOV enabled
n/a
n/a
yes
Function DD support
Plain Function only
(not VF or PF)
VF only
VF only
PFs managed by platform?
n/a
yes
yes
Managed environment support?
yes
yes
yes
Non-managed environment support?
yes
yes
no
OS controlled Hot Plug capable?
yes
yes
no
DLPAR capable?
yes
yes
yes
All functions under one PHB in the OF Device Tree for the adapter?
yes
yes
no
All functions under separate PHBs in the OF Device Tree
for the same adapterThe adapter is
physically under one PHB, but the platform creates separate
“virtual” PHBs in the OF Device Tree and virtualizes the PCI
Express configuration space for the various functions.?
no
no
yes
config_addr translation
(virtualization) by the platform (that is, the bus/device/function of the
config_addr does not necessarily correspond to what the
device has programmed)
no
yes
yes
Shared PE domain (for example, shared EEH domain, shared DMA window)
yes
yes
no
R1--1.
PCI Express Single Root IOV (SR-IOV)
adapters must comply to the .
R1--2.
PCI Express Multi-Root IOV (MR-IOV)
adapters must comply to the .
R1--3.
The platform must present within the
device tree nodes for all PCI Express adapters configured to operate in IOV
mode the "ibm,is-vf" property as defined in section
.
Multi-Initiator SCSI Support
Multi-initiator SCSI support is identified in the OF device tree.
R1--1.
Platform Implementation:
Platforms must support the “scsi-initiator-id”
property as described in and .
Contiguous Memory
I/O devices that require contiguous memory pages (either real or via
contiguous TCEs) cannot reasonably be accommodated in LoPAR platforms. When
TCEs are turned off, that would require that real physical memory addresses be
allocated. When TCEs are on, that would require contiguous TCEs be assigned,
and although that is the first attempt by the OS’s TCE assignment
algorithm, the algorithm will assign non-contiguous ones if contiguous ones
cannot be assigned. Dynamic Reconfiguration complicates the contiguous problem
even further.
R1--1.
I/O devices and/or their device drivers used
in LoPAR platforms must implement scatter/gather capability for DMA operations
such that they do not require contiguous memory pages to be allocated for
proper operation.
Re-directed Serial Ports
The “ibm,vty-wrap-capable” OF
device tree property will be present in an OF device tree of a serial port node
when the OS data communication with that serial port controller can be
redirected, or wrapped, away from the physical serial port connector to an
ibm,vty device, which is often a virtual terminal session
of the Hardware Management Console (HMC). This property indicates to serial
port diagnostic programs that additional end user information should be
displayed during the serial port diagnostic test indicating that it is possible
that serial port data could be redirected away from the physical serial port
preventing the execution of wrap tests with physical wrap plugs. The end user
information should describe that initiating a virtual terminal session causes
the serial port controller's data to be wrapped away from the physical serial
port connection and that terminating a virtual terminal session causes the
serial port controller's data to be returned to the physical serial port
connection. The “ibm,vty-wrap-capable”
property is present with a value of null when this re-direction capability
exists and is absent when this capability does not exist.
R1--1.
The “ibm,vty-wrap-capable”
OF device tree property must
be present in an OF device tree of a serial port node when the OS data
communication with that serial port controller can be redirected, or wrapped,
away from the physical serial port connector to an ibm,vty device, and must not
be present if this capability does not exist.
System Bus IOAs
This section lists the requirements for the systems to support IOAs
connected to the system bus or main I/O expansion bus.
R1--1.
Each system bus IOA must be a bus
master.
R1--2.
Firmware must assign unique addresses to all
system bus IOA facilities.
R1--3.
Addresses assigned to system bus IOA
facilities must not conflict with the addresses mapped by any host bridge on
the system bus.
R1--4.
System bus IOAs must be assigned interrupt
sources for their interrupt requirements by firmware.
R1--5.
A system bus IOA’s OF
“interrupts” property must reflect
the interrupt source and type allocation for the device.
R1--6.
All system bus IOA interrupts must be low
true level sensitive (referred to as level sensitive).
R1--7.
Interrupts assigned to system bus IOAs must
not be shared with other IOAs.
R1--8.
The OF unit address (first entry of the
“reg” property) of a system
bus IOA must stay the same from boot to boot.
R1--9.
Each system bus IOA must have documentation
for programming the IOA and an OF binding which describes at least the
“name”,
“reg”,
“interrupts”, and
“interrupt-parent”
properties for the device.