diff --git a/LoPAR/app_bibliography.xml b/LoPAR/app_bibliography.xml index dce18da..eae8f8f 100644 --- a/LoPAR/app_bibliography.xml +++ b/LoPAR/app_bibliography.xml @@ -28,38 +28,6 @@ revision, the revision shall apply. - - - - Linux on Power Architecture Reference: Platform and Device Tree - - - - - Linux on Power Architecture Reference: Device Tree - - - - - Linux on Power Architecture Reference: Error Recovery and Logging - - - - - Linux on Power Architecture Reference: Virtualization - - - - - Linux on Power Architecture Reference: Runtime Abstraction Services (RTAS) - - - Power ISA diff --git a/LoPAR/app_fault_v_errorlog.xml b/LoPAR/app_fault_v_errorlog.xml index 0b50f15..29a30d1 100644 --- a/LoPAR/app_fault_v_errorlog.xml +++ b/LoPAR/app_fault_v_errorlog.xml @@ -1,7 +1,7 @@ When to use: Fault vs. Error Log Indicators (Lightpath Mode) - This appendix gives + This appendix gives highly recommended Service Indicator activation models for typical system issues, when the Lightpath mode is implemented. The purpose of this appendix is to get consistency across platforms, and to @@ -28,18 +28,18 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo that are involved, specifically related to the different types of physical layouts (for example: deskside, blade and blade chassis, rack-mounted and particularly high end racks). - This appendix does + This appendix does not change the architectural requirements specified in other parts of this document, nor the requirement for implementations to support those requirements. If there are any inconsistencies between this appendix and the requirements in the rest of this document, the requirements take precedence over this appendix. It is very important, therefore, that designers understand the requirements in this architecture, and more - specifically, those in + specifically, those in . gives the recommended models. The general model, though, is still dictated by the following requirement, copied - here from the : + here from the : @@ -54,7 +54,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Service Indicator Activation Models for Typical System - Issues (Lightpath Mode) + Issues (Lightpath Mode) <emphasis></emphasis> @@ -79,8 +79,8 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Indicator activation? - (see notes - , + (see notes + , ) @@ -104,7 +104,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo FRU Fault indicator?(see notes - , + , ) @@ -745,7 +745,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo no - See also Requirement + See also Requirement @@ -847,13 +847,13 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Notes: - - + + Never activate both a Fault indicator and an Error Log indicator for the same problem. See also - Requirement - , referenced immediately above + Requirement + , referenced immediately above . @@ -867,11 +867,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Enclosure Fault indicators and above are only roll-up indicators and are never activated - without a FRU Fault indicator being activated. Therefore the column in + without a FRU Fault indicator being activated. Therefore the column in indicates a FRU Fault indicator. That is, if no FRU Fault indicator exists for the problem, then the Error Log - indicator is used instead (per Requirement - , referenced immediately above + indicator is used instead (per Requirement + , referenced immediately above ). @@ -880,11 +880,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Error Log indicator (previously known as the System Information (Attention) indicator) and Fault indicators are regulated by the following requirements, among others: - + - + - + diff --git a/LoPAR/app_firmware_dump.xml b/LoPAR/app_firmware_dump.xml index 0a65f42..abf2ce8 100644 --- a/LoPAR/app_firmware_dump.xml +++ b/LoPAR/app_firmware_dump.xml @@ -23,8 +23,8 @@ Firmware Assisted Dump Data Format This appendix documents the dump data format, in support of the - Configure Platform Assisted Kernel Dump option in - . + Configure Platform Assisted Kernel Dump option + ()).
Register Save Area diff --git a/LoPAR/app_glossary.xml b/LoPAR/app_glossary.xml index 0419382..fc35ebc 100644 --- a/LoPAR/app_glossary.xml +++ b/LoPAR/app_glossary.xml @@ -1,7 +1,7 @@ Term Definition - + AC Alternating current - + ACR Architecture Change Request - + AD Address Data line - + Adapter - A device which attaches a device to a bus or which converts one - bus to another; for example, an I/O Adapter (IOA), a PCI Host Bridge (PHB), + A device which attaches a device to a bus or which converts one + bus to another; for example, an I/O Adapter (IOA), a PCI Host Bridge (PHB), or a NUMA fabric attachment device. - + addr Address - + Architecture - The hardware/software interface definition or software module to + The hardware/software interface definition or software module to software module interface definition. - + ASCII - American National Standards Code for Information + American National Standards Code for Information Interchange - + ASR Address Space Register - + BAT Block Address Translation - + BE - Big-Endian or Branch Trace Enable bit in the + Big-Endian or Branch Trace Enable bit in the MSR (MSRBE) - + BIO Bottom of Peripheral Input/Output Space - + BIOS Basic Input/Output system - + BIST Built in Self Test - + Boundedly undefined - Describes some addresses and registers which when referenced provide + Describes some addresses and registers which when referenced provide one of a small set of predefined results. - + BPA Bulk Power Assembly. Refers to components used for power distribution from a central point in the rack. - + BPM Bottom of Peripheral Memory - + BSCA Bottom of System Control Area - + BSM Bottom of System Memory - + BUID - Bus Unit Identifier. The high-order part of an interrupt source number + Bus Unit Identifier. The high-order part of an interrupt source number which is used for hardware routing purposes by the platform. - + CCIN Custom Card Identification Number - + CD-ROM Compact Disk Read-Only Memory - + CIS Client Interface Service - + CMO - Cooperative Memory Over-commitment option. See - for more information. + Cooperative Memory Over-commitment option. See + for more information. - + CMOS Complimentary Metal Oxide Semiconductor - + Conventional PCI Behavior or features that conform to . - + CPU Central Processing Unit - + CR Condition Register - + CTR Count Register - + DABR Data Address Breakpoint Register - + DAR Data Address Register - + DASD Direct Access Storage Device (a synonym for “hard disk”) - + DBAT Data Block Address Translation - + DC Direct current - + DEC Decrementer - + DIMM Dual In-line Memory Module - + DMA Direct Memory Access - + DMA Read - A data transfer from System Memory to I/O. A DMA Read Request - is the inbound operation and the DMA Read Reply (or Read Completion) is the + A data transfer from System Memory to I/O. A DMA Read Request + is the inbound operation and the DMA Read Reply (or Read Completion) is the outbound data coming back from a DMA Read Request. - + DMA Write A data transfer to System Memory from I/O or a Message Signalled Interrupt (MSI) DMA Write. This is an inbound operation. - + DOS Disk OS - + DR Data Relocate bit in MSR (MSRDR) - + DRA Deviation Risk Assessment - + DRAM Dynamic Random Access Memory - + DRC - Delayed Read Completion. A transaction that has completed + Delayed Read Completion. A transaction that has completed on the destination bus and is now moving toward the originating bus to complete. DR Connector. - + DR entity - An entity that can participate in DR operations. That is, an entity - that can be added or removed from the platform while the platform power is on and the + An entity that can participate in DR operations. That is, an entity + that can be added or removed from the platform while the platform power is on and the system remains operational. - + DRR Delayed Read Request. A transaction that must complete on the destination bus before completing on the originating bus. - + DSISR Data Storage Interrupt Status Register - + DWR Delayed Write Request. A transaction that must complete on the destination bus before completing on the originating bus. - + EA Effective Address - + EAR External Access Register - + ECC Error Checking and Correction - + EE External interrupt Enable bit in the MSR (MSREE) - + EEH Enhance I/O Error Handling - + EEPROM Electrically Erasable Programmable Read Only Memory - + EPOW Environment and Power Warning - + - Error Log indicator An amber indicator that indicates that the user needs to - look at the error log or problem determination procedures, in order to determine the cause. + Error Log indicator An amber indicator that indicates that the user needs to + look at the error log or problem determination procedures, in order to determine the cause. Previously called System Information (Attention). - + FCode - A computer programming language defined by the OF standard which is semantically - similar to the Forth programming language, but is encoded as a sequence of binary byte codes + A computer programming language defined by the OF standard which is semantically + similar to the Forth programming language, but is encoded as a sequence of binary byte codes representing a defined set of Forth words. - + FE0 Floating-point Exception mode 0 bit in the MSR (MSRFE0) - + FE1 Floating-point Exception mode 1bit in the MSR (MSRFE1) - + FIR Fault Isolation Registers - + FLR - Function Level Reset (see PCI Express documentation). An optional reset for PCI Express + Function Level Reset (see PCI Express documentation). An optional reset for PCI Express functions that allows resetting a single function of a multi-function IOA. - + FP Floating-Point available bit in the MSR (MSRFP) - + FPSCR Floating-Point Status And Control Register - + FRU Field Replaceable Unit - + FSM Finite State Machine - + GB Gigabytes - as used in this document it is 2 raised to the power of 30 - + HB Host Bridge - + HMC - Hardware Management Console - used generically to refer to the system - component that performs platform administration function where ever physically located. - The HMC is outside of this architecture and may be implemented in multiple ways. - Examples include: a special HMC applications in another system, an external appliance, - or in an LPAR partition using the Virtual Management Channel (VMC) interface to the + Hardware Management Console - used generically to refer to the system + component that performs platform administration function where ever physically located. + The HMC is outside of this architecture and may be implemented in multiple ways. + Examples include: a special HMC applications in another system, an external appliance, + or in an LPAR partition using the Virtual Management Channel (VMC) interface to the hypervisor. - + Hz Hertz - + IBAT Instruction block address translation - + ID Identification - + IDE Integrated Device Electronics - + IDU Interrupt Delivery Unit - + IEEE Institute of Electrical and Electronics Engineers - + I2C Inter Integrated-circuit Communications - + I/O nput/Output - + I/O bus master - Any entity other than a processor, cache, - memory controller, or host bridge which supplies both address and data in - write transactions or supplies the address and is the sink for the data in + Any entity other than a processor, cache, + memory controller, or host bridge which supplies both address and data in + write transactions or supplies the address and is the sink for the data in read transactions. - + I/O device - Generally refers to any entity that is connected - to an IOA (usually through a cable), but in some cases may refer to the IOA - itself (that is, a device in the device tree that happens to be used for I/O + Generally refers to any entity that is connected + to an IOA (usually through a cable), but in some cases may refer to the IOA + itself (that is, a device in the device tree that happens to be used for I/O operations). - + I/O Drawer - An enclosure in a rack that holds at least one PHB and at + An enclosure in a rack that holds at least one PHB and at least one IOA. - + ILE Interrupt Little-Endian bit in MSR (MSRILE) - + Instr Instruction - + Interrupt Number See Interrupt Vector below. - + Interrupt Vector - The identifier associated with a specific interrupt source. - The identifier’s value is loaded into the source’s Interrupt Vector Register and + The identifier associated with a specific interrupt source. + The identifier’s value is loaded into the source’s Interrupt Vector Register and is read from the Interrupt Delivery Unit’s Interrupt Acknowledge Register. - + IOA - I/O Adapter. A device which attaches to a physical bus which is capable - of supporting I/O (a physical IOA) or logical bus (a virtual IOA). The term “IOA” - without the usage of the qualifier “physical” or “virtual” will be - used to designate a physical IOA. Virtual IOAs are defined further in - - . - In PCI terms, an IOA may be defined by a unique combination of its assigned - bus number and device number, but not necessarily including its function number. - That is, an IOA may be a single or multi-function device, unless otherwise specified by - the context of the text. In the context of a PCIe I/O Virtualized (IOV) device (not to be - confused with a virtual IOA), an IOA is a single or multiple function device (for example, a - PCIe Virtual Function (VF) or multiple VFs). An IOA function may or may not have its own set of - resources, that is may or may not be in its own Partitionable Endpoint (PE) domain + I/O Adapter. A device which attaches to a physical bus which is capable + of supporting I/O (a physical IOA) or logical bus (a virtual IOA). The term “IOA” + without the usage of the qualifier “physical” or “virtual” will be + used to designate a physical IOA. Virtual IOAs are defined further in + . + In PCI terms, an IOA may be defined by a unique combination of its assigned + bus number and device number, but not necessarily including its function number. + That is, an IOA may be a single or multi-function device, unless otherwise specified by + the context of the text. In the context of a PCIe I/O Virtualized (IOV) device (not to be + confused with a virtual IOA), an IOA is a single or multiple function device (for example, a + PCIe Virtual Function (VF) or multiple VFs). An IOA function may or may not have its own set of + resources, that is may or may not be in its own Partitionable Endpoint (PE) domain (see also ). - + IOA function - That part of an IOA that deals with a specific part of the - IOA as defined by the configuration space “Function” part of Bus/Device/Function. + That part of an IOA that deals with a specific part of the + IOA as defined by the configuration space “Function” part of Bus/Device/Function. For single-function IOAs, the IOA Function and the IOA are synonymous. - + IP Interrupt Prefix bit in MSR (MSRIP) - + IPI Interprocessor Interrupt - + IR Instruction Relocate bit in MSR register (MSRIR) or infrared - + ISF Interrupt 64-bit processor mode bit in the MSR (MSRISF) - + ISO International Standards Organization - + ISR Interrupt Source Register - + ISU Interrupt Source Unit - + KB Kilobytes - as used in this document it is 2 raised to the power of 10 - + KHz Kilo Hertz - + LAN Local Area Network - + LCD Liquid Crystal Display - + LE Little-Endian bit in MSR (MSRLE) or Little-Endian - + LED Light Emitting Diode - + LMB - Logical Memory Block. The Block of logical memory addresses associated with a dynamically + Logical Memory Block. The Block of logical memory addresses associated with a dynamically reconfigurable memory node. - + Load - A Load Request is the outbound (from the processor) operation - and the Load Reply is the inbound data coming back from a - Load Request. When it relates to I/O operations, this is an + A Load Request is the outbound (from the processor) operation + and the Load Reply is the inbound data coming back from a + Load Request. When it relates to I/O operations, this is an MMIO Load . - + LR Link Register - + LSb Least Significant bit - + LSB Least Significant Byte - + LSI Level Sensitive Interrupt - + LUN Logical Unit Number - + L1 Primary cache - + L2 Secondary cache - + MB Megabytes - as used in this document it is 2 raised to the power of 20 - + ME Machine check Enable - + MMIO - Memory Mapped I/O. This refers to the mapping of the address space required - by an I/O device for Load or Store operations into + Memory Mapped I/O. This refers to the mapping of the address space required + by an I/O device for Load or Store operations into the system’s address space. - + MES Miscellaneous Equipment Specification - + MFM Modified frequency modulation - + MHz Mega Hertz - + MOD - Address modification bit in the MSR + Address modification bit in the MSR (MSRMOD) - + MP Multiprocessor - + MSb Most Significant bit - + MSB Most Significant Byte - + MSI Message Signalled Interrupt - + MSR Machine State Register - + MTT - Multi-TCE-Table option. See - - - . + Multi-TCE-Table option. See + . - + N/A Not Applicable - + Nibble Refers to the first or last four bits in an 8 bit byte - + NUMA Non-Uniform Memory Access - + NUMA fabric Mechanism and method for connecting the multiple nodes of a NUMA system - + NVRAM Nonvolatile Random Access Memory - + OF Open Firmware - + OP Operator - + OS Operating System - + OUI Organizationally Unique Identifier - + PA Processor Architecture - + PAP Privileged Access Password - + LoPAR - Used within the Linux on Power Architecture - Reference documents to denote: (1) the architectural requirements specified - by the Linux on Power Architecture Reference document, (2) the Linux on Power Architecture - Reference documents themself, and (3) as an adjective to qualify an entity as being + Used within the Linux on Power Architecture + Reference documents to denote: (1) the architectural requirements specified + by the Linux on Power Architecture Reference document, (2) the Linux on Power Architecture + Reference documents themself, and (3) as an adjective to qualify an entity as being related to this architecture. - + Partitionable Endpoint - This refers to the I/O granule that may be treated as one for - purposes of assignment to an OS (for example, to an LPAR partition). May be an - I/O adapter (IOA), or groups of IOAs and bridges, or portions of IOAs. PE granularity - supported by the hardware may be finer than is supported by the firmware. Grouping - of multiple PEs into one DR entity may limit assignment of a the separate PEs to different + This refers to the I/O granule that may be treated as one for + purposes of assignment to an OS (for example, to an LPAR partition). May be an + I/O adapter (IOA), or groups of IOAs and bridges, or portions of IOAs. PE granularity + supported by the hardware may be finer than is supported by the firmware. Grouping + of multiple PEs into one DR entity may limit assignment of a the separate PEs to different LPAR partitions. See also DR entity. - + PC Personal Computer - + PCI - Peripheral Component Interconnect. An all-encompassing term referring to + Peripheral Component Interconnect. An all-encompassing term referring to conventional PCI, PCI-X, and PCI Express. - + PCI bus - A general term referring to either the PCI Local Bus, as - specified in and - for conventional PCI and PCI-X, or a PCI Express link, as specified in + A general term referring to either the PCI Local Bus, as + specified in and + for conventional PCI and PCI-X, or a PCI Express link, as specified in for PCI Express. - + PCI Express - Behavior or features that conform to + Behavior or features that conform to . - + PCI link A PCI Express link, as specified in . - + PCI-X Behavior or features that conform to . - + PD Presence Detect - + PE - When referring to the body of the LoPAR, this refers to a Partitionable + When referring to the body of the LoPAR, this refers to a Partitionable Endpoint. - - + PE has a different meaning relative to + + (see + for that definition). - + PEM Partition Energy Management option. See - - - . + . - + Peripheral I/O Space - The range of real addresses which are assigned - to the I/O Space of a Host Bridge (HB) and which are sufficient to contain all of - the Load and Store address space requirements of all the devices in the I/O Space - of the I/O bus that is generated by the HB. A keyboard controller is an example of + The range of real addresses which are assigned + to the I/O Space of a Host Bridge (HB) and which are sufficient to contain all of + the Load and Store address space requirements of all the devices in the I/O Space + of the I/O bus that is generated by the HB. A keyboard controller is an example of a device which may require Peripheral I/O Space addresses. - + Peripheral Memory Space - The range of real addresses which are assigned to the Memory - Space of a Host Bridge (HB) and which are sufficient to contain all of the Load and - Store address space requirements of the devices in the Memory Space of the I/O bus - that is generated by the HB. The frame buffer of a graphics adapter is an example + The range of real addresses which are assigned to the Memory + Space of a Host Bridge (HB) and which are sufficient to contain all of the Load and + Store address space requirements of the devices in the Memory Space of the I/O bus + that is generated by the HB. The frame buffer of a graphics adapter is an example of a device which may require Peripheral Memory Space addresses. - + Peripheral Space - Refers to the physical address space which may - be accessed by a processor, but which is controlled by a host bridge. At least one - peripheral space must be present and it is referred to by the suffix 0. A host bridge - will typically provide access to at least a memory space and possibly to an I/O + Refers to the physical address space which may + be accessed by a processor, but which is controlled by a host bridge. At least one + peripheral space must be present and it is referred to by the suffix 0. A host bridge + will typically provide access to at least a memory space and possibly to an I/O space. - + PHB PCI Host Bridge - + PIC Programmable Interrupt Controller - + PIR Processor Identification Register - + Platform - Refers to the hardware plus firmware portion of a system composed of hardware, + Refers to the hardware plus firmware portion of a system composed of hardware, firmware, and OS. - + Platform firmware - Refers to all firmware on a system including the software or firmware in a + Refers to all firmware on a system including the software or firmware in a support processor. - + Plug-in I/O card - A card which can be plugged into an I/O - connector in a platform and which contains one or more IOAs and potentially + A card which can be plugged into an I/O + connector in a platform and which contains one or more IOAs and potentially one or more I/O bridges or switches. - + Plug-in Card An entity that plugs into a physical slot. - + PMW - Posted memory write. A transaction that has complete on the + Posted memory write. A transaction that has complete on the originating bus before completing on the destination bus - + PnP Plug and Play - + POP Power On Password - + POST Power-On Self Test - + PR Privileged bit in the MSR (MSRPR) - + Processor Architecture - Used throughout this document to - mean compliance with the requirements specified in + Used throughout this document to + mean compliance with the requirements specified in . - + Processor revision number - A 16-bit number that distinguishes between various releases - of a particular processor version, for example different engineering change + A 16-bit number that distinguishes between various releases + of a particular processor version, for example different engineering change levels. - + PVN - Processor Version Number. Uniquely determines the particular + Processor Version Number. Uniquely determines the particular processor and PA version. - + PVR - Processor Version Register. A register in each processor - that identifies its type. The contents of the PVR include the processor + Processor Version Register. A register in each processor + that identifies its type. The contents of the PVR include the processor version number and processor revision number. - + RAID Redundant Array of Independent Disks - + RAM Random Access Memory - + RAS Reliability, Availability, and Serviceability - + Real address - A real address results from doing address - translation on an effective address when address translation is enabled. - If address translation is not enabled, the real address is the same as the - effective address. An attempt to fetch from, load from, or store to a real - address that is not physically present in the machine may result in a + A real address results from doing address + translation on an effective address when address translation is enabled. + If address translation is not enabled, the real address is the same as the + effective address. An attempt to fetch from, load from, or store to a real + address that is not physically present in the machine may result in a machine check interrupt. - + Reserved - The term “reserved” is used within this - document to refer to bits in registers or areas in the address space - which should not be referenced by software except as described in this + The term “reserved” is used within this + document to refer to bits in registers or areas in the address space + which should not be referenced by software except as described in this document. - + Reserved for firmware use - Refers to a given location or bit which may not be used by + Refers to a given location or bit which may not be used by software, but are used by firmware. - + Reserved for future use - Refers to areas of address space or bits in registers which may be + Refers to areas of address space or bits in registers which may be used by future versions of this architecture. - + RI Recoverable interrupt bit in the MSR (MSRRI) - + RISC Reduced Instruction Set Computing - + RMA - Real Mode Area. The first block of logical memory addresses - owned by a logical partition, containing the storage that may be accessed with + Real Mode Area. The first block of logical memory addresses + owned by a logical partition, containing the storage that may be accessed with translate off. - + ROM Read Only Memory - + Root Complex - A PCI Express root complex as specified in + A PCI Express root complex as specified in . - + RPN Real Page Number - + RTAS Run-Time Abstraction Services - + RTC Real Time Clock - + SAE Log Service Action Event log - + SCC Serial Communications Controller - + SCSI Small Computer System Interface - + SE - Single-step trace enabled bit in the MSR + Single-step trace enabled bit in the MSR (MSRSE) - + Service Focal Point - The common point of control in the system for handling all + The common point of control in the system for handling all service actions - + Serviceable Event - Serviceable Events are platform, - global, regional and local error events that require a service action - and possibly a call home when the serviceable event must be handled by a - service representative or at least reported to the service provider. - Activation of the Error Log indicator notifies the customer of the event - and the event indicates to the customer that there must be some intervention - to rectify the problem. The intervention may be a service action that the + Serviceable Events are platform, + global, regional and local error events that require a service action + and possibly a call home when the serviceable event must be handled by a + service representative or at least reported to the service provider. + Activation of the Error Log indicator notifies the customer of the event + and the event indicates to the customer that there must be some intervention + to rectify the problem. The intervention may be a service action that the customer can perform or it may require a service provider. - + SES - Storage Enclosure Services (can also mean SCSI Enclosure + Storage Enclosure Services (can also mean SCSI Enclosure Services in relation to SCSI storage) - + SF - Processor 32-bit or 64-bit processor mode bit in the MSR + Processor 32-bit or 64-bit processor mode bit in the MSR (MSRSF) - + SFP Service Focal Point - + Shrink-wrap OS - A single version of an OS that runs on all + A single version of an OS that runs on all compliant platforms. - + Shrink-wrap Application - A single version of an application program + A single version of an application program that runs on all compliant platforms with the applicable OS. - + SMP Symmetric multiprocessor - + SMS System Management Services - + Snarf - An industry colloquialism for cache-to-cache - transfer. A typical scenario is as follows: (1) cache miss from cache A, - (2) line found modified in cache B, (3) cache B performs castout of modified - line, and (4) cache A allocates the modified line as it is being written back + An industry colloquialism for cache-to-cache + transfer. A typical scenario is as follows: (1) cache miss from cache A, + (2) line found modified in cache B, (3) cache B performs castout of modified + line, and (4) cache A allocates the modified line as it is being written back to memory. - + Snoop - The act of interrogating a cache for the presence of a - line, usually in response to another party on a shared bus attempting to + The act of interrogating a cache for the presence of a + line, usually in response to another party on a shared bus attempting to allocate that line. - + SPRG Special Purpose Registers for General use - + SR System Registers - + SRC Service Reference Code - + SRN Service Request Number - + Store - A Store Request is an - outbound (from the processor) operation. When it relates to I/O + A Store Request is an + outbound (from the processor) operation. When it relates to I/O operations, this is an MMIO Store. - + System - Refers to the collection of hardware, system firmware, + Refers to the collection of hardware, system firmware, and OS software which comprise a computer model. - + System address space - The total range of addressability as established by the + The total range of addressability as established by the processor implementation. - + System Control Area - Refers to a range of addresses which - contains the system ROM(s) and an unarchitected, reserved, platform-dependent - area used by firmware and Run-Time Abstraction services for control of the - platform. The ROM areas are defined by the OF properties in the - openprom and os-rom nodes + Refers to a range of addresses which + contains the system ROM(s) and an unarchitected, reserved, platform-dependent + area used by firmware and Run-Time Abstraction services for control of the + platform. The ROM areas are defined by the OF properties in the + openprom and os-rom nodes of the OF device tree. - + System Information (Attention) indicator See Error Log indicator. - + System firmware - Refers to the collection of all firmware on a system + Refers to the collection of all firmware on a system including OF, RTAS and any legacy firmware. - + System Memory - Refers to those areas of memory which form - a coherency domain with respect to the PA processor or processors that + Refers to those areas of memory which form + a coherency domain with respect to the PA processor or processors that execute application software on a system. - + System software - Refers to the combination of OS software, - device driver software, and any hardware abstraction software, but + Refers to the combination of OS software, + device driver software, and any hardware abstraction software, but excludes the application software. - + TB Time Base - + TCE Translation Control Entry - + TLB Translation Look-aside Buffer - + TOD Time Of Day - + TOSM Top of system memory - + TPM Top of Peripheral Memory Trusted Platform Module - + tty - Teletypewriter or ASCII character driven + Teletypewriter or ASCII character driven terminal device - + UI User Interface - + USB Universal Serial Bus - + v Volt - + VGA Video Graphics Array - + VMC Virtual Management Channel - + VPD Vital Product Data - + VPNH Virtual Processor Home Node option. See - - - . + . - - - - - - - - + + + + + + diff --git a/LoPAR/app_pa_processor_binding.xml b/LoPAR/app_pa_processor_binding.xml index 3f10b2c..82ad959 100644 --- a/LoPAR/app_pa_processor_binding.xml +++ b/LoPAR/app_pa_processor_binding.xml @@ -704,7 +704,7 @@ location of resources) to the processor are preserved by the device tree once presented upon boot. For a list of properties that may change before a reboot, see - . + . diff --git a/LoPAR/app_papr_binding.xml b/LoPAR/app_papr_binding.xml index 16e9997..4b972f0 100644 --- a/LoPAR/app_papr_binding.xml +++ b/LoPAR/app_papr_binding.xml @@ -1710,6 +1710,156 @@ + + “ibm,partition-uuid” + + + property name specifies a universally unique identifier for this partition. + + prop-encoded-array: A string of data as described below, encoded as with + encode-string + The Universally Unique IDentifier (UUID) option provides each partition with a + Universally Unique Identifier that is persisted by the platform across partition + reboots, reconfigurations, OS reinstalls, partition migration, hibernation etc. + The UUID is a 16 byte string of format fields and random bits as defined in + . + The random bits are generated in an implementation-dependent manner to + achieve a projected probability of collision of not greater than one in 260. + +
+ UUID Format + + + + + + + + + + Field + + + + + Byte:Bit + + + + + Size (Bits) + + + + + Values + + + + + + + + Version + + + 0:0 + + + 1 + + + 0: Initial Version + 1: Reserved + + + + + Random Bits + + + 0:1 thru 5:7 + + + 47 + + + Random Bits + + + + + Generation Method + + + 6:0-3 + + + 4 + + + 0b0000 Never Used + 0b0100 Random Generated + All other values are reserved + + + + + Random Bits + + + 6:4 - 7:7 + + + 12 + + + Random Bits + + + + + Variant + + + 8:0-1 + + + 2 + + + 0b10 DCE Variant UUID + All other values are reserved + + + + + Random Bits + + + 8:2 - 15:7 + + + 62 + + + Random Bits + + + + +
+ + + For the GET_PARTNER_UUID subfunction (See ), the data is + represented as 16 bytes as described in . + For the ibm,partition-uuid property, the data is represented as a string of + hexadecimal characters, with hyphens added for readability. + Hexadecimal values a through f are lower case. An example of the string + representation of the UUID is 648a9ca6-1fb4-4f7e-9436-14d015f3dd74 +
+ + “ibm,platform-hardware-notification” @@ -1828,7 +1978,7 @@ prop-encoded-array: An integer encoded as with encode-int that represents the maximum VIOS level that the client shall negotiate. See - for the definition of the + for the definition of the values of this property. @@ -1962,7 +2112,7 @@ property name to define that the OS may ignore failures of Hot Plug power off and isolate operations during a DLPAR remove operation. See also Note 2 in - . + . prop-encoded-array: None, this is a name only property. @@ -2066,6 +2216,28 @@ 1 = Platform is operating in the Lightpath mode. + + + Implementation Notes: + + + + + In the absence of this property, the determination of how the OS + is to behave is made by the platform presenting or not presenting FRU + Fault indicators to the OS see chapter + . In the case where there are + no FRUs owned by the partition, the OS will not observe any FRU Fault + indicators assigned, even when the platform is operating in the Lightpath + mode. + + + + Presenting this property does not imply any relaxation of the + requirements specified in chapter + . + + @@ -2090,181 +2262,7 @@ prop-encoded-array: <NULL> - - - “ibm,partition-uuid” - - - property name specifies a universally unique identifier for this partition. - - prop-encoded-array: A string of data as described below, encoded as with - encode-string - The Universally Unique IDentifier (UUID) option provides each partition with a - Universally Unique Identifier that is persisted by the platform across partition - reboots, reconfigurations, OS reinstalls, partition migration, hibernation etc. - The UUID is a 16 byte string of format fields and random bits as defined in - . - The random bits are generated in an implementation-dependent manner to - achieve a projected probability of collision of not greater than one in 260. - - - UUID Format - - - - - - - - - - Field - - - - - Byte:Bit - - - - - Size (Bits) - - - - - Values - - - - - - - - Version - - - 0:0 - - - 1 - - - 0: Initial Version - 1: Reserved - - - - - Random Bits - - - 0:1 thru 5:7 - - - 47 - - - Random Bits - - - - - Generation Method - - - 6:0-3 - - - 4 - - - 0b0000 Never Used - 0b0100 Random Generated - All other values are reserved - - - - - Random Bits - - - 6:4 - 7:7 - - - 12 - - - Random Bits - - - - - Variant - - - 8:0-1 - - - 2 - - - 0b10 DCE Variant UUID - All other values are reserved - - - - - Random Bits - - - 8:2 - 15:7 - - - 62 - - - Random Bits - - - - -
- - - For the GET_PARTNER_UUID subfunction (See ), the data is - represented as 16 bytes as described in . - For the ibm,partition-uuid property, the data is represented as a string of - hexadecimal characters, with hyphens added for readability. - Hexadecimal values a through f are lower case. An example of the string - representation of the UUID is 648a9ca6-1fb4-4f7e-9436-14d015f3dd74 -
-
- - - Implementation Notes: - - - - - - In the absence of this property, the determination of how the OS - is to behave is made by the platform presenting or not presenting FRU - Fault indicators to the OS see chapter - . In the case where there are - no FRUs owned by the partition, the OS will not observe any FRU Fault - indicators assigned, even when the platform is operating in the Lightpath - mode. - - - - Presenting this property does not imply any relaxation of the - requirements spe3cified in chapter - . - - -
@@ -2329,7 +2327,7 @@ “ibm,client-architecture-support” and invoke that method with the - >ibm,??? compatible (wording???) with the Real Base and Real Size constraints of the + >ibm,??? compatible with the Real Base and Real Size constraints of the kernel being loaded. @@ -3267,7 +3265,7 @@ value indicates that the client supports the I/O Super Page Option (Support of >4K I/O pages) (Includes extensions to H_MIGRATE_DMA for >4K I/O pages and >256 xlates). - See . + See . In the ibm,architecture-vec-5 property of the /chosen node, a non-zero value indicates @@ -3287,7 +3285,7 @@ /chosen node, this field represents the implementation dependent number of xlates entries supported per migration operation as: 256 * 2**N. - See . + See . @@ -3302,7 +3300,7 @@ /chosen node, this field represents the implementation dependent number of simultaneous migration options supported as: 2**N. - See . + See . @@ -3364,7 +3362,7 @@ = the “Form value” of the “ibm,associativity” and “ibm,associativity-reference-points” - properties. See for further details. + properties. See for further details. @@ -3399,7 +3397,7 @@ Enable MTT Option See - . + . @@ -3449,7 +3447,7 @@ Enable Hotplug Interrupts - See Hot Plug Events in . + See Hot Plug Events in . @@ -4388,7 +4386,7 @@ token) for the defined indicators and the number of indicators ( maxindex) for that token which are implemented (see - ) on the platform. + ) on the platform. Note: The indicator indices for a given token are numbered 0... maxindex-1. @@ -4410,7 +4408,7 @@ token) for the defined sensors and the number of sensors ( maxindex) for that token which are implemented (see - ) on the platform. + ) on the platform. Note: The sensor indices for a given token are numbered 0 ... maxindex-1. @@ -4933,7 +4931,7 @@ prop-encoded-array: Contains the description of the registered kernel dump in the format described in - . + . @@ -4949,7 +4947,7 @@ the first 3 inputs and the first 4 outputs ( Number Inputs is required to be 3 and the Number Outputs is required to be 4), as defined in - . + . prop-encoded-array: Contains a 32 bit cell, with the bits defined as follows: @@ -4958,13 +4956,13 @@ ibm,read-slot-reset-state2 RTAS call checks the Number Outputs and the implements the 5th output ( Number Outputs of 5), as defined by - . + . Bit 31: When a value of 1, the ibm,read-slot-reset-state2 RTAS call implements the first 3 inputs and the first 4 outputs ( Number Inputs of 3 and the Number Outputs of 4), as defined in - . This bit is always required + . This bit is always required to be a value of 1 when this property is implemented. @@ -5035,7 +5033,7 @@ property-name indicating that the platform supports extended ibm,os-term behavior as described in - . + . prop-encoded-array: encode-null @@ -5048,8 +5046,8 @@ This section defines the property names associated with the various RTAS functions defined by - . - should be used as the reference + . + should be used as the reference for RTAS Functions currently implemented. Each RTAS function that a platform implements shall be represented by its own function property, @@ -5076,7 +5074,7 @@ rtas-call interface (see below), invokes the named RTAS function. If a RTAS function is not implemented, there will not be a property corresponding to that function name. See the - for more information about RTAS + for more information about RTAS functions. @@ -5502,7 +5500,7 @@ The first specification shall specify the configured address and size of this PHB’s I/O Space. (I/O Space is shown as “BIOn” to “TIOn” in - "Address Map" section.) The + .) The second specification shall specify the configured address and size of this PHB’s Memory Space. (Memory Space is shown as “BPMn” to “TPMn” in the Common Hardware Reference @@ -5595,7 +5593,7 @@ prop-encoded-array: Integer, encoded as with encode-int. This property, when present (for example, see Requirement - ), indicates the maximum DMA + ), indicates the maximum DMA Read completion latency for IOAs under this PHB, in microseconds. For plug-in adapters, the latency value does not include latency of any additional PCI fabric (for example, PCI Express switches) on the plug-in @@ -5974,7 +5972,7 @@ as a token for an additional RTAS call or an architectural level of an extended interface. The value of one indicates that only a single extension is implemented as specified by the second integer in the list. - provides the definition of the + provides the definition of the subsequent integers as defined for the LoPAR level of the DDW option. @@ -7452,7 +7450,7 @@ property name to provide Vital Product Data (VPD) information as defined in - . + . prop-encoded-array: the concatenation, with encode+, of one or more pairs of elements, the first @@ -7831,7 +7829,7 @@
-
+
hot-plug-events The presence of the node indicates that all or some of the function @@ -8367,7 +8365,7 @@ See - for further detail on this + for further detail on this virtual device. @@ -8412,7 +8410,7 @@ “reg” property value. The following properties are the minimum required, optional support such as dynamic reconfiguration will add properties per requirements called out in the - . + . @@ -10643,7 +10641,7 @@ where: power management related information shall be resident in the OF device tree prior to the transfer phase of software operation (see the definition of transfer phase in - ). Dummy devices shall be + ). Dummy devices shall be placed in the device tree for all standard I/O bus connectors which are not in use to provide a node to assign the slot-names, power-domains, and power-sources properties. diff --git a/LoPAR/app_splar.xml b/LoPAR/app_splar.xml index 55294bf..510b809 100644 --- a/LoPAR/app_splar.xml +++ b/LoPAR/app_splar.xml @@ -7,7 +7,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> This appendix defines the string that is returned by the ibm,get-system-parameter RTAS call when the parameter token value of 20 (SPLPAR Characteristics) is specified on the ibm,get-system-parameter RTAS - call as per . + call as per .
SPLPAR Terms diff --git a/LoPAR/ch_address_map.xml b/LoPAR/ch_address_map.xml index 333f6ce..1c3ee5f 100644 --- a/LoPAR/ch_address_map.xml +++ b/LoPAR/ch_address_map.xml @@ -1,95 +1,95 @@ - - 3 Address Map - - The address map of an LoPAR platform is made up of several distinct - areas. These areas are one of five basic types. Each of these types has its own - general characteristics such as coherency, alignment, size restrictions, - variability of starting address and size, the system action on access of the - area, and so on. This chapter gives details on some of those characteristics, - and other chapters define the other characteristics. The variable - characteristics of these areas are reported to the OS via properties in the OF + Address Map + + The address map of an LoPAR platform is made up of several distinct + areas. These areas are one of five basic types. Each of these types has its own + general characteristics such as coherency, alignment, size restrictions, + variability of starting address and size, the system action on access of the + area, and so on. This chapter gives details on some of those characteristics, + and other chapters define the other characteristics. The variable + characteristics of these areas are reported to the OS via properties in the OF device tree. - +
Address Areas - The following is a definition of the five areas and some of their + The following is a definition of the five areas and some of their characteristics: - + - System Memory refers to memory - which forms a coherency domain with respect - to the PA processor(s) that execute application software on a system. See - for details on aspects of coherence. - System Memory Spaces refer to one or more pieces that - together form the System Memory. System Memory areas may be marked with a - special value of the “status” property of - “reserved” which means that this memory is not for general use by - the base OS, but may be reserved for use by OS extensions (see - ). Some System Memory areas may be + System Memory refers to memory + which forms a coherency domain with respect + to the PA processor(s) that execute application software on a system. See + for details on aspects of coherence. + System Memory Spaces refer to one or more pieces that + together form the System Memory. System Memory areas may be marked with a + special value of the “status” property of + “reserved” which means that this memory is not for general use by + the base OS, but may be reserved for use by OS extensions (see + ). Some System Memory areas may be preservable across boots (see ). - + - Peripheral Memory Space refers to a range of real addresses which - are assigned to the Memory Space of a Host Bridge (HB) or System Bus attached - IOA, and which are sufficient to contain all of the Load and Store address - space requirements of all IOAs in the Memory Space of the I/O bus that is - generated by the HB or which are encompassed by the System Bus attached IOA. - The frame buffer of a graphics IOA is an example of a device which may reside - in the Peripheral Memory Space. Due to space limitations in the address space - below 4 GB, the HBs of platforms may split this space into two pieces; one to - support the IOAs that need to have their addresses below 4 GB (because they - only support 32-bit addresses) and another to support the IOAs that can have - their addresses above 4 GB (because they support 64-bit addresses). In addition - to a Memory Space, many types of I/O buses have a separate address space called - the I/O Space. An HB which generates such I/O buses must decode another address - range, the Peripheral I/O Space.A - peripheral space may also include a “configuration” address - space. The configuration space is abstracted by a Run-Time Abstraction Service - (for example, see ). + Peripheral Memory Space refers to a range of real addresses which + are assigned to the Memory Space of a Host Bridge (HB) or System Bus attached + IOA, and which are sufficient to contain all of the Load and Store address + space requirements of all IOAs in the Memory Space of the I/O bus that is + generated by the HB or which are encompassed by the System Bus attached IOA. + The frame buffer of a graphics IOA is an example of a device which may reside + in the Peripheral Memory Space. Due to space limitations in the address space + below 4 GB, the HBs of platforms may split this space into two pieces; one to + support the IOAs that need to have their addresses below 4 GB (because they + only support 32-bit addresses) and another to support the IOAs that can have + their addresses above 4 GB (because they support 64-bit addresses). In addition + to a Memory Space, many types of I/O buses have a separate address space called + the I/O Space. An HB which generates such I/O buses must decode another address + range, the Peripheral I/O Space.A + peripheral space may also include a “configuration” address + space. The configuration space is abstracted by a Run-Time Abstraction Service + (for example, see ). - + - Peripheral I/O Space refers to a range of real addresses which - are assigned to the I/O Space of an HB or System Bus attached IOA and which are - sufficient to contain all of the Load and Store address space requirements of - all the IOAs in the I/O Space of the I/O bus that is generated by the HB or - which are encompassed by the System Bus IOA. A keyboard controller is an + Peripheral I/O Space refers to a range of real addresses which + are assigned to the I/O Space of an HB or System Bus attached IOA and which are + sufficient to contain all of the Load and Store address space requirements of + all the IOAs in the I/O Space of the I/O bus that is generated by the HB or + which are encompassed by the System Bus IOA. A keyboard controller is an example of an IOA which may require Peripheral I/O Space addresses. - + - System Control Area (SCA) refers to a range of addresses which - contains all reserved addresses (architected or unarchitected) which are not - part of one of the other defined address spaces. For example, the system ROM(s), - unarchitected platform-dependent addresses used by firmware and Run-Time - Abstraction Services for control of the platform, and architected entities like - interrupt controller addresses when those addresses are not in another defined + System Control Area (SCA) refers to a range of addresses which + contains all reserved addresses (architected or unarchitected) which are not + part of one of the other defined address spaces. For example, the system ROM(s), + unarchitected platform-dependent addresses used by firmware and Run-Time + Abstraction Services for control of the platform, and architected entities like + interrupt controller addresses when those addresses are not in another defined address space. - + - Undefined refers to areas that are not one of the above four - areas. The result of accessing one of these areas is defined in as an invalid address error. + Undefined refers to areas that are not one of the above four + areas. The result of accessing one of these areas is defined in as an invalid address error. - - - In addition to the above definitions, it is convenient, relative to - I/O op erations, to define a Partitionable Endpoint. A Partitionable - Endpoint (PE) is an I/O subtree that can be treated as a unit for - the purposes of partitioning and error recovery. A PE may be a single or - multi-function IOA, a function of a multi-function IOA, or multiple IOAs - (possibly including switch and bridge structures above the multiple IOAs). See + + + In addition to the above definitions, it is convenient, relative to + I/O op erations, to define a Partitionable Endpoint. A Partitionable + Endpoint (PE) is an I/O subtree that can be treated as a unit for + the purposes of partitioning and error recovery. A PE may be a single or + multi-function IOA, a function of a multi-function IOA, or multiple IOAs + (possibly including switch and bridge structures above the multiple IOAs). See for more information about PEs. - In describing the characteristics of these various areas, it is - convenient to have a nomenclature for the various boundary addresses. - defines the labels which are used in this - document when describing the various address ranges. Note that - “bottom” refers to the smallest address of the range and + In describing the characteristics of these various areas, it is + convenient to have a nomenclature for the various boundary addresses. + defines the labels which are used in this + document when describing the various address ranges. Note that + “bottom” refers to the smallest address of the range and “top” refers to the largest address. @@ -117,7 +117,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BIOn - Bottom of Peripheral I/O Space for HBn (n=0, 1, 2,...). + Bottom of Peripheral I/O Space for HBn (n=0, 1, 2,...). The OF property “ranges” in the OF device tree for HBn contains the value of BIOn. @@ -127,13 +127,13 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TIOn - Top of Peripheral I/O Space for HBn (n=0, 1, 2,...). The - value of TIOn can be determined by adding the size of the area as found in the - OF property “ranges” in the OF device tree - for HBn to the value of BIOn found in that same property and then subtracting + Top of Peripheral I/O Space for HBn (n=0, 1, 2,...). The + value of TIOn can be determined by adding the size of the area as found in the + OF property “ranges” in the OF device tree + for HBn to the value of BIOn found in that same property and then subtracting 1. - This architecture allows at most one Peripheral I/O area - per HB which may be above or below 4 GB. For any given n, BIOn to TIOn cannot + This architecture allows at most one Peripheral I/O area + per HB which may be above or below 4 GB. For any given n, BIOn to TIOn cannot span from the first 4 GB of address space to the second. @@ -142,9 +142,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BPMn,m - Bottom of Peripheral Memory Space m (m=0,1) for HBn (n=0, - 1, 2,...), as viewed from the system side of HBn. The OF property - “ranges” in the OF device tree for HBn contains the + Bottom of Peripheral Memory Space m (m=0,1) for HBn (n=0, + 1, 2,...), as viewed from the system side of HBn. The OF property + “ranges” in the OF device tree for HBn contains the value of BPMn,m. @@ -153,11 +153,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BPM’n,m - Bottom of Peripheral Memory Space m (m=0,1) for HBn (n=0, - 1, 2,...), as viewed from the I/O side of the HBn. That is, this is the value - to which BPMn,m gets translated to as it passes through the HB. The OF property - “ranges” in the OF device tree for HBn - contains the value of BPM’n,m. BPM’n,m may be equal to BPMn,m or + Bottom of Peripheral Memory Space m (m=0,1) for HBn (n=0, + 1, 2,...), as viewed from the I/O side of the HBn. That is, this is the value + to which BPMn,m gets translated to as it passes through the HB. The OF property + “ranges” in the OF device tree for HBn + contains the value of BPM’n,m. BPM’n,m may be equal to BPMn,m or may not be. @@ -166,17 +166,17 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TPMn,m - Top of Peripheral Memory Space m (m=0,1) for HBn (n=0, 1, - 2,...) as viewed from the system side of HBn. The Peripheral Memory Space - address range is in the OF device tree, as indicated by the - “ranges” property in the node in the OF device tree - for HBn; BPMn,m to TPMn,m. The value of TPMn,m can be determined by adding the - size of the area as found in the OF property - “ranges” in the OF device tree for HBn to the value of + Top of Peripheral Memory Space m (m=0,1) for HBn (n=0, 1, + 2,...) as viewed from the system side of HBn. The Peripheral Memory Space + address range is in the OF device tree, as indicated by the + “ranges” property in the node in the OF device tree + for HBn; BPMn,m to TPMn,m. The value of TPMn,m can be determined by adding the + size of the area as found in the OF property + “ranges” in the OF device tree for HBn to the value of BPMn,m found in that same property and then subtracting 1. - This architecture allows for one or two Peripheral Memory - areas per HB (hence, m=0,1). A Peripheral Memory area may be above 4 GB or - below. For any given n, BPMn,m to TPMn,m cannot span from the first 4 GB of + This architecture allows for one or two Peripheral Memory + areas per HB (hence, m=0,1). A Peripheral Memory area may be above 4 GB or + below. For any given n, BPMn,m to TPMn,m cannot span from the first 4 GB of address space to the second. @@ -185,12 +185,12 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TPM’n,m - Top of Peripheral Memory Space m (m=0,1) for HBn (n=0, 1, - 2,...) as viewed from the I/O side of HBn. The value of TPM’n,m can be - calculated from the values in the“ranges” - property as was TPMn,m. In some cases TPM’n,m is required to be equal to - TPMn,m and in some cases it is not required to be equal. For any given n, - BPM’n,m to TPM’n,m cannot span from the first 4 GB of address + Top of Peripheral Memory Space m (m=0,1) for HBn (n=0, 1, + 2,...) as viewed from the I/O side of HBn. The value of TPM’n,m can be + calculated from the values in the“ranges” + property as was TPMn,m. In some cases TPM’n,m is required to be equal to + TPMn,m and in some cases it is not required to be equal. For any given n, + BPM’n,m to TPM’n,m cannot span from the first 4 GB of address space to the second. @@ -199,9 +199,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BSCAn - Bottom of System Control Area. Corresponding top of the - System Control Area is TSCAn. This architecture allows for one or two SCAs per - platform. The SCA below 4 GB is at the top (largest addresses) of the lower 4 + Bottom of System Control Area. Corresponding top of the + System Control Area is TSCAn. This architecture allows for one or two SCAs per + platform. The SCA below 4 GB is at the top (largest addresses) of the lower 4 GB range. @@ -210,7 +210,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TSCAn - Top of System Control Area. For any given n, BSCAn to + Top of System Control Area. For any given n, BSCAn to TSCAn cannot span from the first 4 GB of address space to the second. @@ -219,8 +219,8 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BSMn - Bottom of System Memory Space n (n=0, 1, 2,...); BSM0 = 0. - The OF property “reg” in the OF device tree + Bottom of System Memory Space n (n=0, 1, 2,...); BSM0 = 0. + The OF property “reg” in the OF device tree for the Memory Controller’s node contains the value of BSMn. @@ -229,9 +229,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TSMn - Top of System Memory Space n (n=0, 1, 2,...). The value of - TSMn can be determined by adding the value of BSMn as found in the Memory - Controller’s node of the OF device tree to the value of the size of that + Top of System Memory Space n (n=0, 1, 2,...). The value of + TSMn can be determined by adding the value of BSMn as found in the Memory + Controller’s node of the OF device tree to the value of the size of that area as found in the same property, and then subtracting 1. @@ -240,12 +240,12 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo BTTAn,m - Bottom of TCE Translatable Address space m (m=0, 1, 2,...) - for HBn (n=0, 1, 2,...) as viewed from the I/O side of HBn. This is the bottom - of an address range that is translatable by a Translation Control Entry (TCE) - table. The value of BTTAn,m is obtained from the - “ibm,dma-window” or - “ibm,my-dma-window” property in the OF device + Bottom of TCE Translatable Address space m (m=0, 1, 2,...) + for HBn (n=0, 1, 2,...) as viewed from the I/O side of HBn. This is the bottom + of an address range that is translatable by a Translation Control Entry (TCE) + table. The value of BTTAn,m is obtained from the + “ibm,dma-window” or + “ibm,my-dma-window” property in the OF device tree. @@ -254,100 +254,100 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo TTTAn,m - Top of TCE Translatable Address space m (m=0, 1, 2,...) - for HBn (n=0, 1, 2,...) as viewed from the I/O side of HBn. This is the top of - an address range that is translatable by a TCE table. The range BTTAn,m to - TTTAn,m is not accessible by more than one PE for any given “n”. - The value of TTTAn,m can be determined by adding the size of the area as found - in the OF property “ibm,dma-window” or - “ibm,my-dma-window” in the OF device tree - for HBn to the value of BTTAn,m found in that same property and then + Top of TCE Translatable Address space m (m=0, 1, 2,...) + for HBn (n=0, 1, 2,...) as viewed from the I/O side of HBn. This is the top of + an address range that is translatable by a TCE table. The range BTTAn,m to + TTTAn,m is not accessible by more than one PE for any given “n”. + The value of TTTAn,m can be determined by adding the size of the area as found + in the OF property “ibm,dma-window” or + “ibm,my-dma-window” in the OF device tree + for HBn to the value of BTTAn,m found in that same property and then subtracting 1.
- The figures found in , show - examples of the areas referenced by the labels in The figures found in , show + examples of the areas referenced by the labels in . - The OS and other software should not use fixed addresses for these - various areas. A given platform may, however, make some of these addresses - unchangeable. Each of these areas is defined in the OF device tree in the node - of the appropriate controller. This gives platforms the most flexibility in + The OS and other software should not use fixed addresses for these + various areas. A given platform may, however, make some of these addresses + unchangeable. Each of these areas is defined in the OF device tree in the node + of the appropriate controller. This gives platforms the most flexibility in implementing the System Address Map to meet their market requirements. - R2-R2--1. - All unavailable addresses in the Peripheral Memory and Peripheral + All unavailable addresses in the Peripheral Memory and Peripheral I/O Spaces must be conveyed in the OF device tree. - + - A “device_type” of - “reserved” must be used to specify areas which are not + A “device_type” of + “reserved” must be used to specify areas which are not to be used by software and not otherwise reported by OF. - Shadow aliases must be communicated as specified by the + Shadow aliases must be communicated as specified by the appropriate OF bus binding. - + - + - R2-R2--2. - There must not be any address generated by + There must not be any address generated by the system which causes the system to hang. - Hardware Implementation Note: The reason for Requirement - is to reserve address space for registers - used only by the firmware or addresses which are used only by the + Hardware Implementation Note: The reason for Requirement + is to reserve address space for registers + used only by the firmware or addresses which are used only by the hardware.
- +
Address Decoding (or Validating) and Translation - In general, different components in the hardware are going to decode - the address ranges for the various areas. In some cases the component may be - required to translate the address to a new address as it passes through the - component. The requirements, below, describe the various system address decodes - (or validating) and, where appropriate, what address transforms take place + In general, different components in the hardware are going to decode + the address ranges for the various areas. In some cases the component may be + required to translate the address to a new address as it passes through the + component. The requirements, below, describe the various system address decodes + (or validating) and, where appropriate, what address transforms take place outside of the processor. - The HB requirements in this section refer to HBs which are defined by - this architecture. Currently, there is only one HB defined by this - architecture, and that is the PHB. HBs which implement I/O buses other than - those defined by this architecture may or may not require changes to this + The HB requirements in this section refer to HBs which are defined by + this architecture. Currently, there is only one HB defined by this + architecture, and that is the PHB. HBs which implement I/O buses other than + those defined by this architecture may or may not require changes to this addressing model. - The reader may want to reference the example address maps found in - , while reading through the + The reader may want to reference the example address maps found in + , while reading through the requirements of this section.
- <emphasis>Load</emphasis> and <emphasis>Store</emphasis> + <title> <emphasis>Load</emphasis> and <emphasis>Store</emphasis> Address Decoding and Translation - Load and Store - operations may be targeted at System Memory or I/O. The latter is called Memory + Load and Store + operations may be targeted at System Memory or I/O. The latter is called Memory Mapped I/O (MMIO). - R2-R2--1. - Processor Load and Store - operations must be routed and translated as shown in Processor Load and Store + operations must be routed and translated as shown in . @@ -363,19 +363,19 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - Address Range at Processor + Address Range at Processor Bus - Route and Translation + Route and Translation Requirements - Other Requirements and + Other Requirements and Comments @@ -388,11 +388,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo (n=0, 1) - To ROM controller or to a platform dependent area. + To ROM controller or to a platform dependent area. Translation dependent on implementation. - Areas other than ROM are reserved for firmware use, or + Areas other than ROM are reserved for firmware use, or have their address passed by the OF device tree. @@ -402,9 +402,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo (n=0,1, 2,...) - Send through the HB to the I/O space of the I/O bus, - translating by subtracting the value of BIO from each address in this range - (that is, translate BIO to TIO to be at 0 to (TIO - BIO) on the I/O + Send through the HB to the I/O space of the I/O bus, + translating by subtracting the value of BIO from each address in this range + (that is, translate BIO to TIO to be at 0 to (TIO - BIO) on the I/O side). @@ -421,25 +421,25 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Send through HBn to the Memory Space of the I/O bus. - If BPMn,m < 4 GB, do not translate an address in the - BPMn,m to TPMn,m range as the transaction passes through the bridge (that is, + If BPMn,m < 4 GB, do not translate an address in the + BPMn,m to TPMn,m range as the transaction passes through the bridge (that is, BPM’n,m = BPMn,m and TPM’n,m = TPMn,m). - If BPMn,m is at or above 4 GB then if BPM’n,m is - to be below 4 GB (for 32-bit IOAs) then translate addresses in the BPMn,m to - TPMn,m range so that this address range becomes BPM’n,m to - TPM’n,m (where BPM’n,m and TPM’n,m are less than 4 GB) as - the transaction passes through the bridge, otherwise do not translate an - address in the BPMn,m to TPMn,m range as the transaction passes through the + If BPMn,m is at or above 4 GB then if BPM’n,m is + to be below 4 GB (for 32-bit IOAs) then translate addresses in the BPMn,m to + TPMn,m range so that this address range becomes BPM’n,m to + TPM’n,m (where BPM’n,m and TPM’n,m are less than 4 GB) as + the transaction passes through the bridge, otherwise do not translate an + address in the BPMn,m to TPMn,m range as the transaction passes through the bridge (for 64-bit IOAs which are configured at or above 4 GB). - Platforms that need to support both 32-bit capable and - 64-bit capable IOAs and do not want to configure the 64-bit capable IOAs below + Platforms that need to support both 32-bit capable and + 64-bit capable IOAs and do not want to configure the 64-bit capable IOAs below 4 GB need to support two Peripheral Memory spaces per HB. @@ -471,7 +471,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo All other addresses - See . + See . Access is to undefined space. @@ -483,211 +483,211 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - R2-R2--2. - There must be no architected address - spaces (Peripheral Memory, Peripheral I/O, SCA, or System Memory) which span + There must be no architected address + spaces (Peripheral Memory, Peripheral I/O, SCA, or System Memory) which span the (4GB - 1) to 4 GB boundary. - + - R2-R2--3. The following are the System Control Area requirements: - + - The platform must have at most one System Control Area below 4 GB + The platform must have at most one System Control Area below 4 GB and at most one per platform or per NUMA node at or above 4 GB. - The System Control Area must not overlap with the System Memory - Space(s), Peripheral Memory Space(s), or the Peripheral I/O Space(s) in the + The System Control Area must not overlap with the System Memory + Space(s), Peripheral Memory Space(s), or the Peripheral I/O Space(s) in the platform. - + - + - R2-R2--4. The following are the System Memory Space requirements: - + Each platform must have at least one System Memory Space. - The System Memory Space(s) must not overlap with the Peripheral I/O - Space(s), Peripheral Memory Space(s), the System Control Area, or other System + The System Memory Space(s) must not overlap with the Peripheral I/O + Space(s), Peripheral Memory Space(s), the System Control Area, or other System Memory Space(s) in the platform. - The first System Memory Space must start at address 0 (BSM0 = 0), - must be at least 128 MB before a second System Memory Space is added and must + The first System Memory Space must start at address 0 (BSM0 = 0), + must be at least 128 MB before a second System Memory Space is added and must be contiguous. - Each of the additional (optional) System Memory Space(s) + Each of the additional (optional) System Memory Space(s) must start on a 4 KB boundary. - Each of the additional (optional) System Memory Space(s) + Each of the additional (optional) System Memory Space(s) must be contiguous within itself. - There must be at most eight System Memory Spaces below BSCA0 and + There must be at most eight System Memory Spaces below BSCA0 and at most eight at or above 4 GB. - If multiple System Memory Spaces exist below 4 GB, then they - must not have any Peripheral Memory or Peripheral I/O Spaces interspersed - between them and if multiple System Memory Spaces exist above 4 GB, then they - must not have any Peripheral Memory or Peripheral I/O Spaces interspersed + If multiple System Memory Spaces exist below 4 GB, then they + must not have any Peripheral Memory or Peripheral I/O Spaces interspersed + between them and if multiple System Memory Spaces exist above 4 GB, then they + must not have any Peripheral Memory or Peripheral I/O Spaces interspersed between them. - + - + - R2-R2--5. The following are the Peripheral Memory Space requirements: - + - The Peripheral Memory Space(s) must not overlap with the System - Memory Space(s), Peripheral I/O Space(s), the System Control Area, or other + The Peripheral Memory Space(s) must not overlap with the System + Memory Space(s), Peripheral I/O Space(s), the System Control Area, or other Peripheral Memory Space(s) in the platform. - The size of each - Peripheral Memory Space (TPMn,m - BPMn,m + 1) must be a power of two for sizes - up to and including 256 MB, with the minimum size being 1 MB, and an integer - multiple of 256 MB plus a power of two which is greater than or equal to 1 MB - for sizes greater than 256 MB (for example, 1 MB, 2 MB, 4 MB, 8 MB, 16 MB, 32 - MB, 64 MB, 128 MB, 256 MB, (256 + 1) MB, (256 + 2) MB,..., (512 + 1) + The size of each + Peripheral Memory Space (TPMn,m - BPMn,m + 1) must be a power of two for sizes + up to and including 256 MB, with the minimum size being 1 MB, and an integer + multiple of 256 MB plus a power of two which is greater than or equal to 1 MB + for sizes greater than 256 MB (for example, 1 MB, 2 MB, 4 MB, 8 MB, 16 MB, 32 + MB, 64 MB, 128 MB, 256 MB, (256 + 1) MB, (256 + 2) MB,..., (512 + 1) MB,...). - The boundary alignment for each Peripheral Memory Space must be an integer multiple - of the size of the space up to and including 256 MB and must be an integer + The boundary alignment for each Peripheral Memory Space must be an integer multiple + of the size of the space up to and including 256 MB and must be an integer multiple of 256 MB for sizes greater than 256 MB. - There must be at most two Peripheral Memory Spaces per HB. + There must be at most two Peripheral Memory Spaces per HB. - If the Peripheral Memory Space for a HB is below 4 GB, then the - address must not be translated as it passes through the HB from the system side - to the I/O side of the HB (see If the Peripheral Memory Space for a HB is below 4 GB, then the + address must not be translated as it passes through the HB from the system side + to the I/O side of the HB (see ). - If the Peripheral Memory Space for a HB is above 4 GB, then the - address may or may not be translated as it passes through the HB from the - system side to the I/O side of the HB, but if it is translated, then the - translated address range must be aligned on a boundary which is an integer + If the Peripheral Memory Space for a HB is above 4 GB, then the + address may or may not be translated as it passes through the HB from the + system side to the I/O side of the HB, but if it is translated, then the + translated address range must be aligned on a boundary which is an integer multiple of the size of the Peripheral Memory Space. - + - Implementation Note: Relative to Requirement - , not all OSs can support BPM’ to + Implementation Note: Relative to Requirement + , not all OSs can support BPM’ to TPM’ being above 4 GB. - R2-R2--6. The following are the Peripheral I/O Space requirements: - + - The Peripheral I/O Space(s) must not overlap with the System - Memory Space(s), Peripheral Memory Space(s), the System Control Area, or other + The Peripheral I/O Space(s) must not overlap with the System + Memory Space(s), Peripheral Memory Space(s), the System Control Area, or other Peripheral I/O Space(s) in the platform. - The size of each Peripheral I/O Space (TIOn - BIOn + 1) must be a - power of two with the minimum size being 64 KB (that is, sizes of 64 KB, 128 - KB, 256 KB, 512 KB, 1 MB, 2 MB, 4 MB, 8 MB, 16 MB, 32 MB, 64 MB, and so on, are + The size of each Peripheral I/O Space (TIOn - BIOn + 1) must be a + power of two with the minimum size being 64 KB (that is, sizes of 64 KB, 128 + KB, 256 KB, 512 KB, 1 MB, 2 MB, 4 MB, 8 MB, 16 MB, 32 MB, 64 MB, and so on, are acceptable). - The boundary alignment for each Peripheral I/O Space must be an integer multiple of the size of + The boundary alignment for each Peripheral I/O Space must be an integer multiple of the size of the space. - There must be at most one Peripheral I/O Space per + There must be at most one Peripheral I/O Space per HB. - + - + - R2-R2--7. - All System Memory must be - accessible via DMA operation from all IOAs in the system, except where LPAR - requirements limit accessibility of an IOA belonging to one partition to the + All System Memory must be + accessible via DMA operation from all IOAs in the system, except where LPAR + requirements limit accessibility of an IOA belonging to one partition to the System Memory of another partition. - Hardware Implementation Notes: Memory controller and memory card - designers who are designing for 64-bit platforms should be careful to consider - that the amount of I/O space below 4 GB is reduced by the amount of System - Memory space below 4 GB. Therefore it may be prudent to design the hardware to - allow minimization of the amount of System Memory below 4 GB, in order to allow - maximization of the space for 32-bit Peripheral Memory and Peripheral I/O + Hardware Implementation Notes: Memory controller and memory card + designers who are designing for 64-bit platforms should be careful to consider + that the amount of I/O space below 4 GB is reduced by the amount of System + Memory space below 4 GB. Therefore it may be prudent to design the hardware to + allow minimization of the amount of System Memory below 4 GB, in order to allow + maximization of the space for 32-bit Peripheral Memory and Peripheral I/O spaces below 4 GB. - The beginning addresses and sizes of the Peripheral I/O Space(s) - and Peripheral - Memory Space(s), are controlled by firmware. Information about the address map - is reported by the OF Device Tree or, for items that can change, through RTAS - calls (for example, for Dynamic Reconfiguration, through the + The beginning addresses and sizes of the Peripheral I/O Space(s) + and Peripheral + Memory Space(s), are controlled by firmware. Information about the address map + is reported by the OF Device Tree or, for items that can change, through RTAS + calls (for example, for Dynamic Reconfiguration, through the ibm,configure-connector RTAS call). - Certain System Memory addresses must be reserved in all systems for - specific uses (see and Certain System Memory addresses must be reserved in all systems for + specific uses (see and for more information).
DMA Address Validation and Translation - is a representation of - how the validation and translation mechanism works, along with a description of - the steps which are involved. At the core of the translation mechanism is the + is a representation of + how the validation and translation mechanism works, along with a description of + the steps which are involved. At the core of the translation mechanism is the Translation and Control Entry (TCE) table.
@@ -697,7 +697,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - @@ -708,20 +708,20 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - R2-R2--1. - Upon receiving a DMA - transaction to the Memory Space of an I/O bus, the HB must perform the - validation and translation steps, as indicated in - and in + Upon receiving a DMA + transaction to the Memory Space of an I/O bus, the HB must perform the + validation and translation steps, as indicated in + and in . - + - DMA Address Decoding and Translation (I/O Bus Memory + <title>DMA Address Decoding and Translation (I/O Bus Memory Space) @@ -731,19 +731,19 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - Address Range at I/O Side of + Address Range at I/O Side of HBn - Route and Translation + Route and Translation Requirements - Other Requirements and + Other Requirements and Comments @@ -758,8 +758,8 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo (note 1) - HB does not respond or responds and signals an invalid - address error (See ). + HB does not respond or responds and signals an invalid + address error (See ).   @@ -773,11 +773,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo (note 1) - If the PE that is trying to access this space is allowed - to access this space, then translate via the TCE table (as specified in ) and pass the translated address through - the HB, otherwise generate an invalid address or TCE extent error, as - appropriate (See ). + If the PE that is trying to access this space is allowed + to access this space, then translate via the TCE table (as specified in ) and pass the translated address through + the HB, otherwise generate an invalid address or TCE extent error, as + appropriate (See ). See Notes 2, 3 @@ -788,7 +788,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo All other addresses - Generate an invalid address error (See ). + Generate an invalid address error (See ). See Note 3 @@ -797,26 +797,26 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo Notes: - + - n = # of HB Viewing or Receiving the Operation, m = # + n = # of HB Viewing or Receiving the Operation, m = # of instance within the HB. - After translation of the address, if the translated - address would re-access the same HB or another HB (for example, is in the - Peripheral Memory Space or Peripheral I/O Space of that HB or another HB), then - the HB generates an invalid address error (See ). + After translation of the address, if the translated + address would re-access the same HB or another HB (for example, is in the + Peripheral Memory Space or Peripheral I/O Space of that HB or another HB), then + the HB generates an invalid address error (See ). - If the Enhanced I/O Error Handling (EEH) option is - implemented and enabled, then on an error, the PE will enter the DMA Stopped + If the Enhanced I/O Error Handling (EEH) option is + implemented and enabled, then on an error, the PE will enter the DMA Stopped State (See ). - + @@ -826,87 +826,87 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo - R2-R2--2. - An HB must not act as a target + An HB must not act as a target for operations in the I/O Space of an I/O bus.
- DMA Address Translation and Control via the TCE + <title> DMA Address Translation and Control via the TCE Mechanism - This architecture defines a Translation and Control Entry (TCE) - mechanism for translating and controlling DMA addresses. There are several + This architecture defines a Translation and Control Entry (TCE) + mechanism for translating and controlling DMA addresses. There are several reasons for doing such translations, including: - + - To provide a mechanism for increasing the number of addressing - bits for some IOAs. For example, IOAs which are only capable of accessing up to - 4 GB via DMA need a way to access above that limit when used in 64-bit + To provide a mechanism for increasing the number of addressing + bits for some IOAs. For example, IOAs which are only capable of accessing up to + 4 GB via DMA need a way to access above that limit when used in 64-bit addressing systems and the addressing requirements go beyond 4 GB. - + - To provide a redirection mechanism. A redirection mechanism is - needed, even for 64-bit addressing capable IOAs, in order to provide the + To provide a redirection mechanism. A redirection mechanism is + needed, even for 64-bit addressing capable IOAs, in order to provide the protection and indirection benefits provided by such a translation. - - - The description of how the access to the TCE table occurs, for the - translation of a 32-bit address and using a 4 KB I/O page size, follows. The - most significant 20 bits of the address (for example, AD[31:12], for PCI) is - used as an offset into the TCE table for the PE to select the TCE. Thus, the - first TCE maps the addresses BTTAn to BTTAn + 0x00000FFF of the Memory Space of - the I/O bus; the second entry controls translation of addresses BTTAn + - 0x00001000 to BTTAn + 0x00001FFF, and so on. The translated real system address - is generated as follows. The Real Page Number (RPN) from the TCE replaces the - 20 most significant bits of the address from the I/O bus. The least significant - 12 bits from the I/O bus address are used as-is for the least significant 12 + + + The description of how the access to the TCE table occurs, for the + translation of a 32-bit address and using a 4 KB I/O page size, follows. The + most significant 20 bits of the address (for example, AD[31:12], for PCI) is + used as an offset into the TCE table for the PE to select the TCE. Thus, the + first TCE maps the addresses BTTAn to BTTAn + 0x00000FFF of the Memory Space of + the I/O bus; the second entry controls translation of addresses BTTAn + + 0x00001000 to BTTAn + 0x00001FFF, and so on. The translated real system address + is generated as follows. The Real Page Number (RPN) from the TCE replaces the + 20 most significant bits of the address from the I/O bus. The least significant + 12 bits from the I/O bus address are used as-is for the least significant 12 bits of the new address. - Thus, the TCE table entries have a one-to-one correspondence with - the first n pages of the Memory Space of the I/O bus starting at BTTAn that - corresponds to the TCE table. The size of the Memory address space of the I/O - bus that can be mapped to the system address space for a particular HB depends - on how much System Memory is allocated to the TCE table(s) and on how much - mappable I/O bus Memory Space is unavailable due to IOAs which are mapped + Thus, the TCE table entries have a one-to-one correspondence with + the first n pages of the Memory Space of the I/O bus starting at BTTAn that + corresponds to the TCE table. The size of the Memory address space of the I/O + bus that can be mapped to the system address space for a particular HB depends + on how much System Memory is allocated to the TCE table(s) and on how much + mappable I/O bus Memory Space is unavailable due to IOAs which are mapped there. - Each TCE also contains two control bits. These are used to identify - whether that page is mapped to the system address space, and if the page is - mapped, whether it is mapped read/write, read only, or write only. See the - for a definition of these control + Each TCE also contains two control bits. These are used to identify + whether that page is mapped to the system address space, and if the page is + mapped, whether it is mapped read/write, read only, or write only. See the + for a definition of these control bits. - The TCE table is the analogue of the system translation tables. - However, unlike the system translation tables, the dynamic page faulting of - memory during an I/O operation is not required (the page fault value, 0b00, in - the TCE Page Mapping and Control field is used for error detection; that is, - access to an invalid TCE by the I/O creates an error indication to the + The TCE table is the analogue of the system translation tables. + However, unlike the system translation tables, the dynamic page faulting of + memory during an I/O operation is not required (the page fault value, 0b00, in + the TCE Page Mapping and Control field is used for error detection; that is, + access to an invalid TCE by the I/O creates an error indication to the software). - The size and location of the HB’s TCE table is set up and + The size and location of the HB’s TCE table is set up and changed only by the firmware. - R2-R2--1. - The platform must provide the - “64-bit-addressing” and - “ibm,extended-address” OF properties in all HB nodes - of the device tree and the “ibm,extended-address” + The platform must provide the + “64-bit-addressing” and + “ibm,extended-address” OF properties in all HB nodes + of the device tree and the “ibm,extended-address” OF property in the root node of the OF device tree. - + - R2-R2--2. - The bits of the TCE must be implemented + The bits of the TCE must be implemented as defined in . @@ -938,11 +938,11 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo 0 to 51 - RPN: If the page mapping and control field of the TCE - indicate anything other than page fault, then these bits contain the Real Page - Number (RPN) to which the bus address is mapped in the system address space. In - certain HB implementations, all of these bits may not be required, however - enough bits must be implemented to match the largest real address in the + RPN: If the page mapping and control field of the TCE + indicate anything other than page fault, then these bits contain the Real Page + Number (RPN) to which the bus address is mapped in the system address space. In + certain HB implementations, all of these bits may not be required, however + enough bits must be implemented to match the largest real address in the platform. @@ -959,151 +959,151 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo 62 to 63 - Page Mapping and Control: These bits define page mapping + Page Mapping and Control: These bits define page mapping and read-write authority. They are coded as follows: 00 Page fault (no access) 01 System address space (read only) 10 System address space (write only) 11 System address space (read/write) - Code point 0b00 signifies that the page is not mapped. - It must be used to indicate a page fault error. Hardware must not change its - state based on the value in the remaining bits of a TCE when code point 0b00 is + Code point 0b00 signifies that the page is not mapped. + It must be used to indicate a page fault error. Hardware must not change its + state based on the value in the remaining bits of a TCE when code point 0b00 is set in this field of the TCE. - For accesses to system address space with an invalid - operation (write to a read-only page or read to a write-only page), the HB - generates an error. See for more information + For accesses to system address space with an invalid + operation (write to a read-only page or read to a write-only page), the HB + generates an error. See for more information about error handling.
- + - R2-R2--3. - If the address that the HB would use to - access the TCE table (in order to get the TCE) would access outside of the TCE - table, then the HB must create a TCE extent error (See ). + If the address that the HB would use to + access the TCE table (in order to get the TCE) would access outside of the TCE + table, then the HB must create a TCE extent error (See ). - + - R2-R2--4. - Enough bits must be implemented in the + Enough bits must be implemented in the TCE so that DMA IOAs are able to access all System Memory addresses. - + - R2-R2--5. - Each PE must have its own independent + Each PE must have its own independent TCE table. - + - R2-R2--6. - Any non-recoverable error while an HB - is accessing its TCE table must result in a TCE access error; the action to be - taken by the HB being defined under the TCE access error in . + Any non-recoverable error while an HB + is accessing its TCE table must result in a TCE access error; the action to be + taken by the HB being defined under the TCE access error in . - + - R2-R2--7. - In implementations which cache TCEs, if - software changes a TCE, then the platform must perform the following steps: - First, if any data associated with the page represented by that TCE is in an - I/O bridge cache or buffer, the hardware must write the data, if modified, to - System Memory. Secondly, it must invalidate the data in the cache. Finally, it + In implementations which cache TCEs, if + software changes a TCE, then the platform must perform the following steps: + First, if any data associated with the page represented by that TCE is in an + I/O bridge cache or buffer, the hardware must write the data, if modified, to + System Memory. Secondly, it must invalidate the data in the cache. Finally, it must invalidate the TCE in the cache. - + - R2-R2--8. - Neither an IOA nor an HB must ever + Neither an IOA nor an HB must ever modify a TCE. - + - R2-R2--9. - If the page mapping and control bits in - the TCE are set to 0b00, the hardware must not change its state based on the + If the page mapping and control bits in + the TCE are set to 0b00, the hardware must not change its state based on the values of the remaining bits of the TCE. - + - R2-R2--10. - The OS must initialize all its TCEs + The OS must initialize all its TCEs upon receiving control from the platform. - +
Example Address Maps - shows how to construct a - simple address map with one PHB and with Peripheral Memory, Peripheral I/O, and + shows how to construct a + simple address map with one PHB and with Peripheral Memory, Peripheral I/O, and SCA spaces below 4 GB. - shows how to construct - the address map with Peripheral Memory, Peripheral I/O, and SCA spaces above 4 - GB. This configuration allows some overlap of the System Memory space and - 32-bit I/O bus memory space (with the resulting loss of the TCE table in the - overlap), while moving some of the SCA spaces above 4 GB. Several things can be + shows how to construct + the address map with Peripheral Memory, Peripheral I/O, and SCA spaces above 4 + GB. This configuration allows some overlap of the System Memory space and + 32-bit I/O bus memory space (with the resulting loss of the TCE table in the + overlap), while moving some of the SCA spaces above 4 GB. Several things can be noted from this configuration: - + - I/O bus memory areas can overlap System Memory addresses (see - memory space of PHB0). However, significant overlap of these I/O bus memory - areas and the TCE table may significantly reduce the amount of TCE table space - that is available for mapping I/O memory space to system address space (a + I/O bus memory areas can overlap System Memory addresses (see + memory space of PHB0). However, significant overlap of these I/O bus memory + areas and the TCE table may significantly reduce the amount of TCE table space + that is available for mapping I/O memory space to system address space (a potential performance impact). - + - The System Memory which is above 4GB is shown starting at 4GB. - This architecture also allows this to be pushed further up, with Peripheral - Memory, Peripheral I/O, and SCAs existing above 4 GB and below the System + The System Memory which is above 4GB is shown starting at 4GB. + This architecture also allows this to be pushed further up, with Peripheral + Memory, Peripheral I/O, and SCAs existing above 4 GB and below the System Memory areas. - + - BPM’n,m to TPM’n,m spaces for different PHBs - (different “n”) are allowed to occur at the same memory addresses - in the various memory spaces of different I/O buses, but are not required to do - so (and are not shown as the same in the figure). Implementations are likely - have BPM’n,m to TPM’n,m at the same address range for all - “n” when the BPM’n,m to TPM’n,m ranges are below 4 + BPM’n,m to TPM’n,m spaces for different PHBs + (different “n”) are allowed to occur at the same memory addresses + in the various memory spaces of different I/O buses, but are not required to do + so (and are not shown as the same in the figure). Implementations are likely + have BPM’n,m to TPM’n,m at the same address range for all + “n” when the BPM’n,m to TPM’n,m ranges are below 4 GB. - +
- Example Address Map: One PHB, Peripheral Memory and Peripheral + <title>Example Address Map: One PHB, Peripheral Memory and Peripheral I/O Spaces below 4 GB @@ -1114,9 +1114,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo
- +
- Example Address Map: Four PHBs, all Peripheral Memory and + <title>Example Address Map: Four PHBs, all Peripheral Memory and Peripheral I/O Spaces above 4GB diff --git a/LoPAR/ch_dynamic_reconfig.xml b/LoPAR/ch_dynamic_reconfig.xml index 5c93900..6c67c91 100644 --- a/LoPAR/ch_dynamic_reconfig.xml +++ b/LoPAR/ch_dynamic_reconfig.xml @@ -1,6 +1,9 @@ - Dynamic Reconfiguration (DR) Architecture + xmlns:xl="http://www.w3.org/1999/xlink" + version="5.0" + xml:lang="en" + xml:id="dbdoclet.50569342_75822"> + Dynamic Reconfiguration (DR) Architecture Dynamic Reconfiguration (DR) is the capability of a system to adapt to changes in the hardware/firmware physical or logical configuration, and to be @@ -283,7 +286,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> A DR entity which does not have to be physically plugged or unplugged during a DR operation on that entity. See - for a list of the supported + for a list of the supported Logical DR types. @@ -317,7 +320,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> A DR entity which may need to be physically plugged or unplugged during a DR operation on that entity. See - for a list of the supported + for a list of the supported physical DR types. @@ -448,7 +451,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> these operations. In this case, the OS may ignore those errors if the operation is a DLPAR to remove the hardware. See also the “ibm,ignore-hp-po-fails-for-dlpar” - property in . + property in . @@ -1127,7 +1130,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> connector an index to be passed between the OS and RTAS to identify the DR connector to be operated upon. This property is in the parent node of the DR connector to which the property applies. See - for the definition of this + for the definition of this property. See for additional information. @@ -1151,7 +1154,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> entry in the “ibm,drc-indexes” property for that connector. This property is used for correlation purposes. See - for the definition of this + for the definition of this property. @@ -1172,7 +1175,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> <emphasis role="bold"><literal>“ibm,drc-names”</literal></emphasis> Property This property is added for the DR option to specify for each DR connector a user-readable location code for the connector. See - for the definition of this + for the definition of this property. See for additional information. @@ -1299,7 +1302,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> <emphasis role="bold"><literal>“ibm,drc-power-domains”</literal></emphasis> Property This property is added for the DR option to specify for each DR connector the power domain in which the connector resides. See - for the definition of this + for the definition of this property. See for additional information. @@ -1339,7 +1342,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> <emphasis role="bold"><literal>“ibm,drc-types”</literal></emphasis> Property This property is added for the DR option to specify for each DR connector a user-readable connector type for the connector. See - for the definition of this + for the definition of this property. See for additional information. Architecture Note: The logical connectors (CPU, MEM @@ -1369,7 +1372,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> <emphasis role="bold"><literal>“ibm,phandle”</literal></emphasis> Property This property is added for the DR option to specify the phandle for each OF device tree node returned by ibm,configure-connector. See - for the definition of this + for the definition of this property. @@ -1473,7 +1476,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
<emphasis>set-power-level</emphasis> This RTAS call is defined in - . Several additional requirements are placed + . Several additional requirements are placed on this call when the platform implements DR along with PM. This RTAS call is used in DR to power up or power down a DR connector, if necessary (that is, if there is a non-zero power domain @@ -1497,7 +1500,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> For all DR options: the set-power-level RTAS call must be implemented as specified in - and the further requirements of this DR + and the further requirements of this DR option. @@ -1859,7 +1862,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
<emphasis>set-indicator</emphasis> This RTAS call is defined as shown in - . This RTAS call is used in DR to transition + . This RTAS call is used in DR to transition between isolation states, allocation states, and control DR indicators. In some cases, a state transition fails due to various conditions, however, a null transition (commanding that the new state be what it @@ -2015,7 +2018,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> connector, then they are used to indicate the state of the DR connector to the user. Usage of these states are as defined in and - . + . @@ -2074,7 +2077,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> set-indicator call must return a -2 status, or optionally for indicator type 9001 the 990x status, for each call until the operation is complete; where the 990x status is defined in - . + . @@ -2512,7 +2515,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> requirement, in order to provide for a consistent user interface across platforms. Information on implementation dependent aspects of the DR indicators can be found in - . + . @@ -2557,7 +2560,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> power off at the connector, then the caller of set-indicator must turn power off prior to setting the indicator to this state. See also - . + . @@ -2569,7 +2572,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> identify the physical location of the DR connector. This state may map to the same visual state (for example, blink rate) as the Action state, or may map to a different state. See also - . + . @@ -2581,7 +2584,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> the user is to perform the current DR operation. This state may map to the same visual state (for example, blink rate) as the Identify state, or may map to a different state. See also - . + . @@ -2591,7 +2594,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> The DR connector is active and entity removal may disrupt system operation. See also - . + . @@ -2671,7 +2674,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> When bringing a DR entity online that utilizes TCEs (see - ), the OS must initialize the DR + ), the OS must initialize the DR entity's TCEs. @@ -2743,7 +2746,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> The other indicator must be amber and must be controllable by RTAS, separately from all other indicators, and must be used as a slot Identify indicator, as defined in - . + . @@ -3295,7 +3298,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> Always present for sub-systems and for PCI IOAs which follow the PCI VPD proposed standard. See - and note to see the effect of + and note to see the effect of using different PCI versions. @@ -3735,7 +3738,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> Shall be one of the values “CPU”, “MEM”, “PHB”, or “SLOT” as defined in - . + . @@ -3865,7 +3868,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> “ibm,lrdr-capacity” property must be included in the /rtas node of the partition device tree (see - ). + ). diff --git a/LoPAR/ch_interrupt_controller.xml b/LoPAR/ch_interrupt_controller.xml index 788c5f6..9680c25 100644 --- a/LoPAR/ch_interrupt_controller.xml +++ b/LoPAR/ch_interrupt_controller.xml @@ -1,285 +1,285 @@ - Interrupt Controller - - This chapter specifies the requirements for the LoPAR interrupt - controller. Platforms may chose to virtualize the interrupt controller or to + + This chapter specifies the requirements for the LoPAR interrupt + controller. Platforms may chose to virtualize the interrupt controller or to provide the PowerPC External Interrupt option. - +
Interrupt Controller Virtualization - Virtualization of the interrupt controller is done through the - Interrupt Support hcalls. See . + Virtualization of the interrupt controller is done through the + Interrupt Support hcalls. See .
PowerPC External Interrupt Option - The PowerPC External Interrupt option is based upon a subset of the - PowerPC External Interrupt Architecture. The PowerPC External Interrupt - Architecture contains a register-level architectural definition of an interrupt - control structure. This architecture defines means for assigning properties - such as priority, destination, etc., to I/O and interprocessor interrupts, as - well as an interface for presenting them to processors. It supports both - specific and distributed methods for interrupt delivery. See also - A PowerPC External + The PowerPC External Interrupt option is based upon a subset of the + PowerPC External Interrupt Architecture. The PowerPC External Interrupt + Architecture contains a register-level architectural definition of an interrupt + control structure. This architecture defines means for assigning properties + such as priority, destination, etc., to I/O and interprocessor interrupts, as + well as an interface for presenting them to processors. It supports both + specific and distributed methods for interrupt delivery. See also + A PowerPC External Interrupt.htm#38341.--> - In NUMA platform configurations, the interrupt controllers may be - configured in disjoint domains. The firmware makes the server numbers visible - to any single OS image appear to come from a single space without duplication. - This may be done by appropriately initializing the interrupt presentation - controllers or the firmware may translate the server numbers presented to it in - RTAS calls before entering them into the interrupt controller registers. The OS - is made aware that certain interrupts are only served by certain servers by the - inclusion of the “ibm,interrupt-domain” + In NUMA platform configurations, the interrupt controllers may be + configured in disjoint domains. The firmware makes the server numbers visible + to any single OS image appear to come from a single space without duplication. + This may be done by appropriately initializing the interrupt presentation + controllers or the firmware may translate the server numbers presented to it in + RTAS calls before entering them into the interrupt controller registers. The OS + is made aware that certain interrupts are only served by certain servers by the + inclusion of the “ibm,interrupt-domain” property in the interrupt controller nodes. - +
PowerPC External Interrupt Option Requirements - The following are the requirements for the PowerPC External - Interrupt option. Additional requirements and information relative to the MSI - option, when implemented with this option, are listed in The following are the requirements for the PowerPC External + Interrupt option. Additional requirements and information relative to the MSI + option, when implemented with this option, are listed in . - R1-R1--1. - For the PowerPC External - Interrupt option: Platforms must implement interrupt architectures - that are in register-level architectural compliance with - A PowerPC External + For the PowerPC External + Interrupt option: Platforms must implement interrupt architectures + that are in register-level architectural compliance with + A PowerPC External Interrupt. - + - R1-R1--2. - For the PowerPC External - Interrupt option: The platform’s OF device tree must include - one or more PowerPC External Interrupt Presentation node(s), as children of the + For the PowerPC External + Interrupt option: The platform’s OF device tree must include + one or more PowerPC External Interrupt Presentation node(s), as children of the root node. - + - R1-R1--3. - For the PowerPC External - Interrupt option: The platform’s OF device tree must include - an “ibm,ppc-interrupt-server#s” and an - “ibm,ppc-interrupt-gserver#s” property as defined for - each processor in the processor’s /cpus/cpu + For the PowerPC External + Interrupt option: The platform’s OF device tree must include + an “ibm,ppc-interrupt-server#s” and an + “ibm,ppc-interrupt-gserver#s” property as defined for + each processor in the processor’s /cpus/cpu node. - + - R1-R1--4. - For the PowerPC External - Interrupt option: The various - “ibm,ppc-interrupt-server#s” property values seen by a + For the PowerPC External + Interrupt option: The various + “ibm,ppc-interrupt-server#s” property values seen by a single OS image must be all unique. - + - R1-R1--5. - For the PowerPC External - Interrupt option: If an OS image sees multiple global interrupt - server queues, the “ibm,ppc-interrupt-gserver#s” + For the PowerPC External + Interrupt option: If an OS image sees multiple global interrupt + server queues, the “ibm,ppc-interrupt-gserver#s” properties associated with the various queues must have unique values. - + - R1-R1--6. - For the PowerPC External - Interrupt option: The platform’s OF device tree must include - a PowerPC External Interrupt Source Controller node, as defined for each Bus - Unit Controller (BUC) that can generate PowerPC External Interrupt Architecture + For the PowerPC External + Interrupt option: The platform’s OF device tree must include + a PowerPC External Interrupt Source Controller node, as defined for each Bus + Unit Controller (BUC) that can generate PowerPC External Interrupt Architecture interrupts, as a child of the platform’s root node. - + - R1-R1--7. - For the PowerPC External - Interrupt option: The platform’s OF device tree must conform - to the and - include the appropriate mapping and interrupt properties to allow the mapping - of all non-zero XISR values (interrupt#) to the + For the PowerPC External + Interrupt option: The platform’s OF device tree must conform + to the and + include the appropriate mapping and interrupt properties to allow the mapping + of all non-zero XISR values (interrupt#) to the corresponding node generating the interrupt. - + - R1-R1--8. - For the PowerPC External - Interrupt option: The PowerPC External Interrupt Presentation - Controller node must not contain the + For the PowerPC External + Interrupt option: The PowerPC External Interrupt Presentation + Controller node must not contain the “used-by-rtas” property. - + - R1-R1--9. - For the PowerPC External - Interrupt option: The PowerPC External Interrupt Source Controller - node must contain the “used-by-rtas” + For the PowerPC External + Interrupt option: The PowerPC External Interrupt Source Controller + node must contain the “used-by-rtas” property. - + - R1-R1--10. - For the PowerPC External - Interrupt option: If the interrupt hardware is configured such that, - viewed from any given OS image, any interrupt source controller cannot direct - interrupts to any interrupt presentation controller, then the platform must - include the “ibm,interrupt-domain” property - in all interrupt source and presentation controller nodes for that OS so that - the OS can determine the servers that may be valid targets for any given + For the PowerPC External + Interrupt option: If the interrupt hardware is configured such that, + viewed from any given OS image, any interrupt source controller cannot direct + interrupts to any interrupt presentation controller, then the platform must + include the “ibm,interrupt-domain” property + in all interrupt source and presentation controller nodes for that OS so that + the OS can determine the servers that may be valid targets for any given interrupt. - + - R1-R1--11. - For the PowerPC External - Interrupt option: All interrupt controller registers must be - accessed via Caching-Inhibited, Memory Coherence not required and Guarded + For the PowerPC External + Interrupt option: All interrupt controller registers must be + accessed via Caching-Inhibited, Memory Coherence not required and Guarded Storage mapping. - + - R1-R1--12. - For the PowerPC External - Interrupt option: The platform must manage the Available Processor - Mask Register so that global interrupts (server number field of the eXternal - Interrupt Vector Entry (XIVE) set to a value from - “ibm,ppc-interrupt-gserver#s”) are only sent to one + For the PowerPC External + Interrupt option: The platform must manage the Available Processor + Mask Register so that global interrupts (server number field of the eXternal + Interrupt Vector Entry (XIVE) set to a value from + “ibm,ppc-interrupt-gserver#s”) are only sent to one of the active processors. - + - R1-R1--13. - For the PowerPC External - Interrupt option: The platform must initialize the interrupt - priority in each XIVE to the least favored level (0xFF), enable any associated - IER bit for interrupt sources owned by the OS, and set the Current Processor - Priority Register to the Most favored level (0x00) prior to the transfer of - control to the OS so that no interrupts are signaled to a processor until the + For the PowerPC External + Interrupt option: The platform must initialize the interrupt + priority in each XIVE to the least favored level (0xFF), enable any associated + IER bit for interrupt sources owned by the OS, and set the Current Processor + Priority Register to the Most favored level (0x00) prior to the transfer of + control to the OS so that no interrupts are signaled to a processor until the OS has taken explicit action. - + - R1-R1--14. - For the PowerPC External - Interrupt option: Any implemented PowerPC External Interrupt - Architecture registers that are not reported in specific interrupt source or - destination controller nodes (such as the APM register) must be included in the - “reg” property of the + For the PowerPC External + Interrupt option: Any implemented PowerPC External Interrupt + Architecture registers that are not reported in specific interrupt source or + destination controller nodes (such as the APM register) must be included in the + “reg” property of the /reserved node. - + - R1-R1--15. - For the PowerPC External - Interrupt option: The interrupt source controller must prevent signalling new - interrupts when the XIVE interrupt priority field is set to the least favored + For the PowerPC External + Interrupt option: The interrupt source controller must prevent signalling new + interrupts when the XIVE interrupt priority field is set to the least favored level. - + - R1-R1--16. - For the PowerPC External - Interrupt option: Interrupt controllers that do not implement the - behavior of Requirement , must provide + For the PowerPC External + Interrupt option: Interrupt controllers that do not implement the + behavior of Requirement , must provide an Interrupt Enable Register (IER) which can be manipulated by RTAS, - + - R1-R1--17. - For the PowerPC External - Interrupt option: The platform must assign the Bus Unit Identifiers - (BUIDs) such that they form a compact address space. That is, while the first + For the PowerPC External + Interrupt option: The platform must assign the Bus Unit Identifiers + (BUIDs) such that they form a compact address space. That is, while the first BUID value is arbitrary, subsequent BUIDs should be contiguous. - + - R1-R1--18. - For the PowerPC External - Interrupt option: Platforms implementing interrupt server number - fields greater than 8 bits must include the - “ibm,interrupt-server#-size” property in the interrupt + For the PowerPC External + Interrupt option: Platforms implementing interrupt server number + fields greater than 8 bits must include the + “ibm,interrupt-server#-size” property in the interrupt source controller node. - + - R1-R1--19. - For the PowerPC External - Interrupt option: Platforms implementing interrupt buid number - fields greater than 9 bits must include the - “ibm,interrupt-buid-size” property in the interrupt + For the PowerPC External + Interrupt option: Platforms implementing interrupt buid number + fields greater than 9 bits must include the + “ibm,interrupt-buid-size” property in the interrupt presentation controller node. - + - R1-R1--20. - For the PowerPC External - Interrupt option: Platforms must include the - “ibm,interrupt-server-ranges” property in the + For the PowerPC External + Interrupt option: Platforms must include the + “ibm,interrupt-server-ranges” property in the interrupt presentation controller node. @@ -289,42 +289,42 @@ xml:lang="en">
PowerPC External Interrupt Option Properties - See for property definitions. + See for property definitions.
MSI Option - The Message Signaled Interrupt (MSI) or Enhanced MSI (MSI-X) - capability of PCI IOAs in many cases allows for greater flexibility in - assignment of external interrupts to IOA functions than the predecessor Level - Sensitive Interrupt (LSI) capability, and in some cases treats MSIs as a - resource pool that can be reassigned based on availability of MSIs and the need - of an IOA function for more interrupts than initially assigned. Platforms that - implement the MSI option implement the ibm,change-msi and - ibm,query-interrupt-source-number RTAS calls. These RTAS - calls manage interrupts in a platform that implements the MSI option. In - particular, these calls assign additional MSI resources to an IOA function (as - defined by its PCI configuration address: PHB_Unit_ID_Hi, - PHB_Unit_ID_Low, and config_addr), when supported by the platform. - See for more information on theses RTAS calls for + The Message Signaled Interrupt (MSI) or Enhanced MSI (MSI-X) + capability of PCI IOAs in many cases allows for greater flexibility in + assignment of external interrupts to IOA functions than the predecessor Level + Sensitive Interrupt (LSI) capability, and in some cases treats MSIs as a + resource pool that can be reassigned based on availability of MSIs and the need + of an IOA function for more interrupts than initially assigned. Platforms that + implement the MSI option implement the ibm,change-msi and + ibm,query-interrupt-source-number RTAS calls. These RTAS + calls manage interrupts in a platform that implements the MSI option. In + particular, these calls assign additional MSI resources to an IOA function (as + defined by its PCI configuration address: PHB_Unit_ID_Hi, + PHB_Unit_ID_Low, and config_addr), when supported by the platform. + See for more information on theses RTAS calls for MSI management. - This architecture will refer generically to the MSI and MSI-X - capabilities as simply “MSI,” except where differentiation is - required. In this architecture, MSIs and LSIs are what the IOA function - signals, and what the software sees for that signal is ultimately the LSI or - MSI source number. The interrupt source numbers returned - by the ibm,query-interrupt-source-number RTAS call are + This architecture will refer generically to the MSI and MSI-X + capabilities as simply “MSI,” except where differentiation is + required. In this architecture, MSIs and LSIs are what the IOA function + signals, and what the software sees for that signal is ultimately the LSI or + MSI source number. The interrupt source numbers returned + by the ibm,query-interrupt-source-number RTAS call are the numbers used to control the interrupt as in the ibm,get-xive, - ibm,set-xive, ibm,int-on, + ibm,set-xive, ibm,int-on, and ibm,int-off RTAS calls. - PCI-X and PCI Express IOA functions that signal interrupts are - required by the PCI specifications to implement either the MSI or MSI-X - interrupt capabilities, or both. For PCI Express, it is expected that IOAs will - only support MSI or MSI-X (that is, no support for LSIs). When both MSI and - MSI-X are implemented by an IOA function, the MSI method will be configured by - the platform, but may be overridden by the OS or device driver, via the - ibm,change-msi RTAS call, to be MSI-X or, if assigned by - the firmware, to LSI (by removal of the MSIs assigned). + PCI-X and PCI Express IOA functions that signal interrupts are + required by the PCI specifications to implement either the MSI or MSI-X + interrupt capabilities, or both. For PCI Express, it is expected that IOAs will + only support MSI or MSI-X (that is, no support for LSIs). When both MSI and + MSI-X are implemented by an IOA function, the MSI method will be configured by + the platform, but may be overridden by the OS or device driver, via the + ibm,change-msi RTAS call, to be MSI-X or, if assigned by + the firmware, to LSI (by removal of the MSIs assigned). summarizes the LSI and MSI support. @@ -366,9 +366,9 @@ xml:lang="en"> - Initial interrupt assignmentAssignment means to allocate the platform - resources and to enable the interrupt in the IOA function’s + Initial interrupt assignmentAssignment means to allocate the platform + resources and to enable the interrupt in the IOA function’s configuration space. @@ -406,8 +406,8 @@ xml:lang="en"> LSI or MSI - LSIIf MSIs are to - be supported, the device driver must enable via the + LSIIf MSIs are to + be supported, the device driver must enable via the ibm,change-msi RTAS call. @@ -416,7 +416,7 @@ xml:lang="en"> PCI-X - Encouraged when interrupts are required, for backward + Encouraged when interrupts are required, for backward platform compatibility @@ -432,8 +432,8 @@ xml:lang="en"> LSI or MSI - LSIIf MSIs are to - be supported, the device driver must enable via the + LSIIf MSIs are to + be supported, the device driver must enable via the ibm,change-msi RTAS call. @@ -458,11 +458,11 @@ xml:lang="en"> MSI - MSIMSI as an - initial assignment means that one or more MSIs are reported as being available - for the IOA function. In addition, LSIs may also be reported but not enabled, - in which case if the device driver removes the assigned MSIs, the assigned LSI - are enabled by the platform firmware in the IOA function’s configuration + MSIMSI as an + initial assignment means that one or more MSIs are reported as being available + for the IOA function. In addition, LSIs may also be reported but not enabled, + in which case if the device driver removes the assigned MSIs, the assigned LSI + are enabled by the platform firmware in the IOA function’s configuration space. @@ -473,24 +473,24 @@ xml:lang="en"> LSI or not supported - If PCI Express IOA function does not support LSI, + If PCI Express IOA function does not support LSI, then this combination is not supported. LSI - If PCI Express - IOA function does not support LSI, then this combination is not + If PCI Express + IOA function does not support LSI, then this combination is not supported. or MSI LSI - If the PCI - Express IOA function does not support LSI, then the platform will set the - initial interrupt assignment to MSI, and if the device driver does not support - MSI, then the IOA function will not be configurable (that is, conversion from - MSI to LSI through the bridge is not supported by this architecture). If LSI is - the initial assignment, then if MSIs are to be supported, device driver must - enable via the ibm,change-msi RTAS + If the PCI + Express IOA function does not support LSI, then the platform will set the + initial interrupt assignment to MSI, and if the device driver does not support + MSI, then the IOA function will not be configurable (that is, conversion from + MSI to LSI through the bridge is not supported by this architecture). If LSI is + the initial assignment, then if MSIs are to be supported, device driver must + enable via the ibm,change-msi RTAS call. @@ -498,191 +498,191 @@ xml:lang="en">
- The ibm,change-msi RTAS call is used to query - the initial number of MSIs assigned to a PCI configuration address and to - request a change in the number of MSIs assigned. The MSIs interrupt source - numbers assigned to an IOA function are returned via the - ibm,query-interrupt-source-number - RTAS call. In addition, when the - ibm,query-interrupt-source-number RTAS call is - implemented, it may be used to query the LSI source numbers, also. The - ibm,query-interrupt-source-number RTAS call is called - iteratively, once for each interrupt assigned to the IOA function. When an IOA - function receives an initial assignment of an LSI, the interrupt number for - that LSI may also be obtained through the same OF device tree properties that - are used to report interrupt information when the - ibm,query-interrupt-source-number RTAS call is not + The ibm,change-msi RTAS call is used to query + the initial number of MSIs assigned to a PCI configuration address and to + request a change in the number of MSIs assigned. The MSIs interrupt source + numbers assigned to an IOA function are returned via the + ibm,query-interrupt-source-number + RTAS call. In addition, when the + ibm,query-interrupt-source-number RTAS call is + implemented, it may be used to query the LSI source numbers, also. The + ibm,query-interrupt-source-number RTAS call is called + iteratively, once for each interrupt assigned to the IOA function. When an IOA + function receives an initial assignment of an LSI, the interrupt number for + that LSI may also be obtained through the same OF device tree properties that + are used to report interrupt information when the + ibm,query-interrupt-source-number RTAS call is not implemented. - R1-R1--1. - The platform must implement the MSI + The platform must implement the MSI option if the platform contains at least one PCI Express HB. - Architecture and Software Note: The MSI - option may also be implemented in the absence of any PCI Express HBs. In that - case, the implementation of the MSI option is via the presence of the - implementation of the associated ibm,change-msi and + Architecture and Software Note: The MSI + option may also be implemented in the absence of any PCI Express HBs. In that + case, the implementation of the MSI option is via the presence of the + implementation of the associated ibm,change-msi and ibm,query-interrupt-source-number RTAS calls. - + - R1-R1--2. - For the MSI option: + For the MSI option: The platform must implement the PowerPC External Interrupt option. - + - R1-R1--3. - For the MSI option: - The platform must implement the ibm,change-msi + For the MSI option: + The platform must implement the ibm,change-msi and ibm,query-interrupt-source-number RTAS calls. - + - R1-R1--4. - For the MSI option: - The platform must initially assign LSI or MSIs to IOA functions as - defined in and must enable the - assigned interrupts in the IOA function’s configuration space (the - interrupts remains disabled at the PHB, and must be enabled by the device - driver though the ibm,set-xive and + For the MSI option: + The platform must initially assign LSI or MSIs to IOA functions as + defined in and must enable the + assigned interrupts in the IOA function’s configuration space (the + interrupts remains disabled at the PHB, and must be enabled by the device + driver though the ibm,set-xive and ibm,int-on RTAS calls. - + - R1-R1--5. For the MSI option: - The platform - must provide a minimum of one MSI per IOA function (that is per each unique PCI - configuration address, including the Function #) to be supported beneath the - interrupt source controller, and any given MSI and MSI source number must not - be shared between functions or within one function (even within the same + The platform + must provide a minimum of one MSI per IOA function (that is per each unique PCI + configuration address, including the Function #) to be supported beneath the + interrupt source controller, and any given MSI and MSI source number must not + be shared between functions or within one function (even within the same PE). - + - R1-R1--6. For the MSI option: - The platform - must provide at least one MSI port (the address written by the MSI) per + The platform + must provide at least one MSI port (the address written by the MSI) per Partitionable Endpoint (PE). - Platform Implementation Note: Requirement - in conjunction with Requirement may have certain ramifications on the - design. Depending on the implementation, a unique MSI port per IOA function may + Platform Implementation Note: Requirement + in conjunction with Requirement may have certain ramifications on the + design. Depending on the implementation, a unique MSI port per IOA function may be required, and not just a unique port per PE. - + - R1-R1--7. - For the MSI option with the - LPAR option: The platform must prevent a PE from creating an - interrupt to a partition other than those to which the PE is authorized by the + For the MSI option with the + LPAR option: The platform must prevent a PE from creating an + interrupt to a partition other than those to which the PE is authorized by the platform to interrupt. - + - R1-R1--8. - For the MSI option: - The platform must set the PCI configuration space MSI registers properly in an + For the MSI option: + The platform must set the PCI configuration space MSI registers properly in an IOA at all the following times: - + Initial boot time - During the ibm,configure-connector RTAS + During the ibm,configure-connector RTAS call - During the ibm,change-msi or + During the ibm,change-msi or ibm,query-interrupt-source-number RTAS call - + - + - R1-R1--9. - For the MSI option: - The platform must initialize any bridges necessary to appropriately route + For the MSI option: + The platform must initialize any bridges necessary to appropriately route interrupts at all the following times: - + At initial boot time - During the ibm,configure-connector RTAS + During the ibm,configure-connector RTAS call - During the ibm,configure-bridge RTAS + During the ibm,configure-bridge RTAS call - During the ibm,change-msi or + During the ibm,change-msi or ibm,query-interrupt-source-number RTAS call - + - + - R1-R1--10. - For the MSI option: - The platform must provide the “ibm,req#msi” - property for any IOA function which is - requesting MSIs; at initial boot time and during the + For the MSI option: + The platform must provide the “ibm,req#msi” + property for any IOA function which is + requesting MSIs; at initial boot time and during the ibm,configure-connector RTAS call. - + - R1-R1--11. For the MSI option: - The platform - must remember and recover on error recovery any previously allocated and setup + The platform + must remember and recover on error recovery any previously allocated and setup interrupt information in the platform-owned hardware. - Software and Platform Implementation Note: In - Requirement , it is possible that some - interrupts may be lost as part of the error recovery, and software should be + Software and Platform Implementation Note: In + Requirement , it is possible that some + interrupts may be lost as part of the error recovery, and software should be implemented to take into consideration that possibility. @@ -693,30 +693,30 @@ xml:lang="en">
Platform Reserved Interrupt Priority Level Option - The Platform Reserved Interrupt Priority Level option allows - platforms to reserve interrupt priority levels for internal uses. When the - platform exercises this option, it notifies the client program via the OF - device tree “ibm,plat-res-int-priorities” + The Platform Reserved Interrupt Priority Level option allows + platforms to reserve interrupt priority levels for internal uses. When the + platform exercises this option, it notifies the client program via the OF + device tree “ibm,plat-res-int-priorities” property of the root node of the device tree. - R1-R1--1. - For the Platform Reserved - Interrupt Priority Level option: The platform must include - the“ibm,plat-res-int-priorities” + For the Platform Reserved + Interrupt Priority Level option: The platform must include + the“ibm,plat-res-int-priorities” property in the root node of the device tree. - + - R1-R1--2. - For the Platform Reserved - Interrupt Priority Level option: The platform must not reserve + For the Platform Reserved + Interrupt Priority Level option: The platform must not reserve priority levels 0x00 through 0x07 and 0xFF for internal use. diff --git a/LoPAR/ch_io_devices.xml b/LoPAR/ch_io_devices.xml index e6df909..8e508ce 100644 --- a/LoPAR/ch_io_devices.xml +++ b/LoPAR/ch_io_devices.xml @@ -1,104 +1,104 @@ - I/O Devices - This chapter describes requirements for IOAs. It adds detail to areas - of the PCI architectures (conventional PCI, PCI-X and PCI Express) that are - either unaddressed or optional. It also places some requirements on firmware - and the OS for IOA support. It provides references to specifications to which - IOAs must comply and gives design notes for IOAs that run on LoPAR systems. + This chapter describes requirements for IOAs. It adds detail to areas + of the PCI architectures (conventional PCI, PCI-X and PCI Express) that are + either unaddressed or optional. It also places some requirements on firmware + and the OS for IOA support. It provides references to specifications to which + IOAs must comply and gives design notes for IOAs that run on LoPAR systems. - +
PCI IOAs - R1-R1--1. - All PCI IOAs must be capable of decoding and + All PCI IOAs must be capable of decoding and generating either a full 32-bit address or a full 64-bit address. - + - R1-R1--2. - IOAs that implement conventional PCI must be compliant with the - most recent version of the at the - time of their design, including any approved Engineering Change Requests (ECRs) + IOAs that implement conventional PCI must be compliant with the + most recent version of the at the + time of their design, including any approved Engineering Change Requests (ECRs) against that document. - R1-R1--3. - IOAs that implement PCI-X must be compliant - with the most recent version of the - at the time of their design, including any approved Engineering Change Requests + IOAs that implement PCI-X must be compliant + with the most recent version of the + at the time of their design, including any approved Engineering Change Requests (ECRs) against that document. - R1-R1--4. - IOAs that implement PCI Express must be - compliant with the most recent version of the at the time of their design, including any + IOAs that implement PCI Express must be + compliant with the most recent version of the at the time of their design, including any approved Engineering Change Requests (ECRs) against that document - Architecture Note: Revision 2.1 and later of - the PCI Local Bus Specification requires that PCI masters - which receive a Retry target termination to unconditionally repeat the same - request until it completes. The master may perform other bus transactions, but - cannot require those to complete before repeating the original transaction - which was previously target terminated with Retry. Revision 2.1 of the - specification (page 49) also includes an example which describes how the - requirement above applies to a multi-function IOA. See page 48-49 of the 2.1 - revision of the PCI Local Bus Specification for more - detail. Revision 2.0 of the PCI Local Bus Specification - includes a definition of target termination via Retry, but did not spell out - the requirement described above for masters, as does the 2.1 revision of the - specification. Masters which are designed based on revision 2.0 of the - specification that perform other transactions following target termination with - Retry, may cause live-locks and/or deadlocks when installed in a system that - utilizes bridges (host bridge or PCI-PCI bridges) that implement Retry, delayed - transactions, and/or TCEs, when those masters require following transactions to - complete before the original transaction that was terminated with the target - Retry. This revision 2.0 to revision 2.1 compatibility problem has been - observed on several IOAs that have asked for deviations to Requirement . Wording was added to the revision 2.2 of - the PCI Local Bus Specification which makes a statement + Architecture Note: Revision 2.1 and later of + the PCI Local Bus Specification requires that PCI masters + which receive a Retry target termination to unconditionally repeat the same + request until it completes. The master may perform other bus transactions, but + cannot require those to complete before repeating the original transaction + which was previously target terminated with Retry. Revision 2.1 of the + specification (page 49) also includes an example which describes how the + requirement above applies to a multi-function IOA. See page 48-49 of the 2.1 + revision of the PCI Local Bus Specification for more + detail. Revision 2.0 of the PCI Local Bus Specification + includes a definition of target termination via Retry, but did not spell out + the requirement described above for masters, as does the 2.1 revision of the + specification. Masters which are designed based on revision 2.0 of the + specification that perform other transactions following target termination with + Retry, may cause live-locks and/or deadlocks when installed in a system that + utilizes bridges (host bridge or PCI-PCI bridges) that implement Retry, delayed + transactions, and/or TCEs, when those masters require following transactions to + complete before the original transaction that was terminated with the target + Retry. This revision 2.0 to revision 2.1 compatibility problem has been + observed on several IOAs that have asked for deviations to Requirement . Wording was added to the revision 2.2 of + the PCI Local Bus Specification which makes a statement similar to this Architecture Note. - +
Resource Locking - R1-R1--1. - PCI IOAs, excepting bridges, must not depend on the PCI LOCK# - signal for correct operation nor require any other PCI IOA to assert LOCK# for + PCI IOAs, excepting bridges, must not depend on the PCI LOCK# + signal for correct operation nor require any other PCI IOA to assert LOCK# for correct operation. - There are some legacy IOAs on legacy buses which require LOCK#. - Additionally, LOCK# is used in some implementations to resolve deadlocks + There are some legacy IOAs on legacy buses which require LOCK#. + Additionally, LOCK# is used in some implementations to resolve deadlocks between bridges under a single PHB. These uses of LOCK# are permitted.
@@ -107,30 +107,30 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - PCI expansion ROMs must have a ROM image - with a code type of 1 for OF as provided in the . This ROM image must abide by the ROM image - format for OF as documented in the PCI expansion ROMs must have a ROM image + with a code type of 1 for OF as provided in the . This ROM image must abide by the ROM image + format for OF as documented in the . - LoPAR systems rely on OF - not BIOS - to boot. This is why strong + LoPAR systems rely on OF - not BIOS - to boot. This is why strong requirements for OF device support are made. - Vital Product Data (VPD) is an optional feature for PCI adapters - and it is strongly recommended that VPD be included in all PCI expansion ROMs. - If it is put in the PCI expansion ROM in accordance with the , VPD will be reported in the OF device - tree. If the VPD information is formatted as defined in Revision 2.2 with the - new capabilities feature, or in any other format, firmware will not read the - VPD, and the device driver for the IOA will have to reformat any provided VPD - into an OS specified format. It is still required that the keywords and their - values must conform to those specified by either PCI 2.1 or PCI 2.2, no matter - how they are formatted. Refer to Requirement Vital Product Data (VPD) is an optional feature for PCI adapters + and it is strongly recommended that VPD be included in all PCI expansion ROMs. + If it is put in the PCI expansion ROM in accordance with the , VPD will be reported in the OF device + tree. If the VPD information is formatted as defined in Revision 2.2 with the + new capabilities feature, or in any other format, firmware will not read the + VPD, and the device driver for the IOA will have to reformat any provided VPD + into an OS specified format. It is still required that the keywords and their + values must conform to those specified by either PCI 2.1 or PCI 2.2, no matter + how they are formatted. Refer to Requirement .
@@ -139,33 +139,33 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - All PCI IOAs must use the PowerPC - interrupt controller, except when made transparent to the OS by the platform + All PCI IOAs must use the PowerPC + interrupt controller, except when made transparent to the OS by the platform through the architected hcall()s. - + - R1-R1--2. - PCI IOAs that do not reside in the - Peripheral Memory Space and Peripheral I/O Space of the same PHB must not share + PCI IOAs that do not reside in the + Peripheral Memory Space and Peripheral I/O Space of the same PHB must not share the same LSI source. - For further information on the interrupt controller refer to For further information on the interrupt controller refer to . - It is strongly advised that system board designers assign one - interrupt for each interrupt source. Additionally, multi-function PCI IOAs - should have multiple interrupt sources. For restrictions on sharing interrupts - with the LPAR option, see Requirement . - For restrictions on sharing MSIs, see Requirement and Requirement It is strongly advised that system board designers assign one + interrupt for each interrupt source. Additionally, multi-function PCI IOAs + should have multiple interrupt sources. For restrictions on sharing interrupts + with the LPAR option, see Requirement . + For restrictions on sharing MSIs, see Requirement and Requirement .
@@ -174,46 +174,46 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - Firmware must initialize all PCI-to-PCI + Firmware must initialize all PCI-to-PCI bridges. See . - All bridges and switches are required to comply with the bus - specification(s) of the buses to which they are attached. See Requirement All bridges and switches are required to comply with the bus + specification(s) of the buses to which they are attached. See Requirement .
Graphics Controller and Monitor Requirements for Clients - - The graphics requirements for servers are different from those for + + The graphics requirements for servers are different from those for portable and personal systems. - R1-R1--1. - Plug-in graphics controllers for portable - and personal platforms must provide graphics mode sets in the OF PCI expansion + Plug-in graphics controllers for portable + and personal platforms must provide graphics mode sets in the OF PCI expansion ROM image in accordance with the . - Portable and personal platforms are strongly urged to support some - mechanism which allows the platform to electronically sense the display + Portable and personal platforms are strongly urged to support some + mechanism which allows the platform to electronically sense the display capabilities of monitors. - For graphics controllers that are placed on the system board, the - graphics mode sets can be put in system ROM. The mode set software put in the - system ROM in this case would be FCode and would be largely or entirely the - same as the FCode that would be in the PCI expansion ROM if the same graphics + For graphics controllers that are placed on the system board, the + graphics mode sets can be put in system ROM. The mode set software put in the + system ROM in this case would be FCode and would be largely or entirely the + same as the FCode that would be in the PCI expansion ROM if the same graphics controller was put on a plug-in PCI card.
@@ -222,42 +222,42 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - + - R1-R1--2. - PCI plug-in graphics cards which are - going to be the primary display IOA during the time prior to the OS device + PCI plug-in graphics cards which are + going to be the primary display IOA during the time prior to the OS device driver being loaded must contain an OF display driver on the IOA.
- +
PCI Cache Support Protocol - The PCI architecture allows for the optional implementation of - caching of data. This architecture basically assumes that the data in I/O - memory is non-coherent. As such, platforms are not required to implement the - optional PCI Cache Support protocol using the SBO# and SDONE signals. - Therefore, IOAs used in LoPAR platforms should not count on those signals for + The PCI architecture allows for the optional implementation of + caching of data. This architecture basically assumes that the data in I/O + memory is non-coherent. As such, platforms are not required to implement the + optional PCI Cache Support protocol using the SBO# and SDONE signals. + Therefore, IOAs used in LoPAR platforms should not count on those signals for proper operations. - R1-R1--1. - IOAs used in LoPAR platforms - and their device drivers must not require the use of the PCI signals SBO# and + IOAs used in LoPAR platforms + and their device drivers must not require the use of the PCI signals SBO# and SDONE for proper operations. @@ -266,18 +266,18 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
PCI Configuration Space for IOAs - There are several writable fields in the PCI Configuration Header. - Some of these are written by the firmware and should never be changed by the + There are several writable fields in the PCI Configuration Header. + Some of these are written by the firmware and should never be changed by the device driver. - R1-R1--1. - All registers and bits in the PCI Configuration Header must be - set to a platform specific value by firmware and preserved by software, except - that software is responsible for setting the configuration space as indicated + All registers and bits in the PCI Configuration Header must be + set to a platform specific value by firmware and preserved by software, except + that software is responsible for setting the configuration space as indicated in . @@ -308,7 +308,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> Bus Master - Must write to a 1 before the first DMA operation after a + Must write to a 1 before the first DMA operation after a reset. Must write to a 0 before unconfiguring device driver. @@ -317,8 +317,8 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> Memory Space - Must write to a 1 before the first MMIO operation to - IOA’s memory space (if any) after a reset. Must write to a 0 before + Must write to a 1 before the first MMIO operation to + IOA’s memory space (if any) after a reset. Must write to a 0 before unconfiguring device driver. @@ -327,8 +327,8 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> IO Space - Must write to a 1 before the first MMIO operation to - IOA’s I/O space (if any) after a reset. Must write to a 0 before + Must write to a 1 before the first MMIO operation to + IOA’s I/O space (if any) after a reset. Must write to a 0 before unconfiguring device driver. @@ -337,9 +337,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> all other bits - Must restore to previous value after any reset operation - (for example, via ibm,set-slot-reset Function 1 or 3). - The ibm,configure-bridge RTAS call is available to assist + Must restore to previous value after any reset operation + (for example, via ibm,set-slot-reset Function 1 or 3). + The ibm,configure-bridge RTAS call is available to assist in restoring values, where appropriate. @@ -356,16 +356,16 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - all other PCI header registers that may be modified by + all other PCI header registers that may be modified by firmware after initial reset or by ibm,configure-connector for DR operations all - Must restore to previous value after any reset operation - (for example, via ibm,set-slot-reset-state Function 1). - The ibm,configure-bridge RTAS call is available to assist + Must restore to previous value after any reset operation + (for example, via ibm,set-slot-reset-state Function 1). + The ibm,configure-bridge RTAS call is available to assist in configuring PCI bridges and switches, where appropriate. @@ -374,25 +374,25 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
- + - R1-R1--2. - All IOAs that implement PCI-X Mode 2 or PCI Express must supply - the “ibm,pci-config-space-type” property - (see ). - Implementation Note: The - “ibm,pci-config-space-type” - property in Requirement is added for platforms that support - I/O fabric and IOAs that implement PCI-X Mode 2, and PCI Express. To access the - extended configuration space provided by PCI-X Mode 2 and PCI Express, all I/O - fabric leading up to an IOA must support a 12-bit register number. In other - words, if a platform implementation has a conventional PCI bridge leading up to - an IOA that implements PCI-X Mode 2, the platform will not be able to provide - access to the extended configuration space of that IOA. The - “ibm,config-space-type” property in the IOA's OF node - is used by device drivers to determine if an IOA’s extended + All IOAs that implement PCI-X Mode 2 or PCI Express must supply + the “ibm,pci-config-space-type” property + (see ). + Implementation Note: The + “ibm,pci-config-space-type” + property in Requirement is added for platforms that support + I/O fabric and IOAs that implement PCI-X Mode 2, and PCI Express. To access the + extended configuration space provided by PCI-X Mode 2 and PCI Express, all I/O + fabric leading up to an IOA must support a 12-bit register number. In other + words, if a platform implementation has a conventional PCI bridge leading up to + an IOA that implements PCI-X Mode 2, the platform will not be able to provide + access to the extended configuration space of that IOA. The + “ibm,config-space-type” property in the IOA's OF node + is used by device drivers to determine if an IOA’s extended configuration space can be accessed. @@ -401,85 +401,85 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
PCI IOA Use of PCI Bus Memory Space Address 0 - Some PCI IOAs will fail when given a bus address of 0. In the PC - world, address 0 would not be a good address, so some PCI IOA designs which - were designed for the PC arena will check for an address of 0, and fail the + Some PCI IOAs will fail when given a bus address of 0. In the PC + world, address 0 would not be a good address, so some PCI IOA designs which + were designed for the PC arena will check for an address of 0, and fail the operation if it is 0. - R1-R1--1. - For systems that use PCI IOAs - which will fail when given a bus address of 0 for DMA operations, and when the - operations for which those IOAs are used are other than system memory dump - operations, then the OS must prevent the mapping of PCI bus address 0 for PCI + For systems that use PCI IOAs + which will fail when given a bus address of 0 for DMA operations, and when the + operations for which those IOAs are used are other than system memory dump + operations, then the OS must prevent the mapping of PCI bus address 0 for PCI DMA operation for such IOAs. - + - R1-R1--2. - PCI IOAs used for dumping contents of - system memory must operate properly with a PCI bus address of 0 for PCI DMA + PCI IOAs used for dumping contents of + system memory must operate properly with a PCI bus address of 0 for PCI DMA operations. - + - R1-R1--3. - The firmware must not map an IOA used for - loading a boot image to an address of 0, when loading a boot image, if that IOA + The firmware must not map an IOA used for + loading a boot image to an address of 0, when loading a boot image, if that IOA cannot accept an address of 0. - Implementation Note: A reasonable - implementation of Requirement would - be to have an interface between the device driver and the kernel to allow the - device driver to indicate to the kernel that the restriction is required for + Implementation Note: A reasonable + implementation of Requirement would + be to have an interface between the device driver and the kernel to allow the + device driver to indicate to the kernel that the restriction is required for that IOA, so that all IOAs for that kernel image are not affected.
PCI Express Completion Timeout - Prior to the implementation of the PCI Express additional - capability to set the Completion Timeout Value and Completion Timeout Disable - in the PCI Express Device Control 2 register of an IOA, the IOAs need - device-specific way to provide the disable capability. In addition, the - platforms need to provide a way for the OSs and device drivers to know when to - disable the completion timeout of these devices that only provide a + Prior to the implementation of the PCI Express additional + capability to set the Completion Timeout Value and Completion Timeout Disable + in the PCI Express Device Control 2 register of an IOA, the IOAs need + device-specific way to provide the disable capability. In addition, the + platforms need to provide a way for the OSs and device drivers to know when to + disable the completion timeout of these devices that only provide a device-specific way of doing so. - R1-R1--1. - PCI Express IOAs must either provide a - device-specific way to disable their DMA Completion Timeout timer or must - provide the Completion Timeout Disable or Completion Timeout Value capability - in the PCI Express Device Control 2 register, and device drivers for IOAs that - provide a device-specific way must disable their DMA Completion Timeout timer - if it is either unknown whether the IOA provides a sufficiently long timer - value for the platform, or if it is known that they do not provide a sufficient - timeout value (for example, if the - “ibm,max-completion-latency” property is not + PCI Express IOAs must either provide a + device-specific way to disable their DMA Completion Timeout timer or must + provide the Completion Timeout Disable or Completion Timeout Value capability + in the PCI Express Device Control 2 register, and device drivers for IOAs that + provide a device-specific way must disable their DMA Completion Timeout timer + if it is either unknown whether the IOA provides a sufficiently long timer + value for the platform, or if it is known that they do not provide a sufficient + timeout value (for example, if the + “ibm,max-completion-latency” property is not provided). - + - R1-R1--2. - Platforms must provide the - “ibm,max-completion-latency” property in + Platforms must provide the + “ibm,max-completion-latency” property in each PCI Express PHB node of the OF Device Tree. @@ -489,67 +489,67 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
PCI Express I/O Virtualized (IOV) Adapters - PCI Express defines I/O Virtualized (IOV) adapters, where such an - adapter has separate resources for each virtual instance, called a Virtual - Function (VF). There are two PCI specifications that exist to define such + PCI Express defines I/O Virtualized (IOV) adapters, where such an + adapter has separate resources for each virtual instance, called a Virtual + Function (VF). There are two PCI specifications that exist to define such adapters: - + - defines the + defines the requirements for SR-IOV adapters. - + - defines the + defines the requirements for MR-IOV adapters. - The interface presented to an OS from an MR-IOV adapter will look - the same as an SR-IOV adapters, and therefore will not be described separately + The interface presented to an OS from an MR-IOV adapter will look + the same as an SR-IOV adapters, and therefore will not be described separately here. - IOV adapters and/or the VFs of an IOV adapter that has IOV enabled, - are assigned to OSs as follows (see also for a full set of characteristics of these + IOV adapters and/or the VFs of an IOV adapter that has IOV enabled, + are assigned to OSs as follows (see also for a full set of characteristics of these environments): - - + + - For the Legacy Dedicated environment, the - entire adapter is assigned to one LPAR, with the IOV functionality not enabled. - In this mode, the OS provides device driver(s) for the adapter Function(s). VFs - do not exist, because IOV is not enabled. The OS is given the capability to do - Hot Plug add, remove, and replace in a non-managed environment (without an + For the Legacy Dedicated environment, the + entire adapter is assigned to one LPAR, with the IOV functionality not enabled. + In this mode, the OS provides device driver(s) for the adapter Function(s). VFs + do not exist, because IOV is not enabled. The OS is given the capability to do + Hot Plug add, remove, and replace in a non-managed environment (without an HMC), and may be given that capability in a managed environment. - + - For the SR-IOV Non-shared environment, the - entire adapter is assigned to one LPAR, with IOV functionality enabled, but - with the Physical Function(s) (PFs) of the adapter hosted by the platform. Only - VFs are presented to the OS. The OS is given the capability to do Hot Plug add, - remove, and replace in a non-managed environment (without an HMC), and may be + For the SR-IOV Non-shared environment, the + entire adapter is assigned to one LPAR, with IOV functionality enabled, but + with the Physical Function(s) (PFs) of the adapter hosted by the platform. Only + VFs are presented to the OS. The OS is given the capability to do Hot Plug add, + remove, and replace in a non-managed environment (without an HMC), and may be given that capability in a managed environment. - + - For the SR-IOV Shared environment, the - adapter is assigned to the platform, with IOV functionality enabled. The - platform then assigns VF(s) to OS(s). Only the managed environment applies, and - add/remove/replace operations are controlled by DLPAR operations to the OS(s) + For the SR-IOV Shared environment, the + adapter is assigned to the platform, with IOV functionality enabled. The + platform then assigns VF(s) to OS(s). Only the managed environment applies, and + add/remove/replace operations are controlled by DLPAR operations to the OS(s) from the management console. - + - For all environments except the SR-IOV Shared, multiple functions - will appear as a multi-function IOA with possible sharing of a single PE. For - example, the multi-function adapters may have a shared EEH domain and shared + For all environments except the SR-IOV Shared, multiple functions + will appear as a multi-function IOA with possible sharing of a single PE. For + example, the multi-function adapters may have a shared EEH domain and shared DMA window. - Determination of which of the above environments is supported for a - given platform and partition or OS type is beyond the scope of this + Determination of which of the above environments is supported for a + given platform and partition or OS type is beyond the scope of this architecture. - defines the + defines the characteristics of these environments. @@ -727,10 +727,10 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - All functions under separate PHBs in the OF Device Tree - for the same adapterThe adapter is - physically under one PHB, but the platform creates separate - “virtual” PHBs in the OF Device Tree and virtualizes the PCI + All functions under separate PHBs in the OF Device Tree + for the same adapterThe adapter is + physically under one PHB, but the platform creates separate + “virtual” PHBs in the OF Device Tree and virtualizes the PCI Express configuration space for the various functions.? @@ -745,9 +745,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - config_addr translation - (virtualization) by the platform (that is, the bus/device/function of the - config_addr does not necessarily correspond to what the + config_addr translation + (virtualization) by the platform (that is, the bus/device/function of the + config_addr does not necessarily correspond to what the device has programmed) @@ -780,31 +780,31 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - PCI Express Single Root IOV (SR-IOV) + PCI Express Single Root IOV (SR-IOV) adapters must comply to the . - + - R1-R1--2. - PCI Express Multi-Root IOV (MR-IOV) + PCI Express Multi-Root IOV (MR-IOV) adapters must comply to the . - + - R1-R1--3. - The platform must present within the - device tree nodes for all PCI Express adapters configured to operate in IOV - mode the "ibm,is-vf" property as defined in section - . + The platform must present within the + device tree nodes for all PCI Express adapters configured to operate in IOV + mode the "ibm,is-vf" property as defined in section + . @@ -817,12 +817,12 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> - R1-R1--1. - Platform Implementation: + Platform Implementation: Platforms must support the “scsi-initiator-id” - property as described in and . + property as described in and . @@ -830,23 +830,23 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
Contiguous Memory - I/O devices that require contiguous memory pages (either real or via - contiguous TCEs) cannot reasonably be accommodated in LoPAR platforms. When - TCEs are turned off, that would require that real physical memory addresses be - allocated. When TCEs are on, that would require contiguous TCEs be assigned, - and although that is the first attempt by the OS’s TCE assignment - algorithm, the algorithm will assign non-contiguous ones if contiguous ones - cannot be assigned. Dynamic Reconfiguration complicates the contiguous problem + I/O devices that require contiguous memory pages (either real or via + contiguous TCEs) cannot reasonably be accommodated in LoPAR platforms. When + TCEs are turned off, that would require that real physical memory addresses be + allocated. When TCEs are on, that would require contiguous TCEs be assigned, + and although that is the first attempt by the OS’s TCE assignment + algorithm, the algorithm will assign non-contiguous ones if contiguous ones + cannot be assigned. Dynamic Reconfiguration complicates the contiguous problem even further. - R1-R1--1. - I/O devices and/or their device drivers used - in LoPAR platforms must implement scatter/gather capability for DMA operations - such that they do not require contiguous memory pages to be allocated for + I/O devices and/or their device drivers used + in LoPAR platforms must implement scatter/gather capability for DMA operations + such that they do not require contiguous memory pages to be allocated for proper operation. @@ -855,34 +855,34 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
Re-directed Serial Ports - The “ibm,vty-wrap-capable” OF - device tree property will be present in an OF device tree of a serial port node - when the OS data communication with that serial port controller can be - redirected, or wrapped, away from the physical serial port connector to an - ibm,vty device, which is often a virtual terminal session - of the Hardware Management Console (HMC). This property indicates to serial - port diagnostic programs that additional end user information should be - displayed during the serial port diagnostic test indicating that it is possible - that serial port data could be redirected away from the physical serial port - preventing the execution of wrap tests with physical wrap plugs. The end user - information should describe that initiating a virtual terminal session causes - the serial port controller's data to be wrapped away from the physical serial - port connection and that terminating a virtual terminal session causes the - serial port controller's data to be returned to the physical serial port - connection. The “ibm,vty-wrap-capable” - property is present with a value of null when this re-direction capability + The “ibm,vty-wrap-capable” OF + device tree property will be present in an OF device tree of a serial port node + when the OS data communication with that serial port controller can be + redirected, or wrapped, away from the physical serial port connector to an + ibm,vty device, which is often a virtual terminal session + of the Hardware Management Console (HMC). This property indicates to serial + port diagnostic programs that additional end user information should be + displayed during the serial port diagnostic test indicating that it is possible + that serial port data could be redirected away from the physical serial port + preventing the execution of wrap tests with physical wrap plugs. The end user + information should describe that initiating a virtual terminal session causes + the serial port controller's data to be wrapped away from the physical serial + port connection and that terminating a virtual terminal session causes the + serial port controller's data to be returned to the physical serial port + connection. The “ibm,vty-wrap-capable” + property is present with a value of null when this re-direction capability exists and is absent when this capability does not exist. - R1-R1--1. - The “ibm,vty-wrap-capable” - OF device tree property must - be present in an OF device tree of a serial port node when the OS data - communication with that serial port controller can be redirected, or wrapped, - away from the physical serial port connector to an ibm,vty device, and must not + The “ibm,vty-wrap-capable” + OF device tree property must + be present in an OF device tree of a serial port node when the OS data + communication with that serial port controller can be redirected, or wrapped, + away from the physical serial port connector to an ibm,vty device, and must not be present if this capability does not exist. @@ -891,95 +891,95 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
System Bus IOAs - This section lists the requirements for the systems to support IOAs + This section lists the requirements for the systems to support IOAs connected to the system bus or main I/O expansion bus. - R1-R1--1. - Each system bus IOA must be a bus + Each system bus IOA must be a bus master. - + - R1-R1--2. - Firmware must assign unique addresses to all + Firmware must assign unique addresses to all system bus IOA facilities. - + - R1-R1--3. - Addresses assigned to system bus IOA - facilities must not conflict with the addresses mapped by any host bridge on + Addresses assigned to system bus IOA + facilities must not conflict with the addresses mapped by any host bridge on the system bus. - + - R1-R1--4. - System bus IOAs must be assigned interrupt + System bus IOAs must be assigned interrupt sources for their interrupt requirements by firmware. - + - R1-R1--5. - A system bus IOA’s OF - “interrupts” property must reflect + A system bus IOA’s OF + “interrupts” property must reflect the interrupt source and type allocation for the device. - + - R1-R1--6. - All system bus IOA interrupts must be low + All system bus IOA interrupts must be low true level sensitive (referred to as level sensitive). - + - R1-R1--7. - Interrupts assigned to system bus IOAs must + Interrupts assigned to system bus IOAs must not be shared with other IOAs. - + - R1-R1--8. - The OF unit address (first entry of the - “reg” property) of a system + The OF unit address (first entry of the + “reg” property) of a system bus IOA must stay the same from boot to boot. - + - R1-R1--9. - Each system bus IOA must have documentation - for programming the IOA and an OF binding which describes at least the - “name”, - “reg”, - “interrupts”, and - “interrupt-parent” + Each system bus IOA must have documentation + for programming the IOA and an OF binding which describes at least the + “name”, + “reg”, + “interrupts”, and + “interrupt-parent” properties for the device. diff --git a/LoPAR/ch_io_topology.xml b/LoPAR/ch_io_topology.xml index bffc788..babfb62 100644 --- a/LoPAR/ch_io_topology.xml +++ b/LoPAR/ch_io_topology.xml @@ -1,149 +1,150 @@ - I/O Bridges and Topologies - - There will be at least one bridge in a platform which interfaces to the - system interconnect on the processor side, and interfaces to the Peripheral - Component Interface (PCI) bus on the other. This bridge is called the - PCI Host Bridge (PHB). - The architectural requirements on the PHB, as well as other aspects of the I/O - structures, PCI bridges, and PCI Express switches are defined in this chapter. + + There will be at least one bridge in a platform which interfaces to the + system interconnect on the processor side, and interfaces to the Peripheral + Component Interface (PCI) bus on the other. This bridge is called the + PCI Host Bridge (PHB). + The architectural requirements on the PHB, as well as other aspects of the I/O + structures, PCI bridges, and PCI Express switches are defined in this chapter. - +
I/O Topologies and Endpoint Partitioning - As systems get more sophisticated, partitioning of various components - of the system will be used, in order to obtain greater Reliability, - Availability, and Serviceability (RAS). For example, Dynamic Reconfiguration - (DR) allows the removal, addition, and replacement of components from an - OS’s pool of resources, without having to stop the operation of that OS. - In addition, Logical Partitioning (LPAR) allows the isolation of resources used - by one OS from those used by another. This section will discuss aspects of the - partitioning of the I/O subsystem. Further information on DR and LPAR can be - found in . - To be useful, the granularity of assignment of I/O resources to an OS - needs to be fairly fine-grained. For example, it is not generally acceptable to - require assignment of all I/O under the same PCI Host Bridge (PHB) to the same - partition in an LPARed system, as that restricts configurability of the system, - including the capability to dynamically move resources between - partitionsDynamic LPAR or DLPAR is - defined by the Logical Resource Dynamic Reconfiguration (LRDR) option. See - for more information. Assignment of all - IOAs under the same PHB to one partition may be acceptable if that I/O is - shared via the Virtual I/O (VIO) capability defined in .. To be able to partition - I/O adapters (IOAs), groups of IOAs or portions of IOAs for DR or to different - OSs for LPAR will generally require some extra functionality in the platform - (for example, I/O bridges and firmware) in order to be able to partition the - resources of these groups, or endpoints, while at the same time preventing any - of these endpoints from affecting another endpoint or getting access to another - endpoint’s resources. These endpoints (that is, I/O subtrees) that can - be treated as a unit for the purposes of partitioning and error recovery will - be called Partitionable Endpoints (PEs)A - “Partitionable Endpoint” in this architecture is not to be - confused with what the PCI Express defines as an “endpoint.” PCI - Express defines an endpoint as “a device with a Type 0x00 Configuration - Space header.” That means PCI Express defines any entity with a unique - Bus/Dev/Func # as an endpoint. In most implementations, a PE will not exactly - correspond to this unit. and this concept will be called + As systems get more sophisticated, partitioning of various components + of the system will be used, in order to obtain greater Reliability, + Availability, and Serviceability (RAS). For example, Dynamic Reconfiguration + (DR) allows the removal, addition, and replacement of components from an + OS’s pool of resources, without having to stop the operation of that OS. + In addition, Logical Partitioning (LPAR) allows the isolation of resources used + by one OS from those used by another. This section will discuss aspects of the + partitioning of the I/O subsystem. Further information on DR and LPAR can be + found in and + . + To be useful, the granularity of assignment of I/O resources to an OS + needs to be fairly fine-grained. For example, it is not generally acceptable to + require assignment of all I/O under the same PCI Host Bridge (PHB) to the same + partition in an LPARed system, as that restricts configurability of the system, + including the capability to dynamically move resources between + partitionsDynamic LPAR or DLPAR is + defined by the Logical Resource Dynamic Reconfiguration (LRDR) option. See + for more information. Assignment of all + IOAs under the same PHB to one partition may be acceptable if that I/O is + shared via the Virtual I/O (VIO) capability defined in .. To be able to partition + I/O adapters (IOAs), groups of IOAs or portions of IOAs for DR or to different + OSs for LPAR will generally require some extra functionality in the platform + (for example, I/O bridges and firmware) in order to be able to partition the + resources of these groups, or endpoints, while at the same time preventing any + of these endpoints from affecting another endpoint or getting access to another + endpoint’s resources. These endpoints (that is, I/O subtrees) that can + be treated as a unit for the purposes of partitioning and error recovery will + be called Partitionable Endpoints (PEs)A + “Partitionable Endpoint” in this architecture is not to be + confused with what the PCI Express defines as an “endpoint.” PCI + Express defines an endpoint as “a device with a Type 0x00 Configuration + Space header.” That means PCI Express defines any entity with a unique + Bus/Dev/Func # as an endpoint. In most implementations, a PE will not exactly + correspond to this unit. and this concept will be called Endpoint Partitioning. - A PE is defined by its Enhanced I/O Error Handling (EEH) domain and - associated resources. The resources that need to be partitioned and not overlap + A PE is defined by its Enhanced I/O Error Handling (EEH) domain and + associated resources. The resources that need to be partitioned and not overlap with other PE domains include: - + - The Memory Mapped I/O (MMIO) Load and - Store address space which is available to the PE. This is - accomplished by using the processor’s Page Table mechanism (through - control of the contents of the Page Table Entries) and not having any part of - two separate PEs’ MMIO address space overlap into the same 4 KB system - page. Additionally, for LPAR environments, the Page Table Entries are + The Memory Mapped I/O (MMIO) Load and + Store address space which is available to the PE. This is + accomplished by using the processor’s Page Table mechanism (through + control of the contents of the Page Table Entries) and not having any part of + two separate PEs’ MMIO address space overlap into the same 4 KB system + page. Additionally, for LPAR environments, the Page Table Entries are controlled by the hypervisor. - + - The DMA I/O bus address space which is available to the PE. This is - accomplished by a hardware mechanism (in a bridge in the platform) which - enforces the correct DMA addresses, and for LPAR, this hardware enforcement is - set up by the hypervisor. It is also important that a mechanism be provided for - LPAR such that the I/O bus addresses can further be limited at the system level - to not intersect; so that one PE cannot get access to a partition’s - memory to which it should not have access. The Translation Control Entry (TCE) - mechanism, when controlled by the firmware (for example, a hypervisor), is such - a mechanism. See for more information + The DMA I/O bus address space which is available to the PE. This is + accomplished by a hardware mechanism (in a bridge in the platform) which + enforces the correct DMA addresses, and for LPAR, this hardware enforcement is + set up by the hypervisor. It is also important that a mechanism be provided for + LPAR such that the I/O bus addresses can further be limited at the system level + to not intersect; so that one PE cannot get access to a partition’s + memory to which it should not have access. The Translation Control Entry (TCE) + mechanism, when controlled by the firmware (for example, a hypervisor), is such + a mechanism. See for more information on the TCE mechanism. - + - The configuration address space of the PE, as it is made available - to the device driver. This is accomplished through controlling access to a - PE’s configuration spaces through Run Time Abstraction Services (RTAS) + The configuration address space of the PE, as it is made available + to the device driver. This is accomplished through controlling access to a + PE’s configuration spaces through Run Time Abstraction Services (RTAS) calls, and for LPAR, these accesses are controlled by the hypervisor. - + - The interrupts which are accessible to the PE. An interrupt cannot - be shared between two PEs. For LPAR environments, the interrupt presentation + The interrupts which are accessible to the PE. An interrupt cannot + be shared between two PEs. For LPAR environments, the interrupt presentation and management is via the hypervisor. - + - The error domains of the PE; that is, the error containment must be - such that a PE error cannot affect another PE or, for LPAR, another partition - or OS image to which the PE is not given access. This is accomplished though - the use of the Enhanced I/O Error Handling (EEH) option of this architecture. - For LPAR environments, the control of EEH is through the hypervisor via several + The error domains of the PE; that is, the error containment must be + such that a PE error cannot affect another PE or, for LPAR, another partition + or OS image to which the PE is not given access. This is accomplished though + the use of the Enhanced I/O Error Handling (EEH) option of this architecture. + For LPAR environments, the control of EEH is through the hypervisor via several RTAS calls. - + - The reset domain: A reset domain contains all the components of a - PE. The reset is provided programmatically and is intended to be implemented - via an architected (non implementation dependent) method.For example, through a Standard Hot Plug - Controller in a bridge, or through the Secondary Bus Reset bit in the Bridge - Control register of a PCI bridge or switch. Resetting a - component is sometimes necessary in order to be able to recover from some types - of errors. A PE will equate to a reset domain, such that the entire PE can be - reset by the ibm,set-slot-reset RTAS call. For LPAR, the + The reset domain: A reset domain contains all the components of a + PE. The reset is provided programmatically and is intended to be implemented + via an architected (non implementation dependent) method.For example, through a Standard Hot Plug + Controller in a bridge, or through the Secondary Bus Reset bit in the Bridge + Control register of a PCI bridge or switch. Resetting a + component is sometimes necessary in order to be able to recover from some types + of errors. A PE will equate to a reset domain, such that the entire PE can be + reset by the ibm,set-slot-reset RTAS call. For LPAR, the control of the reset from the RTAS call is through the hypervisor. - - - In addition to the above PE requirements, there may be other - requirements on the power domains. Specifically, if a PE is going to - participate in DR, including DLPAR,To - prevent data from being transferred from one partition to another via data - remaining in an IOA’s memory, most implementations of DLPAR will require - the power cycling of the PE after removal from one partition and prior to - assigning it to another partition. then either the power - domain of the PE is required to be in a power domain which is separate from - other PEs (that is, power domain, reset domain, and PE domain all the same), or - else the control of that power domain and PCI Hot Plug (when implemented) of - the contained PEs will be via the platform or a trusted platform agent. When - the control of power for PCI Hot Plug is via the OS, then for LPAR + + + In addition to the above PE requirements, there may be other + requirements on the power domains. Specifically, if a PE is going to + participate in DR, including DLPAR,To + prevent data from being transferred from one partition to another via data + remaining in an IOA’s memory, most implementations of DLPAR will require + the power cycling of the PE after removal from one partition and prior to + assigning it to another partition. then either the power + domain of the PE is required to be in a power domain which is separate from + other PEs (that is, power domain, reset domain, and PE domain all the same), or + else the control of that power domain and PCI Hot Plug (when implemented) of + the contained PEs will be via the platform or a trusted platform agent. When + the control of power for PCI Hot Plug is via the OS, then for LPAR environments, the control is also supervised via the hypervisor. - It is possible to allow several cooperating device drivers to share a - PE. Sharing of a PE between device drivers within one OS image is supported by - the constructs in this architecture. Sharing between device drivers in + It is possible to allow several cooperating device drivers to share a + PE. Sharing of a PE between device drivers within one OS image is supported by + the constructs in this architecture. Sharing between device drivers in different partitions is beyond the scope of the current architecture. - A PE domain is defined by its top-most (closest to the PHB) PCI - configuration address (in the terms of the RTAS calls, the - PHB_Unit_ID_Hi, PHB_Unit_ID_Low, and config_addr ), which will be - called the PE configuration address in this architecture, - and encompasses everything below that in the I/O tree. The top-most PCI bus of - the PE will be called the PE primary bus. Determination - of the PE configuration address is made as described in A PE domain is defined by its top-most (closest to the PHB) PCI + configuration address (in the terms of the RTAS calls, the + PHB_Unit_ID_Hi, PHB_Unit_ID_Low, and config_addr ), which will be + called the PE configuration address in this architecture, + and encompasses everything below that in the I/O tree. The top-most PCI bus of + the PE will be called the PE primary bus. Determination + of the PE configuration address is made as described in . - A summary of PE support can be found in . This architecture assumes that there is a - single level of bridge within a PE if the PE is heterogeneous (some - Conventional PCI Express), and these cases are shown by the shaded cells in the + A summary of PE support can be found in . This architecture assumes that there is a + single level of bridge within a PE if the PE is heterogeneous (some + Conventional PCI Express), and these cases are shown by the shaded cells in the table.
@@ -181,7 +182,7 @@ xml:lang="en"> All - Use the ibm,read-slot-reset-state2 + Use the ibm,read-slot-reset-state2 RTAS call. @@ -193,9 +194,9 @@ xml:lang="en"> All - PE reset is required for all PEs and is - activated/deactivated via the ibm,set-slot-reset RTAS - call. The PCI configuration address used in this call is the PE configuration + PE reset is required for all PEs and is + activated/deactivated via the ibm,set-slot-reset RTAS + call. The PCI configuration address used in this call is the PE configuration address (the reset domain is the same as the PE domain). @@ -214,12 +215,12 @@ xml:lang="en"> - Top of PE domain determinationPE - configuration address is used as input to the - RTAS calls which are used for PE control, namely: - ibm,set-slot-reset, - ibm,set-eeh-option, - ibm,slot-error-detail, + Top of PE domain determinationPE + configuration address is used as input to the + RTAS calls which are used for PE control, namely: + ibm,set-slot-reset, + ibm,set-eeh-option, + ibm,slot-error-detail, ibm,configure-bridge (How to obtain the PE configuration address) @@ -228,14 +229,14 @@ xml:lang="en"> Express - Use the ibm,get-config-addr-info2 + Use the ibm,get-config-addr-info2 RTAS call to obtain PE configuration address. - Shared PE determinationIf device - driver is written for the shared PE + Shared PE determinationIf device + driver is written for the shared PE environment, then this may be a don’t care. (is there more than one IOA per PE?) @@ -244,13 +245,13 @@ xml:lang="en"> Express - Use the ibm,get-config-addr-info2 + Use the ibm,get-config-addr-info2 RTAS call. - PEs per PCI Hot Plug domain and PCI Hot Plug control + PEs per PCI Hot Plug domain and PCI Hot Plug control point @@ -258,9 +259,9 @@ xml:lang="en"> Express - May have more than one PE per PCI Hot Plug DR entity, but - a PE will be entirely encompassed by the PCI Hot Plug power domain. If more - than one PE per DR entity, then PCI Hot Plug control is via the platform or + May have more than one PE per PCI Hot Plug DR entity, but + a PE will be entirely encompassed by the PCI Hot Plug power domain. If more + than one PE per DR entity, then PCI Hot Plug control is via the platform or some trusted platform agent. @@ -270,63 +271,63 @@ xml:lang="en"> - R1-R1--1. - All platforms must implement the + All platforms must implement the ibm,get-config-addr-info2 RTAS call. - + - R1-R1--2. - All platforms must implement the + All platforms must implement the ibm,read-slot-reset-state2 RTAS call. - + - R1-R1--3. - For the EEH option: - The resources of one PE must not overlap the resources of another PE, + For the EEH option: + The resources of one PE must not overlap the resources of another PE, including: - + Error domains - + MMIO address ranges - + I/O bus DMA address ranges (when PEs are below the same PHB) - + Configuration space - + Interrupts - + - + - R1-R1--4. - For the EEH option: + For the EEH option: All the following must be true relative to a PE: - + An IOA function must be totally encompassed by a PE. @@ -334,107 +335,107 @@ xml:lang="en"> All PEs must be independently resetable by a reset domain. - - Architecture Note: The partitioning of PEs - down to a single IOA function within a multi-function IOA requires a way to - reset an individual IOA function within a multi-function IOA. For PCI, the only - mechanism defined to do this is the optional PCI Express Function Level Reset - (FLR). A platform supports FLR if it supports PCI Express and the partitioning - of PEs down to a single IOA function within a multi-function IOA. When FLR is - supported, if the ibm,set-slot-reset RTAS call uses FLR - for the Function 1/Function 0 - (activate/deactivate reset) sequence for an IOA function, then the platform - provides the “ibm,pe-reset-is-flr” property - in the function’s node of the OF device tree, See - for more information. + + Architecture Note: The partitioning of PEs + down to a single IOA function within a multi-function IOA requires a way to + reset an individual IOA function within a multi-function IOA. For PCI, the only + mechanism defined to do this is the optional PCI Express Function Level Reset + (FLR). A platform supports FLR if it supports PCI Express and the partitioning + of PEs down to a single IOA function within a multi-function IOA. When FLR is + supported, if the ibm,set-slot-reset RTAS call uses FLR + for the Function 1/Function 0 + (activate/deactivate reset) sequence for an IOA function, then the platform + provides the “ibm,pe-reset-is-flr” property + in the function’s node of the OF device tree, See + for more information. - + - R1-R1--5. - The platform must own (be - responsible for) any error recovery for errors that occur outside of all PEs + The platform must own (be + responsible for) any error recovery for errors that occur outside of all PEs (for example in switches and bridges above defined PEs). - Implementation Note: As part of the error - recovery of Requirement , the platform - may, as part of the error handling of those errors, establish an equivalent EEH - error state in the EEH domains of all PEs below the error point, in order to - recover the hardware above those EEH domains from its error state. The platform - also returns a PE Reset State of 5 (PE is unavailable) - with a PE Unavailable Info non-zero (temporarily + Implementation Note: As part of the error + recovery of Requirement , the platform + may, as part of the error handling of those errors, establish an equivalent EEH + error state in the EEH domains of all PEs below the error point, in order to + recover the hardware above those EEH domains from its error state. The platform + also returns a PE Reset State of 5 (PE is unavailable) + with a PE Unavailable Info non-zero (temporarily unavailable) while a recovery is in progress. - + - R1-R1--6. - The platform must own (be responsible for) - fault isolation for all errors that occur in the I/O fabric (that is, down to - the IOA; including errors that occur on that part of the I/O fabric which is + The platform must own (be responsible for) + fault isolation for all errors that occur in the I/O fabric (that is, down to + the IOA; including errors that occur on that part of the I/O fabric which is within a PE’s domain). - + - R1-R1--7. - For the EEH option with the PCI + For the EEH option with the PCI Hot Plug option: All of the following must be true: - + - If PCI Hot Plug operations are to be controlled by the OS to which - the PE is assigned, then the PE domain for the PCI Hot Plug entity and the PCI + If PCI Hot Plug operations are to be controlled by the OS to which + the PE is assigned, then the PE domain for the PCI Hot Plug entity and the PCI Hot Plug power domain must be the same. - + - All PE domains must be totally encompassed by their respective PCI - Hot Plug power domain, regardless of the entity that controls the PCI Hot Plug + All PE domains must be totally encompassed by their respective PCI + Hot Plug power domain, regardless of the entity that controls the PCI Hot Plug operation. - + - + - R1-R1--8. - All platforms that implement the + All platforms that implement the EEH option must enable that option by default for all PEs. Implementation Notes: - + - See for requirements + See and for requirements relative to EEH requirements with LPAR. - Defaulting to EEH enabled, as required by Requirement - does not imply that the platform has no - responsibility in assuring that all device drivers are EEH enabled or EEH safe - before allowing their the Bus Master, Memory Space or I/O Space bits in the PCI - configuration Command register of their IOA to be set to a 1. Furthermore, even - though a platform defaults its EEH option as enabled, as required by - Requirement does not imply that the - platform cannot disable EEH for a PE. See Requirement + Defaulting to EEH enabled, as required by Requirement + does not imply that the platform has no + responsibility in assuring that all device drivers are EEH enabled or EEH safe + before allowing their the Bus Master, Memory Space or I/O Space bits in the PCI + configuration Command register of their IOA to be set to a 1. Furthermore, even + though a platform defaults its EEH option as enabled, as required by + Requirement does not imply that the + platform cannot disable EEH for a PE. See Requirement for more information. - + - The following two figures show some examples of the concept of - Endpoint Partitioning. See also for + The following two figures show some examples of the concept of + Endpoint Partitioning. See also for more information on the EEH option.
@@ -448,7 +449,7 @@ xml:lang="en">
- +
PE and DR Partitioning Examples for PCI Express HBs @@ -459,60 +460,60 @@ xml:lang="en"> -
+ - +
PCI Host Bridge (PHB) Architecture - The PHB architecture places certain requirements on PHBs. There - should be no conflict between this document and the PCI specifications, but if - there is, the PCI documentation takes precedence. The intent of this - architecture is to provide a base architectural level which supports the PCI - architecture and to provide optional constructs which allow for use of 32-bit + The PHB architecture places certain requirements on PHBs. There + should be no conflict between this document and the PCI specifications, but if + there is, the PCI documentation takes precedence. The intent of this + architecture is to provide a base architectural level which supports the PCI + architecture and to provide optional constructs which allow for use of 32-bit PCI IOAs in platforms with greater than 4 GB of system addressability. - R1-R1--1. - All PHBs that implement - conventional PCI must be compliant with the most recent version of the - at the time of their design, including any + All PHBs that implement + conventional PCI must be compliant with the most recent version of the + at the time of their design, including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--2. - All PHBs that - implement PCI-X must be compliant with the most - recent version of the at the time of - their design, including any approved Engineering Change Requests (ECRs) against + All PHBs that + implement PCI-X must be compliant with the most + recent version of the at the time of + their design, including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--3. - All PHBs that - implement PCI Express must be compliant with the - most recent version of the at the - time of their design, including any approved Engineering Change Requests (ECRs) + All PHBs that + implement PCI Express must be compliant with the + most recent version of the at the + time of their design, including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--4. - All requirements - defined in for HBs must + All requirements + defined in for HBs must be implemented by all PHBs in the platform. @@ -520,80 +521,81 @@ xml:lang="en">
PHB Implementation Options - There are a few implementation options when it comes to - implementing a PHB. Some of these become requirements, depending on the - characteristics of the system for which the PHB is being designed. The options + There are a few implementation options when it comes to + implementing a PHB. Some of these become requirements, depending on the + characteristics of the system for which the PHB is being designed. The options affecting PHBs, include the following: - + - The Enhanced I/O Error Handling (EEH) option enhances RAS - characteristics of the I/O and allows for smaller granularities of I/O + The Enhanced I/O Error Handling (EEH) option enhances RAS + characteristics of the I/O and allows for smaller granularities of I/O assignments to partitions in an LPAR environment. - + - The Error Injection (ERRINJCT) option enhances the testing of the - I/O error recovery code. This option is required of bridges which implement the + The Error Injection (ERRINJCT) option enhances the testing of the + I/O error recovery code. This option is required of bridges which implement the EEH option. - + - R1-R1--1. - All PHBs - for use in platforms which implement LPAR must support EEH, in support of virtualizations - requirements in . + All PHBs + for use in platforms which implement LPAR must support EEH, in support of virtualizations + requirements in and + . - + - R1-R1--2. - All PCI HBs designed for use in platforms - which will support PCI Express must support the PCI extended configuration + All PCI HBs designed for use in platforms + which will support PCI Express must support the PCI extended configuration address space and the MSI option.
- +
PCI Data Buffering and Instruction Queuing - Some PHB - implementations may include buffers or queues for DMA, - Load, and Store operations. These buffers are required to be transparent to the + Some PHB + implementations may include buffers or queues for DMA, + Load, and Store operations. These buffers are required to be transparent to the software with only certain exceptions, as noted in this section. - Most - processor accesses to System Memory go through the processor data cache. When - sharing System Memory with IOAs, hardware must maintain consistency with the - processor data cache and the System Memory, as defined by the requirements in + Most + processor accesses to System Memory go through the processor data cache. When + sharing System Memory with IOAs, hardware must maintain consistency with the + processor data cache and the System Memory, as defined by the requirements in . - R1-R1--1. - PHB implementations which - include buffers or queues for DMA, Load, and - Store operations must make sure that these are transparent to the - software, with a few exceptions which are allowed by the PCI architecture, by - , and by PHB implementations which + include buffers or queues for DMA, Load, and + Store operations must make sure that these are transparent to the + software, with a few exceptions which are allowed by the PCI architecture, by + , and by . - + - R1-R1--2. - PHBs must accept up to a 128 byte MMIO - Loads, and must do so without compromising performance + PHBs must accept up to a 128 byte MMIO + Loads, and must do so without compromising performance of other operations. @@ -601,107 +603,107 @@ xml:lang="en">
PCI <emphasis>Load</emphasis> and <emphasis>Store</emphasis> Ordering - For the platform Load and - Store ordering requirements, see - and the - appropriate PCI specifications (per Requirements - , - , and - ). Those requirements will, for most - implementations, require strong ordering (single threading) of all - Load and Store operations through the PHB, - regardless of the address space on the PCI bus to which they are targeted. - Single threading through the PHB means that processing a - Load requires that the PHB wait on the Load - response data of a Load issued on the PCI bus prior to - issuing the next Load or Store on + For the platform Load and + Store ordering requirements, see + and the + appropriate PCI specifications (per Requirements + , + , and + ). Those requirements will, for most + implementations, require strong ordering (single threading) of all + Load and Store operations through the PHB, + regardless of the address space on the PCI bus to which they are targeted. + Single threading through the PHB means that processing a + Load requires that the PHB wait on the Load + response data of a Load issued on the PCI bus prior to + issuing the next Load or Store on the PCI bus.
- +
PCI DMA Ordering - For the platform DMA ordering requirements, see the requirements in - this section, in , and the appropriate - PCI specifications (per Requirements - , - , and + For the platform DMA ordering requirements, see the requirements in + this section, in , and the appropriate + PCI specifications (per Requirements + , + , and ). - In general, the ordering for DMA path operations from the I/O bus - to the processor side of the PHB is independent from the - Load and Store path, with the exception stated - in Requirement . Note that in the - requirement, below, a read request is the initial request - to the PHB and the read completion is the data phase of + In general, the ordering for DMA path operations from the I/O bus + to the processor side of the PHB is independent from the + Load and Store path, with the exception stated + in Requirement . Note that in the + requirement, below, a read request is the initial request + to the PHB and the read completion is the data phase of the transaction (that is, the data is returned). - R1-R1--1. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - + - R1-R1--2. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - + - R1-R1--3. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - + - R1-R1--4. - The hardware must make sure that - a DMA read request from an IOA that specifies any byte address that has been - written by a previous DMA write operation (as defined by the untranslated PCI - address) does not complete before the DMA write from the previous DMA write is + The hardware must make sure that + a DMA read request from an IOA that specifies any byte address that has been + written by a previous DMA write operation (as defined by the untranslated PCI + address) does not complete before the DMA write from the previous DMA write is in the coherency domain. - + - R1-R1--5. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - + - R1-R1--6. - The hardware must make sure that - all DMA write data buffered from an IOA, which is destined for system memory, - is in the platform’s coherency domain prior to delivering data from a - Load operation through the same PHB which has come after + The hardware must make sure that + all DMA write data buffered from an IOA, which is destined for system memory, + is in the platform’s coherency domain prior to delivering data from a + Load operation through the same PHB which has come after the DMA write operation(s). - + - R1-R1--7. - The hardware must make sure that - all DMA write data buffered from an IOA, which is destined for system memory, - is in the platform’s coherency domain prior to delivering an MSI from + The hardware must make sure that + all DMA write data buffered from an IOA, which is destined for system memory, + is in the platform’s coherency domain prior to delivering an MSI from that same IOA which has come after the DMA write operation(s). @@ -709,63 +711,63 @@ xml:lang="en"> Architecture Notes: - + - Requirement clarifies - (and may tighten up) the PCI architecture requirement that the read be to the + Requirement clarifies + (and may tighten up) the PCI architecture requirement that the read be to the “just-written” data. - + - The address comparison for determining whether the address of the - data being read is the same as the address of that being written is in the same - cache line is based on the PCI address and not a TCE-translated address. This - says that the System Memory cache line address will be the same also, since the - requirement is directed towards operations under the same PHB. However, use of - a DMA Read Request and DMA Write Request that use different PCI addresses (even - if they hit the same System Memory address) are not required to be kept in - order (see Requirement ). So, for - example, Requirement says that split - PHBs that share the same data buffers at the system end do not have to keep DMA - Read Request following a DMA Write Request in order when they do not traverse - the same PHB PCI bus (even if they get translated to the same system address) - or when they originate on the same PCI bus but have different PCI bus addresses + The address comparison for determining whether the address of the + data being read is the same as the address of that being written is in the same + cache line is based on the PCI address and not a TCE-translated address. This + says that the System Memory cache line address will be the same also, since the + requirement is directed towards operations under the same PHB. However, use of + a DMA Read Request and DMA Write Request that use different PCI addresses (even + if they hit the same System Memory address) are not required to be kept in + order (see Requirement ). So, for + example, Requirement says that split + PHBs that share the same data buffers at the system end do not have to keep DMA + Read Request following a DMA Write Request in order when they do not traverse + the same PHB PCI bus (even if they get translated to the same system address) + or when they originate on the same PCI bus but have different PCI bus addresses (even if they get translated to the same system address). - + - Requirement is the only - case where the Load and Store paths - are coupled to the DMA data path. This requirement guarantees that the software - has a method for forcing DMA write data out of any buffers in the path during - the servicing of a completion interrupt from the IOA. Note that the IOA can - perform the flush prior to the completion interrupt, via Requirement . That is, the IOA can issue a read request - to the last word written and wait for the read completion data to return. When - the read is complete, the data will have arrived at the destination. In - addition, the use of MSIs, instead of LSIs, allows for a programming model for - IOAs where the interrupt signalling itself pushes the last DMA write to System - Memory, prior to the signalling of the interrupt to the system (see Requirement + Requirement is the only + case where the Load and Store paths + are coupled to the DMA data path. This requirement guarantees that the software + has a method for forcing DMA write data out of any buffers in the path during + the servicing of a completion interrupt from the IOA. Note that the IOA can + perform the flush prior to the completion interrupt, via Requirement . That is, the IOA can issue a read request + to the last word written and wait for the read completion data to return. When + the read is complete, the data will have arrived at the destination. In + addition, the use of MSIs, instead of LSIs, allows for a programming model for + IOAs where the interrupt signalling itself pushes the last DMA write to System + Memory, prior to the signalling of the interrupt to the system (see Requirement ). - + - A DMA read operation is allowed to be processed prior to the + A DMA read operation is allowed to be processed prior to the completion of a previous DMA read operation, but is not required to be. - +
- +
PCI DMA Operations and Coherence - The I/O is not aware of the setting of the coherence required bit - when performing operations to System Memory, and so the PHB needs to assume + The I/O is not aware of the setting of the coherence required bit + when performing operations to System Memory, and so the PHB needs to assume that the coherency is required. - R1-R1--1. I/O transactions to System Memory through a PHB must be made with coherency required. @@ -778,44 +780,44 @@ xml:lang="en">
Byte Ordering Conventions - LoPAR platforms operate with either Big-Endian (BE) or - Little-Endian addressing. In Big-Endian systems, the address of a word in - memory is the address of the most significant byte (the “big” - end) of the word. Increasing memory addresses will approach the least - significant byte of the word. In Little-Endian (LE) addressing, the address of - a word in memory is the address of the least significant byte (the - “little” end) of the word. See also + LoPAR platforms operate with either Big-Endian (BE) or + Little-Endian addressing. In Big-Endian systems, the address of a word in + memory is the address of the most significant byte (the “big” + end) of the word. Increasing memory addresses will approach the least + significant byte of the word. In Little-Endian (LE) addressing, the address of + a word in memory is the address of the least significant byte (the + “little” end) of the word. See also . - The PCI bus itself can be thought of as not inherently having an - endianess associated with it (although its numbering convention indicates LE). - It is the IOAs on the PCI bus that can be thought of as having endianess - associated with them. Some PCI IOAs will contain a mode bit to allow them to - appear as either a BE or LE IOA. Some IOAs will even have multiple mode bits; - one for each data path (Load and Store versus DMA). In addition, some IOAs may - have multiple concurrent apertures, or address ranges, where the IOA can be + The PCI bus itself can be thought of as not inherently having an + endianess associated with it (although its numbering convention indicates LE). + It is the IOAs on the PCI bus that can be thought of as having endianess + associated with them. Some PCI IOAs will contain a mode bit to allow them to + appear as either a BE or LE IOA. Some IOAs will even have multiple mode bits; + one for each data path (Load and Store versus DMA). In addition, some IOAs may + have multiple concurrent apertures, or address ranges, where the IOA can be accessed as a LE IOA in one aperture and as a BE IOA in another. - R1-R1--1. - When the - processor is operating in the Big-Endian mode, the platform design must produce the results - indicated in while issuing - Load and Store operations to various entities + When the + processor is operating in the Big-Endian mode, the platform design must produce the results + indicated in while issuing + Load and Store operations to various entities with various endianess. - + - R1-R1--2. - When performing DMA operations through a - PHB, the platform must not modify the data during the transfer process; the - lowest addressed byte in System Memory being transferred to the lowest - addressed byte on the PCI bus, the second byte in System Memory being + When performing DMA operations through a + PHB, the platform must not modify the data during the transfer process; the + lowest addressed byte in System Memory being transferred to the lowest + addressed byte on the PCI bus, the second byte in System Memory being transferred as the second byte on the PCI bus, and so on. @@ -823,7 +825,7 @@ xml:lang="en">
- Big-Endian Mode <emphasis>Load</emphasis> and + <title>Big-Endian Mode <emphasis>Load</emphasis> and <emphasis>Store</emphasis> Programming Considerations @@ -870,15 +872,15 @@ xml:lang="en">
PCI Bus Protocols - This section details the items from the - , - , and - documents where there is - variability allowed, and therefore further specifications, requirements, or + This section details the items from the + , + , and + documents where there is + variability allowed, and therefore further specifications, requirements, or explanations are needed. - Specifically, details - specific PCI Express options and the requirements for usage of such in LoPAR - platforms. These requirements will drive the design of PHB implementations. See + Specifically, details + specific PCI Express options and the requirements for usage of such in LoPAR + platforms. These requirements will drive the design of PHB implementations. See the for more information.
@@ -917,8 +919,8 @@ xml:lang="en"> - Usage Legend : - NS = Not Supported; O = Optional (see also Description); OR = Optional but + Usage Legend : + NS = Not Supported; O = Optional (see also Description); OR = Optional but Recommended; R = Required; SD = See Description @@ -936,10 +938,10 @@ xml:lang="en"> SD - Required if the platform is going to support any Legacy - I/O devices, as defined by the , - otherwise support not required. The expectation is that Legacy I/O device - support by PHBs will end soon, so platform designers should not rely on this + Required if the platform is going to support any Legacy + I/O devices, as defined by the , + otherwise support not required. The expectation is that Legacy I/O device + support by PHBs will end soon, so platform designers should not rely on this being there when choosing I/O devices. @@ -955,14 +957,14 @@ xml:lang="en"> SD - Implementation is optional, but is expected to be needed - in some platforms, especially those with more complex PCI Express fabrics. - Although the “ibm,dma-window” property can - implement 64-bit addresses, some OSs and Device Drivers may not be able to - handle values in the “ibm,dma-window” - property that are greater than or equal to 4 GB. Therefore, it is recommended - that 64-bit DMA addresses be implemented through the Dynamic DMA Window option - (see ). + Implementation is optional, but is expected to be needed + in some platforms, especially those with more complex PCI Express fabrics. + Although the “ibm,dma-window” property can + implement 64-bit addresses, some OSs and Device Drivers may not be able to + handle values in the “ibm,dma-window” + property that are greater than or equal to 4 GB. Therefore, it is recommended + that 64-bit DMA addresses be implemented through the Dynamic DMA Window option + (see ). @@ -977,8 +979,8 @@ xml:lang="en"> SD - This has implications in the IOAs selected for use in - the platform, as well as the PHB and firmware implementation. See the This has implications in the IOAs selected for use in + the platform, as well as the PHB and firmware implementation. See the . @@ -994,17 +996,17 @@ xml:lang="en"> NS - Enabling either of these options could allow DMA - transactions that should be dropped by an EEH Stopped State, to get to the - system before the EEH Stopped State is set, and therefore these options are not - to be enabled. Specifically, either of these could allow DMA transactions that - follow a DMA transaction in error to bypass the PCI Express error message + Enabling either of these options could allow DMA + transactions that should be dropped by an EEH Stopped State, to get to the + system before the EEH Stopped State is set, and therefore these options are not + to be enabled. Specifically, either of these could allow DMA transactions that + follow a DMA transaction in error to bypass the PCI Express error message signalling an error on a previous packet.   - Platform Implementation Note: It is - permissible for the platform (for example, the PHB or the nest) to re-order DMA - transactions that it knows can be re-ordered -- such as DMA transactions that - come from different Requester IDs or come into different PHBs -- as long as the + Platform Implementation Note: It is + permissible for the platform (for example, the PHB or the nest) to re-order DMA + transactions that it knows can be re-ordered -- such as DMA transactions that + come from different Requester IDs or come into different PHBs -- as long as the ordering with respect to error signalling is met. @@ -1062,8 +1064,8 @@ xml:lang="en"> SD - May be required if the IOAs being supported require it. - May specifically be needed for certain classes of IOAs such as + May be required if the IOAs being supported require it. + May specifically be needed for certain classes of IOAs such as accelerators. @@ -1079,7 +1081,7 @@ xml:lang="en"> SD - When 128 bit Atomic Operations are supported, 32 and 64 + When 128 bit Atomic Operations are supported, 32 and 64 bit Atomic Operations must be also supported. @@ -1178,11 +1180,11 @@ xml:lang="en"> SD - It is required that peer to peer operation between IOAs - be blocked when LPAR is implemented and those IOAs are assigned to different - LPAR partitions. For switches below a PHB, when the IOA functions below the - switch may be assigned to different partitions, this blocking is provided by - ACS in the switch. This is required even in Base platforms, if the above + It is required that peer to peer operation between IOAs + be blocked when LPAR is implemented and those IOAs are assigned to different + LPAR partitions. For switches below a PHB, when the IOA functions below the + switch may be assigned to different partitions, this blocking is provided by + ACS in the switch. This is required even in Base platforms, if the above conditions apply. @@ -1197,8 +1199,8 @@ xml:lang="en"> SD - Required when a PE consists of something other than a - full IOA. For example, if each function of a multi-function IOA each is in its + Required when a PE consists of something other than a + full IOA. For example, if each function of a multi-function IOA each is in its own PE. An SR-IOV Virtual Function (VF) may be one such example. @@ -1214,8 +1216,8 @@ xml:lang="en"> SD - This has implications in the IOAs selected for use in - the platform, as well as the PHB and firmware implementation. See the + This has implications in the IOAs selected for use in + the platform, as well as the PHB and firmware implementation. See the . @@ -1232,8 +1234,8 @@ xml:lang="en"> SD - Implement where appropriate. Platforms need to consider - this for platform switches, also. PHBs may report internal errors to firmware + Implement where appropriate. Platforms need to consider + this for platform switches, also. PHBs may report internal errors to firmware using a different mechanism outside of this architecture. @@ -1248,9 +1250,9 @@ xml:lang="en"> NS - LoPAR does not support ATS, because the invalidation - and modification of the Address Translation and Protection Table (ATPT) -- - called the TCEs in LoPAR -- is a synchronous operations, whereas the ATS + LoPAR does not support ATS, because the invalidation + and modification of the Address Translation and Protection Table (ATPT) -- + called the TCEs in LoPAR -- is a synchronous operations, whereas the ATS invalidation requires a more asynchronous operation. @@ -1279,7 +1281,7 @@ xml:lang="en"> OR - It is likely that most server platforms will need to be + It is likely that most server platforms will need to be enabled to use SR-IOV IOAs. @@ -1294,9 +1296,9 @@ xml:lang="en"> SD - Depending on how this is implemented, an MR-IOV device - is likely to look like an SR-IOV device to an OS (with the platform hiding the - Multi-root aspects). PHBs may be MR enabled or the MR support may be through + Depending on how this is implemented, an MR-IOV device + is likely to look like an SR-IOV device to an OS (with the platform hiding the + Multi-root aspects). PHBs may be MR enabled or the MR support may be through switches external to the PHBs. @@ -1304,54 +1306,54 @@ xml:lang="en">
- +
Programming Model - Normal memory mapped Load and Store instructions are used to access - a PHB’s facilities or PCI IOAs on the I/O side of the PHB. - defines the addressing model. + Normal memory mapped Load and Store instructions are used to access + a PHB’s facilities or PCI IOAs on the I/O side of the PHB. + defines the addressing model. Addresses of IOAs are passed by OF via the OF device tree. - R1-R1--1. - If a PHB defines any registers that are - outside of the PCI Configuration space, then the address of those registers - must be in the Peripheral Memory Space or Peripheral I/O Space for that PHB, or + If a PHB defines any registers that are + outside of the PCI Configuration space, then the address of those registers + must be in the Peripheral Memory Space or Peripheral I/O Space for that PHB, or must be in the System Control Area. - PCI - master DMA transfers refer to data transfers between a PCI master IOA and - another PCI IOA, or System Memory, where the PCI master IOA supplies the - addresses and controls all aspects of the data transfer. Transfers from a PCI - master to the PCI I/O Space are essentially ignored by a PHB (except for - address parity checking). Transfers from a PCI master to PCI Memory Space are - either directed at PCI Memory Space (for peer to peer operations) or need to be - directed to the host side of the PHB. DMA transfers directed to the host side - of a PHB may be to System Memory or may be to another IOA via the Peripheral - Memory Space of another HB. Transfers that are directed to the Peripheral I/O - Space of another HB are considered to be an addressing error (see - ). For information about decoding these address spaces - and the address transforms necessary, see + PCI + master DMA transfers refer to data transfers between a PCI master IOA and + another PCI IOA, or System Memory, where the PCI master IOA supplies the + addresses and controls all aspects of the data transfer. Transfers from a PCI + master to the PCI I/O Space are essentially ignored by a PHB (except for + address parity checking). Transfers from a PCI master to PCI Memory Space are + either directed at PCI Memory Space (for peer to peer operations) or need to be + directed to the host side of the PHB. DMA transfers directed to the host side + of a PHB may be to System Memory or may be to another IOA via the Peripheral + Memory Space of another HB. Transfers that are directed to the Peripheral I/O + Space of another HB are considered to be an addressing error (see + ). For information about decoding these address spaces + and the address transforms necessary, see .
- +
Peer-to-Peer Across Multiple PHBs - This architecture does not architect peer-to-peer traffic between + This architecture does not architect peer-to-peer traffic between two PCI IOAs when the operation traverses multiple PHBs. - R1-R1--1. - The platform must prevent Peer-to-Peer + The platform must prevent Peer-to-Peer operations that would cross multiple PHBs. @@ -1360,519 +1362,519 @@ xml:lang="en">
Dynamic Reconfiguration of I/O - Disconnecting or connecting an I/O subsystem while the system is - operational and then having the new configuration be operational, including any + Disconnecting or connecting an I/O subsystem while the system is + operational and then having the new configuration be operational, including any new added subsystems, is a subset of Dynamic Reconfiguration (DR). - Some platforms may also support plugging/unplugging of PCI IOAs + Some platforms may also support plugging/unplugging of PCI IOAs while the system is operational. This is another subset of DR. - DR is an option and as such, is not required by this architecture. - Attempts to change the hardware configuration on a platform that does not - enable configuration change, whose OS does not support that configuration - change, or without the appropriate user configuration change actions may + DR is an option and as such, is not required by this architecture. + Attempts to change the hardware configuration on a platform that does not + enable configuration change, whose OS does not support that configuration + change, or without the appropriate user configuration change actions may produce unpredictable results (for example, the system may crash). - PHBs in platforms that support the PCI Hot Plug Dynamic - Reconfiguration (DR) option may have some unique design considerations. For - information about the DR options, see . + PHBs in platforms that support the PCI Hot Plug Dynamic + Reconfiguration (DR) option may have some unique design considerations. For + information about the DR options, see .
- +
Split Bridge Implementations - In some platforms the PHB may be split into two pieces, separated - by a cable or fiber optics. The piece that is connected to the system bus (or - switch) and which generates the interconnect is called the Hub. There are - several implications of such implementations and several requirements to go + In some platforms the PHB may be split into two pieces, separated + by a cable or fiber optics. The piece that is connected to the system bus (or + switch) and which generates the interconnect is called the Hub. There are + several implications of such implementations and several requirements to go along with these.
- Coherency Considerations with IOA to IOA Communications + <title>Coherency Considerations with IOA to IOA Communications via System Memory - Bridges which are split across multiple chips may introduce a large - enough latency between the time DMA write data is accepted by the PHB and the - time that previously cached copies of the same System Memory locations are - invalidated, and this latency needs to be taken into consideration in designs, - as it can introduce the problems described below. This is not a problem if the - same PCI address is used under a single PHB by the same or multiple IOAs, but + Bridges which are split across multiple chips may introduce a large + enough latency between the time DMA write data is accepted by the PHB and the + time that previously cached copies of the same System Memory locations are + invalidated, and this latency needs to be taken into consideration in designs, + as it can introduce the problems described below. This is not a problem if the + same PCI address is used under a single PHB by the same or multiple IOAs, but can be a problem under any of the following conditions: - + - The same PCI address is used by different IOAs under different + The same PCI address is used by different IOAs under different PHBs. - + - Different PCI addresses are used which access the same System - Memory coherency block, regardless of whether the IOA(s) are under the same PHB + Different PCI addresses are used which access the same System + Memory coherency block, regardless of whether the IOA(s) are under the same PHB or not; for example: - + - Two different TCEs accessing the same System Memory coherency + Two different TCEs accessing the same System Memory coherency block. - + - + - An example scenario where this could be a problem is as + An example scenario where this could be a problem is as follows: - + - Device 1 does a DMA read from System Memory address x using PCI + Device 1 does a DMA read from System Memory address x using PCI address y - Device 2 (under same PHB as Device 1 -- the devices could even - be different function in the same IOA) does a DMA write to System Memory + Device 2 (under same PHB as Device 1 -- the devices could even + be different function in the same IOA) does a DMA write to System Memory address x using PCI address z. - Device 2 attempts to read back System Memory address x before - the time that its previous DMA write is globally coherent (that is, before the - DMA write gets to the Hub and an invalidate operation on the cache line - containing that data gets back down to the PHB), and gets the data read by + Device 2 attempts to read back System Memory address x before + the time that its previous DMA write is globally coherent (that is, before the + DMA write gets to the Hub and an invalidate operation on the cache line + containing that data gets back down to the PHB), and gets the data read by Device 1 rather than what it just wrote. - + Another example scenario is as follows: - + - Device 1 under PHB 1 does a DMA read from System Memory location + Device 1 under PHB 1 does a DMA read from System Memory location x. - Device 2 under PHB 2 does a DMA write to System Memory location + Device 2 under PHB 2 does a DMA write to System Memory location x and signals an interrupt to the system. - The interrupt bypasses the written data which is on its way to + The interrupt bypasses the written data which is on its way to the coherency domain. - The device driver for Device 2 services the interrupt and - signals Device 1 via a Store to Device 1 that the data is there at location + The device driver for Device 2 services the interrupt and + signals Device 1 via a Store to Device 1 that the data is there at location x. - Device 1 sees the Store before the invalidate operation on the - cache line containing the data propagates down to invalidate the previous - cached copy of x, and does a DMA read of location x using the same address as + Device 1 sees the Store before the invalidate operation on the + cache line containing the data propagates down to invalidate the previous + cached copy of x, and does a DMA read of location x using the same address as in step (1), getting the old copy of x instead of the new copy. - - - This last example is a little far-fetched since the propagation - times should not be longer than the interrupt service latency time, but it is - possible. In this example, the device driver should do a Load to Device 2 - during the servicing of the interrupt and wait for the Load results before - trying to signal Device 1, just the way that this device driver would to a Load - if it was a program which was going to use the data written instead of another - IOA. Note that this scenario can also be avoided if the IOA uses a PCI Message - Signalled Interrupt (MSI) rather than the PCI interrupt signals pins, in order - to signal the interrupt (in which case the Load operation + + + This last example is a little far-fetched since the propagation + times should not be longer than the interrupt service latency time, but it is + possible. In this example, the device driver should do a Load to Device 2 + during the servicing of the interrupt and wait for the Load results before + trying to signal Device 1, just the way that this device driver would to a Load + if it was a program which was going to use the data written instead of another + IOA. Note that this scenario can also be avoided if the IOA uses a PCI Message + Signalled Interrupt (MSI) rather than the PCI interrupt signals pins, in order + to signal the interrupt (in which case the Load operation is avoided). - R1-R1--1. - A DMA read to a PCI address - which is different than a PCI address used by a previous DMA write or which is - performed under a different PHB must not presume that a previous DMA write is - complete, even if the DMA write is to the same System Memory address, unless + A DMA read to a PCI address + which is different than a PCI address used by a previous DMA write or which is + performed under a different PHB must not presume that a previous DMA write is + complete, even if the DMA write is to the same System Memory address, unless one of the following is true: - + - The IOA doing the DMA write has followed that write by a DMA read - to the address of the last byte of DMA write data to be flushed (the DMA read - request must encompass the address of the last byte written, but does not need - to be limited to just that byte) and has waited for the results to come back - before an IOA is signaled (via peer-to-peer operations or via software) to + The IOA doing the DMA write has followed that write by a DMA read + to the address of the last byte of DMA write data to be flushed (the DMA read + request must encompass the address of the last byte written, but does not need + to be limited to just that byte) and has waited for the results to come back + before an IOA is signaled (via peer-to-peer operations or via software) to perform a DMA read to the same System Memory address. - + - The device driver for the IOA doing the DMA write has followed - that write by a Load to that IOA and has waited for the - results to come back before a DMA read to the same System Memory address with a + The device driver for the IOA doing the DMA write has followed + that write by a Load to that IOA and has waited for the + results to come back before a DMA read to the same System Memory address with a different PCI address is attempted. - + - The IOA doing the DMA write has followed the write with a PCI - Message Signalled Interrupt (MSI) as a way to interrupt the device driver, and + The IOA doing the DMA write has followed the write with a PCI + Message Signalled Interrupt (MSI) as a way to interrupt the device driver, and the MSI message has been received by the interrupt controller. - +
- +
I/O Bus to I/O Bus Bridges - The PCI bus architecture was designed to allow for bridging to other - slower speed I/O buses or to another PCI bus. The requirements when bridging + The PCI bus architecture was designed to allow for bridging to other + slower speed I/O buses or to another PCI bus. The requirements when bridging from one I/O bus to another I/O bus in the platform are defined below. - R1-R1--1. - All bridges must comply with the + All bridges must comply with the bus specification(s) of the buses to which they are attached. - +
What Must Talk to What - Platforms are not required to support peer to peer operations - between IOAs. IOAs on the same shared bus segment will generally be able to do - peer to peer operations between themselves. Peer to peer operations in an LPAR - environment, when the operations are between IOAs that are not in the same - partition, is specifically prohibited (see Requirement - ). + Platforms are not required to support peer to peer operations + between IOAs. IOAs on the same shared bus segment will generally be able to do + peer to peer operations between themselves. Peer to peer operations in an LPAR + environment, when the operations are between IOAs that are not in the same + partition, is specifically prohibited (see Requirement + ).
PCI to PCI Bridges - This architecture allows the use of PCI to PCI bridges and - PCI Express switches in the platform. TCEs are used with the IOAs attached to - the other side of the PCI to PCI bridge or PCI Express switch when those IOAs - are accessing something on the processor side of the PHB. After configuration, - PCI to PCI bridges and PCI Express switches are basically transparent to the - software as far as addressing is concerned (the exception is error handling). - For more information, see the appropriate PCI Express switch + This architecture allows the use of PCI to PCI bridges and + PCI Express switches in the platform. TCEs are used with the IOAs attached to + the other side of the PCI to PCI bridge or PCI Express switch when those IOAs + are accessing something on the processor side of the PHB. After configuration, + PCI to PCI bridges and PCI Express switches are basically transparent to the + software as far as addressing is concerned (the exception is error handling). + For more information, see the appropriate PCI Express switch specification. - R1-R1--1. - Conventional PCI to PCI bridges used on - the base platform and plug-in cards must be compliant with the most recent - version of the at the time of the - platform design, including any approved Engineering Change Requests (ECRs) - against that document. PCI-X to PCI-X bridges used on the base platform and - plug-in cards must be compliant with the most recent version of the - at the time of the platform design, - including any approved Engineering Change Requests (ECRs) against that + Conventional PCI to PCI bridges used on + the base platform and plug-in cards must be compliant with the most recent + version of the at the time of the + platform design, including any approved Engineering Change Requests (ECRs) + against that document. PCI-X to PCI-X bridges used on the base platform and + plug-in cards must be compliant with the most recent version of the + at the time of the platform design, + including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--2. - PCI Express to PCI/PCI-X and PCI/PCI-X to - PCI Express bridges used on the base platform and plug-in cards must be - compliant with the most recent version of the - at the time of the platform design, - including any approved Engineering Change Requests (ECRs) against that + PCI Express to PCI/PCI-X and PCI/PCI-X to + PCI Express bridges used on the base platform and plug-in cards must be + compliant with the most recent version of the + at the time of the platform design, + including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--3. - PCI Express switches used on the base - platform and plug-in cards must be compliant with the most recent version of - the at the time of the platform - design, including any approved Engineering Change Requests (ECRs) against that + PCI Express switches used on the base + platform and plug-in cards must be compliant with the most recent version of + the at the time of the platform + design, including any approved Engineering Change Requests (ECRs) against that document. - + - R1-R1--4. - Bridges - and switches used in platforms which will support PCI - Express IOAs beneath them must support pass-through of PCI configuration cycles + Bridges + and switches used in platforms which will support PCI + Express IOAs beneath them must support pass-through of PCI configuration cycles which access the PCI extended configuration space. Software and Platform Implementation Notes: - + - Bridges used on plug-in cards that do not follow Requirement - will presumably allow for the - operation of their IOAs on the plug-in card, even though not supporting the PCI - extended configuration address space, because the card was designed with the + Bridges used on plug-in cards that do not follow Requirement + will presumably allow for the + operation of their IOAs on the plug-in card, even though not supporting the PCI + extended configuration address space, because the card was designed with the bridges and IOAs in mind. - Determination of support of the PCI configuration address space - is via the “ibm,pci-config-space-type” + Determination of support of the PCI configuration address space + is via the “ibm,pci-config-space-type” property in the IOA's node. - + - + - R1-R1--5. - Bridges and switches used in platforms - which will support PCI Express IOAs beneath them must support 64-bit + Bridges and switches used in platforms + which will support PCI Express IOAs beneath them must support 64-bit addressing.
- +
Bridge Extensions
Enhanced I/O Error Handling (EEH) Option The EEH option uses the following terminology. - PE A Partitionable Endpoint. This refers to the granule that is - treated as one for purposes of EEH recovery and for assignment to an OS image - (for example, in an LPAR environment). Note that the PE granularity supported - by the hardware may be finer than is supported by the firmware. See also - . A PE may be any one of the + PE A Partitionable Endpoint. This refers to the granule that is + treated as one for purposes of EEH recovery and for assignment to an OS image + (for example, in an LPAR environment). Note that the PE granularity supported + by the hardware may be finer than is supported by the firmware. See also + . A PE may be any one of the following: A single-function or multi-function IOA - A set of IOAs and some piece of I/O fabric above the IOAs that + A set of IOAs and some piece of I/O fabric above the IOAs that consists of one or more bridges or switches. - EEH Stopped state The state of a PE being in both the MMIO Stopped + EEH Stopped state The state of a PE being in both the MMIO Stopped state and DMA Stopped state. - MMIO Stopped state The state of the PE which will discard any MMIO - Store s to that PE, and will return all-1's data for - Load s to that PE. If the PE is in the MMIO Stopped state - and EEH is disabled, then a Load will also return a - machine check to the processor that issued the Load, for - the Load that had the initial error and while the PE + MMIO Stopped state The state of the PE which will discard any MMIO + Store s to that PE, and will return all-1's data for + Load s to that PE. If the PE is in the MMIO Stopped state + and EEH is disabled, then a Load will also return a + machine check to the processor that issued the Load, for + the Load that had the initial error and while the PE remains in the MMIO Stopped state. - DMA Stopped state The state of the PE which will block any further - DMA requests from that PE (DMA completions that occur after the DMA Stopped - state is entered that correspond to DMA requests that occurred before the DMA + DMA Stopped state The state of the PE which will block any further + DMA requests from that PE (DMA completions that occur after the DMA Stopped + state is entered that correspond to DMA requests that occurred before the DMA Stopped state is entered, may be completed). - Failure A detected error between the PE and the system (for - example, processor or memory); errors internal to the PE are not considered - failures unless the PE signals the error via a normal I/O fabric error + Failure A detected error between the PE and the system (for + example, processor or memory); errors internal to the PE are not considered + failures unless the PE signals the error via a normal I/O fabric error signalling protocol. (for example, SERR or ERR_FATAL). - The Enhanced I/O Error Handling (EEH) option is defined primarily - to enhance the system recoverability from failures that occur during - Load and Store operations. In addition, - certain failures that are normally non-recoverable during DMA are prevented - from causing a catastrophic failure to the system (for example, a conventional + The Enhanced I/O Error Handling (EEH) option is defined primarily + to enhance the system recoverability from failures that occur during + Load and Store operations. In addition, + certain failures that are normally non-recoverable during DMA are prevented + from causing a catastrophic failure to the system (for example, a conventional PCI address parity error). - The basic concept behind the EEH option is to turn all failures - that cannot be reported to the IOA, into something that looks like a - conventional PCI Master Abort (MA) errorA conventional PCI MA error is where the - conventional PCI IOA does not respond as a target with a device select - indication (that is, the IOA does not respond by activating the DEVSEL signal - back to the master). For PCI Express, the corresponding error is Unsupported - Request (UR). on a Load or Store operation to the PE during - and after the failure; responding with all-1’s data and no error - indication on a Load instruction and ignoring - Store instructions. The MA error should be handled by a device - driver, so this approach should just be an extension to what should be the + The basic concept behind the EEH option is to turn all failures + that cannot be reported to the IOA, into something that looks like a + conventional PCI Master Abort (MA) errorA conventional PCI MA error is where the + conventional PCI IOA does not respond as a target with a device select + indication (that is, the IOA does not respond by activating the DEVSEL signal + back to the master). For PCI Express, the corresponding error is Unsupported + Request (UR). on a Load or Store operation to the PE during + and after the failure; responding with all-1’s data and no error + indication on a Load instruction and ignoring + Store instructions. The MA error should be handled by a device + driver, so this approach should just be an extension to what should be the error handling without this option implemented. The following is the general idea behind the EEH option: - + - On a failure that occurs in an operation between the PHB and + On a failure that occurs in an operation between the PHB and PE: - + - Put the PE into the MMIO Stopped and DMA Stopped states (also - known as the EEH Stopped state). This is defined as a state where the PE is - prevented from doing any further operations that could corrupt the system; - which for the most part means blocking DMA from the PE and preventing load and + Put the PE into the MMIO Stopped and DMA Stopped states (also + known as the EEH Stopped state). This is defined as a state where the PE is + prevented from doing any further operations that could corrupt the system; + which for the most part means blocking DMA from the PE and preventing load and store completions to the PE. - + - While the PE is in the MMIO Stopped state, if a - Load or Store is targeted to that PE, then - return all-1’s data with no error indication on a - Load and discard all Stores to that PE. That - is, essentially treat the Load or - Store the same way as if a MA error was received on that + While the PE is in the MMIO Stopped state, if a + Load or Store is targeted to that PE, then + return all-1’s data with no error indication on a + Load and discard all Stores to that PE. That + is, essentially treat the Load or + Store the same way as if a MA error was received on that operation. - + - The device driver and OS recovers a PE by removing it from the - MMIO Stopped state (keeping it in the DMA Stopped state) and doing any - necessary loads to the PE to capture PE state, and then either doing the - necessary stores to the PE to set the appropriate state before removing the PE - from the DMA Stopped state and continuing operations, or doing a reset of the - PE and then re-initializing and restarting the PE.Most - device drivers will implement a reset and - restart in order to assure a clean restart of + The device driver and OS recovers a PE by removing it from the + MMIO Stopped state (keeping it in the DMA Stopped state) and doing any + necessary loads to the PE to capture PE state, and then either doing the + necessary stores to the PE to set the appropriate state before removing the PE + from the DMA Stopped state and continuing operations, or doing a reset of the + PE and then re-initializing and restarting the PE.Most + device drivers will implement a reset and + restart in order to assure a clean restart of operations. - + - + - In order to make sure that there are no interactions necessary - between device drivers during recovery operations, each PE will have the - capability of being removed from its MMIO Stopped and DMA Stopped states - independent from any other PE which is in the MMIO Stopped or DMA Stopped + In order to make sure that there are no interactions necessary + between device drivers during recovery operations, each PE will have the + capability of being removed from its MMIO Stopped and DMA Stopped states + independent from any other PE which is in the MMIO Stopped or DMA Stopped state. - + - In order to take into account device drivers which do not - correctly implement MA recovery, make sure that the EEH option can be enabled - and disabled independently for each PE.LPAR - implementations limit the capability of - running with EEH disabled (see virtualization rsequirements in - ). + In order to take into account device drivers which do not + correctly implement MA recovery, make sure that the EEH option can be enabled + and disabled independently for each PE.LPAR + implementations limit the capability of + running with EEH disabled (see Requirement + and Requirement ). - + - EEH, as defined, only extends to operations between the processor - and a PE and between a PE and System Memory. It does not extend to direct IOA + EEH, as defined, only extends to operations between the processor + and a PE and between a PE and System Memory. It does not extend to direct IOA to IOA peer to peer operations. - + + + Hardware changes for this option are detailed in the next section. + RTAS changes required are detailed in . - Hardware changes for this option are detailed in the next section. - RTAS changes required are detailed in . -
EEH Option Requirements - Although the EEH option architecture may be extended to other I/O - topologies in the future, for now this recovery architecture will be limited to + Although the EEH option architecture may be extended to other I/O + topologies in the future, for now this recovery architecture will be limited to PCI. - In order to be able to test device driver additional code for the - EEH-enabled case, the EEH option also requires the Error Injection option be + In order to be able to test device driver additional code for the + EEH-enabled case, the EEH option also requires the Error Injection option be implemented concurrently. - The additional requirements on the hardware for this option are as - follows. For the RTAS requirements for this option, see - . + The additional requirements on the hardware for this option are as + follows. For the RTAS requirements for this option, see + . - R1-R1--1. - For the EEH option: - A platform must implement the Error Injection option concurrently + For the EEH option: + A platform must implement the Error Injection option concurrently with the EEH option, with an error injection granularity to the PE level. - + - R1-R1--2. - For the EEH option: - If a platform is going to implement the EEH option, then the I/O + For the EEH option: + If a platform is going to implement the EEH option, then the I/O topology implementing EEH must only consist of PCI components. - + - R1-R1--3. - For the EEH option: - The hardware must provide a way to independently enable and disable + For the EEH option: + The hardware must provide a way to independently enable and disable the EEH option for each PE with normal processor Load - and Store - instructions, and must provide the capability of doing this while not + and Store + instructions, and must provide the capability of doing this while not disturbing operations to other PEs in the platform. - + - R1-R1--4. - For the EEH option: The hardware - fault isolation register bits must be set the same way on errors when the EEH - option is enabled as they were when the EEH option is not implemented or when + For the EEH option: The hardware + fault isolation register bits must be set the same way on errors when the EEH + option is enabled as they were when the EEH option is not implemented or when it is implemented but disabled. - + - R1-R1--5. - For the EEH option: Any - detected failure to/from a PE must set both the MMIO Stopped and DMA Stopped - states for the PE, unless the error that caused the failure can be reported to - the IOA in a way that the IOA will report the error to its device driver in a + For the EEH option: Any + detected failure to/from a PE must set both the MMIO Stopped and DMA Stopped + states for the PE, unless the error that caused the failure can be reported to + the IOA in a way that the IOA will report the error to its device driver in a way that will avoid any data corruption. - + - R1-R1--6. - For the EEH - option: If an - I/O fabric consists of a hierarchy of components, then when a failure is - detected in the fabric, all PEs that are downstream of the failure must enter - the MMIO Stopped and DMA Stopped states if they may be affected by the + For the EEH + option: If an + I/O fabric consists of a hierarchy of components, then when a failure is + detected in the fabric, all PEs that are downstream of the failure must enter + the MMIO Stopped and DMA Stopped states if they may be affected by the failure. - + - R1-R1--7. - For the EEH - option: While a PE has its EEH option enabled, if a failure occurs, - the platform must not propagate it to the system as any type of error (for + For the EEH + option: While a PE has its EEH option enabled, if a failure occurs, + the platform must not propagate it to the system as any type of error (for example, as an SERR for a PE which is a conventional PCI-to-PCI bridge). - + - R1-R1--8. - For the EEH option: - From the time that the MMIO Stopped state is entered for a PE, the - PE must be prevented from responding to Load and Store operations including the - operation that caused the PE to enter the MMIO Stopped state; a Load operation - must return all-1’s with no error indication and a Store operation must - be discarded (that is, Load and Store - operations being treated like they received a conventional PCI + For the EEH option: + From the time that the MMIO Stopped state is entered for a PE, the + PE must be prevented from responding to Load and Store operations including the + operation that caused the PE to enter the MMIO Stopped state; a Load operation + must return all-1’s with no error indication and a Store operation must + be discarded (that is, Load and Store + operations being treated like they received a conventional PCI Master Abort error), until one of the following is true: - + - The ibm,set-eeh-option RTAS call is called - with function 2 (Release PE for MMIO + The ibm,set-eeh-option RTAS call is called + with function 2 (Release PE for MMIO Load /Store operations). - The ibm, set-slot-reset RTAS call is - called with function 0 (Deactivate the reset signal to + The ibm, set-slot-reset RTAS call is + called with function 0 (Deactivate the reset signal to the PE). @@ -1883,30 +1885,30 @@ xml:lang="en"> The partition or system is rebooted. - + - + - R1-R1--9. - For the EEH option: - From the time that the DMA Stopped state is entered for a PE, the - PE must be prevented from initiating a new DMA request or completing a DMA - request that caused the PE to enter the DMA Stopped state (DMA requests that - were started before the DMA Stopped State is entered may be completed), and + For the EEH option: + From the time that the DMA Stopped state is entered for a PE, the + PE must be prevented from initiating a new DMA request or completing a DMA + request that caused the PE to enter the DMA Stopped state (DMA requests that + were started before the DMA Stopped State is entered may be completed), and including MSI DMA operations, until one of the following is true: - + - The ibm,set-eeh-option RTAS call is called + The ibm,set-eeh-option RTAS call is called with function 3 (Release PE for DMA operations). - The ibm, set-slot-reset RTAS call is - called with function 0 (Deactivate the reset signal to + The ibm, set-slot-reset RTAS call is + called with function 0 (Deactivate the reset signal to the PE). @@ -1917,363 +1919,364 @@ xml:lang="en"> The partition or system is rebooted. - + - + - R1-R1--10. - For the EEH option: - The hardware must provide the capability to the firmware to - determine, on a per-PE basis, that a failure has occurred which has caused the - PE to be put into the MMIO Stopped and DMA Stopped states and to read the + For the EEH option: + The hardware must provide the capability to the firmware to + determine, on a per-PE basis, that a failure has occurred which has caused the + PE to be put into the MMIO Stopped and DMA Stopped states and to read the actual state information (MMIO Stopped state and DMA Stopped state). - R1-R1--11. - For the EEH option: - The hardware must provide the capability of separately enabling and - resetting the DMA Stopped and MMIO Stopped states for a PE without disturbing - other PEs on the platform. The hardware must provide this capability without - requiring a PE reset and must do so through normal processor + For the EEH option: + The hardware must provide the capability of separately enabling and + resetting the DMA Stopped and MMIO Stopped states for a PE without disturbing + other PEs on the platform. The hardware must provide this capability without + requiring a PE reset and must do so through normal processor Store instructions. - + - R1-R1--12. - For the EEH option: The hardware - must provide the capability to the firmware to deactivate the reset to each PE, - independent of other PEs, and the hardware must provide the proper controls on - the reset transitions in order to prevent failures from being introduced into + For the EEH option: The hardware + must provide the capability to the firmware to deactivate the reset to each PE, + independent of other PEs, and the hardware must provide the proper controls on + the reset transitions in order to prevent failures from being introduced into the platform by the changing of the reset. - + - R1-R1--13. - For the EEH option: The hardware - must provide the capability to the firmware to activate the reset to each PE, - independent of other PEs, and the hardware must provide the proper controls on - the reset transitions in order to prevent failures from being introduced into + For the EEH option: The hardware + must provide the capability to the firmware to activate the reset to each PE, + independent of other PEs, and the hardware must provide the proper controls on + the reset transitions in order to prevent failures from being introduced into the platform by the changing of the reset. - + - R1-R1--14. - For the EEH option: - The hardware must provide the capability to the firmware to read + For the EEH option: + The hardware must provide the capability to the firmware to read the state of the reset signal to each PE. - + - R1-R1--15. - For the EEH option: When a PE is - put into the MMIO Stopped and DMA Stopped states, it must be done in such a way + For the EEH option: When a PE is + put into the MMIO Stopped and DMA Stopped states, it must be done in such a way to not introduce failures that may corrupt other parts of the platform. - + - R1-R1--16. - For the EEH option: - The hardware must allow firmware access to internal bridge and I/O - fabric control registers when any or all of the PEs are in the MMIO Stopped + For the EEH option: + The hardware must allow firmware access to internal bridge and I/O + fabric control registers when any or all of the PEs are in the MMIO Stopped state. - Platform Implementation Note: It is expected - that bridge and fabric control registers will have their own PE state separate + Platform Implementation Note: It is expected + that bridge and fabric control registers will have their own PE state separate from the PEs for IOAs. - + - R1-R1--17. - For the EEH - option: A PE that supports the EEH option must not share an + For the EEH + option: A PE that supports the EEH option must not share an interrupt with another PE in the platform. Hardware Implementation Notes: - + - Requirement means that - the hardware must always update the standard PCI error/status registers in the - bus’ configuration space as defined by the bus architecture, even when + Requirement means that + the hardware must always update the standard PCI error/status registers in the + bus’ configuration space as defined by the bus architecture, even when the EEH option is enabled. - The type of error information trapped by the hardware when a PE - is placed into the MMIO Stopped and DMA Stopped states is implementation - dependent. It is expected that the system software will do an check-exception - or ibm,slot-error-detail RTAS call to gather the error information when a + The type of error information trapped by the hardware when a PE + is placed into the MMIO Stopped and DMA Stopped states is implementation + dependent. It is expected that the system software will do an check-exception + or ibm,slot-error-detail RTAS call to gather the error information when a failure is detected. A DMA operation (Read or Write) that was initiated before a Load, - Store, or DMA error, does not - necessarily need to be blocked, as it was not a result of the - Load, Store, or DMA that failed. The normal - PCI Express ordering rules require that an ERR_FATAL or ERR_NONFATAL from a - failed Store or DMA error, or a - Load Completion with error status, will reach the PHB prior to any - DMA that might have been kicked-off in error as a result of a failed - Load or Store or a Load - or Store that follows a failed Load - or Store. This means that as long as the PHB processes - an ERR_FATAL, ERR_NONFATAL, or Load Completion which - indicates a failure, prior to processing any more DMA operations or - Load Completions, and puts the PE into the MMIO and Stopped DMA - Stopped states, implementations should be able to block DMA operations that - were kicked-off after a failing DMA operation and allow DMA operations that - were kicked off before a failing DMA operation without violating the normal PCI + Store, or DMA error, does not + necessarily need to be blocked, as it was not a result of the + Load, Store, or DMA that failed. The normal + PCI Express ordering rules require that an ERR_FATAL or ERR_NONFATAL from a + failed Store or DMA error, or a + Load Completion with error status, will reach the PHB prior to any + DMA that might have been kicked-off in error as a result of a failed + Load or Store or a Load + or Store that follows a failed Load + or Store. This means that as long as the PHB processes + an ERR_FATAL, ERR_NONFATAL, or Load Completion which + indicates a failure, prior to processing any more DMA operations or + Load Completions, and puts the PE into the MMIO and Stopped DMA + Stopped states, implementations should be able to block DMA operations that + were kicked-off after a failing DMA operation and allow DMA operations that + were kicked off before a failing DMA operation without violating the normal PCI Express ordering rules. - In reference to Requirements - , and - , PCI Express implementations may choose to - enter the MMIO Stopped and DMA Stopped states even if an error can be reported + In reference to Requirements + , and + , PCI Express implementations may choose to + enter the MMIO Stopped and DMA Stopped states even if an error can be reported back to the IOA. - + - + - R1-R1--18. - For the EEH option: If - the device driver(s) for any IOA(s) in a PE in the platform are EEH unaware - (that is may produce data integrity exposures due to a MMIO Stopped or DMA - Stopped state), then the firmware must prevent the IOA(s) in such a PE from - being enabled for operations (that is, do not allow the Bus Master, Memory - Space or I/O Space bits in the PCI configuration Command register from being - set to a 1) while EEH is enabled for that PE, and instead of preventing the PE - from being enabled, may instead turn off EEH when such an enable is attempted - without first an attempt by the device driver to enable EEH (by the - ibm,set-eeh-option ), providing such EEH disablement does not - violate any other requirement for EEH enablement (for example, virtualization - requirement in ). - Software Implementation Note: To be EEH - aware, a device driver does not need to be able to recover from an MMIO Stopped - or DMA Stopped state, only recognize the all-1's condition and not use data - from operations that may have occurred since the last all-1's checkpoint. In - addition, the device driver under such failure circumstances needs to turn off - interrupts (using the ibm,set-int-off RTAS call or by - resetting the PE and keeping it reset with ibm,set-slot-reset or - ibm,slot-error-detail) in order to make - sure that any (unserviceable) interrupts from the PE do not affect the system. - Note that this is the same device driver support needed to protect against an - IOA dying or against a no-DEVSEL type error (which may or may not be the result + For the EEH option: If + the device driver(s) for any IOA(s) in a PE in the platform are EEH unaware + (that is may produce data integrity exposures due to a MMIO Stopped or DMA + Stopped state), then the firmware must prevent the IOA(s) in such a PE from + being enabled for operations (that is, do not allow the Bus Master, Memory + Space or I/O Space bits in the PCI configuration Command register from being + set to a 1) while EEH is enabled for that PE, and instead of preventing the PE + from being enabled, may instead turn off EEH when such an enable is attempted + without first an attempt by the device driver to enable EEH (by the + ibm,set-eeh-option ), providing such EEH disablement does not + violate any other requirement for EEH enablement (for example, Requirement + or + >). + Software Implementation Note: To be EEH + aware, a device driver does not need to be able to recover from an MMIO Stopped + or DMA Stopped state, only recognize the all-1's condition and not use data + from operations that may have occurred since the last all-1's checkpoint. In + addition, the device driver under such failure circumstances needs to turn off + interrupts (using the ibm,set-int-off RTAS call or by + resetting the PE and keeping it reset with ibm,set-slot-reset or + ibm,slot-error-detail) in order to make + sure that any (unserviceable) interrupts from the PE do not affect the system. + Note that this is the same device driver support needed to protect against an + IOA dying or against a no-DEVSEL type error (which may or may not be the result of an IOA that has died).
- +
Slot Level EEH Event Interrupt Option - Some platform implementations may allow asynchronous notification - of EEH events via an external interrupt. This is called the Slot Level EEH - Event Interrupt option. When implemented, the platform will implement the - “ibm,io-events-capable” property in the - nodes where the EEH control resides, and the ibm,set-eeh-option - RTAS call will implement function 4 to enable the - EEH interrupt for each of these nodes and function 5 to disable the EEH - interrupt for each of these nodes (individual control by node). Calling the - ibm,set-eeh-option RTAS call with function 4 or function - 5 when the node specified does not implement this capability will return a -3, + Some platform implementations may allow asynchronous notification + of EEH events via an external interrupt. This is called the Slot Level EEH + Event Interrupt option. When implemented, the platform will implement the + “ibm,io-events-capable” property in the + nodes where the EEH control resides, and the ibm,set-eeh-option + RTAS call will implement function 4 to enable the + EEH interrupt for each of these nodes and function 5 to disable the EEH + interrupt for each of these nodes (individual control by node). Calling the + ibm,set-eeh-option RTAS call with function 4 or function + 5 when the node specified does not implement this capability will return a -3, indicating invalid parameters. - The interrupt source specified in the - ibm,io-events child must be enabled (in addition to any individual - node enables) via the ibm,int-on RTAS call and the - priority for that interrupt, as set in the XIVE by the - ibm,set-xive RTAS call, must be something other than 0xFF, in order + The interrupt source specified in the + ibm,io-events child must be enabled (in addition to any individual + node enables) via the ibm,int-on RTAS call and the + priority for that interrupt, as set in the XIVE by the + ibm,set-xive RTAS call, must be something other than 0xFF, in order for the external interrupt to be presented to the system. - The “ibm,io-events-capable” - property, when it exists, contains 0 to N interrupt specifiers (per the - definition of interrupt specifiers for the node's interrupt parent). When no - interrupt specifiers are specified by the “ibm,io-events-capable” - property, then the interrupt, if enabled, is signaled via the interrupt specifier given in the - ibm,io-events child node of the /events + The “ibm,io-events-capable” + property, when it exists, contains 0 to N interrupt specifiers (per the + definition of interrupt specifiers for the node's interrupt parent). When no + interrupt specifiers are specified by the “ibm,io-events-capable” + property, then the interrupt, if enabled, is signaled via the interrupt specifier given in the + ibm,io-events child node of the /events node. - R1-R1--1. - For the Slot Level EEH Event + For the Slot Level EEH Event Interrupt option: All of the following must be true: - + - The platform must implement the “ibm,io-events-capable” - property in all device tree nodes which represent bridge where EEH is implemented and for which the EEH + The platform must implement the “ibm,io-events-capable” + property in all device tree nodes which represent bridge where EEH is implemented and for which the EEH io-event interrupt is to be signaled. - The platform must implement functions 4 and 5 of the - ibm,set-eeh-option RTAS call for all PEs under nodes that contain + The platform must implement functions 4 and 5 of the + ibm,set-eeh-option RTAS call for all PEs under nodes that contain the “ibm,io-events-capable” property. - +
- +
Error Injection (ERRINJCT) Option - The Error Injection (ERRINJCT) option is defined primarily to test - enhanced error recovery software. As implemented in the I/O bridge, this option - is used to test the software which implements the recovery which is enabled by - the EEH option in that bridge. Specifically, the ioa-bus-error and - ioa-bus-error-64 functions - of the ibm,errinjct RTAS call are used to inject errors - onto each PE primary bus, which in turn will cause certain actions on the bus - and certain actions by the PE, the EEH logic, and by the error recovery + The Error Injection (ERRINJCT) option is defined primarily to test + enhanced error recovery software. As implemented in the I/O bridge, this option + is used to test the software which implements the recovery which is enabled by + the EEH option in that bridge. Specifically, the ioa-bus-error and + ioa-bus-error-64 functions + of the ibm,errinjct RTAS call are used to inject errors + onto each PE primary bus, which in turn will cause certain actions on the bus + and certain actions by the PE, the EEH logic, and by the error recovery software. - +
ERRINJCT Option Hardware Requirements - Although the ioa-bus-error and - ioa-bus-error-64 functions of the - ibm,errinjct RTAS call may be extended to other I/O buses and PEs in + Although the ioa-bus-error and + ioa-bus-error-64 functions of the + ibm,errinjct RTAS call may be extended to other I/O buses and PEs in the future, for now this architecture will be limited to PCI buses. - The type of errors, and the injection qualifiers, place the + The type of errors, and the injection qualifiers, place the following additional requirements on the hardware for this option. - R1-R1--1. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - If a platform is going to implement either of these functions of this option, then + ioa-bus-error-64 functions of the Error Injection option: + If a platform is going to implement either of these functions of this option, then the I/O topology must be PCI. - + - R1-R1--2. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The hardware must provide a way to inject the required errors for - each PE primary bus, and the errors must be injectable independently, without + ioa-bus-error-64 functions of the Error Injection option: + The hardware must provide a way to inject the required errors for + each PE primary bus, and the errors must be injectable independently, without affecting the operations on the other buses in the platform. - + - R1-R1--3. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The hardware must provide a way to set up for the injection of the - required errors without disturbing operations to other buses outside the + ioa-bus-error-64 functions of the Error Injection option: + The hardware must provide a way to set up for the injection of the + required errors without disturbing operations to other buses outside the PE. - + - R1-R1--4. - For the + For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The hardware must provide a way to the firmware to set up the - following information for the error injection operation by normal processor + ioa-bus-error-64 functions of the Error Injection option: + The hardware must provide a way to the firmware to set up the + following information for the error injection operation by normal processor Load andStore instructions: - + Address at which to inject the error - + - Address mask to mask off any combination of the least significant - 24 (64 for the ioa-bus-error-64 function) bits of the + Address mask to mask off any combination of the least significant + 24 (64 for the ioa-bus-error-64 function) bits of the address - + PE primary bus number which is to receive the error - + Type of error to be injected - + - + - R1-R1--5. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The platform must have the capability of selecting the errors - specified in when the bus directly - below the bridge injecting the error is a Conventional PCI or PCI-X Bus, and - the errors specified in when the bus - directly below the bridge injecting the error is a PCI Express link, and when - that error is appropriate for the platform configuration, and the platform must - limit the injection of errors which are inappropriate for the given platform + ioa-bus-error-64 functions of the Error Injection option: + The platform must have the capability of selecting the errors + specified in when the bus directly + below the bridge injecting the error is a Conventional PCI or PCI-X Bus, and + the errors specified in when the bus + directly below the bridge injecting the error is a PCI Express link, and when + that error is appropriate for the platform configuration, and the platform must + limit the injection of errors which are inappropriate for the given platform configuration. - Platform Implementation Note: As an example - of inappropriate errors to inject in Requirement - , consider the configuration where there is - an I/O bridge or switch below the bridge with the injector and that bridge - generates multiple PEs and when those PEs are assigned to different LPAR - partitions. In that case, injection of some real errors may cause the switches - or bridges to react and generate an error that affects multiple partitions, - which would be inappropriate. Therefore, to comply with Requirement - , the platform may either emulate some - errors in some configurations instead of injecting real errors on the link or - bus, or else the platform may not support injection at all to those PEs. - Another example where a particular error may be inappropriate is when there is - a heterogeneous network between the PHB and the PE (for example, a PCI Express + Platform Implementation Note: As an example + of inappropriate errors to inject in Requirement + , consider the configuration where there is + an I/O bridge or switch below the bridge with the injector and that bridge + generates multiple PEs and when those PEs are assigned to different LPAR + partitions. In that case, injection of some real errors may cause the switches + or bridges to react and generate an error that affects multiple partitions, + which would be inappropriate. Therefore, to comply with Requirement + , the platform may either emulate some + errors in some configurations instead of injecting real errors on the link or + bus, or else the platform may not support injection at all to those PEs. + Another example where a particular error may be inappropriate is when there is + a heterogeneous network between the PHB and the PE (for example, a PCI Express bridge that converts from a PCI Express PHB and a PCI-X PE). - Supported Errors for Conventional PCI, PCI-X Mode 1 + <title>Supported Errors for Conventional PCI, PCI-X Mode 1 or PCI-X Mode 2 Error Injectors @@ -2316,9 +2319,9 @@ xml:lang="en"> Data Parity Error - All PCI-X adapters operating in Mode 2 and some - operating in Mode 1 utilize a double bit detecting, single bit correcting Error - Correction Code (ECC). In these cases, ensure that at least two bits are + All PCI-X adapters operating in Mode 2 and some + operating in Mode 1 utilize a double bit detecting, single bit correcting Error + Correction Code (ECC). In these cases, ensure that at least two bits are modified to detect this error. @@ -2354,9 +2357,9 @@ xml:lang="en"> Data Parity Error - All PCI-X adapters operating in Mode 2 and some - operating in Mode 1 utilize a double bit detecting, single bit correcting Error - Correction Code (ECC). In these cases, ensure that at least two bits are + All PCI-X adapters operating in Mode 2 and some + operating in Mode 1 utilize a double bit detecting, single bit correcting Error + Correction Code (ECC). In these cases, ensure that at least two bits are modified to detect this error. @@ -2392,9 +2395,9 @@ xml:lang="en"> Data Parity Error - All PCI-X adapters operating in Mode 2 and some - operating in Mode 1 utilize a double bit detecting, single bit correcting Error - Correction Code (ECC). In these cases, ensure that at least two bits are + All PCI-X adapters operating in Mode 2 and some + operating in Mode 1 utilize a double bit detecting, single bit correcting Error + Correction Code (ECC). In these cases, ensure that at least two bits are modified to detect this error. @@ -2466,8 +2469,8 @@ xml:lang="en"> TLP ECRC Error - The TLP ECRC covers the address and data bits of a TLP. - Therefore, one cannot determine if the integrity error resides in the address + The TLP ECRC covers the address and data bits of a TLP. + Therefore, one cannot determine if the integrity error resides in the address or data portion of a TLP. @@ -2493,8 +2496,8 @@ xml:lang="en"> TLP ECRC Error - The TLP ECRC covers the address and data bits of a TLP. - Therefore, one cannot determine if the integrity error resides in the address + The TLP ECRC covers the address and data bits of a TLP. + Therefore, one cannot determine if the integrity error resides in the address or data portion of a TLP. @@ -2517,8 +2520,8 @@ xml:lang="en"> TLP ECRC Error - The TLP ECRC covers the address and data bits of a TLP. - Therefore, one cannot determine if the integrity error resides in the address + The TLP ECRC covers the address and data bits of a TLP. + Therefore, one cannot determine if the integrity error resides in the address or data portion of a TLP. @@ -2527,16 +2530,16 @@ xml:lang="en">
- + - R1-R1--6. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The hardware must provide a way to inject the errors in - in a non-persistent manner (that is, at + ioa-bus-error-64 functions of the Error Injection option: + The hardware must provide a way to inject the errors in + in a non-persistent manner (that is, at most one injection for each invocation of the ibm,errinjct RTAS call). @@ -2546,61 +2549,61 @@ xml:lang="en">
ERRINJCT Option OF Requirements - The Error Injection option will be disabled for all IOAs prior to + The Error Injection option will be disabled for all IOAs prior to the OS getting control. - R1-R1--1. For the ioa-bus-error and - ioa-bus-error-64 functions of the Error Injection option: - The OF must disable the ERRINJCT option for all PEs and all empty - slots on all bridges which implement this option prior to passing control to + ioa-bus-error-64 functions of the Error Injection option: + The OF must disable the ERRINJCT option for all PEs and all empty + slots on all bridges which implement this option prior to passing control to the OS. - Hardware and Firmware Implementation Note: - The platform only needs the capability to setup the injection of one error at a - time, and therefore injection facilities can be shared. The - ibm,open-errinjct and ibm,close-errinjct are - used to make sure that only one user is using the injection facilities at a + Hardware and Firmware Implementation Note: + The platform only needs the capability to setup the injection of one error at a + time, and therefore injection facilities can be shared. The + ibm,open-errinjct and ibm,close-errinjct are + used to make sure that only one user is using the injection facilities at a time.
- +
Bridged-I/O EEH Support Option - If a platform requires multi-function I/O cards which are - constructed by placing multiple IOAs beneath a PCI to PCI bridge, then extra - support is needed to support such cards in an EEH-enabled environment. If this - option is implemented, then the ibm,configure-bridge RTAS - call will be implemented and therefore the - “ibm,configure-bridge” property will exist in the + If a platform requires multi-function I/O cards which are + constructed by placing multiple IOAs beneath a PCI to PCI bridge, then extra + support is needed to support such cards in an EEH-enabled environment. If this + option is implemented, then the ibm,configure-bridge RTAS + call will be implemented and therefore the + “ibm,configure-bridge” property will exist in the rtas device node. - R1-R1--1. - For the Bridged-I/O EEH - Support option: The platform must support the + For the Bridged-I/O EEH + Support option: The platform must support the ibm,configure-bridge RTAS call. - + - R1-R1--2. - For the Bridged-I/O EEH - Support option: The OS must provide the correct EEH coordination - between device drivers that control multiple IOAs that are in the same + For the Bridged-I/O EEH + Support option: The OS must provide the correct EEH coordination + between device drivers that control multiple IOAs that are in the same PE. diff --git a/LoPAR/ch_lpar_option.xml b/LoPAR/ch_lpar_option.xml index 40e35e5..aa40499 100644 --- a/LoPAR/ch_lpar_option.xml +++ b/LoPAR/ch_lpar_option.xml @@ -211,7 +211,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo I/O Sub-System Requirements The platform divides the I/O subsystem up into Partitionable Endpoints (PEs). See - for more information on PEs. Each PE has + for more information on PEs. Each PE has its own (separate) error, addressing, and interrupt domains which allows the assignment of separate PEs to different LPAR partitions. The following are the requirements for I/O subsystems when the @@ -2034,7 +2034,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo For the LPAR option: If a platform reports in its “ibm,hypertas-functions” property (see - ) that it supports a function set, then it + ) that it supports a function set, then it must support all hcall()s of that function set as defined in . @@ -6190,7 +6190,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo using 64 bit linkage conventions and apply to all page sizes that the platform supports as specified by the “ibm,processor-page-sizes” property. (See - for more details.) + for more details.) The Page actual size is encoded in the PFT entry per the architecture Book IIIs along with the segment base page size per the @@ -6463,7 +6463,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo logical region supports (see “ibm,dynamic-memory” and “ibm,lmb-page-sizes” in - as well as + as well as for more details). @@ -6487,7 +6487,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo For the LPAR option: Each logical region must support all page sizes presented in the “ibm,processor-page-sizes” property in - that are less than or equal to the + that are less than or equal to the size of the logical region as specified by either the OF standard “reg” property of the logical region’s OF @@ -6495,7 +6495,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo “ibm,lmb-size” property of the logical region’s /ibm,dynamic-reconfiguration-memory node in - . + . Implementation Note: 32 bit versions of AIX only support 36 bit logical address memory spaces. Providing such a partition with a larger @@ -7883,7 +7883,7 @@ hcall ( const int64 H_BULK_REMOVE, /* Function Code */ fewer than 8 entries for a given actual page size / base page size combination as communicated by the “Block Invalidate Characteristics” system parameter (see - ). + ). The virtual pages are all within the same naturally aligned 8 page virtual address block and have the same page and segment size encodings. The AVA parameter, @@ -8648,7 +8648,7 @@ hcall ( const int64 H_RESIZE_HPT_COMMIT, /* Function Code */ “ibm,dma-window” property associated with the particular IOA. For the format of the “ibm,dma-window” property, reference - . + .
H_GET_TCE @@ -8710,7 +8710,7 @@ hcall ( const uint64 H_GET_TCE, /* Return the contents of the specified TCE */ If specified TCE’s Page Mapping and Control bits (see - ) specify “Page Fault” then + ) specify “Page Fault” then return H_Success @@ -8783,7 +8783,7 @@ hcall ( const uint64 H_PUT_TCE, /* Function Token */ If the Page Mapping and Control field of the TCE is not “Page Fault” (see - ) + ) @@ -8882,7 +8882,7 @@ hcall ( const uint64 H_STUFF_TCE, /* Function Token */ If the Page Mapping and Control field of the TCE is not “Page Fault” (see - ) + ) @@ -9136,7 +9136,7 @@ hcall ( const uint64 H_PUT_TCE_INDIRECT, /* Function Token */ If the Page Mapping and Control field of the 8 byte entry “T” is not “Page Fault” (see - ) then do + ) then do @@ -10314,7 +10314,7 @@ hcall ( const uint64 H_CLEAR_HPT);]]> /chosen “ibm,architecture-vec” Byte 23 bits 0-1 undefined or 0b00 See ibm,architecture vector 5, byte 23 in - + for more details. @@ -12314,7 +12314,7 @@ hcall ( const uint64 H_INT_RESET, /* Reset all interrupt structures */ returns; this requires that the OS not reuse/modify the data within the old page until the worst case DMA read access time has expired. The “ibm,dma-delay-time” property (see - ) gives the OS this implementation + ) gives the OS this implementation dependent delay value. Failure to observe this delay time may result in data corruption as seen by the caller’s I/O adapter(s). @@ -15913,7 +15913,7 @@ hcall ( const uint64 H_GET_DMA_XLATES_LIMITED, /*Return I/O Bus and correspondin performed, resulting in a constrained return from such a request. System Parameters readable via the ibm,get-system-parameter RTAS call (see - ) + ) communicate a variety of configuration and constraint parameters among which are determined by the partition definition. @@ -16742,7 +16742,7 @@ hcall ( const uint64 H_PROD, /* Mark the target processor runable */ When the value of the “ibm,partition-performance-parameters-level” see - ) is >=1 then register R8 contains + ) is >=1 then register R8 contains the processor virtualization resource allocations. In the case of a dedicated processor partition R8 contains 0: @@ -16922,7 +16922,7 @@ hcall ( const uint64 H_SET_PPP, /* Modifies the specified partition’s performa running on processors that do not implement the register in hardware, firmware simulates the function. On platforms that present the property “ibm,rks-hcalls” with bit 2 set (see - ), this call provides a reduced + ), this call provides a reduced “kill set” of volatile registers, GPRs r0 and r5-r13 are preserved. @@ -17086,7 +17086,7 @@ hcall ( const uint64 H_PIC ); /*Returns in R4 the value of the Pool Idle Count * When the value of the “ibm,partition-performance-parameters-level” (see - ) is >=1 then: + ) is >=1 then: @@ -17338,7 +17338,7 @@ hcall ( const uint64 H_JOIN /* Join active threads and return H_CONTINUE to fina For the Virtual Processor Home Node option: The platform must support the “Form 1” of the “ibm,associativity-reference-points” property per - . The client program may call + . The client program may call H_HOME_NODE_ASSOCIATIVITY hcall() with a valid identifier input parameter (such as from the device tree or from the ibm,configure-connector RTAS call) even if the @@ -17581,7 +17581,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no For the Partition Migration and Partition Hibernation options: The platform must implement the Partition Suspension option (See - ). + ). @@ -17613,7 +17613,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no For the Partition Migration and Partition Hibernation options: The platform must implement the Version 6 Extensions of Event Log Format for all reported events (See - ). + ). @@ -17658,7 +17658,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no For the Partition Migration and Partition Hibernation options: The platform must present the “ibm,nominal-tbf” property (See - ) with the value of 512 MHz. + ) with the value of 512 MHz. @@ -17668,7 +17668,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no For the Partition Suspension option: The platform must present the properties from - , as specified by + , as specified by , to a partition. @@ -17681,7 +17681,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no value of all properties in must not change while a partition is suspended except for those properties described by - . + . @@ -18300,7 +18300,7 @@ hcall ( const uint64 H_HOME_NODE_ASSOCIATIVITY), /* Returns in R4-R9 the home no For the VRMA option: The platform must include the “ibm,vrma-page-sizes” property (See - ) in the + ) in the /cpu node. @@ -18450,7 +18450,7 @@ hcall ( const uint64 H_VRMASD, /* Change the page mapping characteristics of the (except for time delays) handle all effects of any memory expropriation that it may introduce unless the CMO option is explicitly enabled by the setting of architecture.vec option vector 5 byte 4 bit 0 (See - for details). + for details). The CMO option consists of the following LoPAR extensions: @@ -19977,7 +19977,7 @@ hcall ( const H_GET_MPP_X /* Returns in R4-R10 extended Memory Performance */ Check Interrupt by returning to the partition’s interrupt vector at location 0x0200. Note the subsequent firmware assisted NMI and check exception processing returns a VPM SUE error log (See - ). + ). @@ -21604,7 +21604,7 @@ hcall ( const uint64 H_REGISTER_PROCESS_TABLE, /* Set translation mode */ the value 1) cede latency specifier settings. Platforms that implement cede latency specifier settings greater than the value of 1 implement the cede latency settings system parameter see - . The hypervisor is then free to take energy + . The hypervisor is then free to take energy management actions with this hint in mind. @@ -21666,7 +21666,7 @@ hcall ( const uint64 H_REGISTER_PROCESS_TABLE, /* Set translation mode */ For the PEM option: If the platform implements cede latency specifier values greater than 1 it must implement the cede latency settings system parameter see - . + . diff --git a/LoPAR/ch_nonvolatile_memory.xml b/LoPAR/ch_nonvolatile_memory.xml index 28bdebd..00afa53 100644 --- a/LoPAR/ch_nonvolatile_memory.xml +++ b/LoPAR/ch_nonvolatile_memory.xml @@ -1,136 +1,136 @@ - Non-Volatile Memory - This chapter describes the requirements relating to Non-Volatile - Memory. Non-Volatile Memory is the repository for system information that must + This chapter describes the requirements relating to Non-Volatile + Memory. Non-Volatile Memory is the repository for system information that must be persistent across reboots and power cycles. - +
System Requirements - R1-R1--1. - Platforms must implement at least 8 KB of - Non-Volatile Memory. The actual amount is platform dependent and must allow for - 4 KB for the OS. Platforms must provide an additional 4 KB for each installed + Platforms must implement at least 8 KB of + Non-Volatile Memory. The actual amount is platform dependent and must allow for + 4 KB for the OS. Platforms must provide an additional 4 KB for each installed OS beyond the first. - + - R1-R1--2. - Non-Volatile Memory must maintain its + Non-Volatile Memory must maintain its contents in the absence of system power. - + - R1-R1--3. - Firmware must reinitialize NVRAM to a + Firmware must reinitialize NVRAM to a bootable state if NVRAM data corruption is detected. - + - R1-R1--4. - OSs must reinitialize their own NVRAM - partitions if NVRAM data corruption is detected. OSs may create free space from - the first corrupted NVRAM partition header to the end of NVRAM and utilize this + OSs must reinitialize their own NVRAM + partitions if NVRAM data corruption is detected. OSs may create free space from + the first corrupted NVRAM partition header to the end of NVRAM and utilize this area to initialize their NVRAM partitions. - Hardware Implementation Note: The NVRAM terminology used in this - chapter goes back to historic implementations that have used battery-powered - RAM to implement the non-volatile memory. It should be understood that this is - not the only possible implementation. Implementers need to understand that - there are no limits on the frequency of writing to the non-volatile memory, so - certain technologies may not be applicable. Also, it should be noted that the - nvram-fetch and nvram-store RTAS - calls do not allow a “busy” Status return, and this may further + Hardware Implementation Note: The NVRAM terminology used in this + chapter goes back to historic implementations that have used battery-powered + RAM to implement the non-volatile memory. It should be understood that this is + not the only possible implementation. Implementers need to understand that + there are no limits on the frequency of writing to the non-volatile memory, so + certain technologies may not be applicable. Also, it should be noted that the + nvram-fetch and nvram-store RTAS + calls do not allow a “busy” Status return, and this may further limit the implementation choices. - Software Implementation Note: Refer to + Software Implementation Note: Refer to for information on accessing NVRAM.
- +
Structure - NVRAM is formatted as a set of NVRAM partitions that adhere to the - structure in . NVRAM partitions are - prefixed with a header containing signature, - checksum, length, and - name fields. The structure of the data field - is defined by the NVRAM partition creator/owner (designated by + NVRAM is formatted as a set of NVRAM partitions that adhere to the + structure in . NVRAM partitions are + prefixed with a header containing signature, + checksum, length, and + name fields. The structure of the data field + is defined by the NVRAM partition creator/owner (designated by signature and name). - R1-R1--1. - NVRAM partitions must be structured as shown + NVRAM partitions must be structured as shown in . - + - R1-R1--2. - All NVRAM space must be accounted for by + All NVRAM space must be accounted for by NVRAM partitions. - + - R1-R1--3. - All IBM-defined NVRAM partitions that are - intended to be IBM-unique must have names prefixed with the ASCII + All IBM-defined NVRAM partitions that are + intended to be IBM-unique must have names prefixed with the ASCII representation of the four characters: ibm,. - Software Implementation Note: Although the data - areas of NVRAM partitions are not required to have error checking, it is - strongly recommended that the system software implement robust data structures - and error checking. Loss of NVRAM structures due to data corruption can be - catastrophic, potentially leading to OS reinstallation and/or complete system + Software Implementation Note: Although the data + areas of NVRAM partitions are not required to have error checking, it is + strongly recommended that the system software implement robust data structures + and error checking. Loss of NVRAM structures due to data corruption can be + catastrophic, potentially leading to OS reinstallation and/or complete system initialization.
- +
Signatures - The signature field is used as the first level - of NVRAM partition identification. - lists all the currently defined signature types and their ownership classes. - The ownership class determines the permission of a particular system software - component to create and/or modify NVRAM partitions and/or NVRAM partition - contents. All NVRAM partitions may be read by any system software component, - but the ownership class has exclusive write permission. Global ownership gives - read/write permission to all system software components. These restrictions are - made to minimize the possibility of corruption of NVRAM during update + The signature field is used as the first level + of NVRAM partition identification. + lists all the currently defined signature types and their ownership classes. + The ownership class determines the permission of a particular system software + component to create and/or modify NVRAM partitions and/or NVRAM partition + contents. All NVRAM partitions may be read by any system software component, + but the ownership class has exclusive write permission. Global ownership gives + read/write permission to all system software components. These restrictions are + made to minimize the possibility of corruption of NVRAM during update activities. - Hardware and Software Implementation Note: It is recommended that - NVRAM partitions be ordered on the signature field with the lowest value - signature NVRAM partition at the lowest NVRAM address (with the exception of - signature = 0x7F, free space). This will minimize the effect of NVRAM data + Hardware and Software Implementation Note: It is recommended that + NVRAM partitions be ordered on the signature field with the lowest value + signature NVRAM partition at the lowest NVRAM address (with the exception of + signature = 0x7F, free space). This will minimize the effect of NVRAM data corruption on system operation. @@ -167,9 +167,9 @@ version="5.0" xml:lang="en"> 1 byte - The signature field is used to - identify the NVRAM partition type and provide some level of checking for - overall NVRAM contamination. Signature assignments are given in + The signature field is used to + identify the NVRAM partition type and provide some level of checking for + overall NVRAM contamination. Signature assignments are given in . @@ -181,11 +181,11 @@ version="5.0" xml:lang="en"> 1 byte - The checksum field is included to - provide a check on the validity of the header. The checksum covers the - signature, length, and - name fields and is calculated (on a byte by byte or equivalent - basis) by: add, and add 1 back to the sum if a carry resulted as demonstrated + The checksum field is included to + provide a check on the validity of the header. The checksum covers the + signature, length, and + name fields and is calculated (on a byte by byte or equivalent + basis) by: add, and add 1 back to the sum if a carry resulted as demonstrated with the following program listing. - This checksum algorithm guarantees 0 to be an impossible + This checksum algorithm guarantees 0 to be an impossible calculated value. A valid header cannot have a checksum of zero. @@ -215,13 +215,13 @@ unsigned int nbytes; /* number of bytes to sum */ 2 bytes - The length field designates the - total length of the NVRAM partition, in 16-byte blocks, beginning with the - signature and ending with the last byte of the data area. A length of zero is + The length field designates the + total length of the NVRAM partition, in 16-byte blocks, beginning with the + signature and ending with the last byte of the data area. A length of zero is invalid. - Software Implementation Note: The - length field must always provide valid offsets to the - next header since an invalid length effectively causes the loss of access to + Software Implementation Note: The + length field must always provide valid offsets to the + next header since an invalid length effectively causes the loss of access to every NVRAM partition beyond it. @@ -233,18 +233,18 @@ unsigned int nbytes; /* number of bytes to sum */ 12 bytes - The name field is a 12 byte string - (or a NULL-terminated string of less than 12 bytes) used to identify a - particular NVRAM partition within a signature group. In order to reduce the - likelihood of a naming conflict, each platform-specific or OS-specific NVRAM - partition name should be prefixed with a company name as specified under the - description of the “name” string in the - , that is, a company name string - in one of the three forms described in the reference, followed by a comma - (“,”). If the company name string is null, the name will be + The name field is a 12 byte string + (or a NULL-terminated string of less than 12 bytes) used to identify a + particular NVRAM partition within a signature group. In order to reduce the + likelihood of a naming conflict, each platform-specific or OS-specific NVRAM + partition name should be prefixed with a company name as specified under the + description of the “name” string in the + , that is, a company name string + in one of the three forms described in the reference, followed by a comma + (“,”). If the company name string is null, the name will be interpreted as “other”. - Before assigning a new name to a NVRAM partition, software - should scan the existing NVRAM partitions and ensure that an unwanted name + Before assigning a new name to a NVRAM partition, software + should scan the existing NVRAM partitions and ensure that an unwanted name conflict is not created. @@ -256,7 +256,7 @@ unsigned int nbytes; /* number of bytes to sum */ length minus 16 bytes - The structure of the data area is + The structure of the data area is controlled by the creator/owner of the NVRAM partition. @@ -350,8 +350,8 @@ unsigned int nbytes; /* number of bytes to sum */ 0 to n - This signature is used to mark free space in the NVRAM - array. The name field of all signature 0x7F NVRAM + This signature is used to mark free space in the NVRAM + array. The name field of all signature 0x7F NVRAM partitions must be set to 0x7...77. @@ -374,8 +374,8 @@ unsigned int nbytes; /* number of bytes to sum */ - Note: Any signature not defined above is reserved, and - signatures 0x02, 0x50, 0x51, 0x52, 0x71, and 0x72 are reserved for legacy + Note: Any signature not defined above is reserved, and + signatures 0x02, 0x50, 0x51, 0x52, 0x71, and 0x72 are reserved for legacy reasons. @@ -383,136 +383,136 @@ unsigned int nbytes; /* number of bytes to sum */
- +
Architected NVRAM Partitions
System (0x70) - System NVRAM partitions are used for storing information - (typically, configuration variables) accessible to both OF and the OS. Refer to - for the definition of the contents of the + System NVRAM partitions are used for storing information + (typically, configuration variables) accessible to both OF and the OS. Refer to + for the definition of the contents of the System NVRAM partition named common. - R1-R1--1. - Every system NVRAM must contain a System + Every system NVRAM must contain a System NVRAM partition with the NVRAM partition name = common. - + - R1-R1--2. - Data in the common - NVRAM partition must be stored as NULL-terminated strings of the form: - <name>=<string> and the - data area must be terminated with at least two NULL + Data in the common + NVRAM partition must be stored as NULL-terminated strings of the form: + <name>=<string> and the + data area must be terminated with at least two NULL characters. - + - R1-R1--3. - All names used in + All names used in the common NVRAM partition must be unique. - + - R1-R1--4. - Device and file specifications used in - the common NVRAM partition must follow IEEE Std 1275 + Device and file specifications used in + the common NVRAM partition must follow IEEE Std 1275 nomenclature conventions. - +
System NVRAM Partition - The System NVRAM partition, with name = common, - contains information that is accessible to both OF and OSs. - The contents of this NVRAM partition are represented in the OF device tree as - properties (i.e., (name, value) - pairs) in the /options node. While OF is available, the - OS can alter the contents of these properties by using the - setprop client interface service. When OF is no longer available, - the OS can alter the contents of the System NVRAM partition itself, following - the rules below for the formats of the name and - value. Information is stored in the System NVRAM - partition as a sequence of (name, + The System NVRAM partition, with name = common, + contains information that is accessible to both OF and OSs. + The contents of this NVRAM partition are represented in the OF device tree as + properties (i.e., (name, value) + pairs) in the /options node. While OF is available, the + OS can alter the contents of these properties by using the + setprop client interface service. When OF is no longer available, + the OS can alter the contents of the System NVRAM partition itself, following + the rules below for the formats of the name and + value. Information is stored in the System NVRAM + partition as a sequence of (name, value) pairs in the following format: name = value - where name follows the rules defined in - and value follows the - rules defined in . The end of the + where name follows the rules defined in + and value follows the + rules defined in . The end of the sequence of pairs is denoted by a NULL (0x00) byte. - +
Name - Since the data in the System NVRAM partition is an external - representation of properties of the /option node, the - name component must follow the rules for property names + Since the data in the System NVRAM partition is an external + representation of properties of the /option node, the + name component must follow the rules for property names as defined by Section 3.2.2.1.1 Property names of - ; i.e., a string of 1-31 printable - characters containing no uppercase characters or the characters - “/”, “\”, “:”, “[“, - “]” or “@”. In addition to these rules, a naming - convention is required for OS specific names to avoid name conflicts. Each such - name must begin with the OS vendor’s OUI followed by a - “,”; e.g., aapl,xxx or - ibm,xxx. This introduces separate name spaces for each vendor in which + ; i.e., a string of 1-31 printable + characters containing no uppercase characters or the characters + “/”, “\”, “:”, “[“, + “]” or “@”. In addition to these rules, a naming + convention is required for OS specific names to avoid name conflicts. Each such + name must begin with the OS vendor’s OUI followed by a + “,”; e.g., aapl,xxx or + ibm,xxx. This introduces separate name spaces for each vendor in which it manages its own naming conventions.
- +
Value - The value component of System NVRAM partition data can contain an - arbitrary number of bytes in the range 0x01 to 0xFF, terminated by a NULL - (0x00) byte. Bytes in the range 0x01 to 0xFE represent themselves. In order to - allow arbitrary byte data to be represented, an encoding is used to represent - strings of 0x00 or 0xFF bytes. This encoding uses the 0xFF byte as an escape, + The value component of System NVRAM partition data can contain an + arbitrary number of bytes in the range 0x01 to 0xFF, terminated by a NULL + (0x00) byte. Bytes in the range 0x01 to 0xFE represent themselves. In order to + allow arbitrary byte data to be represented, an encoding is used to represent + strings of 0x00 or 0xFF bytes. This encoding uses the 0xFF byte as an escape, indicating that the following byte is encoded as: bnnnnnnn - where b, the most-significant bit, is 0 to represent a sequence of - 0x00 bytes or 1 to represent a sequence of 0xFF bytes. nnnnnnn, the - least-significant 7 bits, is a binary number (in the range 0x01 to 0x7F) that + where b, the most-significant bit, is 0 to represent a sequence of + 0x00 bytes or 1 to represent a sequence of 0xFF bytes. nnnnnnn, the + least-significant 7 bits, is a binary number (in the range 0x01 to 0x7F) that represents the number of repetitions of 0x00 or 0xFF.
- +
OF Configuration Variables - OF configuration variables control the operation of OF. In addition - to the standard configuration variables defined in - , other configuration variables are defined - in . While such variables are stored in the System - NVRAM partition as described above, they have additional rules placed on the - format of the value component. Each configuration variable is also represented - by a user interface word (of the same name) that returns stack value(s) when - that word is evaluated. Each also has a platform defined default value; the - absence of a configuration variable in the System NVRAM partition indicates - that the value is set to its default value. The format of the external - representation of configuration variables, and their stack representation, is - defined by Section 7.4.4.1 Configuration Variables of - ; the format depends upon the data type of - the configuration variable. Whereas the internal storage format is not defined - by , this architecture specifies them - as described below. The names of configuration variables are defined in OF configuration variables control the operation of OF. In addition + to the standard configuration variables defined in + , other configuration variables are defined + in . While such variables are stored in the System + NVRAM partition as described above, they have additional rules placed on the + format of the value component. Each configuration variable is also represented + by a user interface word (of the same name) that returns stack value(s) when + that word is evaluated. Each also has a platform defined default value; the + absence of a configuration variable in the System NVRAM partition indicates + that the value is set to its default value. The format of the external + representation of configuration variables, and their stack representation, is + defined by Section 7.4.4.1 Configuration Variables of + ; the format depends upon the data type of + the configuration variable. Whereas the internal storage format is not defined + by , this architecture specifies them + as described below. The names of configuration variables are defined in , except as noted otherwise. - +
Boolean Configuration Variables - The value of a boolean configuration variable is represented in the - System NVRAM partition as the string “true” or - “false”. The following configuration variables are of type + The value of a boolean configuration variable is represented in the + System NVRAM partition as the string “true” or + “false”. The following configuration variables are of type boolean: @@ -547,9 +547,9 @@ unsigned int nbytes; /* number of bytes to sum */
Integer Configuration Variables - The value of an integer configuration variable is represented in - the System NVRAM partition as a decimal number or a hexadecimal number preceded - by “0x”. The following configuration variables are of type + The value of an integer configuration variable is represented in + the System NVRAM partition as a decimal number or a hexadecimal number preceded + by “0x”. The following configuration variables are of type integer: @@ -584,14 +584,14 @@ unsigned int nbytes; /* number of bytes to sum */
- +
String Configuration Variables - The value of a string configuration variable is represented in the - System NVRAM partition as the characters of the string. Where multiple - “lines” of text are represented, each line is terminated by a - carriage-return (0x0D), a line-feed (0x0A), or carriage-return, line-feed - sequence (0x0D, 0x0A). The following configuration variables are of type + The value of a string configuration variable is represented in the + System NVRAM partition as the characters of the string. Where multiple + “lines” of text are represented, each line is terminated by a + carriage-return (0x0D), a line-feed (0x0A), or carriage-return, line-feed + sequence (0x0D, 0x0A). The following configuration variables are of type string: @@ -641,8 +641,8 @@ unsigned int nbytes; /* number of bytes to sum */
Byte Configuration Variables - The value of a bytes configuration variable is represented by an - arbitrary number of bytes, using the encoding escape for values of 0x00 and + The value of a bytes configuration variable is represented by an + arbitrary number of bytes, using the encoding escape for values of 0x00 and 0xFF. The following configuration variables are of type bytes: @@ -655,89 +655,89 @@ unsigned int nbytes; /* number of bytes to sum */
DASD Spin-up Control - In order to reduce the boot time of platforms, a configuration - variable is defined to communicate from the platform to the OS to what extent - spin-up of hard disk drives can be overlapped. Disk drives generally draw more - current as the motors spin up to operating speed, thus the capacity of the + In order to reduce the boot time of platforms, a configuration + variable is defined to communicate from the platform to the OS to what extent + spin-up of hard disk drives can be overlapped. Disk drives generally draw more + current as the motors spin up to operating speed, thus the capacity of the power supply limits the ability to spin up drives simultaneously. - The configuration variable - ibm,dasd-spin-interval indicates the minimum time, in seconds, that - must be allowed between initiating the spin-up of hard disk drives on the - platform. Presence of this variable potentially allows starting up a drive - prior to receiving completion status from a drive previously started. The - absence of this variable implies no platform knowledge regarding the capability - to overlap and, hence, the OS should wait for the appropriate device status + The configuration variable + ibm,dasd-spin-interval indicates the minimum time, in seconds, that + must be allowed between initiating the spin-up of hard disk drives on the + platform. Presence of this variable potentially allows starting up a drive + prior to receiving completion status from a drive previously started. The + absence of this variable implies no platform knowledge regarding the capability + to overlap and, hence, the OS should wait for the appropriate device status before proceeding to subsequent drives (no overlap). - R1-R1--1. - If a platform wants to overlap spinning - up it's hard disk drives to improve boot performance, it must create the - ibm,dasd-spin-interval OF configuration variable in the - NVRAM signature 0x70 NVRAM partition named common and set - it equal to an integer that represents the minimum time, in seconds, that must + If a platform wants to overlap spinning + up it's hard disk drives to improve boot performance, it must create the + ibm,dasd-spin-interval OF configuration variable in the + NVRAM signature 0x70 NVRAM partition named common and set + it equal to an integer that represents the minimum time, in seconds, that must be allowed between initiating the spin-up of drives on the platform. - Firmware Implementation Note: The platform - should provide a user-friendly interface to this variable to allow for the - possibility of a user installing hard disks that do not conform to the original + Firmware Implementation Note: The platform + should provide a user-friendly interface to this variable to allow for the + possibility of a user installing hard disks that do not conform to the original setting of the variable.
- +
Free Space (0x7F) - R1-R1--1. - All unused NVRAM space must be included + All unused NVRAM space must be included in a signature = 0x7F Free Space NVRAM partition. - + - R1-R1--2. - All Free Space NVRAM partitions must have + All Free Space NVRAM partitions must have the name field set to 0x7...77.
- +
NVRAM Space Management - The only NVRAM partitions whose size an OS can modify are OS and Free - Space signature NVRAM partitions. As NVRAM partitions are created and modified - by an OS, it is likely that free space will become fragmented; free space + The only NVRAM partitions whose size an OS can modify are OS and Free + Space signature NVRAM partitions. As NVRAM partitions are created and modified + by an OS, it is likely that free space will become fragmented; free space consolidation may become necessary. - R1-R1--1. - An OS must not move or delete any NVRAM + An OS must not move or delete any NVRAM partition, except OS and Free Space signature NVRAM partitions. - + - R1-R1--2. - The NVRAM partition header checksum must be + The NVRAM partition header checksum must be calculated as shown in . diff --git a/LoPAR/ch_numa.xml b/LoPAR/ch_numa.xml index e813542..f011e26 100644 --- a/LoPAR/ch_numa.xml +++ b/LoPAR/ch_numa.xml @@ -72,7 +72,7 @@ xml:lang="en"> (processor, memory region, and IO slot) conveys information about the resources statically assigned to the client program; and contains the “ibm,associativity” - property (see ). This property allows the client + property (see ). This property allows the client program to determine the associativity between any two of it’s resources. The greater the associativity the greater the expected performance when using those two resources in a given operation. @@ -96,7 +96,7 @@ xml:lang="en"> information for the resources is not provided to prevent erroneous operation. If the long term mapping changes the client program can be made aware of the new associativity information using the ibm,update-properties RTAS call (See - ). + ). @@ -221,7 +221,7 @@ xml:lang="en"> property byte 5 bit 0 has the value of zero, the “ibm,associativity-reference-points” property defines reference points in the “ibm,associativity” - property (see ) which roughly correspond to + property (see ) which roughly correspond to traditional notions of platform topology constructs. It is important for the user to realize that these reference points are not exact and their characteristics vary among implementations. @@ -281,7 +281,7 @@ xml:lang="en"> Dynamic Reconfiguration with Cross CEC I/O Drawers Should the configuration change in such a way that the associativity between an OS image’s resources changes, the platform notifies the OS - via an event scan log. See . + via an event scan log. See . @@ -307,7 +307,7 @@ xml:lang="en"> and the “ibm,current-associativity-domains” properties in the /rtas node of the device tree (see - ). + ). @@ -319,7 +319,7 @@ xml:lang="en"> “ibm,max-associativity-domains” and the “ibm,current-associativity-domains” - properties in + properties in the /rtas node of the device tree. @@ -343,7 +343,7 @@ xml:lang="en"> preferred. The OS and platform firmware negotiate their mutual support of the PRRN option via the ibm,client-architecture-support - interface (See ). Should a partition be + interface (See ). Should a partition be migrated from a platform that did not support the PRRN option, the target platform does not notify the partition’s OS of any PRRN events and, when possible avoids changing the affinity among the partition’s resources. @@ -353,7 +353,7 @@ xml:lang="en"> events. A PRRN event is signaled via the RTAS event-scan mechanism, which returns a Hot Plug Event message “fixed - part” (See ) indicating “Platform + part” (See ) indicating “Platform Resource Reassignment”. In response to the Hot Plug Event message, the OS may call ibm,update-nodes to determine which resources were reassigned, and then ibm,update-properties to obtain @@ -443,7 +443,7 @@ xml:lang="en"> - Description, Values (Described in ) + Description, Values (Described in ) diff --git a/LoPAR/ch_processors_memory.xml b/LoPAR/ch_processors_memory.xml index ff7d24b..4e1b7a3 100644 --- a/LoPAR/ch_processors_memory.xml +++ b/LoPAR/ch_processors_memory.xml @@ -1,344 +1,344 @@ - Processor and Memory - The purpose of this chapter is to specify the processor and memory - requirements of this architecture. The processor architecture section addresses - differences between the processors in the PA family as well as their interface - variations and features of note. The memory architecture section addresses - coherency, minimum system memory requirements, memory controller requirements, + The purpose of this chapter is to specify the processor and memory + requirements of this architecture. The processor architecture section addresses + differences between the processors in the PA family as well as their interface + variations and features of note. The memory architecture section addresses + coherency, minimum system memory requirements, memory controller requirements, and cache requirements. - +
Processor Architecture - The Processor Architecture (PA) governs software compatibility at an - instruction set and environment level. However, each processor implementation - has unique characteristics which are described in its user’s manual. To - facilitate shrink-wrapped software, this architecture places some limitations - on the variability in processor implementations. Nonetheless, evolution of the - PA and implementations creates a need for both software and hardware developers - to stay current with its progress. The following material highlights areas - deserving special attention and provides pointers to the latest + The Processor Architecture (PA) governs software compatibility at an + instruction set and environment level. However, each processor implementation + has unique characteristics which are described in its user’s manual. To + facilitate shrink-wrapped software, this architecture places some limitations + on the variability in processor implementations. Nonetheless, evolution of the + PA and implementations creates a need for both software and hardware developers + to stay current with its progress. The following material highlights areas + deserving special attention and provides pointers to the latest information. - +
Processor Architecture Compliance The PA is defined in . - R1-R1--1. - Platforms must incorporate only processors which comply fully + Platforms must incorporate only processors which comply fully with . - + - R1-R1--2. For the Symmetric Multiprocessor option: - Multiprocessing platforms must use only processors which + Multiprocessing platforms must use only processors which implement the processor identification register. - + - R1-R1--3. - Platforms must incorporate only processors which implement - tlbie and tlbsync, and - slbie and slbia for + Platforms must incorporate only processors which implement + tlbie and tlbsync, and + slbie and slbia for 64-bit implementations. - + - R1-R1--4. - Except where specifically noted otherwise - in , platforms must support all + Except where specifically noted otherwise + in , platforms must support all functions specified by the PA. - Hardware and Software Implementation Note: The PA and this - architecture view tlbia - as an optional performance enhancement. Processors need not - implement tlbia. Software that needs to purge the TLB should provide a sequence - of instructions that is functionally equivalent to tlbia and use the content of - the OF device tree to choose the software implementation or the hardware + Hardware and Software Implementation Note: The PA and this + architecture view tlbia + as an optional performance enhancement. Processors need not + implement tlbia. Software that needs to purge the TLB should provide a sequence + of instructions that is functionally equivalent to tlbia and use the content of + the OF device tree to choose the software implementation or the hardware instruction. See for details.
- +
PA Processor Differences - A complete understanding of processor differences may be obtained - by studying and the user’s + A complete understanding of processor differences may be obtained + by studying and the user’s manuals for the various processors. - The creators of this architecture cooperate with processor - designers to maintain a list of supported differences, to be used by the OS - instead of the processor - version number (PVN), - enabling execution on future processors. OF communicates these differences via properties of the - cpu node of the OF device tree. Examples of OF device - tree properties which support these differences include “64-bit” - and “performance-monitor”. See - for a complete listing and more details. + The creators of this architecture cooperate with processor + designers to maintain a list of supported differences, to be used by the OS + instead of the processor + version number (PVN), + enabling execution on future processors. OF communicates these differences via properties of the + cpu node of the OF device tree. Examples of OF device + tree properties which support these differences include “64-bit” + and “performance-monitor”. See + for a complete listing and more details. - R1-R1--1. - The OS must use the properties of the cpu - node of the OF device tree to determine the programming model of the processor + The OS must use the properties of the cpu + node of the OF device tree to determine the programming model of the processor implementation. - + - R1-R1--2. - The OS must provide an execution path - which uses the properties of the cpu node of the OF - device. The PVN - is available to the platform aware OS for exceptional cases such as performance + The OS must provide an execution path + which uses the properties of the cpu node of the OF + device. The PVN + is available to the platform aware OS for exceptional cases such as performance optimization and errata handling. - + - R1-R1--3. - The OS must - support the 64-bit page table formats defined by + The OS must + support the 64-bit page table formats defined by . - + - R1-R1--4. - Processors which exhibit the - “64-bit” property of the - cpu node of the OF device tree must also implement the - “bridge architecture,” an option in . + Processors which exhibit the + “64-bit” property of the + cpu node of the OF device tree must also implement the + “bridge architecture,” an option in . - + - R1-R1--5. - Platforms must restrict their choice of processors to those whose - programming models may be described by the properties defined for the - cpu node of the OF device tree in - . + Platforms must restrict their choice of processors to those whose + programming models may be described by the properties defined for the + cpu node of the OF device tree in + . - + - R1-R1--6. - Platform firmware must initialize the - second and third pages above Base correctly for the + Platform firmware must initialize the + second and third pages above Base correctly for the processor in the platform prior to giving control to the OS. - + - R1-R1--7. - OS and application software must not + OS and application software must not alter the state of the second and third pages above Base. - + - R1-R1--8. - Platforms must implement the - “ibm,platform-hardware-notification” property (see - ) and include all PVRs that the platform may + Platforms must implement the + “ibm,platform-hardware-notification” property (see + ) and include all PVRs that the platform may contain. - +
64-bit Implementations - - Some 64-bit processor implementations will not support the full - virtual address allowed by . As a - result, this architecture adds a 64-bit virtual address subset to the PA and - the corresponding cpu node property + + Some 64-bit processor implementations will not support the full + virtual address allowed by . As a + result, this architecture adds a 64-bit virtual address subset to the PA and + the corresponding cpu node property “64-bit-virtual-address” to OF. - In order for an OS to make use of the increased addressability of + In order for an OS to make use of the increased addressability of 64-bit processor implementations: - + - The memory subsystem must support the addressing of memory + The memory subsystem must support the addressing of memory located at or beyond 4 GB, and - + - Any system memory located at or beyond 4 GB must be reported via + Any system memory located at or beyond 4 GB must be reported via the OF device tree. - + - At an abstract level, the effort to support 64-bit architecture in + At an abstract level, the effort to support 64-bit architecture in platforms is modest. The requirements follow. - R1-R1--1. - The OS must support the 64-bit virtual - address subset, but may defer support of the full 80-bit virtual address until + The OS must support the 64-bit virtual + address subset, but may defer support of the full 80-bit virtual address until such time as it is required. - + - R1-R1--2. - Firmware must report the + Firmware must report the “64-bit-virtual-address” property for processors which implement the 64-bit virtual address subset. - + - R1-R1--3. - RTAS must be capable of being - instantiated in either a 32-bit or 64-bit mode on a platform with addressable + RTAS must be capable of being + instantiated in either a 32-bit or 64-bit mode on a platform with addressable memory above 4 GB. - Software Implementation Note: A 64-bit OS need not require 64-bit - client interface services in order to boot. Because of the problems that might - be introduced by dynamically switching between 32-bit and 64-bit modes in OF, - the configuration variable 64-bit-mode? is provided so + Software Implementation Note: A 64-bit OS need not require 64-bit + client interface services in order to boot. Because of the problems that might + be introduced by dynamically switching between 32-bit and 64-bit modes in OF, + the configuration variable 64-bit-mode? is provided so that OF can statically configure itself to the needs of the OS.
- +
Processor Interface Variations - Individual processor interface implementations are described in + Individual processor interface implementations are described in their respective user’s manuals.
- +
PA Features Deserving Comment - Some PA features are optional, and need not be implemented in a - platform. Usage of others may be discouraged due to their potential for poor - performance. The following sections elaborate on the disposition of these + Some PA features are optional, and need not be implemented in a + platform. Usage of others may be discouraged due to their potential for poor + performance. The following sections elaborate on the disposition of these features in regard to compliance with the PA. - +
Multiple Scalar Operations - The PA supports multiple scalar operations. The multiple scalar - operations are Load and Store String and Load and Store Multiple. - Due to the long-term performance disadvantage associated with multiple scalar + The PA supports multiple scalar operations. The multiple scalar + operations are Load and Store String and Load and Store Multiple. + Due to the long-term performance disadvantage associated with multiple scalar operations, their use by software is not recommended.
- +
External Control Instructions (Optional) - The external control instructions - (eciwx and ecowx) are not supported + The external control instructions + (eciwx and ecowx) are not supported by this architecture.
- +
<emphasis role="bold"><literal>cpu</literal></emphasis> Node <emphasis role="bold"><literal>“Status”</literal></emphasis> Property - See for the values of the - “status” property of the cpu + See for the values of the + “status” property of the cpu node.
- +
Multi-Threading Processor Option Power processors may optionally support multi-threading. - R1-R1--1. - For the Multi-threading - Processor option: The platform must supply one entry in the - ibm,ppc-interrupt-server#s property associated with the + For the Multi-threading + Processor option: The platform must supply one entry in the + ibm,ppc-interrupt-server#s property associated with the processor for each thread that the processor supports. - Refer to for the definition of + Refer to for the definition of the ibm,ppc-interrupt-server#s property.
- +
Memory Architecture - The Memory Architecture of an LoPAR implementation is defined by - and - , which defines what platform elements - are accessed by each real (physical) system address, as well as the sections + The Memory Architecture of an LoPAR implementation is defined by + and + , which defines what platform elements + are accessed by each real (physical) system address, as well as the sections which follow. - The PA allows implementations to incorporate such performance - enhancing features as write-back caching, non-coherent instruction caches, - pipelining, and out-of-order and speculative execution. These features - introduce the concepts of coherency (the apparent order - of storage operations to a single memory location as observed by other - processors and DMA) and consistency (the order of storage - accesses among multiple locations). In most cases, these features are - transparent to software. However, in certain circumstances, OS software - explicitly manages the order and buffering of storage operations. By - selectively eliminating ordering options, either via storage access mode bits - or the introduction of storage barrier instructions, software can force - increasingly restrictive ordering semantics upon its storage operations. Refer + The PA allows implementations to incorporate such performance + enhancing features as write-back caching, non-coherent instruction caches, + pipelining, and out-of-order and speculative execution. These features + introduce the concepts of coherency (the apparent order + of storage operations to a single memory location as observed by other + processors and DMA) and consistency (the order of storage + accesses among multiple locations). In most cases, these features are + transparent to software. However, in certain circumstances, OS software + explicitly manages the order and buffering of storage operations. By + selectively eliminating ordering options, either via storage access mode bits + or the introduction of storage barrier instructions, software can force + increasingly restrictive ordering semantics upon its storage operations. Refer to for further details. - PA processor designs usually allow, under certain conditions, for - caching, buffering, combining, and reordering in the platform’s memory - and I/O subsystems. The platform’s memory subsystem, system - interconnect, and processors, which cooperate through a platform implementation - specific protocol to meet the PA specified memory coherence, consistency, and - caching rules, are said to be within the platform’s coherency + PA processor designs usually allow, under certain conditions, for + caching, buffering, combining, and reordering in the platform’s memory + and I/O subsystems. The platform’s memory subsystem, system + interconnect, and processors, which cooperate through a platform implementation + specific protocol to meet the PA specified memory coherence, consistency, and + caching rules, are said to be within the platform’s coherency domain. - shows an example system. - The shaded portion is the PA coherency domain. Buses 1 through 3 lie outside - this domain. The figure shows two - I/O subsystems, each interfacing with the host system via a Host Bridge. Notice that - the domain includes portions of the Host Bridges. This symbolizes the role of - the bridge to apply PA semantics to reference streams as they enter or leave - the coherency domain, while implementing the ordering rules of the I/O bus + shows an example system. + The shaded portion is the PA coherency domain. Buses 1 through 3 lie outside + this domain. The figure shows two + I/O subsystems, each interfacing with the host system via a Host Bridge. Notice that + the domain includes portions of the Host Bridges. This symbolizes the role of + the bridge to apply PA semantics to reference streams as they enter or leave + the coherency domain, while implementing the ordering rules of the I/O bus architecture. - Memory, other than System Memory, is not required to be coherent. + Memory, other than System Memory, is not required to be coherent. Such memory may include memory in IOAs. - +
Example System Diagram Showing the PA Coherency Domain @@ -351,238 +351,238 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
- Hardware Implementation Note: Components of the platform within the - coherency domain (memory controllers and in-line caches, for example) - collectively implement the PA memory model, including the ordering of - operations. Special care should be given to configurations for which multiple - paths exist between a component that accesses memory and the memory itself, if + Hardware Implementation Note: Components of the platform within the + coherency domain (memory controllers and in-line caches, for example) + collectively implement the PA memory model, including the ordering of + operations. Special care should be given to configurations for which multiple + paths exist between a component that accesses memory and the memory itself, if accesses for which ordering is required are permitted to use different paths. - +
System Memory - System Memory normally consists of dynamic read/write random access - memory which is used for the temporary storage of programs and data being - operated on by the processor(s). A platform usually provides for the expansion + System Memory normally consists of dynamic read/write random access + memory which is used for the temporary storage of programs and data being + operated on by the processor(s). A platform usually provides for the expansion of System Memory via plug-in memory modules and/or memory boards. - R1-R1--1. - Platforms must provide at least 128 MB of - System Memory. (Also see for - other requirements which apply to memory within the first 32MB of System + Platforms must provide at least 128 MB of + System Memory. (Also see for + other requirements which apply to memory within the first 32MB of System Memory.) - + - R1-R1--2. - Platforms must support the expansion of + Platforms must support the expansion of System Memory to 2 GB or more. - Hardware Implementation Note: These requirements are minimum - requirements. Each OS has its own recommended configuration which may be + Hardware Implementation Note: These requirements are minimum + requirements. Each OS has its own recommended configuration which may be greater. - Software Implementation Note: System Memory will be described by - the properties of the memory node(s) of the OF + Software Implementation Note: System Memory will be described by + the properties of the memory node(s) of the OF device tree.
- +
Memory Mapped I/O (MMIO) and DMA Operations - Storage operations which cross the coherency domain boundary are - referred to as Memory Mapped I/O (MMIO) operations if they are initiated within - the coherency domain, and DMA operations - if they are initiated outside the coherency domain - and target storage within it. Accesses with targets outside the coherency - domain are assumed to be made to IOAs. These accesses are considered performed + Storage operations which cross the coherency domain boundary are + referred to as Memory Mapped I/O (MMIO) operations if they are initiated within + the coherency domain, and DMA operations + if they are initiated outside the coherency domain + and target storage within it. Accesses with targets outside the coherency + domain are assumed to be made to IOAs. These accesses are considered performed (or complete) when they complete at the IOA’s I/O bus interface. - Bus bridges translate between bus operations on the initiator and - target buses. In some cases, there may not be a one-to-one correspondence - between initiator and target bus transactions. In these cases, the bridge - selects one or a sequence of transactions which most closely matches the - meaning of the transaction on the source bus. See also - for more details and the appropriate PCI + Bus bridges translate between bus operations on the initiator and + target buses. In some cases, there may not be a one-to-one correspondence + between initiator and target bus transactions. In these cases, the bridge + selects one or a sequence of transactions which most closely matches the + meaning of the transaction on the source bus. See also + for more details and the appropriate PCI specifications. - For MMIO Load and Store - instructions, the software needs to set up the WIMG bits - appropriately to control Load and Store caching, - Store combining, and - speculative Load execution to I/O addresses. This - architecture does not require platform support of caching of MMIO - Load and Store instructions. + For MMIO Load and Store + instructions, the software needs to set up the WIMG bits + appropriately to control Load and Store caching, + Store combining, and + speculative Load execution to I/O addresses. This + architecture does not require platform support of caching of MMIO + Load and Store instructions. See the PA for more information. - R1-R1--1. - For MMIO Load and Store instructions, - the hardware outside of the processor must not - introduce any reordering of the MMIO instructions for a processor or processor - thread which would not be allowed by the PA for the instruction stream executed + For MMIO Load and Store instructions, + the hardware outside of the processor must not + introduce any reordering of the MMIO instructions for a processor or processor + thread which would not be allowed by the PA for the instruction stream executed by the processor or processor thread. - Hardware Implementation Note: Requirement - may imply that hardware outside of - the processor cannot reorder MMIO instructions from the same processor or - processor thread, but this depends on the processor implementation. For - example, some processor implementations will not allow multiple - Loads to be issued when those Loads are to - Cache Inhibited and Guarded space (as are MMIO Loads ) or - allow multiple Stores to be issued when those - Stores are to Cache Inhibited and Guarded space (as are MMIO - Stores). In this example, hardware external to the - processors could re-order Load instructions with respect - to other Load instructions or re-order - Store instructions with respect to other Store - instructions since they would not be from the same processor or thread. - However, hardware outside of the processor must still take care not to re-order - Loads with respect to Stores or - vice versa, unless the hardware has access to the entire instruction stream to - see explicit ordering instructions, like eieio. Hardware outside of the - processor includes, but is not limited to, buses, interconnects, bridges, and - switches, and includes hardware inside and outside of the coherency + Hardware Implementation Note: Requirement + may imply that hardware outside of + the processor cannot reorder MMIO instructions from the same processor or + processor thread, but this depends on the processor implementation. For + example, some processor implementations will not allow multiple + Loads to be issued when those Loads are to + Cache Inhibited and Guarded space (as are MMIO Loads ) or + allow multiple Stores to be issued when those + Stores are to Cache Inhibited and Guarded space (as are MMIO + Stores). In this example, hardware external to the + processors could re-order Load instructions with respect + to other Load instructions or re-order + Store instructions with respect to other Store + instructions since they would not be from the same processor or thread. + However, hardware outside of the processor must still take care not to re-order + Loads with respect to Stores or + vice versa, unless the hardware has access to the entire instruction stream to + see explicit ordering instructions, like eieio. Hardware outside of the + processor includes, but is not limited to, buses, interconnects, bridges, and + switches, and includes hardware inside and outside of the coherency domain. - + - R1-R1--2. - (Requirement Number Reserved + (Requirement Number Reserved For Compatibility) - Apart from the ordering disciplines stated in Requirements - and, for PCI the ordering of MMIO - Load data return versus buffered DMA data, as defined by - Requirement , no other ordering - discipline is guaranteed by the system hardware for Load - and Store instructions performed by a processor to - locations outside the PA coherency domain. Any other ordering discipline, if + Apart from the ordering disciplines stated in Requirements + and, for PCI the ordering of MMIO + Load data return versus buffered DMA data, as defined by + Requirement , no other ordering + discipline is guaranteed by the system hardware for Load + and Store instructions performed by a processor to + locations outside the PA coherency domain. Any other ordering discipline, if necessary, must be enforced by software via programming means. - The elements of a system outside its coherency domain are not - expected to issue explicit PA ordering operations. System hardware must - therefore take appropriate action to impose ordering disciplines on storage - accesses entering the coherency domain. In general, a strong-ordering rule is - enforced on an IOA’s accesses to the same location, and write operations - from the same source are completed in a sequentially consistent manner. The - exception to this rule is for the special protocol ordering modifiers that may - exist in certain I/O bus protocols. An example of such a protocol ordering - modifier is the PCI Relaxed Ordering bitThe PCI - Relaxed Ordering bit is an optional - implementation, from both the IOA and platform perspective. , + The elements of a system outside its coherency domain are not + expected to issue explicit PA ordering operations. System hardware must + therefore take appropriate action to impose ordering disciplines on storage + accesses entering the coherency domain. In general, a strong-ordering rule is + enforced on an IOA’s accesses to the same location, and write operations + from the same source are completed in a sequentially consistent manner. The + exception to this rule is for the special protocol ordering modifiers that may + exist in certain I/O bus protocols. An example of such a protocol ordering + modifier is the PCI Relaxed Ordering bitThe PCI + Relaxed Ordering bit is an optional + implementation, from both the IOA and platform perspective. , as indicated in the requirements, below. - + - R1-R1--3. - Platforms must guarantee that accesses - entering the PA coherency domain that are from the same IOA and to the same - location are completed in a sequentially consistent manner, except transactions - from PCI-X and PCI Express masters may be reordered when the Relaxed Ordering - bit in the transaction is set, as specified in the - and + Platforms must guarantee that accesses + entering the PA coherency domain that are from the same IOA and to the same + location are completed in a sequentially consistent manner, except transactions + from PCI-X and PCI Express masters may be reordered when the Relaxed Ordering + bit in the transaction is set, as specified in the + and . - + - R1-R1--4. - Platforms must guarantee that multiple write operations entering - the PA coherency domain that are issued by the same IOA are completed in a - sequentially consistent manner, except transactions from PCI-X and PCI Express - masters may be reordered when the Relaxed Ordering bit in the transaction is - set, as specified in the - and + Platforms must guarantee that multiple write operations entering + the PA coherency domain that are issued by the same IOA are completed in a + sequentially consistent manner, except transactions from PCI-X and PCI Express + masters may be reordered when the Relaxed Ordering bit in the transaction is + set, as specified in the + and . - + - R1-R1--5. - Platforms must be designed to present I/O DMA writes to the coherency domain in the order required by - , except transactions from PCI-X and PCI - Express masters may be reordered when the Relaxed Ordering bit in the - transaction is set, as specified in the - and + Platforms must be designed to present I/O DMA writes to the coherency domain in the order required by + , except transactions from PCI-X and PCI + Express masters may be reordered when the Relaxed Ordering bit in the + transaction is set, as specified in the + and .
- +
Storage Ordering and I/O Interrupts - The conclusion of I/O operations is often communicated to - processors via interrupts. For example, at the end of a DMA operation that - deposits data in the System Memory, the IOA performing the operation might send - an interrupt to the processor. Arrival of the interrupt, however, may be no - guarantee that all the data has actually been deposited; some might be on its - way. The receiving program must not attempt to read the data from the memory - before ensuring that all the data has indeed been deposited. There may be - system and I/O subsystem specific method for guaranteeing this. See The conclusion of I/O operations is often communicated to + processors via interrupts. For example, at the end of a DMA operation that + deposits data in the System Memory, the IOA performing the operation might send + an interrupt to the processor. Arrival of the interrupt, however, may be no + guarantee that all the data has actually been deposited; some might be on its + way. The receiving program must not attempt to read the data from the memory + before ensuring that all the data has indeed been deposited. There may be + system and I/O subsystem specific method for guaranteeing this. See .
- +
Atomic Update Model - An update of a memory location by a processor, involving a - Load followed by a Store, can be - considered “atomic” if there are no intervening - Stores to that location from another processor or mechanism. The PA - provides primitives in the form of Load - And Reserve and Store - Conditional instructions which can be used to determine if the update was - indeed atomic. These primitives can be used to emulate operations such as - “atomic read-modify-write” and “atomic - fetch-and-add.” Operation of the atomic update primitives is based on - the concept of “Reservation,”See - Book I and II of . + An update of a memory location by a processor, involving a + Load followed by a Store, can be + considered “atomic” if there are no intervening + Stores to that location from another processor or mechanism. The PA + provides primitives in the form of Load + And Reserve and Store + Conditional instructions which can be used to determine if the update was + indeed atomic. These primitives can be used to emulate operations such as + “atomic read-modify-write” and “atomic + fetch-and-add.” Operation of the atomic update primitives is based on + the concept of “Reservation,”See + Book I and II of . which is supported in an LoPAR system via the coherence mechanism. - R1-R1--1. - Load And Reserve and - Store Conditional instructions + Load And Reserve and + Store Conditional instructions must not be assumed to be supported for Write-Through storage. - Software Implementation Note: To emulate an - atomic read-modify-write operation, the instruction pair must access the same - storage location, and the location must have the Memory Coherence Required + Software Implementation Note: To emulate an + atomic read-modify-write operation, the instruction pair must access the same + storage location, and the location must have the Memory Coherence Required attribute. - Hardware Implementation Note: The reservation - protocol is defined in Book II of the + Hardware Implementation Note: The reservation + protocol is defined in Book II of the for atomic updates to locations in the same coherency domain. - + - R1-R1--2. - The Load And - Reserve and Store Conditional instructions + The Load And + Reserve and Store Conditional instructions must not be assumed to be supported for Caching-Inhibited storage. @@ -591,116 +591,116 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
Memory Controllers - A Memory Controller responds to the real (physical) addresses - produced by a processor or a host bridge for accesses to System Memory. It is - responsible for handling the translation from these addresses to the physical + A Memory Controller responds to the real (physical) addresses + produced by a processor or a host bridge for accesses to System Memory. It is + responsible for handling the translation from these addresses to the physical memory modules within its configured domain of control. - R1-R1--1. - Memory controller(s) must support the + Memory controller(s) must support the accessing of System Memory as defined in . - + - R1-R1--2. - Memory controller(s) must be fully initialized and + Memory controller(s) must be fully initialized and set to full power mode prior to the transfer of control to the OS. - + - R1-R1--3. - All allocations of System Memory space - among memory controllers must have been done prior to the transfer of control + All allocations of System Memory space + among memory controllers must have been done prior to the transfer of control to the OS. - Software Implementation Note: Memory controller(s) are described by - properties of the memory-controller node(s) of the OF device + Software Implementation Note: Memory controller(s) are described by + properties of the memory-controller node(s) of the OF device tree.
- +
Cache Memory - - All of the PA processors include some amount of on-chip or - internal cache memory. - This architecture allows for cache memory which is external to the processor - chip, and this external + + All of the PA processors include some amount of on-chip or + internal cache memory. + This architecture allows for cache memory which is external to the processor + chip, and this external cache memory forms an extension to internal cache memory. - R1-R1--1. - If a platform implementation elects not - to cache portions of the address map in all external levels of the cache - hierarchy, the result of not doing so must be transparent to the operation of + If a platform implementation elects not + to cache portions of the address map in all external levels of the cache + hierarchy, the result of not doing so must be transparent to the operation of the software, other than as a difference in performance. - + - R1-R1--2. - All caches must be fully - initialized and enabled, and they must have + All caches must be fully + initialized and enabled, and they must have accurate state bits prior to the transfer of control to the OS. - + - R1-R1--3. - If an in-line external - cache is used, it must support one reservation as - defined for the Load And Reserve and + If an in-line external + cache is used, it must support one reservation as + defined for the Load And Reserve and Store Conditional instructions. - + - R1-R1--4. - For the Symmetric - Multiprocessor option: Platforms must implement their cache - hierarchy such that all caches at a given level in the cache hierarchy can be - flushed and disabled before any caches at the next level which may cache the - same data are flushed and disabled (that is, L1 first, then L2, and so + For the Symmetric + Multiprocessor option: Platforms must implement their cache + hierarchy such that all caches at a given level in the cache hierarchy can be + flushed and disabled before any caches at the next level which may cache the + same data are flushed and disabled (that is, L1 first, then L2, and so on). - + - R1-R1--5. - For the Symmetric - Multiprocessor option: If a cache implements snarfing, - then the cache must be capable of disabling the snarfing during flushing in order to implement + For the Symmetric + Multiprocessor option: If a cache implements snarfing, + then the cache must be capable of disabling the snarfing during flushing in order to implement the RTAS stop-self function in an atomic way. - + - R1-R1--6. - Software must not depend on being able to + Software must not depend on being able to change a cache from copy-back to write-through. @@ -708,94 +708,94 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> Software Implementation Notes: - + - Each first level cache will be defined via properties of the - cpu node(s) of the OF device tree. Each higher level cache will be - defined via properties of the l2-cache node(s) - of the OF device tree. See for more details. + Each first level cache will be defined via properties of the + cpu node(s) of the OF device tree. Each higher level cache will be + defined via properties of the l2-cache node(s) + of the OF device tree. See for more details. - To ensure proper operation, cache(s) at the same level in the - cache hierarchy should be flushed and disabled before cache(s) at the next + To ensure proper operation, cache(s) at the same level in the + cache hierarchy should be flushed and disabled before cache(s) at the next level (that is, L1 first, then L2, and so on). - +
- +
Memory Status information - New OF properties are defined to support the identification and + New OF properties are defined to support the identification and contain the status information on good and bad system memory. - R1-R1--1. - Firmware must implement all of the - properties for memory modules, as specified by , + Firmware must implement all of the + properties for memory modules, as specified by , and any other properties defined by this document which apply to memory modules.
- +
Reserved Memory - Sections of System Memory may be reserved for usage by OS - extensions, with the restrictions detailed below. Memory nodes marked with the - special value of the “status” property of - “reserved” is not to be used or altered by the base OS. Several - different ranges of memory may be marked as “reserved”. If DLPAR - of memory is to be supported and growth is expected, then, an address range - must be unused between these areas in order to allow growth of these areas. - Each area has its own DRC Type (starting at 0, MEM, MEM-1, MEM-2, and so on). - Each area has a current and a maximum size, with the current size being the sum - of the sizes of the populated DRCs for the area and the max being the sum total - of the sizes of all the DRCs for that area. The logical address space allocated - is the size of the sum of the all the areas' maximum sizes. Starting with - logical real address 0, the address areas are allocated in the following order: - OS, DLPAR growth space for OS (if DLPAR is supported), reserved area (if any) - followed by the DLPAR growth space for that reserved area (if DLPAR is - supported), followed by the next reserved space (if any), and so on. The - current memory allocation for each area is allocated contiguously from the - beginning of the area. On a boot or reboot, including hypervisor reboot, if - there is any data to be preserved (that is, the - “ibm,preserved-storage” - property exists in the RTAS - node), then the starting logical real address of each LMB is maintained through - the reboot. The memory in each region can be independently increased or - decreased using DLPAR memory functions, when DLPAR is supported. Changes to the - current memory allocation for an area results in the addition or removal of + Sections of System Memory may be reserved for usage by OS + extensions, with the restrictions detailed below. Memory nodes marked with the + special value of the “status” property of + “reserved” is not to be used or altered by the base OS. Several + different ranges of memory may be marked as “reserved”. If DLPAR + of memory is to be supported and growth is expected, then, an address range + must be unused between these areas in order to allow growth of these areas. + Each area has its own DRC Type (starting at 0, MEM, MEM-1, MEM-2, and so on). + Each area has a current and a maximum size, with the current size being the sum + of the sizes of the populated DRCs for the area and the max being the sum total + of the sizes of all the DRCs for that area. The logical address space allocated + is the size of the sum of the all the areas' maximum sizes. Starting with + logical real address 0, the address areas are allocated in the following order: + OS, DLPAR growth space for OS (if DLPAR is supported), reserved area (if any) + followed by the DLPAR growth space for that reserved area (if DLPAR is + supported), followed by the next reserved space (if any), and so on. The + current memory allocation for each area is allocated contiguously from the + beginning of the area. On a boot or reboot, including hypervisor reboot, if + there is any data to be preserved (that is, the + “ibm,preserved-storage” + property exists in the RTAS + node), then the starting logical real address of each LMB is maintained through + the reboot. The memory in each region can be independently increased or + decreased using DLPAR memory functions, when DLPAR is supported. Changes to the + current memory allocation for an area results in the addition or removal of memory to the end of the existing memory allocation. - Implementation Note: if the shared memory - regions are not accessed by the programs, and are just used for DMA most of the - time, then the same HPFT hit rate could be achieved with a far lower ration of + Implementation Note: if the shared memory + regions are not accessed by the programs, and are just used for DMA most of the + time, then the same HPFT hit rate could be achieved with a far lower ration of HPFT entries to logical storage space. - R1-R1--1. - For the Reserved Memory option: - Memory nodes marked with the special value of the “status” + For the Reserved Memory option: + Memory nodes marked with the special value of the “status” property of “reserved” must not be used or altered by the base OS - Implementation Note: How areas get chosen to + Implementation Note: How areas get chosen to be marked as reserved is beyond the scope of this architecture. - + - R1-R1--2. - For the Reserved Memory option - with the LRDR option: Each unique memory area that is to be changed - independently via DLPAR must have different DRC Types (for example, MEM, MEM-1, + For the Reserved Memory option + with the LRDR option: Each unique memory area that is to be changed + independently via DLPAR must have different DRC Types (for example, MEM, MEM-1, and so on). @@ -804,9 +804,9 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
Persistent Memory - Selected regions of storage (LMBs) may be optionally preserved - across client program boot cycles. See and - "Managing Storage Preservations" in specification. + Selected regions of storage (LMBs) may be optionally preserved + across client program boot cycles. See + and .
diff --git a/LoPAR/ch_product_topology.xml b/LoPAR/ch_product_topology.xml index 8c8aa71..eade577 100644 --- a/LoPAR/ch_product_topology.xml +++ b/LoPAR/ch_product_topology.xml @@ -10,7 +10,7 @@ xml:lang="en"> VPD and Location Code OF Properties A set of OF properties is defined to facilitate asset protection and RAS capabilities in LoPAR systems. The following properties are defined in -): +):
@@ -137,7 +137,7 @@ xml:lang="en">
System Identification - provides properties in the + provides properties in the “OF Root Node” section called “system-id” and“model”. @@ -4707,7 +4707,7 @@ xml:lang="en"> Up to 56 - Processor CoD Capacity Card Info per + Processor CoD Capacity Card Info per @@ -4721,7 +4721,7 @@ xml:lang="en"> Up to 56 - Memory CoD Capacity Card Info per + Memory CoD Capacity Card Info per diff --git a/LoPAR/ch_service_indicators.xml b/LoPAR/ch_service_indicators.xml index 13201f5..9a2ce40 100644 --- a/LoPAR/ch_service_indicators.xml +++ b/LoPAR/ch_service_indicators.xml @@ -431,7 +431,7 @@ Dynamic Reconfiguration (LoPAR indicator type 9002) to indicate the status of DR operations on a Field Replacable Unit (FRU). More information on DR indicators can be found in - . This indicator is amber + . This indicator is amber The term “amber” will be used in this chapter to mean any wavelength between yellow and amber. diff --git a/LoPAR/ch_smp.xml b/LoPAR/ch_smp.xml index 50bf499..c515f45 100644 --- a/LoPAR/ch_smp.xml +++ b/LoPAR/ch_smp.xml @@ -1,166 +1,166 @@ - The Symmetric Multiprocessor Option - This architecture supports the implementation of symmetric - multiprocessor (SMP) systems as an optional feature. This Chapter provides - information concerning the design and programming of such systems. For SMP OF - binding information, see . - SMP systems differ from uniprocessors in a number of ways. These - differences are not all covered in this chapter. Other chapters that cover + This architecture supports the implementation of symmetric + multiprocessor (SMP) systems as an optional feature. This Chapter provides + information concerning the design and programming of such systems. For SMP OF + binding information, see . + SMP systems differ from uniprocessors in a number of ways. These + differences are not all covered in this chapter. Other chapters that cover SMP-related topics include: - + - Non-processor-related initialization and other requirements: + Non-processor-related initialization and other requirements: - + Interrupts: - + - Error handling: + Error handling: - + - Many other general characteristics of SMPs—such as - interprocessor communication, load/store ordering, and cache - coherence—are defined in . - Requirements and recommendations for system organization and time base synchronization are + Many other general characteristics of SMPs—such as + interprocessor communication, load/store ordering, and cache + coherence—are defined in . + Requirements and recommendations for system organization and time base synchronization are discussed here, along with SMP-specific aspects of the boot process. - SMP platforms require SMP-specific OS support. An OS supporting only - uniprocessor platforms - may not be usable on an SMP, even when an SMP platform has only a single - processor installed; conversely, an SMP-supporting OS may not be usable on a - uniprocessor. It is, however, a requirement that uniprocessor OSs be able to - run on one-processor SMPs, and that SMP-enabled OSs also run on uniprocessors. + SMP platforms require SMP-specific OS support. An OS supporting only + uniprocessor platforms + may not be usable on an SMP, even when an SMP platform has only a single + processor installed; conversely, an SMP-supporting OS may not be usable on a + uniprocessor. It is, however, a requirement that uniprocessor OSs be able to + run on one-processor SMPs, and that SMP-enabled OSs also run on uniprocessors. See the next section. - +
SMP System Organization - This chapter only addresses SMP multiprocessor platforms. This is a - computer system in which multiple processors equally share functional and - timing access to and control over all other system components, including memory - and I/O, as defined in the requirements below. Other - multiprocessor organizations (“asymmetric - multiprocessors,” “ attached - processors,” etc.) are not included in this architecture. These might, - for example, include systems in which only one processor can perform I/O - operations; or in which processors have private memory that is not accessible + This chapter only addresses SMP multiprocessor platforms. This is a + computer system in which multiple processors equally share functional and + timing access to and control over all other system components, including memory + and I/O, as defined in the requirements below. Other + multiprocessor organizations (“asymmetric + multiprocessors,” “ attached + processors,” etc.) are not included in this architecture. These might, + for example, include systems in which only one processor can perform I/O + operations; or in which processors have private memory that is not accessible by other processors. - Requirements through - , further require that all processors - be of (nearly) equal speed, type, cache characteristics, etc. Requirements for - optional non-uniform multiprocessor platforms are found in + Requirements through + , further require that all processors + be of (nearly) equal speed, type, cache characteristics, etc. Requirements for + optional non-uniform multiprocessor platforms are found in . - R1-R1--1. - OSs that do not explicitly support the SMP option must support + OSs that do not explicitly support the SMP option must support SMP-enabled platforms, actively using only one processor. - + - R1-R1--2. - For the Symmetric Multiprocessor + For the Symmetric Multiprocessor option: SMP OSs must support uniprocessor platforms. - + - R1-R1--3. - For the Symmetric Multiprocessor - option: The extensions defined in - , and the SMP support section of the RTAS - specifications (see ) must be implemented. + For the Symmetric Multiprocessor + option: The extensions defined in + , and the SMP support section of the RTAS + specifications (see ) must be implemented. - + - R1-R1--4. - For the Symmetric Multiprocessor or Power Management - option: All processors in the configuration must have equal - functional access and “quasi-equal” - timing access to all of system memory, - including other processors’ caches, via cache coherence. - “Quasi-equal” means that the time required for processors to - access memory is sufficiently close to being equal that all software can ignore - the difference without a noticeable negative impact on system performance; and + For the Symmetric Multiprocessor or Power Management + option: All processors in the configuration must have equal + functional access and “quasi-equal” + timing access to all of system memory, + including other processors’ caches, via cache coherence. + “Quasi-equal” means that the time required for processors to + access memory is sufficiently close to being equal that all software can ignore + the difference without a noticeable negative impact on system performance; and no software is expected to profitably exploit the difference in timing. - + - R1-R1--5. - For the Symmetric Multiprocessor option: - All processors in the configuration must have equal functional and - “quasi-equal” - timing access to all I/O devices and IOAs. - “Quasi-equal” is defined as in Requirement , + For the Symmetric Multiprocessor option: + All processors in the configuration must have equal functional and + “quasi-equal” + timing access to all I/O devices and IOAs. + “Quasi-equal” is defined as in Requirement , above, with I/O access replacing memory access for this case. - + - R1-R1--6. - For the Symmetric Multiprocessor option: - SMP OSs must at least support SMPs with the same PVR contents and speed. The + For the Symmetric Multiprocessor option: + SMP OSs must at least support SMPs with the same PVR contents and speed. The PVR contents includes both the PVN and the revision number. - + - R1-R1--7. - For the Symmetric Multiprocessor option: + For the Symmetric Multiprocessor option: All caches at the same hierarchical level must have the same OF properties. - + - R1-R1--8. - Hardware for SMPs must provide a means for synchronizing all the - time bases of all the processors in the platform, for use by platform firmware. - See . This is for purposes of clock synchronization + Hardware for SMPs must provide a means for synchronizing all the + time bases of all the processors in the platform, for use by platform firmware. + See . This is for purposes of clock synchronization at initialization and at times when the processor loses time base state. - + - R1-R1--9. - The platform must initialize and maintain the synchronization of - the time bases and timers of all platform processors such that; for any code - sequence “C”, run between any two platform processors - “A” and “B”, where the reading of the time base or - timer in processor “A” can be architecturally guaranteed to have - happened later in time than the reading of the time base or timer in processor - “B”, the value of the time base read by processor - “A” is greater than or equal to the value of the time base read + The platform must initialize and maintain the synchronization of + the time bases and timers of all platform processors such that; for any code + sequence “C”, run between any two platform processors + “A” and “B”, where the reading of the time base or + timer in processor “A” can be architecturally guaranteed to have + happened later in time than the reading of the time base or timer in processor + “B”, the value of the time base read by processor + “A” is greater than or equal to the value of the time base read by processor “B”. @@ -168,204 +168,204 @@ xml:lang="en"> Software Implementation Notes: - + - Requirement has - implications on the design of uniprocessor OSs, particularly regarding the - handling of interrupts. See the sections that follow, particularly + Requirement has + implications on the design of uniprocessor OSs, particularly regarding the + handling of interrupts. See the sections that follow, particularly . - While Requirement does - not require this, OSs are encouraged to support processors of the same type but - different PVR contents as long as their programming models are + While Requirement does + not require this, OSs are encouraged to support processors of the same type but + different PVR contents as long as their programming models are compatible. - Because of performance penalties associated with inter-processor - synchronization, the weakest synchronization primitive that produces correct - operation should be used. For example, eieio can often be - used as part of a sequence that unlocks a data structure, rather than the + Because of performance penalties associated with inter-processor + synchronization, the weakest synchronization primitive that produces correct + operation should be used. For example, eieio can often be + used as part of a sequence that unlocks a data structure, rather than the higher-overhead but more general sync instruction. - + Hardware Implementation Notes: - + - Particularly when used as servers, SMP systems make heavy demands - on the I/O and memory subsystems. Therefore, it is strongly recommended that - the I/O and memory subsystem of an SMP platform should either be expandable as - additional processors are added, or else designed to handle the load of the + Particularly when used as servers, SMP systems make heavy demands + on the I/O and memory subsystems. Therefore, it is strongly recommended that + the I/O and memory subsystem of an SMP platform should either be expandable as + additional processors are added, or else designed to handle the load of the maximum system configuration. - Defining an exact numeric threshold for - “quasi-equal” is not feasible because it depends on the - application, compiler, subsystem, and OS software that the system is to run. It - is highly likely that a wider range of timing differences can be absorbed in - I/O access time than in memory access time. An illustrative example that is - deliberately far from an upper bound: A 2% timing difference is certainly - quasi-equal by this definition. While significantly larger timing differences - are undoubtedly also quasi-equal, more conclusive statements must be the + Defining an exact numeric threshold for + “quasi-equal” is not feasible because it depends on the + application, compiler, subsystem, and OS software that the system is to run. It + is highly likely that a wider range of timing differences can be absorbed in + I/O access time than in memory access time. An illustrative example that is + deliberately far from an upper bound: A 2% timing difference is certainly + quasi-equal by this definition. While significantly larger timing differences + are undoubtedly also quasi-equal, more conclusive statements must be the province of the OS and other software. - +
- +
An SMP Boot Process - Booting - an SMP entails considerations not present when booting a - uniprocessor. This section indicates those considerations by describing a way - in which an SMP system can be booted. It does not pretend to describe - “the” way to boot an SMP, since there are a wide variety of ways - to do this, depending on engineering choices that can differ from platform to - platform. To illustrate the possibilities, several variations on the SMP + Booting + an SMP entails considerations not present when booting a + uniprocessor. This section indicates those considerations by describing a way + in which an SMP system can be booted. It does not pretend to describe + “the” way to boot an SMP, since there are a wide variety of ways + to do this, depending on engineering choices that can differ from platform to + platform. To illustrate the possibilities, several variations on the SMP booting theme will be described after the initial description. - This section concentrates solely on SMP-related issues, and ignores a - number of other initialization issues such as hibernation and suspension. See - for a discussion of those other + This section concentrates solely on SMP-related issues, and ignores a + number of other initialization issues such as hibernation and suspension. See + for a discussion of those other issues. - +
SMP-Safe Boot - The basic booting process described here is called - “SMP-Safe” because it tolerates the presence of multiple + The basic booting process described here is called + “SMP-Safe” because it tolerates the presence of multiple processors, but does not exploit them. This process proceeds as follows: - + - At power on, one or more finite state machines (FSMs) built into - the system hardware initialize each processor independently. FSMs also perform - basic initialization of other system elements, such as the memory and interrupt + At power on, one or more finite state machines (FSMs) built into + the system hardware initialize each processor independently. FSMs also perform + basic initialization of other system elements, such as the memory and interrupt controllers. - After the FSM initialization of each processor concludes, it - begins execution at a location in ROM that the FSM has specified. This is the - start of execution of the system firmware that eventually provides the OF + After the FSM initialization of each processor concludes, it + begins execution at a location in ROM that the FSM has specified. This is the + start of execution of the system firmware that eventually provides the OF interfaces to the OS. - One of the first things that firmware does is establish one of the processors as the - master: The - master is a single processor which - continues with the rest of the booting process; all the others are placed in a - stopped state. A processor in this - stopped state is out of the picture; it does nothing that affects - the state of the system and will continue to be in that state until awakened by - some outside force, such as an inter-processor interrupt (IPI).Another - characteristic of the stopped state, - defined in , is that the - processor remembers nothing of its prior life when placed in a - stopped state; this distinguishes it from the - idle state. That isn’t strictly necessary for this booting - process; idle could have been used. However, since the - non-master processor must be in the - stopped state when the OS is started, + One of the first things that firmware does is establish one of the processors as the + master: The + master is a single processor which + continues with the rest of the booting process; all the others are placed in a + stopped state. A processor in this + stopped state is out of the picture; it does nothing that affects + the state of the system and will continue to be in that state until awakened by + some outside force, such as an inter-processor interrupt (IPI).Another + characteristic of the stopped state, + defined in , is that the + processor remembers nothing of its prior life when placed in a + stopped state; this distinguishes it from the + idle state. That isn’t strictly necessary for this booting + process; idle could have been used. However, since the + non-master processor must be in the + stopped state when the OS is started, stopped might as well be used. - One way to choose the master is to include a special register - at a fixed address in the memory controller. That special register has the + One way to choose the master is to include a special register + at a fixed address in the memory controller. That special register has the following properties: - + - The FSM initializing the memory controller sets this + The FSM initializing the memory controller sets this register’s contents to 0 (zero). - + - The first time that register is read, it returns the value 0 and - then sets its own contents to non-zero. This is performed as an atomic - operation; if two or more processors attempt to read the register at the same - time, exactly one of them will get the 0 and the rest will get a non-zero + The first time that register is read, it returns the value 0 and + then sets its own contents to non-zero. This is performed as an atomic + operation; if two or more processors attempt to read the register at the same + time, exactly one of them will get the 0 and the rest will get a non-zero value. - + - After the first attempt, all attempts to read that + After the first attempt, all attempts to read that register’s contents return a non-zero value. - + - The master is then picked by having all the - processors read from that special register. Exactly one of them will receive a + The master is then picked by having all the + processors read from that special register. Exactly one of them will receive a 0 and thereby become the master. - Note that the operation of choosing the - master cannot be done using the PA memory locking instructions, - since at this point in the boot process the memory is not initialized. The - advantage to using a register in the memory controller is that system bus - serialization can be used to automatically provide the required + Note that the operation of choosing the + master cannot be done using the PA memory locking instructions, + since at this point in the boot process the memory is not initialized. The + advantage to using a register in the memory controller is that system bus + serialization can be used to automatically provide the required atomicity. - The master chosen in step - then proceeds to do the remainder of the - system initialization. This includes, for example, the remainder of Power-On - Self Test, initialization of OF, discovery of devices and construction of the - OF device tree, loading the OS, starting it, and so on. Since one processor is - performing all these functions, and the rest are in a state where they are not - affecting anything, code that is at least very close to the uniprocessor code - can be used for all of this (but see + The master chosen in step + then proceeds to do the remainder of the + system initialization. This includes, for example, the remainder of Power-On + Self Test, initialization of OF, discovery of devices and construction of the + OF device tree, loading the OS, starting it, and so on. Since one processor is + performing all these functions, and the rest are in a state where they are not + affecting anything, code that is at least very close to the uniprocessor code + can be used for all of this (but see below). - The OS begins execution on the single - master processor. It uses the OF Client Interface - Services to start each of the other processors, taking them out of the + The OS begins execution on the single + master processor. It uses the OF Client Interface + Services to start each of the other processors, taking them out of the stopped state and setting them loose on the SMP OS code. - This completes the example SMP boot process. Variations are - discussed beginning at . Before - discussing those variations, an element of the system initialization not + This completes the example SMP boot process. Variations are + discussed beginning at . Before + discussing those variations, an element of the system initialization not discussed above will be covered. - +
- +
Finding the Processor Configuration - Unlike uniprocessor initialization, SMP initialization must also - discover the number and identities of the processors installed in the system. - “Identity” means the interrupt address of each processor as seen - by the interrupt controller; without that information, a processor cannot reset - interrupts directed at it. This identity is determined by board wiring: The - processor attached to the “processor 0” wire from the interrupt - controller has identity 0. For information about how this identity is used, see - . - The method used by a platform to identify its processors is - dependent upon the platform hardware design and may be based upon service - processor information, identification registers, inter-processor interrupts, or + Unlike uniprocessor initialization, SMP initialization must also + discover the number and identities of the processors installed in the system. + “Identity” means the interrupt address of each processor as seen + by the interrupt controller; without that information, a processor cannot reset + interrupts directed at it. This identity is determined by board wiring: The + processor attached to the “processor 0” wire from the interrupt + controller has identity 0. For information about how this identity is used, see + . + The method used by a platform to identify its processors is + dependent upon the platform hardware design and may be based upon service + processor information, identification registers, inter-processor interrupts, or other novel techniques.
SMP-Efficient Boot - The booting - process as described so far tolerates the existence of multiple processors but - does not attempt to exploit them. It is possible that the booting process can - be sped up by actively using multiple processors simultaneously. In that case, - the pick-a-master technique must still be - used to perform sufficient initialization that other inter-processor - coordination facilities—in-memory locks and IPIs—can be used. - Once that is accomplished, normal parallel SMP programming techniques can be + The booting + process as described so far tolerates the existence of multiple processors but + does not attempt to exploit them. It is possible that the booting process can + be sped up by actively using multiple processors simultaneously. In that case, + the pick-a-master technique must still be + used to perform sufficient initialization that other inter-processor + coordination facilities—in-memory locks and IPIs—can be used. + Once that is accomplished, normal parallel SMP programming techniques can be used within the initialization process itself.
Use of a Service Processor - A system might contain a service processor that is distinct from the processors - that form the SMP. If that service processor has suitably intimate access to - and control over each of the SMP processors, it can perform the operations of - choosing a master and discovering the SMP processor + A system might contain a service processor that is distinct from the processors + that form the SMP. If that service processor has suitably intimate access to + and control over each of the SMP processors, it can perform the operations of + choosing a master and discovering the SMP processor configuration.  
diff --git a/LoPAR/ch_system_reqs.xml b/LoPAR/ch_system_reqs.xml index 8e6eb3d..a2a18e6 100644 --- a/LoPAR/ch_system_reqs.xml +++ b/LoPAR/ch_system_reqs.xml @@ -331,7 +331,7 @@ xml:lang="en">
Locate an OS Boot Image The OS boot image is located as described in - . A device and filename can be specified directly + . A device and filename can be specified directly from the command interpreter (the boot command) or OF will locate the image through an automatic boot process controlled by configuration variables. Once a boot image is located, the device path is set @@ -345,7 +345,7 @@ xml:lang="en"> boot-device entries that the platform processes. If multi-boot (multiple bootable OSs residing on the same platform) is supported, a configuration variable instructs the firmware to display a multi-boot menu - from which the OS and bootpath are selected. See + from which the OS and bootpath are selected. See for information relating to the multiboot process. @@ -368,10 +368,10 @@ xml:lang="en">
Boot Process - The boot process is described in . + The boot process is described in . Steps in the process are reviewed here, but the authoritative and complete description of the process is included in - . is a + . is a depiction of the boot flow showing the action of the f1, f5, and f6 function keys. The figure should only be used as an aid in understanding the requirements for LoPAR systems. @@ -431,7 +431,7 @@ xml:lang="en"> Once the boot prompt is displayed, the System Management Services (SMS) menu can be invoked. SMS provides a user interface for utilities, configuration, and the Multiboot Menu (as introduced in - ) for boot/install and the OF command + ) for boot/install and the OF command interpreter. The Multiboot menu is formatted so that block devices that currently contain boot information are most easily selected by the user. @@ -721,7 +721,7 @@ xml:lang="en"> diag-device configuration variables must include the standard block device bootinfo.txt file specification as documented in - (\ppc\bootinfo.txt). + (\ppc\bootinfo.txt). @@ -729,7 +729,7 @@ xml:lang="en">
Tape Boot - Boot from tape is defined in . + Boot from tape is defined in .
@@ -897,7 +897,7 @@ ELSE the platform using the ibm,manage-storage-preservation RTAS call if it wants the contents of the storage preserved across client boot cycles (see also "Managing Storage Preservations" in - specification). The architectural intent of this + specification). The architectural intent of this facility is to enable client programs to emulate persistent storage. This is done by a client program registering preservable LMBs. Then, after a subsequent boot cycle (perhaps due to error or impending power loss) the presence of the @@ -1029,7 +1029,7 @@ ELSE R1--1. - Platforms must implement OF as defined in . + Platforms must implement OF as defined in . @@ -1053,7 +1053,7 @@ ELSE xrefstyle="select: labelnumber nopage"/>-3. Platforms must implement the Run-Time Abstraction Services (RTAS) as described in - . + . @@ -1138,7 +1138,7 @@ ELSE
Tape Install The OF definition of installation from tape is defined in - . + .
@@ -1167,7 +1167,7 @@ ELSE will run on other vendors’ platforms which might not have permission to use AIX diagnostics, the “ibm,aix-diagnostics” property indicates that AIX diagnostics are permitted (see "Root - Node Properties" in ). + Node Properties" in ). @@ -1184,15 +1184,14 @@ ELSE Software Implementation Note: Each OS may implement an OS-specific run-time diagnostics package, but should, for purposes of consistency, adhere - to the error log formats in . + to the error log formats in .
Platform Class The “ibm,model-class” OF property is defined to classify platforms for planning, marketing, licensing, and - service purposes (see "Root Node Properties" in - ). + service purposes (see ). @@ -1696,7 +1695,7 @@ ELSE OR - See for more information. + See for more information. @@ -1710,7 +1709,7 @@ ELSE OR - . + . @@ -1726,8 +1725,8 @@ ELSE See and . Requirements for platforms that implement - LPAR, regardless of the number of partitions, are contained in - . + LPAR, regardless of the number of partitions (Requirements + and ). @@ -1755,7 +1754,7 @@ ELSE R - See . + See . @@ -1801,7 +1800,7 @@ ELSE O - See for more information on + See for more information on support of I2C buses. @@ -1816,7 +1815,7 @@ ELSE R - . + . @@ -1830,7 +1829,7 @@ ELSE R - . + . @@ -1844,7 +1843,7 @@ ELSE O - . + . @@ -1858,7 +1857,7 @@ ELSE O - . + . @@ -1888,7 +1887,7 @@ ELSE O - . + . @@ -1902,7 +1901,7 @@ ELSE O - . + . @@ -1916,7 +1915,7 @@ ELSE O - . + . @@ -1930,7 +1929,7 @@ ELSE O - . + . @@ -1944,7 +1943,7 @@ ELSE O - . + . @@ -1958,7 +1957,7 @@ ELSE O - . + . @@ -1972,7 +1971,7 @@ ELSE O - See . + See . @@ -1986,7 +1985,7 @@ ELSE O - . + . @@ -2000,7 +1999,7 @@ ELSE O - See . + See . @@ -2014,8 +2013,7 @@ ELSE NS - and "Managing - Storage Preservations" in specification. + and . @@ -2032,7 +2030,7 @@ ELSE Required of all platforms that support LPAR, otherwise not implemented. Provides a virtual “Asynchronous” IOA for connecting to a server Vterm IOA, the hypervisor, or HMC (for example, to a virtual - console). See for more + console). See for more information. @@ -2068,7 +2066,7 @@ ELSE - Performance Tool Support + Performance Tool Support O @@ -2078,8 +2076,9 @@ ELSE Provides access to platform-level facilities for - performance tools running in a partition on an LPAR system. See - . + performance tools running in a partition on an LPAR system. + See + .> @@ -2107,7 +2106,7 @@ ELSE O - See . + See . @@ -2121,7 +2120,7 @@ ELSE O - See . + See . @@ -2216,7 +2215,7 @@ ELSE Allows an authorized virtual server partition (VSP) to safely access the internal state of a specific partition. See - for more details. Requires the Reliable + for more details. Requires the Reliable Command/Response Transport option. @@ -2248,7 +2247,7 @@ ELSE Allows the OS to indicate that there is no need to search secondary page table entry groups to determine a page table search has failed. - See for more details. + See for more details. @@ -2292,7 +2291,7 @@ ELSE Support for the Subordinate CRQs as needed by some Virtual - IOAs. See . + IOAs. See . @@ -2308,7 +2307,7 @@ ELSE The CMO option allows for partition participation in the over-commitment of logical memory by the platform. See - . + . @@ -2323,7 +2322,7 @@ ELSE Allows the OS to cooperate with platform energy - management. See . + management. See . @@ -2338,7 +2337,7 @@ ELSE Support for the Multi-TCE-Table Option. See - . + . @@ -2354,7 +2353,7 @@ ELSE Provides substantially consistent virtual processor associativity in a shared processor LPAR environment. See - . + . @@ -2382,7 +2381,7 @@ ELSE O - See . + See . @@ -2397,7 +2396,7 @@ ELSE Allows OS notification of a cooperative memory - overcommitment page fault see . + overcommitment page fault see . @@ -2413,7 +2412,7 @@ ELSE Allows the platform to communicate and the availability of performance boost modes along with any ability to manage the same. See - + @@ -2445,7 +2444,7 @@ ELSE Allows the creation of DMA Windows above 4 GB. See - . + . @@ -2460,7 +2459,7 @@ ELSE - for information on ibm,partition-uuid. + for information on ibm,partition-uuid. @@ -2475,7 +2474,9 @@ ELSE O - See for more information. + See , + , and + for more information. @@ -2490,7 +2491,7 @@ ELSE Introduces additional cooperative memory overcommitment - functions see + functions see @@ -2504,7 +2505,7 @@ ELSE O - See . + See . @@ -2563,7 +2564,7 @@ ELSE Allows partitions to resize their HPT. See - . + . @@ -2577,8 +2578,8 @@ ELSE O - Allows partitions to resize their HPT. See - . + See + . diff --git a/LoPAR/ch_virtual_io.xml b/LoPAR/ch_virtual_io.xml index 397ff52..8f1bc55 100644 --- a/LoPAR/ch_virtual_io.xml +++ b/LoPAR/ch_virtual_io.xml @@ -193,7 +193,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo access modes, like writing to a read only page). More information on TCEs and TCE tables, which are used for physical IOAs, can be found in - . The RTCE table for Remote + . The RTCE table for Remote DMA (RDMA) is analogous to the TCE table for physical IOAs. The RTCE table does, however, have a little more information in it (as placed there by the hypervisor) in order to, among other @@ -728,7 +728,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo The virtual location code (see - ) + ) @@ -953,7 +953,7 @@ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en" xml:id="dbdo assignment number is uniquely generated when the virtual IOA is assigned to the partition and remains invariably associated with that virtual IOA for the duration of the partition definition. For more information, see - . + .
@@ -1153,7 +1153,7 @@ hcall ( const int64 H_VIO_SIGNAL, /* Function Code */ root node, a node of type vdevice as the parent of a sub-tree representing the virtual IOAs assigned to the partition (see - for details). + for details). @@ -1252,7 +1252,7 @@ hcall ( const int64 H_VIO_SIGNAL, /* Function Code */ For all VIO options: The platform must assign an invariant virtual location code to each virtual IOA as described in - . + . @@ -3326,7 +3326,7 @@ hcall ( const unit64 H_VIOCTL, /* Query/Set behaviors for the virtual IOA */ Transfer into registers R4 (High order 8 bytes) and R5 (low order 8 bytes) of the UUID of the client partition that owns the virtual device ( - for the format of the UUID string. + for the format of the UUID string. @@ -8852,7 +8852,7 @@ hcall ( const int64 H_SEND_SUB_CRQ, /* Function Code */ Property name specifying the unique and persistent location code associated with this virtual IOA, the value shall be of the form defined in - . + . @@ -9108,7 +9108,7 @@ hcall ( const int64 H_SEND_SUB_CRQ, /* Function Code */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -9128,7 +9128,7 @@ hcall ( const int64 H_SEND_SUB_CRQ, /* Function Code */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -12455,7 +12455,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ encoded array as with encode-string. The value shall be of the form specified in - . + . @@ -12552,7 +12552,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -12572,7 +12572,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -12750,7 +12750,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ encoded array as with encode-string. The value shall be of the form - . + . @@ -12872,7 +12872,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -12892,7 +12892,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -13467,7 +13467,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ encoded array as with encode-string. The value shall be of the form specified in - . + . @@ -13720,7 +13720,7 @@ hcall ( const uint64 H_ILLAN_ATTRIBUTES,/* Returns in R4 the resulting ILLAN */ encoded array as with encode-string. The value shall be of the form - . + . @@ -15231,8 +15231,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne presented as an encoded array as with encode-string. The value shall be of the form specified in - information on - Virtual Card Connector Location Codes. + . @@ -15317,7 +15316,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne the method described in the definition for the “ibm,#dma-size-cells” property in - section on System Bindings. + . @@ -15336,7 +15335,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne the method described in the definition for the “ibm,#dma-address-cells” property in - section on System Bindings. + . @@ -15542,8 +15541,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne presented as an encoded array as with encode-string. The value shall be of the form specified in - information on - Virtual Card Connector Location Codes. + . @@ -15633,7 +15631,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne the method described in the definition for the “ibm,#dma-size-cells” property in - section on System Bindings. + . @@ -15652,7 +15650,7 @@ hcall ( const uint64 H_FREE_VTERM, /* Break connection between server and partne the method described in the definition for the “ibm,#dma-address-cells” property in - section on System Bindings. + . @@ -18503,8 +18501,7 @@ able 252‚ “VASI Reliable CRQ Response Status Values‚” on page 721. presented as an encoded array as with encode-string. The value shall be of the form specified in - information on - Virtual Card Connector Location Codes. + . @@ -20749,7 +20746,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ encoded array as with encode-string. The value shall be of the form specified in - . + . @@ -20846,7 +20843,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -20866,7 +20863,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -21101,7 +21098,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ encoded array as with encode-string. The value shall be of the form - . + . @@ -21223,7 +21220,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -21243,7 +21240,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -21717,7 +21714,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -21740,7 +21737,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ format cannot be derived using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -22208,7 +22205,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ using the method described in the definition for the “ibm,#dma-address-cells” property in - . + . @@ -22229,7 +22226,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ definition for the “ibm,#dma-size-cells” property in - . + . @@ -22270,7 +22267,7 @@ hcall ( const uint64 H_VASI_STATE, /* Return the state of the VASI service */ Vendor unique property name indicating ranges of the client program virtual address space that are used by the virtual device serving partition adjunct. - See information about the children + See information about the children of the /vdevice node. diff --git a/LoPAR/sec_rtas_call_defn.xml b/LoPAR/sec_rtas_call_defn.xml index bf471fb..0a6a57a 100644 --- a/LoPAR/sec_rtas_call_defn.xml +++ b/LoPAR/sec_rtas_call_defn.xml @@ -33,7 +33,7 @@ to hold OF options, RTAS information, machine configuration state, OS state, diagnostic logs, etc. The type and size of NVRAM is specified in the OF device tree. The format of NVRAM is detailed in - . + . In order to give the OS the ability to access NVRAM on different platforms that may use different implementations or locations @@ -346,7 +346,7 @@ The caller of the nvram-store RTAS call must maintain the NVRAM partitions as specified in - . + . @@ -362,7 +362,7 @@ clock which maintains the time of day even if power to the machine is removed. Minimum requirements for this clock are described in Requirement - . + .
Time of Day Inputs/Outputs @@ -785,7 +785,7 @@ Software Implementation Note: The OS maintains the clock in UTC. This allows the OS and diagnostics to co-exist with each other and provide uniform handling of time. Refer to Requirement - for further details on the time + for further details on the time of day clock. @@ -1232,7 +1232,7 @@ The event-scan call must fill in the error log with a single error log formatted as specified in - . If necessary, the data placed + . If necessary, the data placed into the error log must be truncated to length bytes. @@ -1246,7 +1246,7 @@ that are within the classes defined by the Event mask. Event mask is a bit mask of error and event classes. Refer to - for the definition of the bit + for the definition of the bit positions. @@ -1602,7 +1602,7 @@ The check-exception call must fill in the error log with a single error log formatted as specified in - . The data in the error log + . The data in the error log must be truncated to length bytes. @@ -1636,7 +1636,7 @@ that are within the classes defined by the Event mask. Event mask is a bit mask of error and event classes. Refer to - for the definition of the bit + for the definition of the bit positions. @@ -1655,7 +1655,7 @@ The interrupt number for external device interrupts is provided in the OF device tree as specified in - . + . @@ -1789,7 +1789,7 @@ The rtas-last-error call must fill in the error log with a single error log formatted as specified in - . If necessary, the data placed + . If necessary, the data placed into the error log must be truncated to ‘length” bytes. @@ -2689,7 +2689,7 @@ Device drivers and system software need access to PCI configuration space. - section on "Address Map" defines + defines system address spaces for PCI memory and PCI I/O spaces. It does not define an address space for PCI configuration. Different PCI bridges may implement the mechanisms for accessing PCI configuration space in @@ -2842,7 +2842,7 @@ xrefstyle="select: labelnumber nopage"/>-3. RTAS must follow the rules of - when accessing PCI + when accessing PCI configuration space. @@ -2866,7 +2866,7 @@ PCI-X Mode 2 and PCI Express devices, an IOA device driver is responsible for checking if the “ibm,pci-config-space-type” property (see - ) of the IOA's node exists and + ) of the IOA's node exists and is set to a non-zero value. @@ -3677,7 +3677,7 @@ control characters carriage-return (CR) (0x0D) and line-feed (LF) (0x0A). The following OF properties are defined in - : + : @@ -4993,7 +4993,7 @@ When tone is required. See Requirement - . + . ibm @@ -5023,7 +5023,7 @@ When tone is required. See Requirement - . + . ibm @@ -5170,7 +5170,7 @@ Isolate refers to the DR action to logically disconnect from the platform and/or OS (for example, for PCI, isolate from the bus and from the OS). See - for more + for more details. @@ -5206,8 +5206,8 @@ or just an Identify/Action indicator. Identify and Action may map to the same visual state (for example, the same blink rate). See - and - for more + and + for more information. @@ -5240,7 +5240,7 @@ Allows an OS image to assign (usable, exchange, or recover) resources from the firmware or, release resources from the OS to the firmware. See - for more + for more details. @@ -5321,7 +5321,7 @@ Yes See - . + . ibm @@ -5335,8 +5335,8 @@ system or a partition requires operator intervention for another reason. The Error Log indicator is located only on the Primary Enclosure. See - and - for more + and + for more information. @@ -5360,7 +5360,7 @@ Yes See - . + . ibm @@ -5377,8 +5377,8 @@ protect against the use of multiple 9007 indicators simultaneously or multiple uses of the same 9007 indicator simultaneously. See - and - for more + and + for more information. @@ -5696,8 +5696,7 @@ -1: Hardware Error -2: Hardware Busy, Try again later -3: No such sensor implemented - -9000: DR Entity isolated ( - ) + -9000: DR Entity isolated () @@ -5774,7 +5773,7 @@ Critical High - The sensor value is greater than or equal to this limit. The platform may take some action and may initiate an EPOW (see - ). The OS may take some action + ). The OS may take some action to correct this situation or to perform an orderly shutdown. @@ -6260,8 +6259,8 @@ Used in Dynamic Reconfiguration operations to determine if connector is available and whether the user performed a particular DR operation correctly. See - and - . + and + . @@ -6331,7 +6330,7 @@ Yes See - . + . ibm @@ -6359,7 +6358,7 @@ Yes See - . + . ibm @@ -6743,7 +6742,7 @@ sensor. For example, the first entry of “ibm,sensor-9001” contains the location code for fan#1. Location codes are shown in - . Of course, since it is an + . Of course, since it is an abstracted sensor, the entry for “ibm,sensor-9000” is NULL.
@@ -6853,7 +6852,7 @@ property “ibm,environmental-sensors” in the /rtas node (see - ). + ). @@ -7749,7 +7748,7 @@ under a UPS would be given by the platform as an EPOW event with EPOW event modifier being given as, 0x02 = Loss of utility power, system is running on UPS/Battery, as described in section - . + . @@ -7786,7 +7785,7 @@ system-reboot call which resets all processors and all attached devices. After reset, the system must be booted with the current settings of the System Environment Variables (refer to - for more information). + for more information). @@ -7894,7 +7893,7 @@ in this section. It does not return to the OS if successful. This call supports RTAS instantiated in 32 bit mode to access storage at addresses above 4GB. In an exception to the LPAR Requirement - this call supports block lists + this call supports block lists being outside of the Real Mode Area (RMA) as long as the initial block list is at an address below the limits of the cell size of the Block_list argument. @@ -8213,7 +8212,7 @@ Flash Update with Discontiguous Block Lists The property “ibm,flash-block-version” (see - ) is defined to describe the + ) is defined to describe the following definition and operation of the Block_list shown in . @@ -9014,7 +9013,7 @@ xrefstyle="select: labelnumber nopage"/>-1. (Merged into Requirement - ) + ) @@ -9023,7 +9022,7 @@ xrefstyle="select: labelnumber nopage"/>-2. (Merged into Requirement - ) + ) @@ -9363,7 +9362,7 @@ Note: Requirement - applies to the start-cpu RTAS + applies to the start-cpu RTAS call. At the completion of start-cpu, the caches to be used by the specified processor must have been initialized and the state bits made accurate prior to beginning execution at the start address. @@ -10593,7 +10592,7 @@ favored level by firmware at boot), of the External Interrupt Vector Entry associated with the interrupt number provided as an input argument unless prevented by Requirement - . + . @@ -10789,7 +10788,7 @@ ibm,int-on call since boot), associated with the interrupt number provided as an input argument unless prevented by Requirement - . + . @@ -10978,7 +10977,7 @@ ibm,int-off call must disable interrupts from the interrupt source associated with the interrupt number provided as an input argument unless prevented by Requirement - . + . @@ -10997,7 +10996,7 @@ ibm,get-xive call and set the priority value of the XIVE to the least favored priority value (0xFF), unless prevented by Requirement - . + . @@ -11150,7 +11149,7 @@ ibm,int-on call must enable interrupts from the interrupt source associated with the interrupt number provided as an input argument unless prevented by Requirement - . + . @@ -11167,7 +11166,7 @@ saved by the previous ibm,int-off call (initialized by the firmware to the least favored level at boot) unless prevented by Requirement - . + . @@ -11280,7 +11279,7 @@ MSI Support This section describes the RTAS calls required when the MSI option is implemented. See - for other platform requirements + for other platform requirements for the MSI option. The Message Signaled Interrupt (MSI) and Enhanced MSI (MSI-X) capability of PCI IOAs in many cases allows for greater flexibility in @@ -11371,7 +11370,7 @@ interrupts from the IOA function. It is permissible to use LSI, MSI and MSI-X on different IOA functions. The default (initial) assignment of interrupts is defined in - . + . @@ -11818,7 +11817,7 @@ MSIs and MSI source numbers are not shared (see Requirement - ). + ). @@ -11859,8 +11858,7 @@ The platform will return a status -2 or 990x only when the OS indicates support. The OS indicates support via ibm,client-architecture-support, - vector 4. See section on "Root Node Methods" - for more information. + vector 4. See . @@ -12121,7 +12119,7 @@ order to be able to test device driver code that implements recovery based on the EEH option. See also, - , for additional information + , for additional information about implementing EEH error recovery. @@ -13212,15 +13210,14 @@ - The PE configuration address ( - PHB_Unit_ID_Hi, PHB_Unit_ID_Low, and config_addr) + The PE configuration address (PHB_Unit_ID_Hi, PHB_Unit_ID_Low, and config_addr) for the domain is the PCI configuration address for the PE primary bus and is the same format as used for the ibm,read-pci-config and ibm,write-pci-config calls (see Requirement ), except that the Register field is set to 0. The PE configuration address is obtained as indicated in - . + .
<emphasis>ibm,set-eeh-option</emphasis> @@ -13386,7 +13383,7 @@ ibm,set-eeh-option Function 1 (enable EEH) is still required as a signalling method from the device driver to the platform that the device driver is at least EEH aware (see Requirement - ). + ). @@ -13910,7 +13907,7 @@ that call can be used instead of this one to determine the PE configuration address. See and - . + . @@ -13986,7 +13983,7 @@ “ibm,read-slot-reset-state-functions” property in the RTAS node of the device tree ( - ). + ). @@ -14345,7 +14342,7 @@ This call is used obtain information about fabric configuration addresses, given the PCI configuration address. See - for more information on PEs and + for more information on PEs and determining PE configuration addresses. The PCI configuration address ( PHB_Unit_ID_Hi, PHB_Unit_ID_Low, and config_addr) @@ -14617,7 +14614,7 @@ log version used, the data in the Returned_Error_Buffer is in an extended log format as defined in - . When the call returns data + . When the call returns data for version 6 or greater, the device driver error buffer data is included as the last User Data section. The device driver data in the return buffer may be truncated from what is passed by the device driver or @@ -14932,7 +14929,7 @@ Device_Driver_Error_Buffer_Length argument is non-zero, indicating the existence of optional device driver error data, the referenced buffer must contain an extended event log as defined in - . + . @@ -16023,7 +16020,7 @@ device driver or OS from the restoration of non-interrupts the PCI configuration space for changes that were made to the configuration space after boot (see Requirement - ). + ). @@ -16596,7 +16593,7 @@ “ibm,errinjct-tokens” property as defined below in the /rtas node (see - ) of the OF device tree with a + ) of the OF device tree with a specification for each implemented error injection class. @@ -17173,7 +17170,7 @@ For PHB implementations that do not allow injection of a TLP ECRC error into the request, or for the case where the injection would be in violation of Requirement - due to the hardware + due to the hardware configuration, the platform should emulate the error by setting the appropriate error state in the PHB when EEH is enabled. @@ -17486,7 +17483,7 @@ should the hardware signal a machine check or system reset interrupt. The results of an error analysis are reported via a standard error log structure as defined in - . The storage containing the + . The storage containing the error log structure is subsequently released back to firmware use by the OS after it has completed its event handling by the issuance, from the interrupted processor, of the @@ -17819,7 +17816,7 @@ contains the real address of a 16 byte memory buffer containing the original contents of GPR R3 in the first 8 bytes and the RTAS Error Log (fixed part) (per - ) in the second 8 bytes. + ) in the second 8 bytes. @@ -17841,7 +17838,7 @@ For the FWNMI option: Once the firmware has reported a “fatal” machine check event to an OS image it must only report “fatal error previously reported” (see - ) in response to machine checks + ) in response to machine checks on any processor belonging to that image. @@ -19012,7 +19009,7 @@ The format of the SPLPAR string is beyond the scope of this architecture. See also, - . + . diff --git a/LoPAR/sec_rtas_dma_window.xml b/LoPAR/sec_rtas_dma_window.xml index de5c0b5..d0c5aa3 100644 --- a/LoPAR/sec_rtas_dma_window.xml +++ b/LoPAR/sec_rtas_dma_window.xml @@ -833,7 +833,7 @@ Platforms supporting the DDW option implement extensions described in this section. These extensions include: adding the “ibm,ddw-extensions” property see - to those nodes that include the + to those nodes that include the “ibm,ddw-applicable” property, and implementing the functional extensions specified for the architectural level in diff --git a/LoPAR/sec_rtas_environment.xml b/LoPAR/sec_rtas_environment.xml index b55e964..439acf2 100644 --- a/LoPAR/sec_rtas_environment.xml +++ b/LoPAR/sec_rtas_environment.xml @@ -43,7 +43,7 @@ If the LPAR option is enabled, multiple partitions may exist, each with its own OS instance. This requires some changes to the RTAS environment. These changes are discussed in - . + .
Machine State @@ -1059,7 +1059,7 @@ Required for DR operations (see - ) + )   @@ -1327,7 +1327,7 @@ “ibm,configure-connector” - + @@ -1335,7 +1335,7 @@ See - . + . @@ -1546,7 +1546,7 @@ Sometimes (see - ) + )   @@ -1924,8 +1924,8 @@ (9003). DR indicators and sensors are required to be there based on the DR entity being supported. Their indices are specified by the DR index for the DR entity. See - and - for more information. + and + for more information. are static since they represent the base hardware, others are dynamic coming and going with extensions to the base hardware. Indices for DR indicators and sensors are obtained from the DRC index for the DRC @@ -1964,7 +1964,7 @@ contiguous, and any of the indices between 0 and maxindex may be missing. The formats for location codes are defined in - . For indicators and sensors, + . For indicators and sensors, these location codes are for the location of the device being manipulated or measured, not the location of the specific controller or sensor. The location code for an abstracted indicator or sensor is a NULL @@ -1985,9 +1985,9 @@ For static indicators, except DR indicators, the extension property, <vendor>,indicator-<token> - (see ), provides an array of strings + (see ), provides an array of strings containing the FRU location codes associated with each indicator. See - . Here, “ + . Here, “ <vendor>” corresponds to the “<vendor>” column of @@ -2007,7 +2007,7 @@ /rtas node. Indices for DR indicators 9001, 9002, and 9003 are obtained from the DRC index for the DRC connector. See Requirement - . + . @@ -2082,9 +2082,9 @@ platform provides. For static sensors, except DR sensors, the extension property, <vendor>,sensor-<token> - (see ), provides an array of strings + (see ), provides an array of strings containing the FRU location codes associated with each sensor. See - . Here, “ + . Here, “ <vendor>” corresponds to the “<vendor>” column of @@ -2102,7 +2102,7 @@ /rtas node. Indices for DR sensors 9003 are obtained from the DRC index for the DRC connector. See Requirement - . + . @@ -2493,7 +2493,7 @@ Multi-level isolation error (see - ). + ). diff --git a/LoPAR/sec_rtas_error_classes.xml b/LoPAR/sec_rtas_error_classes.xml index 232c37e..244ac6c 100644 --- a/LoPAR/sec_rtas_error_classes.xml +++ b/LoPAR/sec_rtas_error_classes.xml @@ -36,8 +36,8 @@ OSs know which interrupts may be handled by calling check-exception. The OF structure for describing these interrupts is defined in - . - This document also defines the mask parameter for the + . + also defines the mask parameter for the check-exception and event-scan RTAS functions which limits the search for diff --git a/LoPAR/sec_rtas_error_reporting_location_codes.xml b/LoPAR/sec_rtas_error_reporting_location_codes.xml index 36bddfd..c0bd3b7 100644 --- a/LoPAR/sec_rtas_error_reporting_location_codes.xml +++ b/LoPAR/sec_rtas_error_reporting_location_codes.xml @@ -1,7 +1,7 @@ -
Location Codes - + This document defines an architecture extension for physical location codes. One use of location codes is to append failing location - information to error logs returned by the - event-scan and - check-exception RTAS services. Refer to - for more information on the + information to error logs returned by the + event-scan and + check-exception RTAS services. Refer to + for more information on the format and use of location codes. For event logs with Version 6 or later, the location code of FRU call out is contained in the Primary SRC section, FRU call out sub-section of the Platform Event Log format. - +
diff --git a/LoPAR/sec_rtas_get_indices.xml b/LoPAR/sec_rtas_get_indices.xml index e84fbba..28cf6f5 100644 --- a/LoPAR/sec_rtas_get_indices.xml +++ b/LoPAR/sec_rtas_get_indices.xml @@ -43,8 +43,8 @@ This RTAS call is not used for DR indicators (9001, 9002, and 9003) or DR sensors (9003). See the following sections in the DR chapter for more information: - and - . + and + . It may require several calls to the ibm,get-indices RTAS routine to get the entire list of indicators or sensors of a particular type. Each call may specify a diff --git a/LoPAR/sec_rtas_hot_plug.xml b/LoPAR/sec_rtas_hot_plug.xml index 5da520a..ce55888 100644 --- a/LoPAR/sec_rtas_hot_plug.xml +++ b/LoPAR/sec_rtas_hot_plug.xml @@ -1,7 +1,7 @@ -
- + Hot Plug Events - - Hot Plug Events, when implemented, are reported through + + Hot Plug Events, when implemented, are reported through either the event-scan RTAS call or a hotplug interrupt. - An OS that wants to be notified of hotplug events will need to - set the appropriate arch-vector bit. Look for the hot-plug-events - node in the /event-sources node of the OF device tree (see - ), enable the interrupts listed - in its “interrupts” property and provide an interrupt handler to call + An OS that wants to be notified of hotplug events will need to + set the appropriate arch-vector bit. Look for the hot-plug-events + node in the /event-sources node of the OF device tree (see + ), enable the interrupts listed + in its “interrupts” property and provide an interrupt handler to call check-exception when one of those interrupts are received. - When a hotplug event occurs, whether reported by check-exception - or event-scan, RTAS will directly pass back the Hotplug Event Log as + When a hotplug event occurs, whether reported by check-exception + or event-scan, RTAS will directly pass back the Hotplug Event Log as described in . - + - + - R1-R1--1. If FRUs can be hot plugged in the system @@ -48,7 +48,7 @@ signaling the OS about the event. - + - +
diff --git a/LoPAR/sec_rtas_manage_storage_preservation.xml b/LoPAR/sec_rtas_manage_storage_preservation.xml index 7fac71c..18de6ea 100644 --- a/LoPAR/sec_rtas_manage_storage_preservation.xml +++ b/LoPAR/sec_rtas_manage_storage_preservation.xml @@ -25,7 +25,7 @@ Platforms may optionally preserve selected regions of storage (LMBs) across client program boot cycles. - for more information. + for more information.
diff --git a/LoPAR/sec_rtas_suspend_me.xml b/LoPAR/sec_rtas_suspend_me.xml index 5d987f3..5d494e4 100644 --- a/LoPAR/sec_rtas_suspend_me.xml +++ b/LoPAR/sec_rtas_suspend_me.xml @@ -28,14 +28,14 @@ part of OS hibernation or migration to another platform. This RTAS call is made by the last active processor thread of a partition. The OS uses the H_JOIN hcall() (see - ) to deactivate other + ) to deactivate other processing threads. Processing treads may exit H_JOIN due to an unmaskable interrupt; if a thread has exited H_JOIN, ibm,suspend-me fails with a status of “multiple processor threads active”. The wake up from suspension is triggered by partition state change (see - sections on "Partition Migration" - and "Partition Hibernation"). The + and + ). The ibm,suspend-me RTAS call returns only on the calling virtual processor. Other virtual processors that were inactive when ibm,suspend-me was called remain so until they are @@ -50,7 +50,7 @@ ibm,update-properties (see ) and/or ibm,configure-connector (see - ). Also during suspension, some + ). Also during suspension, some system parameters may have changed. See , for details. The OS may want to re-scan selected system parameters. @@ -62,7 +62,7 @@ For the Partition Suspension option: The platform must implement the Logical Partitioning option (see - ) + ) . @@ -191,7 +191,7 @@ For the Partition Suspension option: The platform must implement the Thread Join option (see - ). + ). @@ -236,7 +236,7 @@ For the Partition Suspension option: The platform must support the “Partner partition suspended” CRQ Transport Event (See - ). + ). @@ -273,7 +273,7 @@ For the Partition Suspension option: The platform must implement the H_ENABLE_CRQ hcall() using the syntax and semantics described in - . + . @@ -312,7 +312,7 @@ For the Partition Suspension option: The platform must implement the LRDR option (See - ). + ). diff --git a/LoPAR/sec_rtas_update_properties.xml b/LoPAR/sec_rtas_update_properties.xml index 21e720b..0d90f5e 100644 --- a/LoPAR/sec_rtas_update_properties.xml +++ b/LoPAR/sec_rtas_update_properties.xml @@ -903,7 +903,7 @@ TLB properties (See - ) + )