From b2db419922c1b2c79f0d8ee4bd311d9e45200149 Mon Sep 17 00:00:00 2001 From: Bill Schmidt Date: Tue, 9 Jun 2020 13:22:14 -0500 Subject: [PATCH] Fixes per internal review comments Signed-off-by: Bill Schmidt --- specification/bk_main.xml | 2 +- specification/ch_1.xml | 16 +++- specification/ch_2.xml | 150 +++++++++++++++----------------------- specification/ch_3.xml | 5 +- specification/ch_4.xml | 2 +- specification/ch_5.xml | 2 +- 6 files changed, 75 insertions(+), 102 deletions(-) diff --git a/specification/bk_main.xml b/specification/bk_main.xml index dd948b8..ef261b1 100644 --- a/specification/bk_main.xml +++ b/specification/bk_main.xml @@ -94,7 +94,7 @@ - 2020-05-21 + 2020-06-04 diff --git a/specification/ch_1.xml b/specification/ch_1.xml index 921ab34..aed65cf 100644 --- a/specification/ch_1.xml +++ b/specification/ch_1.xml @@ -69,9 +69,7 @@ IBM Power Instruction Set Architecture, Versions 2.7 and 3.0, 2.07, 3.0, and 3.1, - IBMOpenPOWER Foundation, 2.07, 3.0, and 3.1, IBM, 2013-20162013-2020. @@ -214,7 +212,6 @@ describe the implications of this new capability. For specifics, see , , , , , , . + + + Appendix A, "Predefined Functions for Vector Programming," + and most of Chapter 6, "Vector Programming Interfaces," have + been removed from this document. This material is now + incorporated into the POWER Vector Intrinsics + Programming Reference. See + for a link to this document. + + diff --git a/specification/ch_2.xml b/specification/ch_2.xml index bf21515..d8e9c44 100644 --- a/specification/ch_2.xml +++ b/specification/ch_2.xml @@ -2384,24 +2384,10 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> Function pointer - - + + Binary Floating-Point - - _Float16 - - - 2 - - - Halfword - - - Half-precision float - - - float @@ -2778,7 +2764,7 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> Vector of 1 signed quadword. - + @@ -4091,9 +4077,9 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> If a function contains any prefixed (8-byte) instructions, functions should preferably be aligned on at least a 64-byte boundary. In ISA 3.1, executing a prefixed instruction that - crosses a 64-byte boundary will cause a SIGILL that must be - handled by the kernel. Compilers and assemblers can avoid - this if functions are aligned on a 64-byte boundary. + crosses a 64-byte boundary causes an alignment interrupt. + Compilers and assemblers can avoid this if functions are + aligned on a 64-byte boundary. @@ -4142,12 +4128,12 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> tables is further described in the referenced section. A program may contain any combination of the function call protocols in these tables. - Note that - this ABI does not define protocols where the caller does not use - a TOC pointer, but does preserve r2. It is most efficient when - such functions are always leaf procedures. It is not forbidden for - such a function to call another function, but in this case it is - up to the caller to save and restore r2 around each call. + This ABI does not define protocols where the + caller does not use a TOC pointer, but does preserve r2. It + is most efficient when such functions are always leaf + procedures. It is not forbidden for such a function to call + another function, but in this case it is up to the caller to + save and restore r2 around each call. linkage table (PLT) stub that saves r2 and replaces the nop instruction with a restore of r2. (The save of r2 may be omitted from the PLT stub if the R_PPC64_TOCSAVE relocation is used; see - .) If the callee requires - a TOC, the PLT stub also includes code to place the callee's global - entry point into r12. See for a full description of PLT stubs. + .) See for a full description + of PLT stubs.
@@ -4487,8 +4472,7 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> after the bl instruction for the call. Instead, the compiler annotates the bl instruction with an R_PPC64_REL24_NOTOC relocation. The linker generates a PLT stub that does not include - a save of r2. If the callee requires a TOC, the PLT stub also - includes code to place the callee's global entry point into r12. + a save of r2.
@@ -4685,19 +4669,23 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> Nonvolatile Register r2 is nonvolatile with respect to calls - between most functions - in the same compilation unit. It is saved and restored by - code inserted - by the linker resolving a call to an external function. For - more information, see and . + between functions in the same compilation + unit. It is saved and + restored by code inserted by the linker resolving a + call to an external function., except under the conditions + in footnote (b). For more information, see + and . or Volatile Register r2 is volatile and available for use in a function that does not use a TOC pointer and that does - not preserve r2. See - . + not guarantee that it preserves r2. See + and + . @@ -5042,11 +5030,7 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> revisionflag="changed">least-significant halves of those VSX registers corresponding to the classic floating-point registers (that is, vsr0–vsr31), are volatile. If the most-significant half of such a - VSX register is a non-volatile floating-point register that is - not used for a function call, the entire VSX register is - volatile. + revisionflag="changed">are volatile.
Floating-Point Register Roles for Binary Floating-Point @@ -6009,8 +5993,15 @@ xml:id="dbdoclet.50655240_pgfId-1156194"> <para>Any future type requiring 16-byte alignment (see <xref linkend="dbdoclet.50655240_15141" />) or processed in vector registers</para> - <para>For the purpose of determining a qualified floating-point argument, - _Float128 shall be considered a vector data type. In addition, _Float128 + <para>For the purpose of determining a qualified + floating-point argument, <phrase + revisionflag="deleted">_Float128</phrase><phrase + revisionflag="added">IEEE BINARY 128 QUADRUPLE + PRECISION</phrase> shall be considered a vector data type. In + addition, <phrase + revisionflag="deleted">_Float128</phrase><phrase + revisionflag="added">IEEE BINARY 128 QUADRUPLE + PRECISION</phrase> is like a vector data type for determining if multiple aggregate members are like.</para> <para>A homogeneous aggregate can consist of a variety of nested @@ -6594,7 +6585,7 @@ s6 - 72 (stored)</programlisting> Area must be large enough to accommodate all parameters, including parameters passed in registers.</para> <para revisionflag="added"> - The caller of any function with an ellipsis in its prototype + The caller of any function with a variable argument list must allocate a Parameter Save Area, as described in <xref linkend="dbdoclet.50655240_78421" />. </para> @@ -6652,44 +6643,6 @@ s6 - 72 (stored)</programlisting> </listitem> </itemizedlist> </section> - <section xml:id="dbdoclet.50655240___tailcall" - revisionflag="added"> - <title>Tail-Call Optimization - - When the last action of a function F is - to perform a function call to a function - G, and optionally return the value - returned from G, a compiler may perform a - tail-call optimization so long as the - optimization is undetectable by the caller of - G. The full details of and requirements - for tail-call optimization will not be described here, but in - essence F removes its stack frame and - issues a direct branch to G, which reuses - the stack space and the saved link register so that - G eventually returns to the caller of - F. - - - When the call from F to - G is not local, and - F is a TOC-preserving function, tail-call - optimization is disallowed because F and - G may have different TOC pointers. - Tail-call optimization cannot guarantee that the correct TOC - will be restored when G returns. - - - When the call from F to - G is local, and F is - a TOC-preserving function, but G is - not a TOC-preserving function, then - tail-call optimization is again disallowed. In this case, - G may have placed any value into register - r2, and the correct TOC will not be restored when - G returns. - -
Coding Examples @@ -8255,19 +8208,30 @@ bctrl
- Function calls often + Function calls need to be performed in conjunction with establishing, maintaining, and restoring addressability through the TOC pointer register, r2. When a function is called, the TOC pointer register - may be modified. In many cases, - the caller must provide a nop + may be modified. The caller must provide a nop after the bl instruction performing a call, if r2 is not known to have the same value in the callee. This is generally true for external calls. The linker will replace the nop with an r2 restoring instruction if the - caller and callee use different r2 values. The linker leaves it unchanged if they + caller and callee use different r2 values, The linker leaves it + unchanged if they use the same r2 value. This scheme avoids having a compiler generate an - overconservative r2 save and restore around every external call. + overconservative r2 save and restore around every external + call. + + When a function requires addressability through the TOC + pointer register, r2, and that function calls another function + that may not preserve the value of r2, the caller must provide + a nop after the bl instruction performing the call. The + linker will replace the nop with an r2-restoring instruction + if it determines that r2 may be changed as a result of the + call; otherwise the linker will leave the nop unchanged. See + for a full + description of when a nop must be inserted. + There are two cases where the caller need not provide a nop after the bl instruction performing a call: diff --git a/specification/ch_3.xml b/specification/ch_3.xml index 1852399..5db9554 100644 --- a/specification/ch_3.xml +++ b/specification/ch_3.xml @@ -5606,7 +5606,7 @@ addi r4, r4, lower typedef struct { /* Reservation for HWCAP data. */ unsigned int hwcap2; - unsigned int hwcap; /* not used in LE ABI */ + uint64_t hwcap; /* not used in LE ABI */ /* Indicate if HTM capable (ISA 2.07). */ int tm_capable; @@ -9083,7 +9083,8 @@ nop
Initial Exec to Local Exec (PC-Relative) - Initial-Exec-to-Local-Exec Initial Relocations + Initial-Exec-to-Local-Exec Initial Relocations + (PC-Relative) diff --git a/specification/ch_4.xml b/specification/ch_4.xml index 8427354..5326017 100644 --- a/specification/ch_4.xml +++ b/specification/ch_4.xml @@ -718,7 +718,7 @@ PPC_FEATURE2_DARN 0x00200000 /* darn instruction */ PPC_FEATURE2_SCV 0x00100000 /* scv syscall */ PPC_FEATURE2_HTM_NO_SUSPEND 0x00080000 /* TM without suspended state */ PPC_FEATURE2_ARCH_3_1 0x00040000 /* ISA 3.1 */ -PPC_FEATURE2_MMA 0x00020000 /* Matrix Multiply Accumulate */ +PPC_FEATURE2_MMA 0x00020000 /* Matrix Multiply Assist */ When a process starts to execute, its stack holds the arguments, environment, and auxiliary vector received from the exec call. The system makes no guarantees about the relative arrangement of argument strings, diff --git a/specification/ch_5.xml b/specification/ch_5.xml index 9875b86..3e8e99d 100644 --- a/specification/ch_5.xml +++ b/specification/ch_5.xml @@ -320,7 +320,7 @@ xml:id="dbdoclet.50655243_pgfId-1099317"> __VEC_ELEMENT_REG_ORDER__ - For more information, see + For more information, see .