|
|
|
@ -4032,7 +4032,8 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
</figure>
|
|
|
|
|
|
|
|
|
|
<note>
|
|
|
|
|
<para><xref linkend="dbdoclet.50655240_30073" /> , the alignment of the
|
|
|
|
|
<para>In <xref linkend="dbdoclet.50655240_30073" />, the alignment
|
|
|
|
|
of the
|
|
|
|
|
structure is not affected by the unnamed short and int fields. The
|
|
|
|
|
named members are aligned relative to the start of the structure.
|
|
|
|
|
However, it is possible that the alignment of the named members is
|
|
|
|
@ -4044,6 +4045,70 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
</section>
|
|
|
|
|
</section>
|
|
|
|
|
</section>
|
|
|
|
|
<section revisionflag="added" xml:id="dbdoclet.50655240_AddrModel">
|
|
|
|
|
<title revisionflag="added">Global Data Addressing Models</title>
|
|
|
|
|
<para revisionflag="added">This specification provides for two global data
|
|
|
|
|
addressing models. The traditional addressing model, which we will call
|
|
|
|
|
"TOC-based," relies on a dedicated table-of-contents (TOC) pointer to
|
|
|
|
|
obtain the addresses of global data. PowerISA version 3.1 introduces new
|
|
|
|
|
"PC-relative" instructions that can be used to obtain the addresses of
|
|
|
|
|
global data relative to the current instruction address (CIA). Code that
|
|
|
|
|
is targeted to run on hardware compliant with PowerISA 3.1 may make use of
|
|
|
|
|
this capability with a "PC-relative" addressing model.</para>
|
|
|
|
|
<para revisionflag="added">Each compilation unit must adhere entirely to
|
|
|
|
|
one addressing model or the other. However, it is expressly possible to
|
|
|
|
|
link TOC-based and PC-relative compilation units into a single
|
|
|
|
|
executable, or to dynamically link from a compilation unit with one
|
|
|
|
|
addressing model to a compilation unit with the other addressing model.
|
|
|
|
|
In particular, a PC-relative compilation unit may be linked with an
|
|
|
|
|
existing TOC-based library. Note that a "compilation unit" may consist of
|
|
|
|
|
hand-written assembly code as well as high-level source code.</para>
|
|
|
|
|
<para revisionflag="added">Compilers and other tools performing
|
|
|
|
|
link-time optimizations that repackage functions into different
|
|
|
|
|
compilation units must not mix PC-relative and TOC-based functions in
|
|
|
|
|
the same compilation unit. [To discuss: This could be permitted, but
|
|
|
|
|
the value is unclear and it would be likely to spawn occasional
|
|
|
|
|
linker bugs.] Similarly, programmers should not be allowed to
|
|
|
|
|
specify a single function in a TOC-based compilation unit to use the
|
|
|
|
|
PC-relative addressing model or vice versa; for example, using GCC's
|
|
|
|
|
"#pragma target" syntax. [To discuss: How should this be recorded and
|
|
|
|
|
communicated? Perhaps add to e_flags in the ELF header for module
|
|
|
|
|
objects only? We can communicate the need for PC-relative PLT stubs
|
|
|
|
|
to the linker on calls with a reloc, so the linker may not need this,
|
|
|
|
|
but perhaps other tools will?]</para>
|
|
|
|
|
<para revisionflag="added">Details of the two addressing models will be
|
|
|
|
|
provided throughout this specification. However, a brief description
|
|
|
|
|
of each is in order.</para>
|
|
|
|
|
<section revisionflag="added" xml:id="dbdoclet.50655240_TOCBased">
|
|
|
|
|
<title revisionflag="added">TOC-Based Addressing Model</title>
|
|
|
|
|
<para revisionflag="added">In the traditional TOC-based addressing model,
|
|
|
|
|
each function uses register r2 (see <xref
|
|
|
|
|
linkend="dbdoclet.50655240_68174" />) to access global memory. A variety
|
|
|
|
|
of techniques, known as TOC-relative, TOC-indirect, GOT-relative, etc.,
|
|
|
|
|
may be used to address the global data, but all these techniques use the
|
|
|
|
|
TOC pointer r2 as part of the data reference.</para>
|
|
|
|
|
<para revisionflag="added">With the cooperation of the linker, each
|
|
|
|
|
function in a TOC-based compilation unit is responsible for the
|
|
|
|
|
establishment and maintenance of its own TOC pointer. All functions
|
|
|
|
|
within a compilation unit have the same TOC pointer, so local function
|
|
|
|
|
calls may assume it does not change. An external function call may be
|
|
|
|
|
resolved to a function in a shared object having a different TOC
|
|
|
|
|
pointer, so a caller in a TOC-based compilation unit must save its TOC
|
|
|
|
|
pointer prior to making a call outside the compilation unit, and restore
|
|
|
|
|
its value upon return before the TOC pointer may be used to access global
|
|
|
|
|
data.</para>
|
|
|
|
|
</section>
|
|
|
|
|
<section revisionflag="added" xml:id="dbdoclet.50655240_PCRel">
|
|
|
|
|
<title revisionflag="added">PC-Relative Addressing Model</title>
|
|
|
|
|
<para revisionflag="added">A function in a PC-relative compilation unit
|
|
|
|
|
has no TOC pointer. All accesses to global data are made relative to
|
|
|
|
|
the current instruction address. Since functions in TOC-based
|
|
|
|
|
compilation units are responsible for establishment and maintenance
|
|
|
|
|
of their own TOC pointers, register r2 may be used freely within a
|
|
|
|
|
PC-relative compilation unit, with no need to save or restore the
|
|
|
|
|
register when modifying it.</para>
|
|
|
|
|
</section>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_85672">
|
|
|
|
|
<title>Function Calling Sequence</title>
|
|
|
|
|
<para>The standard sequence for function calls is outlined in this section.
|
|
|
|
@ -4208,15 +4273,22 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>Nonvolatile<footnote>
|
|
|
|
|
<para>Register r2 is nonvolatile with respect to calls
|
|
|
|
|
between functions in the same compilation unit. It is saved
|
|
|
|
|
and restored by code inserted by the linker resolving a
|
|
|
|
|
call to an external function. For more information, see
|
|
|
|
|
<xref linkend="dbdoclet.50655240_51083" />.</para>
|
|
|
|
|
</footnote></para>
|
|
|
|
|
<para><phrase revisionflag="changed">In a TOC-based
|
|
|
|
|
compilation unit, register</phrase> r2 is nonvolatile with
|
|
|
|
|
respect to calls between functions in the same compilation
|
|
|
|
|
unit. It is saved and restored by code inserted by the linker
|
|
|
|
|
resolving a call to an external function. For more
|
|
|
|
|
information, see <xref linkend="dbdoclet.50655240_51083"
|
|
|
|
|
/>.</para>
|
|
|
|
|
</footnote><phrase revisionflag="added"> or
|
|
|
|
|
Volatile<footnote>
|
|
|
|
|
<para>Register r2 is volatile and available for use in
|
|
|
|
|
PC-relative compilation units.</para>
|
|
|
|
|
</footnote></phrase></para>
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>TOC pointer.</para>
|
|
|
|
|
<para>TOC pointer <phrase revisionflag="added"> for
|
|
|
|
|
TOC-based compilation units</phrase>.</para>
|
|
|
|
|
</entry>
|
|
|
|
|
</row>
|
|
|
|
|
<row>
|
|
|
|
@ -4388,7 +4460,8 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
</table>
|
|
|
|
|
<para> </para>
|
|
|
|
|
<bridgehead xml:id="dbdoclet.50655240_51083">TOC Pointer
|
|
|
|
|
Usage</bridgehead>
|
|
|
|
|
Usage <phrase revisionflag="added">(TOC-Based Compilation Units
|
|
|
|
|
Only)</phrase></bridgehead>
|
|
|
|
|
<para>As described in
|
|
|
|
|
<xref linkend="dbdoclet.50655241_73385" />, the TOC pointer, r2, is
|
|
|
|
|
commonly initialized by the global function entry point when a function
|
|
|
|
@ -4497,12 +4570,15 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
mask the value received from
|
|
|
|
|
<emphasis role="bold">mfocr</emphasis> to avoid corruption of the resulting
|
|
|
|
|
(partial) condition register word.</para>
|
|
|
|
|
<para>This erratum does not apply to the POWER9 processor.</para>
|
|
|
|
|
<para>This erratum does not apply to <phrase
|
|
|
|
|
revisionflag="changed">POWER9 and subsequent
|
|
|
|
|
processors.</phrase></para>
|
|
|
|
|
</note>
|
|
|
|
|
|
|
|
|
|
<para><anchor xml:id="dbdoclet.50655240_Power-ISA-version-and-the-user-s-manual"
|
|
|
|
|
xreflabel="" />For more information, see
|
|
|
|
|
<citetitle>Power ISA</citetitle>, version 3.0 and "Fixed-Point Invalid
|
|
|
|
|
<citetitle>Power ISA</citetitle>, version <phrase
|
|
|
|
|
revisionflag="changed">3.0B</phrase> and "Fixed-Point Invalid
|
|
|
|
|
Forms and Undefined Conditions" in
|
|
|
|
|
<citetitle>POWER9 Processor User's Manual.</citetitle></para>
|
|
|
|
|
<bridgehead>Floating-Point Registers</bridgehead>
|
|
|
|
@ -5124,8 +5200,16 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
is volatile over a function call.</para>
|
|
|
|
|
<para> </para>
|
|
|
|
|
<bridgehead>TOC Pointer Doubleword</bridgehead>
|
|
|
|
|
<para>If a function changes the value of the TOC pointer register, it
|
|
|
|
|
shall first save it in the TOC pointer doubleword.</para>
|
|
|
|
|
<para>If a function <phrase revisionflag="added">in a TOC-based
|
|
|
|
|
compilation unit</phrase> changes the value of the TOC pointer
|
|
|
|
|
register, it shall first save it in the TOC pointer doubleword.
|
|
|
|
|
<phrase revisionflag="added">The TOC pointer doubleword is reserved
|
|
|
|
|
for future use for functions in a PC-relative compilation
|
|
|
|
|
unit. [To discuss: This has implications for alloca, as if we
|
|
|
|
|
reserve it for future use, then the TOC pointer doubleword must be
|
|
|
|
|
copied during a dynamic allocation operation. I suspect it is
|
|
|
|
|
better to suffer that slight penalty rarely in order to have the
|
|
|
|
|
flexibility to use this for another future purpose.]</phrase></para>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_15141">
|
|
|
|
|
<title>Optional Save Areas</title>
|
|
|
|
@ -5252,7 +5336,8 @@ xml:id="dbdoclet.50655240_pgfId-1156194">
|
|
|
|
|
<para>Functions without a suitable declaration available to the
|
|
|
|
|
caller to determine the called function's characteristics (for
|
|
|
|
|
example, functions in C without a prototype in scope, in accordance
|
|
|
|
|
with Brian Kernighan and Dennis Ritche,
|
|
|
|
|
with Brian Kernighan and Dennis <phrase
|
|
|
|
|
revisionflag="changed">Ritchie</phrase>,
|
|
|
|
|
<citetitle>The C Programming Language</citetitle>, 1st
|
|
|
|
|
edition).</para>
|
|
|
|
|
</listitem>
|
|
|
|
@ -6220,6 +6305,16 @@ ld r12, 0(r12)
|
|
|
|
|
|
|
|
|
|
ld r12, symbol2@got(r2)
|
|
|
|
|
lvx v1, 0, r12</programlisting>
|
|
|
|
|
<itemizedlist>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
By using PC-relative addressing.
|
|
|
|
|
</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<programlisting revisionflag="added">pld r12, symbol@pcrel(0), 1
|
|
|
|
|
|
|
|
|
|
plvx v1, symbol@pcrel(0), 1</programlisting>
|
|
|
|
|
<para>In the OpenPOWER ELF V2 ABI, position-dependent code built with
|
|
|
|
|
this addressing scheme may have a Global Offset Table (GOT) in the data
|
|
|
|
|
segment that holds addresses. (For more information, see
|
|
|
|
@ -6259,6 +6354,12 @@ lvx v1, 0, r12</programlisting>
|
|
|
|
|
loaded in the first 2 GB of the address space because direct address
|
|
|
|
|
references and TOC-pointer initializations can be performed using a
|
|
|
|
|
two-instruction sequence.</para>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
PC-relative offsets are always 34 bits for all code models, with
|
|
|
|
|
a maximum addressing reach of 16GB. The effective addressing reach
|
|
|
|
|
for global data is 8GB, since data sections are always located at
|
|
|
|
|
higher virtual addresses than text sections.
|
|
|
|
|
</para>
|
|
|
|
|
</section>
|
|
|
|
|
<section>
|
|
|
|
|
<title>Position-Independent Code</title>
|
|
|
|
@ -6318,6 +6419,47 @@ ld r12, 0(r12)
|
|
|
|
|
|
|
|
|
|
ld r12 symbol2@got(r2)
|
|
|
|
|
lvx v1, 0, r12</programlisting>
|
|
|
|
|
<itemizedlist>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para revisionflag="added">By using PC-relative addressing (for
|
|
|
|
|
private data).</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<programlisting revisionflag="added">pld r12, symbol@pcrel(0), 1
|
|
|
|
|
|
|
|
|
|
plvx v1, symbol@pcrel(0), 1</programlisting>
|
|
|
|
|
<itemizedlist>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para revisionflag="added">By using PC-relative GOT-indirect
|
|
|
|
|
addressing (for shared data or very large span from code to data):
|
|
|
|
|
</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<programlisting revisionflag="added">pld r12, symbol@got@pcrel(0), 1
|
|
|
|
|
ld r12, 0(r12)
|
|
|
|
|
|
|
|
|
|
pld r12, symbol@got@pcrel(0), 1
|
|
|
|
|
lvx v1, 0, r12</programlisting>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
A compiler may generate a PC-relative addressing sequence to access
|
|
|
|
|
static or restricted-visibility data, but must generate a PC-relative
|
|
|
|
|
GOT-indirect sequence for extern data. Extern data may be satisfied
|
|
|
|
|
from a statically or dynamically linked source, so the compiler must
|
|
|
|
|
be conservative. The compiler and linker can cooperate to replace a
|
|
|
|
|
PC-relative GOT-indirect sequence with a PC-relative sequence when
|
|
|
|
|
the data reference is satisfied at static link time. See
|
|
|
|
|
<xref linkend="dbdoclet.50655241_OptPCRel" />.
|
|
|
|
|
</para>
|
|
|
|
|
<para revisionflag="added">[To discuss: I'd like to see the assembler
|
|
|
|
|
support "pld r12, symbol@pcrel" as an alternative to "pld r12,
|
|
|
|
|
symbol@pcrel(0), 1", and "pld r12, symbol@got@pcrel" as an
|
|
|
|
|
alternative to "pld r12, symbol@got@pcrel(0), 1". In general, any
|
|
|
|
|
prefix load/store with only two arguments is PC-relative; the
|
|
|
|
|
second argument is either a 34-bit offset or a GPR. Is this
|
|
|
|
|
reasonable or too confusing? Another alternative would be "pld r12,
|
|
|
|
|
symbol@pcrel(cia)" for an offset, and "pld r12, r5, cia" for the
|
|
|
|
|
GPR case. I guess we want something readable that isn't too
|
|
|
|
|
complex for the assembler to sort out.]</para>
|
|
|
|
|
<para>Position-independent executables or shared objects have a GOT in
|
|
|
|
|
the data segment that holds addresses. When the system creates a memory
|
|
|
|
|
image from the file, the GOT entries are updated to reflect the
|
|
|
|
@ -6335,6 +6477,8 @@ lvx v1, 0, r12</programlisting>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_19143">
|
|
|
|
|
<title>Code Models</title>
|
|
|
|
|
<bridgehead revisionflag="added">TOC-Based Compilation
|
|
|
|
|
Units</bridgehead>
|
|
|
|
|
<para>Compilers may provide different code models depending on the
|
|
|
|
|
expected size of the TOC and the size of the entire executable or
|
|
|
|
|
shared library.</para>
|
|
|
|
@ -6359,7 +6503,8 @@ lvx v1, 0, r12</programlisting>
|
|
|
|
|
addition, accesses to module-local code and data objects use TOC
|
|
|
|
|
pointer relative addressing with 32-bit offsets. Using TOC pointer
|
|
|
|
|
relative addressing removes a level of indirection, resulting in
|
|
|
|
|
faster access and a smaller GOT. However. it limits the size of the
|
|
|
|
|
faster access and a smaller GOT. <phrase
|
|
|
|
|
revisionflag="changed">However,</phrase> it limits the size of the
|
|
|
|
|
entire binary to between 2 GB and 4 GB, depending on the placement
|
|
|
|
|
of the TOC base.</para>
|
|
|
|
|
<note>
|
|
|
|
@ -6379,6 +6524,53 @@ lvx v1, 0, r12</programlisting>
|
|
|
|
|
TOCs, or by some other method. The suggested allocation order of
|
|
|
|
|
sections is provided in
|
|
|
|
|
<xref linkend="dbdoclet.50655241_66700" />.</para>
|
|
|
|
|
<bridgehead revisionflag="added">PC-Relative Compilation
|
|
|
|
|
Units</bridgehead>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
Compilers may provide different code models depending on the size of
|
|
|
|
|
the entire executable or shared library. There is no small code
|
|
|
|
|
model for PC-relative compilation units.
|
|
|
|
|
</para>
|
|
|
|
|
<itemizedlist revisionflag="added">
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>
|
|
|
|
|
Medium code model: Accesses to module-local code and data objects
|
|
|
|
|
use PC-relative addressing with 34-bit offsets.
|
|
|
|
|
Position-independent code uses PC-relative GOT-indirect
|
|
|
|
|
addressing to access other objects in the binary.
|
|
|
|
|
</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>
|
|
|
|
|
Large code model: Used when 34-bit offsets are insufficient to
|
|
|
|
|
reach global data or the GOT from at least one text section,
|
|
|
|
|
this is similar to the medium code model, except that up to
|
|
|
|
|
64-bit PC-relative offsets are used by generating them into a
|
|
|
|
|
register. [To discuss: None of the options for this seem ideal.
|
|
|
|
|
It takes about 5 instructions to generate a 64-bit constant into
|
|
|
|
|
a register, though we can perhaps use linker optimizations to
|
|
|
|
|
replace with a smaller sequence when available. A second choice
|
|
|
|
|
is to place the offset in a .quad in the text section to reach
|
|
|
|
|
the .got entry, but this would incur a load-load dependency.
|
|
|
|
|
(Are there cases where this requires a text relocation resolution
|
|
|
|
|
during dynamic linking?) A third choice is to fail the compile
|
|
|
|
|
and require TOC addressing with large code model when 34-bit
|
|
|
|
|
offsets aren't enough, though that doesn't initially seem
|
|
|
|
|
reasonable. Whatever we choose, we should document the sequence
|
|
|
|
|
and any associated linker optimizations.]
|
|
|
|
|
</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
As with TOC-based compilation units, the medium code model is the
|
|
|
|
|
default for compilers, and is applicable to most programs and
|
|
|
|
|
libraries. The code examples in this document generally use the
|
|
|
|
|
medium code model.
|
|
|
|
|
</para>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
When linking PC-relative relocatable objects, the linker should
|
|
|
|
|
attempt to place the .got section near the text sections.
|
|
|
|
|
</para>
|
|
|
|
|
</section>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_12107">
|
|
|
|
@ -6387,9 +6579,50 @@ lvx v1, 0, r12</programlisting>
|
|
|
|
|
section.</para>
|
|
|
|
|
<section xml:id="dbdoclet.50655240___RefHeading___Toc377640597">
|
|
|
|
|
<title>Function Prologue</title>
|
|
|
|
|
<para>A function's prologue establishes addressability by initializing
|
|
|
|
|
a TOC pointer in register r2, if necessary, and a stack frame, if
|
|
|
|
|
necessary, and may save any nonvolatile registers it uses.</para>
|
|
|
|
|
<para revisionflag="added">The function prologue is responsible for
|
|
|
|
|
the following functions:</para>
|
|
|
|
|
<itemizedlist revisionflag="added">
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>Establishing addressability to global data</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>Creating a stack frame when required</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>Saving any nonvolatile registers that are used by the
|
|
|
|
|
function</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>Saving any limited-access bits that are used by the function,
|
|
|
|
|
per the rules described in <xref
|
|
|
|
|
linkend="dbdoclet.50655240___RefHeading___Toc377640581" /></para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<para revisionflag="added">This ABI shall be used in conjunction with
|
|
|
|
|
the Power Architecture that implements the
|
|
|
|
|
<emphasis role="bold">mfocrf</emphasis> architecture level. Further,
|
|
|
|
|
OpenPOWER-compliant processors shall implement implementation-defined
|
|
|
|
|
bits in a manner to allow the combination of multiple
|
|
|
|
|
<emphasis role="bold">mfocrf</emphasis> results with an OR instruction;
|
|
|
|
|
for example, to yield a word in r0 including all three preserved CRs as
|
|
|
|
|
follows:</para>
|
|
|
|
|
<programlisting revisionflag="added">mfocrf r0, crf2
|
|
|
|
|
mfocrf r1, crf3
|
|
|
|
|
or r0, r0, r1
|
|
|
|
|
mfocrf r1, crf4
|
|
|
|
|
or r0, r0, r1</programlisting>
|
|
|
|
|
<para revisionflag="added">Specifically, this allows each
|
|
|
|
|
OpenPOWER-compliant processor implementation to set each field to hold
|
|
|
|
|
either 0 or the correct in-order value of the corresponding CR field at
|
|
|
|
|
the point where the <emphasis role="bold">mfocrf</emphasis>
|
|
|
|
|
instruction is performed.</para>
|
|
|
|
|
<bridgehead revisionflag="added">TOC-Based Compilation
|
|
|
|
|
Units</bridgehead>
|
|
|
|
|
<para><phrase revisionflag="changed">In a TOC-based compilation unit,
|
|
|
|
|
a</phrase> function's prologue establishes addressability by
|
|
|
|
|
initializing a TOC pointer in register r2, if necessary, and a stack
|
|
|
|
|
frame, if necessary, and may save any nonvolatile registers it
|
|
|
|
|
uses.</para>
|
|
|
|
|
<para>All functions have a global entry point (GEP) available to any
|
|
|
|
|
caller and pointing to the beginning of the prologue. Some functions
|
|
|
|
|
may have a secondary entry point to optimize the cost of TOC pointer
|
|
|
|
@ -6420,9 +6653,10 @@ addi r2, r2, .TOC.-func@l</programlisting>
|
|
|
|
|
form that is faster due to instruction fusion, such as:</para>
|
|
|
|
|
<programlisting>lis r2, .TOC.@ha
|
|
|
|
|
addi r2, r2, .TOC.@l</programlisting>
|
|
|
|
|
<para>In addition to establishing addressability, the function prologue
|
|
|
|
|
<para revisionflag="deleted">In addition to establishing
|
|
|
|
|
addressability, the function prologue
|
|
|
|
|
is responsible for the following functions:</para>
|
|
|
|
|
<itemizedlist>
|
|
|
|
|
<itemizedlist revisionflag="deleted">
|
|
|
|
|
<listitem>
|
|
|
|
|
<para>Creating a stack frame when required</para>
|
|
|
|
|
</listitem>
|
|
|
|
@ -6436,24 +6670,25 @@ addi r2, r2, .TOC.@l</programlisting>
|
|
|
|
|
<xref linkend="dbdoclet.50655240___RefHeading___Toc377640581" /></para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</itemizedlist>
|
|
|
|
|
<para>This ABI shall be used in conjunction with the Power Architecture
|
|
|
|
|
that implements the
|
|
|
|
|
<para revisionflag="deleted">This ABI shall be used in conjunction with
|
|
|
|
|
the Power Architecture that implements the
|
|
|
|
|
<emphasis role="bold">mfocrf</emphasis> architecture level. Further,
|
|
|
|
|
OpenPOWER-compliant processors shall implement implementation-defined
|
|
|
|
|
bits in a manner to allow the combination of multiple
|
|
|
|
|
<emphasis role="bold">mfocrf</emphasis> results with an OR instruction; for example,
|
|
|
|
|
to yield a word in r0 including all three preserved CRs as
|
|
|
|
|
follows:</para>
|
|
|
|
|
<programlisting>mfocrf r0, crf2
|
|
|
|
|
<programlisting revisionflag="deleted">mfocrf r0, crf2
|
|
|
|
|
mfocrf r1, crf3
|
|
|
|
|
or r0, r0, r1
|
|
|
|
|
mfocrf r1, crf4
|
|
|
|
|
or r0, r0, r1</programlisting>
|
|
|
|
|
<para>Specifically, this allows each OpenPOWER-compliant processor
|
|
|
|
|
implementation to set each field to hold either 0 or the correct
|
|
|
|
|
in-order value of the corresponding CR field at the point where the
|
|
|
|
|
<emphasis role="bold">mfocrf</emphasis> instruction is performed.</para>
|
|
|
|
|
<para> </para>
|
|
|
|
|
<para revisionflag="deleted">Specifically, this allows each
|
|
|
|
|
OpenPOWER-compliant processor implementation to set each field to hold
|
|
|
|
|
either 0 or the correct in-order value of the corresponding CR field at
|
|
|
|
|
the point where the <emphasis role="bold">mfocrf</emphasis>
|
|
|
|
|
instruction is performed.</para>
|
|
|
|
|
<para revisionflag="deleted"> </para>
|
|
|
|
|
<bridgehead>Assembly Language Syntax for Defining Entry
|
|
|
|
|
Points</bridgehead>
|
|
|
|
|
<para>When a function has two entry points, the global entry point is
|
|
|
|
@ -6472,6 +6707,14 @@ or r0, r0, r1</programlisting>
|
|
|
|
|
the meaning of the second parameter, which is put in the three
|
|
|
|
|
most-significant bits of the st_other field in the ELF Symbol Table
|
|
|
|
|
entry.</para>
|
|
|
|
|
<bridgehead revisionflag="added">PC-Relative Compilation
|
|
|
|
|
Units</bridgehead>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
In a PC-relative compilation unit, the function prologue does not
|
|
|
|
|
require any setup code to establish addressability to global data.
|
|
|
|
|
Therefore there is also no need for a function to have a separate
|
|
|
|
|
local entry point.
|
|
|
|
|
</para>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_13754">
|
|
|
|
|
<title>Function Epilogue</title>
|
|
|
|
@ -6884,11 +7127,13 @@ _restvr_31: addi r12,r0,-16
|
|
|
|
|
<xref linkend="dbdoclet.50655242_page119" /> shows an example of this
|
|
|
|
|
method.</para>
|
|
|
|
|
<para>Examples of absolute and position-independent compilations are
|
|
|
|
|
shown in
|
|
|
|
|
<xref linkend="dbdoclet.50655240_12719" />,
|
|
|
|
|
<xref linkend="dbdoclet.50655240_page77" />, and
|
|
|
|
|
<xref linkend="dbdoclet.50655240_19926" />. These examples show the C
|
|
|
|
|
language statements together with the generated assembly language. The
|
|
|
|
|
shown in <phrase revisionflag="changed"><xref
|
|
|
|
|
linkend="dbdoclet.50655240_12719" />,
|
|
|
|
|
<xref linkend="dbdoclet.50655240_page77" />,
|
|
|
|
|
<xref linkend="dbdoclet.50655240_19926" />, and
|
|
|
|
|
<xref linkend="dbdoclet.50655240_StaticPCRel" /></phrase>. These
|
|
|
|
|
examples show the
|
|
|
|
|
C language statements together with the generated assembly language. The
|
|
|
|
|
assumption for these figures is that only executables can use absolute
|
|
|
|
|
addressing while shared objects must use position-independent code
|
|
|
|
|
addressing. The figures are intended to demonstrate the compilation of
|
|
|
|
@ -7151,6 +7396,60 @@ stw r0,0,(r7)</programlisting>
|
|
|
|
|
</tbody>
|
|
|
|
|
</tgroup>
|
|
|
|
|
</table>
|
|
|
|
|
|
|
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655240_StaticPCRel"
|
|
|
|
|
revisionflag="added">
|
|
|
|
|
<title>PC-Relative Load and Store</title>
|
|
|
|
|
<tgroup cols="2">
|
|
|
|
|
<colspec colname="c1" colwidth="30*" />
|
|
|
|
|
<colspec colname="c2" colwidth="70*" />
|
|
|
|
|
<thead>
|
|
|
|
|
<row>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>
|
|
|
|
|
<emphasis role="bold">C Code</emphasis>
|
|
|
|
|
</para>
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>
|
|
|
|
|
<emphasis role="bold">Assembly Code</emphasis>
|
|
|
|
|
</para>
|
|
|
|
|
</entry>
|
|
|
|
|
</row>
|
|
|
|
|
</thead>
|
|
|
|
|
<tbody>
|
|
|
|
|
<row>
|
|
|
|
|
<entry>
|
|
|
|
|
<programlisting>extern int src;
|
|
|
|
|
extern int dst;
|
|
|
|
|
int *ptr;
|
|
|
|
|
|
|
|
|
|
dst = src;
|
|
|
|
|
|
|
|
|
|
ptr = &dst;
|
|
|
|
|
|
|
|
|
|
*ptr = src;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</programlisting>
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<programlisting>.extern src
|
|
|
|
|
.extern dst
|
|
|
|
|
.extern ptr
|
|
|
|
|
.section ".text"
|
|
|
|
|
plwz r9, src@pcrel(0), 1
|
|
|
|
|
pstw r9, dst@pcrel(0), 1
|
|
|
|
|
paddi r11, 0, dst@pcrel, 1
|
|
|
|
|
pstd r11, ptr@pcrel(0), 1
|
|
|
|
|
pld r11, ptr@pcrel(0), 1
|
|
|
|
|
plwz r9, src@pcrel(0), 1
|
|
|
|
|
stw r9, 0(r11)</programlisting>
|
|
|
|
|
</entry>
|
|
|
|
|
</row>
|
|
|
|
|
</tbody>
|
|
|
|
|
</tgroup>
|
|
|
|
|
</table>
|
|
|
|
|
<note>
|
|
|
|
|
<itemizedlist>
|
|
|
|
|
<listitem>
|
|
|
|
@ -7311,9 +7610,16 @@ nop</programlisting>
|
|
|
|
|
<xref linkend="dbdoclet.50655242_20388" />.</para>
|
|
|
|
|
</listitem>
|
|
|
|
|
</orderedlist>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
For a function call in a PC-relative compilation unit, the nop in
|
|
|
|
|
<xref linkend="dbdoclet.50655240_85319" /> should not be generated.
|
|
|
|
|
</para>
|
|
|
|
|
<para>For indirect function calls, the address of the function to be
|
|
|
|
|
called is placed in r12 and the CTR register. A bctrl instruction is used
|
|
|
|
|
to perform the indirect branch as shown in
|
|
|
|
|
to perform the indirect branch as shown in
|
|
|
|
|
<phrase revisionflag="added">
|
|
|
|
|
<xref linkend="dbdoclet.50655240_95364" />,
|
|
|
|
|
</phrase>
|
|
|
|
|
<xref linkend="dbdoclet.50655240_16744" />, and
|
|
|
|
|
<xref linkend="dbdoclet.50655240_95225" />. The ELF V2 ABI requires the
|
|
|
|
|
address of the called function to be in r12 when a cross-module function
|
|
|
|
@ -7381,7 +7687,11 @@ bctrl</programlisting>
|
|
|
|
|
</table -->
|
|
|
|
|
<para>
|
|
|
|
|
<xref linkend="dbdoclet.50655240_16744" /> shows how to make an indirect
|
|
|
|
|
function call using small-model position-independent code.</para>
|
|
|
|
|
function call using small-model position-independent code.
|
|
|
|
|
<phrase revisionflag="added">Note that the store and reload of the
|
|
|
|
|
TOC pointer r2 is not required in a PC-relative compilation
|
|
|
|
|
unit.</phrase>
|
|
|
|
|
</para>
|
|
|
|
|
|
|
|
|
|
<figure xml:id="dbdoclet.50655240_16744">
|
|
|
|
|
<title>Small-Model Position-Independent Indirect Function Call</title>
|
|
|
|
@ -7451,7 +7761,11 @@ ld r2,24(r1)</programlisting>
|
|
|
|
|
</table -->
|
|
|
|
|
<para>
|
|
|
|
|
<xref linkend="dbdoclet.50655240_95225" /> shows how to make an indirect
|
|
|
|
|
function call using large-model position-independent code.</para>
|
|
|
|
|
function call using large-model position-independent code.
|
|
|
|
|
<phrase revisionflag="added">Note that the store and reload of the
|
|
|
|
|
TOC pointer r2 is not required in a PC-relative compilation
|
|
|
|
|
unit.</phrase>
|
|
|
|
|
</para>
|
|
|
|
|
|
|
|
|
|
<figure xml:id="dbdoclet.50655240_95225">
|
|
|
|
|
<title>Large-Model Position-Independent Indirect Function Call</title>
|
|
|
|
@ -7521,6 +7835,7 @@ ld r2,24(r1)</programlisting>
|
|
|
|
|
</tbody>
|
|
|
|
|
</tgroup>
|
|
|
|
|
</table -->
|
|
|
|
|
<bridgehead revisionflag="added">TOC-Based Compilation Units</bridgehead>
|
|
|
|
|
<para>Function calls need to be performed in conjunction with
|
|
|
|
|
establishing, maintaining, and restoring addressability through the TOC
|
|
|
|
|
pointer register, r2. When a function is called, the TOC pointer register
|
|
|
|
@ -7553,6 +7868,19 @@ bl target
|
|
|
|
|
<xref linkend="dbdoclet.50655240___RefHeading___Toc377640597" />,
|
|
|
|
|
<xref linkend="dbdoclet.50655241_95185" />, and
|
|
|
|
|
<xref linkend="dbdoclet.50655241_47572" />.</para>
|
|
|
|
|
<bridgehead revisionflag="added">PC-Relative Compilation
|
|
|
|
|
Units</bridgehead>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
As with TOC-based compilation units, for calls to functions resolved at
|
|
|
|
|
runtime, the linker must generate stub code to load the function
|
|
|
|
|
address from the PLT. When the stub code is generated on behalf of
|
|
|
|
|
an indirect call in a PC-relative compilation unit, the linker may
|
|
|
|
|
omit the save and restore of r2 from the stub code. This behavior
|
|
|
|
|
is optional but recommended. Calls in PC-relative code should not
|
|
|
|
|
be marked with the R_PPC64_TOCSAVE or R_PPC64_REL24_NOTOC relocations.
|
|
|
|
|
[To discuss: Do we need a relocation to identify this as a PC-relative
|
|
|
|
|
call?]
|
|
|
|
|
</para>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_47036">
|
|
|
|
|
<title>Branching</title>
|
|
|
|
@ -7947,6 +8275,75 @@ f1:
|
|
|
|
|
.long .TOC. - Ldefault
|
|
|
|
|
.long .TOC. - Lcase13</programlisting>
|
|
|
|
|
</figure>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
<xref linkend="dbdoclet.50655240_PCRelSwitch" /> shows a switch
|
|
|
|
|
implementation for PC-relative compilation units. [TBD: This needs to
|
|
|
|
|
be a figure, not a table, which may require working with Annette and
|
|
|
|
|
FrameMaker to get something that looks similar to the other figures.
|
|
|
|
|
All we have in the document for the other figures is .png files from
|
|
|
|
|
the old FrameMaker version. Or maybe we should just convert all the
|
|
|
|
|
other figures to tables.]
|
|
|
|
|
</para>
|
|
|
|
|
<table frame="all" pgwide="1" xml:id="dbdoclet.50655240_PCRelSwitch"
|
|
|
|
|
revisionflag="added">
|
|
|
|
|
<title>
|
|
|
|
|
Position-Independent Switch Code (PC-Relative Addressing)
|
|
|
|
|
</title>
|
|
|
|
|
<tgroup cols="2">
|
|
|
|
|
<colspec colname="c1" colwidth="30*" />
|
|
|
|
|
<colspec colname="c2" colwidth="70*" />
|
|
|
|
|
<thead>
|
|
|
|
|
<row>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>
|
|
|
|
|
<emphasis role="bold">C Code</emphasis>
|
|
|
|
|
</para>
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<para>
|
|
|
|
|
<emphasis role="bold">Assembly Code</emphasis>
|
|
|
|
|
</para>
|
|
|
|
|
</entry>
|
|
|
|
|
</row>
|
|
|
|
|
</thead>
|
|
|
|
|
<tbody>
|
|
|
|
|
<row>
|
|
|
|
|
<entry>
|
|
|
|
|
<programlisting>switch(j)
|
|
|
|
|
{
|
|
|
|
|
case 0:
|
|
|
|
|
...
|
|
|
|
|
case 1:
|
|
|
|
|
...
|
|
|
|
|
case 3:
|
|
|
|
|
...
|
|
|
|
|
default:
|
|
|
|
|
...
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</programlisting>
|
|
|
|
|
</entry>
|
|
|
|
|
<entry>
|
|
|
|
|
<programlisting> cmplwi r12, 4
|
|
|
|
|
bge .Ldefault
|
|
|
|
|
slwi r12, 2
|
|
|
|
|
paddi r10, r0, .Ltab@pcrel, 1
|
|
|
|
|
lwax r8, r10, r12
|
|
|
|
|
add r10, r8, r10
|
|
|
|
|
mtctr r10
|
|
|
|
|
bctr
|
|
|
|
|
.p2align 2
|
|
|
|
|
.Ltab:
|
|
|
|
|
.word (.Lcase0-.Ltab)
|
|
|
|
|
.word (.Lcase1-.Ltab)
|
|
|
|
|
.word (.Ldefault-.Ltab)
|
|
|
|
|
.word (.Lcase3-.Ltab)</programlisting>
|
|
|
|
|
</entry>
|
|
|
|
|
</row>
|
|
|
|
|
</tbody>
|
|
|
|
|
</tgroup>
|
|
|
|
|
</table>
|
|
|
|
|
</section>
|
|
|
|
|
<section xml:id="dbdoclet.50655240_32686">
|
|
|
|
|
<title>Dynamic Stack Space Allocation</title>
|
|
|
|
@ -8019,6 +8416,11 @@ addi r3,r1,p ; R3 = new data area following parameter save area.</pro
|
|
|
|
|
a value that needs to be preserved. In the future, if it is defined and
|
|
|
|
|
if the function uses the Reserved word, the LR save doubleword must also
|
|
|
|
|
be copied.</para>
|
|
|
|
|
<para revisionflag="added">
|
|
|
|
|
It is unnecessary to copy the TOC pointer doubleword for a
|
|
|
|
|
PC-relative compilation unit. [To discuss: Should we, for future
|
|
|
|
|
use of this slot for another purpose?]
|
|
|
|
|
</para>
|
|
|
|
|
<note>
|
|
|
|
|
<para>Additional instructions will be necessary for an allocation of
|
|
|
|
|
variable size. If a dynamic deallocation will occur, the r1 stack
|
|
|
|
|