Changed to consistently use Power versus POWER, Power ISA versus

PowerISA, etc.  Added graphic to vec_gb.
pull/69/head
Bill Schmidt 5 years ago
parent ec386314da
commit b2e4fce15b

@ -18,32 +18,32 @@
xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian">
<!-- Chapter Title goes here. -->
<title>The POWER Bi-Endian Vector Programming Model</title>
<title>The Power Bi-Endian Vector Programming Model</title>

<para>
To ensure portability of applications optimized to exploit the
SIMD functions of POWER ISA processors, this reference defines a
SIMD functions of Power ISA processors, this reference defines a
set of functions and data types for SIMD programming. Compliant
compilers will provide suitable support for these functions,
preferably as built-in functions that translate to one or more
POWER ISA instructions.
Power ISA instructions.
</para>
<para>
Compilers are encouraged, but not required, to provide built-in
functions to access individual instructions in the IBM POWER®
functions to access individual instructions in the IBM Power®
instruction set architecture. In most cases, each such built-in
function should provide direct access to the underlying
instruction.
</para>
<para>
However, to ease porting between little-endian (LE) and big-endian
(BE) POWER systems, and between POWER and other platforms, it is
(BE) Power systems, and between Power and other platforms, it is
preferable that some built-in functions provide the same semantics
on both LE and BE POWER systems, even if this means that the
on both LE and BE Power systems, even if this means that the
built-in functions are implemented with different instruction
sequences for LE and BE. To achieve this, vector built-in
functions provide a set of functions derived from the set of
hardware functions provided by the POWER SIMD instructions. Unlike
hardware functions provided by the Power SIMD instructions. Unlike
traditional “hardware intrinsic” built-in functions, no fixed
mapping exists between these built-in functions and the generated
hardware instruction sequence. Rather, the compiler is free to
@ -52,13 +52,13 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian">
built-in functions.
</para>
<para>
As we've seen, the POWER SIMD instructions operate on groups of 1,
As we've seen, the Power SIMD instructions operate on groups of 1,
2, 4, 8, or 16 vector elements at a time in 128-bit registers. On
a big-endian POWER platform, vector elements are loaded from
a big-endian Power platform, vector elements are loaded from
memory into a register so that the 0th element occupies the
high-order bits of the register, and the (N &#8211; 1)th element
occupies the low-order bits of the register. This is referred to
as big-endian element order. On a little-endian POWER platform,
as big-endian element order. On a little-endian Power platform,
vector elements are loaded from memory such that the 0th element
occupies the low-order bits of the register, and the (N &#8211;
1)th element occupies the high-order bits. This is referred to as
@ -68,7 +68,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.biendian">
<note>
<para>
Much of the information in this chapter was formerly part of
Chapter 6 of the 64-Bit ELF V2 ABI Specification for POWER.
Chapter 6 of the 64-Bit ELF V2 ABI Specification for Power.
</para>
</note>

@ -123,7 +123,7 @@ vector double g = (vector double) { 3.5, -24.6 };</programlisting>
For the C and C++ programming languages (and related/derived
languages), these data types may be accessed based on the type
names listed in <xref linkend="VIPR.biendian.vectypes" /> when
POWER SIMD language extensions are enabled using either the
Power SIMD language extensions are enabled using either the
<code>vector</code> or <code>__vector</code> keywords.
</para>
<para>
@ -478,7 +478,7 @@ register vector double vd = vec_splats(*double_ptr);</programlisting>
<title>Vector Operators</title>
<para>
In addition to the dereference and assignment operators, the
POWER Bi-Endian Vector Programming Model provides the usual
Power Bi-Endian Vector Programming Model provides the usual
operators that are valid on pointers; these operators are also
valid for pointers to vector types.
</para>
@ -589,7 +589,7 @@ register vector double vd = vec_splats(*double_ptr);</programlisting>
<section>
<title>Vector Built-In Functions</title>
<para>
Some of the POWER SIMD hardware instructions refer, implicitly
Some of the Power SIMD hardware instructions refer, implicitly
or explicitly, to vector element numbers. For example, the
<code>vspltb</code> instruction has as one of its inputs an
index into a vector. The element at that index position is to
@ -650,7 +650,7 @@ register vector double vd = vec_splats(*double_ptr);</programlisting>
</entry>
<entry>
<para>
<emphasis role="bold">Corresponding POWER
<emphasis role="bold">Corresponding Power
Instructions</emphasis>
</para>
</entry>
@ -761,7 +761,7 @@ register vector double vd = vec_splats(*double_ptr);</programlisting>
(Deprecated)</title>
<para>
Versions 1.0 through 1.4 of the 64-Bit ELFv2 ABI Specification
for POWER provided for optional compiler support for using
for Power provided for optional compiler support for using
big-endian element ordering in little-endian environments.
This was initially deemed useful for porting certain libraries
that assumed big-endian element ordering regardless of the

@ -18,12 +18,12 @@
xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
<!-- Chapter Title goes here. -->
<title>Introduction to Vector Programming on POWER</title>
<title>Introduction to Vector Programming on Power</title>

<section>
<title>A Brief History</title>
<para>
The history of vector programming on POWER processors begins
The history of vector programming on Power processors begins
with the AIM (Apple, IBM, Motorola) alliance in the 1990s. The
AIM partners developed the Power Vector Media Extension (VMX) to
accelerate multimedia applications, particularly image
@ -87,15 +87,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
a VSR can now contain a single 128-bit integer; and starting
with POWER9, a VSR can contain a single 128-bit floating-point
value. The VMX and VSX instruction sets together may be
referred to as the POWER SIMD (single-instruction,
referred to as the Power SIMD (single-instruction,
multiple-data) instructions.
</para>
<section>
<title>Little-Endian Linux</title>
<para>
The POWER architecture has supported operation in either
The Power architecture has supported operation in either
big-endian (BE) or little-endian (LE) mode from the
beginning. However, IBM's POWER servers were only shipped
beginning. However, IBM's Power servers were only shipped
with big-endian operating systems (AIX, Linux, i5/OS) prior to
the introduction of POWER8. With POWER8, IBM began
supporting little-endian Linux distributions for the first
@ -106,7 +106,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
currently used only for little-endian Linux.
</para>
<para>
Although POWER has always supported big- and little-endian
Although Power has always supported big- and little-endian
memory accesses, the introduction of vector register support
added a layer of complexity to programming for processors
operating in different endian modes. Arrays of elements
@ -137,7 +137,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
<para>
The vector-scalar registers can be addressed with VSX
instructions, for vector and scalar processing of all 64
registers, or with the "classic" POWER floating-point
registers, or with the "classic" Power floating-point
instructions to refer to a 32-register subset of these, having
64 bits per register. They can also be addressed with VMX
instructions to refer to a 32-register subset of 128-bit registers.
@ -198,6 +198,16 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
</emphasis>
</para>
</listitem>
<listitem>
<para>
<emphasis>Power Instruction Set Architecture</emphasis>,
Version 3.0B Specification.
<emphasis>
<link xlink:href="https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0">https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
</link>
</emphasis>
</para>
</listitem>
<listitem>
<para>
<emphasis>Power Vector Library.</emphasis>

@ -31,11 +31,11 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
intrinsics the best way to ensure that the compiler does exactly
what you want? Well, sometimes. But the problem is that the
best instruction sequence today may not be the best instruction
sequence tomorrow. As the PowerISA moves forward, new
sequence tomorrow. As the Power ISA moves forward, new
instruction capabilities appear, and the old code you wrote can
easily become obsolete. Then you start having to create
different versions of the code for different levels of the
PowerISA, and it can quickly become difficult to maintain.
Power ISA, and it can quickly become difficult to maintain.
</para>
<para>
Most often programmers use vector intrinsics to increase the
@ -141,7 +141,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<para>
This reference provides intrinsics that are guaranteed to be
portable across compliant compilers. In particular, both the
GCC and Clang compilers for POWER implement the intrinsics in
GCC and Clang compilers for Power implement the intrinsics in
this manual. The compilers may each implement many more
intrinsics, but the ones in this manual are the only ones
guaranteed to be portable. So if you are using an interface not
@ -151,7 +151,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<para>
There are also other vector APIs that may be of use to you (see
<xref linkend="VIPR.techniques.apis" />). In particular, the
POWER Vector Library (see <xref
Power Vector Library (see <xref
linkend="VIPR.techniques.pveclib" />) provides additional
portability across compiler versions.
</para>
@ -221,7 +221,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
and will not necessarily perform optimally (although in many
cases the performance is very good). Using these headers is
often a good first step in porting a library using Intel
intrinsics to POWER, after which more detailed rewriting of
intrinsics to Power, after which more detailed rewriting of
algorithms is usually desirable for best performance.
</para>
<para>
@ -231,8 +231,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
</para>
</section>
<section xml:id="VIPR.techniques.pveclib">
<title>The POWER Vector Library (pveclib)</title>
<para>The POWER Vector Library, also known as
<title>The Power Vector Library (pveclib)</title>
<para>The Power Vector Library, also known as
<code>pveclib</code>, is a separate project available from
github (see <xref linkend="VIPR.intro.links" />). The
<code>pveclib</code> project builds on top of the intrinsics
@ -244,10 +244,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<listitem>
<para>
Providing equivalent functions across versions of the
PowerISA. For example, the <emphasis>Vector
Power ISA. For example, the <emphasis>Vector
Multiply-by-10 Unsigned Quadword</emphasis> operation
introduced in PowerISA 3.0 (POWER9) can be implemented
using a few vector instructions on earlier PowerISA
introduced in Power ISA 3.0 (POWER9) can be implemented
using a few vector instructions on earlier Power ISA
versions.
</para>
</listitem>
@ -262,7 +262,7 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<listitem>
<para>
Providing higher-order functions not provided directly by
the PowerISA. One example is a vector SIMD implementation
the Power ISA. One example is a vector SIMD implementation
for ASCII <code>__isalpha</code> and similar functions.
Another example is full <code>__int128</code>
implementations of <emphasis>Count Leading

@ -15594,15 +15594,28 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="VIPR.vec-ref">
<emphasis role="bold">r</emphasis> is set to the value of the
<emphasis>i</emphasis>th bit of the <emphasis>j</emphasis>th byte
element of <emphasis role="bold">a</emphasis>.</para>
<para>
<xref linkend="VIPR.ch-vec.vec_gb" />, taken from the
Power ISA, shows how bits are combined by the
<code>vec_gb</code> intrinsic. Here <code>VR[VRT]</code> is
equivalent to <emphasis role="bold">r</emphasis>, and
<code>VR[VRB]</code> is equivalent to <emphasis
role="bold">a</emphasis>.
</para>
<figure pgwide="1" xml:id="VIPR.ch-vec.vec_gb">
<title>Operation of vec_gb</title>
<mediaobject>
<imageobject>
<imagedata fileref="vgbbd.png" format="PNG"
scalefit="1" width="100%" />
</imageobject>
</mediaobject>
</figure>
<para><emphasis role="bold">Endian considerations:</emphasis>
The <emphasis role="bold">vec_gb</emphasis> intrinsic function assumes
big-endian (left-to-right) numbering for both bits and bytes, matching
the ISA 2.07 <emphasis role="bold">vgbbd</emphasis> instruction.
</para>
<para><emphasis role="bold">Notes:</emphasis>
<emphasis>Try to get the diagram from the ISA manual to include
here.</emphasis>
</para>
<indexterm>
<primary>vgbbd</primary>

Binary file not shown.

After

Width:  |  Height:  |  Size: 128 KiB

Loading…
Cancel
Save