Incorporate changes following Paul Clarke's admirable review

Signed-off-by: Bill Schmidt <wschmidt@linux.ibm.com>
master
Bill Schmidt 3 years ago
parent ae4cd5ccc6
commit 321ac9e713

@ -804,7 +804,7 @@ a[3] = c;</programlisting>
</entry>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_signextll"
<code><xref linkend="vec_signextq"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
@ -817,10 +817,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergee" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_signextq"
xrefstyle="select:title nopage"/></code>
</para>
<para><code><xref linkend="vec_sld" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -831,7 +828,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergeh" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sld" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_sldw" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -845,7 +842,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergel" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sldw" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_sll" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -859,7 +856,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mergeo" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sll" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_slo" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -870,7 +867,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mfvscr" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_slo" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_slv" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -881,7 +878,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mule" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_slv" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_splat" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -892,7 +889,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_mulo" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_splat" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_srl" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -903,7 +900,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_ncipher_be" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_srl" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_sro" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -914,7 +911,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_ncipherlast_be" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sro" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_srv" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -925,7 +922,10 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_pack" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_srv" xrefstyle="select:title nopage"/></code></para>
<para revisionflag="added">
<code><xref linkend="vec_stril"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
</row>
<row>
@ -937,7 +937,7 @@ a[3] = c;</programlisting>
</entry>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_stril"
<code><xref linkend="vec_stril_p"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
@ -964,7 +964,10 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_packs" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sum2s" xrefstyle="select:title nopage"/></code></para>
<para revisionflag="added">
<code><xref linkend="vec_strir_p"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
</row>
<row>
@ -975,7 +978,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_packsu" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_sums" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_sum2s" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -986,7 +989,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_perm" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_unpackh" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_sums" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -997,7 +1000,8 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_permxor" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_unpackl" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_unpackh"
xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -1008,7 +1012,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_pmsum_be" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_unsigned2" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_unpackl" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -1019,7 +1023,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_reve" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_unsignede" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_unsigned2" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -1030,7 +1034,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_sbox_be" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_unsignedo" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_unsignede" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -1044,7 +1048,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_shasigma_be" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_xl" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
<para><code><xref linkend="vec_unsignedo" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
@ -1058,7 +1062,7 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signed2" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_xl_be" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_xl" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
</entry>
</row>
<row>
@ -1072,13 +1076,13 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signede" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_xst" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
<para><code><xref linkend="vec_xl_be" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_genwm"
<code><xref linkend="vec_genpcvm"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
@ -1086,12 +1090,15 @@ a[3] = c;</programlisting>
<para><code><xref linkend="vec_signedo" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para><code><xref linkend="vec_xst_be" xrefstyle="select:title nopage"/></code></para>
<para><code><xref linkend="vec_xst" xrefstyle="select:title nopage"/></code> (ISA 2.07 only)</para>
</entry>
</row>
<row>
<entry>
<para><code><xref linkend="vec_insert" xrefstyle="select:title nopage"/></code></para>
<para revisionflag="added">
<code><xref linkend="vec_genwm"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
<entry>
<para revisionflag="added">
@ -1099,6 +1106,20 @@ a[3] = c;</programlisting>
xrefstyle="select:title nopage"/></code>
</para>
</entry>
<entry>
<para><code><xref linkend="vec_xst_be" xrefstyle="select:title nopage"/></code></para>
</entry>
</row>
<row>
<entry>
<para><code><xref linkend="vec_insert" xrefstyle="select:title nopage"/></code></para>
</entry>
<entry>
<para revisionflag="added">
<code><xref linkend="vec_signextll"
xrefstyle="select:title nopage"/></code>
</para>
</entry>
<entry>
</entry>
</row>
@ -1255,13 +1276,14 @@ a[3] = c;</programlisting>
introduced serious compiler complexity without much utility.
Thus this support (previously controlled by switches
<code>-maltivec=be</code> and/or <code>-qaltivec=be</code>) is
now deprecated. Current versions of the GCC and Clang
open-source compilers do not implement this support.
now deprecated. Current versions of the <phrase
revisionflag="changed">GCC, Clang, and Open XL</phrase>
compilers do not implement this support.
</para>
</section>
</section>

<section>
<section revisionflag="deleted">
<title>Language-Specific Vector Support for Other
Languages</title>
<section>

@ -201,11 +201,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
</listitem>
<listitem>
<para>
<emphasis role="underline">The XL compilers</emphasis>. For
XL compilers provided with the Linux Community Edition, you
can provide feedback to the XL compiler team via email
<emphasis role="underline">The XL <phrase
revisionflag="added">and OpenXL</phrase>
compilers</emphasis>. For XL <phrase
revisionflag="added">and OpenXL</phrase> compilers provided
with the Linux Community Edition, you can provide feedback
to the XL compiler team via email
(<email>compinfo@cn.ibm.com</email>); for other editions of
XL compilers, please open a <link
XL <phrase revisionflag="added">and OpenXL</phrase>
compilers, please open a <link
xlink:href="https://www.ibm.com/mysupport/s/">Case</link>.
</para>
</listitem>
@ -335,6 +339,22 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_intro">
</emphasis>
</para>
</listitem>
<listitem revisionflag="added">
<para>
<emphasis>The GNU C Library Project.</emphasis>
<emphasis>
<link xlink:href="https://www.gnu.org/software/libc">https://www.gnu.org/software/libc</link>
</emphasis>
</para>
</listitem>
<listitem revisionflag="added">
<para>
<emphasis>Matrix-Multiply Assist Best Practices Guide.</emphasis>
<emphasis>
<link xlink:href="http://www.redbooks.ibm.com/redpapers/pdfs/redp5612.pdf">https://www.redbooks.ibm.com/redpapers/pdfs/redp5612.pdf</link>
</emphasis>
</para>
</listitem>
</itemizedlist>
</section>


@ -19,7 +19,7 @@
revisionflag="added">

<!-- Chapter Title goes here. -->
<title>Matrix Multiply Accelerate (MMA) Intrinsic Reference</title>
<title>Matrix-Multiply Assist (MMA) Intrinsic Reference</title>

<section>
<title>Introduction</title>
@ -43,8 +43,14 @@
instruction directly writes to one of these VSRs.
</para>
<para>
<emphasis role="bold">Review status:</emphasis> This chapter is
not yet reviewed by anyone.
This reference is not intended to be a complete introduction to
MMA concepts. The reader is directed to the Matrix-Multiply
Assist Best Practices Guide (see <xref
linkend="VIPR.intro.links" />) and to the POWER ISA.
</para>
<para>
<emphasis role="bold">Review status:</emphasis> Chapter reviewed
by Paul Clarke; changes made.
</para>
</section>

@ -76,6 +82,14 @@
<para>
Load and store vector pairs.
</para>
<indexterm>
<primary>lxvp</primary>
<secondary>__builtin_vsx_lxvp</secondary>
</indexterm>
<indexterm>
<primary>stxvp</primary>
<secondary>__builtin_vsx_stxvp</secondary>
</indexterm>
<para>
<informaltable frame="all">
<tgroup cols="2">
@ -95,7 +109,7 @@
<row>
<entry>
<programlisting>
__vector pair __builtin_vsx_lxvp (long long int a, const __vector_pair* b)
__vector_pair __builtin_vsx_lxvp (long long a, const __vector_pair* b)
</programlisting>
</entry>
<entry>
@ -107,7 +121,7 @@
<row>
<entry>
<programlisting>
void __builtin_vsx_stxvp (__vector_pair s, long long int a, const __vector_pair* b)
void __builtin_vsx_stxvp (__vector_pair s, long long a, const __vector_pair* b)
</programlisting>
</entry>
<entry>
@ -226,6 +240,18 @@
(a "priming" operation) or vice versa ( a "depriming"
operation), or initialize an accumulator to zeros.
</para>
<indexterm>
<primary>xxmfacc</primary>
<secondary>__builtin_mma_xxmfacc</secondary>
</indexterm>
<indexterm>
<primary>xxmtacc</primary>
<secondary>__builtin_mma_xxmtacc</secondary>
</indexterm>
<indexterm>
<primary>xxsetaccz</primary>
<secondary>__builtin_mma_xxsetaccz</secondary>
</indexterm>
<para>
<informaltable frame="all">
<tgroup cols="2">
@ -289,6 +315,238 @@
Each of these intrinsics generates an instruction to perform
an outer product operation.
</para>
<indexterm>
<primary>pmxvbf16ger2</primary>
<secondary>__builtin_mma_pmxvbf16ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2nn</primary>
<secondary>__builtin_mma_pmxvbf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2np</primary>
<secondary>__builtin_mma_pmxvbf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2pn</primary>
<secondary>__builtin_mma_pmxvbf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>pmxvbf16ger2pp</primary>
<secondary>__builtin_mma_pmxvbf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2</primary>
<secondary>__builtin_mma_pmxvf16ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2nn</primary>
<secondary>__builtin_mma_pmxvf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2np</primary>
<secondary>__builtin_mma_pmxvf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2pn</primary>
<secondary>__builtin_mma_pmxvf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf16ger2pp</primary>
<secondary>__builtin_mma_pmxvf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32ger</primary>
<secondary>__builtin_mma_pmxvf32ger</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gernn</primary>
<secondary>__builtin_mma_pmxvf32gernn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gernp</primary>
<secondary>__builtin_mma_pmxvf32gernp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gerpn</primary>
<secondary>__builtin_mma_pmxvf32gerpn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf32gerpp</primary>
<secondary>__builtin_mma_pmxvf32gerpp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64ger</primary>
<secondary>__builtin_mma_pmxvf64ger</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gernn</primary>
<secondary>__builtin_mma_pmxvf64gernn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gernp</primary>
<secondary>__builtin_mma_pmxvf64gernp</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gerpn</primary>
<secondary>__builtin_mma_pmxvf64gerpn</secondary>
</indexterm>
<indexterm>
<primary>pmxvf64gerpp</primary>
<secondary>__builtin_mma_pmxvf64gerpp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2</primary>
<secondary>__builtin_mma_pmxvi64ger2</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2pp</primary>
<secondary>__builtin_mma_pmxvi64ger2pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2s</primary>
<secondary>__builtin_mma_pmxvi64ger2s</secondary>
</indexterm>
<indexterm>
<primary>pmxvi64ger2spp</primary>
<secondary>__builtin_mma_pmxvi64ger2spp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi4ger8</primary>
<secondary>__builtin_mma_pmxvi4ger8</secondary>
</indexterm>
<indexterm>
<primary>pmxvi4ger8pp</primary>
<secondary>__builtin_mma_pmxvi4ger8pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4</primary>
<secondary>__builtin_mma_pmxvi8ger4</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4pp</primary>
<secondary>__builtin_mma_pmxvi8ger4pp</secondary>
</indexterm>
<indexterm>
<primary>pmxvi8ger4spp</primary>
<secondary>__builtin_mma_pmxvi8ger4spp</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2</primary>
<secondary>__builtin_mma_xvbf16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2nn</primary>
<secondary>__builtin_mma_xvbf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2np</primary>
<secondary>__builtin_mma_xvbf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2pn</primary>
<secondary>__builtin_mma_xvbf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>xvbf16ger2pp</primary>
<secondary>__builtin_mma_xvbf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2</primary>
<secondary>__builtin_mma_xvf16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2nn</primary>
<secondary>__builtin_mma_xvf16ger2nn</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2np</primary>
<secondary>__builtin_mma_xvf16ger2np</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2pn</primary>
<secondary>__builtin_mma_xvf16ger2pn</secondary>
</indexterm>
<indexterm>
<primary>xvf16ger2pp</primary>
<secondary>__builtin_mma_xvf16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvf32ger</primary>
<secondary>__builtin_mma_xvf32ger</secondary>
</indexterm>
<indexterm>
<primary>xvf32gernn</primary>
<secondary>__builtin_mma_xvf32gernn</secondary>
</indexterm>
<indexterm>
<primary>xvf32gernp</primary>
<secondary>__builtin_mma_xvf32gernp</secondary>
</indexterm>
<indexterm>
<primary>xvf32gerpn</primary>
<secondary>__builtin_mma_xvf32gerpn</secondary>
</indexterm>
<indexterm>
<primary>xvf32gerpp</primary>
<secondary>__builtin_mma_xvf32gerpp</secondary>
</indexterm>
<indexterm>
<primary>xvf64ger</primary>
<secondary>__builtin_mma_xvf64ger</secondary>
</indexterm>
<indexterm>
<primary>xvf64gernn</primary>
<secondary>__builtin_mma_xvf64gernn</secondary>
</indexterm>
<indexterm>
<primary>xvf64gernp</primary>
<secondary>__builtin_mma_xvf64gernp</secondary>
</indexterm>
<indexterm>
<primary>xvf64gerpn</primary>
<secondary>__builtin_mma_xvf64gerpn</secondary>
</indexterm>
<indexterm>
<primary>xvf64gerpp</primary>
<secondary>__builtin_mma_xvf64gerpp</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2</primary>
<secondary>__builtin_mma_xvi16ger2</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2pp</primary>
<secondary>__builtin_mma_xvi16ger2pp</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2s</primary>
<secondary>__builtin_mma_xvi16ger2s</secondary>
</indexterm>
<indexterm>
<primary>xvi16ger2spp</primary>
<secondary>__builtin_mma_xvi16ger2spp</secondary>
</indexterm>
<indexterm>
<primary>xvi4ger8</primary>
<secondary>__builtin_mma_xvi4ger8</secondary>
</indexterm>
<indexterm>
<primary>xvi4ger8pp</primary>
<secondary>__builtin_mma_xvi4ger8pp</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4</primary>
<secondary>__builtin_mma_xvi8ger4</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4pp</primary>
<secondary>__builtin_mma_xvi8ger4pp</secondary>
</indexterm>
<indexterm>
<primary>xvi8ger4spp</primary>
<secondary>__builtin_mma_xvi8ger4spp</secondary>
</indexterm>
<para>
<informaltable frame="all">
<tgroup cols="2">

@ -113,9 +113,10 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
references. (<code>restrict</code> can be used only in C
when compiling for the C99 standard or later.
<code>__restrict__</code> is a language extension, available
in GCC, Clang, and the XL compilers, that can be used
without restriction for both C and C++. See your compiler's
user manual for details.)
in GCC, Clang, and the XL <phrase revisionflag="added">and
Open XL</phrase> compilers, that can be used without
restriction for both C and C++. See your compiler's user
manual for details.)
</para>
<para>
Suppose you have a function that takes two pointer
@ -142,8 +143,9 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
</para>
<para>
This reference provides intrinsics that are guaranteed to be
portable across compliant compilers. In particular, both the
GCC and Clang compilers for Power implement the intrinsics in
portable across compliant compilers. In particular, the <phrase
revisionflag="changed">GCC, Clang, and Open XL</phrase>
compilers for Power implement the intrinsics in
this manual. The compilers may each implement many more
intrinsics, but the ones in this manual are the only ones
guaranteed to be portable. So if you are using an interface not
@ -204,10 +206,15 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
responsible for following the calling conventions established by
the ABI (see <xref linkend="VIPR.intro.links" />). Again, it is
best to look at examples. One place to find well-written
<code>.S</code> files is in the GLIBC project. You can also
<code>.S</code> files is in the <phrase
revisionflag="changed">GNU C Library project (see <xref
linkend="VIPR.intro.links" />).</phrase> You can also
study the assembly output from your favorite compiler, which can
be obtained with the <code>-S</code> or similar option, or by
using the <emphasis role="bold">objdump</emphasis> utility.
using the <emphasis role="bold">objdump</emphasis> utility:
</para>
<para revisionflag="added">
<programlisting> objdump -dr &lt;binary or object file&gt;</programlisting>
</para>
</section>

@ -219,7 +226,8 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<section>
<title>x86 Vector Portability Headers</title>
<para>
Recent versions of the GCC and Clang open-source compilers
Recent versions of the <phrase revisionflag="changed">GCC,
Clang, and Open XL</phrase> compilers
for Power provide "drop-in" portability headers for portions
of the Intel Architecture Instruction Set Extensions (see <xref
linkend="VIPR.intro.links" />). These headers mirror the APIs
@ -243,14 +251,18 @@ xmlns:xlink="http://www.w3.org/1999/xlink" xml:id="section_techniques">
<para>
Access to the portability APIs occurs automatically when
including one of the corresponding Intel header files, such as
<code>&lt;mmintrin.h&gt;</code>.
<code>&lt;mmintrin.h&gt;</code>. <phrase
revisionflag="added">You must also compile with
<code>-DNO_WARN_X86_INTRINSICS</code> to opt into using the
headers.</phrase>
</para>
</section>
<section xml:id="VIPR.techniques.pveclib">
<title>The Power Vector Library (pveclib)</title>
<para>The Power Vector Library, also known as
<code>pveclib</code>, is a separate project available from
github (see <xref linkend="VIPR.intro.links" />). The
<phrase revisionflag="changed">GitHub</phrase> (see <xref
linkend="VIPR.intro.links" />). The
<code>pveclib</code> project builds on top of the intrinsics
described in this manual to provide higher-level vector
interfaces that are highly portable. The goals of the project

File diff suppressed because it is too large Load Diff
Loading…
Cancel
Save